Skip to content

Generated Module Layout

This document explains the structure of a generated C inference module and how to integrate it with different build systems: Generic CMake, Zephyr, and neuralSPOT.

It depends on our optimized kernel library, ns-cmsis-nn, a fork of CMSIS-NN with extra kernels and fixes. At build time the AOT runtime picks the best-ranked kernel that is compatible with the operator configuration and target platform; if a specialized kernel isn’t applicable, a safe fallback is selected automatically.

Key artifacts:

  • Public headers under includes-api/
  • Implementation in src/
  • Build glue for your environment: CMakeLists.txt, zephyr/ (module), or module.mk (neuralSPOT)

At-a-glance

Each generated module ships as a small, portable C library plus headers. The packaging differs slightly depending on the target ecosystem, but the API (includes-api/) and sources (src/) are the same.

<module-name>/
├─ LICENSE                     # License for this generated module
├─ README.md                   # This file
├─ includes-api/               # Public headers (stable API surface)
└─ src/                        # Implementation (operators, model glue, test helpers)

Depending on the selected module type, you will also see one of the following:

<module-name>/
...
└─ CMakeLists.txt            # Builds a static library of the module
<module-name>/
...
└── zephyr/
    ├── CMakeLists.txt  # Zephyr CMake module
    ├── Kconfig         # Zephyr Kconfig file
    └── module.yml      # Zephyr module metadata
<module-name>/
...
└─ module.mk                 # neuralSPOT module makefile

What’s in includes-api/

Public headers (names may vary slightly depending on the generated content):

  • prefix_common.h – Common types, helpers, and compile-time configuration.
  • prefix_context.h – Runtime context struct and types.
  • prefix_model.h – Model entry points (e.g., prefix_model_init, prefix_model_run).
  • prefix_operators.h – Operator function inclusions.
  • prefix_tensors.h – Tensor descriptors, constants, and arenas (if internal).
  • prefix_test_case.h – Optional test harness helpers (stimulus structure, smoke tests).
  • prefix_platform.h – Platform abstraction hooks and macros (e.g., AOT_PRINTF macro).

Note: The API is designed to be freestanding C (C99), easy to embed into firmware projects with or without an RTOS. The prefix is configurable at generation time to allow multiple modules to coexist in the same project.


What’s in src/

Typical source files include:

  • prefix_context.c – Initializes the runtime context and wires arenas/tensors.
  • prefix_model.c – The model’s init and run functions.
  • prefix_tensors.cTensor descriptor tables and (optionally) constant data.
  • prefix_test_case.c – Optional smoke test / reference runner.
  • prefix_fully_connected_*.c, prefix_conv2d_*.c, … – Generated operator wrappers.

Multiple kernel variants may exist for the same logical operator. The AOT generator ranks and selects the best kernel for the layer configuration and platform.


Integration Guide

HeliosAOT currently supports targeting CMake, Zephyr, and neuralSPOT frameworks. The integration steps differ slightly depending on the target. Please select your target framework below:

STEP 1: Add NS-CMSIS-NN dependency to your project

Copy or submodule ns-cmsis-nn module into your project's modules directory:

cp -r <path-to-ns-cmsis-nn> <project-root>/modules/ns-cmsis-nn

STEP 2: Add your AOT module to your project

Copy or submodule AOT module folder into your project's modules directory:

cp -r <path-to-aot-module> <project-root>/modules/

Alternatively, you can have HeliosAOT generate directly into your project modules directory:

helios-aot convert ... --module.path <project-root>/modules

STEP 3: Update your application CMakeLists.txt

In your app CMakeLists.txt:

# Add modules (these must contain a top-level CMakeLists.txt)
add_subdirectory(modules/ns-cmsis-nn)
add_subdirectory(modules/<module-name>)

add_executable(app
  src/main.c
  # ... your sources ...
)

target_link_libraries(app PRIVATE
  cmsis-nn # <- this is the NS-CMSIS-NN library target
  <module-name> # <- this is the generated AOT module
)

# (Optional) If you need to set CPU/ABI flags here:
# target_compile_options(app PRIVATE -mcpu=cortex-m55 -mthumb -Ofast)
# target_link_options(app    PRIVATE -mcpu=cortex-m55 -mthumb -Wl,--gc-sections)

STEP 1: Add NS-CMSIS-NN Zephyr module dependency


If you have access to the NS-CMSIS-NN repository, you can add it as a west project. Inside your zephyr project, add the following to your west.yml manifest file:

remotes:
    ... # Existing remotes
    - name: ambiqai # New remote for AmbiqAI modules
    url-base: git@github.com:AmbiqAI

projects:
    ... # Existing projects
    - name: ns-cmsis-nn
    remote: ambiqai
    repo-path: ns-cmsis-nn
    path: modules/ns-cmsis-nn
    revision: main # or pin a tag

Ensure your manifest schema is supported by your west (e.g., version: "1.2") and run west update after editing.

Copy or submodule NS-CMSIS-NN module into your Zephyr project under modules:

cp -r <path-to-ns-cmsis-nn> <project-root>/modules/ns-cmsis-nn

Inside your Zephyr application CMakeLists.txt, add:

list(APPEND ZEPHYR_EXTRA_MODULES
    ${CMAKE_CURRENT_SOURCE_DIR}/modules/ns-cmsis-nn
)
# IMPORTANT: ^ must appear **before** find_package(Zephyr)

STEP 2: Add your AOT module to your project

Copy or submodule your AOT module folder into your Zephyr project under modules:

cp -r <path-to-aot-module> <project-root>/modules/

Inside your Zephyr application CMakeLists.txt, add:

list(APPEND ZEPHYR_EXTRA_MODULES
    ... # Existing modules
    ${CMAKE_CURRENT_SOURCE_DIR}/modules/<module-name>
)
# IMPORTANT: ^ must appear **before** find_package(Zephyr)

STEP 3: Update your prj.conf

... # Existing configurations

# Enable FPU support (needed for DSP and MVE)
CONFIG_FPU=y
# If multiple threads/ISRs may use MVE/FP:
# CONFIG_FPU_SHARING=y

# Enable NS-CMSIS-NN
CONFIG_NS_CMSIS_NN=y
# Enable all kernels
CONFIG_NS_CMSIS_NN_ALL=y
# -OR- selectively add kernels
# CONFIG_NS_CMSIS_NN_ACTIVATION=y
# CONFIG_NS_CMSIS_NN_BASICMATH=y
# CONFIG_NS_CMSIS_NN_CONCATENATION=y
# CONFIG_NS_CMSIS_NN_CONVOLUTION=y
# CONFIG_NS_CMSIS_NN_FULLYCONNECTED=y
# CONFIG_NS_CMSIS_NN_LSTM=y
# CONFIG_NS_CMSIS_NN_NNSUPPORT=y
# CONFIG_NS_CMSIS_NN_PAD=y
# CONFIG_NS_CMSIS_NN_POOLING=y
# CONFIG_NS_CMSIS_NN_QUANTIZATION=y
# CONFIG_NS_CMSIS_NN_RESHAPE=y
# CONFIG_NS_CMSIS_NN_SOFTMAX=y
# CONFIG_NS_CMSIS_NN_STRIDED_SLICE=y
# CONFIG_NS_CMSIS_NN_SVD=y
# CONFIG_NS_CMSIS_NN_TRANSPOSE=y

# Enable <module-name> module
CONFIG_<MODULE_NAME>=y
# ^ Notice we use uppercase with hyphens replaced by underscores

STEP 4: Refresh your west manifest

west update

Verify you see both NS-CMSIS-NN and AOT module folders inside <project-root>/modules.


STEP 5: Build your Zephyr application

Rebuild your application using pristine to ensure all modules are re-built:

west build -p always -b <board> -s <app-path> -d <build-dir>
# -OR-
west build -b <board> -s <app-path> -d <build-dir> --pristine=always

The module ships a module.mk so you can easily drop it into your neuralSPOT app.

STEP 1: Add NS-CMSIS-NN dependency to your project

Copy or submodule ns-cmsis-nn module into your project's modules directory:

cp -r <path-to-ns-cmsis-nn> <project-root>/modules/ns-cmsis-nn

STEP 2: Add AOT module to your project

Copy or submodule your AOT module folder into your project's modules directory:

cp -r <path-to-aot-module> <project-root>/modules/

Alternatively, you can have HeliosAOT generate directly into your project modules directory:

helios-aot convert ... --module.path <project-root>/modules

STEP 3: Update your application makefile

Add the modules into your makefile:

# In your app's makefile
modules      += <path-to-ns-cmsis-nn>
modules      += <path-to-module>

STEP 4: Build your application

make clean
make

The module leverages neuralSPOT's make-library makefile helper to generate a static library.


Usage Example

Minimal runtime sequence in your main.c:

#include "prefix_model.h"
#include "prefix_test_case.h"  // optional

static void model_layer_status_cb(
    int32_t op,
    prefix_operator_state_t state,
    int32_t status,
    void *user_data
) {
    if (state == prefix_op_state_init_finished) {
        PREFIX_PRINTF("Operator %d init finished with status %d\n", op, status);
    }
    if (state == prefix_op_state_run_finished) {
        PREFIX_PRINTF("Operator %d run finished with status %d\n", op, status);
    }
}

static prefix_model_context_t prefix_model_ctx = {
    .callback = model_layer_status_cb,
    .user_data = NULL
};

int main(void) {

    int32_t status;

    // 1. Initialize model
    status = prefix_model_init(&prefix_model_ctx);

    // 2. Copy input tensors to context
    for (int i = 0; i < prefix_num_inputs; i++) {
         arm_memset_s8(
            prefix_model_ctx.inputs[i].data,
            0,
            prefix_model_ctx.inputs[i].size
        );
    }

    // 3. Invoke model
    status = prefix_model_run(&prefix_model_ctx);

    // 4. Retrieve output tensors from context
    // ...

    // Alternatively, run included smoke test
    prefix_test_case_init();
    status = prefix_test_case_run();

    return 0;
}

Printing & Macros

The module uses PREFIX_PRINTF(...) for printing in the smoke test. Additionally, a number of macro stubs are used for memory placement, e.g. PREFIX_PUT_IN_DTCM. These macros can be found in includes-api/prefix_platform.h. For certain frameworks the macros are already pre-wired; otherwise, a no-op is used. The end user is responsible for defining these macros as needed.

#include <stdio.h>
#define AOT_PRINTF(...)  printf(__VA_ARGS__)

Memory & Arena Buffers

The generated module relies on arena buffers for temporary scratch tensors. Unlike TFLM, we know exactly the size of these arenas ahead of time. By default, the module will internally pre-allocate these arenas which are accessible via the context struct once initialized. If desired, the arenas can be allocated externally and passed into the context before calling prefix_model_init(). This is useful for multi-model scenarios.

Based on the configuration and planner, all tensors are placed in one of the available memory regions (e.g. DTCM, SRAM, etc.). Pre-processor macros are used to assign the locations. These macros (e.g. PREFIX_PUT_IN_DTCM) are defined in prefix_platform.h. As discussed in Printing & Macros, these macros are only pre-wired for certain frameworks. Otherwise, the user must define these macros based on their memory map.


Quick Checklist

  • Add the module to your project (CMake, Zephyr, or neuralSPOT).
  • Include <module>/includes-api in your compiler include paths.
  • Link the generated static library (and NS-CMSIS-NN).
  • Provide AOT_PRINTF and other pre-processor macros definitions.
  • Verify stack/heap sizes and memory regions in your linker script.
  • Call aot_model_init() and then run your model via aot_model_run().

Support

If you have questions about platform bring-up, memory placement, or NS-CMSIS-NN kernel selections, please reach out to the project maintainers.