Generated Module Layout
This document explains the structure of a generated C inference module and how to integrate it with different build systems: Generic CMake, Zephyr, and neuralSPOT.
It depends on our optimized kernel library, ns-cmsis-nn, a fork of CMSIS-NN with extra kernels and fixes. At build time the AOT runtime picks the best-ranked kernel that is compatible with the operator configuration and target platform; if a specialized kernel isn’t applicable, a safe fallback is selected automatically.
Key artifacts:
- Public headers under includes-api/
- Implementation in src/
- Build glue for your environment: CMakeLists.txt, zephyr/ (module), or module.mk (neuralSPOT)
At-a-glance
Each generated module ships as a small, portable C library plus headers. The packaging differs slightly depending on the target ecosystem, but the API (includes-api/
) and sources (src/
) are the same.
<module-name>/
├─ LICENSE # License for this generated module
├─ README.md # This file
├─ includes-api/ # Public headers (stable API surface)
└─ src/ # Implementation (operators, model glue, test helpers)
Depending on the selected module type, you will also see one of the following:
What’s in includes-api/
Public headers (names may vary slightly depending on the generated content):
prefix_common.h
– Common types, helpers, and compile-time configuration.prefix_context.h
– Runtime context struct and types.prefix_model.h
– Model entry points (e.g.,prefix_model_init
,prefix_model_run
).prefix_operators.h
– Operator function inclusions.prefix_tensors.h
– Tensor descriptors, constants, and arenas (if internal).prefix_test_case.h
– Optional test harness helpers (stimulus structure, smoke tests).prefix_platform.h
– Platform abstraction hooks and macros (e.g.,AOT_PRINTF
macro).
Note: The API is designed to be freestanding C (C99), easy to embed into firmware projects with or without an RTOS. The
prefix
is configurable at generation time to allow multiple modules to coexist in the same project.
What’s in src/
Typical source files include:
prefix_context.c
– Initializes the runtime context and wires arenas/tensors.prefix_model.c
– The model’sinit
andrun
functions.prefix_tensors.c
– Tensor descriptor tables and (optionally) constant data.prefix_test_case.c
– Optional smoke test / reference runner.prefix_fully_connected_*.c
,prefix_conv2d_*.c
, … – Generated operator wrappers.
Multiple kernel variants may exist for the same logical operator. The AOT generator ranks and selects the best kernel for the layer configuration and platform.
Integration Guide
HeliosAOT currently supports targeting CMake, Zephyr, and neuralSPOT frameworks. The integration steps differ slightly depending on the target. Please select your target framework below:
STEP 1: Add NS-CMSIS-NN dependency to your project
Copy or submodule ns-cmsis-nn module into your project's modules directory:
STEP 2: Add your AOT module to your project
Copy or submodule AOT module folder into your project's modules directory:
Alternatively, you can have HeliosAOT generate directly into your project modules directory:
STEP 3: Update your application CMakeLists.txt
In your app CMakeLists.txt:
# Add modules (these must contain a top-level CMakeLists.txt)
add_subdirectory(modules/ns-cmsis-nn)
add_subdirectory(modules/<module-name>)
add_executable(app
src/main.c
# ... your sources ...
)
target_link_libraries(app PRIVATE
cmsis-nn # <- this is the NS-CMSIS-NN library target
<module-name> # <- this is the generated AOT module
)
# (Optional) If you need to set CPU/ABI flags here:
# target_compile_options(app PRIVATE -mcpu=cortex-m55 -mthumb -Ofast)
# target_link_options(app PRIVATE -mcpu=cortex-m55 -mthumb -Wl,--gc-sections)
STEP 1: Add NS-CMSIS-NN Zephyr module dependency
If you have access to the NS-CMSIS-NN repository, you can add it as a west project. Inside your zephyr project, add the following to your west.yml manifest file:
remotes:
... # Existing remotes
- name: ambiqai # New remote for AmbiqAI modules
url-base: git@github.com:AmbiqAI
projects:
... # Existing projects
- name: ns-cmsis-nn
remote: ambiqai
repo-path: ns-cmsis-nn
path: modules/ns-cmsis-nn
revision: main # or pin a tag
Ensure your manifest schema is supported by your west (e.g., version: "1.2") and run west update after editing.
Copy or submodule NS-CMSIS-NN module into your Zephyr project under modules:
Inside your Zephyr application CMakeLists.txt, add:
STEP 2: Add your AOT module to your project
Copy or submodule your AOT module folder into your Zephyr project under modules:
Inside your Zephyr application CMakeLists.txt, add:
list(APPEND ZEPHYR_EXTRA_MODULES
... # Existing modules
${CMAKE_CURRENT_SOURCE_DIR}/modules/<module-name>
)
# IMPORTANT: ^ must appear **before** find_package(Zephyr)
STEP 3: Update your prj.conf
... # Existing configurations
# Enable FPU support (needed for DSP and MVE)
CONFIG_FPU=y
# If multiple threads/ISRs may use MVE/FP:
# CONFIG_FPU_SHARING=y
# Enable NS-CMSIS-NN
CONFIG_NS_CMSIS_NN=y
# Enable all kernels
CONFIG_NS_CMSIS_NN_ALL=y
# -OR- selectively add kernels
# CONFIG_NS_CMSIS_NN_ACTIVATION=y
# CONFIG_NS_CMSIS_NN_BASICMATH=y
# CONFIG_NS_CMSIS_NN_CONCATENATION=y
# CONFIG_NS_CMSIS_NN_CONVOLUTION=y
# CONFIG_NS_CMSIS_NN_FULLYCONNECTED=y
# CONFIG_NS_CMSIS_NN_LSTM=y
# CONFIG_NS_CMSIS_NN_NNSUPPORT=y
# CONFIG_NS_CMSIS_NN_PAD=y
# CONFIG_NS_CMSIS_NN_POOLING=y
# CONFIG_NS_CMSIS_NN_QUANTIZATION=y
# CONFIG_NS_CMSIS_NN_RESHAPE=y
# CONFIG_NS_CMSIS_NN_SOFTMAX=y
# CONFIG_NS_CMSIS_NN_STRIDED_SLICE=y
# CONFIG_NS_CMSIS_NN_SVD=y
# CONFIG_NS_CMSIS_NN_TRANSPOSE=y
# Enable <module-name> module
CONFIG_<MODULE_NAME>=y
# ^ Notice we use uppercase with hyphens replaced by underscores
STEP 4: Refresh your west manifest
Verify you see both NS-CMSIS-NN and AOT module folders inside <project-root>/modules
.
STEP 5: Build your Zephyr application
Rebuild your application using pristine to ensure all modules are re-built:
The module ships a module.mk
so you can easily drop it into your neuralSPOT app.
STEP 1: Add NS-CMSIS-NN dependency to your project
Copy or submodule ns-cmsis-nn module into your project's modules directory:
STEP 2: Add AOT module to your project
Copy or submodule your AOT module folder into your project's modules directory:
Alternatively, you can have HeliosAOT generate directly into your project modules directory:
STEP 3: Update your application makefile
Add the modules into your makefile:
STEP 4: Build your application
The module leverages neuralSPOT's
make-library
makefile helper to generate a static library.
Usage Example
Minimal runtime sequence in your main.c
:
#include "prefix_model.h"
#include "prefix_test_case.h" // optional
static void model_layer_status_cb(
int32_t op,
prefix_operator_state_t state,
int32_t status,
void *user_data
) {
if (state == prefix_op_state_init_finished) {
PREFIX_PRINTF("Operator %d init finished with status %d\n", op, status);
}
if (state == prefix_op_state_run_finished) {
PREFIX_PRINTF("Operator %d run finished with status %d\n", op, status);
}
}
static prefix_model_context_t prefix_model_ctx = {
.callback = model_layer_status_cb,
.user_data = NULL
};
int main(void) {
int32_t status;
// 1. Initialize model
status = prefix_model_init(&prefix_model_ctx);
// 2. Copy input tensors to context
for (int i = 0; i < prefix_num_inputs; i++) {
arm_memset_s8(
prefix_model_ctx.inputs[i].data,
0,
prefix_model_ctx.inputs[i].size
);
}
// 3. Invoke model
status = prefix_model_run(&prefix_model_ctx);
// 4. Retrieve output tensors from context
// ...
// Alternatively, run included smoke test
prefix_test_case_init();
status = prefix_test_case_run();
return 0;
}
Printing & Macros
The module uses PREFIX_PRINTF(...)
for printing in the smoke test. Additionally, a number of macro stubs are used for memory placement, e.g. PREFIX_PUT_IN_DTCM
. These macros can be found in includes-api/prefix_platform.h
. For certain frameworks the macros are already pre-wired; otherwise, a no-op is used. The end user is responsible for defining these macros as needed.
Memory & Arena Buffers
The generated module relies on arena buffers for temporary scratch tensors. Unlike TFLM, we know exactly the size of these arenas ahead of time. By default, the module will internally pre-allocate these arenas which are accessible via the context struct once initialized. If desired, the arenas can be allocated externally and passed into the context before calling prefix_model_init()
. This is useful for multi-model scenarios.
Based on the configuration and planner, all tensors are placed in one of the available memory regions (e.g. DTCM, SRAM, etc.). Pre-processor macros are used to assign the locations. These macros (e.g. PREFIX_PUT_IN_DTCM) are defined in prefix_platform.h
. As discussed in Printing & Macros, these macros are only pre-wired for certain frameworks. Otherwise, the user must define these macros based on their memory map.
Quick Checklist
- Add the module to your project (CMake, Zephyr, or neuralSPOT).
- Include
<module>/includes-api
in your compiler include paths. - Link the generated static library (and NS-CMSIS-NN).
- Provide
AOT_PRINTF
and other pre-processor macros definitions. - Verify stack/heap sizes and memory regions in your linker script.
- Call
aot_model_init()
and then run your model viaaot_model_run()
.
Support
If you have questions about platform bring-up, memory placement, or NS-CMSIS-NN kernel selections, please reach out to the project maintainers.