Operator & Kernel Coverage¶
heliaCORE is organized around operator families and kernel variants. Operators are the public neural-network operations users care about (Conv2D, Pad, Softmax, LeakyReLU, and so on). Kernel variants are the implementations selected for a target CPU, data type, and acceleration path.
The goal is not to document every function on this page. For exact prototypes, parameters, scratch-buffer queries, and per-function behavior, use the generated API reference described in API.
Acceleration paths¶
Kernel selection is compile-time and follows the feature flags produced by the compiler and build system for Ambiq targets.
Path |
Typical Ambiq use |
Notes |
|---|---|---|
Pure C |
Baseline path for smallest Cortex-M targets. |
Always available; keeps functionality portable inside Ambiq builds. |
DSP |
Apollo-class Cortex-M targets with DSP extension. |
Uses SIMD-style DSP intrinsics where they help latency and code size. |
MVE / Helium |
Cortex-M55-class Apollo targets. |
Primary acceleration focus where MVE is present; vectorized kernels target real HELIA workloads. |
Operator families¶
The table below is intentionally grouped. It is meant to help users understand coverage shape without duplicating the full API reference.
Family |
Examples |
Coverage notes |
|---|---|---|
Convolution |
Conv2D, DepthwiseConv2D, TransposeConv2D |
int8 is the common path; selected operators also include int16 and int4-weight variants. |
Dense / matrix |
Fully Connected, Batch MatMul, SVDF |
Includes helper kernels under |
Recurrent |
LSTM, SVDF |
Used by speech and sequence workloads in Ambiq model flows. |
Activation |
ReLU, ReLU6, LeakyReLU, PReLU, Hard-Swish, Logistic, Tanh, Clamp |
Includes Ambiq coverage for glue operators that appear frequently in field-like models. |
Elementwise / math |
Add, Sub, Mul, Min/Max, Abs, Squared Difference, Sqrt/Rsqrt, Mean |
Often small individually, but can matter for end-to-end latency. |
Comparison / reduction |
Equal/NotEqual/Less/Greater, ArgMin/ArgMax, Reduce Min/Max |
Maintains quantized operator coverage used by embedded inference graphs. |
Pooling / softmax |
MaxPool, AvgPool, Softmax |
Common post-convolution and classifier tail operations. |
Data movement |
Pad, Transpose, Reshape, Concatenation, Split, StridedSlice, Gather/GatherND |
Important in real graphs; these operators can dominate if left unoptimized. |
Quantization |
Quantize, Dequantize, requantization helpers |
Supports TFLM-style affine quantization and HELIA deployment paths. |
Layout transforms |
Resize Nearest Neighbor, Space-to-Batch, Batch-to-Space, Space-to-Depth, Depth-to-Space |
Provides coverage for model-conversion and graph-shaping patterns. |
Data types¶
heliaCORE focuses on quantized inference for Ambiq devices:
int8 activations / weights for the broadest operator surface.
int16 activations where models need wider activation precision.
int4 weights with int8 activations for selected high-value kernels such as Conv2D, DepthwiseConv2D, and Fully Connected.
Exact dtype support is per function and per variant. Check the generated API reference before depending on a specific dtype/backend combination.
Reading the source tree¶
The source layout mirrors operator families:
Directory |
Purpose |
|---|---|
|
Conv-family kernels and variants. |
|
Dense layer kernels. |
|
Activation and clamp-style operators. |
|
Shared matmul, requantization, and helper kernels. |
|
Data-movement and graph-shaping operators. |
The public C API is declared primarily in Include/arm_nnfunctions.h and helper
APIs in Include/arm_nnsupportfunctions.h.
Scope¶
heliaCORE is intended for HELIA AI workflows on Ambiq silicon. Arm CMSIS-NN remains the vendor-neutral upstream ecosystem reference for general Cortex-M kernel development.