Ethos-U with HeliaAOT

HeliaAOT understands the Ethos-U command streams generated by Arm Vela. When you feed a Vela-converted LiteRT/TFLite model into HeliaAOT, every Ethos-U-supported subgraph is collapsed into a single ETHOS_U operator whose first input is the command stream. The generated C module emits a thin wrapper around ethosu_invoke_v3, automatically wiring up the command stream plus all required base addresses for inputs, outputs, and Vela-added auxiliary buffers. If the real ethosu_driver headers are not present, HeliaAOT falls back to a stub so you can still build and inspect the output; link the Ethos-U core driver to exercise real hardware or the FVP.

Run a LiteRT model through Vela and HeliaAOT

Compile with Vela. Install Vela (python -m pip install ethos-u-vela) and run it on the original .tflite:

vela ./models/my_model.tflite \
  --accelerator-config ethos-u55-128 \
  --system-config Ethos_U55_High_End_Embedded \
  --memory-mode Shared_Sram \
  --output-dir build/vela_out \
  --output-format tflite

Adjust the accelerator/system configs to match your target. Vela produces a new .tflite where each Ethos-U region is encoded as a single custom op plus a command stream tensor.

Convert with HeliaAOT. Point HeliaAOT at the Vela output to generate the C module:

helia-aot convert \
  --model.path build/vela_out/my_model_vela.tflite \
  --module.path ./out/ethos_u_module \
  --platform.name apollo510_evb \
  --test.enabled

No extra flags are needed for Ethos-U; the converter detects the ETHOS_U custom op, lifts the command stream into a const tensor, and keeps any non-Ethos-U operators as standard CPU/Helium kernels.

Integrate the driver. The generated sources call ethosu_invoke_v3 if <ethosu_driver.h> is available. Link against the Ethos-U core driver (or the FVP-provided driver) to run on real hardware; otherwise, the stub implementation lets you compile and review the wiring without the driver present.