Functional Testing Plan for Helia AOT Conversion
This document outlines a comprehensive strategy for validating the functional correctness of the six conversion stages implemented in helia_aot/converter.py:
- Load (backend → AIR)
- Transform (graph-level optimizations)
- Resolve (handler initialization)
- Plan (memory & handler planning)
- Emit (code generation)
- Export (module export)
For each stage we describe the observable behaviors, required fixtures, and the concrete assertions that functional tests should perform. The goal is to cover both happy-path scenarios and critical failure modes so that the conversion pipeline is reliable end to end.
1. Load Stage — convert_backend_model
Key behavior: Convert a TFLite/LiteRT flatbuffer to an AirModel instance that exposes stable operator topology, tensor metadata, and typed options for downstream stages.【F:helia_aot/converter.py†L43-L71】【F:helia_aot/converters/init.py†L17-L64】
Fixtures
- On-the-fly LiteRT builders: Instead of checking in serialized graphs, we synthesize minimal single-op and composite TFLite models during tests (see
tests/unit/utils/operator_models.pyandtests/unit/utils/composite_models.py). Each generator encodes deterministic tensor shapes, buffers, builtin options, quantization metadata, and sparsity so the loader exercises every registered parser. - Temporary file helpers: Tests materialize models under
tmp_pathand delete them after execution; we do not require a dedicated CLIConvertArgsharness for this stage.
Tests
- Successful load path
- Invoke
convert_backend_modelon the generated.tflitefiles and verify the returnedAirModelcontains the expected operator ordering and tensor connectivity (sequential and branching graphs). - Assert unsupported suffixes raiseValueError, missing files raiseFileNotFoundError, and corrupted flatbuffers surface exceptions so validation guards stay in place. - Operator coverage & tensor metadata
- Parameterized tests span every registered LiteRT parser, checking that each
AirOperatorreports the correctAirOpType, named tensors (e.g.,weights,bias), constant tensor payloads, typed options, and that tensor quantization/sparsity metadata is preserved (including blockwise quantization and dimension metadata for models that expose it).
2. Transform Stage — TransformPipeline.apply
Key behavior: Build a pipeline from specs (wildcard default handling + per-transform toggles) and mutate the model according to registered transforms.【F:helia_aot/converter.py†L73-L80】【F:helia_aot/transforms/transform_pipeline.py†L1-L61】
Fixtures
- Synthetic AIR inputs: The tests reuse the Stage 1 LiteRT builders to emit minimal models containing identity ops, depthwise convolutions, and transpose convolutions. Each model is loaded once via
convert_backend_modelbefore the transform pipeline is exercised. - Transform specs: Tests assemble
TransformSpecinstances inline to cover wildcard defaults, per-transform overrides, and invalid names.
Tests
- Pipeline construction
-
apply_wildcard_and_validateis verified to fan out the wildcard entry across the registered transforms, preserving explicit overrides.TransformPipeline.from_configis then checked to ensure each transform instance reflects the requestedenabledflag and options payload, and that unknown transform names raiseValueError. - PruneIdentityOps
- Using a single
RESHAPEgraph, the transform is expected to mark the output tensor as an alias of the input while leaving non-identity operators untouched. A follow-up call confirms the transform is idempotent. - DepthwiseToConv
- Positive-path coverage asserts that a convertible depthwise convolution is rewritten as
CONV_2D, producesAirConv2DOptions, and rewrites the weights tensor with the anticipated permutation. A separate test increases the efficiency threshold to prove the transform opts out. - TransposeReverseConv
- Exercises both the reversible conversion (weights flipped + transposed and
AirConv2DOptionsemitted) and the negative-path where a large threshold keeps the original operator intact.
3. Resolve Stage — Handler resolution
Key behavior: Instantiate each handler and invoke resolve() so they populate the shared CodeGenContext (operators, interpreter, directory scaffolding).【F:helia_aot/converter.py†L93-L122】
Fixtures
- Context builder: Tests materialize LiteRT models on disk, load them via
convert_backend_model, and spin up aCodeGenContextpointed at a temporarywork_path. No CLI harness or Stage 1 test code is reused beyond the shared model builders. - Monkeypatch hooks: Lightweight stubs replace side-effectful pieces (e.g.,
create_interpreter) so we can observe calls without invoking real interpreters or touching the filesystem outside of the temporary work directory.
Tests
- OperatorHandler resolution
- Running
OperatorHandler.resolve()produces oneAotOperatorper AIR operator and marks each as resolved (the_has_resolvedflag is set).【F:helia_aot/aot/handlers/operator_handler.py†L12-L55】 - ModuleHandler cleanup
- Verify it wipes any stale contents under
work_pathand recreates thesrc/andincludes-api/directories used by later stages.【F:helia_aot/aot/handlers/module_handler.py†L52-L76】 - TestHandler interpreter wiring (optional)
- With tests enabled,
resolve()must callcreate_interpreterwith the configured model path and stash the returned interpreter for emission.【F:helia_aot/aot/handlers/test_handler.py†L12-L45】 - Failure propagation
- Monkeypatching a handler’s
resolve()to raise should surface the exception throughAotConverter.convert(), demonstrating that StepContext aborts the pipeline instead of swallowing the error.
4. Plan Stage — Memory planner & handler hooks
Key behavior: Run the configured memory planner and invoke plan() on every handler.【F:helia_aot/converter.py†L124-L139】
Fixtures
- A temporary
CodeGenContextis constructed for each registered platform using on-the-fly LiteRT models so the greedy planner runs under a variety of memory maps. - The context is seeded with constants, persistent tensors, and scratch tensors (including multi-layer graphs) to exercise allocation reuse and arena growth.
Tests
- Planner integration
- Instantiate the greedy planner for every platform and confirm the resulting
MemoryPlanincludes allocations for constants/persistent tensors and that arenas reflect the platform’s memory sizes. Scratch tensors must land in writable arenas with non-zero peak usage.【F:helia_aot/memory/greedy_planner.py†L1-L143】 - Constraint validation
- Check invalid constraints on type/size raise
ValueError, and tight limits cause insufficient-memory failures. Custom constraints also verify arena resizing behavior. - Tensor constraints - Apply attribute overrides (e.g., force PSRAM) to specific tensor IDs and assert allocations respect those directives.
- Gap reuse & preferred order - Use multi-layer graphs to ensure scratch buffers reuse freed offsets and fall back through each platform’s preferred memory order when earlier arenas are constrained.
- Handler plan hooks
- Monkeypatch handler
plan()methods to verify they are invoked after planning so downstream emit stages receive the prepared context.【F:helia_aot/aot/handlers/operator_handler.py†L57-L60】
5. Emit Stage — Artifact generation
Key behavior: Call emit(save_path) on all handlers to create headers, sources, docs, and optional tests within the staging directory.【F:helia_aot/converter.py†L141-L147】
Fixtures
- Use a temporary workspace and configure module type variations (
neuralspot,zephyr,cmake) to cover all code paths. - Provide deterministic
CodeGenContextvalues (operators list, memory plan) so templates render reproducibly.
Tests
- ModuleHandler outputs - Assert the expected file set exists per module type and check key template substitutions (e.g., prefix, CMSIS version) in generated files.【F:helia_aot/aot/handlers/module_handler.py†L14-L115】
- Operator/Tensor/Model handlers
- Confirm the expected headers/sources exist with the configured prefix (e.g.,
{prefix}_model.c,{prefix}_tensors.h).【F:helia_aot/aot/handlers/tensor_handler.py†L1-L66】【F:helia_aot/aot/handlers/model_handler.py†L1-L59】 - DocHandler/TestHandler - With documentation HTML disabled, ensure license/README files are generated and MkDocs artifacts are absent; when tests are enabled, assert test-case sources are emitted (mocking the interpreter as needed).【F:helia_aot/aot/handlers/doc_handler.py†L1-L88】【F:helia_aot/aot/handlers/test_handler.py†L47-L82】
- Idempotent emit
- Re-run
emiton the same context and ensure files are overwritten deterministically without residual artifacts. - Complex graphs
- Emit from a multi-layer model to guarantee operator-specific artifacts (e.g., generated conv kernels) materialize under
src/and operator manifests update accordingly.
6. Export Stage — Packaging the workspace
Key behavior: Move or archive the staged module to the configured output path, handling zip archives and directory exports with overwrite protection.【F:helia_aot/converter.py†L149-L188】
Fixtures
- Temporary output directories with and without pre-existing content.
- Toggle
config.forceto exercise both overwrite rejection and forced clobbering paths.
Tests
- Zip export
- Configure
module.pathwith.zip, run conversion, and assert the resulting archive contains the staged structure (inspect viazipfile). Re-run withoutforceto ensure the expectedFileExistsErroris raised when the archive already exists. - Directory export & force
- Point
module.pathto a directory, verify the workspace is copied into<path>/<module.name>, and confirm re-running withforce=Trueremoves stale contents before re-exporting.
Running the tests locally
All functional tests live under tests/unit. From the project root you can run:
Or target specific stages:
uv sync --group ci
uv run pytest tests/unit/converters # Stage 1 load
uv run pytest tests/unit/transforms # Stage 2 transform
uv run pytest tests/unit/aot/test_resolve_stage.py
uv run pytest tests/unit/aot/test_plan_stage.py
uv run pytest tests/unit/aot/test_emit_stage.py
uv run pytest tests/unit/aot/test_export_stage.py
uv run pytest tests/unit/platforms/test_platform_resolution.py
These commands assume you are using pytest (e.g., via uv run pytest …).