greedy_planner
GreedyMemoryPlanner API
The GreedyMemoryPlanner class implements a greedy algorithm for memory allocation in the context of AOT (Ahead-Of-Time) compilation for machine learning models.
It is designed to optimize memory usage by reusing gaps from tensors whose lifetimes have ended.
Classes:
-
GreedyMemoryPlanner–Implements a greedy memory allocation strategy for tensors in a shared buffer.
Copyright 2025 Ambiq. All Rights Reserved.
Classes
GreedyMemoryPlanner
Greedy memory planner for AOT models.
Functions
plan
plan(model: AirModel, memory_constraints: list[MemoryConstraint] | None = None, tensor_constraints: list[AttributeRuleset] | None = None) -> MemoryPlan
Allocate offsets for each tensor in a shared buffer using a first-fit algorithm that reuses gaps from tensors whose lifetimes have ended.
Every tensor is bound to an arena slot — there are no per-tensor C symbols. Three arena families are populated:
- :attr:
MemoryPlan.arena_usages— writable scratch arenas (one per writable memory bank), reused via liveness analysis. - :attr:
MemoryPlan.persistent_arenas— writable persistent arenas (one per writable memory bank that received a persistent), bump-allocated and never reclaimed. - :attr:
MemoryPlan.constant_arenas— constant arenas, keyed by runtime (kernel-visible) memory. Two shapes share the map: cold (arena.source_memory == arena.memory, arena buffer is the read-only blob in cold storage) and staged (arena.source_memory != arena.memory, arena buffer is a writable runtime copy hydrated from a separate source blob). The shape is per-arena and follows from theconstant_destination_memoryattribute on the constituent tensors.
Parameters:
-
(modelAirModel) –The target model.
-
(memory_constraintslist[MemoryConstraint] | None, default:None) –Memory constraints
-
(tensor_constraintslist[AttributeRuleset] | None, default:None) –Tensor constraints
Returns:
-
MemoryPlan(MemoryPlan) –The memory plan containing tensor allocations and arena usages.