Skip to content

defines

Classes

ArenaRole

Logical role of a memory arena.

Every tensor in a :class:MemoryPlan lives in an arena tagged with one of these roles. Roles partition the planner output so scratch reuse, persistent zero-init, and constant residency can be reasoned about independently.

Attributes:

  • scratch

    Writable activation/scratch arena (whole-program lifetime shared by ops, slots reused via liveness analysis).

  • persistent

    Whole-program-live writable storage (resource variables and similar).

  • constant

    Read-only weight storage. Can be cold (the arena buffer is the cold-storage blob and kernels read it in place) or staged (the arena buffer is a writable runtime copy hydrated from a separate source blob).

TensorBinding

Planner's emit intent for a single tensor's storage.

Every tensor in a :class:MemoryPlan is bound to an arena slot at (role, memory, offset). There are no per-tensor C symbols anymore — scratch, persistent, and constant tensors are all descriptors against arena buffers exposed via arena_buffers[region].

Constant arenas come in two shapes, distinguished purely by whether source_memory matches memory:

  • Cold (source_memory == memory): the arena buffer itself is a static const blob in cold storage. Kernels read it directly; no hydration is required.
  • Staged (source_memory != memory): the arena buffer is a writable runtime copy in memory; a separate read-only source blob lives in source_memory and the caller (or the weak <prefix>_hydrate_constants helper) copies it in before the first model_run.

Scratch and persistent bindings always have source_memory == memory (they are writable arenas with no cold-storage source).

Attributes:

  • role (ArenaRole) –

    Logical role of the storage region.

  • memory (MemoryType) –

    Runtime (kernel-visible) memory of the arena slot.

  • offset (int) –

    Byte offset within the arena.

  • source_memory (MemoryType) –

    Cold-storage memory the slot is sourced from. Equals memory for scratch, persistent, and cold constants; differs from memory for staged constants.

TensorLifetime

Tensor operator lifetime.

Attributes:

  • tensor_id (TensorId) –

    Unique identifier for the tensor

  • start_op (int) –

    Index of the first operator that uses/defines it

  • end_op (int) –

    Index of the last operator that uses it

Functions

add_op
add_op(op_idx: int)

Add an operator index to the lifetime.

Parameters:

  • op_idx
    (int) –

    Operator index to add

merge
merge(other: TensorLifetime)

Merge another lifetime into this one.

This updates the start and end operators to encompass both lifetimes. Useful when tensors have aliases.

Parameters:

TensorAllocation

Metadata for a tensor allocation.

Attributes:

  • tensor_id (TensorId) –

    Unique identifier for the tensor

  • memory (MemoryType) –

    Memory type for the allocation (runtime / kernel-visible memory of the arena slot)

  • offset (int) –

    Offset within the arena

  • size (int) –

    Size of the allocation in bytes

  • binding (TensorBinding | None) –

    Planner's binding describing (role, memory, offset, source_memory) for the slot. Defaults to None for callers that construct :class:TensorAllocation directly; the bundled :class:GreedyMemoryPlanner always populates it. Emit templates dispatch on binding.role and on the binding.source_memory == binding.memory predicate (cold vs staged constant).

ArenaUsage

Metadata for a memory arena.

Attributes:

  • memory (MemoryType) –

    Runtime (kernel-visible) memory of the arena.

  • total_size (int) –

    Total bytes in this arena.

  • used (int) –

    Bytes actually used. For bump-allocated arenas (constant, persistent) this equals total_size.

  • role (ArenaRole) –

    Logical role of the arena.

  • source_memory (MemoryType) –

    The cold-storage memory the arena's contents are sourced from. For scratch and persistent arenas this equals memory (purely runtime, no source blob). For constant arenas it equals memory in the cold case (arena buffer is the cold blob in place) and differs from memory in the staged case (arena buffer is a writable runtime copy of a source blob in source_memory).

Attributes

is_staged property
is_staged: bool

True iff this arena has a distinct cold source memory.

Equivalent to source_memory is not None and source_memory != memory. Centralizes the cold-vs-staged predicate so templates and handlers do not duplicate it.

For scratch and persistent arenas this is always False (they have no cold source). For constant arenas it distinguishes the two emit shapes:

  • False (cold): the arena buffer itself is a static const blob in cold storage.
  • True (staged): the arena buffer is a writable runtime copy hydrated from a separate source blob in source_memory.

Functions

allocate
allocate(size: int) -> int

Allocate a segment of the arena and return its offset

MemoryConstraint

User-defined memory constraint

Attributes:

  • name (MemoryType) –

    Memory type (e.g., DTCM, ITCM)

  • max_size (int | None) –

    Maximum size in bytes, or None for no limit

  • arena_alignment (int | None) –

    Optional per-arena alignment floor in bytes. When set, the planner uses max(arena_alignment, platform.min_alignment) for both the arena base symbol and the per-slot offset of every tensor packed into it. When None (the default), only the implementation floor (16 bytes) is applied to the arena base symbol; per-slot alignment continues to use max(platform.min_alignment, dtype_alignment_floor, tensor.alignment_hint) so dtype-natural and kernel-driven alignment are still honored. Bump arena_alignment above the default for memories that back DMA-driven hydration paths needing stronger alignment than the platform's MVE/Helium floor (e.g. cacheline-sized PSRAM transfers).

MemoryPlan

Top-level memory plan for tensors.

Every tensor allocation is a slot in some arena. Three arena maps partition storage by role:

  • :attr:arena_usages — writable scratch arenas (one per writable memory bank).
  • :attr:persistent_arenas — writable persistent arenas (one per writable memory bank that received a persistent tensor). Separate from scratch so the scratch arena stays purely transient and may be aliased across models.
  • :attr:constant_arenas — constant arenas. Two shapes share the same map:

    • Cold: arena.source_memory == arena.memory. The arena buffer itself is the read-only blob in cold storage; kernels read it directly. No hydration required.
    • Staged: arena.source_memory != arena.memory. The arena buffer is a writable runtime copy; a separate source blob in arena.source_memory is hydrated into it before the first model_run.

Attributes:

  • tensor_allocs (dict[str, TensorAllocation]) –

    Tensor allocations. Each carries a :class:TensorBinding recording (role, memory, offset, source_memory).

  • arena_usages (dict[MemoryType, ArenaUsage]) –

    Scratch arenas, keyed by MemoryType.

  • constant_arenas (dict[MemoryType, ArenaUsage]) –

    Constant arenas, keyed by runtime (destination) MemoryType. Per-arena single-source-memory invariant holds (every constant in the same arena has the same source_memory) so cold arenas and staged arenas are both guaranteed to be a single contiguous source blob.

  • persistent_arenas (dict[MemoryType, ArenaUsage]) –

    Persistent arenas, keyed by writable MemoryType.

  • tensor_lifetimes (dict[TensorId, TensorLifetime]) –

    Tensor lifetimes.

Functions

get_allocation
get_allocation(tensor_id: str) -> TensorAllocation

Get the allocation metadata for a given tensor ID.

Parameters:

  • tensor_id
    (str) –

    The tensor ID to look up.

Raises:

  • KeyError

    If the tensor ID is not found in the allocations.

Returns:

  • TensorAllocation ( TensorAllocation ) –

    The allocation metadata for the tensor.

MemoryPlannerType

Memory planner type

Attributes:

  • greedy (str) –

    Greedy memory planner

TensorAttributes

This class provides the baseline set of tensor attribute rules.

Attributes:

  • memory (MemoryType) –

    Memory placement for tensors. For constants this is the source (cold-storage) memory where the tensor's bytes live in the image. When constant_destination_memory is set and differs from memory, the runtime arena lives in that destination memory and a hydration copy is required; otherwise the arena lives in memory itself and is read-only (cold).

  • constant_destination_memory (MemoryType | None) –

    Override the runtime (kernel-visible) memory for a constant. Required only when the runtime memory must differ from the source memory (e.g. weights stored in MRAM but read from DTCM/SRAM at runtime). When None (default), the runtime memory equals memory and no hydration step is needed. Must reference a writable memory of the target platform when set.

MemoryFreeBlock

Metadata for a free memory block.

Attributes:

  • offset (int) –

    Offset of the free block

  • size (int) –

    Size of the free block

MemoryActiveBlock

Metadata for an active memory block.

Attributes:

  • end_time (int) –

    End time of the active block

  • offset (int) –

    Offset of the active block

  • size (int) –

    Size of the active block