rsqrt
Classes
RsqrtOperator
Functions
resolve_rsqrt_lut_mode
Resolve the RSQRT LUT mode from operator attributes.
warn_on_rsqrt_lut_mode
Log a guidance warning when multiple INT16 RSQRT ops all use per_op mode.
Only considers INT16 RSQRT operators; INT8 RSQRT ignores lut_mode entirely.
Parameters:
make_rsqrt_lut_s8
make_rsqrt_lut_s8(input_scale: float, input_zero_point: int, output_scale: float, output_zero_point: int) -> np.ndarray
Build a 256-entry int8 RSQRT LUT using fixed-point emulation.
make_universal_rsqrt_lut_s16
Build a universal RSQRT base LUT shared by all INT16 operators.
The table stores 1 / sqrt(q) in Q30 for positive quantized values q
sampled every 64 steps. Per-operator output scaling is applied later during
code generation/runtime.
extract_per_op_rsqrt_lut_s16
Extract a 513-entry INT16 RSQRT LUT by probing a throwaway TFLite model.
Builds a minimal single-op TFLite model with the given quantization
parameters and feeds grid-aligned inputs (frac=0) so that TFLite returns
the exact LUT entries. This guarantees bit-exact agreement with TFLite's
LUTPopulate<int16_t> (float32 arithmetic) without reimplementing its
rounding in Python.
The positive domain (LUT indices 257-511) is probed directly. The negative domain (indices 0-256, never accessed by valid RSQRT inputs) and the last endpoint (index 512) are computed in float32 to match TFLite's Prepare().
Parameters:
-
(input_scalefloat) –Quantization scale for the INT16 input tensor.
-
(output_scalefloat) –Quantization scale for the INT16 output tensor.
Returns:
-
ndarray–A 513-entry INT16 LUT matching TFLite's internal table.
compute_rsqrt_scale_s16
Compute the fixed-point scale for the universal RSQRT base LUT.
The shared LUT stores base samples in Q30 for 1 / sqrt(q) where
x_real = input_scale * q.
RSQRT in real space is:
y_real = 1 / sqrt(x_real)
= 1 / sqrt(input_scale * q)
= (1 / sqrt(input_scale)) * (1 / sqrt(q))
Output quantization is:
y_real = output_scale * y_q
y_q = y_real / output_scale
Combining both gives:
y_q = (1 / (output_scale * sqrt(input_scale))) * (1 / sqrt(q))
The LUT stores 2^30 * (1 / sqrt(q)), so converting that LUT value into
the quantized output domain requires one extra division by 2^30:
real_multiplier = 1 / (output_scale * sqrt(input_scale) * 2^30)
This real multiplier is then converted into the CMSIS-NN fixed-point pair
(multiplier, shift) via AirFixedPointScale.from_real_multiplier,
and the generated C passes that pair to arm_rsqrt_s16_universal(...).
Parameters:
-
(input_scalefloat) –Quantization scale for the INT16 input tensor.
-
(output_scalefloat) –Quantization scale for the INT16 output tensor.
Returns:
-
AirFixedPointScale–Fixed-point multiplier/shift pair for the universal INT16 RSQRT path,
-
AirFixedPointScale–passed to
arm_rsqrt_s16_universal(...).