Function arm_nn_mat_mult_kernel_s4_s16¶

Defined in File arm_nnsupportfunctions.h

Function Documentation¶

int8_t *arm_nn_mat_mult_kernel_s4_s16(const int8_t *input_a, const int16_t *input_b, const uint16_t output_ch, const int32_t *out_shift, const int32_t *out_mult, const int32_t out_offset, const int32_t activation_min, const int32_t activation_max, const int32_t num_col_a, const int32_t *const output_bias, int8_t *out_0)¶

Matrix-multiplication function for convolution with per-channel requantization and 4 bit weights.

Parameters:

input_a – [in] pointer to operand A, int8 packed with 2x int4.
input_b – [in] pointer to operand B, always consists of 2 vectors.
output_ch – [in] number of rows of A
out_shift – [in] pointer to per output channel requantization shift parameter.
out_mult – [in] pointer to per output channel requantization multiplier parameter.
out_offset – [in] output tensor offset.
activation_min – [in] minimum value to clamp the output to. Range : int8
activation_max – [in] maximum value to clamp the output to. Range : int8
num_col_a – [in] number of columns of A
output_bias – [in] per output channel bias. Range : int32
out_0 – [inout] pointer to output

Returns:

The function returns one of the two

The incremented output pointer for a successful operation or
NULL if implementation is not available.

This function does the matrix multiplication of weight matrix for all output channels with 2 columns from im2col and produces two elements/output_channel. The outputs are clamped in the range provided by activation min and max. Supported framework: TensorFlow Lite micro.