Function arm_nn_mat_mult_kernel_s16

Function Documentation

int16_t *arm_nn_mat_mult_kernel_s16(const int8_t *input_a, const int16_t *input_b, const int32_t output_ch, const int32_t *out_shift, const int32_t *out_mult, const int32_t activation_min, const int32_t activation_max, const int32_t num_col_a, const cmsis_nn_bias_data *const bias_data, int16_t *out_0, const int32_t row_address_offset)

Matrix-multiplication function for convolution with per-channel requantization for 16 bits convolution.

Parameters:
  • input_a[in] pointer to operand A

  • input_b[in] pointer to operand B, always consists of 2 vectors.

  • output_ch[in] number of rows of A

  • out_shift[in] pointer to per output channel requantization shift parameter.

  • out_mult[in] pointer to per output channel requantization multiplier parameter.

  • activation_min[in] minimum value to clamp the output to. Range : int16

  • activation_max[in] maximum value to clamp the output to. Range : int16

  • num_col_a[in] number of columns of A

  • bias_data[in] pointer to struct with bias vector. The length of this vector is equal to the number of output columns (or RHS input rows). The vector can be int32 or int64 indicated by a flag in the struct.

  • out_0[inout] pointer to output

  • row_address_offset[in] Address offset between rows in output.

Returns:

The function returns one of the two

  1. The incremented output pointer for a successful operation or

  2. NULL if implementation is not available.

    This function does the matrix multiplication of weight matrix for all output channels with 2 columns from im2col and produces two elements/output_channel. The outputs are clamped in the range provided by activation min and max. Supported framework: TensorFlow Lite micro.