Function arm_nn_depthwise_conv_nt_t_s4

Function Documentation

arm_cmsis_nn_status arm_nn_depthwise_conv_nt_t_s4(const int8_t *lhs, const int8_t *rhs, const int32_t lhs_offset, const int32_t active_ch, const int32_t total_ch, const int32_t *out_shift, const int32_t *out_mult, const int32_t out_offset, const int32_t activation_min, const int32_t activation_max, const uint16_t row_x_col, const int32_t *const output_bias, int8_t *out)

Depthwise convolution of transposed rhs matrix with 4 lhs matrices. To be used in non-padded cases. rhs consists of packed int4 data. Dimensions are the same for lhs and rhs.

Note

If number of channels is not a multiple of 4, upto 3 elements outside the boundary will be read out for the following.

  • Output shift

  • Output multiplier

  • Output bias

  • rhs

Parameters:
  • lhs[in] Input left-hand side matrix

  • rhs[in] Input right-hand side matrix (transposed). Consists of int4 data packed in an int8 buffer.

  • lhs_offset[in] LHS matrix offset(input offset). Range: -127 to 128

  • active_ch[in] Subset of total_ch processed

  • total_ch[in] Number of channels in LHS/RHS

  • out_shift[in] Per channel output shift. Length of vector is equal to number of channels.

  • out_mult[in] Per channel output multiplier. Length of vector is equal to number of channels.

  • out_offset[in] Offset to be added to the output values. Range: -127 to 128

  • activation_min[in] Minimum value to clamp the output to. Range: int8

  • activation_max[in] Maximum value to clamp the output to. Range: int8

  • row_x_col[in] (row_dimension * col_dimension) of LHS/RHS matrix

  • output_bias[in] Per channel output bias. Length of vector is equal to number of channels.

  • out[in] Output pointer

Returns:

The function returns one of the two

  • Updated output pointer if an implementation is available

  • NULL if no implementation is available.