metaformer
MetaFormer: Meta-Learning with Transformers
Overview
MetaFormer is a transformer-based model that incorporates both spatial mixing and channel mixing blocks. The architecture is designed to learn from few examples and generalize to new tasks.
For more info, refer to the original paper MetaFormer: Meta-Learning with Transformers.
Classes:
-
MetaFormerParams
–MetaFormer parameters
-
MetaFormerModel
–Helper class to generate model from parameters
Functions:
-
patch_embedding
–Patch embedding layer
-
pool_token_mixer
–Token mixer using average pooling
-
conv_token_mixer
–Token mixer using separable convolution
-
attention_token_mixer
–Token mixer using multi-head attention
-
mlp_channel_mixer
–Channel mixer using MLP via 1x1 convolutions
-
metaformer_block
–Metaformer block
-
metaformer_layer
–Metaformer functional layer
Classes
MetaFormerBlockParams
MetaFormerParams
MetaFormer parameters
Attributes:
-
blocks
(list[MetaFormerBlockParams]
) –MetaFormer blocks
-
output_filters
(int
) –Output filters
-
output_activation
(str | None
) –Output activation
-
include_top
(bool
) –Include top
-
dropout
(float
) –Dropout rate
-
drop_connect_rate
(float
) –Drop connect rate
-
name
(str
) –Model name
MetaFormerModel
Helper class to generate model from parameters
Functions
layer_from_params
staticmethod
layer_from_params(inputs: keras.Input, params: MetaFormerParams | dict, num_classes: int | None = None)
Create layer from parameters
Source code in neuralspot_edge/models/metaformer.py
model_from_params
staticmethod
model_from_params(inputs: keras.Input, params: MetaFormerParams | dict, num_classes: int | None = None)
Create model from parameters
Source code in neuralspot_edge/models/metaformer.py
Functions
patch_embedding
patch_embedding(embed_dim: int, patch_shape: tuple[int, int], stride_shape: tuple[int, int] | None = None, padding: str = 'same') -> keras.layers.Layer
Patch embedding layer using 2D convolution
Parameters:
-
embed_dim
int
) –Embedding dimension
-
patch_shape
tuple[int, int]
) –Patch shape
-
stride_shape
tuple[int, int]
, default:None
) –Stride shape. Defaults to None.
-
padding
str
, default:'same'
) –Padding. Defaults to 'same'.
Source code in neuralspot_edge/models/metaformer.py
pool_token_mixer
Token mixer using average pooling
Parameters:
Returns:
-
Layer
–keras.layers.Layer: Token mixer layer
Source code in neuralspot_edge/models/metaformer.py
conv_token_mixer
conv_token_mixer(embed_dim: int, kernel_size: tuple[int, int] = (3, 3), strides: tuple[int, int] = (1, 1)) -> keras.Layer
Token mixer using separable convolution
Parameters:
-
embed_dim
int
) –Embedding dimension
-
kernel_size
tuple[int, int]
, default:(3, 3)
) –Kernel size. Defaults to (3, 3).
-
strides
tuple[int, int]
, default:(1, 1)
) –Strides. Defaults to (1, 1).
Returns:
-
Layer
–keras.layers.Layer: Token mixer layer
Source code in neuralspot_edge/models/metaformer.py
attention_token_mixer
Token mixer using multi-head attention
Parameters:
-
embed_dim
int
) –Embedding dimension
-
num_heads
int
) –Number of heads
-
dropout
float
, default:0.1
) –Dropout rate. Defaults to 0.1.
Returns:
-
Layer
–keras.layers.Layer: Token mixer layer
Source code in neuralspot_edge/models/metaformer.py
mlp_channel_mixer
mlp_channel_mixer(embed_dim: int, ratio: int = 4, activation: str = 'gelu', dropout: float = 0) -> keras.Layer
Channel mixer using MLP via 1x1 convolutions
Parameters:
-
embed_dim
int
) –Embedding dimension
-
ratio
int
, default:4
) –Expansion ratio. Defaults to 4.
-
activation
str
, default:'gelu'
) –Activation function. Defaults to "gelu".
-
dropout
float
, default:0
) –Dropout rate. Defaults to 0.
Returns:
-
Layer
–keras.layers.Layer: Channel mixer layer
Source code in neuralspot_edge/models/metaformer.py
metaformer_block
metaformer_block(token_mixer: keras.layers.Layer | None = None, channel_mixer: keras.layers.Layer | None = None, name: str = 'mf_block') -> keras.layers.Layer
Metaformer block
Parameters:
-
token_mixer
Layer
, default:None
) –Token mixer layer. Defaults to None.
-
channel_mixer
Layer
, default:None
) –Channel mixer layer. Defaults to None.
-
name
str
, default:'mf_block'
) –Block name. Defaults to 'mf_block'.
Returns:
-
Layer
–keras.layers.Layer: Metaformer block
Source code in neuralspot_edge/models/metaformer.py
metaformer_layer
metaformer_layer(x: keras.KerasTensor, params: MetaFormerParams, num_classes: int | None = None) -> keras.KerasTensor
MetaFormer functional layer
Parameters:
-
x
KerasTensor
) –Input tensor
-
params
MetaFormerParams
) –Model parameters.
-
num_classes
int
, default:None
) –Number of classes.
Returns:
-
KerasTensor
–keras.KerasTensor: Output tensor