metaformer
MetaFormer: Meta-Learning with Transformers
Overview
MetaFormer is a transformer-based model that incorporates both spatial mixing and channel mixing blocks. The architecture is designed to learn from few examples and generalize to new tasks.
For more info, refer to the original paper MetaFormer: Meta-Learning with Transformers.
Classes:
-
MetaFormerParams
–MetaFormer parameters
-
MetaFormerModel
–Helper class to generate model from parameters
Functions:
-
patch_embedding
–Patch embedding layer
-
pool_token_mixer
–Token mixer using average pooling
-
conv_token_mixer
–Token mixer using separable convolution
-
attention_token_mixer
–Token mixer using multi-head attention
-
mlp_channel_mixer
–Channel mixer using MLP via 1x1 convolutions
-
metaformer_block
–Metaformer block
-
metaformer_layer
–Metaformer functional layer
Classes
MetaFormerBlockParams
MetaFormerParams
MetaFormer parameters
Attributes:
-
blocks
(list[MetaFormerBlockParams]
) –MetaFormer blocks
-
output_filters
(int
) –Output filters
-
output_activation
(str | None
) –Output activation
-
include_top
(bool
) –Include top
-
dropout
(float
) –Dropout rate
-
drop_connect_rate
(float
) –Drop connect rate
-
name
(str
) –Model name
MetaFormerModel
Helper class to generate model from parameters
Functions
layer_from_params
staticmethod
layer_from_params(inputs: keras.Input, params: MetaFormerParams | dict, num_classes: int | None = None)
Create layer from parameters
Source code in neuralspot_edge/models/metaformer.py
model_from_params
staticmethod
model_from_params(inputs: keras.Input, params: MetaFormerParams | dict, num_classes: int | None = None)
Create model from parameters
Source code in neuralspot_edge/models/metaformer.py
Functions
patch_embedding
patch_embedding(embed_dim: int, patch_shape: tuple[int, int], stride_shape: tuple[int, int] | None = None, padding: str = 'same') -> keras.layers.Layer
Patch embedding layer using 2D convolution
Parameters:
-
embed_dim
(int
) –Embedding dimension
-
patch_shape
(tuple[int, int]
) –Patch shape
-
stride_shape
(tuple[int, int]
, default:None
) –Stride shape. Defaults to None.
-
padding
(str
, default:'same'
) –Padding. Defaults to 'same'.
Source code in neuralspot_edge/models/metaformer.py
pool_token_mixer
Token mixer using average pooling
Parameters:
Returns:
-
Layer
–keras.layers.Layer: Token mixer layer
Source code in neuralspot_edge/models/metaformer.py
conv_token_mixer
conv_token_mixer(embed_dim: int, kernel_size: tuple[int, int] = (3, 3), strides: tuple[int, int] = (1, 1)) -> keras.Layer
Token mixer using separable convolution
Parameters:
-
embed_dim
(int
) –Embedding dimension
-
kernel_size
(tuple[int, int]
, default:(3, 3)
) –Kernel size. Defaults to (3, 3).
-
strides
(tuple[int, int]
, default:(1, 1)
) –Strides. Defaults to (1, 1).
Returns:
-
Layer
–keras.layers.Layer: Token mixer layer
Source code in neuralspot_edge/models/metaformer.py
attention_token_mixer
Token mixer using multi-head attention
Parameters:
-
embed_dim
(int
) –Embedding dimension
-
num_heads
(int
) –Number of heads
-
dropout
(float
, default:0.1
) –Dropout rate. Defaults to 0.1.
Returns:
-
Layer
–keras.layers.Layer: Token mixer layer
Source code in neuralspot_edge/models/metaformer.py
mlp_channel_mixer
mlp_channel_mixer(embed_dim: int, ratio: int = 4, activation: str = 'gelu', dropout: float = 0) -> keras.Layer
Channel mixer using MLP via 1x1 convolutions
Parameters:
-
embed_dim
(int
) –Embedding dimension
-
ratio
(int
, default:4
) –Expansion ratio. Defaults to 4.
-
activation
(str
, default:'gelu'
) –Activation function. Defaults to "gelu".
-
dropout
(float
, default:0
) –Dropout rate. Defaults to 0.
Returns:
-
Layer
–keras.layers.Layer: Channel mixer layer
Source code in neuralspot_edge/models/metaformer.py
metaformer_block
metaformer_block(token_mixer: keras.layers.Layer | None = None, channel_mixer: keras.layers.Layer | None = None, name: str = 'mf_block') -> keras.layers.Layer
Metaformer block
Parameters:
-
token_mixer
(Layer
, default:None
) –Token mixer layer. Defaults to None.
-
channel_mixer
(Layer
, default:None
) –Channel mixer layer. Defaults to None.
-
name
(str
, default:'mf_block'
) –Block name. Defaults to 'mf_block'.
Returns:
-
Layer
–keras.layers.Layer: Metaformer block
Source code in neuralspot_edge/models/metaformer.py
metaformer_layer
metaformer_layer(x: keras.KerasTensor, params: MetaFormerParams, num_classes: int | None = None) -> keras.KerasTensor
MetaFormer functional layer
Parameters:
-
x
(KerasTensor
) –Input tensor
-
params
(MetaFormerParams
) –Model parameters.
-
num_classes
(int
, default:None
) –Number of classes.
Returns:
-
KerasTensor
–keras.KerasTensor: Output tensor