patching
Patching Layers API
This module provides classes to extract patches from 2D data and mask a proportion of them.
Classes:
-
PatchLayer2D–Extracts patches from 2D data.
-
MaskedPatchEncoder2D–Encodes patches and masks a proportion of them.
Classes
PatchLayer2D
This layer will extract patches from 2D data (e.g. image) and reshape them into flattened vectors. Useful as preprocessing technique for patch-based self-supervised learning methods like DINO and Masked Autoencoders. For in-model patching, consider using convolutional layers.
Parameters:
-
(heightint) –The height of the data.
-
(widthint) –The width of the data.
-
(chint) –The number of channels in the data.
-
(patch_heightint) –The height of the patch.
-
(patch_widthint) –The width of the patch.
Source code in helia_edge/layers/patching.py
Functions
show_patched_image
Utility function which accepts a batch of images and its corresponding patches and help visualize one image and its patches side by side.
NOTE: Assumes patch size is divisible by the image size.
Parameters:
-
(imagesKerasTensor) –A batch of images of shape (B, H, W, C).
-
(patchesKerasTensor) –A batch of patches of shape (B, P, A).
Returns:
-
int(int) –The index of the image that was visualized
Source code in helia_edge/layers/patching.py
reconstruct_from_patch
Takes a patch from a single image and reconstructs it back into the image.
NOTE: Assumes patch size is divisible by the image size.
Parameters:
-
(patchKerasTensor) –A patch of shape (P, A).
Returns:
-
KerasTensor–keras.KerasTensor: The reconstructed image of shape (H, W, C).
Source code in helia_edge/layers/patching.py
MaskedPatchEncoder2D
MaskedPatchEncoder2D(patch_height: int, patch_width: int, ch_size: int, projection_dim: int, mask_proportion: float, downstream: bool = False, **kwargs)
Given a batch of patches, this layer will 1. Project the patches and apply positional embeddings. 2. Mask a proportion of patches. 3. Return the masked and unmasked patches along with
Parameters:
-
(patch_heightint) –The height of the patch.
-
(patch_widthint) –The width of the patch.
-
(ch_sizeint) –The number of channels in the patch.
-
(projection_dimint) –The dimension of the projection layer.
-
(mask_proportionfloat) –The proportion of patches to mask.
-
(downstreambool, default:False) –Whether to use the layer in the downstream task. Defaults to False