Skip to content

patching

Patching Layers API

This module provides classes to extract patches from 2D data and mask a proportion of them.

Classes:

Classes

PatchLayer2D

PatchLayer2D(height: int, width: int, ch: int, patch_height: int, patch_width: int, **kwargs)

This layer will extract patches from 2D data (e.g. image) and reshape them into flattened vectors. Useful as preprocessing technique for patch-based self-supervised learning methods like DINO and Masked Autoencoders. For in-model patching, consider using convolutional layers.

Parameters:

  • height (int) –

    The height of the data.

  • width (int) –

    The width of the data.

  • ch (int) –

    The number of channels in the data.

  • patch_height (int) –

    The height of the patch.

  • patch_width (int) –

    The width of the patch.

Source code in neuralspot_edge/layers/patching.py
def __init__(
    self,
    height: int,
    width: int,
    ch: int,
    patch_height: int,
    patch_width: int,
    **kwargs,
):
    """This layer will extract patches from 2D data (e.g. image) and reshape them into flattened vectors.
    Useful as preprocessing technique for patch-based self-supervised learning methods like
    DINO and Masked Autoencoders. For in-model patching, consider using convolutional layers.

    Args:
        height (int): The height of the data.
        width (int): The width of the data.
        ch (int): The number of channels in the data.
        patch_height (int): The height of the patch.
        patch_width (int): The width of the patch.
    """
    super().__init__(**kwargs)
    self.height = height
    self.width = width
    self.ch_size = ch
    self.patch_height = patch_height
    self.patch_width = patch_width

    # Each patch will be size (patch_height, patch_width, ch).
    self.resize = keras.layers.Reshape((-1, patch_height * patch_width * ch))

Functions

show_patched_image
show_patched_image(images: keras.KerasTensor, patches: keras.KerasTensor) -> int

Utility function which accepts a batch of images and its corresponding patches and help visualize one image and its patches side by side.

NOTE: Assumes patch size is divisible by the image size.

Parameters:

  • images (KerasTensor) –

    A batch of images of shape (B, H, W, C).

  • patches (KerasTensor) –

    A batch of patches of shape (B, P, A).

Returns:

  • int ( int ) –

    The index of the image that was visualized

Source code in neuralspot_edge/layers/patching.py
def show_patched_image(self, images: keras.KerasTensor, patches: keras.KerasTensor) -> int:
    """Utility function which accepts a batch of images and its
    corresponding patches and help visualize one image and its patches
    side by side.

    NOTE: Assumes patch size is divisible by the image size.

    Args:
        images (keras.KerasTensor): A batch of images of shape (B, H, W, C).
        patches (keras.KerasTensor): A batch of patches of shape (B, P, A).

    Returns:
        int: The index of the image that was visualized
    """

    idx = np.random.choice(patches.shape[0])

    image = images[idx]
    patch = patches[idx]
    reconstructed_image = self.reconstruct_from_patch(patch)

    plt.figure(figsize=(4, 4))
    plt.imshow(keras.utils.array_to_img(image))
    plt.axis("off")
    plt.show()

    plt.figure(figsize=(4, 4))
    plt.imshow(keras.utils.array_to_img(reconstructed_image))
    plt.axis("off")
    plt.show()

    return idx
reconstruct_from_patch
reconstruct_from_patch(patch: keras.KerasTensor) -> keras.KerasTensor

Takes a patch from a single image and reconstructs it back into the image.

NOTE: Assumes patch size is divisible by the image size.

Parameters:

  • patch (KerasTensor) –

    A patch of shape (P, A).

Returns:

  • KerasTensor

    keras.KerasTensor: The reconstructed image of shape (H, W, C).

Source code in neuralspot_edge/layers/patching.py
def reconstruct_from_patch(self, patch: keras.KerasTensor) -> keras.KerasTensor:
    """Takes a patch from a *single* image and reconstructs it back into the image.

    NOTE: Assumes patch size is divisible by the image size.

    Args:
        patch (keras.KerasTensor): A patch of shape (P, A).

    Returns:
        keras.KerasTensor: The reconstructed image of shape (H, W, C).

    """
    num_patches = patch.shape[0]
    n = int(self.height / self.patch_height)

    patch = keras.ops.reshape(patch, (num_patches, self.patch_height, self.patch_width, self.ch_size))
    rows = keras.ops.split(patch, n, axis=0)
    rows = [keras.ops.concatenate(keras.ops.unstack(x), axis=1) for x in rows]
    reconstructed = keras.ops.concatenate(rows, axis=0)
    return reconstructed

MaskedPatchEncoder2D

MaskedPatchEncoder2D(patch_height: int, patch_width: int, ch_size: int, projection_dim: int, mask_proportion: float, downstream: bool = False, **kwargs)

Given a batch of patches, this layer will 1. Project the patches and apply positional embeddings. 2. Mask a proportion of patches. 3. Return the masked and unmasked patches along with

Parameters:

  • patch_height (int) –

    The height of the patch.

  • patch_width (int) –

    The width of the patch.

  • ch_size (int) –

    The number of channels in the patch.

  • projection_dim (int) –

    The dimension of the projection layer.

  • mask_proportion (float) –

    The proportion of patches to mask.

  • downstream (bool, default: False ) –

    Whether to use the layer in the downstream task. Defaults to False

Source code in neuralspot_edge/layers/patching.py
def __init__(
    self,
    patch_height: int,
    patch_width: int,
    ch_size: int,
    projection_dim: int,
    mask_proportion: float,
    downstream: bool = False,
    **kwargs,
):
    """Given a batch of patches, this layer will
    1. Project the patches and apply positional embeddings.
    2. Mask a proportion of patches.
    3. Return the masked and unmasked patches along with

    Args:
        patch_height (int): The height of the patch.
        patch_width (int): The width of the patch.
        ch_size (int): The number of channels in the patch.
        projection_dim (int): The dimension of the projection layer.
        mask_proportion (float): The proportion of patches to mask.
        downstream (bool, optional): Whether to use the layer in the downstream task. Defaults to False
    """
    super().__init__(**kwargs)
    self.patch_height = patch_height
    self.patch_width = patch_width
    self.ch_size = ch_size
    self.projection_dim = projection_dim
    self.mask_proportion = mask_proportion
    self.downstream = downstream

Functions