PPG Workflow¶

This page describes the current production-oriented workflow in compressionKIT. Today, the most complete path in the toolkit is a PPG compression and reconstruction pipeline built around a residual vector quantization autoencoder.

What Is Supported Today¶

The current supported task is:

Input: 64 Hz PPG signals, typically sourced from EDF recordings with a Pleth channel.
Training objective: reconstruct the waveform after neural compression while preserving useful physiological structure.
Deployment path: export the encoder as INT8 LiteRT plus a C header for embedded integration.
Evaluation path: compare original versus reconstructed signals with waveform, reconstruction, and compression metrics.

In practice, this gives teams a path from offline dataset preparation to embedded export and browser-based inspection without changing toolchains midway through the project.

End-To-End Flow¶

Load PPG segments from EDF data and resample to 64 Hz.
Build deterministic training or validation windows with cache-backed dataset support.
Normalize and optionally augment the input with noise and synthetic mixing.
Compress the signal through an RVQ autoencoder configured for a target operating point.
Track validation metrics and keep the best-performing checkpoint weights.
Export encoder artifacts for deployment and save summaries, plots, and reconstruction samples.

Core Design Choices¶

Fixed-window processing¶

The default flow uses a fixed frame size so training, evaluation, and deployment all share the same model assumptions. That keeps exported inference predictable on constrained targets.

YAML-driven configuration¶

The full run is driven from YAML, which makes it easy to keep experiments reproducible and to compare configurations across compression ratios.

Embedded-oriented outputs¶

The workflow does not stop at training. It produces deployment artifacts, evaluation summaries, and reconstruction samples in the same result directory.

Primary Inputs And Outputs¶

Stage	Input	Output
Data loading	EDF PPG recordings	Windowed training and validation datasets
Preprocessing	Raw or cached windows	Normalized model-ready tensors
Codec model	PPG frames	Quantized latent representation and reconstructed waveform
Evaluation	Original and reconstructed signals	MSE, PRD, cosine, plots, summary JSON
Export	Best encoder weights	LiteRT encoder and C header

Recommended Entry Points¶

Use the CLI when you want reproducible runs from config files:

train-ppg-rvq --config configs/ppg_rvq_08x.yaml

Or call the module directly:

python -m compressionkit.cli.train_ppg_rvq --config configs/ppg_rvq_08x.yaml

Four Reference Operating Points¶

compressionKIT currently documents four reference PPG variants for the v1 flow:

2× for highest fidelity.
4× for balanced quality and savings.
8× for aggressive compression with strong signal preservation.
16× for maximum reduction when system efficiency matters most.

The detailed walkthrough is in PPG 2x-16x Examples.

The workflow is also exposed through a customer-facing browser demo that shows reconstruction quality, compression tradeoffs, and live controls in a more visual format. See PPG Codec Demo.