Model Training
Introduction
Each task provides a mode to train a model on the specified features. The training mode can be invoked either via CLI or within sleepkit python package. At a high level, the training mode performs the following actions based on the provided configuration parameters:
- Load the configuration data (e.g.
configuration.json(1)) - Load features (e.g.
FS-W-A-5) - Initialize custom model architecture (e.g.
tcn) - Define the metrics, loss, and optimizer (e.g.
accuracy,categorical_crossentropy,adam) - Train the model (e.g.
model.fit) - Save artifacts (e.g.
model.keras)
- Example configuration:
{ "name": "sd-2-tcn-sm", "job_dir": "./results/sd-2-tcn-sm", "verbose": 2, "datasets": [{ "name": "cmidss", "params": { "path": "./datasets/cmidss" } }], "feature": { "name": "FS-W-A-5", "sampling_rate": 0.2, "frame_size": 12, "loader": "hdf5", "feat_key": "features", "label_key": "detect_labels", "mask_key": "mask", "feat_cols": null, "save_path": "./datasets/store/fs-w-a-5-60", "params": {} }, "sampling_rate": 0.0083333, "frame_size": 240, "num_classes": 2, "class_map": { "0": 0, "1": 1, "2": 1, "3": 1, "4": 1, "5": 1 }, "class_names": ["WAKE", "SLEEP"], "samples_per_subject": 100, "val_samples_per_subject": 100, "test_samples_per_subject": 50, "val_size": 4000, "test_size": 2500, "val_subjects": 0.20, "batch_size": 128, "buffer_size": 10000, "epochs": 200, "steps_per_epoch": 25, "val_steps_per_epoch": 25, "val_metric": "loss", "lr_rate": 1e-3, "lr_cycles": 1, "label_smoothing": 0, "test_metric": "f1", "test_metric_threshold": 0.02, "tflm_var_name": "sk_detect_flatbuffer", "tflm_file": "sk_detect_flatbuffer.h", "backend": "pc", "display_report": true, "quantization": { "qat": false, "mode": "INT8", "io_type": "int8", "concrete": true, "debug": false }, "model_file": "model.keras", "use_logits": false, "architecture": { "name": "tcn", "params": { "input_kernel": [1, 5], "input_norm": "batch", "blocks": [ {"depth": 1, "branch": 1, "filters": 16, "kernel": [1, 5], "dilation": [1, 1], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"}, {"depth": 1, "branch": 1, "filters": 32, "kernel": [1, 5], "dilation": [1, 2], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"}, {"depth": 1, "branch": 1, "filters": 48, "kernel": [1, 5], "dilation": [1, 4], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"}, {"depth": 1, "branch": 1, "filters": 64, "kernel": [1, 5], "dilation": [1, 8], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"} ], "output_kernel": [1, 5], "include_top": true, "use_logits": true, "model_name": "tcn" } } }
graph LR
A("`Load
configuration
__TaskParams__
`")
B("`Load
features
__FeatureFactory__
`")
C("`Initialize
model
__ModelFactory__
`")
D("`Define
_metrics_, _loss_,
_optimizer_
`")
E("`Train
__model__
`")
F("`Save
__artifacts__
`")
A ==> B
B ==> C
subgraph "Model Training"
C ==> D
D ==> E
end
E ==> F
Usage
CLI
The following command will train a sleep stage model using the reference configuration.
Python
The model can be trained using the following snippet:
- Example configuration:
Arguments
Please refer to TaskParams for the list of arguments that can be used with the train command.
Logging
SleepKit provides built-in support for logging to several third-party services including Weights & Biases (WANDB) and TensorBoard.
WANDB
The training mode is able to log all metrics and artifacts (aka models) to Weights & Biases (WANDB). To enable WANDB logging, simply set environment variable WANDB=1. Remember to sign in prior to running experiments by running wandb login.
TensorBoard
The training mode is able to log all metrics to TensorBoard. To enable TensorBoard logging, simply set environment variable TENSORBOARD=1.