Skip to content

Features

SleepKit includes a mini feature store to enable crafting rich feature sets to train and evaluate models. The store includes several built-in feature set generators that can be invoked to create feature sets for a variety of use cases. Each feature set inherits from sk.FeatureSet and provides implementations for generating features from the dataset.

The main method, generate_subject_features receives the target dataset name, subject id, and high-level sk.TaskParams which includes field feature which includes feature set specific parameters. The resulting data and labels are stored in HDF5 format that can then be used for training via sk.H5Dataloader class. Custom feature set generators can also be added to the feature store by subclassing sk.FeatureSet and registering the new generator with the feature store.

  1. Load the configuration data (e.g. configuration.json (1))
  2. Load datasets from DatasetFactory (e.g. mesa)
  3. Load feature generator from FeatureFactory (e.g. FS-W-PA-14)
  4. Generate features for each dataset/subject (e.g. generate_subject_features)
  5. Save subjects features to disk (e.g. subject.h5)
  1. Example configuration:
    {
        "name": "sd-2-tcn-sm",
        "job_dir": "./results/sd-2-tcn-sm",
        "verbose": 2,
    
        "datasets": [{
            "name": "cmidss",
            "params": {
                "path": "./datasets/cmidss"
            }
        }],
    
        "feature": {
            "name": "FS-W-A-5",
            "sampling_rate": 0.2,
            "frame_size": 12,
            "loader": "hdf5",
            "feat_key": "features",
            "label_key": "detect_labels",
            "mask_key": "mask",
            "feat_cols": null,
            "save_path": "./datasets/store/fs-w-a-5-60",
            "params": {}
        },
    
        "sampling_rate": 0.0083333,
        "frame_size": 240,
    
        "num_classes": 2,
        "class_map": {
            "0": 0,
            "1": 1,
            "2": 1,
            "3": 1,
            "4": 1,
            "5": 1
        },
        "class_names": ["WAKE", "SLEEP"],
    
        "samples_per_subject": 100,
        "val_samples_per_subject": 100,
        "test_samples_per_subject": 50,
    
        "val_size": 4000,
        "test_size": 2500,
    
        "val_subjects": 0.20,
        "batch_size": 128,
        "buffer_size": 10000,
        "epochs": 200,
        "steps_per_epoch": 25,
        "val_steps_per_epoch": 25,
        "val_metric": "loss",
        "lr_rate": 1e-3,
        "lr_cycles": 1,
        "label_smoothing": 0,
    
        "test_metric": "f1",
        "test_metric_threshold": 0.02,
        "tflm_var_name": "sk_detect_flatbuffer",
        "tflm_file": "sk_detect_flatbuffer.h",
    
        "backend": "pc",
        "display_report": true,
    
        "quantization": {
            "qat": false,
            "mode": "INT8",
            "io_type": "int8",
            "concrete": true,
            "debug": false
        },
    
        "model_file": "model.keras",
        "use_logits": false,
        "architecture": {
            "name": "tcn",
            "params": {
                "input_kernel": [1, 5],
                "input_norm": "batch",
                "blocks": [
                    {"depth": 1, "branch": 1, "filters": 16, "kernel": [1, 5], "dilation": [1, 1], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"},
                    {"depth": 1, "branch": 1, "filters": 32, "kernel": [1, 5], "dilation": [1, 2], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"},
                    {"depth": 1, "branch": 1, "filters": 48, "kernel": [1, 5], "dilation": [1, 4], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"},
                    {"depth": 1, "branch": 1, "filters": 64, "kernel": [1, 5], "dilation": [1, 8], "dropout": 0.10, "ex_ratio": 1, "se_ratio": 4, "norm": "batch"}
                ],
                "output_kernel": [1, 5],
                "include_top": true,
                "use_logits": true,
                "model_name": "tcn"
            }
        }
    }
    


graph LR
A("`Load
configuration
__TaskParams__
`")
B("`Load datasets
__DatasetFactory__
`")
C("`Load feature
generator
__FeatureFactory__
`")
D("`Generate
features
`")
E("`Save
features
`")

A ==> B
B ==> C
C ==> D
D ==> E

Availabel Feature Sets

  • FS-W-PA-14:14 features derived from PPG and IMU on wrist for sleep stage classification.
  • FS-C-EAR-9: 9 features derived from ECG, RSP, and IMU on chest for sleep stage classification.
  • FS-W-A-5: 5 features derived from IMU on wrist for actigraphy style sleep detection.
  • FS-H-E-10: 10 features derived from ECG and EOG on head for sleep stage classification.
  • FS-W-P-5: 5 features derived from PPG on wrist for sleep apnea classification.
  • BYOFS: Bring-Your-Own-Feature-Set (BYOFS) to create custom feature sets.

Feature Factory

The feature factory, sk.FeatureFactory, provides a convenient way to access the feature sets. The factory is a thread-safe singleton class that provides a single point of access to the features via the features' slug names. The benefit of using the factory is it allows registering new additional feature sets that can then be leveraged by existing and new tasks.

The feature factory provides the following methods:

  • sk.FeatureFactory.register: Register a custom feature set
  • sk.FeatureFactory.unregister: Unregister a custom feature set
  • sk.FeatureFactory.has: Check if a feature set is registered
  • sk.FeatureFactory.get: Get a feature set
  • sk.FeatureFactory.list: List all available feature sets