dataset
Classes
Dataset
Dataset serves as a base class to download and provide unified access to datasets.
Parameters:
-
path
(PathLike | None
, default:None
) –Path to dataset base path. Defaults to None.
Example:
import numpy as np
import sleepkit as sk
class MyDataset(sk.Dataset):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
@property
def name(self) -> str:
return 'my-dataset'
@property
def sampling_rate(self) -> int:
return 100
def get_train_patient_ids(self) -> npt.NDArray:
return np.arange(80)
def get_test_patient_ids(self) -> npt.NDArray:
return np.arange(80, 100)
@contextlib.contextmanager
def patient_data(self, patient_id: int) -> Generator[PatientData, None, None]:
data = np.random.randn(1000)
segs = np.random.randint(0, 1000, (10, 2))
yield {"data": data, "segmentations": segs}
def signal_generator(
self,
patient_generator: PatientGenerator,
frame_size: int,
samples_per_patient: int = 1,
target_rate: int | None = None,
) -> Generator[npt.NDArray, None, None]:
for patient in patient_generator:
for _ in range(samples_per_patient):
with self.patient_data(patient) as pt:
yield pt["data"]
def download(self, num_workers: int | None = None, force: bool = False):
pass
# Register dataset
sk.DatasetFactory.register("my-dataset", MyDataset)
Source code in sleepkit/datasets/dataset.py
Attributes
subject_ids
property
Functions
uniform_subject_generator
uniform_subject_generator(
subject_ids: list[str] | None = None, repeat: bool = True, shuffle: bool = True
) -> SubjectGenerator
Yield data for each subject in the array.
Parameters:
-
subject_ids
(ArrayLike
, default:None
) –Array of subject ids
-
repeat
(bool
, default:True
) –Whether to repeat generator. Defaults to True.
-
shuffle
(bool
, default:True
) –Whether to shuffle subject ids.. Defaults to True.
Returns:
-
SubjectGenerator
(SubjectGenerator
) –Subject generator
Yields:
-
SubjectGenerator
–Iterator[SubjectGenerator]
Source code in sleepkit/datasets/dataset.py
download
Download dataset
This will download preprocessed HDF5 files from S3.
Parameters:
-
num_workers
(int | None
, default:None
) –parallel workers. Defaults to None.
-
force
(bool
, default:False
) –Force redownload. Defaults to False.