icentia11k
Classes
IcentiaRhythm
Icentia rhythm labels
IcentiaBeat
Incentia beat labels
IcentiaDataset
Icentia11kDataset consists of ECG recordings from 11,000 patients and 2 billion labelled beats.
Parameters:
Source code in heartkit/datasets/icentia11k.py
Attributes
patient_ids
property
Functions
get_train_patient_ids
get_test_patient_ids
label_key
Get local label key for HDF5 file
Parameters:
-
(label_typestr, default:'rhythm') –Label type. Defaults to "rhythm".
Returns:
-
str(str) –Label key
Source code in heartkit/datasets/icentia11k.py
patient_data
Get patient data
Patient data is stored in HDF5 format with the following structure
- {segment_id}/data: ECG data (1 x N)
- {segment_id}/rlabels: Rhythm labels (N x 2)
- {segment_id}/blabels: Beat labels (N x 2)
segment_id is sequential number for each segment in the patient data.
Parameters:
-
(patient_idint) –Patient ID
Returns:
-
None–Generator[h5py.Group, None, None]: Patient data
Source code in heartkit/datasets/icentia11k.py
signal_generator
signal_generator(
patient_generator: PatientGenerator, frame_size: int, samples_per_patient: int = 1, target_rate: int | None = None
) -> Generator[npt.NDArray, None, None]
Generate random frames.
Parameters:
-
(patient_generatorPatientGenerator) –Generator that yields patient data.
-
(frame_sizeint) –Frame size
-
(samples_per_patientint, default:1) –Samples per patient. Defaults to 1.
-
(target_rateint | None, default:None) –Target rate. Defaults to None.
Returns:
-
SampleGenerator(None) –Generator of input data of shape (frame_size, 1)
Source code in heartkit/datasets/icentia11k.py
download
Download dataset
This will download preprocessed HDF5 files from S3.
Parameters:
-
(num_workersint | None, default:None) –parallel workers. Defaults to None.
-
(forcebool, default:False) –Force redownload. Defaults to False.
Source code in heartkit/datasets/icentia11k.py
split_train_test_patients
split_train_test_patients(
patient_ids: NDArray, test_size: float, label_map: dict[int, int] | None = None, label_type: str | None = None
) -> list[list[int]]
Perform train/test split on patients for given task.
Parameters:
-
(patient_idsNDArray) –Patient Ids
-
(test_sizefloat) –Test size
-
(label_mapdict[int, int], default:None) –Label map. Defaults to None.
-
(label_typestr, default:None) –Label type. Defaults to None.
Returns:
Source code in heartkit/datasets/icentia11k.py
filter_patients_for_labels
filter_patients_for_labels(
patient_ids: NDArray, label_map: dict[int, int] | None = None, label_type: str | None = None
) -> npt.NDArray
Filter patients based on labels. Useful to remove patients w/o labels for task to speed up data loading.
Parameters:
-
(patient_idsNDArray) –Patient ids
-
(label_mapdict[int, int], default:None) –Label map. Defaults to None.
-
(label_typestr, default:None) –Label type. Defaults to None.
Returns:
-
NDArray–npt.NDArray: Filtered patient ids
Source code in heartkit/datasets/icentia11k.py
get_patients_labels
get_patients_labels(patient_ids: NDArray, label_map: dict[int, int], label_type: str = 'rhythm') -> list[list[int]]
Get class labels for each patient
Parameters:
-
(patient_idsNDArray) –Patient ids
-
(label_mapdict[int, int]) –Label map
-
(label_typestr, default:'rhythm') –Label type. Defaults to "rhythm".
Returns:
Source code in heartkit/datasets/icentia11k.py
get_patient_labels
get_patient_labels(patient_id: int, label_map: dict[int, int], label_type: str = 'rhythm') -> list[int]
Get class labels for patient
Parameters:
-
(patient_idint) –Patient id
-
(label_mapdict[int, int]) –Label map
-
(label_typestr, default:'rhythm') –Label type. Defaults to "rhythm".
Returns:
Source code in heartkit/datasets/icentia11k.py
download_raw_dataset
Downloads full Icentia dataset zipfile and converts into individial patient HDF5 files.
NOTE: This is a very long process (e.g. 24 hrs). Please use icentia11k.download_dataset instead.
Parameters:
-
(forcebool, default:False) –Whether to force re-download if destination exists. Defaults to False.
-
(num_workersint, default:None) –parallel workers. Defaults to os.cpu_count().