ptbxl
Classes
PtbxlScpCode
PTBXL SCP codes
PtbxlDataset
PTBXL dataset consists of 21837 clinical 12-lead ECGs from 18885 patients.
Parameters:
Source code in heartkit/datasets/ptbxl.py
Attributes
patient_ids
property
Functions
get_train_patient_ids
get_test_patient_ids
label_key
patient_data
Get patient data
Note
If cacheable, data is cached in memory and returned as dict Otherwise, data is provided as HDF5 objects
Patient Data Format
- data: ECG data of shape (12, N)
- slabels: SCP labels of shape (N, 2)
- blabels: Beat labels of shape (N, 2)
Parameters:
-
(patient_idint) –Patient ID
Returns:
-
None–Generator[PatientData, None, None]: Patient data
Source code in heartkit/datasets/ptbxl.py
signal_generator
signal_generator(
patient_generator: PatientGenerator, frame_size: int, samples_per_patient: int = 1, target_rate: int | None = None
) -> Generator[npt.NDArray, None, None]
Generate random frames.
Parameters:
-
(patient_generatorPatientGenerator) –Generator that yields patient data.
-
(frame_sizeint) –Frame size
-
(samples_per_patientint, default:1) –Samples per patient. Defaults to 1.
-
(target_rateint | None, default:None) –Target rate. Defaults to None.
Returns:
-
None–Generator[npt.NDArray, None, None]: Generator of input data of shape (frame_size, 1)
Source code in heartkit/datasets/ptbxl.py
signal_label_generator
signal_label_generator(
patient_generator: PatientGenerator,
frame_size: int,
samples_per_patient: int = 1,
target_rate: int | None = None,
label_map: dict[int, int] | None = None,
label_type: str = "scp",
label_format: str | None = None,
) -> Generator[tuple[npt.NDArray, int], None, None]
Generate frames w/ labels using patient generator.
Parameters:
-
(patient_generatorPatientGenerator) –Patient Generator
-
(frame_sizeint) –Frame size
-
(samples_per_patientint, default:1) –Samples per patient. Defaults to 1.
-
(target_rateint, default:None) –Target rate. Defaults to None.
-
(label_mapdict[int, int], default:None) –Label map. Defaults to None.
-
(label_typestr, default:'scp') –Class type. Defaults to "scp".
-
(label_formatstr, default:None) –Label format. Defaults to None.
Returns:
-
None–Generator[tuple[npt.NDArray, int], None, None]: Generator of input data and labels
Yields:
Source code in heartkit/datasets/ptbxl.py
393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 | |
split_train_test_patients
split_train_test_patients(
patient_ids: NDArray,
test_size: float,
label_map: dict[int, int] | None = None,
label_type: str | None = None,
label_threshold: int | None = 2,
) -> list[list[int]]
Perform train/test split on patients for given task. NOTE: We only perform inter-patient splits and not intra-patient.
Parameters:
-
(patient_idsNDArray) –Patient Ids
-
(test_sizefloat) –Test size
-
(label_mapdict[int, int], default:None) –Label map. Defaults to None.
-
(label_typestr, default:None) –Label type. Defaults to None.
-
(label_thresholdint, default:2) –Label threshold. Defaults to 2.
Returns:
Source code in heartkit/datasets/ptbxl.py
filter_patients_for_labels
filter_patients_for_labels(
patient_ids: NDArray, label_map: dict[int, int] | None = None, label_type: str | None = None
) -> npt.NDArray
Filter patients based on labels. Useful to remove patients w/o labels for task to speed up data loading.
Parameters:
-
(patient_idsNDArray) –Patient ids
-
(label_mapdict[int, int], default:None) –Label map. Defaults to None.
-
(label_typestr, default:None) –Label type. Defaults to None.
Returns:
-
NDArray–npt.NDArray: Filtered patient ids
Source code in heartkit/datasets/ptbxl.py
get_patients_labels
get_patients_labels(patient_ids: NDArray, label_map: dict[int, int], label_type: str = 'scp') -> list[list[int]]
Get class labels for each patient
Parameters:
-
(patient_idsNDArray) –Patient ids
-
(label_mapdict[int, int]) –Label map
-
(label_typestr, default:'scp') –Label type. Defaults to "scp".
Returns:
Source code in heartkit/datasets/ptbxl.py
get_patient_scp_codes
Get SCP codes for patient
Parameters:
-
(patient_idint) –Patient id
Returns:
Source code in heartkit/datasets/ptbxl.py
get_patient_labels
get_patient_labels(patient_id: int, label_map: dict[int, int], label_type: str = 'scp') -> list[int]
Get class labels for patient
Parameters:
-
(patient_idint) –Patient id
Returns:
Source code in heartkit/datasets/ptbxl.py
download
Download dataset
This will download preprocessed HDF5 files from S3.
Parameters:
-
(num_workersint | None, default:None) –parallel workers. Defaults to None.
-
(forcebool, default:False) –Force redownload. Defaults to False.
Source code in heartkit/datasets/ptbxl.py
download_raw_dataset
Downloads full dataset zipfile and converts into individial patient HDF5 files.
Parameters:
-
(forcebool, default:False) –Whether to force re-download if destination exists. Defaults to False.
-
(num_workersint, default:None) –parallel workers. Defaults to os.cpu_count().