Model Evaluation
Introduction
Evaluate mode is used to test the performance of the model on the reserved test set for the specified task. Similar to training, the routine can be customized via CLI configuration file or by setting the parameters directly in the code. The evaluation process involves testing the model's performance on the test data to measure its accuracy, precision, recall, and F1 score. A number of results and metrics will be generated and saved to the job_dir
.
Usage
Example
The following command will evaluate the rhythm model using the reference configuration:
CLI Python
heartkit --mode evaluate --task rhythm --config ./configs/rhythm-class-2.json
from pathlib import Path
import heartkit as hk
task = hk . TaskFactory . get ( "rhythm" )
task . evaluate ( hk . HKTestParams (
job_dir = Path ( "./results/rhythm-class-2" ),
ds_path = Path ( "./datasets" ),
datasets = [{
"name" : "icentia11k" ,
"params" : {}
}],
num_classes = 2 ,
class_map = {
0 : 0 ,
1 : 1 ,
2 : 1
},
class_names = [
"NONE" , "AFIB/AFL"
],
sampling_rate = 200 ,
frame_size = 800 ,
test_samples_per_patient = [ 100 , 800 ],
test_patients = 1000 ,
test_size = 100000 ,
data_parallelism = lambda : os . cpu_count () or 1 ,
preprocesses = [
hk . PreprocessParams (
name = "znorm" ,
params = dict (
eps = 0.01 ,
axis = None
)
)
]
))
Arguments
The following tables lists the arguments that can be used with the evaluate
command.
HKTestParams
Argument
Type
Opt/Req
Default
Description
job_dir
Path
Optional
tempfile.gettempdir
Job output directory
ds_path
Path
Optional
Path()
Dataset directory
datasets
list[DatasetParams]
Optional
Datasets
sampling_rate
int
Optional
250
Target sampling rate (Hz)
frame_size
int
Optional
1250
Frame size
num_classes
int
Optional
1
# of classes
class_map
dict[int, int]
Optional
Class/label mapping
class_names
list[str]
Optional
None
Class names
test_samples_per_patient
int|list[int]
Optional
1000
# test samples per patient
test_patients
float|None
Optional
None
# or proportion of patients for testing
test_size
int
Optional
200000
# samples for testing
test_file
Path|None
Optional
None
Path to load/store pickled test file
preprocesses
list[PreprocessParams]
Optional
Preprocesses
augmentations
list[AugmentationParams]
Optional
Augmentations
model_file
Path|None
Optional
None
Path to save model file (.keras)
threshold
float|None
Optional
None
Model output threshold
seed
int|None
Optional
None
Random state seed
data_parallelism
int
Optional
os.cpu_count
# of data loaders running in parallel
QuantizationParams
Argument
Type
Opt/Req
Default
Description
enabled
bool
Optional
False
Enable quantization
qat
bool
Optional
False
Enable quantization aware training (QAT)
ptq
bool
Optional
False
Enable post training quantization (PTQ)
input_type
str|None
Optional
None
Input type
output_type
str|None
Optional
None
Output type
supported_ops
list[str]|None
Optional
None
Supported ops
DatasetParams
Argument
Type
Opt/Req
Default
Description
name
str
Required
Dataset name
params
dict[str, Any]
Optional
{}
Dataset parameters
weight
float
Optional
1
Dataset weight
PreprocessParams
Argument
Type
Opt/Req
Default
Description
name
str
Required
Preprocess name
params
dict[str, Any]
Optional
{}
Preprocess parameters
AugmentationParams
Argument
Type
Opt/Req
Default
Description
name
str
Required
Augmentation name
params
dict[str, Any]
Optional
{}
Augmentation parameters