🧠 soundKIT Speech Enhancement Tutorial¶
This tutorial walks you through using the soundKIT CLI for speech enhancement tasks including data preparation, training, evaluation, exporting to TFLite, and deployment.
⚙️ Setup¶
In [2]:
Copied!
# Set working directory to project root (one level up)
import os
os.chdir("..")
print("✅ Working directory set to project root:", os.getcwd())
# Set working directory to project root (one level up)
import os
os.chdir("..")
print("✅ Working directory set to project root:", os.getcwd())
✅ Working directory set to project root: /home/poyupaulchen/Documents/soundkit
In [ ]:
Copied!
# Clone the soundKIT repo and navigate to it
!git clone https://github.com/AmbiqAI/soundkit.git
%cd soundkit
# Install required dependencies
!pip install -e .
# Clone the soundKIT repo and navigate to it
!git clone https://github.com/AmbiqAI/soundkit.git
%cd soundkit
# Install required dependencies
!pip install -e .
🔧 Configuration¶
In [3]:
Copied!
# Show editable training config
config_path = "configs/se.yaml"
# Optionally print the config content
with open(config_path, "r") as f:
print(f.read())
# Show editable training config
config_path = "configs/se.yaml"
# Optionally print the config content
with open(config_path, "r") as f:
print(f.read())
name: crnn_experiment
project: se
job_dir: ./soundkit/tasks/se
data:
path_tfrecord: ${job_dir}/tfrecords
tfrecord_datalist_name: # list of saved tfrecords
train: train_tfrecord.csv
val: val_tfrecord.csv
num_samples_per_noise:
train: 1000
val: 250
force_download: false
reverb_prob: 0.2
num_processes: 8
corpora:
- {name: train-clean-360, type: train, path: wavs/LibriSpeech/train-clean-360}
- {name: train-clean-100, type: train, path: wavs/LibriSpeech/train-clean-100}
- {name: dev-clean, type: val, path: wavs/LibriSpeech/dev-clean}
- {name: thchs30, type: train-val, path: {train: wavs/data_thchs30/train, dev: wavs/data_thchs30/dev}}
- {name: ESC-50-master, type: noise}
- {name: FSD50K, type: noise}
- {name: musan, type: noise}
- {name: wham_noise, type: noise}
- {name: rirs_noises, type: reverb}
snr_dbs: [-6, -3, 0, 3, 6, 9, 12, 15, 30] # mixture of signal-to-noise ratios
target_length_in_secs: 5
min_amp: 0.1
max_amp: 0.95
signal:
sampling_rate: 16000
dc_removal: true
debug: false
train:
initial_lr: 4e-4
batchsize: 32
epochs: 150
warmup_epochs: 5
epoch_loaded: random
loss_function: {
type: mse,
params: {exp: 0.6, eps: 1e-8}
}
path:
model_dir: ${job_dir}/models_trained/${name}
tensorboard_dir: ${job_dir}/tensorboard/${name}
num_lookahead: 0
feature:
frame_size: 480
hop_size: 160
fft_size: 512
type: mel
bins: 72
standardization: true
model:
config_dir: ./soundkit/models/arch_configs
config_file: config_crnn.yaml
debug: false
evaluate:
epoch_loaded: best
data:
dir: "./wavs/se/test_wavs"
files: [keyboard_steak.wav, i_like_steak.wav, steak_hairdryer.wav]
# # dir: ./wavs/LibriSpeech/test-clean
# # files:
result_folder: ${job_dir}/test_results/${name}
export:
epoch_loaded: best
tflite_dir: ${job_dir}/tflite
demo:
epoch_loaded: best
tflite_dir: ${job_dir}/tflite
evb_dir: ${job_dir}/evb
pre_gain: 1
📁 Step 1: Prepare Data¶
In [4]:
Copied!
!soundkit -t se -m data -c configs/se.yaml
!soundkit -t se -m data -c configs/se.yaml
2025-05-25 01:16:00.130646: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-05-25 01:16:00.130671: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-05-25 01:16:00.131318: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-25 01:16:00.135195: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-05-25 01:16:00.690089: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2025-05-25 01:16:01.286619: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.324943: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.325075: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.325955: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.326050: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.326250: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.378412: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.378542: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.378630: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:16:01.378694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 19180 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
Mode: data, Task: se
Processing [train] set with [ESC-50-master] noise
Processing [train] set with [FSD50K] noise
Processing [train] set with [musan] noise
Processing [train] set with [wham_noise] noise
Processing [val] set with [ESC-50-master] noise
Processing [val] set with [FSD50K] noise
Processing [val] set with [musan] noise
Processing [val] set with [wham_noise] noise
🎯 Step 2: Train Model¶
In [7]:
Copied!
!soundkit -t se -m train -c configs/se.yaml
!soundkit -t se -m train -c configs/se.yaml
2025-05-25 01:18:28.018874: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-05-25 01:18:28.018901: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-05-25 01:18:28.019559: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-25 01:18:28.023413: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-05-25 01:18:28.580924: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2025-05-25 01:18:29.155739: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.193461: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.193582: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.194454: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.194561: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.194726: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.249032: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.249183: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.249276: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:29.249342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 19203 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
Mode: train, Task: se
Training SE model with params: name='crnn_experiment' project='se' job_dir=PosixPath('soundkit/tasks/se') data={'path_tfrecord': './soundkit/tasks/se/tfrecords', 'tfrecord_datalist_name': {'train': 'train_tfrecord.csv', 'val': 'val_tfrecord.csv'}, 'num_samples_per_noise': {'train': 1000, 'val': 250}, 'force_download': False, 'reverb_prob': 0.2, 'num_processes': 8, 'corpora': [{'name': 'train-clean-360', 'type': 'train', 'path': 'wavs/LibriSpeech/train-clean-360'}, {'name': 'train-clean-100', 'type': 'train', 'path': 'wavs/LibriSpeech/train-clean-100'}, {'name': 'dev-clean', 'type': 'val', 'path': 'wavs/LibriSpeech/dev-clean'}, {'name': 'thchs30', 'type': 'train-val', 'path': {'train': 'wavs/data_thchs30/train', 'dev': 'wavs/data_thchs30/dev'}}, {'name': 'ESC-50-master', 'type': 'noise'}, {'name': 'FSD50K', 'type': 'noise'}, {'name': 'musan', 'type': 'noise'}, {'name': 'wham_noise', 'type': 'noise'}, {'name': 'rirs_noises', 'type': 'reverb'}], 'snr_dbs': [-6, -3, 0, 3, 6, 9, 12, 15, 30], 'target_length_in_secs': 5, 'min_amp': 0.03, 'max_amp': 0.95, 'signal': {'sampling_rate': 16000, 'dc_removal': True}, 'debug': False} train={'initial_lr': 0.0004, 'batchsize': 32, 'epochs': 150, 'warmup_epochs': 5, 'epoch_loaded': 'random', 'loss_function': {'type': 'compressed_mse', 'params': {'exp': 0.6, 'eps': 1e-08}}, 'path': {'model_dir': './soundkit/tasks/se/models_trained/crnn_experiment', 'tensorboard_dir': './soundkit/tasks/se/tensorboard/crnn_experiment'}, 'num_lookahead': 0, 'feature': {'frame_size': 480, 'hop_size': 160, 'fft_size': 512, 'type': 'mel', 'bins': 72}, 'standardization': True, 'model': {'config_dir': './soundkit/models/arch_configs', 'config_file': 'config_crnn.yaml'}, 'debug': False} evaluate={'epoch_loaded': 'best', 'data': {'dir': './wavs/se/test_wavs', 'files': ['keyboard_steak.wav', 'i_like_steak.wav', 'steak_hairdryer.wav'], 'result_folder': './soundkit/tasks/se/test_results/crnn_experiment'}} export={'epoch_loaded': 'best', 'tflite_dir': './soundkit/tasks/se/tflite'} demo={'epoch_loaded': 'best', 'tflite_dir': './soundkit/tasks/se/tflite', 'evb_dir': './soundkit/tasks/se/evb', 'pre_gain': 1} and more
Downloading model weights for crnn_experiment to soundkit/tasks/se
Model: "crnn"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dropout (Dropout) multiple 0
conv2d (Conv2D) multiple 36372
dropout_1 (Dropout) multiple 0
lstm (LSTM) multiple 56784
dropout_2 (Dropout) multiple 0
dense (Dense) multiple 7140
dropout_3 (Dropout) multiple 0
dense_1 (Dense) multiple 7140
dropout_4 (Dropout) multiple 0
dense_2 (Dense) multiple 21845
=================================================================
Total params: 146177 (571.00 KB)
Trainable params: 129281 (505.00 KB)
Non-trainable params: 16896 (66.00 KB)
_________________________________________________________________
Epoch 0/150
2025-05-25 01:18:34.679417: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8904
2025-05-25 01:18:34.964459: I external/local_xla/xla/service/service.cc:168] XLA service 0x72a65173acf0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-05-25 01:18:34.964480: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2025-05-25 01:18:34.968224: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1748161115.037346 1438135 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
[train] | 125/125 | Loss: 11.2496 |
[val] | 31/31 | Loss: 11.0736 |
Epoch 1/150
[train] | 125/125 | Loss: 10.1950 |
[val] | 31/31 | Loss: 9.3877 |
Epoch 2/150
[train] | 125/125 | Loss: 8.6472 |
[val] | 31/31 | Loss: 7.8945 |
Epoch 3/150
[train] | 125/125 | Loss: 7.8091 |
[val] | 31/31 | Loss: 7.1544 |
Epoch 4/150
[train] | 125/125 | Loss: 7.3638 |
[val] | 31/31 | Loss: 6.5285 |
Epoch 5/150
[train] | 125/125 | Loss: 7.0209 |
[val] | 31/31 | Loss: 6.3599 |
Epoch 6/150
[train] | 125/125 | Loss: 6.7351 |
[val] | 31/31 | Loss: 6.0633 |
Epoch 7/150
[train] | 125/125 | Loss: 6.5289 |
[val] | 31/31 | Loss: 6.0435 |
Epoch 8/150
[train] | 125/125 | Loss: 6.3136 |
[val] | 31/31 | Loss: 5.8874 |
Epoch 9/150
[train] | 125/125 | Loss: 6.2481 |
[val] | 31/31 | Loss: 5.8651 |
Epoch 10/150
[train] | 125/125 | Loss: 6.0919 |
[val] | 31/31 | Loss: 5.7019 |
Epoch 11/150
[train] | 125/125 | Loss: 5.9949 |
[val] | 31/31 | Loss: 5.8791 |
Epoch 12/150
[train] | 125/125 | Loss: 5.9605 |
[val] | 31/31 | Loss: 5.6243 |
Epoch 13/150
[train] | 125/125 | Loss: 5.8215 |
[val] | 31/31 | Loss: 5.6699 |
Epoch 14/150
[train] | 125/125 | Loss: 5.7624 |
[val] | 31/31 | Loss: 5.5465 |
Epoch 15/150
[train] | 125/125 | Loss: 5.7405 |
[val] | 31/31 | Loss: 5.4573 |
Epoch 16/150
[train] | 125/125 | Loss: 5.6025 |
[val] | 31/31 | Loss: 5.4954 |
Epoch 17/150
[train] | 125/125 | Loss: 5.5708 |
[val] | 31/31 | Loss: 5.2935 |
Epoch 18/150
[train] | 125/125 | Loss: 5.5097 |
[val] | 31/31 | Loss: 5.2901 |
Epoch 19/150
[train] | 125/125 | Loss: 5.4354 |
[val] | 31/31 | Loss: 5.3616 |
Epoch 20/150
[train] | 125/125 | Loss: 5.3939 |
[val] | 31/31 | Loss: 5.3062 |
Epoch 21/150
[train] | 125/125 | Loss: 5.3996 |
[val] | 31/31 | Loss: 5.1070 |
Epoch 22/150
Traceback (most recent call last):730 |
File "/home/poyupaulchen/Documents/soundkit/.venv/bin/soundkit", line 8, in <module>
sys.exit(main())
File "/home/poyupaulchen/Documents/soundkit/soundkit/cli.py", line 86, in main
parser()
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/argdantic/core.py", line 150, in __call__
return args.__func__(args)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/argdantic/core.py", line 82, in __call__
return self.callback(**destructured)
File "/home/poyupaulchen/Documents/soundkit/soundkit/cli.py", line 73, in run
task_handler.train(params)
File "/home/poyupaulchen/Documents/soundkit/soundkit/tasks/se/__init__.py", line 35, in train
train(params)
File "/home/poyupaulchen/Documents/soundkit/soundkit/tasks/se/train.py", line 297, in train
loss = run_epoch(
File "/home/poyupaulchen/Documents/soundkit/soundkit/tasks/se/train.py", line 117, in run_epoch
feat_sn, spec_sn, states_audio_sn = feat_extractor(
File "/home/poyupaulchen/Documents/soundkit/soundkit/utils/feature_utils.py", line 70, in __call__
states_udpate = tf.identity(audio_sn[:, -overlap:])
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/util/dispatch.py", line 1260, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/ops/array_ops.py", line 1130, in _slice_helper
array_ops_stack.stack(begin),
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
return fn(*args, **kwargs)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/util/dispatch.py", line 1260, in op_dispatch_handler
return dispatch_target(*args, **kwargs)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/ops/array_ops_stack.py", line 74, in stack
return ops.convert_to_tensor(values, name=name)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/profiler/trace.py", line 183, in wrapped
return func(*args, **kwargs)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 696, in convert_to_tensor
return tensor_conversion_registry.convert(
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/tensor_conversion_registry.py", line 234, in convert
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py", line 335, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/ops/weak_tensor_ops.py", line 142, in wrapper
return op(*args, **kwargs)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py", line 271, in constant
return _constant_impl(value, dtype, shape, name, verify_shape=False,
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py", line 284, in _constant_impl
return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py", line 296, in _constant_eager_impl
t = convert_to_eager_tensor(value, ctx, dtype)
File "/home/poyupaulchen/Documents/soundkit/.venv/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py", line 103, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
KeyboardInterrupt
^C
To monitor training with TensorBoard:
In [6]:
Copied!
!soundkit -t se -m train --tensorboard -c configs/se.yaml
# Then run this in another terminal:
# tensorboard --logdir=soundkit/tasks/se/your_tensorboard_dir
!soundkit -t se -m train --tensorboard -c configs/se.yaml
# Then run this in another terminal:
# tensorboard --logdir=soundkit/tasks/se/your_tensorboard_dir
2025-05-25 01:18:11.307317: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-05-25 01:18:11.307343: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-05-25 01:18:11.308028: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-25 01:18:11.311922: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-05-25 01:18:11.878821: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2025-05-25 01:18:12.469211: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.506868: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.507001: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.508036: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.508132: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.508214: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.555868: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.555991: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.556082: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:12.556147: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 19200 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
Mode: train, Task: se
🚀 Launching TensorBoard at: soundkit/tasks/se/tensorboard
2025-05-25 01:18:15.484963: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-05-25 01:18:15.484991: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-05-25 01:18:15.485663: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-05-25 01:18:15.489396: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-05-25 01:18:16.060973: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2025-05-25 01:18:16.596399: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:16.633784: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-05-25 01:18:16.633942: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
NOTE: Using experimental fast data loading logic. To disable, pass
"--load_fast=false" and report issues on GitHub. More details:
https://github.com/tensorflow/tensorboard/issues/4784
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.15.2 at http://localhost:6007/ (Press CTRL+C to quit)
^C
📊 Step 3: Evaluate Model¶
In [ ]:
Copied!
!soundkit -t se -m evaluate -c configs/se.yaml
!soundkit -t se -m evaluate -c configs/se.yaml
📤 Step 4: Export Model to TFLite¶
In [ ]:
Copied!
!soundkit -t se -m export -c configs/se.yaml
!soundkit -t se -m export -c configs/se.yaml
🚀 Step 5: Deploy to Embedded EVB (e.g., Apollo)¶
In [ ]:
Copied!
!soundkit -t se -m demo -c configs/se.yaml demo.platform=evb # For EVB demo, ensure you have a microphone connected
!soundkit -t se -m demo -c configs/se.yaml demo.platform=pc # For PC demo, ensure you have a microphone connected
!soundkit -t se -m demo -c configs/se.yaml demo.platform=evb # For EVB demo, ensure you have a microphone connected
!soundkit -t se -m demo -c configs/se.yaml demo.platform=pc # For PC demo, ensure you have a microphone connected
2025-06-09 00:12:14.811082: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-06-09 00:12:14.811107: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-06-09 00:12:14.811958: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-06-09 00:12:14.816316: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-06-09 00:12:15.374533: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2025-06-09 00:12:15.940246: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:15.976081: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:15.976214: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:15.977362: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:15.977453: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:15.977532: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:16.033370: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:16.033500: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:16.033593: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2025-06-09 00:12:16.033658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20841 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6
🔧 Mode: demo, Task: se
🛠️ Overrides: ['demo.platform=evb']
Traceback (most recent call last):
File "/home/poyupaulchen/Documents/soundkit/.venv/bin/soundkit", line 8, in <module>
sys.exit(main())
File "/home/poyupaulchen/Documents/soundkit/soundkit/cli.py", line 77, in main
run_task(
File "/home/poyupaulchen/Documents/soundkit/soundkit/cli.py", line 32, in run_task
params = parse_config(config, overrides=extra_overrides)
File "/home/poyupaulchen/Documents/soundkit/soundkit/cli.py", line 17, in parse_config
raise FileNotFoundError(f"Config file not found: {path}")
FileNotFoundError: Config file not found: configs/se.yaml
✅ Notes¶
- Edit
se.yamlto customize parameters like sampling rate, model type, loss function - Change training and export paths as needed
- Log and spectrograms are saved in their respective folders