A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) using on Deep Filtering.
Demo
DeepFilterNet-Demo-new.mp4
News
-
New DeepFilterNet Demo: DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
- Paper: https://arxiv.org/abs/2305.08227
- Video: https://youtu.be/EO7n96YwnyE
-
New Multi-Frame Filtering Paper: Deep Multi-Frame Filtering for Hearing Aids
-
Real-time version and a LADSPA plugin
- Pre-compiled binary, no python dependencies. Usage:
deep-filter audio-file.wav
- LADSPA plugin with pipewire filter-chain integration for real-time noise reduction on your mic.
- Pre-compiled binary, no python dependencies. Usage:
-
DeepFilterNet2 Paper: DeepFilterNet2: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio
-
Original DeepFilterNet Paper: DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering
- Paper: https://arxiv.org/abs/2110.05588
- Samples: https://rikorose.github.io/DeepFilterNet-Samples/
- Demo: https://huggingface.co/spaces/hshr/DeepFilterNet
- Video Lecture: https://youtu.be/it90gBqkY6k
Usage
deep-filter
Download a pre-compiled deep-filter binary from the release page.
You can use deep-filter
to suppress noise in noisy .wav audio files. Currently, only wav files with a sampling rate of 48kHz are supported.
DeepFilterNet Framework
This framework supports Linux, MacOS and Windows. Training is only tested under Linux. The framework is structured as follows:
libDF
contains Rust code used for data loading and augmentation.DeepFilterNet
contains DeepFilterNet code training, evaluation and visualization as well as pretrained model weights.pyDF
contains a Python wrapper of libDF STFT/ISTFT processing loop.pyDF-data
contains a Python wrapper of libDF dataset functionality and provides a pytorch data loader.ladspa
contains a LADSPA plugin for real-time noise suppression.models
contains pretrained for usage in DeepFilterNet (Python) or libDF/deep-filter (Rust)
DeepFilterNet Python: PyPI
Install the DeepFilterNet Python wheel via pip:
# Specify an output directory with --output-dir [OUTPUT_DIR]
deepFilter path/to/noisy_audio.wav
Manual Installation
Install cargo via rustup. Usage of a conda
or virtualenv
recommended.
Installation of python dependencies and libDF:
To enhance noisy audio files using DeepFilterNet run
$ python DeepFilterNet/df/enhance.py --help usage: enhance.py [-h] [--model-base-dir MODEL_BASE_DIR] [--pf] [--output-dir OUTPUT_DIR] [--log-level LOG_LEVEL] [--compensate-delay] noisy_audio_files [noisy_audio_files ...] positional arguments: noisy_audio_files List of noise files to mix with the clean speech file. optional arguments: -h, --help show this help message and exit --model-base-dir MODEL_BASE_DIR, -m MODEL_BASE_DIR Model directory containing checkpoints and config. To load a pretrained model, you may just provide the model name, e.g. `DeepFilterNet`. By default, the pretrained DeepFilterNet2 model is loaded. --pf Post-filter that slightly over-attenuates very noisy sections. --output-dir OUTPUT_DIR, -o OUTPUT_DIR Directory in which the enhanced audio files will be stored. --log-level LOG_LEVEL Logger verbosity. Can be one of (debug, info, error, none) --compensate-delay, -D Add some paddig to compensate the delay introduced by the real-time STFT/ISTFT implementation. # Enhance audio with original DeepFilterNet python DeepFilterNet/df/enhance.py -m DeepFilterNet path/to/noisy_audio.wav # Enhance audio with DeepFilterNet2 python DeepFilterNet/df/enhance.py -m DeepFilterNet2 path/to/noisy_audio.wav
Use DeepFilterNet within your Python script
from df import enhance, init_df model, df_state, _ = init_df() # Load default model enhanced_audio = enhance(model, df_state, noisy_audio)
See here for a full example.
Training
The entry point is DeepFilterNet/df/train.py
. It expects a data directory containing HDF5 dataset
as well as a dataset configuration json fil