Back to the audio resources page

Wiener filter implementation

General context: originally speech enhancement for humans, but also for Automatic Speech Recognition (ASR) as e.g. in [1].

Below: explanations of some of the parameters in Wiener_iter_mel.m:

niter
(integer > 0) how many times the Wiener filter is applied to the signal. Each time noise will be removed, but some speech as well, so with niter > 1 the resulting "cleaned" speech may sound distorted ("muffled").
noise_est_factor (real number > 0.0, default 3.0):
The noise spectrum En is estimated as the geometric mean of the spectra of "silent" frames, then simply multiplied by noise_est_factor.
T_smooth_frame (integer number of time frames >= 0, default 1):
"Fills the small gaps" in the Wiener filter response H, to avoid loosing weak speech close to strong speech (in time-frequency space). Implementation: a dilation followed by an erosion [2]. If T_smooth_frame == 0, no smoothing is applied.
new_M (real number between 0.0 and 1.0, default 0.9):
How we should scale the amplitude of the final time-domain signal (waveform), simply to avoid saturation issues when saving or playing the waveform. It is only a linear scaling, no filtering whatsoever.

[1] http://glat.info/pub/2006-Lathoud-RR-06-09-Channel-normalization-for-unsupervised-spectral-subtraction.pdf

[2] http://en.wikipedia.org/wiki/Mathematical_morphology


/com/mmm/shared/lathoud/Wiener_standalone/
README.m
Wiener_iter_mel.m
Wiener_iter_mel.py
Wiener_standalone.tar.gz
audioMix_cut.wav
audioMix_cut_Wiener_iter_mel.wav
index.html

Last updated on 2011-12-07 by Guillaume Lathoud - glathoud at yahoo dot fr