Online resources for audio processing
by Guillaume Lathoud
These are mostly data (video and audio), and MATLAB/C code for microphone array speaker localization,
as well as single microphone noise/channel removal.
Feel free to use any of these resources, as long as you properly cite the corresponding papers
in your own publications and implementations.
Questions? ->
To my home page.
To the AV16.3 Corpus home page.
[ DATA ]
[ MATLAB & C code ]
- av16.3_v6/
The AV16.3 Corpus for Audio-Visual Speaker Localization and Tracking.
(two microphone arrays, three cameras, 3-D mouth location known within 1.2 cm).
Includes MATLAB code, annotation GUIs and examples (2004 MLMI paper).
- 2005-SAM-SPARSE-MEAN/
2005-SAM-SPARSE-MEAN.tar.gz (whole directory: 141 M)
MATLAB and MATLAB/C implementations of the sector-based detection-localization with microphone array
(ICASSP 2005, EURASIP 2006 and IDIAP RR-04-67, by G. Lathoud and others).
- 2006-CHN-USS/
2006-CHN-USS.tar.gz (whole directory: 16 M)
MATLAB implementation of single microphone noise & channel removal (IDIAP RR-06-09).
- 2006-CHN-USS-TIME-DOMAIN/
2006-CHN-USS-TIME-DOMAIN.tar.gz (whole directory: 269 k)
MATLAB implementation of single microphone noise & channel removal (IDIAP RR-06-09)
+ reconstruction of the cleaned waveform through overlap-add.
- 2006-short-term-clustering/
2006-short-term-clustering.tar.gz (whole directory: 2.8 M)
MATLAB implementation of short-term clustering (ML & confident),
along with simple examples on synthetic data,
including threshold-free detection of trajectory crossings.
(NISTRT04 paper).
- 2006-multidetloc/
2006-multidetloc-code.tar.gz (whole directory, first part: 1.9 M)
2006-multidetloc-data.tar.gz (whole directory, second part: 308 M)
MATLAB/C implementation of multispeaker detection-localization
applied to real microphone array recordings of multiple humans (AV16.3 Corpus)
(this demonstrates short-term clustering on real data)
+ evaluation of the result against the 3-D location ground-truth
(IDIAP RR-06-26).
- 2006-distant-speaker/
2006-distant-speaker-code.tar.gz (whole directory, first part: 183 k)
2006-distant-speaker-data.tar.gz (whole directory, second part: 194 M)
MATLAB cross-correlation study on speech signals received by distant microphones.
Signals are emitted by real human speakers (seq01 & seq03 in the AV16.3 Corpus)
(corresponding paper: IDIAP RR-06-74).
- 2006-various-MATLAB-tools/
2006-various-MATLAB-tools.tar.gz (whole directory: 84 k)
That's what it says!
- MULTISEG/
MULTISEG.tar.gz (whole directory: 28 M)
Multispeaker speech/silence segmentation using multiple lapel microphones.
This is a MATLAB implementation of the lapel segmentation baseline in the 2004 NIST Workshop paper.
- USS-EXAMPLE/
USS-EXAMPLE.zip (whole directory: 11 M)
MATLAB implementation of the single channel noise removal procedure
called "Unsupervised Spectral Subtraction" (ASRU 2005, IDIAP RR-05-42).
- Wiener_standalone/
Wiener_standalone.tar.gz (whole directory: 790 k)
MATLAB implementation of Wiener filtering in the MEL domain (one or multiple iterations).
A Python wrapper is also given.
Last updated on 2008-09-19 by Guillaume Lathoud - glathoud at yahoo dot fr