[ file USS-EXAMPLE/README ] Example of the Unsupervised Spectral Subtraction scheme proposed in: "Unsupervised Spectral Subtraction", by G. Lathoud, M. Magimai.-Doss, B. Mesot and H. Bourlard, to appear in Proceedings of ASRU 2005. ---------- CONTENTS [ USS-EXAMPLE.zip ] An archive file containing this entire directory and its subdirectories. [ DATA ] Directory containing various WAV files, the "av163*wav" files come from the AV16.3 corpus (http://mmm.idiap.ch/Lathoud/av16.3_v6/AV163.pdf), the "*NU-1028*wav" files come from the Numbers 95 corpus, with some NOISEX Factory or Lynx additive noise. [ README ] This file. [ USS-MATLAB-FUNCTIONS ] All matlab functions used by the example, the main function of interest is "fit_raylsherl.m". [ apply_uss_to_wav.m ] The main code. ---------- More details on "apply_uss_to_wav.m" The idea is to show the original magnitude spectrograms of a given WAV file, then to show how the ( Rayleigh + Shifted Erlang ) distribution is fitted, and finally the filtered magnitude spectrogram. You can try it on various examples, stored in the DATA directory. The following MATLAB lines can be directly pasted onto a MATLAB interactive command-line. Meeting room examples: apply_uss_to_wav( 'DATA/av163-meetingroom-distant.wav', [5 15] ); % Limit to the first 30 seconds apply_uss_to_wav( 'DATA/av163-meetingroom-lapel.wav', [5 15] ); % Limit to the first 30 seconds Numbers95 + Noisex examples: ( clean: show that the information is not lost ) apply_uss_to_wav( 'DATA/clean-NU-1028.streetaddr.wav' ); ( stationary Lynx noise at 12, 6 and 0 dB ) apply_uss_to_wav( 'DATA/lynx12-NU-1028.streetaddr.wav' ); apply_uss_to_wav( 'DATA/lynx06-NU-1028.streetaddr.wav' ); apply_uss_to_wav( 'DATA/lynx00-NU-1028.streetaddr.wav' ); ( non-stationary Factory noise at 12, 6 and 0 dB ) apply_uss_to_wav( 'DATA/fact12-NU-1028.streetaddr.wav' ); apply_uss_to_wav( 'DATA/fact06-NU-1028.streetaddr.wav' ); apply_uss_to_wav( 'DATA/fact00-NU-1028.streetaddr.wav' ); Note: a strange looking fit in the "clean-NU-1028.streetaddr.wav" case is explained by some artificial data on silences in the waveform, most likely generated by the telephone speech/silence detector. This effect does not appear on all other files, and did not impact on the recognition performance in our experiments