[ file MULTISEG/README ] Contact: lathoud@idiap.ch =========== DESCRIPTION =========== Tool to segment multichannel recordings (e.g. lapels or headsets): 1) You have recorded several WAV files (one for each person), synchronized in time. 2) You can use the MATLAB function "segment_multichannel.m" to produce a single ".txt" file containing a list of speech segments for each channel. 3) You can use the scripts in the TXT2XML directory to convert the ".txt" file into XML format (see TXT2XML/README). This is an implementation of the baseline multichannel segmentation presented in a paper at the 2004 NIST Meeting Recognition workshop (RT-04), titled "Unsupervised Location-Based Segmentation of Multi-Party Speech". MATLAB files written by Guillaume Lathoud (lathoud@idiap.ch). TXT2XML tools written by Maël Guillemot (mmmAdmin@idiap.ch). ========================== CONTENTS OF THIS DIRECTORY ========================== DATA/ -> an example (WAV files of one short meeting). README -> this file TXT2XML/ -> additional script to convert the ".txt" output of "segment_multichannel.m" into XML format (see TXT2XML/README) check_param.m -> MATLAB script used by "segment_multichannel.m" dilation.m (idem) energy.m (idem) erosion.m (idem) fill_default.m (idem) segment_multichannel.m -> the main segmentation script ========== AN EXAMPLE ========== % Some useful MATLAB lines & information. % Advice: do not use MATLAB 6.5 % Use MATLAB 6.1 instead % Information % This prints most of the information you need (also copied at the end of this README file). % Whatever is missing can be found by directly looking at the code: % dbtype segment_multichannel % or drop me an e-mail. help segment_multichannel; % Example of use clear in_p; in_p.wavfile_list{ 1 } = 'DATA/Lapel-1_T000010.0_T0459.201.wav'; in_p.wavfile_list{ 2 } = 'DATA/Lapel-2_T000010.0_T0459.201.wav'; in_p.wavfile_list{ 3 } = 'DATA/Lapel-3_T000010.0_T0459.201.wav'; in_p.wavfile_list{ 4 } = 'DATA/Lapel-4_T000010.0_T0459.201.wav'; in_p.out_dir = 'DATA'; in_p.output_basename = 'myMeeting'; [ multiseg, p ] = segment_multichannel( in_p ); % Once this is done, you will obtain a file % % 'DATA/myMeeting-multiseg.txt' % % as well as several intermediary '.mat' files. % Keep the '.mat' files if you want to play with parameter % of "segment_multichannel", otherwise you can delete them. ================ MORE INFORMATION ================ >> help segment_multichannel FUNCTION [ MULTISEG, P ] = SEGMENT_MULTICHANNEL( IN_P ) Segment a multichannel audio recording (e.g. lapels or headsets). Produces intermediary ".mat" files as well as a human-readable "-multiseg.txt" file. The output is also returned in the form of a struct array: MULTISEG (see below). This is an implementation of the baseline multichannel segmentation presented in a paper at the 2004 NIST Meeting Recognition workshop (RT-04), titled "Unsupervised Location-Based Segmentation of Multi-Party Speech". IN_P is a structure containing parameters, of which two are mandatory: IN_P.WAVFILE_LIST = cell array of NCHANNELS strings (filenames). IN_P.OUTPUT_BASENAME = string, the base name for all output files. There can be many other parameters (look at the code, e.g. with "dbtype segment_multichannel"). MULTISEG is a struct array of NCHANNELS elements. MULTISEG( i_channel ).SEG is a 2-row matrix. Each column describes a speech segment (row 1 = start time in seconds, row 2 = end time in seconds). P is IN_P enriched with default values, in particular it gives the names of the output files. NOTE: for some reason this script does not work well with MATLAB 6.5. Use MATLAB 6.1 instead. By Guillaume Lathoud - lathoud@idiap.ch