% MERGESOUNDSFROMSCRIPT Add noise to all files listed in a script file. % Write output files in the current directory. % % Note: this implementation ensures we obtain exactly the desired SNR % in each output file. Thus more noise will be added to lively speakers, % whereas less noise will be added to quiet speakers. % % MERGESOUNDSFROMSCRIPT(FILESINFO, DESIREDSNRDB, SAMPLERATE, SPEECHTHRESHOLD) % % Audio files must be 16-bit. % % FILEINFO is a structure containing all needed info about the audio files. % (description below) % DESIREDSNRDB is a float, the target SNR level in decibels. % SAMPLERATE is a float, the sample frequency common to all audio files. % SPEECHTHRESHOLD is a flost (optional, default 1e-3). It is a ratio determining % speech/non-speech segment for segmental energy calculation. See mergeSounds(). % % --- % FILEINFO.INPUTSCRIPT is a string. This is the name of the script file, % that contains all (typically clean speech) audio file names. % % FILEINFO.INPUT_HS is an integer, the size of the audio file header % of all the files listed by FILEINFO.INPUTSCRIPT. % Typically 0 for raw, 44 for microsoft wave, 1024 for NIST_1A. % % FILEINFO.OUTPUTDIR is a string. This is the directory where we will write % the noisy files. % % FILEINFO.INSERT is the string we use to produce output filenames. % It is inserted in the original audio file names just before the extension. % % FILEINFO.INPUT2 is a string, the name of the noise file. % % FILEINFO.INPUT2_HS is an integer, the size of the audio file header for FILEINFO.INPUT2. % Typically 0 for raw, 44 for microsoft wave, 1024 for NIST_1A. % % by Guillaume LATHOUD at IDIAP (lathoud@idiap.ch) function mergeSoundsFromScript(fileinfo, desiredSNRdB, samplerate, speechthreshold) if nargin < 3 error('I need at least three parameters!') end if nargin < 4 speechthreshold = 1e-3; % Ratio on max energy in a signal (used in mergeSounds()) end % 0) read parameters listfilename1 = fileinfo.inputscript; headersize1 = fileinfo.input_hs; outputdir = fileinfo.outputdir; insert1 = fileinfo.insert; filename2 = fileinfo.input2; headersize2 = fileinfo.input2_hs; % 1) read the list of speech files % for each filename prepare the output filename fd = fopen(listfilename1, 'r'); filelist = {}; while ~feof(fd) s = deblank(fgetl(fd)); if length(s) > 0 % build the output filename % first remove path aux = findstr(s, '/'); if length(aux) > 0 s2 = s(aux(length(aux))+1:length(s)); end % second insert string 'insert1' before extension aux = findstr(s2, '.'); if length(aux) < 1 s2 = strcat(s2, insert1); else s2 = strcat(s2(1:aux(length(aux))-1), insert1, s2(aux(length(aux)):length(s2))); end i = length(filelist) + 1; filelist{i}.in = s; filelist{i}.out = fullfile(outputdir,s2); end end fclose(fd); % 2) read the noise file [sound2 n2] = rawread(filename2, headersize2); % 3) Loop through the list of files for i = 1:length(filelist) disp([filelist{i}.in ' with ' filename2 ' to ' filelist{i}.out]) % read the speech file [sound1 n1] = rawread(filelist{i}.in, headersize1); % merge the speech signal with the noise signal [soundMerged a1 a2] = mergeSounds(sound1, sound2, desiredSNRdB, speechthreshold, samplerate, 16); % write out the result nistwrite(soundMerged, samplerate, 16, filelist{i}.out); end