2004-08-17 ========== contact: mmmAdmin@idiap.ch Tools to convert txt raw files in XML format suitable for Transcriber. The current directory is organized as follows: 1) README.txt: this file ======================== 2) speakerSeg2XML.awk ===================== description: awk script that converts one *.txt file to *.xml usage: awk -f spksegTo1xmlAMInew.awk /path/file.txt input: TXT file /path/file.txt output: XML file dataAMI/AMI-Meeting-001/AMI-Meeting-001seg.xml where the string "AMI-Meeting-001" is extracted from the first line of file /path/file.txt . Example of first line (in dataAMI/AMI-Meeting-001/AMI-Meeting-001-lapelseg.txt): /com/mmm/shared/tmp_meetings/AMI-Meeting-001/Headset-1_T000010.0_T01710.150.wav The path and filename dataAMI/***/***seg.xml is hardcoded in the awk script. For further use, it is necessary to edit the awk script and to adapt it by changing Lines 19 and Ligne 21 that are relative to the first line of the *.txt files. 3) convertTXT2XMLall.pl ======================= Description: perl script that runs awk scripts. Usage: $./convertTXT2XMLall.pl For further use, it is necessary to edit the perl script and to adapt it by changing Lines 7 that finds *.txt files in the data diretory. When running the script locally, you should get the following output from the shell, showing that the XML file get created: ###################################### $ ./convertTXT2XMLall.pl >>> converting dataAMI/AMI-Meeting-001/AMI-Meeting-001-lapelseg.txt in XML Generating dataAMI/AMI-Meeting-001/AMI-Meeting-001seg.xml >>> converting dataAMI/AMI-Meeting-002/AMI-Meeting-002-lapelseg.txt in XML Generating dataAMI/AMI-Meeting-002/AMI-Meeting-002seg.xml >>> converting dataAMI/AMI-Meeting-003/AMI-Meeting-003-lapelseg.txt in XML Generating dataAMI/AMI-Meeting-003/AMI-Meeting-003seg.xml >>> converting dataAMI/AMI-Meeting-004/AMI-Meeting-004-lapelseg.txt in XML Generating dataAMI/AMI-Meeting-004/AMI-Meeting-004seg.xml ###################################### 4) dataAMI directory ===================== organized as follws: dataAMI/: AMI-Meeting-001 AMI-Meeting-002 AMI-Meeting-003 AMI-Meeting-004 dataAMI/AMI-Meeting-001: AMI-Meeting-001-lapelseg.txt AMI-Meeting-001seg.xml dataAMI/AMI-Meeting-002: AMI-Meeting-002-lapelseg.txt AMI-Meeting-002seg.xml dataAMI/AMI-Meeting-003: AMI-Meeting-003-lapelseg.txt AMI-Meeting-003seg.xml dataAMI/AMI-Meeting-004: AMI-Meeting-004-lapelseg.txt AMI-Meeting-004seg.xml where AMI-Meeting-00*-lapelseg.txt are the input txt files and AMI-Meeting-00*seg.xml are the results after XML convertion. 5) speakerSeg2XMLnew.awk ====================== This script is the same as speakerSeg2XML.awk except that it takes an output file as input parameter and creates it. description: awk script that converts one *.txt file to *.xml usage example: awk -v outputTRSfile=output.trs -f speakerSeg2XMLnew.awk /path/file.txt input: TXT file /path/file.txt output: TRS file.