[ Daniel-AV-recordings/README ] -------------------- CONTENTS -------------------- - Three audio-visual sequences called Daniel_Darren_Guillaume_AV_Occlusion Daniel_Darren_VisualClutter DanielTest For each sequence, 8 audio files "*-channel01.wav" to "*-channel08.wav", and 1 video directory containing PNM files of the video. - "rr03-25.ps.gz": a previous research report giving an example of use - this README file. -------------------- DESCRIPTION -------------------- The three recordings were made with one camera and one circular, 8-microphone array, placed at different locations. No precise calibration information is available. The spatial information that I could gather goes as follows: - the distance between the wall supporting the camera and the wall visible behind the people is 3.6m. - of the two microphone arrays visible on the videos, only the left one was used (8 channels). The centre of the microphone array is halfway between the two walls (1.8m). Note that only the microphone array seen on the left in the videos is used. The 3D audio referent is defined as follows: - the (X,Y) origin is at the center of the array, the Z origin is at the level of the table. - the X axis is pointing towards the wall on the back (as seen in the videos), - the Y axis is pointing towards left (as seen on the videos), - the Z axis is pointing towards the ceiling. Below are a few MATLAB lines to define the geometry of the circular array. The "0.04" value for Z indicates that the array is 4 cm above the table. % Microphone array geometry for all those files: 10 cm-radius circle a = (-2:5) * pi / 4; geometry = [.1 * cos(a); .1 * sin(a); 0 * a + 0.04]; Precise 3D calibration of the camera is not available for those sequences, we (Gatica et al.) used a simplified calibration strategy: one of the sequences to build a codebook linking 2D position in the video and (azimuth,elevation) estimates in the audio. For more details, see paragraph 2.5 in "rr03-25.ps.gz", in this directory. As for time synchronization between audio and video, I believe we had some trouble with the hardware for those particular sequences, so that: - in theory, the first audio sample in a ".wav" file corresponds to the video frame with video timestamp 00:00:10:00 (the timestamp is visible on the videos themselves). - in practice, for those particular video, the actual synchronization may differ. Guillaume Lathoud, May 31st, 2005