HAL Id: hal-01969317
https://hal.laas.fr/hal-01969317
Submitted on 4 Jan 2019
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Database of binaural room impulse responses of an
apartment-like environment
Fiete Winter, Hagen Wierstorf, Ariel Podlubne, Thomas Forgue, Jérôme
Manhes, Matthieu Herrb, Sascha Spors, Alexander Raake, Patrick Danès
To cite this version:
Fiete Winter, Hagen Wierstorf, Ariel Podlubne, Thomas Forgue, Jérôme Manhes, et al.. Database of
binaural room impulse responses of an apartment-like environment. Convention e-Brief of the Audio
Engineering Society, Jun 2016, Paris, France. 4p. �hal-01969317�
Convention e-Brief
Presented at the 140
thConvention
2016 June 4–7, Paris, France
This Engineering Brief was selected on the basis of a submitted synopsis. The author is solely responsible for its presentation, and the AES takes no responsibility for the contents. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Audio Engineering Society.
Database of binaural room impulse responses of an
apartment-like environment
Fiete Winter1, Hagen Wierstorf2, Ariel Podlubne3, Thomas Forgue3, Jérôme Manhès3, Matthieu Herrb3, Sascha Spors1, Alexander Raake2, and Patrick Danès4
1Institute of Communications Engineering, University of Rostock, Rostock, D-18119, Germany 2Audiovisual Technology Group, Technische Universität Ilmenau, Ilmenau, D-98693, Germany 3LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
4LAAS-CNRS, Université de Toulouse, CNRS, UPS, Toulouse, France
Correspondence should be addressed to Fiete Winter (fiete.winter@uni-rostock.de) ABSTRACT
We present a database of binaural room impulse responses (BRIRs) measured in an apartment-like environment. The BRIRs were captured at four different sound source positions, each combined with four listener positions. A head and torso simulator (HATS) with varying head-orientation in the range of ±78◦with 2◦resolution was used. Additionally, BRIRs of 20 listener positions along a trajectory connecting two of the four positions were measured, each with a fixed head-orientation. The data is provided in the Spatially Oriented Format for Acoustics (SOFA) and it is freely available under the Creative Commons (CC-BY-4.0) license. It can be used to simulate complex acoustic scenes in order to study the process of auditory scene analysis for humans and machines.
1
Introduction
One of the outstanding capabilities of the human audi-tory system is to recover information on single audiaudi-tory objects out of a mixture of sounds [1]. The EU FET-OPEN project TWO!EARSaims at developing a com-putational model that mimics this behavior. Listeners are regarded as multi-modal agents that develop their concept of the world by active, exploratory sensing. In the course of this process, they interpret percepts, applying existing and collecting new knowledge and concepts accordingly. Consequently, the TWO!EARS model involves bottom-up (signal-driven) as well as top-down (hypothesis-driven) processes.
A first open-source release of the model software is available for download [2]. It will be complemented
until the end of the project so as to provide a com-prehensive set of functions enabling new perspectives, including the neighbouring field of robot audition. The model is deployed on a binaural robot, in order to assess its suitability in real dynamic auditory scene analysis (DASA) scenarios. In addition, a software simulation stage of the robot is provided to foster the development process.
The synthesis of ear signals is thus an important basis for the development and evaluation of the TWO!EARS model. It allows to generate reproducible conditions in contrast to slightly varying signals from real-world scenarios. Binaural synthesis using Head-Related Im-pulse Responses (HRIRs) for anechoic environments and Binaural Room Impulse Responses (BRIRs) for reverberant conditions is a versatile tool to generate
Winter et al. BRIR database of an apartment-like environment
Fig. 1: ADREAM Laboratory
ear signals. For the latter, a-priori measured BRIRs de-scribing the acoustic sound propagation from a sound source to the ears of a human listener are needed. This engineering brief provides detailed information about the measurement process for the BRIRs of a complex, apartment-like environment. It is organised as follows: The details of the measurement process are described in Sec. 2. A brief data analysis is presented in Sec. 3 followed by information about accessibility of measured data given in Sec. 4.
2
Measurement Details
The impulse response measurements were conducted in the G. Giralt building of the Laboratory for Analysis and Architecture of Systems (LAAS-CNRS), Toulouse, France, which hosts the Reconfigurable Dynamic Ar-chitectures for Embedded Autonomous Mobile Sys-tems (ADREAM) project. As depicted in Fig. 1, this environment constitutes a drywall installation built into in a big hall, representing an apartment with two closed rooms (L × W × H = 4 × 4 × 2.5 m and 3 × 4 × 2.5 m) and one half-open area (4 × 4 × 2.5 m). The whole apartment has no ceiling.
2.1 Measurement Setup
The involved hardware and the data connections be-tween the devices are depicted in Fig. 2. The BRIRs were measured using logarithmic sweeps of 219 sam-ples length at a sampling rate of 44.1 kHz. The measurement signals are D/A-converted by a RME FIREFACE UC USB-soundcard and emitted by one
PC Robotics Platform (Jido) Sound Card Optical Tracking System HATS (KEMAR) Loudspeakers USB audio E T H E R N E T contr ol commands E T H E R N E T location data SERIAL head azimuth X L R / B N C audio X L R audio
Fig. 2: Involved hardware for the measurement includ-ing data flows between the components. The type of the respective physical data connections is written in capital letters, while the data con-tent is marked in bold font.
out of four Genelec 8020A two-way active loudspeak-ers. Simultaneously, the emitted sound is measured using a Head and Torso Simulator (HATS), namely the G.R.A.S. Knowles Electronics Manikin for Acous-tic Research (KEMAR) 45BB-4 [3] with Large Pin-nae (KB0091 + KB0090). The length of the BRIRs was limited to 2 seconds and the delay of the mea-surement chain was compensated a-priori. The HATS was mounted on a mobile robotics platform, named Jido (see Fig. 3a). The manikin is furthermore en-dowed with a servo motor to enable azimuthal head rotations. The motor is associated to a gearhead (ra-tio of 9) and a quadrature incremental encoder with 2048 steps (4 pulses by step), so that the 73728 pulses per revolution ensure an accuracy of 4.88 × 10−3 de-grees. It is interfaced via a serial connection through an ELMO controller that manages the sensors (Hall effect sensors, encoders and limit sensors) and controls the motor in position or velocity mode. The setpoints are sent through a standard CAN serial link.
AES 140thConvention, Paris, France, 2016 June 4–7 Page 2 of 4
(a) (b)
Fig. 3: (a) Jido mobile robot with the KEMAR mounted on its top. (b) Genelec 8020A Loud-speaker (×4). Both devices were endowed with optical tracking markers (small gray balls). The location1of the robotics platform and loudspeakers are captured by an optical tracking system composed of 28 infrared cameras. It uses small optical markers made with a retroreflective material (cf. Fig. 3) mounted on the respective devices. Its accuracy is in the millimeter range.
2.2 Measurement Positions
In the first part of the measurement, BRIRs were cap-tured for four different loudspeaker locations, each combined with four listener locations with a head-above-torso-orientation varying in the range of ±78◦ with 2◦resolution (see Fig. 5a). Hereby, a negative an-gle corresponds to turning the head to the right above the torso. While the loudspeakers remained steady over the whole measurement, the HATS had to be moved to the next location by operating the robot manually. As shown in Fig. 5c, the first loudspeaker was oriented towards a wall in order to create a case, where the direct sound has a lower amplitude than the first reflection, due to the directivity of the loudspeaker.
Within the second part, the listener was gradually moved along a trajectory connecting the measurement locations 1 and 2 from the first part (see Fig. 5b). BRIRs for each loudspeaker were measured for 0◦ 1location subsumes position and orientation within this treatise
−60◦ −40◦ −20◦ 0◦ 20◦ 40◦ 60◦ 5 10 15 20 25 30 head-orientation time / ms 5 10 15 20 25 30 time / ms Left Ear 0 dB −60 −45 −30 −15 Right Ear
(a) Loudspeaker 1, Listener Position 1
5 10 15 20 10 15 20 25 30 position on trajectory time / ms 10 15 20 25 30 time / ms Left Ear 0 dB −60 −45 −30 −15 Right Ear (b) Loudspeaker 2, Trajectory Fig. 4: Magnitude of measured BRIRs head-orientation every ≈ 25 cm on the trajectory. The loudspeaker positions remained the same compared to part one. During the whole measurement, that is part one and two, the individual sound pressure levels of all four loudspeakers were kept constant.
3
Data Analysis
A small excerpt of the measured impulse responses is shown in Fig. 4 in order to illustrate how the complex environment is affecting the structure of the BRIRs. The directivity of the loudspeakers and the orientation of the first loudspeaker towards the wall leads to the expected strong reflection, which is considerably higher than the direct sound (cf. 4a). The influence of the listener’s head is also clearly visible, as the amplitude of the reflection is reduced for the contra-lateral ear. As the listener "moves" along the trajectory and passes the doorframe connecting both closed rooms (cf. location 11 to 16 in Fig. 5b), loudspeaker 2 is no longer occluded by the walls and the direct sound for the right ear (cf. Fig. 4b) significantly increases.
Winter et al. BRIR database of an apartment-like environment x y ϕ 1.0 m 1.0 m 1 2 3 4 1 2 3 4 1: (2.55, 7.47) m, 4◦ 2: (−3.66, 7.15) m, 0◦ 3: (−3.30, 5.09) m, -5◦ 4: (−1.44, 2.98) m, 43◦ 1: (2.37, 9.96) m, -106◦ 2: (0.88, 4.70) m, -166◦ 3: (−1.17, 4.57) m, -157◦ 4: (−0.14, 6.91) m, 139◦
(a) 4 listener positions with varying head-above-torso-orientation x y ϕ 1.0 m 1.0 m 1 2 3 4 1: (2.55, 7.47) m, 4◦ 2: (−3.66, 7.15) m, 0◦ 3: (−3.30, 5.09) m, -5◦ 4: (−1.44, 2.98) m, 43◦ 1: (2.54, 9.66) m, -105◦ 2: (2.49, 9.56) m, -105◦ 3: (2.32, 9.23) m, -104◦ 4: (2.10, 8.82) m, -103◦ 5: (1.97, 8.55) m, -103◦ 6: (1.87, 8.34) m, -103◦ 7: (1.76, 8.14) m, -102◦ 8: (1.64, 7.91) m, -102◦ 9: (1.51, 7.68) m, -101◦ 10: (1.39, 7.45) m, -101◦ 11: (1.31, 7.25) m, -100◦ 12: (1.20, 7.01) m, -100◦ 13: (1.10, 6.80) m, -99◦ 14: (1.00, 6.58) m, -99◦ 15: (0.88, 6.32) m, -98◦ 16: (0.81, 6.10) m, -98◦ 17: (0.78, 5.85) m, -82◦ 18: (0.89, 5.33) m, -80◦ 19: (1.05, 4.85) m, -78◦ 20: (1.12, 4.63) m, -76◦ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
(b) Listener trajectory with 0◦ head-above-torso-orientation
(c) Loudspeaker 1 and HATS at listener posi-tion 1
(d) Loudspeaker 2, 3, and 4 and HATS at lis-tener position 3
Fig. 5: Fig. (a) and (b) show the xy-position and horizontal orientation of the listener’s torso drawn in black. Respective quantities for the four loudspeakers are drawn in red. The legend shows the respective position followed by the orientation’s azimuth angle ϕ. In addition, an arc around the listener symbol is used in (a) to illustrate the varying head-above-torso-orientation in the range of ±78◦. Fig. (c) and (d) show exemplary combinations of the listener and loudspeaker locations from (a).
4
BRIR Database
All measured BRIRs are stored in the Spatially Ori-ented Format for Acoustics (SOFA), which is the stan-dard format for spatial acoustic data of the Audio En-gineering Society [4]. The data accompanied by pho-tos from the setup is freely available at [5] and is li-censed under Creative Commons Attribution 4.0 (CC BY 4.0)2.
5
Acknowledgements
This research has been supported by EU FET grant TWO!EARS, ICT-618075.
References
[1] Cherry, E. C., “Some experiments on the recogni-tion of speech, with one and with 2 ears,” J. Acoust. Soc. Am., 25, pp. 975–979, 1953.
2http://creativecommons.org/licenses/by/4.0/
[2] Two!Ears Project, “Two!Ears Auditory Model 1.2,” 2016, doi:10.5281/zenodo.47487.
[3] G.R.A.S. Sound & Vibration, “Instruction Manual - G.R.A.S. 45BB KEMAR Head and Torso / 45BC KEMAR Head and Torso with Mouth Simulator,” 2016.
[4] Audio Engineering Society, Inc., “AES692015 -AES standard for file exchange - Spatial acoustic data file format,” 2015.
[5] Winter, F., Hagen, W., Podlubne, A., Forgue, T., Manhès, J., Herrb, M., Spors, S., Raake, A., and Danès, P., “Binaural room impulse responses of an apartment- like environment,” 2016, doi:10.5281/ zenodo.49357.
AES 140thConvention, Paris, France, 2016 June 4–7 Page 4 of 4