Virtual Acoustics is a powerful tool for the auralization of virtual acoustic scenes and the reproduction thereof. Getting started with VA includes three important steps
The usage of VA can often be divided into two user groups
- those who seek for quick experiments with spatial audio and are happy with conventional playback over headphones
- those who want to employ VA for a sophisticated loudspeaker setup for (multi modal) listening experiments and Virtual Reality applications
If you are willing to dive deeper into the VA framework you are probably interested in how to adapt the software package for your purposes. The following sections will describe how you can set up VA for your goal from the very beginning.
Virtual Acoustics configuration
VA can be configured using a section-based key-value parameter collection which is passed on to the core instance during initialization. This is usually done by providing a path to a text-based INI file which will be referred to as
VACore.ini but can be of arbitrary name. If you use the
VAServer application you will work with this file only. If you only use the
Redstart GUI application you will probably never use it. However, the INI file can be exported from a Redstart session in case you need it.
Paths section allows for adding search paths to the core. If resources like head-related transfer functions (HRTFs), geometry files, or audio files, are required these search paths guarantee to locate the requested files. Relative paths are resolved from the execution folder where the VA server application is started from. When using the provided batch start scripts on Windows it is recommended to add
[Paths] data = data conf = conf my_data = C:/Users/Me/Documents/AuralizationData my_other_data = /home/me/auralization/input
Files section, you can name files that will be included as further configuration files. This is helpful when certain configuration sections must be outsourced to be reused efficiently. Outsourcing is especially convenient when switching between static sections like hardware descriptions for laboratories or setups, but can also be used for rendering and reproduction modules (see below). Avoid copying larger configuration sections that are re-used frequently. Use different configuration files, instead.
[Files] old_lab = VASetup.OldLab.Loudspeakers.ini #new_lab = VASetup.NewLab.Loudspeakers.ini
Macros section is helpful to write tidy scripts. Use macros if it is not explicitly required to use a specific input file. For example, if any HRTF can be used for a receiver in the virtual scene the
DefaultHRIR will point to the default HRTF data set, or head-related impulse response (HRIR) in time domain. Any defined macros will be replaced through a given value by the core.
Usage: "$(MyMacroName)/file.abc" -> "MyValue/file.abc"
Macros are substituted forwardly by key name order (use with care), and otherwise stay untouched: A = B; C = $(A) -> $(C) is B
The example macros provided below are a good practice set which should be present in a configuration file in order to keep the example scripts valid.
Macros are also very helpful if certain exported file prefixes are desired, e.g., to get better structured file names for input and output recordings.
[Macros] DefaultHRIR = HRIR/ITA-Kunstkopf_HRIR_AP11_Pressure_Equalized_3x3_256.v17.ir.daff HumanDir = Directivity/Singer.v17.ms.daff Trumpet = Directivity/Trumpet1.v17.ms.daff # Define some other macros (examples) ProjectName = MyVirtualAcousticsProject
Debug section configures the initial behavior of the core as, for example, log level and input/output recording. If input and output recording is enabled the entire channel number of your physical or abstract device will be logged. For devices with a lot of digital inputs and outputs, the channel count may reach up to 256 channels, the maximum channel number as defined per WAV format. Additionally, the data is stored as PCM data at a resolution of 32 bit leading to high storage requirements. To avoid such excessive storage demands, only use this option if absolutely necessary. Otherwise it is recommended to only record the output channels which were set, for example, in the playback modules (see below).
In the following, some macros are used (see Macros section above).
[Debug] # Record device input and store to hard drive (will record every available input channel) InputRecordEnabled = false InputRecordFilePath = $(ProjectName)_in.wav # Record device output and store to hard drive (will record every available output channel) OutputRecordEnabled = false OutputRecordFilePath = $(ProjectName)_out.wav # Set log level: 0 = quiet; 1 = errors; 2 = warnings (default); 3 = info; 4 = verbose; 5 = trace; LogLevel = 3
To properly calibrate a rendering and reproduction system, every component in the chain has to be carefully configured. Hence the lack of being scaled by physical means, digital signals stored, for example, in a WAV file or in the buffers of the sound card, a reference point enabling a proper calibration was set. In VA, a digital value of 1.0 refers to 1 Pascal at a distance of 1 m per default. For example, a sine wave with peak value of \sqrt(2)) will retain 94 dB SPL at a distance of 1m. But this value can also be changed to 124 dB if lower amplitudes are necessary (and a sample type conversion from float to integer is performed along the output chain). This makes it necessary to use a powerful amplifier facilitating the reproduction of small sample values. Setting the internal conversion value to 124 dB avoids clipping at high values (but introduces a higher noise floor). To do so, include the following section into the configuration (the clarification comment can be dropped):
[Calibration] # The amplitude calibration mode either sets the internal conversion from # sound pressure to an electrical or digital amplitude signal (audio stream) # to 94dB (default) or to 124dB. The rendering modules will use this calibration # mode to calculate from physical values to an amplitude that can be forwarded # to the reproduction modules. If a reproduction module operates in calibrated # mode, the resulting physical sound pressure at receiver location can be maintained. DefaultAmplitudeCalibrationMode = 94dB
Audio interface configuration
The audio interface controls the backend driver and the device. In the current version, for the
Driver backend key,
ASIO is supported on Windows only, whereas
Portaudio is available on all platforms. By default, Portaudio with the default driver is used that usually produces audible sound without further ado. However, the block sizes are high and the update rates are not sufficient for real-time auralization using motion tracking. Therefore, dedicated hardware and small block sizes should be used - and ASIO is recommended for Windows platforms.
ASIO example using ASIO4ALL v2
ASIO4ALL is a useful and well-implemented intermediate layer for audio I/O making it possible to use ASIO drivers for the internal hardware (and any other audio device available). It must be installed on the PC, first.
Although it appears that the buffer size can be defined for ASIO devices, the ASIO backend will automatically detect the buffer size that has been configured by the driver when the
[Audio driver] Driver = ASIO Samplerate = 44100 Buffersize = AUTO Device = ASIO4ALL v2
AUTOvalue is set (recommended). Set the buffer size in the ASIO driver dialog of your physical device, instead. Make sure, that the sampling rates are matching.
ASIO requires a device name to be defined by each driver host. Further common hardware device names are
|Manufacturer||Device||ASIO device name|
|Focusrite||2i2, 2i4, ...||
|M-Audio||Fast Track Ultra||
|Realtek||Realtek Audio HD||
|ASIO4ALL||any windows device||
|Reaper (x64)||any Reaper device||
|Table 1: Common ASIO device driver host names|
If you do not have any latency requirements you can also use
Portaudio under Windows and other platforms. The specific device names of Portaudio interfaces can be detected, for example, using the VLC player or with Audacity. But the
default device is recommended simply because it will pick the audio device that is also registered as the default device of your system. This is, what most people need anyway, and the system tools can be used to change the output device.
Buffersize is unkown, at least the native buffer size of the audio device should be used (which is most likely
1024 for on-board chips). Otherwise, timing will behave oddly which has a negative side effect on the rendering.
[Audio driver] Driver = Portaudio Samplerate = 44100 Buffersize = 1024 Device = default
Audio hardware configuration
Setup section describes the hardware environment in detail. It might seem a bit over the top but the complex definition of hardware groups with logical and physical layers eases re-using of physical devices for special setups and also allows for multiple assignments - similar to the RME matrix concept of TotalMix, except that volume control and mute toggling can be manipulated in real-time using the VA interface instead of the ASIO control panel GUI.
The hardware configuration can be separated into inputs and outputs, but they are basically handled in the same manner. More importantly, the setup can be devided into devices of specialized types and groups that combine devices. Often, this concept is unnecessary and appears cumbersome, but there are situations where this level of complexity is required.
A device is a physical emmitter (
OutputDevice) or transducer (
InputDevice) with a fixed number of channels and assignment using (arbitrary but unique) channel indices. A broadband loudspeaker with one line input is a typical representative of the single channel
OutputDevicethat has a fixed pose in space. A pair of headphones is assigned the type
HP and usually has two channels, but no fixed pose in space.
So far, there is only an input device type called
MIC that has a single channel.
Physical devices can not directly be used for a playback in VA. A reproduction module can rather be connected with one or many
Outputs - logical groups of
Again, for headphones this seems useless because a headphone device will be represented by a virtual group of only one device. However, for loudspeaker setups this makes sense as, for example, a setup of 7 loudspeakers for spatial reproduction may be used by different groups which combine only 5, 4, 3, or 2 of the available loudspeakers to form an output group. In this case, only the loudspeaker identifiers are required and channels and positions are made available by the physical device description. Following this strategy, repositioning of loudspeakers and re-assignment of channel indices is less error prone due to its organization in one configuration section, only.
Headphone setup example
Let us assume you have a pair of Sennheiser HD 650 headphones at your disposal and you want to use it for binaural rendering and reproduction. This is the most common application of VA and will result in the following configuration:
If you want to use another output jack for some reason change your channels accordingly, say to
[Setup] [OutputDevice:SennheiserHD650] Type = HP Description = Sennheiser HD 650 headphone hardware device Channels = 1,2 [Output:DesktopHP] Description = Desktop user with headphones Devices = SennheiserHD650
Loudspeaker setup example
Let us assume you have a square-shaped loudspeaker setup of Neumann KH120 at your disposal. You want to use it for binaural rendering and reproduction. This is the a common application of VA for a dynamic listening experiment in a hearing booth. For this scenario, the configuration file may like this:
Note: The order of devices in the output group is irrelevant for the final result. Each LS will receive the corresponding signal on the channel of the device.
[Setup] [OutputDevice:NeumannKH120_FL] Type = LS Description = Neumann KH 120 in front left corner of square Channels = 1 [OutputDevice:NeumannKH120_FR] Type = LS Description = Neumann KH 120 in front right corner of square Channels = 2 [OutputDevice:NeumannKH120_RR] Type = LS Description = Neumann KH 120 in rear right corner of square Channels = 3 [OutputDevice:NeumannKH120_RL] Type = LS Description = Neumann KH 120 in rear left corner of square Channels = 4 [Output:HearingBoothLabLS] Description = Hearing booth laboratory loudspeaker setup Devices = NeumannKH120_FL, NeumannKH120_FR, NeumannKH120_RR, NeumannKH120_RL
Microphone setup example
The audio input configuration is similar to the output configuration but is not yet fully included in VA. If you want to use input channels as signal sources for a virtual sound source assign the provided unmanaged signals called
audioinput1, audioinput2, ... . The number refers to the input channel index beginning with 1 and you can get the signals by using the getters
[Setup] [InputDevice:NeumannTLM170] Type = MIC Description = Neumann TLM 170 Channels = 1 [Input:BodyMic] Description = Hearing booth talk back microphone Devices = NeumannTLM170
To override default values concerning the homogeneous medium that is provided by VA, include the following section and modify the values to your needs (the default values are shown here).
[HomogeneousMedium] DefaultSoundSpeed = 344.0 # m/s DefaultStaticPressure = 101125.0 # [Pa] DefaultTemperature = 20.0 # [Degree centigrade] DefaultRelativeHumidity = 20.0 # [Percent] DefaultShiftSpeed = 0.0, 0.0, 0.0 # 3D vector in m/s
Rendering module configuration
To instantiate a rendering module, a section with a
Renderer: suffix has to be included. The statement following
: will be the unique identifier of this rendering instance. If you want to change parameters during execution this identifier is required to call the instance. Although all renderers require some obligatory definitions, a detailed description is necessary for the specific parameter set. For typical renderers, some examples are given below.
Required rendering module parameters
The rendering class refers to the type of renderer which can be taken from the tables in the overview section.
Class = RENDERING_CLASS Reproductions = REPRODUCTION_INSTANCE(S)
Reproductionsdescribes how to configure connections to reproduction modules. At least one reproduction module has to be defined but the rendering stream can also be connected to multiple reproductions of same or different type (e.g., talkthrough, equalized headphones and cross-talk cancellation). The only restriction is that the rendering output channel number has to match the reproduction module's input channel number. This prevents connecting a two-channel binaural renderer with, for example, an Ambisonics reproduction which would take at least 4 channels.
Optional rendering module parameters
Description = Some informative description of this rendering module instance Enabled = true OutputDetectorEnabled = false RecordOutputEnabled = false RecordOutputFileName = renderer_out.wav RecordOutputBaseFolder = recordings/MyRenderer
Note: until version 2018a, the record output file was only controlled by a file path key namedRendering modules can be enabled and disabled to speed up setup changes without copying & pasting larger parts of a configuration section, as especially reproduction modules can only be instantiated if the sound card provides enough channels. This makes testing on a desktop PC and switching to a laboratory environment easier.
RecordOutputFilePath. The file name and base folder has been introduced in 2018b. Now, the folder and file name can be modified during runtime, see simulation and recording section.
For rendering modules, only the output can be observed. A stream detector for the output can be activated that will produce level meter values, for example, for a GUI widget. The output of the active listener can also be recorded and exported as a WAV file. Recording starts with initialization and is exported to the hard disc drive after finalization impliciting that data is kept in the RAM. If a high channel number is required and/or long recording sessions are planned it is recommended to route the output through a DAW, instead, i.e. with ASIO re-routing software devices like Reapers ReaRoute ASIO driver. To include a more versatile output file name (macros are allowed).
Binaural free field renderer (class
This example with all available key/value configuration pairs is include in the default
VACore.ini settings which is generated from the repository's
VACore.ini.proto (by CMake). It requires a reproduction called
MyTalkthroughHeadphones, shown further below.
A more detailed explanation of the motion model and further parameters are provided in the documentation specifying how the rendering works.
[Renderer:MyBinauralFreeField] Class = BinauralFreeField Enabled = true Reproductions = MyTalkthroughHeadphones HRIRFilterLength = 256 MotionModelNumHistoryKeys = 10000 MotionModelWindowSize = 0.1 MotionModelWindowDelay = 0.1 MotionModelLogInputSources = false MotionModelLogEstimatedOutputSources = false MotionModelLogInputReceiver = false MotionModelLogEstimatedOutputReceiver = false SwitchingAlgorithm = linear OutputDetectorEnabled = false RecordOutputEnabled = false RecordOutputFilePath = MyRenderer_filename_may_including_$(ProjectName)_macro.wav
VBAP free field renderer (class
Output (3-d positions of a loudspeaker setup) to render channel-based audio. Otherwise, it works similar to other free field renderers.
[Renderer:MyVBAPFreefield] Class = VBAPFreeField Enabled = true Output = VRLab_Horizontal_LS Reproductions = MixdownHeadphones
Ambisonics free field renderer (class
Similar to binaural free field renderer, but evaluates receiver directions based on a decomposition into spherical harmonics with a specific order (
TruncationOrder). It requires a reproduction called
MyAmbisonicsDecoder which is shown further below.
[Renderer:MyAmbisonicsFreeField] Class = AmbisonicsFreeField Enabled = true Reproductions = MyAmbisonicsDecoder TruncationOrder = 3 MotionModelNumHistoryKeys = 10000 MotionModelWindowSize = 0.1 MotionModelWindowDelay = 0.1 MotionModelLogInputSources = false MotionModelLogEstimatedOutputSources = false MotionModelLogInputReceiver = false MotionModelLogEstimatedOutputReceiver = false SwitchingAlgorithm = linear OutputDetectorEnabled = false RecordOutputEnabled = false RecordOutputFilePath = MyRenderer_filename_may_including_$(ProjectName)_macro.wav
Ambient mixing renderer (class
The ambient mixer takes the value of the key
OutputGroup and accordingly sets the channel count for playback as subsequent reproduction modules require matching channels. However, an arbitrary number of reproduction modules can be specified, as shown in the following example.
[Renderer:MyAmbientMixer] Class = AmbientMixer Description = Low-cost renderer to make sound audible without spatializations Enabled = true OutputGroup = MyDesktopHP Reproductions = MyDesktopHP, MySubwooferArray
Binaural artificial room acoustics renderer (class
Values and angles are specified in SI units (e.g., seconds, meters, watts, etc.) and angles, respectively. The reverberation time may exceed the reverberation filter length (divided by the sampling rate) resulting in a cropped impulse response. This renderer requires and uses the sound receiver HRIR for spatialization and applies a sound power correction to match with direct sound energy if used together with the binaural free field renderer.
[Renderer:MyBinauralArtificialRoom] Class = BinauralArtificialReverb Description = Low-cost per receiver artificial reverberation effect Enabled = true Reproductions = MyTalkthroughHeadphones ReverberationTime = 0.71 RoomVolume = 200 RoomSurfaceArea = 88 MaxReverbFilterLengthSamples = 88200 PositionThreshold = 1.0 AngleThresholdDegree = 30 SoundPowerCorrectionFactor = 0.05 TimeSlotResolution = 0.005 MaxReflectionDensity = 12000.0 ScatteringCoefficient = 0.1
Binaural room acoustics renderer (class
Requires the Room Acoustics for Virtual ENvironments (RAVEN) software module (see Research section) or other room acoustics simulation backends. Note that the reverberation time may exceed the reverberation filter length (divided by the sampling rate) with the consequence that the generated impulse response will be cropped. This renderer requires and uses the specified sound receiver HRIR data set for spatialization and applies a sound power correction to match with direct sound energy if combined with binaural free field renderer.
[Renderer:MyBinauralRoomAcoustics] Class = BinauralRoomAcoustics Enabled = true Description = Renderer with room acoustics simulation backend (RAVEN) for a source-receiver-pair with geometry-aware propagation Reproductions = MyTalkthroughHeadphones # Setup options: Local, Remote, Hybrid Setup = Local ServerIP = PC-SEACEN HybridLocalTasks = DS HybridRemoteTasks = ER_IS, DD_RT RavenDataBasePath = $(raven_data) # Task processing (Timeout = with desired update rate, for resource efficient processing; EventSync = process on request (for sporadic updates); Continuous = update as often as possible, for standalone server) TaskProcessing = Timeout # Desired update rates in Hz, may lead to resource issues UpdateRateDS = 12.0 UpdateRateER = 6.0 UpdateRateDD = 1.0 MaxReverbFilterLengthSamples = 88200 DirectSoundPowerCorrectionFactor = 0.3
Prototype free field renderer (class
Similar to binaural free field renderer with the capability of handling multi-channel receiver directivities. This renderer can, for example, be used for recording the output of microphone array simulations.
[Renderer:MyPrototypeFreeField] Class = PrototypeFreeField Enabled = true Reproductions = MyTalkthroughHeadphones MotionModelNumHistoryKeys = 10000 MotionModelWindowSize = 0.2 MotionModelWindowDelay = 0.1 MotionModelLogInputSources = false MotionModelLogEstimatedOutputSources = false MotionModelLogInputReceivers = false MotionModelLogEstimatedOutputReceivers = false SwitchingAlgorithm = linear
Prototype generic path renderer (class
Channel count and length can be specified arbitrarily but is limited by the computational power available. Filtering is done individually for each source-receiver pair.
[Renderer:MyPrototypeGenericPath] Class = PrototypeGenericPath Enabled = true Reproductions = MyTalkthroughHeadphones NumChannels = 2 IRFilterLengthSamples = 88200 IRFilterDelaySamples = 0 OutputMonitoring = true
Binaural air traffic noise renderer (class
Filtering is done individually for each source-receiver pair. Involved filters the simulation of propagation paths can also be exchanged by the user for prototyping (requires a modification of simulation flags in the configuration file).
[Renderer:MyAirTrafficNoiseRenderer] Class = BinauralAirTrafficNoise Enabled = true Reproductions = MyTalkthroughHeadphones MotionModelNumHistoryKeys = 1000 MotionModelWindowSize = 2 MotionModelWindowDelay = 1 MotionModelLogInputSources = false MotionModelLogEstimatedOutputSources = false MotionModelLogInputReceivers = false MotionModelLogEstimatedOutputReceivers = false GroundPlanePosition = 0.0 PropagationDelayExternalSimulation = false GroundReflectionExternalSimulation = false DirectivityExternalSimulation = false AirAbsorptionExternalSimulation = false SpreadingLossExternalSimulation = false TemporalVariationsExternalSimulation = false SwitchingAlgorithm = cubicspline
Dummy renderer (class
Useful for a quick configuration of your own prototype renderer.
[Renderer:MyDummyRenderer] class = PrototypeDummy Description = Dummy renderer for testing, benchmarking and building upon Enabled = true OutputGroup = MyDesktopHP Reproductions = MyTalkthroughHeadphones
Other rendering module examples
Every specific rendering module has its own specific set of parameters. The discussion of every functional detail is out of scope of this introduction. As all configurations are parsed in the constructor of the respective module, their functionality can sometimes only be fully understood by investigating the source code. For facilitation, the Redstart GUI application includes dialogs to create and interact with those renderers, additionally offering information when hovering over the GUI elements.
Reproduction module configuration
To instantiate a reproduction module, a section with a
Reproduction: suffix has to be included. The statement following
: will be the unique identifier of this reproduction instance. If you want to change parameters during execution, this identifier is required to call the instance. All reproduction modules require some obligatory definitions but for every specific parameter set, a detailed description is necessary. For typical reproduction modules, some examples are given below.
Required reproduction module parameters
The reproduction class refers to the type of reproduction as provided in the section overview.
Class = REPRODUCTION_CLASS Outputs = OUTPUT_GROUP(S)
Outputsdescribes the connections to logical output groups that forward audio based on the configured channels. At least one output group has to be defined but the reproduction stream can also be connected to multiple outputs of same or different type (e.g., different pairs of headphones). The only restriction is that the reproduction channel number has to match with the channel count of the output group(s).
Optional reproduction module parameters
Description = Some informative description of this reproduction module instance Enabled = true InputDetectorEnabled = false RecordInputEnabled = false RecordInputFilePath = MyReproInput_filename_may_including_$(ProjectName)_macro.wav OutputDetectorEnabled = false RecordInputEnabled = false RecordInputFileName = reproduction_in.wav RecordInputBaseFolder = recordings/MyReproduction RecordOutputEnabled = false RecordOutputFileName = reproduction_out.wav RecordOutputBaseFolder = recordings/MyReproduction
Note: until version 2018a, the record input / output file was only controlled by a file path key namedReproduction modules can be enabled and disabled to speed up setup changes without copy & pasting larger parts of a configuration section as especially output groups can only be instantiated if the sound card provides enough channels. This makes testing on a desktop and switching to a lab environment easier.
RecordOutputFilePath. The file name and base folder has been introduced in 2018b. Now, the folder and file name can be modified during runtime, see simulation and recording section.
For reproduction modules, the input and output can be observed. A stream detector on input and output can be activated that will produce level meter values, to be used in a GUI widget or so. The input of a reproduction module may include several superposed rendering streams (in constrast to the rendering output), for example, for direct sound and reverberant sound. The output of a reproduction can also be recorded and exported to a WAV file. The recording starts at initialization and is exported to hard drive after finalization implicating that data is kept in the RAM. If a lot of channel numbers are required and/or long recording sessions are planned it is recommended to route the output through a DAW using, for example, ASIO re-routing software devices like Reapers ReaRoute ASIO driver. Macros are useful to include a more versatile output file name.
Talkthrough reproduction (class
The following example with all available key/value configuration pairs is taken from the default
VACore.ini settings which is generated from the repository's
VACore.ini.proto (by CMake). It requires an output called
[Reproduction:MyTalkthroughHeadphones] Class = Talkthrough Enabled = true Description = Generic talkthrough to output group Outputs = MyDesktopHP InputDetectorEnabled = false OutputDetectorEnabled = false RecordInputEnabled = false RecordInputFilePath = $(ProjectName)_Reproduction_MyTalkthroughHeadphones_Input.wav RecordOutputEnabled = false RecordOutputFilePath = $(ProjectName)_Reproduction_MyTalkthroughHeadphones_Output.wav
Low-frequency / subwoofer mixing reproduction (class
[Reproduction:MySubwooferMixer] Class = LowFrequencyMixer Enabled = true Description = Generic low frequency (subwoofer) loudspeaker mixer Outputs = Cave_SW MixingChannels = ALL # Can also be a single channel, e.g. zero order of Ambisonics stream
Equalized headphones reproduction (class
Two-channel equalization using FIR filtering based on post-processed inverse headphone impulse responses measured through in-ear microphones.
[Reproduction:MyHD600] Class = Headphones Description = Equalized Sennheiser HD600 headphones Enabled = true # Headphone impulse response inverse file path (can be normalized, but gain must then be applied for calibration) HpIRInvFile = HD600_all_eq_128_stereo.wav HpIRInvFilterLength = 22050 # optional, can also be obtained from IR filter length # Headphone impulse response inverse gain for calibration ( HpIR * HpIRInv == 0dB ) HpIRInvCalibrationGainDecibel = 0.1 Outputs = MyHD600HP
Multi-channel cross-talk cancellation reproduction (class
Requires an output called
MyDesktopLS. In case of a dynamic NCTC reproduction, only one receiver can be tracked (indicated by
TrackedListenerID which is orientated and located based on a real-world pose).
DelaySamples shifts the final CTC filters to obtain causal filters. The amount of the delay has to be set reasonably regarding
CTCFilterLength (e.g., apply a shift of half the filter length).
[Reproduction:MyNCTC] Class = NCTC Enabled = true Description = Crosstalk cancellation for N loudspeaker Outputs = MyDesktopLS TrackedListenerID = 1 # algorithm: reg|... Algorithm = reg RegularizationBeta = 0.001 DelaySamples = 2048 CrossTalkCancellationFactor = 1.0 WaveIncidenceAngleCompensationFactor = 1.0 UseTrackedListenerHRIR = false CTCDefaultHRIR = $(DefaultHRIR) Optimization = OPTIMIZATION_NONE
Higher-order Ambisonics decoding (class
Creates a decoding matrix based on a given output configuration, but can only be used for one output.
[Reproduction:MyAmbisonics] Class = HOA Enabled = true Description = Higher-Order Ambisonics TruncationOrder = 3 Algorithm = HOA Outputs = VRLab_Horizontal_LS ReproductionCenterPos = AUTO # or x,y,z
Ambisonics binaural mixdown (class
Encodes the individual orientations of loudspeakers in a loudspeaker setup using binaural technology based on the
VirtualOutput group. It can also be used for a virtual Ambisonics downmix with ideal spatial sampling layout.
[Reproduction:AmbisonicsBinauralMixdown] Class = AmbisonicsBinauralMixdown Enabled = true Description = Binaural mixdown of virtual loudspeaker setup using HRIR techniques TruncationOrder = 3 Outputs = MyDesktopHP VirtualOutput = MyDesktopLS TrackedListenerID = 1 HRIRFilterLength = 128
Other reproduction module examples
Every specific reproduction module has its own specific set of parameters. The discussion of every functional detail is out of scope of this introduction. As all configurations are parsed in the constructor of the respective module, their functionality can sometimes only be fully understood by investigating the source code. For facilitation, the Redstart GUI application includes dialogs to create and interact with those renderers, additionally offering information when hovering over the GUI elements.
Controlling a Virtual Acoustics instance
Once your VA application is running as configured, you eventually want to create a virtual scene and modify its entities. Scene control is possible via scripts and tracking devices (e.g, NaturalPoint's OptiTrack). The VA interface provides a list of methods which lets you trigger updates and control settings.
Control VA using Matlab
The most common way to control VA for prototyping, testing, and in the scope of listening experiments is by using MathWorks' Matlab. VA provides a Matlab binding and a convenience class called
itaVA. Once initialized, the class object can be connected to the VA server application over a TCP/IP network connection (or the local network port), as already described in the overview section on controlling VA.
You can find the
itaVA.m Matlab class along with the required files for communication with VA in the VA package under the
matlab folder. In case you are building and deploying
VAMatlab on your own (for your platform), or if it is missing, look out for
build_itaVA*.m scripts that will generate the convenience class around the
VAMatlab executable. Adding this folder to the Matlab path list, will enable permanent access from the console, independently of the current working directory.
To get started, inspect the example files and use Matlab's bash completion on an instance of the
itaVA class to receive self explanatory functions, i.e., when executing
The list of available methods is sorted by getter and setter nomenclature (
va = itaVA
va.set_*), followed by the entity (
sound_portal), and the actual action. To create an entity, directivities and more, use the
Note: All example calls to control VA are shown in Matlab code style. The naming convention in other scripting languages, however, is very similar. C++ and C# methods use capitalized words without underscores.
Control VA using Python
A Python VA module is available facilitating network access. It can be installed to be executed from anywhere, or it can be copied to and executed from a local folder. To obtain the package and example scripts, download a package that includes the Python binding (only available for Python 3.6 and recent compilers).
Control VA using Unity
Unity, a 3D and scripting development environment for games and Virtual Reality applications, allows a more intuitive and playful way to use VA. The
VAUnity C# scripts extend a Unity
GameObject and communicates properties to a VA server. Therefore, a C# VA binding, which comes with the binary packages in the download section, is required. No knowledge of a scripting or programming language, only a copy of Unity is required using this method. How to use VA and Unity is described in the README file of the project repository.
Global gain and muting
To control the global input gains (sound card software input channels), use
va.set_input_gain( 1.0 ) # for values between 0 and 1
To mute the input, use
va.set_input_muted( true ) # or false to unmute
The same is true for the global output gain (sound card software output channels)
va.set_output_gain( 1.0 )
va.set_output_muted( true ) # or false to unmute
Global auralization mode
The auralization mode is combined in the renderers by a logical AND combination of global auralization mode, sound receiver auralization mode and sound source auralization mode. The deactivation of an acoustic phenomenon such as, for example, the spreading loss, will affect all rendered sound paths.
Find the appropriate identifier for every auralization mode in the overview table.
va.set_global_auralization_mode( '-DS' ) # ... to disable direct sound
va.set_global_auralization_mode( '+SL' ) # ... to enable spreading loss, e.g. 1/r distance law
The VA log level at server side can be changed using
Increasing the log level is potentially helpful to detect problems if the current log level is not high enough to throw an indicative warning message.
va.set_log_level( 3 ) # 0 = quiet; 1 = errors; 2 = warnings (default); 3 = info; 4 = verbose; 5 = trace;
At runtime, search paths can be added to the VA server using
Note, that the search path has to be available at server side if you are not running VA on the same machine. Wherever possible, add search paths and use file names only. Never use absolute paths for input files. If your server is not running on the same machine, consider adding search paths via the configuration at startup.
va.add_search_path( './your/data/' )
Query registered modules
To retrieve information on the available modules, use
This method will return any registered VA module, including all renderer and reproduction modules as well as the core itself.
modules = va.get_modules()
All modules can be called using
out_args = va.call_module( 'module_id', in_args )
out_argare structs with specific fields which depend on the module you are calling. Usually, a struct field with the name
inforeturns useful information on how to work with the respective module:
va.call_module( 'module_id', struct('help',true) )
To work with renderers, use
Again, all parameters are returned as structs. More information on a parameter set can be obtained using structs containing the field
renderers = va.get_rendering_modules()
params = va.get_renderer_parameters( 'renderer_id' )
va.set_renderer_parameters( 'renderer_id', params )
info. It is good practice to use the parameter getter and inspect the key/value pairs before modifying and re-setting the module with the new parameters.
For reproduction modules, use
Querying and re-setting parameters works in the same way as described for rendering and reproduction modules.
reproductions = va.get_reproduction_modules()
params = va.get_reproduction_parameters( 'reproduction_id' )
va.set_reproduction_parameters( 'reproduction_id', params )
How to create and modify a scene in Virtual Acoustics
In VA, everything that is not static is considered part of a dynamic scene. All sound sources, sound portals, sound receivers, underlying geometry and source/receiver directivities are potentially dynamic and therefore are stored and accessed using a history concept. They can be modified, however, during lifetime. Renderers are picking up modifications and react upon the new state, for example, when a sound source is moved or a sound receiver is rotated.
Updates are triggered asynchronously by the user or by another application and can also be synchronized ensuring that all signals are started or stopped within one audio frame.
Sound sources can be created by using
S = va.create_sound_source()
or created and optionally assigned a name
S = va.create_sound_source( 'Car' )
Swill contain a unique numerical identifier which is required to modify the sound source.
A sound source (as well as a sound receiver) can only be auralized if it has been placed somewhere in 3D space. Otherwise it remains in an invalid state.
Specify a position as a three-dimensional vector ...
va.set_sound_source_position( S, [ x y z ] )
... and an orientation using a four-dimensional quaternion
following the quaternion coefficient order
va.set_sound_source_orientation( S, [ a b c d ] )
a + bi + cj + dk.
It is also possible to set both values at once using a pose (position and orientation)
va.set_sound_source_pose( S, [ x y z ], [ a b c d ] )
You may also use a special view-and-up vector orientation, where the default view vector points towards negative Z direction and the up vector points towards positive Y direction according to a right-handed OpenGL coordinate system.
va.set_sound_source_orientation_view_up( S, [ vx vy vz ], [ ux uy uz ] )
The corresponding getter functions are
p = va.get_sound_source_position( S ) q = va.get_sound_source_orientation( S ) [ p, q ] = va.get_sound_source_pose( S ) [ v, u ] = va.get_sound_source_orientation_view_up( S )
p = [x y z]',
q = [a b c d]',
v = [vx vy vz]', and
u = [ux uy uz]', where
'symbolizes the vector transpose.
To get or set the name of a sound source, use
va.set_sound_source_name( S, 'AnotherCar' ) sound_source_name = va.get_sound_source_name( S )
Specific parameter structs can be set or retrieved. They depend on special features and are used for prototyping, for example, if sound sources require additional values for new renderers.
va.set_sound_source_parameters( S, params ) params = va.get_sound_source_parameters( S )
The auralization mode can be modified and returned using
This call would, for example, activate the direct sound. Other variants include
va.set_sound_source_auralization_mode( S, '+DS' ) am = va.get_sound_source_auralization_mode( S )
va.set_sound_source_auralization_mode( S, '-DS' ) va.set_sound_source_auralization_mode( S, 'DS, IS, DD' ) va.set_sound_source_auralization_mode( S, 'ALL' ) va.set_sound_source_auralization_mode( S, 'NONE' ) va.set_sound_source_auralization_mode( S, '' )
Sound sources can be assigned a directivity with a numerical identifier by
The handling of directivities is described below in the input data section.
va.set_sound_source_directivity( S, D ) D = va.get_sound_source_directivity( S )
To mute (true) and unmute (false) a source, type
va.set_sound_source_muted( S, true ) mute_state = va.get_sound_source_muted( S )
To control the level of a sound source, assign the sound power in watts
The default value of 31.67 mW (65 dB re 10e-12 Watts) corresponds to 1 Pascal (94.0 dB SPL re 20e-6 Pascal ) in a distance of 1 m for spherical spreading. The final gain of a sound source is linked to the input signal, which is explained below. However, a digital signal with an RMS value of 1.0 (e.g., a sine wave with peak value of sqrt(2)) will retain 94 dB SPL @ 1m. A directivity may alter this value for a certain direction, but a calibrated directivity will not change the overall excited sound power of the sound source when integrating over a hull.
va.set_sound_source_sound_power( S, P ) P = va.get_sound_source_sound_power( S )
A list of all available sound sources returns the function
source_ids = va.get_sound_source_ids()
Sound sources can be deleted with
va.delete_sound_source( S )
In contrast to all other sound objects, sound sources can be assigned a signal source. It feeds the sound pressure time series for that source and is referred to as the signal (speech, music, sounds). See below for more information on signal sources. The combination with the sound power and the directivity (if assigned), the signal source influences the time-dependent sound emitted from the source. For a calibrated auralization, the combination of the three components have to match physically.
va.set_sound_source_signal_source( sound_source_id, signal_source_id )
Except for the sound power method and the signal source adapter, all sound source methods are equally valid for sound receivers (see above). Just substitute
receiver. A receiver can also be a human listener, in which case the receiver directivity will be an HRTF.
The VA interfaces provides some special features for receivers that are meaningful only in binaural technology. The head-above-torso orientation (HATO) of a human listener can be set and received as quaternion by the methods
In common datasets like the FABIAN HRTF dataset (can be obtained from the OpenDAFF project website), only a certain range within the horizontal plane (around positive Y axis according to right-handed, Cartesian OpenGL coordinates) is present, that accounts for simplified head rotations with a fixed torso. Many listening experiments are conducted in a fixed seat and the user's head orientation is tracked. Here, a HATO HRTF appears more suitable, at least if an artificial head is used.
va.set_sound_receiver_head_above_torso_orientation( sound_receiver_id, [ a b c d ] ) q = va.get_sound_receiver_head_above_torso_orientation( sound_receiver_id )
Additionally, in Virtual Reality applications with loudspeaker-based setups, user motion is typically tracked inside a specific area. Some reproduction systems require knowledge on the exact position of the user's head and torso to apply adaptive sweet spot handling (like cross-talk cancellation). The VA interface therefore includes some receiver-oriented methods that extend the virtual pose with a so called real-world pose. Hardware in a lab and the user's absolute position and orientation (pose) should be set using one of the following setters
Corresponding getters are
va.set_sound_receiver_real_world_pose( sound_receiver_id, [ x y z ], [ a b c d ] ) va.set_sound_receiver_real_world_position_orientation_vu( sound_receiver_id, [ x y z ], [ vx vy vz ], [ ux uy uz ] )
Also, HATOs are supported (in case a future reproduction module makes use of HATO HRTFs)
[ p, q ] = va.get_sound_receiver_real_world_pose( sound_receiver_id ) [ p, v, u ] = va.get_sound_receiver_real_world_position_orientation_vu( sound_receiver_id )
va.set_sound_receiver_real_world_head_above_torso_orientation( sound_receiver_id, [ x y z w ] ) q = va.set_sound_receiver_real_world_head_above_torso_orientation( sound_receiver_id )
Sound portals have been added to the interface for future usage but are currently not supported by the available renderer. Their main purpose will include building acoustics applications, where portals are combined to form flanking transmissions through walls and ducts.
Sound signals or signal sources represent the sound pressure time series that are emitted by a source.
Some are unmanaged and are directly available, others have to be created. To get a list with detailed information on currently available signal sources (including those created at runtime), type
In general, a signal source is attached to one ore many sound sources like this:
va.set_sound_source_signal_source( sound_source_id, signal_source_id )
Buffer signal source
Audio files that can be attached to sound sources are usually single channel anechoic WAV files. In VA, an audio clip can be loaded as a buffer signal source with special control mechanisms. It supports macros and uses the search paths to locate a file. Using relative paths is highly recommended. Two examples are provided in the following:
signal_source_id = va.create_signal_source_buffer_from_file( 'filename.wav' )
demo_signal_source_id = va.create_signal_source_buffer_from_file( '$(DemoSound)' )
DemoSoundmacro points to the 'Welcome to Virtual Acoustics' anechoically recorded file in WAV format, which resides in the common
datafolder. Make sure that the VA application can find the common
datafolder, which is also added as a search path in the default configurations.
Now, the signal source can be attached to a sound source using
Any buffer signal source can be started, stopped and paused. Also, it can be set to looping or non-looping mode (default).
va.set_sound_source_signal_source( sound_source_id, signal_source_id )
To receive the current state of the buffer signal source, use
va.set_signal_source_buffer_playback_action( signal_source_id, 'play' ) va.set_signal_source_buffer_playback_action( signal_source_id, 'pause' ) va.set_signal_source_buffer_playback_action( signal_source_id, 'stop' ) va.set_signal_source_buffer_looping( signal_source_id, true )
playback_state = va.get_signal_source_buffer_playback_state( signal_source_id )
Input device signal sources
Input channels from the sound card can be directly used as signal sources (microphones, electrical instruments, etc) and are unmanaged (can not be created or deleted). All channels are made available individually on startup and are integrated as list of signal sources by
va.set_sound_source_signal_source( sound_source_id, 'inputdevice1' )
for the first channel, and so on.
Text-to-speech (TTS) signal source
The TTS signal source allows to generate speech from text input. Because it uses the commercial CereProc's CereVoice third party library, it is not included in the VA package for public download. However, if you have access to the CereVoice library and can build VA with TTS support, this is how it works in
Do not forget that a signal source can only be auralized in combination with a sound source. For more information, refer to the text-to-speech example in the ITA-Toolbox for Matlab.
tts_signal_source = va.create_signal_source_text_to_speech( 'Heathers beautiful voice' ) tts_in = struct(); tts_in.voice = 'Heather'; tts_in.id = 'id_welcome_to_va'; tts_in.prepare_text = 'welcome to virtual acoustics'; tts_in.direct_playback = true; va.set_signal_source_parameters( tts_signal_source, tts_in )
Other signal sources
VA also provides specialized signal sources which can not be covered in detail in this introduction. Please refer to the source code for proper usage.
Scenes are a prototype-like definition to allow renderers to act differently depending on the requested scene identifier. This is useful when implementing different behaviour based on a user-triggered scene that should be loaded as, for example, a room acoustic situation or a city soundscape. Most renderers will ignore these calls, but renderers like the room acoustics renderer uses this concept as long as direct geometry handling is not fully implemented.
Directivities (including HRTFs)
Sound source and receiver directivities are usually made available as a file resource including multiple directions on a sphere for far-field usage. VA currently supports the OpenDAFF format with time domain and magnitude spectrum content type. They can be loaded with
VA ships with the ITA artificial head HRTF dataset (actually, the DAFF exports this dataset as HRIR in time domain), which is available under Creative Commons license for academic use.
directivity_id = va.create_directivity_from_file( 'my_individual_hrtf.daff' )
The default configuration files and Redstart sessions include this HRTF dataset as
DefaultHRIRmacro, and it can be created using
Make sure that the VA application can find the common
directivity_id = va.create_directivity_from_file( '$(DefaultHRIR)' )
datafolder, which is also added as an include path in default configurations.
Directivities can be assigned to a source or receiver with
va.set_sound_source_directivity( sound_source_id, directivity_id )
va.set_sound_receiver_directivity( sound_source_id, directivity_id )
VA provides support for rudimentary homogeneous medium parameters that can be set by the user. The data is accessed by rendering and reproduction modules (mostly to receive the sound speed value for delay calculation). Values are always in SI units (meters, seconds, etc). Additionally, a user-defined set of parameters is provided in case a prototyping renderer requires further specialized medium information (may also be used for non-homogeneous definitions). Here is the overview of setters and getters:
Speed of sound in m/s
va.set_homogeneous_medium_sound_speed( 343.0 ) sound_speed = va.get_homogeneous_medium_sound_speed()
Temperature in degree Celsius
va.set_homogeneous_medium_temperature( 20.0 ) temperature = va.get_homogeneous_medium_temperature()
Static pressure in Pascal, defaults to the norm atmosphere
va.set_homogeneous_medium_static_pressure( 101325.0 ) static_pressure = va.get_homogeneous_medium_static_pressure()
Relative humidity in percentage (ranging from 0.0 to 100.0 or above)
va.set_homogeneous_medium_relative_humidity( 75.0 ) humidity = va.get_homogeneous_medium_relative_humidity()
Medium shift / 3D wind speed in m/s
va.set_homogeneous_medium_shift_speed( [ x y z ] ) shift_speed = va.get_homogeneous_medium_relative_humidity()
Prototyping parameters (user-defined struct)
va.set_homogeneous_medium_parameters( medium_params ) medium_params = va.get_homogeneous_medium_relative_humidity()
Geometry interface calls are for future use and are currently not supported by the available renderers. The concept behind geometry handling is real-time environment manipulation for indoor and outdoor scenarios using VR technology like Unity or plugin adapters from CAD modelling applications like SketchUp.
Acoustic material interface calls are for future use and are currently not supported by available renderers. Materials are closely connected to geometry, as a geometrical surface can be linked to acoustic properties represented by the material.
Solving synchronisation issues
Scripting languages like Matlab are problematic by nature when it comes to timing: evaluation duration scatters unpredictability and timers are not precise enough. This becomes a major issue when, for example, a continuous motion of a sound source should be performed with a clean Doppler shift. A simple loop with a timeout will result in audible motion jitter as the timing for each loop body execution is significantly diverging. Also, if a music band should start playing at the same time and the start is executed by subsequent scripting lines, it is very likely that they end up out of sync.
To avoid timing problems, the VA Matlab binding provides a high-performance timer that is implemented in C++. It should be used wherever a synchronous update is required, mostly for moving sound sources or sound receivers. An example for a properly synchronized update loop at 60 Hertz that incrementally drives a source from the origin into positive X direction until it is 100 meters away:
S = va.create_sound_source() va.set_timer( 1 / 60 ) x = 0 while( x < 100 ) va.wait_for_timer; va.set_sound_source_position( S, [ x 0 0 ] ) x = x + 0.01 end va.delete_sound_source( S )
Synchronizing multiple updates
VA can execute updates synchronously in the granularity of the block rate of the audio stream process. Every scene update will be withhold until the update is unlocked. This feature is mainly used for simultaneous playback start.
va.lock_update va.set_signal_source_buffer_playback_action( drums, 'play' ) va.set_signal_source_buffer_playback_action( keys, 'play' ) va.set_signal_source_buffer_playback_action( base, 'play' ) va.set_signal_source_buffer_playback_action( sax, 'play' ) va.set_signal_source_buffer_playback_action( vocals, 'play' ) va.unlock_update
It is also useful for uniform movements of spatially static sound sources (like a vehicle with four wheels). However, locking updates will inevitably lock out other clients (like trackers) and should be released as soon as possible.
va.lock_update va.set_sound_source_position( wheel1, p1 ) va.set_sound_source_position( wheel2, p2 ) va.set_sound_source_position( wheel3, p3 ) va.set_sound_source_position( wheel4, p4 ) va.unlock_update
Audio rendering, next to reproduction, is the heart of VA. Rendering instances combine user information to auralize sound, in a unique way and with a dedicated purpose. Audio renderers are informed by the VA core about scene changes (asynchronous updates) which are triggered by the user. The task of each rendering instance is to adapt the requested changes as fast as possible.
Rendering modules work pretty much on their own. They feature, however, some common and some specialized methods for interaction.
To get a list of available modules, use
renderer_ids = va.get_rendering_modules()
Every rendering instance can be muted/unmuted and the output gain can be controlled.
va.set_rendering_module_muted( renderer_id, true ) va.set_rendering_module_gain( renderer_id, 1.0 ) mute_state = va.get_rendering_module_muted( renderer_id ) gain = va.get_rendering_module_gain( renderer_id )
Renderers may also be masked by auralization modes. To enable or disable certain auralization modes, use for example
va.set_rendering_module_auralization_mode( renderer_id, '-DS' ) va.set_rendering_module_auralization_mode( renderer_id, '+DS' )
To obtain and set parameters, type
va.set_rendering_module_parameters( renderer_id, in_params ) out_params = va.get_rendering_module_parameters( renderer_id, request_params )
request_paramscan usually be empty, but if a key
infois present, the rendering module will provide usage information.
A special feature that has been requested for Virtual Reality (background music, instructional speech, operator's voice) provides a sound source and sound receiver create method that will only be effective for the given rendering instance (explicit renderer). This is required if ambient clips should be played back without spatialization, or if certain circumstances demand that a source is only processed by one single renderer. In this way, computational power can be saved.
The sound sources and receivers created with this method are handled like normal entities but are only effective for the explicit rendering instance.
sound_source_id = va.create_sound_source_explicit_renderer( renderer_id, 'HitButtonEffect' ) sound_receiver_id = va.create_sound_receiver_explicit_renderer( renderer_id, 'SurveillanceCamMic' )
Binaural free field renderer
For a proper time synchronization of this renderer with other renderers, a static delay which is added to the propagation delay simulation can be set. This static delay is defined by a special parameter using a struct. In the following example, it is set to 100ms.
in_struct = struct() in_struct.AdditionalStaticDelaySeconds = 0.100 va.set_rendering_module_parameters( renderer_id, in_struct )
Special features of this renderer include individualized HRIRs. The anthropometric parameters are derived from a specific key/value layout of the receiver parameters combined under the key
anthroparams. All parameters are provided in units of meters.
in_struct = struct() in_struct.anthroparams = struct() in_struct.anthroparams.headwidth = 0.12 in_struct.anthroparams.headheight = 0.10 in_struct.anthroparams.headdepth = 0.15 va.set_sound_receiver_parameters( sound_receiver_id, in_struct )
The current anthropometric parameters can be obtained by
params = va.get_sound_receiver_parameters( sound_receiver_id, struct() ) disp( params.anthroparams )
Prototype generic path renderer
This renderer can update impulse responses through the VA interface and will exchange incoming data in real-time for a requested source-receiver pair. It can be used as a powerful prototyping tool that gives instant audible results for A/B comparisons. At ITA, it is used to create binaural (two-channel) FIR filtering-based renderer within Matlab as part of the laboratory course on Acoustic Virtual Reality.
In the examples below, the propagation path from source 1 to receiver 1 is updated. If no verbose output is required, just drop the verbose key.
To trigger an update from a file resource, a specialized struct has to be created:
in_struct = struct() in_struct.receiver = 1 in_struct.source = 1 in_struct.verbose = 1 in_struct.filepath = CologneDomeAmbisonicsIRMeasurement.wav va.set_rendering_module_parameters( renderer_id, in_struct )
If a certain channel should be updated, say channel 3, add
in_struct.channel = 3
To trigger an update by sending impulse response samples directly (in this example, two channels are used, but also more channels are possible), compile another specialized struct like the following:
in_struct = struct() in_struct.receiver = 1 in_struct.source = 1 in_struct.verbose = 1 in_struct.ch1 = [ 1 0 0 0 ... ] in_struct.ch2 = [ 0 0 1 0 ... ] va.set_rendering_module_parameters( renderer_id, in_struct )
This example struct will exchange a non-delayed Dirac impulse for the first channel and a Dirac with 2 samples delay on the second channel. Of course, an entire measured or simulated impulse response will be used in common applications.
Audio reproduction modules receive spatialized audio streams from audio renderer modules. Most of them work independently from user input but some require knowledge about the user's real-world pose in the reproduction environment.
Rendering modules work pretty much on their own. They feature, however, some common and some specialized methods for interaction.
To get a list of available modules, use
reproduction_ids = va.get_reproduction_modules()
Every reproduction instance can be muted/unmuted and its output gain can be controlled.
va.set_reproduction_module_muted( reproduction_id, true ) va.set_reproduction_module_gain( reproduction_id, 1.0 ) gain = mute_state = va.get_reproduction_module_muted( reproduction_id ) va.get_reproduction_module_gain( reproduction_id )
To obtain and set parameters, type
va.set_reproduction_module_parameters( reproduction_id, in_params ) out_params = va.get_reproduction_module_parameters( reproduction_id, request_params )
request_paramscan usually be empty, but if a key
infois present, the reproduction module will provide usage information.
Multi-channel cross-talk cancellation
The N-CTC reproduction requires exact knowledge on the user's ear canal positions. Therefore, it can only be used for a single sound receiver and the module evaluates the real-world pose that can be set by the interface call (or by tracking as described below).
Some additional parameters can be modified during real-time processing for immediate evaluation. The additional delay (in seconds) shifts the resulting CTC filters to create causality. The CTC factor and WICK factor control for the smoothing of the initial HRTF in order to gain better transmission quality and a wider sweet spot while trading off signal-to-noise ratio of the binaural performance. When setting the WICK factor to zero, the N-CTC module acts like a multi-channel transaural stereo reproduction with simple panning and constant group delay.
in_params = struct() in_params.AdditionalDelayTime = 0.100 in_params.CrossTalkCancellationFactor = 1.0 in_params.WaveIncidenceAngleCompensation = 1.0 va.set_reproduction_module_parameters( reproduction_id, in_params )
During runtime, the inverted FIR filter for the headphone equalization can be exchanged. This is helpful to investigate the effect of the equalization performance by direct comparison. To maintain 0 dB playback, if the inverse FIR filter changes the signal's energy, a calibration gain factor can optionally be passed, either as factor or as decibel value.
in_params = struct() in_params.HpIRInvFile = HD650_individualized_eq.wav in_params.HPIRInvCalibrationGain = 1.0 in_params.HPIRInvCalibrationGainDecibel = 0.0 va.set_reproduction_module_parameters( reproduction_id, in_params )
VA does not support tracking internally but facilitates the integration of tracking devices to update VA entities. For external tracking, the
VAMatlab project currently supports NaturalPoint's OptiTrack devices to be connected to a server instance. It can automatically forward rigid body poses (head and torso, separately) to one sound receiver and one sound source. Another possibility is to use an HMD such as Oculus Rift and HTC Vive and update VA through Unity.
OptiTrack via VAMatlab
To connect an OptiTrack rigid body to a VA sound entity (here, a receiver with id 1 was defined), use
va.set_tracked_sound_receiver( 1 )
To also include the real-world pose (as required by some reproduction modules like the N-CTC reproduction module), also execute
va.set_tracked_real_world_sound_receiver( 1 )
If the rigid body index should be changed (e.g., to index 3 for head and 4 for torso), use
va.set_tracked_sound_receiver_head_rigid_body_index( 3 ) va.set_tracked_sound_receiver_torso_rigid_body_index( 4 )
The head rigid body (rb) can also be locally transformed using a translation and (quaternion) rotation method, e.g., if the rigid body barycenter is not between the ears or is rotated against the default orientation:
va.set_tracked_sound_receiver_head_rb_trans( [ x y z ] ) va.set_tracked_real_world_sound_receiver_head_rb_rotation( [ a b c d ] )
For the real-world sound receiver, similar methods exist:
va.set_tracked_real_world_sound_receiver_head_rigid_body_index( 3 ) va.set_tracked_real_world_sound_receiver_torso_rigid_body_index( 4 ) va.set_tracked_real_world_sound_receiver_head_rb_trans( [ x y z ] ) va.set_tracked_real_world_sound_receiver_head_rb_rotation( [ a b c d ] )
The sound source methods are almost equal, except that the
receiver has to be substituted, as shown for a sound source with id 1 at rigid body index 5 in the following example:
va.set_tracked_sound_source( 1 ) va.set_tracked_sound_source_rigid_body_index( 5 ) va.set_tracked_sound_source_rigid_body_translation( [ x y z ] ) va.set_tracked_sound_source_rigid_body_rotation( [x y z w ] )
To finally connect to the tracker that is running on the same machine and pushes to
localhost network loopback device, use
In case that the tracker is running on another machine, OptiTrack requires to both set the remote (in this example 192.168.1.2) AND the client machine IP (in this example 192.168.1.143) like this
va.connect_tracker( '22.214.171.124', '126.96.36.199' )
HMD via VAUnity
To connect an HMD, set up a Unity scene and connect the tracked GameObject (usually the MainCamera) with a VAUSoundReceiver instance. For further details, please read the README files of VAUnity.
Simulation and recording
As already pointed out, VA can be used to record simulated acoustic environments. The only requirement is to activate the output recording flag in the configuration and add a target file name and base folder where to store the recordings, as described in the rendering and reproduction module setup sections. Outputs from the rendering modules and inputs to reproduction modules can be used to record spatial audio samples (like binaural clips or Ambisonics B-format / HOA tracks). Outputs from reproductions can be used for offline playback over the given loudspeaker setup, e.g. for (audio-visual) demonstrations or non-interactive listening experiments.
Two different approaches can be used:
a) capturing the real-time audio streams
b) emulating a sound card and processing the audio stream offline
Capturing the output of rendering and reproduction modules in real-time
If you want to capture the real-time output of rendering and reproduction modules, VA is driven by the sound card and live updates - like tracking device data or HMD movements - are effective. The scene is updated according to the user interaction and are bound to the update rate of the tracking device or control loop timeout.
This approach is helpful to record simple scenes with synchronized audio-visual content, e.g. when using Unity3D and preparing a video for demonstration purposes.
Capturing the output of rendering and reproduction modules by an emulated virtual audio device (offline mode)
For sound field simulations with high-precision timing for physics-based audio rendering, using a lot of scene adaptions in combination with a very small audio processing block sizes is a wise desision. Therefore, one should switch to the offline simulation capability of VA.
A virtual sound device can be activated that suspends the timeout-driven block processing and gives the control of the audio processing speed into the user's hands. This allows to change the scene without time limits and every audio processing block can be triggered to process the entire new scene, no matter how small the block size (in real-time mode, typically 10 times more audio blocks are processed during a single scene update, for example a translation of the listener triggered by a tracking device). Additionally, there is no hardware constraint as the rendering and reproduction calculations are not bound to deliver real-time update rates. This makes it possible to set up scenes of arbitrary complexity for the cost of a longer calculation of the processing chain including rendering, reproduction, recording and export to generate the audio files.
Virtual sound card audio driver configuration
To enable the emulated sound card and set it up for the Matlab example script
itaVA_example_offline_simulation.m (open in git) and
itaVA_example_offline_simulation_ir.m (open in git), modify your configuration as follows
[Audio driver] Driver = Virtual Device = Trigger Samplerate = 44100 Buffersize = 64 Channels = 2
Controlling the virtual card audio processing (
To modify the internal timing, set the clock by an increment of the blocks to be processed, e.g.
To trigger a new audio block to be processed, run
% Clock increment of 64 samples at a sampling rate of 44.1kHz manual_clock = manual_clock + 64 / 44100; va.call_module( 'manualclock', struct( 'time', manual_clock ) );
% Process audio chain by incrementing one block va.call_module( 'virtualaudiodevice', struct( 'trigger', true ) );
These incrementations are usually called at the end of a simulation processing loop. Any scene change prior to that will be effectively auralized. For example to implement a dynamic room acoustics situation with an animation path, a generic path renderer can be used and a full room acoustics simulation of 10 seconds runtime can be executed prior and the filter exchange, making every simulation step change audible.
Changing the recording file paths during runtime (
When recording offline simulations, it is often helpful to store the recorded audio file to a different folder each time a script is executed. To do so, VA accepts a parameter call to modify the base folder and the file name as follows:
% Single-line calls va.set_rendering_module_parameters( 'MyRenderer', struct( 'RecordOutputFileName', 'modified_reproduction_out.wav' ) ); va.set_rendering_module_parameters( 'MyRenderer', struct( 'RecordOutputBaseFolder', ... fullfile( 'recordings/MyRenderer', datestr( datetime( 'now' ) ) ) ) ); % with date and time folder % Struct call mStruct = struct(); mStruct.RecordOutputFileName = 'modified_rendering_out.wav'; mStruct.RecordOutputBaseFolder = fullfile( 'recordings/MyRenderer', datestr( datetime( 'now' ) ) ); % with date and time folder va.set_rendering_module_parameters( 'MyRenderer', mStruct );
For reproduction modules, the same parameter update is possible including the input stream file name and base folder:
% Struct call mStruct = struct(); mStruct.RecordInputFileName = 'modified_reproduction_in.wav'; mStruct.RecordInputBaseFolder = fullfile( 'recordings/MyReproduction', datestr( datetime( 'now' ) ) ); % with date and time folder mStruct.RecordOutputFileName = 'modified_reproduction_out.wav'; mStruct.RecordOutputBaseFolder = fullfile( 'recordings/MyReproduction', datestr( datetime( 'now' ) ) ); % with date and time folder va.set_reproduction_module_parameters( 'MyReproduction', mStruct );
Note: this feature is available since v2018b.
Here are some common use cases and a full description on how to set up a VA server and create a corresponding scene.
Binaural sound source circulating around a listener
Involved application: Redstart
Recommended playback device: Headphones
Note: Shortcuts are indicated in brackets.
- Open up Redstart and create a binaural session (N, B). Leave everything to default.
- Start the session (F5)
- Now, open Run > Circulating source (R, C), leave everything to default and hit the Start button
- You should hear the welcome track as virtual sound source circulating around your head (based on the default ITA artificial head HRTF data set).
Now, try to change parameters and listen to the effect these changes have on the auralization. To test your own files, create a new binaural session and override the default macros.