Getting started

Auralization with Virtual Acoustics
This content is available under CC BY 4.0

Preface

Virtual Acoustics is a powerful tool for the auralization of virtual acoustic scenes and the reproduction thereof. Getting started with VA includes three important steps

  • Configuring the application
  • Controlling the core
  • Setting up a scene

The overall design goal aimed at keeping things as simple as possible. However, certain circumstances do not allow further simplicity due to their complexity by nature. VA addresses professionals and is mainly used by scientists. Important features are never traded for convenience if the system's integrity is at stake. Hence, getting everything out of VA will require profound understanding of the technologies involved. It is designed to offer highest flexibility which comes at the price of a demanding configuration. At the beginning, configuring VA is not trivial especially if a loudspeaker-based audio reproduction shall be used.

The usage of VA can often be divided into two user groups
  • those who seek for quick experiments with spatial audio and are happy with conventional playback over headphones
  • those who want to employ VA for a sophisticated loudspeaker setup for (multi modal) listening experiments and Virtual Reality applications

For the first group of users, there are some simple setups that will already suffice for most of the things you aspire. Such setups include, for example, a configuration for binaural audio rendering over a non-equalized off-the-shelf pair of headphones. Another configuration example contains a self-crafted interactive rendering application that exchanges pre-recorded or simulated FIR filters using Matlab or Python scripts for different purposes such as room acoustic simulations, building acoustics, A/B live switching tests to assess the influence of equalization. The configuration effort is minimal and works out of the box if you use the Redstart applications or start a VA command line server with the corresponding core configuration file. If you consider yourself as part of this group of users skip the configuration part and have a look at the examples. Thereafter, read the control section and the scene handling section

If you are willing to dive deeper into the VA framework you are probably interested in how to adapt the software package for your purposes. The following sections will describe how you can set up VA for your goal from the very beginning.



Virtual Acoustics configuration

VA can be configured using a section-based key-value parameter collection which is passed on to the core instance during initialization. This is usually done by providing a path to a text-based INI file which will be referred to as VACore.ini but can be of arbitrary name. If you use the VAServer application you will work with this file only. If you only use the Redstart GUI application you will probably never use it. However, the INI file can be exported from a Redstart session in case you need it.

Basic configuration

Paths

The Paths section allows for adding search paths to the core. If resources like head-related transfer functions (HRTFs), geometry files, or audio files, are required these search paths guarantee to locate the requested files. Relative paths are resolved from the execution folder where the VA server application is started from. When using the provided batch start scripts on Windows it is recommended to add data and conf folders.

[Paths]

data = data
conf = conf

my_data = C:/Users/Me/Documents/AuralizationData
my_other_data = /home/me/auralization/input

Files

In the Files section, you can name files that will be included as further configuration files. This is helpful when certain configuration sections must be outsourced to be reused efficiently. Outsourcing is especially convenient when switching between static sections like hardware descriptions for laboratories or setups, but can also be used for rendering and reproduction modules (see below). Avoid copying larger configuration sections that are re-used frequently. Use different configuration files, instead.

[Files]

old_lab = VASetup.OldLab.Loudspeakers.ini
#new_lab = VASetup.NewLab.Loudspeakers.ini

Macros

The Macros section is helpful to write tidy scripts. Use macros if it is not explicitly required to use a specific input file. For example, if any HRTF can be used for a receiver in the virtual scene the DefaultHRIR will point to the default HRTF data set, or head-related impulse response (HRIR) in time domain. Any defined macros will be replaced through a given value by the core.
Usage: "$(MyMacroName)/file.abc" -> "MyValue/file.abc"
Macros are substituted forwardly by key name order (use with care), and otherwise stay untouched: A = B; C = $(A) -> $(C) is B
The example macros provided below are a good practice set which should be present in a configuration file in order to keep the example scripts valid.
Macros are also very helpful if certain exported file prefixes are desired, e.g., to get better structured file names for input and output recordings.

[Macros]

DefaultHRIR = HRIR/ITA-Kunstkopf_HRIR_AP11_Pressure_Equalized_3x3_256.v17.ir.daff
HumanDir = Directivity/Singer.v17.ms.daff
Trumpet = Directivity/Trumpet1.v17.ms.daff

# Define some other macros (examples)
ProjectName = MyVirtualAcousticsProject

Debug

The Debug section configures the initial behavior of the core as, for example, log level and input/output recording. If input and output recording is enabled the entire channel number of your physical or abstract device will be logged. For devices with a lot of digital inputs and outputs, the channel count may reach up to 256 channels, the maximum channel number as defined per WAV format. Additionally, the data is stored as PCM data at a resolution of 32 bit leading to high storage requirements. To avoid such excessive storage demands, only use this option if absolutely necessary. Otherwise it is recommended to only record the output channels which were set, for example, in the playback modules (see below).
In the following, some macros are used (see Macros section above).

[Debug]

# Record device input and store to hard drive (will record every available input channel)
OutputRecordEnabled = false
OutputRecordFilePath = $(ProjectName)_in.wav

# Record device output and store to hard drive (will record every available output channel)
InputRecordEnabled = false
OutputRecordFilePath = $(ProjectName)_out.wav

# Set log level: 0 = quiet; 1 = errors; 2 = warnings (default); 3 = info; 4 = verbose; 5 = trace;
LogLevel = 3

Calibration

To properly calibrate a rendering and reproduction system, every component in the chain has to be carefully configured. Hence the lack of being scaled by physical means, digital signals stored, for example, in a WAV file or in the buffers of the sound card, a reference point enabling a proper calibration was set. In VA, a digital value of 1.0 refers to 1 Pascal at a distance of 1 m per default. For example, a sine wave with peak value of \sqrt(2)) will retain 94 dB SPL at a distance of 1m. But this value can also be changed to 124 dB if lower amplitudes are necessary (and a sample type conversion from float to integer is performed along the output chain). This makes it necessary to use a powerful amplifier facilitating the reproduction of small sample values. Setting the internal conversion value to 124 dB avoids clipping at high values (but introduces a higher noise floor). To do so, include the following section into the configuration (the clarification comment can be dropped):

[Calibration]

# The amplitude calibration mode either sets the internal conversion from
# sound pressure to an electrical or digital amplitude signal (audio stream)
# to 94dB (default) or to 124dB. The rendering modules will use this calibration
# mode to calculate from physical values to an amplitude that can be forwarded
# to the reproduction modules. If a reproduction module operates in calibrated
# mode, the resulting physical sound pressure at receiver location can be maintained.

DefaultAmplitudeCalibrationMode = 94dB

Audio interface configuration

The audio interface controls the backend driver and the device. In the current version, for the Driver backend key, ASIO is supported on Windows only, whereas Portaudio is available on all platforms. By default, Portaudio with the default driver is used that usually produces audible sound without further ado. However, the block sizes are high and the update rates are not sufficient for real-time auralization using motion tracking. Therefore, dedicated hardware and small block sizes should be used - and ASIO is recommended for Windows platforms.

ASIO example using ASIO4ALL v2

ASIO4ALL is a useful and well-implemented intermediate layer for audio I/O making it possible to use ASIO drivers for the internal hardware (and any other audio device available). It must be installed on the PC, first.

[Audio driver]

Driver = ASIO
Samplerate = 44100
Buffersize = AUTO
Device = ASIO4ALL v2
Although it appears that the buffer size can be defined for ASIO devices, the ASIO backend will automatically detect the buffer size that has been configured by the driver when the AUTO value is set (recommended). Set the buffer size in the ASIO driver dialog of your physical device, instead. Make sure, that the sampling rates are matching.
ASIO requires a device name to be defined by each driver host. Further common hardware device names are

Manufacturer Device ASIO device name
RME Hammerfall DSP ASIO Hammerfall DSP
RME Fireface USB ASIO Fireface USB
RME MADIFace USB ASIO MADIface USB
Focusrite 2i2, 2i4, ... Focusrite USB 2.0 Audio Driver
M-Audio Fast Track Ultra M-Audio Fast Track Ultra ASIO
Steinberg 6UR22 MK2 Yamaha Steinberg USB ASIO
Realtek Realtek Audio HD Realtek ASIO
Zoom H6 ZOOM H and F Series ASIO
ASIO4ALL any windows device ASIO4ALL v2
Reaper (x64) any Reaper device ReaRoute ASIO (x64)
Table 1: Common ASIO device driver host names

If you do not have any latency requirements you can also use Portaudio under Windows and other platforms. The specific device names of Portaudio interfaces can be detected, for example, using the VLC player or with Audacity. But the default device is recommended simply because it will pick the audio device that is also registered as the default device of your system. This is, what most people need anyway, and the system tools can be used to change the output device.
If the Buffersize is unkown, at least the native buffer size of the audio device should be used (which is most likely 1024 for on-board chips). Otherwise, timing will behave oddly which has a negative side effect on the rendering.

[Audio driver]

Driver = Portaudio
Samplerate = 44100
Buffersize = 1024
Device = default

Audio hardware configuration

The Setup section describes the hardware environment in detail. It might seem a bit over the top but the complex definition of hardware groups with logical and physical layers eases re-using of physical devices for special setups and also allows for multiple assignments - similar to the RME matrix concept of TotalMix, except that volume control and mute toggling can be manipulated in real-time using the VA interface instead of the ASIO control panel GUI.
The hardware configuration can be separated into inputs and outputs, but they are basically handled in the same manner. More importantly, the setup can be devided into devices of specialized types and groups that combine devices. Often, this concept is unnecessary and appears cumbersome, but there are situations where this level of complexity is required.
A device is a physical emmitter (OutputDevice) or transducer (InputDevice) with a fixed number of channels and assignment using (arbitrary but unique) channel indices. A broadband loudspeaker with one line input is a typical representative of the single channel LS type OutputDevicethat has a fixed pose in space. A pair of headphones is assigned the type HP and usually has two channels, but no fixed pose in space.
So far, there is only an input device type called MIC that has a single channel.

Physical devices can not directly be used for a playback in VA. A reproduction module can rather be connected with one or many Outputs - logical groups of OutputDevices.
Again, for headphones this seems useless because a headphone device will be represented by a virtual group of only one device. However, for loudspeaker setups this makes sense as, for example, a setup of 7 loudspeakers for spatial reproduction may be used by different groups which combine only 5, 4, 3, or 2 of the available loudspeakers to form an output group. In this case, only the loudspeaker identifiers are required and channels and positions are made available by the physical device description. Following this strategy, repositioning of loudspeakers and re-assignment of channel indices is less error prone due to its organization in one configuration section, only.

Headphone setup example

Let us assume you have a pair of Sennheiser HD 650 headphones at your disposal and you want to use it for binaural rendering and reproduction. This is the most common application of VA and will result in the following configuration:

[Setup]

[OutputDevice:SennheiserHD650]
Type = HP
Description = Sennheiser HD 650 headphone hardware device
Channels = 1,2

[Output:DesktopHP]
Description = Desktop user with headphones
Devices = SennheiserHD650

If you want to use another output jack for some reason change your channels accordingly, say to 3,4.

Loudspeaker setup example

Let us assume you have a square-shaped loudspeaker setup of Neumann KH120 at your disposal. You want to use it for binaural rendering and reproduction. This is the a common application of VA for a dynamic listening experiment in a hearing booth. For this scenario, the configuration file may like this:

[Setup]

[OutputDevice:NeumannKH120_FL]
Type = LS
Description = Neumann KH 120 in front left corner of square
Channels = 1

[OutputDevice:NeumannKH120_FR]
Type = LS
Description = Neumann KH 120 in front right corner of square
Channels = 2

[OutputDevice:NeumannKH120_RR]
Type = LS
Description = Neumann KH 120 in rear right corner of square
Channels = 3

[OutputDevice:NeumannKH120_RL]
Type = LS
Description = Neumann KH 120 in rear left corner of square
Channels = 4

[Output:HearingBoothLabLS]
Description = Hearing booth laboratory loudspeaker setup
Devices = NeumannKH120_FL, NeumannKH120_FR, NeumannKH120_RR, NeumannKH120_RL

Note: The order of devices in the output group is irrelevant for the final result. Each LS will receive the corresponding signal on the channel of the device.

Microphone setup example

The audio input configuration is similar to the output configuration but is not yet fully included in VA. If you want to use input channels as signal sources for a virtual sound source assign the provided unmanaged signals called audioinput1, audioinput2, ... . The number refers to the input channel index beginning with 1 and you can get the signals by using the getters GetSignalSourceInfos or GetSignalSourceIDs.

[Setup]

[InputDevice:NeumannTLM170]
Type = MIC
Description = Neumann TLM 170
Channels = 1

[Input:BodyMic]
Description = Hearing booth talk back microphone
Devices = NeumannTLM170

Homogeneous medium

To override default values concerning the homogeneous medium that is provided by VA, include the following section and modify the values to your needs (the default values are shown here).

[HomogeneousMedium]

DefaultSoundSpeed = 344.0 # m/s
DefaultStaticPressure = 101125.0 # [Pa]
DefaultTemperature = 20.0 # [Degree centigrade]
DefaultRelativeHumidity = 20.0 # [Percent]
DefaultShiftSpeed = 0.0, 0.0, 0.0 # 3D vector in m/s

Rendering module configuration

To instantiate a rendering module, a section with a Renderer: suffix has to be included. The statement following : will be the unique identifier of this rendering instance. If you want to change parameters during execution this identifier is required to call the instance. Although all renderers require some obligatory definitions, a detailed description is necessary for the specific parameter set. For typical renderers, some examples are given below.

Required rendering module parameters

Class = RENDERING_CLASS
Reproductions = REPRODUCTION_INSTANCE(S)
The rendering class refers to the type of renderer which can be taken from the tables in the overview section.
The section Reproductions describes how to configure connections to reproduction modules. At least one reproduction module has to be defined but the rendering stream can also be connected to multiple reproductions of same or different type (e.g., talkthrough, equalized headphones and cross-talk cancellation). The only restriction is that the rendering output channel number has to match the reproduction module's input channel number. This prevents connecting a two-channel binaural renderer with, for example, an Ambisonics reproduction which would take at least 4 channels.

Optional rendering module parameters

Description = Some informative description of this rendering module instance
Enabled = true
OutputDetectorEnabled = false
RecordOutputEnabled = false
RecordOutputFilePath = MyRenderer_filename_may_including_$(ProjectName)_macro.wav
Rendering modules can be enabled and disabled to speed up setup changes without copying & pasting larger parts of a configuration section, as especially reproduction modules can only be instantiated if the sound card provides enough channels. This makes testing on a desktop PC and switching to a laboratory environment easier.
For rendering modules, only the output can be observed. A stream detector for the output can be activated that will produce level meter values, for example, for a GUI widget. The output of the active listener can also be recorded and exported as a WAV file. Recording starts with initialization and is exported to the hard disc drive after finalization impliciting that data is kept in the RAM. If a high channel number is required and/or long recording sessions are planned it is recommended to route the output through a DAW, instead, i.e. with ASIO re-routing software devices like Reapers ReaRoute ASIO driver. To include a more versatile output file name (macros are allowed).

Binaural free field renderer (class BinauralFreeField) example

This example with all available key/value configuration pairs is include in the default VACore.ini settings which is generated from the repository's VACore.ini.proto (by CMake). It requires a reproduction called MyTalkthroughHeadphones, shown further below.

[Renderer:MyBinauralFreeField]
Class = BinauralFreeField
Enabled = true
Reproductions = MyTalkthroughHeadphones
HRIRFilterLength = 256
MotionModelNumHistoryKeys = 10000
MotionModelWindowSize = 0.1
MotionModelWindowDelay = 0.1
MotionModelLogInputSources = false
MotionModelLogEstimatedOutputSources = false
MotionModelLogInputReceiver = false
MotionModelLogEstimatedOutputReceiver = false
SwitchingAlgorithm = linear
OutputDetectorEnabled = false
RecordOutputEnabled = false
RecordOutputFilePath = MyRenderer_filename_may_including_$(ProjectName)_macro.wav
A more detailed explanation of the motion model and further parameters are provided in the documentation specifying how the rendering works.

VBAP free field renderer (class VBAPFreeField) example

Requires Output (3-d positions of a loudspeaker setup) to render channel-based audio. Otherwise, it works similar to other free field renderers.

[Renderer:MyVBAPFreefield]
Class = VBAPFreeField
Enabled = true
Output = VRLab_Horizontal_LS
Reproductions = MixdownHeadphones

Ambisonics free field renderer (class AmbisonicsFreeField) example

Similar to binaural free field renderer, but evaluates receiver directions based on a decomposition into spherical harmonics with a specific order (TruncationOrder). It requires a reproduction called MyAmbisonicsDecoder which is shown further below.

[Renderer:MyAmbisonicsFreeField]
Class = AmbisonicsFreeField
Enabled = true
Reproductions = MyAmbisonicsDecoder
TruncationOrder = 3
MotionModelNumHistoryKeys = 10000
MotionModelWindowSize = 0.1
MotionModelWindowDelay = 0.1
MotionModelLogInputSources = false
MotionModelLogEstimatedOutputSources = false
MotionModelLogInputReceiver = false
MotionModelLogEstimatedOutputReceiver = false
SwitchingAlgorithm = linear
OutputDetectorEnabled = false
RecordOutputEnabled = false
RecordOutputFilePath = MyRenderer_filename_may_including_$(ProjectName)_macro.wav

Ambient mixing renderer (class AmbientMixer) example

The ambient mixer takes the value of the key OutputGroup and accordingly sets the channel count for playback as subsequent reproduction modules require matching channels. However, an arbitrary number of reproduction modules can be specified, as shown in the following example.

[Renderer:MyAmbientMixer]
Class = AmbientMixer
Description = Low-cost renderer to make sound audible without spatializations
Enabled = true
OutputGroup = MyDesktopHP
Reproductions = MyDesktopHP, MySubwooferArray

Binaural artificial room acoustics renderer (class BinauralArtificialReverb) example

Values and angles are specified in SI units (e.g., seconds, meters, watts, etc.) and angles, respectively. The reverberation time may exceed the reverberation filter length (divided by the sampling rate) resulting in a cropped impulse response. This renderer requires and uses the sound receiver HRIR for spatialization and applies a sound power correction to match with direct sound energy if used together with the binaural free field renderer.

[Renderer:MyBinauralArtificialRoom]
Class = BinauralArtificialReverb
Description = Low-cost per receiver artificial reverberation effect
Enabled = true
Reproductions = MyTalkthroughHeadphones
ReverberationTime = 0.71
RoomVolume = 200
RoomSurfaceArea = 88
MaxReverbFilterLengthSamples = 88200
PositionThreshold = 1.0
AngleThresholdDegree = 30
SoundPowerCorrectionFactor = 0.05
TimeSlotResolution = 0.005
MaxReflectionDensity = 12000.0
ScatteringCoefficient = 0.1

Binaural room acoustics renderer (class BinauralRoomAcoustics) example

Requires the Room Acoustics for Virtual ENvironments (RAVEN) software module (see Research section) or other room acoustics simulation backends. Note that the reverberation time may exceed the reverberation filter length (divided by the sampling rate) with the consequence that the generated impulse response will be cropped. This renderer requires and uses the specified sound receiver HRIR data set for spatialization and applies a sound power correction to match with direct sound energy if combined with binaural free field renderer.

[Renderer:MyBinauralRoomAcoustics]
Class = BinauralRoomAcoustics
Enabled = true
Description = Renderer with room acoustics simulation backend (RAVEN) for a source-receiver-pair with geometry-aware propagation
Reproductions = MyTalkthroughHeadphones
# Setup options: Local, Remote, Hybrid
Setup = Local
ServerIP = PC-SEACEN
HybridLocalTasks = DS
HybridRemoteTasks = ER_IS, DD_RT
RavenDataBasePath = $(raven_data)
# Task processing (Timeout = with desired update rate, for resource efficient processing; EventSync = process on request (for sporadic updates); Continuous = update as often as possible, for standalone server)
TaskProcessing = Timeout
# Desired update rates in Hz, may lead to resource issues
UpdateRateDS = 12.0
UpdateRateER = 6.0
UpdateRateDD = 1.0
MaxReverbFilterLengthSamples = 88200
DirectSoundPowerCorrectionFactor = 0.3

Prototype free field renderer (class PrototypeFreeField) example

Similar to binaural free field renderer with the capability of handling multi-channel receiver directivities. This renderer can, for example, be used for recording the output of microphone array simulations.

[Renderer:MyPrototypeFreeField]
Class = PrototypeFreeField
Enabled = true
Reproductions = MyTalkthroughHeadphones
MotionModelNumHistoryKeys = 10000
MotionModelWindowSize = 0.2
MotionModelWindowDelay = 0.1
MotionModelLogInputSources = false
MotionModelLogEstimatedOutputSources = false
MotionModelLogInputReceivers = false
MotionModelLogEstimatedOutputReceivers = false
SwitchingAlgorithm = linear

Prototype generic path renderer (class PrototypeGenericPath) example

Channel count and length can be specified arbitrarily but is limited by the computational power available. Filtering is done individually for each source-receiver pair.

[Renderer:MyPrototypeGenericPath]
Class = PrototypeGenericPath
Enabled = true
Reproductions = MyTalkthroughHeadphones
NumChannels = 2
IRFilterLengthSamples = 88200
IRFilterDelaySamples = 0
OutputMonitoring = true

Binaural air traffic noise renderer (class BinauralAirTrafficNoise) example

Filtering is done individually for each source-receiver pair. Involved filters the simulation of propagation paths can also be exchanged by the user for prototyping (requires a modification of simulation flags in the configuration file).


[Renderer:MyAirTrafficNoiseRenderer]
Class = BinauralAirTrafficNoise
Enabled = true
Reproductions = MyTalkthroughHeadphones
MotionModelNumHistoryKeys = 1000
MotionModelWindowSize = 2
MotionModelWindowDelay = 1
MotionModelLogInputSources = false
MotionModelLogEstimatedOutputSources = false
MotionModelLogInputReceivers = false
MotionModelLogEstimatedOutputReceivers = false
GroundPlanePosition = 0.0
PropagationDelayExternalSimulation = false
GroundReflectionExternalSimulation = false
DirectivityExternalSimulation = false
AirAbsorptionExternalSimulation = false
SpreadingLossExternalSimulation = false
TemporalVariationsExternalSimulation = false
SwitchingAlgorithm = cubicspline

Dummy renderer (class PrototypeDummy) example

Useful for a quick configuration of your own prototype renderer.

[Renderer:MyDummyRenderer]
class = PrototypeDummy
Description = Dummy renderer for testing, benchmarking and building upon
Enabled = true
OutputGroup = MyDesktopHP
Reproductions = MyTalkthroughHeadphones

Other rendering module examples

Every specific rendering module has its own specific set of parameters. The discussion of every functional detail is out of scope of this introduction. As all configurations are parsed in the constructor of the respective module, their functionality can sometimes only be fully understood by investigating the source code. For facilitation, the Redstart GUI application includes dialogs to create and interact with those renderers, additionally offering information when hovering over the GUI elements.

Reproduction module configuration

To instantiate a reproduction module, a section with a Reproduction: suffix has to be included. The statement following : will be the unique identifier of this reproduction instance. If you want to change parameters during execution, this identifier is required to call the instance. All reproduction modules require some obligatory definitions but for every specific parameter set, a detailed description is necessary. For typical reproduction modules, some examples are given below.

Required reproduction module parameters

Class = REPRODUCTION_CLASS
Outputs = OUTPUT_GROUP(S)
The reproduction class refers to the type of reproduction as provided in the section overview.
The parameter Outputs describes the connections to logical output groups that forward audio based on the configured channels. At least one output group has to be defined but the reproduction stream can also be connected to multiple outputs of same or different type (e.g., different pairs of headphones). The only restriction is that the reproduction channel number has to match with the channel count of the output group(s).

Optional reproduction module parameters

Description = Some informative description of this reproduction module instance
Enabled = true
InputDetectorEnabled = false
RecordInputEnabled = false
RecordInputFilePath = MyReproInput_filename_may_including_$(ProjectName)_macro.wav
OutputDetectorEnabled = false
RecordOutputEnabled = false
RecordOutputFilePath = MyReproOutput_filename_may_including_$(ProjectName)_macro.wav
Reproduction modules can be enabled and disabled to speed up setup changes without copy & pasting larger parts of a configuration section as especially output groups can only be instantiated if the sound card provides enough channels. This makes testing on a desktop and switching to a lab environment easier.
For reproduction modules, the input and output can be observed. A stream detector on input and output can be activated that will produce level meter values, to be used in a GUI widget or so. The input of a reproduction module may include several superposed rendering streams (in constrast to the rendering output), for example, for direct sound and reverberant sound. The output of a reproduction can also be recorded and exported to a WAV file. The recording starts at initialization and is exported to hard drive after finalization implicating that data is kept in the RAM. If a lot of channel numbers are required and/or long recording sessions are planned it is recommended to route the output through a DAW using, for example, ASIO re-routing software devices like Reapers ReaRoute ASIO driver. Macros are useful to include a more versatile output file name.

Talkthrough reproduction (class Talkthrough) example

The following example with all available key/value configuration pairs is taken from the default VACore.ini settings which is generated from the repository's VACore.ini.proto (by CMake). It requires an output called MyDesktopHP.

[Reproduction:MyTalkthroughHeadphones]
Class = Talkthrough
Enabled = true
Description = Generic talkthrough to output group
Outputs = MyDesktopHP
InputDetectorEnabled = false
OutputDetectorEnabled = false
RecordInputEnabled = false
RecordInputFilePath = $(ProjectName)_Reproduction_MyTalkthroughHeadphones_Input.wav
RecordOutputEnabled = false
RecordOutputFilePath = $(ProjectName)_Reproduction_MyTalkthroughHeadphones_Output.wav

Low-frequency / subwoofer mixing reproduction (class LowFrequencyMixer) example
[Reproduction:MySubwooferMixer]
Class = LowFrequencyMixer 
Enabled = true
Description = Generic low frequency (subwoofer) loudspeaker mixer
Outputs = Cave_SW
MixingChannels = ALL # Can also be a single channel, e.g. zero order of Ambisonics stream

Equalized headphones reproduction (class Headphones) example

Two-channel equalization using FIR filtering based on post-processed inverse headphone impulse responses measured through in-ear microphones.

[Reproduction:MyHD600]
Class = Headphones
Description = Equalized Sennheiser HD600 headphones
Enabled = true
# Headphone impulse response inverse file path (can be normalized, but gain must then be applied for calibration)
HpIRInvFile = HD600_all_eq_128_stereo.wav
HpIRInvFilterLength = 22050 # optional, can also be obtained from IR filter length
# Headphone impulse response inverse gain for calibration ( HpIR * HpIRInv == 0dB )
HpIRInvCalibrationGainDecibel = 0.1
Outputs = MyHD600HP

Multi-channel cross-talk cancellation reproduction (class NCTC) example

Requires an output called MyDesktopLS. In case of a dynamic NCTC reproduction, only one receiver can be tracked (indicated by TrackedListenerID which is orientated and located based on a real-world pose). DelaySamples shifts the final CTC filters to obtain causal filters. The amount of the delay has to be set reasonably regarding CTCFilterLength (e.g., apply a shift of half the filter length).

[Reproduction:MyNCTC]
Class = NCTC
Enabled = true
Description = Crosstalk cancellation for N loudspeaker
Outputs = MyDesktopLS
TrackedListenerID = 1
# algorithm: reg|...
Algorithm = reg
RegularizationBeta = 0.001
DelaySamples = 2048
CrossTalkCancellationFactor = 1.0
WaveIncidenceAngleCompensationFactor = 1.0
UseTrackedListenerHRIR = false
CTCDefaultHRIR = $(DefaultHRIR)
Optimization = OPTIMIZATION_NONE

Higher-order Ambisonics decoding (class HOA) example

Creates a decoding matrix based on a given output configuration, but can only be used for one output.

[Reproduction:MyAmbisonics]
Class = HOA
Enabled = true
Description = Higher-Order Ambisonics
TruncationOrder = 3
Algorithm = HOA
Outputs = VRLab_Horizontal_LS
ReproductionCenterPos = AUTO # or x,y,z

Ambisonics binaural mixdown (class AmbisonicsBinauralMixdown) example

Encodes the individual orientations of loudspeakers in a loudspeaker setup using binaural technology based on the VirtualOutput group. It can also be used for a virtual Ambisonics downmix with ideal spatial sampling layout.

[Reproduction:AmbisonicsBinauralMixdown]
Class = AmbisonicsBinauralMixdown
Enabled = true
Description = Binaural mixdown of virtual loudspeaker setup using HRIR techniques
TruncationOrder = 3
Outputs = MyDesktopHP
VirtualOutput = MyDesktopLS
TrackedListenerID = 1
HRIRFilterLength = 128

Other reproduction module examples

Every specific reproduction module has its own specific set of parameters. The discussion of every functional detail is out of scope of this introduction. As all configurations are parsed in the constructor of the respective module, their functionality can sometimes only be fully understood by investigating the source code. For facilitation, the Redstart GUI application includes dialogs to create and interact with those renderers, additionally offering information when hovering over the GUI elements.



Controlling a Virtual Acoustics instance

Once your VA application is running as configured, you eventually want to create a virtual scene and modify its entities. Scene control is possible via scripts and tracking devices (e.g, NaturalPoint's OptiTrack). The VA interface provides a list of methods which lets you trigger updates and control settings.

Control VA using Matlab

The most common way to control VA for prototyping, testing, and in the scope of listening experiments is by using MathWorks' Matlab. VA provides a Matlab binding and a convenience class called itaVA. Once initialized, the class object can be connected to the VA server application over a TCP/IP network connection (or the local network port), as already described in the overview section on controlling VA.
You can find the itaVA.m Matlab class along with the required files for communication with VA in the VA package under the matlab folder. In case you are building and deploying VAMatlab on your own (for your platform), or if it is missing, look out for build_itaVA*.m scripts that will generate the convenience class around the VAMatlab executable. Adding this folder to the Matlab path list, will enable permanent access from the console, independently of the current working directory.
To get started, inspect the example files and use Matlab's bash completion on an instance of the itaVA class to receive self explanatory functions, i.e., when executing

va = itaVA
The list of available methods is sorted by getter and setter nomenclature (va.get_* and va.set_*), followed by the entity (sound_receiver, sound_source, sound_portal), and the actual action. To create an entity, directivities and more, use the va.create_* methods.

Note: All example calls to control VA are shown in Matlab code style. The naming convention in other scripting languages, however, is very similar. C++ and C# methods use capitalized words without underscores.

Control VA using Python

A Python VA module is available facilitating network access. It can be installed to be executed from anywhere, or it can be copied to and executed from a local folder. To obtain the package and example scripts, download a package that includes the Python binding (only available for Python 3.6 and recent compilers).

Control VA using Unity

Unity, a 3D and scripting development environment for games and Virtual Reality applications, allows a more intuitive and playful way to use VA. The VAUnity C# scripts extend a Unity GameObject and communicates properties to a VA server. Therefore, a C# VA binding, which comes with the binary packages in the download section, is required. No knowledge of a scripting or programming language, only a copy of Unity is required using this method. How to use VA and Unity is described in the README file of the project repository.

 

Global gain and muting

To control the global input gains (sound card software input channels), use

va.set_input_gain( 1.0 ) # for values between 0 and 1

To mute the input, use

va.set_input_muted( true ) # or false to unmute

The same is true for the global output gain (sound card software output channels)

va.set_output_gain( 1.0 )
va.set_output_muted( true ) # or false to unmute

Global auralization mode

The auralization mode is combined in the renderers by a logical AND combination of global auralization mode, sound receiver auralization mode and sound source auralization mode. The deactivation of an acoustic phenomenon such as, for example, the spreading loss, will affect all rendered sound paths.

va.set_global_auralization_mode( '-DS' ) # ... to disable direct sound
va.set_global_auralization_mode( '+SL' ) # ... to enable spreading loss, e.g. 1/r distance law
Find the appropriate identifier for every auralization mode in the overview table.

Log level

The VA log level at server side can be changed using

va.set_log_level( 3 ) # 0 = quiet; 1 = errors; 2 = warnings (default); 3 = info; 4 = verbose; 5 = trace;
Increasing the log level is potentially helpful to detect problems if the current log level is not high enough to throw an indicative warning message.

Search paths

At runtime, search paths can be added to the VA server using

va.add_search_path( './your/data/' )
Note, that the search path has to be available at server side if you are not running VA on the same machine. Wherever possible, add search paths and use file names only. Never use absolute paths for input files. If your server is not running on the same machine, consider adding search paths via the configuration at startup.

Query registered modules

To retrieve information on the available modules, use

modules = va.get_modules()
This method will return any registered VA module, including all renderer and reproduction modules as well as the core itself.

All modules can be called using

out_args = va.call_module( 'module_id', in_args )
where in_args and out_arg are structs with specific fields which depend on the module you are calling. Usually, a struct field with the name help or info returns useful information on how to work with the respective module:

va.call_module( 'module_id', struct('help',true) )

To work with renderers, use

renderers = va.get_renderers()
params = va.get_renderer_parameters( 'renderer_id' )
va.set_renderer_parameters( 'renderer_id', params )
Again, all parameters are returned as structs. More information on a parameter set can be obtained using structs containing the field help or info. It is good practice to use the parameter getter and inspect the key/value pairs before modifying and re-setting the module with the new parameters.

For reproduction modules, use

reproductions = va.get_reproductions()
params = va.get_reproduction_parameters( 'reproduction_id' )
va.set_reproduction_parameters( 'reproduction_id', params )
Querying and re-setting parameters works in the same way as described for rendering and reproduction modules.



How to create and modify a scene in Virtual Acoustics

In VA, everything that is not static is considered part of a dynamic scene. All sound sources, sound portals, sound receivers, underlying geometry and source/receiver directivities are potentially dynamic and therefore are stored and accessed using a history concept. They can be modified, however, during lifetime. Renderers are picking up modifications and react upon the new state, for example, when a sound source is moved or a sound receiver is rotated.
Updates are triggered asynchronously by the user or by another application and can also be synchronized ensuring that all signals are started or stopped within one audio frame.

Sound sources

Sound sources can be created by using

S = va.create_sound_source()

or created and optionally assigned a name

S = va.create_sound_source( 'Car' )
S will contain a unique numerical identifier which is required to modify the sound source.

A sound source (as well as a sound receiver) can only be auralized if it has been placed somewhere in 3D space. Otherwise it remains in an invalid state.

Specify a position as a three-dimensional vector ...

va.set_sound_source_position( S, [ x y z ] )

... and an orientation using a four-dimensional quaternion

va.set_sound_source_orientation( S, [ a b c d ] )

following the quaternion coefficient order a + bi + cj + dk.

It is also possible to set both values at once using a pose (position and orientation)

va.set_sound_source_pose( S, [ x y z ], [ a b c d ] )

You may also use a special view-and-up vector orientation, where the default view vector points towards negative Z direction and the up vector points towards positive Y direction according to a right-handed OpenGL coordinate system.

va.set_sound_source_orientation_view_up( S, [ vx vy vz ], [ ux uy uz ] )

The corresponding getter functions are

p = va.get_sound_source_position( S )
q = va.get_sound_source_orientation( S )
[ p, q ] = va.get_sound_source_pose( S )
[ v, u ] = va.get_sound_source_orientation_view_up( S )
with p = [x y z]', q = [a b c d]', v = [vx vy vz]', and u = [ux uy uz]', where ' symbolizes the vector transpose.

To get or set the name of a sound source, use

va.set_sound_source_name( S, 'AnotherCar' )
sound_source_name = va.get_sound_source_name( S )

Specific parameter structs can be set or retrieved. They depend on special features and are used for prototyping, for example, if sound sources require additional values for new renderers.

va.set_sound_source_parameters( S, params )
params = va.get_sound_source_parameters( S )

The auralization mode can be modified and returned using

va.set_sound_source_auralization_mode( S, '+DS' )
am = va.get_sound_source_auralization_mode( S )
This call would, for example, activate the direct sound. Other variants include
va.set_sound_source_auralization_mode( S, '-DS' )
va.set_sound_source_auralization_mode( S, 'DS, IS, DD' )
va.set_sound_source_auralization_mode( S, 'ALL' )
va.set_sound_source_auralization_mode( S, 'NONE' )
va.set_sound_source_auralization_mode( S, '' )

Sound sources can be assigned a directivity with a numerical identifier by

va.set_sound_source_directivity( S, D )
D = va.get_sound_source_directivity( S )
The handling of directivities is described below in the input data section.

To mute (true) and unmute (false) a source, type

va.set_sound_source_muted( S, true )
mute_state = va.get_sound_source_muted( S )

To control the level of a sound source, assign the sound power in watts

va.set_sound_source_sound_power( S, P )
P = va.get_sound_source_sound_power( S )
The default value of 31.67 mW (65 dB re 10e-12 Watts) corresponds to 1 Pascal (94.0 dB SPL re 20e-6 Pascal ) in a distance of 1 m for spherical spreading. The final gain of a sound source is linked to the input signal, which is explained below. However, a digital signal with an RMS value of 1.0 (e.g., a sine wave with peak value of sqrt(2)) will retain 94 dB SPL @ 1m. A directivity may alter this value for a certain direction, but a calibrated directivity will not change the overall excited sound power of the sound source when integrating over a hull.

A list of all available sound sources returns the function

source_ids = va.get_sound_source_ids()

Sound sources can be deleted with

va.delete_sound_source( S )

In contrast to all other sound objects, sound sources can be assigned a signal source. It feeds the sound pressure time series for that source and is referred to as the signal (speech, music, sounds). See below for more information on signal sources. The combination with the sound power and the directivity (if assigned), the signal source influences the time-dependent sound emitted from the source. For a calibrated auralization, the combination of the three components have to match physically.

va.set_sound_source_signal_source( sound_source_id, signal_source_id )

Sound receivers

Except for the sound power method and the signal source adapter, all sound source methods are equally valid for sound receivers (see above). Just substitute source with receiver. A receiver can also be a human listener, in which case the receiver directivity will be an HRTF.

The VA interfaces provides some special features for receivers that are meaningful only in binaural technology. The head-above-torso orientation (HATO) of a human listener can be set and received as quaternion by the methods

va.set_sound_receiver_head_above_torso_orientation( sound_receiver_id, [ a b c d ] )
q = va.get_sound_receiver_head_above_torso_orientation( sound_receiver_id )
In common datasets like the FABIAN HRTF dataset (can be obtained from the OpenDAFF project website), only a certain range within the horizontal plane (around positive Y axis according to right-handed, Cartesian OpenGL coordinates) is present, that accounts for simplified head rotations with a fixed torso. Many listening experiments are conducted in a fixed seat and the user's head orientation is tracked. Here, a HATO HRTF appears more suitable, at least if an artificial head is used.

Additionally, in Virtual Reality applications with loudspeaker-based setups, user motion is typically tracked inside a specific area. Some reproduction systems require knowledge on the exact position of the user's head and torso to apply adaptive sweet spot handling (like cross-talk cancellation). The VA interface therefore includes some receiver-oriented methods that extend the virtual pose with a so called real-world pose. Hardware in a lab and the user's absolute position and orientation (pose) should be set using one of the following setters
va.set_sound_receiver_real_world_pose( sound_receiver_id, [ x y z ], [ a b c d ] )
va.set_sound_receiver_real_world_position_orientation_vu( sound_receiver_id, [ x y z ], [ vx vy vz ], [ ux uy uz ] )
Corresponding getters are
[ p, q ] = va.get_sound_receiver_real_world_pose( sound_receiver_id )
[ p, v, u ] = va.get_sound_receiver_real_world_position_orientation_vu( sound_receiver_id )
Also, HATOs are supported (in case a future reproduction module makes use of HATO HRTFs)
va.set_sound_receiver_real_world_head_above_torso_orientation( sound_receiver_id, [ x y z w ] )
q = va.set_sound_receiver_real_world_head_above_torso_orientation( sound_receiver_id )

Sound portals

Sound portals have been added to the interface for future usage but are currently not supported by the available renderer. Their main purpose will include building acoustics applications, where portals are combined to form flanking transmissions through walls and ducts.

Signal sources

Sound signals or signal sources represent the sound pressure time series that are emitted by a source.
Some are unmanaged and are directly available, others have to be created. To get a list with detailed information on currently available signal sources (including those created at runtime), type

va.get_signal_source_infos()

In general, a signal source is attached to one ore many sound sources like this:

va.set_sound_source_signal_source( sound_source_id, signal_source_id )
Buffer signal source

Audio files that can be attached to sound sources are usually single channel anechoic WAV files. In VA, an audio clip can be loaded as a buffer signal source with special control mechanisms. It supports macros and uses the search paths to locate a file. Using relative paths is highly recommended. Two examples are provided in the following:

signal_source_id = va.create_signal_source_buffer_from_file( 'filename.wav' )
demo_signal_source_id = va.create_signal_source_buffer_from_file( '$(DemoSound)' )
The DemoSound macro points to the 'Welcome to Virtual Acoustics' anechoically recorded file in WAV format, which resides in the common data folder. Make sure that the VA application can find the common data folder, which is also added as a search path in the default configurations.

Now, the signal source can be attached to a sound source using
va.set_sound_source_signal_source( sound_source_id, signal_source_id )
Any buffer signal source can be started, stopped and paused. Also, it can be set to looping or non-looping mode (default).
va.set_signal_source_buffer_playback_action( signal_source_id, 'play' )
va.set_signal_source_buffer_playback_action( signal_source_id, 'pause' )
va.set_signal_source_buffer_playback_action( signal_source_id, 'stop' )
va.set_signal_source_buffer_looping( signal_source_id, true )
To receive the current state of the buffer signal source, use
playback_state = va.get_signal_source_buffer_playback_state( signal_source_id )

Input device signal sources

Input channels from the sound card can be directly used as signal sources (microphones, electrical instruments, etc) and are unmanaged (can not be created or deleted). All channels are made available individually on startup and are integrated as list of signal sources by

va.set_sound_source_signal_source( sound_source_id, 'inputdevice1' )

for the first channel, and so on.

Text-to-speech (TTS) signal source

The TTS signal source allows to generate speech from text input. Because it uses the commercial CereProc's CereVoice third party library, it is not included in the VA package for public download. However, if you have access to the CereVoice library and can build VA with TTS support, this is how it works in Matlab:

tts_signal_source = va.create_signal_source_text_to_speech( 'Heathers beautiful voice' )
tts_in = struct();
tts_in.voice = 'Heather';
tts_in.id = 'id_welcome_to_va';
tts_in.prepare_text = 'welcome to virtual acoustics';
tts_in.direct_playback = true;
va.set_signal_source_parameters( tts_signal_source, tts_in )
Do not forget that a signal source can only be auralized in combination with a sound source. For more information, refer to the text-to-speech example in the ITA-Toolbox for Matlab.

Other signal sources

VA also provides specialized signal sources which can not be covered in detail in this introduction. Please refer to the source code for proper usage.

Scenes

Scenes are a prototype-like definition to allow renderers to act differently depending on the requested scene identifier. This is useful when implementing different behaviour based on a user-triggered scene that should be loaded as, for example, a room acoustic situation or a city soundscape. Most renderers will ignore these calls, but renderers like the room acoustics renderer uses this concept as long as direct geometry handling is not fully implemented.

Directivities (including HRTFs)

Sound source and receiver directivities are usually made available as a file resource including multiple directions on a sphere for far-field usage. VA currently supports the OpenDAFF format with time domain and magnitude spectrum content type. They can be loaded with

directivity_id = va.create_directivity_from_file( 'my_individual_hrtf.daff' )
VA ships with the ITA artificial head HRTF dataset (actually, the DAFF exports this dataset as HRIR in time domain), which is available under Creative Commons license for academic use.
The default configuration files and Redstart sessions include this HRTF dataset as DefaultHRIR macro, and it can be created using
directivity_id = va.create_directivity_from_file( '$(DefaultHRIR)' )
Make sure that the VA application can find the common data folder, which is also added as an include path in default configurations.

Directivities can be assigned to a source or receiver with
va.set_sound_source_directivity( sound_source_id, directivity_id )
va.set_sound_receiver_directivity( sound_source_id, directivity_id )

Homogeneous medium

VA provides support for rudimentary homogeneous medium parameters that can be set by the user. The data is accessed by rendering and reproduction modules (mostly to receive the sound speed value for delay calculation). Values are always in SI units (meters, seconds, etc). Additionally, a user-defined set of parameters is provided in case a prototyping renderer requires further specialized medium information (may also be used for non-homogeneous definitions). Here is the overview of setters and getters:

Speed of sound in m/s

va.set_homogeneous_medium_sound_speed( 343.0 )
sound_speed = va.get_homogeneous_medium_sound_speed()

Temperature in degree Celsius

va.set_homogeneous_medium_temperature( 20.0 )
temperature = va.get_homogeneous_medium_temperature()

Static pressure in Pascal, defaults to the norm atmosphere

va.set_homogeneous_medium_static_pressure( 101325.0 )
static_pressure = va.get_homogeneous_medium_static_pressure()

Relative humidity in percentage (ranging from 0.0 to 100.0 or above)

va.set_homogeneous_medium_relative_humidity( 75.0 )
humidity = va.get_homogeneous_medium_relative_humidity()

Medium shift / 3D wind speed in m/s

va.set_homogeneous_medium_shift_speed( [ x y z ] )
shift_speed = va.get_homogeneous_medium_relative_humidity()

Prototyping parameters (user-defined struct)

va.set_homogeneous_medium_parameters( medium_params )
medium_params = va.get_homogeneous_medium_relative_humidity()


Geometry

Geometry interface calls are for future use and are currently not supported by the available renderers. The concept behind geometry handling is real-time environment manipulation for indoor and outdoor scenarios using VR technology like Unity or plugin adapters from CAD modelling applications like SketchUp.

Acoustic materials

Acoustic material interface calls are for future use and are currently not supported by available renderers. Materials are closely connected to geometry, as a geometrical surface can be linked to acoustic properties represented by the material.

Solving synchronisation issues

Scripting languages like Matlab are problematic by nature when it comes to timing: evaluation duration scatters unpredictability and timers are not precise enough. This becomes a major issue when, for example, a continuous motion of a sound source should be performed with a clean Doppler shift. A simple loop with a timeout will result in audible motion jitter as the timing for each loop body execution is significantly diverging. Also, if a music band should start playing at the same time and the start is executed by subsequent scripting lines, it is very likely that they end up out of sync.

High-performance timeout

To avoid timing problems, the VA Matlab binding provides a high-performance timer that is implemented in C++. It should be used wherever a synchronous update is required, mostly for moving sound sources or sound receivers. An example for a properly synchronized update loop at 60 Hertz that incrementally drives a source from the origin into positive X direction until it is 100 meters away:

S = va.create_sound_source()

va.set_timer( 1 / 60 )
x = 0
while( x < 100 )
	va.wait_for_timer;
	va.set_sound_source_position( S, [ x 0 0 ] )
	x = x + 0.01
end

va.delete_sound_source( S )
Synchronizing multiple updates

VA can execute updates synchronously in the granularity of the block rate of the audio stream process. Every scene update will be withhold until the update is unlocked. This feature is mainly used for simultaneous playback start.

va.lock_update
va.set_signal_source_buffer_playback_action( drums, 'play' )
va.set_signal_source_buffer_playback_action( keys, 'play' )
va.set_signal_source_buffer_playback_action( base, 'play' )
va.set_signal_source_buffer_playback_action( sax, 'play' )
va.set_signal_source_buffer_playback_action( vocals, 'play' )
va.unlock_update

It is also useful for uniform movements of spatially static sound sources (like a vehicle with four wheels). However, locking updates will inevitably lock out other clients (like trackers) and should be released as soon as possible.

va.lock_update
va.set_sound_source_position( wheel1, p1 )
va.set_sound_source_position( wheel2, p2 )
va.set_sound_source_position( wheel3, p3 )
va.set_sound_source_position( wheel4, p4 )
va.unlock_update


Audio rendering

Audio rendering, next to reproduction, is the heart of VA. Rendering instances combine user information to auralize sound, in a unique way and with a dedicated purpose. Audio renderers are informed by the VA core about scene changes (asynchronous updates) which are triggered by the user. The task of each rendering instance is to adapt the requested changes as fast as possible.

Rendering modules work pretty much on their own. They feature, however, some common and some specialized methods for interaction.

To get a list of available modules, use

renderer_ids = va.get_rendering_modules()

Every rendering instance can be muted/unmuted and the output gain can be controlled.

va.set_rendering_module_muted( renderer_id, true )
va.set_rendering_module_gain( renderer_id, 1.0 )
mute_state = va.get_rendering_module_muted( renderer_id )
gain = va.get_rendering_module_gain( renderer_id )

Renderers may also be masked by auralization modes. To enable or disable certain auralization modes, use for example

va.set_rendering_module_auralization_mode( renderer_id, '-DS' )
va.set_rendering_module_auralization_mode( renderer_id, '+DS' )

To obtain and set parameters, type

va.set_rendering_module_parameters( renderer_id, in_params )
out_params = va.get_rendering_module_parameters( renderer_id, request_params )
The request_params can usually be empty, but if a key help or info is present, the rendering module will provide usage information.

A special feature that has been requested for Virtual Reality (background music, instructional speech, operator's voice) provides a sound source and sound receiver create method that will only be effective for the given rendering instance (explicit renderer). This is required if ambient clips should be played back without spatialization, or if certain circumstances demand that a source is only processed by one single renderer. In this way, computational power can be saved.

sound_source_id = va.create_sound_source_explicit_renderer( renderer_id, 'HitButtonEffect' )
sound_receiver_id = va.create_sound_receiver_explicit_renderer( renderer_id, 'SurveillanceCamMic' )
The sound sources and receivers created with this method are handled like normal entities but are only effective for the explicit rendering instance.

Binaural free field renderer

For a proper time synchronization of this renderer with other renderers, a static delay which is added to the propagation delay simulation can be set. This static delay is defined by a special parameter using a struct. In the following example, it is set to 100ms.

in_struct = struct()
in_struct.AdditionalStaticDelaySeconds = 0.100
va.set_rendering_module_parameters( renderer_id, in_struct )

Special features of this renderer include individualized HRIRs. The anthropometric parameters are derived from a specific key/value layout of the receiver parameters combined under the key anthroparams. All parameters are provided in units of meters.

in_struct = struct()
in_struct.anthroparams = struct()
in_struct.anthroparams.headwidth = 0.12
in_struct.anthroparams.headheight = 0.10
in_struct.anthroparams.headdepth = 0.15
va.set_sound_receiver_parameters( sound_receiver_id, in_struct )

The current anthropometric parameters can be obtained by

params = va.get_sound_receiver_parameters( sound_receiver_id, struct() )
disp( params.anthroparams )

Prototype generic path renderer

This renderer can update impulse responses through the VA interface and will exchange incoming data in real-time for a requested source-receiver pair. It can be used as a powerful prototyping tool that gives instant audible results for A/B comparisons. At ITA, it is used to create binaural (two-channel) FIR filtering-based renderer within Matlab as part of the laboratory course on Acoustic Virtual Reality.
In the examples below, the propagation path from source 1 to receiver 1 is updated. If no verbose output is required, just drop the verbose key.

To trigger an update from a file resource, a specialized struct has to be created:

in_struct = struct()
in_struct.receiver = 1
in_struct.source = 1
in_struct.verbose = 1
in_struct.filepath = CologneDomeAmbisonicsIRMeasurement.wav
va.set_rendering_module_parameters( renderer_id, in_struct )

If a certain channel should be updated, say channel 3, add

in_struct.channel = 3

To trigger an update by sending impulse response samples directly (in this example, two channels are used, but also more channels are possible), compile another specialized struct like the following:

in_struct = struct()
in_struct.receiver = 1
in_struct.source = 1
in_struct.verbose = 1
in_struct.ch1 = [ 1 0 0 0 ... ]
in_struct.ch2 = [ 0 0 1 0 ... ]
va.set_rendering_module_parameters( renderer_id, in_struct )

This example struct will exchange a non-delayed Dirac impulse for the first channel and a Dirac with 2 samples delay on the second channel. Of course, an entire measured or simulated impulse response will be used in common applications.



Audio reproduction

Audio reproduction modules receive spatialized audio streams from audio renderer modules. Most of them work independently from user input but some require knowledge about the user's real-world pose in the reproduction environment.

Rendering modules work pretty much on their own. They feature, however, some common and some specialized methods for interaction.

To get a list of available modules, use

reproduction_ids = va.get_reproduction_modules()

Every reproduction instance can be muted/unmuted and its output gain can be controlled.

va.set_reproduction_module_muted( reproduction_id, true )
va.set_reproduction_module_gain( reproduction_id, 1.0 )
gain = mute_state = va.get_reproduction_module_muted( reproduction_id )
va.get_reproduction_module_gain( reproduction_id )

To obtain and set parameters, type

va.set_reproduction_module_parameters( reproduction_id, in_params )
out_params = va.get_reproduction_module_parameters( reproduction_id, request_params )
The request_params can usually be empty, but if a key help or info is present, the reproduction module will provide usage information.

Multi-channel cross-talk cancellation

The N-CTC reproduction requires exact knowledge on the user's ear canal positions. Therefore, it can only be used for a single sound receiver and the module evaluates the real-world pose that can be set by the interface call (or by tracking as described below).
Some additional parameters can be modified during real-time processing for immediate evaluation. The additional delay (in seconds) shifts the resulting CTC filters to create causality. The CTC factor and WICK factor control for the smoothing of the initial HRTF in order to gain better transmission quality and a wider sweet spot while trading off signal-to-noise ratio of the binaural performance. When setting the WICK factor to zero, the N-CTC module acts like a multi-channel transaural stereo reproduction with simple panning and constant group delay.

in_params = struct()
in_params.AdditionalDelayTime = 0.100
in_params.CrossTalkCancellationFactor = 1.0
in_params.WaveIncidenceAngleCompensation = 1.0
va.set_reproduction_module_parameters( reproduction_id, in_params )

Headphones reproduction

During runtime, the inverted FIR filter for the headphone equalization can be exchanged. This is helpful to investigate the effect of the equalization performance by direct comparison. To maintain 0 dB playback, if the inverse FIR filter changes the signal's energy, a calibration gain factor can optionally be passed, either as factor or as decibel value.

in_params = struct()
in_params.HpIRInvFile = HD650_individualized_eq.wav
in_params.HPIRInvCalibrationGain = 1.0
in_params.HPIRInvCalibrationGainDecibel = 0.0
va.set_reproduction_module_parameters( reproduction_id, in_params )


Tracking

VA does not support tracking internally but facilitates the integration of tracking devices to update VA entities. For external tracking, the VAMatlab project currently supports NaturalPoint's OptiTrack devices to be connected to a server instance. It can automatically forward rigid body poses (head and torso, separately) to one sound receiver and one sound source. Another possibility is to use an HMD such as Oculus Rift and HTC Vive and update VA through Unity.

OptiTrack via VAMatlab

To connect an OptiTrack rigid body to a VA sound entity (here, a receiver with id 1 was defined), use

va.set_tracked_sound_receiver( 1 )

To also include the real-world pose (as required by some reproduction modules like the N-CTC reproduction module), also execute

va.set_tracked_real_world_sound_receiver( 1 )

If the rigid body index should be changed (e.g., to index 3 for head and 4 for torso), use

va.set_tracked_sound_receiver_head_rigid_body_index( 3 )
va.set_tracked_sound_receiver_torso_rigid_body_index( 4 )

The head rigid body (rb) can also be locally transformed using a translation and (quaternion) rotation method, e.g., if the rigid body barycenter is not between the ears or is rotated against the default orientation:

va.set_tracked_sound_receiver_head_rb_trans( [ x y z ] )
va.set_tracked_real_world_sound_receiver_head_rb_rotation( [ a b c d ] )

For the real-world sound receiver, similar methods exist:

va.set_tracked_real_world_sound_receiver_head_rigid_body_index( 3 )
va.set_tracked_real_world_sound_receiver_torso_rigid_body_index( 4 )
va.set_tracked_real_world_sound_receiver_head_rb_trans( [ x y z ] )
va.set_tracked_real_world_sound_receiver_head_rb_rotation( [ a b c d ] )

The sound source methods are almost equal, except that the receiver has to be substituted, as shown for a sound source with id 1 at rigid body index 5 in the following example:

va.set_tracked_sound_source( 1 )
va.set_tracked_sound_source_rigid_body_index( 5 )
va.set_tracked_sound_source_rigid_body_translation( [ x y z ] )
va.set_tracked_sound_source_rigid_body_rotation( [x y z w ] )

To finally connect to the tracker that is running on the same machine and pushes to localhost network loopback device, use

va.connect_tracker

In case that the tracker is running on another machine, OptiTrack requires to both set the remote (in this example 192.168.1.2) AND the client machine IP (in this example 192.168.1.143) like this

va.connect_tracker( '192.169.1.2', '192.169.1.143' )

HMD via VAUnity

To connect an HMD, set up a Unity scene and connect the tracked GameObject (usually the MainCamera) with a VAUSoundReceiver instance. For further details, please read the README files of VAUnity.



Simulation and recording

As already pointed out, VA can be used for simulations and recordings. The only requirement is to activate the recording by configuration before runtime, as described in the rendering and reproduction module setup sections. Outputs from the rendering modules can be used to store spatial audio samples (like binaural clips or Ambisonics B-format / HOA tracks). Outputs from reproductions can be used for offline playback with a given loudspeaker setup for (audio-visual) demonstrations or for non-interactive listening experiments.
A current issue of VA's simulation and recording capability is that it can only be driven by a phsyical sound card, thus only allowing to capture real-time rendering/reproduction. Therefore, capabilities are limited to the available resources. Simulations for many sources/receivers might have to be done subsequently (with potential syncing issues). We are planing to provide a virtual sound card that can slow down the rendering to a user-triggerd speed for offline rendering. In this way, even scenes with a high complexity can be handled in the future, albeit not in real-time.



Examples

Here are some common use cases and a full description on how to set up a VA server and create a corresponding scene.

Binaural sound source circulating around a listener

Involved application: Redstart
Recommended playback device: Headphones

Note: Shortcuts are indicated in brackets.
  • Open up Redstart and create a binaural session (N, B). Leave everything to default.
  • Start the session (F5)
  • Now, open Run > Circulating source (R, C), leave everything to default and hit the Start button
  • You should hear the welcome track as virtual sound source circulating around your head (based on the default ITA artificial head HRTF data set).

Now, try to change parameters and listen to the effect these changes have on the auralization. To test your own files, create a new binaural session and override the default macros.