perceptual approximation of room impulse … · perceptual approximation of room impulse responses...
Post on 20-Sep-2018
229 Views
Preview:
TRANSCRIPT
Perceptual Approximation of Room Impulse Responses
with Artificial Reverberation Algorithms
Audio Engineering Project Thesis
Lukas Knöbl
Assesor: MSc. Georgios Marentakis
Graz, 31.05.2015
Abstract
The goal of the project is to design and implement a reverberation algorithm with various
perceptual controls. The literature review in the field will be investigated and synthesized to
come up with a realistic and efficient reverberation algorithm implementation. Furthermore,
the algorithm will be tested with respect to its ability to simulate given impulse responses and
the possibility to provide perceptual controls.
Kurzbeschreibung
Das Ziel der Projektarbeit ist die Entwicklung und Implementation eines Algorithmus zur
Raumsimulation, dessen perzeptuelle Eigenschaften sich mit Hilfe verschiedener
Bedienelemente verändern lassen. Es soll eine Literaturrecherche durchgeführt werden,
welche als Grundlage für die Erstellung eines realistischen und ressourcenschonenden Hall-
Algorithmus dient. In weiterer Folge wird der Algorithmus bezüglich der Fähigkeit getestet
Impulsantworten von realen Räumen zu simulieren.
Lukas Knoebl, Simulation of Room Impulse Response
3
Table of Contents
1 Introduction 5
2 Artificial Reverberation Algorithms 7
2.1 Properties of room acoustics ............................................................................ 7
2.2 EDC and EDR .................................................................................................. 8
2.3 Time & Frequency Density .............................................................................. 9
2.3.1 Frequency Density .......................................................................................... 9
2.3.2 Time Density ................................................................................................ 11
2.4 Echo Density Profile ...................................................................................... 11
2.5 Convolution Reverb ....................................................................................... 14
2.6 Algorithmic Reverberation ............................................................................ 15
2.6.1 The Recursive Comb Filter .......................................................................... 15
2.6.2 The Allpass Filter ......................................................................................... 18
2.7 Classic Filter Network Structures .................................................................. 21
2.7.1 Parallel Comb Filters .................................................................................... 21
2.7.2 Combination of Comb- and Allpass Filters .................................................. 22
2.7.3 Nested Series Allpass Network .................................................................... 24
3 Feedback Delay Networks 27
3.1 General FDN Structure .................................................................................. 27
3.2 Obtaining frequency depended reverberation times ...................................... 29
3.2.1 Implementation using 1st order filters ......................................................... 30
3.2.2 Control of decay characteristics in FDNs .................................................... 31
3.3 Parameterization of FDNs .............................................................................. 32
4 The Final Algorithm 35
4.1 Impulse Response Analysis ........................................................................... 36
4.1.1 Segmentation of Early Reflections and Late Reverberation ........................ 36
4.1.2 Measuring Reverberation Time and Total Energy ....................................... 37
Lukas Knoebl, Simulation of Room Impulse Response
4
4.2 Improving the Quality of Late Reverberation ................................................ 38
4.2.1 Delay Lines and Delay Lengths ................................................................... 38
4.2.2 Modulation ................................................................................................... 40
4.2.3 Output Vectors and Feedback Matrix .......................................................... 42
4.3 Controlling the Reverberation Time .............................................................. 43
4.4 Spectral Correction Filter ............................................................................... 46
4.5 Synthesizing the Early Reflections ................................................................ 47
4.6 Summary of the Algorithm ............................................................................ 48
5 Evaluation and Results 49
5.1.1 Simulation of a Concert Hall ........................................................................ 49
5.1.2 Simulation of a Cavern ................................................................................. 51
5.1.3 Simulation of a Large Studio ....................................................................... 52
5.1.4 Simulation of a Small Studio ....................................................................... 54
5.1.5 Simulation of an Algorithmic Reverb .......................................................... 55
5.2 General Considerations .................................................................................. 57
6 Perceptual Controls 58
7 Conclusion and Outlook 60
Literatur/Bibliography 62
Picture Credits 64
APPENDIX 65
Lukas Knoebl, Simulation of Room Impulse Response
5
1 Introduction
Artificial reverberators are used in almost every audio production to add space, depth and life
to dry recordings. In the very beginning analogue devices - such as plates, springs or even
echo chambers - were used for this purpose, but as technology developed further digital
reverberators have become the new studio standard [JC91].
There exist two main approaches to create artificial reverberation: One technique is based on
the convolution of a room’s impulse response with the input signal, whereas the other one is
based on delay networks. While the former approach produces very accurate and authentic
results, the computational cost increases in conjunction with the length of the impulse
response [CNW14]. However, algorithmic reverberators usually provide a larger set of
perceptual controls to adjust certain parameters like diffusion or reverberation time, but they
have to be designed carefully in order to produce a natural sounding decay without
undesirable artefacts.
The reverberation tail of a real room’s impulse responses consists of a large number of echoes
increasing over time and is perceived as being an exponentially decaying white noise. Various
algorithms have been developed to recreate these characteristics in real time as closely as
possible while using only a reasonable amount of processing power [Jot92].
One can separate such impulse responses into two parts. The very beginning of the
reverberation tail usually consists of discrete reflections and is called early reflections. In
most cases algorithms attempt to model these reflections by using a set of delay lines. The
early reflections contain important information about the characteristics of a room, like size,
geometry and architecture of a room [Zöl08].
Fig. 1.1: Echogram showing three parts of the reverberation process: direct sound, early
reflections and late reverberation.
Lukas Knoebl, Simulation of Room Impulse Response
6
However, in the late part of the impulse response the reflections get much denser and thus the
exact prediction (e.g. via image-source model) with delay lines gets impractical [Schro62].
Usually, a more statistical approach is chosen to model the late part of the reverberation tail
[Ruo12], in fact, most algorithms use a combination of feedback loops and delay taps. The
recursive structures guarantee that the input signal is repeated and attenuated in each loop. By
combining many of these structures into networks, it is possible to create an overall output
that sounds like exponentially decaying white noise. But even with a high number of delay
elements it is difficult to avoid discrete frequency peaks in the artificial reverberator’s reverb
tail, which are caused by the fixed rate of the delay lines. The resulting sound can be
described as ‘metallic‘, because certain frequencies resonate more than others, and this sounds
very unnatural [Fre00].
In chapter two, some properties of room acoustics, which are strongly associated with
artificial reverberation, are reviewed and basic feedback structures like comb- and allpass
filters are explained. In addition, some different approaches and algorithms which have been
used in the past are compared. Chapter three will emphasise the importance of a very specific
and common type of algorithms, the Feedback Delay Networks (FDNs).
Chapter four will describe the final algorithm in detail. All basic blocks will be discussed step
by step, starting with the impulse response analysis and automatic parametrisation procedure.
Next, some methods will be suggested in order to increase the subjective quality of the
reverberation tail generated by the FDN. Finally, it will be explained how to adjust the
frequency dependent reverberation time and initial energy to the statistics of the original
impulse response.
In chapter five we will investigate the algorithm’s ability to model various impulse responses
of different lengths and sources. Finally, chapter six will describe the perceptual controls
which were added to the reverberator in order to provide the option to modify certain
parameters like reverberation time or diffusion.
Lukas Knoebl, Simulation of Room Impulse Response
7
2 Artificial Reverberation Algorithms
2.1 Properties of room acoustics
Any sound source placed in an enclosure will produce a dense pattern of reflections on the
listening position which is perceived as reverberation. These echoes occur because the sound
is partly reflected and diffracted by the walls and objects within the room, depending on their
shape and material of their surfaces. A concrete wall will reflect almost all frequencies
equally, whereas other materials (like wood or fabric) will absorb more of the higher
frequencies. A reflection is therefore a delayed and attenuated copy of the direct sound
[Zöl11]. There exist a variety of different room acoustical measures that describe the
acoustical quality of a room, one of them being the definition of Sabine’s reverberation time
[JC91].
𝑇 = 0,161 [
𝑠
𝑚] .
𝑉
𝐴𝑔𝑒𝑠 (Eq. 2.1)
where V is the volume of the room in m3, and A is the frequency dependent absorption of the
room.
The reverberation time is the time required for the energy in a sound signal to decay a certain
level. Most commonly used is the RT60, which corresponds to a decay of 60dB [GW09].
The transfer function from a static sound source to a listening position is described by the
impulse response, which is a function over time and describes the system’s (in this case the
room’s) reaction for all frequencies. It can be divided in three parts: The direct sound, the
early reflections and the late reverberation [Fre00].
The initial direct sound is the unreflected sound following the shortest acoustical path from
the source to the listening position. It is followed by a discrete set of early reflections, which –
though normally not being be perceived separately - are important for the perception of both
the size of the room and the size of the sound source [Zöl11]. Finally, the density of the late
reverberation is so high that it closely resembles Gaussian Noise [MS07]. This latter portion
gives information about the room size as well as the distance of the sound source [Zöl11].
Lukas Knoebl, Simulation of Room Impulse Response
8
Fig. 2.1: Room impulse response illustrating the three different periods of reverberation.
2.2 EDC and EDR
To obtain the reverberation time, Schroeder introduced the Energy Decay Curve (EDC),
which is computed by the time reversed integration of the squared impulse response h(t). It
describes the remaining energy in the impulse response after any time t [Jot92]:
𝐸𝐷𝐶(𝑡) = ∫ ℎ2(𝜏). 𝑑𝜏
+∞
𝑡
(Eq. 2.2)
Since the energy decreases over time, the reverberation time can be derived from the slope of
this decay.
Jot further extended this method by computing the integral for several frequency bands of the
impulse response. The resulting plot is called the Energy Decay Relief (EDR), which can be
used to visualize the reverberation time as a function of frequency and time in a three-
dimensional graph [Jot92].
Lukas Knoebl, Simulation of Room Impulse Response
9
It can be seen that the reverberation time is larger for low frequencies because usually more of
the higher frequencies are absorbed by the walls and objects within the room. This is an
important characteristic and has to be considered during the development of artificial reverb
algorithms.
Fig. 2.2: Energy Decay Relief of a concert hall.
2.3 Time & Frequency Density
2.3.1 Frequency Density
The resonant frequencies of a rectangular room can be determined by
|𝑓𝑟| = (𝑐
2)√(
𝑛𝑥
𝑙𝑥)
2
+ (𝑛𝑦
𝑙𝑦)
2
+ (𝑛𝑧
𝑧)2
(Eq. 2.3)
where l are the room dimensions and n∈ℕ [GW09].
The modal density is defined as the number of modes per Hertz, and is dependent on
frequency f, volume of the room V and the speed of sound [Kut91]:
Lukas Knoebl, Simulation of Room Impulse Response
10
𝑑𝑁
𝑑𝑓~
4𝜋𝑉
𝑐3𝑓2 (Eq. 2.4)
It can be shown, that the number of eigenmodes increases as frequency squares. Above a
certain frequency, the modal density is so high that the human ear cannot perceive individual
frequency peaks anymore [Smi10]. For average rooms, this limiting frequency can be given
as:
𝑓𝑔 = 2000√𝑇
𝑉 (Eq. 2.5)
where T is the reverberation time in seconds and V is the volume of the room [JC91].
Above this critical frequency, the mean spacing between the frequency peaks can be
approximated by [Fre00]:
∆𝑓𝑚𝑎𝑥 =
4
𝑇 [𝐻𝑧] (Eq. 2.6)
The theoretical modal density can be distinguished from the perceivable frequency density.
The frequency response of a room fluctuates about 10dB on the average, but psycho-acoustic
experiments have shown that such irregularities cannot be detected by the human ear when
the density of modes is high enough [Schro62]. Also, some modes with lower amplitudes
cannot be perceived and therefore the frequency density is always lower than the modal
density [Zöl11].
As explained in chapter 2.3.1, increasing the delay time of a comb filter will thus increase the
modal- or respectively frequency density.
For a large concert hall with V = 30000 m2
and a reverberation time of 3,5 seconds, the
critical frequency is 21.6 Hz. This leads to a modal density of 4.35 modes per Hz and a
∆𝑓𝑚𝑎𝑥of 1.14 Hz. The frequency density will therefore be 0.88 modes (1/∆𝑓𝑚𝑎𝑥 ) per Hz
[Fre00].
Lukas Knoebl, Simulation of Room Impulse Response
11
2.3.2 Time Density
The echo density in a room is defined as the number of echoes until a certain time t
[Schro62]:
𝑑𝑁𝑡
𝑑𝑡~
4𝜋𝑐3
𝑉𝑡2
(Eq. 2.7)
According to Griesinger, an echo density of 10000 echoes per second for transient input
signals should be sufficient for a natural, colourless reverberation [JC91].
Similar as with frequency peaks, the amplitude of successive echoes can be very different and
masking may occur. This suggests introducing the term ‘time density’, which refers to the
perceived number of echoes. Again, the time density is generally lower than the echo density.
Also, the time tc after which individual echoes are not distinguishable anymore is dependent
on the length ∆t of the exciting impulse:
𝑡𝑐 = 5. 10−5√𝑉
∆𝑡 (Eq. 2.8)
Another property of good sounding, large rooms is the absence of flutter echoes and an equal
decay rate for all modes within the same frequency region [Schro62].
2.4 Echo Density Profile
As discussed previously, the echo density increases over time until the number of reflections
is so high that the impulse response is very similar to exponentially decaying Gaussian noise.
The rate of this increase is dependent on the size and shape of the room and whether it is
empty or cluttered with reflective objects.
There are some reasons why counting reflections over a period of time (most commonly one
second) is not always the best solution to express echo density. One reason would be that it is
not clarified how a reflection is generally defined (i.e. when an impulse response tap is
counted as a reflection and when not), whereas the other reason would be that the sampling
frequency will limit the amount of echoes that can be detected [AH06].
Lukas Knoebl, Simulation of Room Impulse Response
12
Abel and Huang [AH06] introduced a measuring method called the ‘Echo Density Profile’,
which relies on the property that the impulse response taps take on a Gaussian distribution
after a sufficient amount of time. For this purpose, a sliding window over the impulse
response is used. The echo density measure counts the taps outside the standard derivation for
the window and normalizes by that expected for Gaussian noise. As a result, the value for the
echo density, which is always somewhere between zero and one (due to normalization), will
be low if there are only a few pronounced reflections. In other words, they will contribute to a
larger standard deviation. On the other side, the value will be close to one for extremely dense
reflection patterns.
Accordingly, the echo density profile η(t) is defined as the fraction of impulse response taps
which lie outside the window standard deviation.
𝜂(𝑡) =1
𝑒𝑟𝑓𝑐(1
√2)
∑ 𝑤(𝜏)𝟏{|ℎ(𝜏)| > 𝜎}
𝑡+𝛿
𝜏=𝑡−𝛿
(Eq. 2.10)
where h(t) is the (zero mean) impulse response, 2𝛿 + 1 is the window length (in samples),
𝑒𝑟𝑓𝑐(1
√2) ≐ 0.3173 is the expected fraction of samples lying outside a standard deviation
from the mean for a Gaussian distribution, 1{..} is returning one when its argument is true and
zero otherwise , w(t) is a positive weighting function (to reduce the effect of the taps at the
window edges) and 𝜎 is the window standard deviation
𝜎 = [ ∑ 𝑤(𝜏)ℎ2(𝜏)
𝑡+𝛿
𝜏=𝑡−𝛿
]
12
(Eq. 2.11)
where w(𝜏) is normalized to have unit sum ∑ 𝑤(𝜏) = 1𝜏 .
For a typical room impulse response or an impulse response taken from a high-quality
artificial reverberator the echo density profile starts near zero and increases to around one,
which indicates the start of the late field. The rate of increase and the time at which the value
of one is reached will depend on the properties of the room or algorithm.
Lukas Knoebl, Simulation of Room Impulse Response
13
Abel and Huang suggest setting the length of the sliding window to about 20-30 ms. A shorter
window will be more responsive to short-term echo density changes, but can also cause jumps
in the profile, which should be avoided. Individual echoes with large amplitudes can be
responsible for sudden changes in the echo density profile and thus a weighting function (e.g.
Hanning window) should be used in order to smooth out the curve (taps at the window edges
will be de-emphasized).
Fig. 2.5.1: Top: Echo density profiles (Hanning weighting, window length 20 ms) for
the impulse responses of a lobby (red, green) and a hallway (blue, yellow). Bottom:
Corresponding impulse responses.
As shown in Fig. 2.5.1, the echo density profile and the time at which the late field is reached
is very similar for the impulse responses measured in the same room. This suggests that the
echo density profile is not noticeably affected by the microphone placement or source
position. Generally, in big rooms it takes more time for the reflections to build up whereas in
smaller rooms the initial echo density is higher. The slope of the curve is dependent on the
Lukas Knoebl, Simulation of Room Impulse Response
14
number of reflective surfaces as well as their distances to the sound source and to the listening
position [AH06].
Fig. 2.5.2: Echo density profiles (Hanning weighting, window length 26 ms) for
the impulse responses of a concert hall located in Pori, Finland (top), and for the
impulse response taken from a Lexicon Vocal Hall algorithm (bottom).
2.5 Convolution Reverb
A very common approach to reverberate a dry signal would be to simply convolve a room
impulse response with the input signal. For long impulse responses the convolution is usually
performed in the frequency domain, as its complexity is reduced to a simple multiplication.
This can be done by computing the Fourier transform of the impulse response and the block
by block Fourier transform of the input signal. Both signals can now be multiplied point per
point and the result is transformed back to the time domain [Zöl11].
Though convolution is a very accurate method to simulate the characteristics of a room, the
quality of the resulting reverberation naturally depends on the quality of the impulse response.
Lukas Knoebl, Simulation of Room Impulse Response
15
Another drawback is that with convolution there are no parameters to manipulate the decay
characteristics like reverberation time and frequency properties.
2.6 Algorithmic Reverberation
A variety of techniques, like ray-tracing or the image-source model, can be used to create an
impulse response by modelling how sound propagates in a room with certain geometry
[Ruo12]. However, a common approach is to simulate the room acoustics from a perceptual
point of view, since the human hearing is not very sensitive to details when it comes to
evaluating reverb tails. Most reverberation algorithms use feedback structures to create echoes
of the input signal, and some of them are described in more detail in the following chapters
[Zöl11].
2.6.1 The Recursive Comb Filter
The recursive comb filter was introduced by Manfred Schroeder at Bell Laboratories in 1961
as a computational inexpensive module to create artificial reverberation [Zöl11].
It consists of a delay line which is inserted into a feedback loop with gain g (less than one for
stability) in order to produce multiple echoes over time [Schro62].
Fig. 2.3.0: Structure of a comb filter
The impulse response of such a structure corresponds to an exponentially decaying repeated
echo.
Lukas Knoebl, Simulation of Room Impulse Response
16
Fig 2.3.1: Impulse response of a comb filter with 𝜏=10ms and g=0.5
Fig 2.3.2: Energy decay relief of a comb filter with 𝜏=10ms and g=0.5
The amplitude-spectrum of this impulse train is given as
|𝐻(𝜔)| =1
(1 + 𝑔2 − 2𝑔 cos(𝜔𝜏))12
(Eq. 2.12)
where the ratio of the response maxima to minima is
Lukas Knoebl, Simulation of Room Impulse Response
17
𝐻𝑚𝑎𝑥
𝐻𝑚𝑖𝑛=
1 + 𝑔
1 − 𝑔 (Eq. 2.13)
Fig 2.3.3: Magnitude response and phase of a comb filter with
τ=0.2ms and g=0.5, Fs=44100 Hz
With the above gain setting of g=0.5 (which corresponds to a attenuation of 6dB for every trip
around the feedback loop), the resulting magnitude response maximum to minimum response
is equal to about 10dB, which results in a very unnatural, ‘ringing’ sound [Schro62].
As a result, the resonant frequencies we perceive as ‘metallic’ are spaced by 1
𝜏 Hz on the
frequency axis [Schro62].
Fig 2.3.4: Pole-Zero map of a comb filter with 𝜏=0.2ms and g=0.5, Fs=44100
Lukas Knoebl, Simulation of Room Impulse Response
18
There are m (m is the delay length in samples) poles in the pole-zero diagram and each pole is
responsible for one frequency peak in the spectrum, which further means that there are m/2
frequency peaks below half of the sampling frequency. Therefore, reducing the delay time
will on the one hand increase the amount of echoes per second, but on the other hand reduce
the modal density as there are fewer peaks in the spectrum [Fre00]. In real rooms, the density
of modes above a certain frequency becomes so high that they interfere and cannot be
distinguished by the human ear [SL61]. A combination of high gain settings and short delay
times expose unnatural resonances of the recursive comb filter even more, because maximum
to the minimum ratio of the magnitude response will be increased [Fre00].
2.6.2 The Allpass Filter
To avoid the problem of unnatural resonances in the frequency domain, Schroeder introduced
the delay based allpass filter. This structure uses a feed-forward path, where the input signal
is multiplied with the negative gain of the feedback path. As opposed to the comb filter, the
resulting frequency response is flat while still producing a dense impulse response [Zöl11].
Fig. 2.3.5: Flowgraph of an allpass filter
The output of this structure is given by
𝑦(𝑛) = −𝑔. 𝑥(𝑛) + 𝑥(𝑛 − 𝑚) + 𝑔. 𝑦(𝑛 − 𝑚) (Eq. 2.14)
The resulting impulse response shares the property of exponential decay of energy and is
similar to the comb filter’s impulse response except from the negative peak in the beginning
[Schro62 and Fre00].
Lukas Knoebl, Simulation of Room Impulse Response
19
Fig. 2.3.6: Impulse response of an allpass filter with 𝜏=10ms and g=0.5
Fig. 2.3.7: EDR of an allpass filter with 𝜏=10ms and g=0.5
Due to its flat frequency response, the allpass filter does not colour the sound from a
perceptual point of view as long as the delay time is much shorter than the integration time
(i.e. the duration of the interval in which stimuli appear to be summated [Tou98]) of the ear ,
Lukas Knoebl, Simulation of Room Impulse Response
20
which is about 50ms. In any other case, the time domain effects become audible and
coloration effects can be perceived [Fre00].
Also, in the case of a stationary input, the coloration created by the comb filter is suppressed
by the allpass filter due to its flat frequency response. However, for short transient signals, the
low echo density still causes fluttering sound and the timbre of the comb filter is still present
[JC91].
Fig 2.3.9: Magnitude response and phase of an allpass filter with 𝜏=10ms and g=0.5
Some zeros at the conjugate reciprocal locations to the poles are now added to the pole-zero
map, as the z-transform of the allpass filter is given by [Fre00]:
𝐻(𝑧) =𝑧−𝑚 − 𝑔
1 − 𝑔𝑧−𝑚 (Eq. 2.15)
The relation between loop gain, delay and reverberation time of a single allpass filter module
can be given as [Schro62]:
𝑇60 = (3
log10 |1𝑔|) . 𝜏 (Eq. 2.16)
Lukas Knoebl, Simulation of Room Impulse Response
21
For a reverberation time of T=2sec and a gain setting of 0.708, the delay time must be 100ms.
This produces only ten echoes per second, which is far to less for a natural reverberation. For
a single component, no matter if allpass- or comb filter, the echo density is nowhere near the
echo density of a real room and thus a solution to solve this problem would be to connect
several of these modules within larger structures [Schro62].
2.7 Classic Filter Network Structures
This section will discuss some common filter networks, consisting of various connections of
comb and allpass filters.
2.7.1 Parallel Comb Filters
If several parallel comb filters are used, as shown in Fig. 2.6.1, it is not possible to achieve a
flat frequency response. However, as long as a sufficient amount of different frequency peaks
are added together, the impulse response will become closer to that of real rooms. Jot and
Chaigne have shown that all comb filters have the same decay rate if the magnitudes of their
poles are made equal. This will reduce the effect of the individual resonances of the
combfilters and will make it less noticeable [JC91].
Fig. 2.6.1: Four parallel comb filters
When connecting P comb filters in parallel, the system transfer function can be given as:
Lukas Knoebl, Simulation of Room Impulse Response
22
𝐶(𝑧) = ∑𝑔𝑝
𝑧𝑚𝑝 − 𝑔𝑝=
𝑃−1
𝑝=0
∑ ∑ [1
𝑚𝑝.
𝑧𝑘𝑝
𝑧 − 𝑧𝑘𝑝]
𝑚𝑝−1
𝑘𝑝=0
𝑃−1
𝑝=0
(Eq. 2.17)
where mp are the delay lengths in samples, gp are the gains and 𝑧𝑘𝑝 = 𝛾. 𝑒𝑗𝑤𝑘𝑝 , where 𝛾 = 𝑔1
𝑚
and 𝑤𝑘𝑝 = 2𝑘𝜋/𝑚.
For the above system, the following condition for equal magnitude of the poles must be
fulfilled for any p:
𝛾 = 𝑔𝑝
1𝑚 (Eq. 2.18)
In this context the frequency density Df and time density Dt (as discussed in chapter 2.4) for a
parallel comb filter with P delays and corresponding delay lengths 𝜏𝑝 can be approximated as
follows:
𝐷𝑓 = ∑ 𝜏𝑝 ≈ 𝑃. 𝜏
𝑃−1
𝑝=0
(Eq. 2.19)
𝐷𝑡 = ∑ 1/𝜏𝑝 ≈ 𝑃/𝜏
𝑃−1
𝑝=0
(Eq. 2.20)
Given that about 10000 echoes and a frequency density of 0.15 for a reverberation time of one
second are considered necessary to obtain a natural and smooth reverberation for transient
input signals, a total amount of 40 comb filter with an average delay length of about 12 ms is
needed. Even more comb filters will be required when approximating large rooms, because
the average separation of frequency peaks is inversely proportional to the reverberation time
[JC91].
2.7.2 Combination of Comb- and Allpass Filters
In search for a more efficient algorithm, Schroeder proposed a network of four parallel comb
filters with incommensurate delay lengths in series with two allpass filters, as shown in Fig.
Lukas Knoebl, Simulation of Room Impulse Response
23
2.6.2. The comb filters are responsible for a sufficient frequency density, whereas the allpass
filters should increase the echo density [Schro62].
Fig. 2.6.2: Delay Network consisting of four parallel comb filters followed by two allpass
filters in series.
This structure has been implemented in Matlab with the following settings (as suggested by
Schroeder [Schro62]) , and plots are provided below.
Sampling Frequency: Fs = 44100 Hz;
Comb Filter Delays: τ1 = 1327, τ2 = 1553, τ3 = 1801, τ4 = 1979 [Samples]
Comb Filter Gains: g1=0.81, g1=0.78, g1=0.754, g1=0.733
Allpass Filter Delays: τ5 = 221, τ6 = 75 [Samples]
Allpass Filter Gains: g5= g6=0.7
Fig. 2.6.4: EDR of Schroeders combined comb- and allpass filter reverberator
Lukas Knoebl, Simulation of Room Impulse Response
24
Fig. 2.6.5: Echo Density Profile of Schroeders combined comb- and allpass filter reverberator
While this network provides a reasonable frequency density, the echo density is much lower
than in real rooms, as shown in Fig. 2.6.5. The Echo Density Profile does not reach Gaussian
distribution and consequently this structure is not able to provide flutter free reverberation for
transient test signals such as clicks.
2.7.3 Nested Series Allpass Network
In his early papers Schroeder [SL61] proposes a second reverberator which consists of five
allpass filters in series, nested within an outer allpass filter, as shown in Fig. 2.6.6. He
suggests choosing the delay length of each of the inner modules to about one third of the
preceding delay length. Again, these ratios should be made incommensurate to avoid echo
cancellation or superposition. The feedback gains gn are most commonly made equal to
around 0.7.
The delay line of the outer filter can be used to introduce a time gap (predelay) between the
direct sound and the onset of reverberation. For large concert halls, this value is often set to
around 30 ms, depending on the position of the listener. The absolute values of the outer
feedback gain should be made less than one to guarantee stability. The ratio of direct sound to
the reverberant sound is given by g2/1-g
2 [SL61].
Lukas Knoebl, Simulation of Room Impulse Response
25
Fig. 2.6.6: In this structure a series of five allpass filters (figured as the ‘All-pass reverberator
gain, 1’ block) is nested within an outer allpass.
Due to its nested structure this reverberator provides a surprisingly high echo density. As it
exclusively consists of allpass filters, each producing a flat frequency response, the resulting
overall frequency response of the whole structure is also flat.
Fig. 2.6.7: EDR of Schroeders nested allpass cascade algorithm. The network consists of
several allpass units. Consequently the resulting frequency response is flat.
Lukas Knoebl, Simulation of Room Impulse Response
26
When analysing the Echo Density Profile (as shown in Fig. 2.6.8) of this structure, a
fundamental property can be observed: The profile increases over time, as it is the case in real
acoustic spaces. Accordingly, putting allpass filters into feedback loops can be used as an
efficient method to increase the echo density of artificial reverberation algorithms.
Fig. 2.6.8: Echo Density Profile of Schroeders nested allpass cascade algorithm. The profile
increases at a slow rate but finally reaches Gaussian distribution at about 600 ms.
Lukas Knoebl, Simulation of Room Impulse Response
27
3 Feedback Delay Networks
This chapter will focus on a very common reverberator structure, the Feedback Delay
Network (FDN). This algorithm was first introduced by Stautner and Puckette [SP82] and is
based on delay lines which are connected by means of a feedback matrix [Zöl11]. This
structure should take advantage of both the high echo density provided by series allpass filters
and the property to simulate the frequency response of real rooms as it can be achieved with
parallel comb filters [Fre00].
3.1 General FDN Structure
The FDN can be described as a vector generalization of the recursive comb filter. The order N
of the system is defined by the number of delay lines, each being 𝜏𝑖 = 𝑚𝑖𝑇𝑠 seconds long,
where m is the delay length in samples and 𝑇𝑠 =1
𝐹𝑠 is the sampling interval . The feedback
gain of the unit comb filter is replaced by a NxN feedback matrix A with elements ai,j. A
fourth order FDN is shown in Fig 3.1.1. [Zöl11].
Fig. 3.1.1: Structure of a fourth order FDN
The FDN is completely described by the following properties:
Lukas Knoebl, Simulation of Room Impulse Response
28
𝑦(𝑛) = ∑𝑐𝑖𝑠𝑖(𝑛) + 𝑑𝑥(𝑛)
𝑁
𝑖=1
(Eq. 3.1)
𝑠𝑖(𝑛 + 𝑚𝑖) = ∑𝑎𝑖,𝑗𝑠𝑗(𝑛) + 𝑏𝑖𝑥(𝑛)
𝑁
𝑗=1
(Eq. 3.2)
The output signal y(n) is a linear combination of the input signal x(n) and the individual
outputs of the delay lines si(t) [JC91].The delay lengths mi are generally high integer numbers
on the orders of hundreds or thousands [Zöl11]. Jot and Chaigne have investigated the
possibilities of FDNs very thoroughly. Using the z-transformation the above equations can be
written as:
𝑦(𝑧) = 𝒄𝑻𝒒(𝒛) + 𝑑𝑥(𝑧) (Eq. 3.3)
𝒔(𝒛) = 𝑫(𝒛). [𝑨. 𝒔(𝒛) + 𝒃. 𝑥(𝑧)] (Eq. 3.4)
Column vectors b and c can be used for multiple input-output systems, A is called the
feedback matrix and the delay matrix D(z) is defined as:
𝐷(𝑧) = [𝑧−1 … 0… … …0 … 𝑧−𝑚𝑁
] (Eq. 3.5)
Now it is possible to find the systems transfer function H(z) [JC91]:
𝐻(𝑧) =𝑦(𝑧)
𝑥(𝑧)= 𝒄𝑻[𝑫𝑧−1 − 𝑨]−1𝒃 + 𝑑 (Eq. 3.6)
The poles of the system can be found by solving the characteristic equation of the system:
det[𝑨 − 𝑫(𝑧−1)] = 0 (Eq. 3.7)
It is not trivial to solve this equation, but it has been shown that the stability of this system can
be ensured if the feedback matrix A is unitary, i.e. A* A = I, where A* is the Hermetian
Lukas Knoebl, Simulation of Room Impulse Response
29
transpose of A. Moreover, this choice leads to a lossless FDN prototype, because all poles of a
unitary feedback loop are located on the unit circle. Consequently, the system has only non-
decaying eigenmodes [JC91 and Jot97].
The matrix A should have no null coefficients to provide a faster increase of the echo density
and ideally all coefficients should have the same magnitude in order to provide a minimum
crest-factor (ratio of largest coefficient over RMS average of all coefficients). The latter will
speed up the convergence to a Gaussian amplitude distribution [Jot97].
As suggested by Jot [Jot97] several classes of unitary matrices can be used for this purpose:
Householder matrices of the type A = (2/N) e eT – I, where e = [1….1]
T and I is the
identity matrix. In this way the complexity of an implementation is reduced to 2N
numerical operations for a N by N matrix. However, householder matrices have a high
crest factor for high numbers of N.
Hadamard matrices can be implemented with butterfly networks and require 𝑁 log2 𝑁
additions an N multiplications.
Circulant matrices can also be implemented very efficiently by using two FFTs and N
complex products.
Generally, with the matrices described above about 8 to 16 delay units should be sufficient to
provide a natural reverberation with an adequate time and frequency density [Jot97].
3.2 Obtaining frequency depended reverberation times
The choice of a unitary feedback matrix leads to a lossless prototype FDN, which creates an
infinite, non-decaying impulse response. However, in real rooms low frequencies will
naturally decay slower than high frequencies due to the absorption properties of the walls. In
order to obtain a frequency dependent reverberation time each delay line can be connected in
series with a corresponding absorbent filter h(z).
Jot [JC91] has investigated this method on the basis of Schroeder’s parallel comb filter, which
is equivalent to the well-known case of a diagonal feedback matrix.
To avoid any unpleasant resonating frequencies it is important that every comb filter decays at
the same relative rate. More precisely, all system poles corresponding to neighbouring
Lukas Knoebl, Simulation of Room Impulse Response
30
eigenmodes must have the same magnitude. This condition is called the continuity of the pole
locus [JC91].
Generally, when the feedback gain of a comb filter is replaced by an absorbent filter then both
the decay time as well as the magnitude of the frequency response is modified. As long as the
continuity of the pole locus is fulfilled this magnitude can be kept independent of the decay
characteristics by connecting a tone correction filter t(z) in series with the comb filter [JC91].
3.2.1 Implementation using 1st order filters
For each delay line there will be one absorbent filter with the same structure but different
coefficients. The absorbent filters can be implemented as first order IIR filters with the
following transfer function hp(z):
ℎ𝑝(𝑧) = 𝑘𝑝. 𝛿𝑘𝑝 where 𝛿𝑘𝑝 =1−𝑏𝑝
1−𝑏𝑝.𝑧−1 (Eq. 3.8)
If 0 ≤ 𝑏𝑝 < 1this will provide a low-pass filter and the gains kp can be computed by the
desired reverberation time 𝑇𝑟 at zero frequency:
𝐾𝑝 = 20 log10(𝑘𝑝) = −60𝜏𝑝
𝑇𝑟(0) (Eq. 3.9)
The coefficients 𝑏𝑝 can be determined as follows:
𝑏𝑝 = 𝐾𝑝.ln (10)
60. [1 −
1
∝2] where ∝=𝑇𝑟(𝜋)
𝑇𝑟(0) (Eq. 3.10)
The tone corrector t(z) is responsible for compensating the frequency response as the
absorbent filters will modify the system poles as described above.
𝑡(𝑧) =1−𝑏.𝑧−1
1−𝑏 with 𝑏 ≈
1−𝛼
1+𝛼 (Eq. 3.11)
These formulas are valid for small values of 𝑏𝑝and not too long delay times. They can be used
to achieve a desired reverberation time at zero frequency and Nyquist frequency [JC91].
Lukas Knoebl, Simulation of Room Impulse Response
31
More general, the tone corrector can be implemented as a filter with a magnitude response
that is equal to:
|𝑡(𝑧)| = √1
𝑇𝑟(𝜔) (Eq. 3.12)
where 𝑇𝑟(𝜔) is the frequency dependent reverberation time in seconds [JC91].
3.2.2 Control of decay characteristics in FDNs
Following the indications of Jot [JC91] every delay line within the FDN can be cascaded with
a gain kp:
𝑘𝑝 = 𝑎𝑚𝑝 (3.13)
This condition ensures that the reverberation time can be modified without violating the
principle of equal magnitudes of the poles. Since all poles are contracted by the same factor a,
this is equivalent to replacing D(z) with D(z/a).
Additionally attenuation coefficients can be replaced by low-pass filters using the method
described in 3.2.2. By inserting the absorbent filters at the output of each delay unit in the
general network as shown in Fig 3.1.1 it is possible to obtain a frequency dependent
reverberation time while still satisfying condition 3.12. Finally, the tone corrector t(z) can be
added to the output of the reverberator.
Fig. 3.2.1: General structure of a FDN of order N=3, including absorbent filters hN(z) and
the tone corrector t(z).
Lukas Knoebl, Simulation of Room Impulse Response
32
3.3 Parameterization of FDNs
As discussed in chapter 3.2 FDNs can provide natural reverberation at low computational
costs. The overall quality of the resulting impulse response depends on the number of delay
lines, their lengths and the choice of coefficients of the unitary feedback matrix. The
frequency dependent reverberation time can be controlled with absorbent filters connected in
series with the output of the delay lines, as shown in Fig. 3.2.1.
The intention of this chapter is to find out about the capabilities of an order N=8 FDN in
terms of its maximum echo density, subjective sound quality and overall variability. Finally it
should be investigated if this structure can be used as the basic tool in order to simulate the
late reverberation of any given impulse responses of real enclosures.
For the eight delay lines, at a sampling rate of 44100 Hz the following lengths in samples
have been chosen:
𝜏1 = 587 𝜏2 = 661 𝜏3 = 743 𝜏4 = 827 𝜏5 = 883 𝜏6 = 967 𝜏7 = 1049 𝜏8 = 1151
The summation of these values leads to a total delay length of 6868 samples which
corresponds to about 155 ms at a sampling frequency of 44100 Hz. It is important to choose
prime numbers for the delay lengths in order to avoid echo-cancelation or -overlapping.
The choice of the feedback matrix affects the echo density and is also responsible for the
computational efficiency. In general, null-coefficients should be avoided and, as discussed in
the previous chapters, the matrix has to be unitary for a lossless prototype reverberator. In this
case the feedback matrix was taken from the class of Householder matrices, as proposed by
Frenette [Fre00]:
𝑨 = 𝑱 −2
𝑁. 𝒆. 𝒆𝑻 (Eq. 3.14)
Lukas Knoebl, Simulation of Room Impulse Response
33
where J is a N x N circular permutation matrix and e is a N x 1 column vector of ones.
The circular N x N permutation matrix J was implemented as follows:
[ 0 0 … … … 11 ⋱ ⋱ ⋱ 0 00 1 ⋱ ⋱ ⋱ ⋮⋮ 0 ⋱ ⋱ ⋱ ⋮⋮ ⋮ ⋱ ⋱ ⋱ ⋮0 … … 0 1 0]
(Eq. 3.15)
In order to simplify the subjective quality of the reverberation effect a longer decay time of
3.5 seconds was chosen for the zero-frequency, whereas the decay time at the Nyquist
frequency (in this case 22050Hz) was set to 1 second to mimic the high frequency absorption
of the walls.
Finally, coefficients of the input vector b have all been set to one to provide maximum echo
density and the stereo output matrix c has been set to:
𝒄 =
[
1 1−1 11 −1
−1 −1⋮ ⋮ ]
(Eq. 3.16)
The first column corresponds to the left output channel. Jot noticed that with the above
feedback matrix periodic clicks can occur in the impulse response. To avoid this sort of clicks,
the signs of every other coefficient can be inverted. The second column was chosen to be as
different as possible from the first column so that the two outputs are perceived as being
uncorrelated [Fre00].
For figures 3.3.1-3.3.2 the two outputs of the reverberator were summarized to a single mono
output. As expected, the desired frequency dependent reverberation time could be obtained
very accurately with a smooth decay curve across the whole frequency range, as shown in
figure 3.3.1. In addition the echo density profile was computed and it can be observed that
Gaussian distribution was reached after about 200 ms. The initial slope of the profile seems to
be somewhat slower compared to the profile illustrated in Fig. 2.5.2.
Lukas Knoebl, Simulation of Room Impulse Response
34
Fig. 3.3.1: Energy Decay Relief of the order N=8 Feedback Delay Network
Fig. 3.3.2: Echo Density Profile for the order N=8 Feedback Delay Network
The subjective quality of the reverberator was tested with a mono, 16-bit wav signal of a snare
drum hit. Although the overall decay is very smooth, a very subtle periodic ringing could be
observed. This may be an indicator for an insufficient frequency density or inappropriate
delay times.
Lukas Knoebl, Simulation of Room Impulse Response
35
4 The Final Algorithm
This chapter will provide information about the development of an algorithm which can
approximate a given room impulse response by automatic parameterization of the used
reverberation system. In general, the reverberation process can be divided into two separate
sections, one for synthesizing the early reflections and one for the simulation of the
reverberation tail.
Fig. 4.6.1: Complete structure of the algorithm. Blocks within the red dashed line are
performed offline.
The individual sections of the block diagram above will be discussed within the next section.
First of all, in chapter 4.1, we will discuss how we can use the Energy Decay Relief and the
Echo Density Profile to get necessary information about reverberation time, spectral
Lukas Knoebl, Simulation of Room Impulse Response
36
properties and temporal progress of the impulse response we would like to model. We will
further establish a criterion, according to which we can clearly separate the early part from the
more diffuse late part.
Section 4.2 will describe the changes which have been applied to the Feedback Delay
Network to improve the subjective quality of the reverb tail for various types of input signals.
In the following part (4.3.) we will look at how to achieve a frequency dependent
reverberation time, corresponding to the data obtained from the methods used in 4.1.
Furthermore it will be discussed how a spectral correction filter can be used adjust the
frequency response of the FDN output, so that it is closer to the response of the modelled
room.
Finally, chapter 4.4 describes an approach to simulate the early reflections by using a simple
tapped delay line.
4.1 Impulse Response Analysis
The sonic quality of reverberation is for the most part influenced by the length and
distribution of the early reflections, the frequency dependent decay time and consequently its
overall frequency response [CNW14]. Therefore it is necessary to investigate the room
impulse response regarding these parameters in order to use the knowledge gained from this
process to be able to adjust the controls of the algorithm properly.
4.1.1 Segmentation of Early Reflections and Late Reverberation
The first step in the structure of the algorithm is to load the impulse response specified by the
user. The audio file should be of good quality in terms of noise floor, sample rate and
resolution.
In regard of its further course, the algorithm now needs to decide at which time the sparse
early reflections change over to the very diffuse late part of the reverberation, since both of
those segments are modelled by separate sections of the reverberator.
In order to do this, the Echo Density Profile described in section 2.5 will be calculated. A
good condition for the choice of the transition point would be the time where the Echo
Density Profile first reaches a value of one, since the late reverberation tends towards a
Lukas Knoebl, Simulation of Room Impulse Response
37
normal distribution. However, tests have shown that this condition is quite prone to errors, so
it was extended in the way it is suggested by Rebecca Steward and Damian Murphy [MS07].
Now the transition point is defined at the maximum of the Echo Density Profile within a time
window of 60ms, right after Gaussian distribution was reached for the first time. Additionally,
the transition point has to be within 60ms and 150ms.
Figure 4.1.1: Echo Density Profile and selected truncation point for a large storage room
(left) and a concert hall (right).
4.1.2 Measuring Reverberation Time and Total Energy
As described in 2.1, the frequency dependent reverberation time can easily be measured by
using the normalized Energy Decay Relief. For each frequency band the reverberation time is
equal to where the EDR matches a value of -60dB. Moreover, the EDR provides information
about the total energy of the impulse response at any desired timeframe t. As the timbre of the
impulse response is strongly connected to the initial energy at t=0, this value can be used to
compare the spectral properties of the original and the synthesized impulse response
[JCW97]. (See chapter 4.3.2)
Lukas Knoebl, Simulation of Room Impulse Response
38
Figure 4.1.2: Reverberation time (top) and the total energy (bottom) for a concert hall. Both
values are calculated using the Energy Decay Relief.
4.2 Improving the Quality of Late Reverberation
A Feedback Delay Network of order N=8 was described earlier in chapter 3. While this
network provided good control over reverberation time for very low and very high
frequencies, the decay suffered from a subtle but still unnatural sounding periodic ringing.
The following chapter will describe which changes have been made to the FDN to further
improve the subjective quality of the reverberation tail.
4.2.1 Delay Lines and Delay Lengths
The first step of improvement was to change the order of the FDN from N= 8 to N=12, which
implies that four additional delay lines are attached to the network. This yields to an increased
total delay length and, in consequence, to a higher modal density. Furthermore the echo
density builds up faster, depending on the length of the individual delays, assumed that they
are chosen mutually prime.
Though in theory any set of delay lengths would have fulfilled the objective criteria of this
thesis, their choice is extremely important for the naturalness and smoothness of the
reverberation tail. Tuning the delay lengths of the FDN was consequently one of the most
time consuming parts of the whole development of the proposed algorithm. There does not
Lukas Knoebl, Simulation of Room Impulse Response
39
seem to exist any scientific approach or any logical coherencies in order to improve them,
except that they should be chosen mutually prime, thus avoiding echo cancellation and
superposition. Changing only one of the delay lengths from a decent sounding set to a
neighbouring prime number turned out to be an impractical optimization method. Finally, a
Matlab program was written in order to create random sets of prime numbers in between
different intervals. In this way thousands of different delay length combinations were tested
by simply listening to the different impulse responses of the FDN. Each of them corresponded
to an individual combination and while the evaluation still took a lot of time, the various
impulse responses were created automatically at least.
Generally, sets containing prime numbers on an interval with a ratio of about 1:1,6 turned out
to be the most effective. Combinations of primes on a rather low interval from around 600 to
960 (at a sampling frequency of 44100 Hz) produced a fast echo density built up, but tended
to sound more metallic. Additionally, they often revealed a periodic ringing, which sometimes
can even be perceived as a very disturbing sound of distinct pitch.
The most convincing results could be achieved by increasing the interval where the prime
numbers are taken from 3000 to around 5000. This finally corresponds to individual delay
lengths of about 68ms to 113ms at a sampling frequency of 44100 Hz. Periodicities within the
reverberation tail could still be heard, but they are generally slower, occur in lower registers
and, as a result, are less obvious. In some cases they were very hard to perceive, even with
good headphones and high-end D/A converters. However, the drawback of the increased
delay length is again a slower slope of the echo density profile. Normal distribution was only
reached after 300-400ms, which does not match the properties of a real room’s impulse
response, where it is usually reached within the first 150ms. In order to solve this problem,
two further improvements have been applied to the FDN: Firstly, the number of delay lines
has again been increased to 16. We will later see that this provides a good compromise
between quality and computational cost. Moreover, a diffusion section has been added to the
input of the network, consisting of four short allpass filters in series, as suggested by Dattorro.
The purpose of these filters is to de-correlate the signal quickly and to reduce peakedness by
randomizing the signal phase [Dat97].
Lukas Knoebl, Simulation of Room Impulse Response
40
Figure 4.2.1: Echo Density Profile of the FDN without input diffusion (left) and with input
diffusion (right). The series allpass filters at the input of the network rapidly increase the echo
density, while it takes much longer to reach normal distribution when no diffusion is applied
to the input signal.
4.2.2 Modulation
Another method to improve the quality of late reverberation is to add modulation to delay
lengths, the feedback matrix or the output vector coefficients of the FDN [Fre00]. It can be
used to continuously modify one or several of these parameters over time in order to avoid
repetitive patterns in the reverberation tail. When employed sparingly, it will add a slight
amount of motion and blurriness to the decay without introducing any unnatural pitch-
shifting, as it intentionally happens with time-varying effects like chorus or flanger. Although
there does not really exist a physical equivalent to modulation in real rooms, many well-
known commercial reverberators use it in their algorithms. As a result of constantly changing
the delay lengths, the resonant frequencies of the system will also be changed, which helps to
achieve a flatter frequency response [Fre00]. In theory, within an enclosure this effect could
only be produced by moving the walls of a room back and forth, or – to a certain degree – if
the air temperature and consequently the speed of sound are altered continuously.
Generally, the output of a modulated delay line is given by [Zöl11]:
𝑦(𝑛) = 𝑥(𝑛 − 𝐷 − 𝑚(𝑛)) (Eq. 4.1)
Lukas Knoebl, Simulation of Room Impulse Response
41
where D corresponds to a fixed integer delay, and m(n) is the excursion of the modulation on
each side of this value.
𝑚(𝑛) = 𝑤𝑖𝑑𝑡ℎ ∗ 𝑀𝑂𝐷(𝑛) (Eq. 4.2)
MOD is a constantly changing modulation signal, typically a sinusoid or any continuous
signal with values between minus one and one. Width corresponds to the modulation depth in
samples. Thus, the maximum possible delay of the input signal is x(n-D-width) and the
smallest possible delay is x(n-D+width). More general, non-integer delay values become
necessary and they can be expressed as a whole number plus a fractional part (on the interval
0 < frac < 1). The technique of interpolation is used to calculate the output sample y(n), which
lies in between two consecutive samples. In this way it is possible to avoid signal
discontinuities when the delay times are modulated continuously, as shown in figure 4.2.2
[Zöl11 and Dat97,2].
Figure 4.2.2: A fractional delay line with interpolation.
There exist several interpolation algorithms which are suitable for audio signals and delay line
interpolation. The most commonly used and straight-forward method would be Linear
Interpolation [Smi10]. This technique works best for lowpass signals or in combination with
oversampling. However, it introduces a high frequency loss, which can be a problem when
used for the delay lines inside a reverberator. This means that the Linar Interpolator acts like a
high frequency damping filter, thus affecting the frequency dependent reverberation time.
Lukas Knoebl, Simulation of Room Impulse Response
42
Fortunately, this does not occur with first-order Allpass Interpolation, which has a flat
frequency response and – like Linear Interpolation – requires only one multiply and two adds
per sample [Smi10].
Figure 4.2.3: First-order Allpass Interpolation.
The difference equation for first order allpass interpolation is given by [Smi10]:
𝑥(𝑛 − 𝑓𝑟𝑎𝑐) = 𝑦(𝑛) = 𝜂 . [𝑥(𝑛) − 𝑦(𝑛 − 1)] + 𝑥(𝑛 − 1))
(Eq. 4.3)
where 𝜂 ≈1−𝑓𝑟𝑎𝑐
1+𝑓𝑟𝑎𝑐
Consequently, this type of modulation was used for all of the 16 delay lines. Finally, a
modulation depth of around 12 samples and a modulation rate of 1.5 to 2 Hz turned out to be
a good compromise. In this way the quality of the reverberation tail could be improved by
suppressing undesired resonances successfully, while any pitch-shifting artefacts were
noticeable. A simple sinusoid was used as the modulation signal, with different phase shifts
applied in order to change the individual delay lengths in more arbitrary directions, as
suggested by Frenette [Fre00].
4.2.3 Output Vectors and Feedback Matrix
It is worth mentioning that the choice of the output vector c does not only affect the spatial
impression (for detailed information, please refer to chapter 3.3), but also influences the
timbre and resonances of the whole system. A combination of delay lengths which sounds
good with all coefficients of c set to one, is not necessarily be the best choice when c is set
differently. The same applies for the choice of the feedback matrix A. For this reason the
output vector c and the feedback matrix A of proposed algorithm correspond to extended
versions of the respective matrices described in 3.3.
Again, the feedback matrix which was used for the FDN is given by:
Lukas Knoebl, Simulation of Room Impulse Response
43
𝑨 = 𝑰 −
2
𝑁. 𝒆. 𝒆𝑻
(Eq. 4.4)
One advantage of this class of Householder feedback matrices is that the N x N matrix-times-
vector operation can be implemented by the adding the values of the input vector, multiplying
it with 2/N and subtracting the result from the input vector [Smi10]. In the case where I is an
N x N circular permutation matrix [See Equation 3.15] the values of the resulting vector have
to be circularly shifted by one.
4.3 Controlling the Reverberation Time
The reverberation time of the FDN can be controlled by inserting absorbent filters with
transfer function hi(z) into all of the N feedback channels. Each absorbent filter introduces a
frequency dependent absorption and is chosen such that the logarithm of its magnitude
response is proportional to the delay length mi, and thus inversely proportional to the
reverberation time Tr(𝜔), derived from the EDR as described in 4.1.2. [Jot92 and CNW14]
By neglecting the absorptive filter’s phase delay, the desired magnitude response of each filter
can be calculated as:
|ℎ𝑖(𝑒
𝑗𝜔)| = 10−60∙𝑚𝑖∙𝑇20∙𝑇𝑟(𝜔)
(Eq. 4.5)
Where 0 ≤ 𝜔 = 2𝜋𝑓𝑇 ≤ 𝜋, f is the frequency in Hz, T is the sampling period in seconds
and Tr(𝜔) is the desired frequency-dependent reverberation time in seconds [Jot92].
Figure 4.3.1: Desired magnitude response (concert hall) according to equation 4.5 for
absorbent filters related to different delay lengths.
Lukas Knoebl, Simulation of Room Impulse Response
44
The absorptive filters are implemented as direct-form-I biquad-filters and their coefficients
are calculated with the Matlab function yulewalk. This function performs an adaptive
algorithm and tries to find the nominator and denominator coefficients of an IIR Filter in a
way that the filter’s magnitude response matches the specified desired magnitude response
(equation 4.5). On the one hand, yulewalk provides very accurate approximations for high
frequencies (>10kHz), but on the other hand also produces quite considerable errors for the
lower octaves. Unfortunately, it doesn’t provide any possibilities to apply a higher weight to
the lower octaves, thus forcing those frequencies to decay too quickly, as shown in figure
4.3.2.
Figure 4.3.2.: The red line shows that the magnitude approximation is slightly too low for
frequencies up to 4 kHz and slightly too high for frequencies above around 10 kHz (top).
Even very small errors for low frequencies can result in drastic errors of around one second
for the reverberation time (bottom).
Lukas Knoebl, Simulation of Room Impulse Response
45
The magnitude error could be optimized by manipulating the measured reverberation time for
the calculation of the absorbent filters. This was done by adding a certain percentage of the
reverberation time for each frequency, depending on the average decay time of the IR. In this
way, it was possible to apply an ‘artificial’ frequency dependent weight to the Matlab function
yulewalk.
Figure 4.3.3: By adding an offset to the reverberation time, the magnitude error for lower
frequencies could be improved. Due to the usage of second order filters, a bigger error for the
very high frequencies was introduced (top). Interestingly, this is not reflected in the resulting
reverberation time of the synthesized impulse response, which is very close to the original for
frequencies above 500 Hz.
As shown in Fig 4.3.3, very good results could be achieved for mid- and high frequencies.
Due to the limited filter order of N=2, the filter is not capable of approximating all of the
Lukas Knoebl, Simulation of Room Impulse Response
46
ripples for the bass frequencies successfully. One way to further improve this approach would
be to increase the filter order, which goes along with significantly increasing the computation
time of the FDN. However, subjective listening test have shown that the choice of second
order filters are a good tradeoff between modeling accuracy and computational efficiency, as
differences within the lowest octaves are generally harder to perceive.
4.4 Spectral Correction Filter
The sound of a room and its related EDR is not only characterized by its frequency dependent
reverberation time Tr(f), but also by its initial power spectrum P(f). In addition to the FDN’s
tone-correction filter described in chapter 3.2.2, a spectral correction filter is introduced to
match the initial spectrum of the FDN with that of the original impulse response. The filter is
applied to the output of the FDN and - in reference to [CNW14] - its magnitude is given as:
𝑆𝑐(𝑓) = √𝐸𝐷𝑅𝐼𝑅(𝑡 = 𝑡𝑟𝑢𝑛𝑐, 𝑓)
𝐸𝐷𝑅𝐹𝐷𝑁(𝑡 = 𝑡𝑟𝑢𝑛𝑐, 𝑓) (Eq. 4.6)
The EDRs are both evaluated at the transition time t=trunc, when the phase of late
reverberation and thus normal distribution of the echo density has been reached (see chapter
4.1.1). The coefficients of the order N=12 linear-phase spectral correction filter are again
calculated with the Matlab function yulewalk.
Figure 4.4.1: The blue line shows the magnitude response of the spectral correction filter
applied to the output of the FDN.
Lukas Knoebl, Simulation of Room Impulse Response
47
4.5 Synthesizing the Early Reflections
The early reflections typically arrive within less that 150ms and can be separated from the late
part of the reverberation by truncating the impulse response at the transition time computed in
chapter 4.4.1. It is very important to model these first reflections very accurately, since they
preserve the naturalness and special impression of the room response [CNW14]. For this
reason, each channel is analyzed separately.
When it comes to the perceptual approximation of room impulse responses, in many existing
approaches the first part of the original impulse response is convolved with the input signal
[CNW14]. By using this method the correlation between the separate channels is remained,
thus giving a very natural impression. However, this algorithm concentrates on an alternative
approach, where only the most prominent early reflections are synthesized. The goal is to
further reduce the computational cost and still produce a result that sounds very similar - if
not indiscernible in the best cast – to the original ERs.
The synthesis of the early reflections is realized in the form of a tapped delay line, with delay
times and corresponding gains extracted from the first part of the original impulse response.
In order to reduce the amount of reflections that need to be generated and thus to reduce the
computation time, only the most prominent reflections are synthesized. This is done by
neglecting all reflections with amplitude below 0.08 (for a normalized impulse response).
Furthermore, a 0.5 milliseconds sliding window is applied to the truncated impulse response.
If a window is still containing more than a single reflection, only the strongest echo is
retained. Consequently, the amount of early reflections is reduced from a maximum of about
6000 to a maximum of about 270 echoes which have to be generated by the tapped delay line.
Finally, a spectral correction filter is applied to the output of the tapped delay line. The
coefficients are calculated in the same way as described in chapter 4.4, except that the EDRs
are evaluated at time t=0.
Lukas Knoebl, Simulation of Room Impulse Response
48
Fig. 4.5.1: Comparison between early reflections of a concert hall impulse response (top) and
their synthesized version (bottom).
Subjective listening tests on headphones and speakers have shown that the approach described
above provides very accurate results for a large variety of impulse responses. In some cases
the synthesized version sounds slightly brighter, while still remaining the character and spatial
impression of its reference.
4.6 Summary of the Algorithm
As discussed above, the structure of the algorithm can be separated into three different parts.
Firstly, the analysis of the original impulse response is performed offline. The transition time
t=trunc between the early reflections and the reverb tail time is determined by evaluating the
Echo Density Profile. The Energy Decay Relief can be used to estimate the frequency
dependent reverberation time and the total energy for the time instances t=trunc and t= 0.
These parameters are necessary to calculate the coefficients for the absorbent- and spectral
correction filters. Additionally, the Early Reflections Analysis block provides the delay times
and gains for the most prominent early reflections.
Lukas Knoebl, Simulation of Room Impulse Response
49
Section number two is responsible for creating the early part of the reverberation. This is
achieved by synthesizing the first reflections with a tapped delay line and applying a spectral
correction filter to its output. Finally, the RMS level is adjusted so that it is equal to the level
of original early reflections.
The third part consists of an input diffusion section in order to increase the echo density.
Next, the Feedback Delay Network will provide the late reverberation. Again a spectral
correction filter is applied to the output of the FDN and thus, RMS matching is performed to
avoid a noticeable transition between the different parts of the reverberation.
Both the synthesis of the early reflections and the late reverberation can be implemented in
real time. The final output of the algorithm is obtained by the summation of these
components, with possibilities to adjust their individual volume.
5 Evaluation and Results
This chapter will provide an overview about the results which could be achieved by using the
proposed algorithm in order to simulate an arbitrary room impulse response. For this purpose
a variety of reference impulse responses was taken from commercial reverberators (both
convolution and algorithmic reverberators). With regard to investigate the flexibility of the
proposed approach a great number of very different types of room responses have been
modelled. The following section will present the results for one example of each reverberation
category which have been specified as Concert Hall, Cavern, Large Studio, Small Studio and
Algorithmic.
5.1.1 Simulation of a Concert Hall
The reference concert hall impulse response was taken from a commercial convolution reverb.
Referring to the manual, it was captured in true stereo, having a reverberation time of 3.5
seconds and being well suited for a lot of different input signals, like orchestral music, vocals
or piano. Figure 5.1 shows a comparison between the original and the synthesized version
regarding the EDR, impulse response, reverberation time and energy.
Lukas Knoebl, Simulation of Room Impulse Response
50
Fig. 5.1.1.a: Normalized Energy Decay Relief of the
original impulse response (Concert Hall)
Fig. 5.1.1.b: Normalized Energy Decay Relief of the
synthesized impulse response (Concert Hall)
Fig. 5.1.2.a: Original impulse response left and right
channel (Concert Hall)
Fig. 5.1.2.b: Synthesized impulse response left and
right channel (Concert Hall)
Fig. 5.1.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f),
(Concert Hall)
Lukas Knoebl, Simulation of Room Impulse Response
51
As shown above, the algorithm models the concert hall accurately. The reverberation time,
energy, and impulse response look very similar, except for the slightly overestimated
reverberation time from around 150-500 Hz. This is due to the limitation of the second order
absorbent filters, which are not able to recreate the fluctuation of the decay rates in this
frequency range. The most obvious difference, however, is that the synthesized version
generally sounds a lot smoother than its reference. This is also reflected in the Energy Decay
Reliefs in Fig. 5.1.1: Two very distinct resonances at around 250 Hz and 1 kHz can be
detected in the decay of the original impulse response, whereas the decay of algorithmic
version is a lot straighter. While the synthesized reverberation still remains the overall
character of its reference, it does not contain such obvious resonances and generally sounds a
bit softer. This can be explained by the fact that the Feedback Delay Network was
intentionally designed to provide a preferably neutral and flutter free reverberation.
5.1.2 Simulation of a Cavern
The reference cavern impulse response was taken from the same commercial convolution
reverb as the concert hall. According to its attributes it features a reverberation time of 6.2
seconds. The caver sounds a lot brighter than the concert hall, and there is a lot of movement
within the decay.
Fig. 5.2.1.a: Normalized Energy Decay Relief of the
original impulse response (Cavern)
Fig. 5.2.1.b: Normalized Energy Decay Relief of the
synthesized impulse response (Cavern)
Lukas Knoebl, Simulation of Room Impulse Response
52
Fig. 5.2.2.a: Original impulse response left and right
channel (Cavern)
Fig. 5.2.2.b: Synthesized impulse response left and
right channel (Cavern)
Fig. 5.2.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f)
(Cavern)
As shown in Fig. 5.2.3, the algorithm also provides convincing results for longer impulse
responses. In both versions subtle resonances can be perceived, whereby for the reference
they seem to be slightly less distinctive and located at higher frequencies, thus causing the
synthesized impulse response to sound somewhat darker.
5.1.3 Simulation of a Large Studio
The modelled large studio in question is an orchestral scoring stage originally located in the
United States. The impulse response was again taken from a commercial convolution-based
reverberator and has a reverberation time of 1.8 seconds.
Lukas Knoebl, Simulation of Room Impulse Response
53
Fig. 5.3.1.a: Normalized Energy Decay Relief of the
original impulse response (Large Studio)
Fig. 5.3.1.b: Normalized Energy Decay Relief of the
synthesized impulse response (Large Studio)
Fig. 5.3.2.a: Original impulse response left and right
channel (Large Studio)
Fig. 5.3.2.b: Synthesized impulse response left and
right channel (Large Studio)
Fig. 5.3.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f)
(Large Studio)
Lukas Knoebl, Simulation of Room Impulse Response
54
Again, the algorithm was able to simulate the large studio quite well. The original impulse
response shows a bit more energy in the very low frequencies, otherwise the reverberation
sounds very similar. In the same fashion of the concert hall described in 5.1.1, the scoring
stage features a noticeable resonance which builds up over time. This resonance does not exist
in the decay of the emulation, while the overall character of the room is preserved very
accurately.
5.1.4 Simulation of a Small Studio
With about 0.7 seconds the reference small studio features the shortest reverberation time of
all test cases. These kind of rooms are many times chosen as recording rooms for audio
productions requiring dry source material in order to be able to add the desired amount of
reverberation during mixing-phase. Typically, this applies to drum-, guitar- or vocal
recordings.
Fig. 5.4.1.a: Normalized Energy Decay Relief of the
original impulse response (Small Studio) Fig. 5.4.1.b: Normalized Energy Decay Relief of the
synthesized impulse response (Small Studio)
Fig. 5.4.2.a: Original impulse response left and right
channel (Small Studio)
Fig. 5.4.2.b: Synthesized impulse response left and
right channel (Small Studio)
Lukas Knoebl, Simulation of Room Impulse Response
55
Fig. 5.4.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f)
(Small Studio)
As shown above, the reverberation time is notably too low for frequencies up to 250 Hz. This
case is another good example for the limitations of the second order absorbent filters.
Surprisingly the audible differences are not too drastically, because the most significant
deviations occur at low frequencies. Due to the overall short reverberation time the remaining
absolute error is still small enough to provide an adequate emulation.
5.1.5 Simulation of an Algorithmic Reverb
Finally, the proposed approach was tested for its capability to recreate the reverberation
generated by algorithmic hall devices. For this purpose a medium hall impulse response with
a decay time of about 2.1 seconds was taken from a broadly known hardware unit. This preset
is characterized by a fairly slow increasing echo density and a smooth reverberation tail.
Fig. 5.5.1.a: Normalized Energy Decay Relief of the
original impulse response (Algorithmic) Fig. 5.5.1.b: Normalized Energy Decay Relief of the
synthesized impulse response (Algorithmic)
Lukas Knoebl, Simulation of Room Impulse Response
56
Fig. 5.5.2.a: Original impulse response left and right
channel (Algorithmic) Fig. 5.5.2.b: Synthesized impulse response left and
right channel (Algorithmic)
Fig. 5.5.3: Comparison of reverberation time Tr(f), total energy EDR(t=0,f) and initial energy EDR(t=0,f)/Tr(f)
(Algorithmic)
By listening closely to both impulse responses, it is noticeable that the echo density increases
faster in the emulated version. However, this can easily be fixed by reducing the allpass
coefficients of the input diffusion section described in chapter 4.2.1. Due to the allpasses this
is also the first case where the proposed algorithm provides a slightly brighter reverberation
than the reference. Moreover, the original response is characterized by a coloured, distinct
decay which is quite different to the sound of the real rooms discussed earlier. This is
something which varies from algorithm to algorithm and depends on the underlying structure
of the individual reverberator. Ultimately this means that the special characteristics of the
reference algorithm cannot be modelled properly with the proposed approach, unless both
algorithms are based on the same basic structure (in this case a Feedback Delay Network).
However, the overall sound is again modelled quite accurately, despite the slight over-
estimation of the reverberation time from around 1.5 kHz to 8 kHz. Interestingly the early
Lukas Knoebl, Simulation of Room Impulse Response
57
reflections are emulated very convincingly. Probably this is because there are generally less
reflections generated by artificial reverberators than there are naturally occurring in real
rooms.
5.2 General Considerations
Beside the case studies mentioned above, the proposed algorithm was also used to emulate a
large number of different room impulse responses. It can be observed that the emulation
generally tends to sound subtly darker than their references, especially if the reference
impulse response was taken in a real room. In some cases the stereo image is also slightly
wider. However, the most obvious quality of the algorithm is that it will not reproduce
conspicuous resonances, as the Feedback Delay Network is designed to provide reverberation
that sounds as smooth as possible. Nonetheless, the characteristic timbre of the original room
is retained and in general the reverberation times are modelled precisely. Differences occur
mainly for low frequencies, especially in combination with very short reverberation times. In
such cases the accuracy can further be improved by increasing the order of the absorbent
filters (at cost of computation time). Having a close look on the diagrams showing the initial
energy of the impulse responses, it can be observed that the Feedback Delay Network
reverberator produces slightly too much energy below 70 Hz. This can be compensated by
applying a first order high pass filter to the output signal. Additionally, the echo density built
up can be modified by adjusting the allpass coefficients of the input diffusion section.
The algorithm has also been tested with more practical audio signals like drums, vocals,
orchestral music or piano. Furthermore, it is a common technique to mix the output of the
reverberator with the direct signal, like it is the case in a real enclosure. In such cases it is
even more difficult to detect whether the signal was reverberated by convolving it with the
original impulse response or whether the proposed algorithm was used. However, the
development of a dedicated listening test is beyond the scope of this thesis.
Lukas Knoebl, Simulation of Room Impulse Response
58
6 Perceptual Controls
A variety of different perceptual controls have been implemented in order to be able to
modify certain parameters of the reverberation. They will be described briefly in the
following section.
Generally, a so called ‘preset’ will be created for each impulse response the user chooses to
emulate. The parameters for each of those presets (like filter coefficients, reverberation time
etc.) are derived offline from the analysis of the impulse response as described in chapter 4.
General Controls:
Reverberation time: This value describes the desired maximum reverberation time in
seconds. The default value is equal to the maximum reverberation time of the
corresponding impulse response. The desired reverberation time is achieved by
multiplying the numerator coefficients of the absorbent filters with a gain factor.
Predelay: This is the time gap between the direct signal and the onset of the
reverberation in milliseconds. For concert halls, the predelay is usually set to around
25 ms, depending on the desired position of the listener [Ber04].
Mono/Stereo: Defines if the output of the reverberator will be in mono or stereo.
Dry Level: Level of the direct signal in dB.
Wet Level: Level of reverberation in dB.
Controls for Earl Reflections:
ER Gain: Sets the volume of the early reflections from –inf to 0dB. –inf will mute the
early reflections.
ER Slope: This control adjusts the attack of the early reflections from 0% to 100%. An
extreme value of zero will multiply the gains of the early reflections with an
increasing ramp, and a value of 100% will multiply them with a decreasing ramp. A
value of 50% (default) will leave the gains as they are.
ER Spread: It is possible to time-stretch or time-compress the early reflections. A
value of 0% will time compress them to half of their length and a value of 100% will
time-stretch them to twice of their length. A value of 50% will leave them unaltered.
Lukas Knoebl, Simulation of Room Impulse Response
59
Controls for Late Reverberation:
Diffusion 1: Controls the coefficients of the first two allpasses of the input diffusion
section. A value close to one means maximum diffusion, a value of zero corresponds
to no diffusion [Dat97].
Diffusion 2: Controls the coefficients of the last two allpasses of the input diffusion
section. A value close to one means maximum diffusion, a value of zero corresponds
to no diffusion [Dat97].
Depth: This gives control about the modulation depths from 0% to 100%. The
modulation can be turned off by setting this value to zero, or to a maximal modulation
depth of 30 samples by setting it to 100%. Default is 50%.
Tail Gain: Sets the volume of the late reverberation from –inf to 0dB. –inf will mute
the reverb tail.
Output Filters:
A simple filtering section has been added to the output of the reverberator. It provides
controls for the cut-off frequency for a first order highpass filter as well as for a first order
lowpass filter.
Lukas Knoebl, Simulation of Room Impulse Response
60
7 Conclusion and Outlook
This thesis proposed an algorithm which was designed in order to emulate the reverberation
of common enclosures like rooms, concert halls or cathedrals. While this effect is typically
achieved by convolving a signal with a measured room impulse response, this thesis describes
an approach, where early reflections are recreated by a tapped delay line and the reverberation
tail is synthesized with a Feedback Delay Network. Common impulse response analysis
methods, like the Energy Decay Relief and the Echo Density Profile, as well as the most basic
building blocks of reverberators, the comb- and allpass filters, have been reviewed. The final
algorithm consists of an offline analysis bock and an artificial reverberator which was
designed to work in real time. The former part gains information about the reference impulse
response and then automatically tunes the necessary parameters of the reverberator in order to
emulate the reverberation effect. The usage of second order absorbent filters approved to be
adequate to successfully approximate the frequency dependent reverberation time.
Additionally a spectral correction was applied to the output of the Feedback Delay Network to
match the initial spectrum with that of the original impulse response.
The duration of the early reflections was determined by calculating and evaluating the Echo
Density Profile, as the start of the late field is usually indicated by the maximum value right
after the Echo Density Profile approached a value of one for the first time. This initial part of
the impulse response can be used to extract the delay times and gains of the early reflections.
After a threshold is applied to eliminate the weaker echoes, only the most prominent
reflections are retained. In this way the computation time can be decreased while still
remaining the characteristic spatial impression of the first part of the reverberation.
A large amount of time was spent on tuning the delay times of the FDN. Consequently the
quality of the reverberation tail could further be improved by introducing modulation and
increasing the order of the network to 16. Although the decay generally sounds smooth and
shows almost no obvious, undesirable resonances, a few subtle periodicities could only be
detected when using Dirac-impulses as input signals and listening on high end headphones
and DA converters.
Finally, the algorithm was tested with a variety of different impulse responses, including
small rooms, large rooms, halls, caverns and even impulse responses taken from algorithmic
reverb devices. In most cases the achieved emulations have been very accurate with the
Lukas Knoebl, Simulation of Room Impulse Response
61
property to suppress the most prominent resonances of the original reverberation, while still
retaining its overall sonic character. However, a dedicated listening test still has to be
conducted to verify the subjective impressions.
Further research can be done in the area of finding an optimal solution to determine the filter
coefficients of the absorbent filters. While the proposed adaptive approach provides adequate
results for frequencies above around 500 Hz, the algorithm sometimes underestimates the
reverberation time for lower frequencies. This issue can probably be fixed by connecting two
of those second order filters in series, where the latter corrects the remaining magnitude error
for the low frequencies. This will however significantly increase the computation time, as
every delay line requires this filtering process.
Tough the advantage of using Feedback Delay Networks is that they are generally well
studied, in theory the whole approach is not necessarily limited to this kind of structure. Other
systems, like single loop reverberators [as proposed in Dat97], can alternatively be used to
create the late reverberation. As such systems only use a single feedback path, a fewer number
of absorbent filters is necessary to control the reverberation time and consequently more
accurate filters can be used. Ultimately the goal is to find a structure that is computationally as
efficient as possible, while still providing good control and high quality reverberation.
At this time the proposed algorithm is implemented in Matlab. A real time implementation in
the form of an audio plugin will be accomplished in future works.
Lukas Knoebl, Simulation of Room Impulse Response
62
Literatur/Bibliography
[JC91] J.M. Jot and A. Chaigne, “Digital Delay Networks for Designing Artificial
Reverberators,” 90th
AES Convention, 1991.
[Zöl11] U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.
[GW09] G. Graber and W. Weselak, „Raumakustik Skriptum,“ Institut für
Breitbandkommunikation, TU Graz, Graz, A, 2009.
[MS07] R. Steward and D. Murphy, “A Hybrid Artificial Reverberation Algorithm,” Audio
Eng. Society Convention Paper 7021, 2007.
[Jot92] J. M. Jot, “An Analysis/Synthesis Approach To Real Time Artificial Reverberation,”
IEEE Int. Conf. Acoustics, vol. 2, 1992.
[Schroe62] M.R. Schroeder, “Natural Sounding Artificial Reverberation,” J. Audio Eng.
Society, vol. 10, no. 3, 1962.
[Fre00] J. Frenette, “Reducing Artificial Reverberation Requirements Using Time Variant
Feedback Delay Networks,” Thesis, University Of Miami, Miami, USA, 2000. URL:
http://www.music.miami.edu/programs/mue/research/jfrenette/index.html (09.06.2015).
[SL61] M. R. Schroeder and B. F. Logan, “Colorless Artificial Reverberation,” J. Audio Eng.
Society, vol. 9, no. 3, 1961.
[Smi10] J. O. Smith, Physical Audio Signal Processing for Virtual Musical Instruments And
Audio Effects. W3K Publishing. 2010.
[Kut91] H. Kuttruff, Room Acoustics. Elsevier Science Publishing Company (New York),
1991.
[AH06] J. S. Abel and P. Huang, “A Simple, Robust Measure of Reverberation Echo
Density,” Proceedings of the 121st AES Convention, San Francisco, CA, USA, October 2006.
Lukas Knoebl, Simulation of Room Impulse Response
63
[SP82] J. Stautner, and M. Puckette, Designing Multi Channel Reverberators, Computer
Music Journal, vol 6, no. 1, pp. 52-65, Spring 1982
[Jot97] J. M. Jot, “Efficient Models for Reverberation and Distance Rendering in Computer
Music and Virtual Reality,” Computer Music Conf., Thessaloniki, GRE, 1997.
[Jot96] J. M. Jot, “Synthesizing Three-Dimensional Sound Scenes in Audio or Multimedia
Production and Interactive Human-Computer Interfaces,” 5th International Conference:
Interface to Real & Virtual Worlds, Montpellier, France, Mai 1996.
[JCW97] J. M. Jot and L. Cerveau and O. Warusfel, “Analysis and Synthesis of Room
Reverberation Based on a Statistical Time-Frequency Model,” Proceedings of the 103rd
AES
Convention, New York, USA, 1997.
[Dat97] J. Dattorro, “Effect Design Part 1: Reverberator and Other Filters,” J. Audio Eng.
Society, vol. 45, 1997.
[Dat97,2][18] J. Dattorro, “Effect Design Part 2: Delay-Line Modulation and Chorus,” J.
Audio Eng. Soc., vol. 45, 1997.
[CNW14] T. Carpentier and M. Noisternig and O. Warusfel, “Hybrid Reverberation Processor
with Perceptual Control,” Proc. Of the 17th
Int. Conference on Dig. Audio Effects, Erlangen,
GER, 2014.
[Ber04] L. Beranek, Concert Halls and Opera Houses-Music, Acoustics and Architecture – 2nd
edition. Springer. 2004.
[Zöl08] U. Zölzer, Digital Audio Signal Processing. Wiley, 2008.
[Ruo12] M. Ruohonen, “Measurement-Based Automatic Parameterization of a Virtual
Acoustic Room Model,” Thesis, School of Electrical Engineering, Aalto University, Espoo,
FI, 2012.
[Tou98] J. Tougaard, “Detection of short pure-tone stimuli in the noctuid ear: what are
temporal integration and integration time all about?,” J. Comp Physiol A, 1998.
Lukas Knoebl, Simulation of Room Impulse Response
64
Picture Credits
Fig. 1.1: J.M. Jot: Synthesizing Three-Dimensional Sound Scenes in Audio or Multimedia
Production and Interactive Human-Computer Interfaces (1996), URL:
http://articles.ircam.fr/textes/Jot96a/ (Stand: 06.06.2015)
Fig. 2.1: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.
Fig. 2.3.0: M.R. Schroeder, “Natural Sounding Artificial Reverberation,” Bell Telephone
Laboratories, 1962.
Fig. 2.3.5: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.
Fig. 2.5.1: [12] J. S. Abel and P. Huang, “A Simple, Robust Measure of Reverberation Echo
Density,” Proceedings of the 121st AES Convention, San Francisco, CA, USA, October 2006.
Fig. 2.6.1: J.M. Jot and A. Chaigne, “Digital Delay Networks for Designing Artificial
Reverberators,” 90th
AES Convention, 1991.
Fig. 2.6.2: M.R. Schroeder, “Natural Sounding Artificial Reverberation,” Bell Telephone
Laboratories, 1962.
Fig. 2.6.6: M.R. Schroeder, “Natural Sounding Artificial Reverberation,” Bell Telephone
Laboratories, 1962.
Fig. 3.1.1: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.
Fig. 3.2.1: J.M. Jot and A. Chaigne, “Digital Delay Networks for Designing Artificial
Reverberators,” Proc. of the 90th
AES Convention, 1991.
Figure 4.2.2: U. Zölzer, Digital Audio Effects – Second Edition. Wiley, 2011.
Lukas Knoebl, Simulation of Room Impulse Response
65
APPENDIX
Delay Lengths in samples of the FDN in the final implementation (@44100Hz), in sequential
order:
3011 3083 3251 3307 3433 3461 3727 3797 4057 4153 4229 4451 4517
4999 5081 5209
Feedback Matrix (N=16) of the final implementation, construction code for Matlab:
A = eye(N); idx = [N,1:N-1]; A = A(idx,:); F = A - (2/N)*ones(N); % F… Feedback Matrix of FDN
top related