Download - ARTICULATION_in_Audio_Processing_Feb2015
Audio Processing • Critical part of “internet radio” station
– Very important for digital streaming to avoid artifacts of compression algorithm
• Must operate on diverse array of files
– Think in grand scale: 80’s music to current day
• Instruments, mastering, music structure, recording technology
• Dynamically adjust to maintain high probability of clarity or articulation of original file structure in spectral and temporal domains while providing a uniform “sound signature” on these diverse files
– Articulation or intelligibity of complex music audio files is preserved and enhanced
• Maintain transient “punch”
• Goal is to make sure “everyone wins” in the mix
– ‘Muddiness” must be avoided
– Preserve loudness, pitch and timbre
• Purpose
– A consistent ‘sonic signature” for the station format 7/30/2013 2 http://www.linkedin.com/in/gpbrefini/
Motivation: The Connected Dash
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 3
• Delivering Internet audio to the car is hard – Carrier’s signals not ubiquitous ….yet – Everyday more people accessing high
speed content in particular rush hour – Carrier’s have limited BW
• DASH Applications are different from auto manufacturer to another
• Early studies show humans want Internet radio in car to work like conventional radio
• NO ONE ARGUES: Internet radio is the FUTURE! – The automobile is the listening
"theater,"
Good Audio Processing is Multi-Band Processing!
• Perfected by Mike Dorrough (based on Altec-Lansing design of the 1950’s)
– “Monolith” installed at KRLA, 8 band analog processor
– DAP-310, 3 band analog
– Both had phase equalized pass-bands before combining back to composite
• Minimize phase rotations at band edges
• Linear Group Delay!!!!
7/30/2013 4 http://www.linkedin.com/in/gpbrefini/
“Process for the Stream”
• Streams use "lossy" data compression such as: • MPEG, Real Audio, Microsoft's MSV2 codec • linear 44.1kHz stereo audio stream ~1.6Mb/s • At 128kb/s the MPEG Layer 3 compression ratio is approximately
11:1 • At 256kb/s the MPEG Layer 3 compression ratio is about 6:1
• Critical area of these perceptual coding schemes is the high-frequency area – Maintain consistent amplitude near FS for codec – Keep the upper spectrum free from clipping distortion or
excessive high-frequency processing – Consistent spectral balance over a wide range of material
is a must!
7/30/2013 5 http://www.linkedin.com/in/gpbrefini/
Codec Magic: Masking!
• Codecs remove redundant information that humans will not perceived as being removed – Audio spectrum split into 500 bands
– Algorithm models human ear • CODEC dynamically computes a “best frequency domain
fit” where certain signals present can be removed
• CODEC also performs “level masking” taking advantage of how human hearing focuses on what’s going on in the foreground
• Typically only 20% of original audio file is all that is needed to be transmitted!
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 6
Avoiding “watery sound” of Internet Radio
• Coders do not like hard limited audio, harmonics get squirreled into pass-band that algorithm can not model
• RMS is more important than peak of all waveforms – It is a measure of energy over time – Normalize FS to RMS (can’t exceed 0dB FS peak)
• Peak to RMS ratio is critical
• Contemporary Hit Music format uses processing to make it more exciting – As in movie production: frame by frame image is color
corrected & exposure corrected
• If we understand the transmission system and technical challenges and we can minimize or hide sonic challenges the better we sound!
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 7
http://schedule.sxsw.com/2011/events/event_MP7661 http://www.digido.com
Articulation Processing Example
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 8
• Process to bring the kick drum beater slaps forward
– Use linear phase to keep the transient rise/fall times steep
• Bring near-infrasonics forward during transients in mid-range ( 2.1 – 6.4 KHz )
– This gives the audio a subtle “thumb”
Example:
• Modern Hit Music Station format – Today’s Hits, the 2K’s, the 90’s and the 80’s
• Processing Challenges – Modern Music has
• very limited dynamic range • large bottom • Digitally corrected vocals
– 80’s Music has • larger dynamic range • More traditional instruments, less synthesized • SPARS code was highly likely AAD
• Need processing that makes for a consistent air sound
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 9
The Nation’s Hit Music Station!
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 10
Press above for Radio XL5 live!
Press above for Radio XL5 website
The Audio Chain
• Mild multiband processing with impact/thumb enhancements – Articulation processing
• Second stage multi-band processing – More bands
– Clip/Bass distortion correction
– Mild stereo enhancement
• Articulation processing
• Two-band DSP limit/compression
• Analog “fast” compressor
• High-end Soundcards
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 11
BreakAway Proc
StereoTool Proc
Behringer Digital Proc
Alesis Analog Proc
‘Proprietary System” Articulation Proc
Internet “Air Sound” vs FM “Air Sound”
• Internet Radio – Flat audio processing throughout the air chain
• FM Radio – requires pre-emphasis, 17 dB gain at 15 KHz for
75 uS (US radio)!
• Most modern music is highly clipped/limited – FM pre-emphasis really increases distortion
– Internet audio, flat and easy to use de-clipping algorithms
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 12
Summary
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 13
• Playback using high quality sources • Multi-stage, multi-band processors
– Less is more – Phase & bass correct – Declipper function is important – Goal is for spectral balance
• Apply multiple articulation/transient “punch” processing at front and back end of chain – Avoid anything that does not have linear group delay – Preserve original transient information (spectral components).
• Minimal analog processing if possible • Use high end sound cards
– A/D and D/A low jitter clocks, preferably locked to common source