windows audio architecture win mm application directsound application sysaudio.sys kmixer.sys...
TRANSCRIPT
Windows audio architecture
Win MMApplication
DirectSoundApplication
SysAudio.SYS
Kmixer.SYS
WinMM.DLL DSound.DLL
Device Drive Container
USBDeviceDriver
IEEE 1394DeviceDriver
PCICARD Driver
ISACARD Driver
User Mode
Kernel Mode
Windows Components
By Hardware Vendor
Windows Driver Model– supported by Win 98, Win ME,
Win 2K and Win XP– a single audio driver works for
multiple Windows versions APIs
– DirectSound– WinMM
Kernel streaming– multiple audio streams can be
played at the same time
– SysAudio.SYS decides the optimal audio format and sample rate conversion
– Kmixer.SYS performs the actual converting
WinMM API
Simple, but– high latency– inability to take advantage of hardware acceleration– no easy way to implement features, e.g. 3-D positioning,
effect processing Play audio
– waveOutOpen(…) - open the output audio device– waveOutWrite(…) - write the waveform audio data– waveOutClose(…) - close the output audio device– need to use callback or polling to check the result
Not very interesting to real-time applications
DirectSound API - over view
Audio component of DirectX package– low latency– use hardware acceleration– direct access to sound device– support capturing sound
Two programming interfaces– COM (Component Object Model) in C++– .NET in C++, C#, Visual Basic, etc.
Important objects– secondary buffers: write/read audio data– buffer cursors: point to current captured/played audio data– buffer notifications: send events when buffer cursors reach a position
DirectSound API - COM interfaces
IDirectSound8– CreateSoundBuffer(descriptor, bufferPointer, …)
create a sound buffer object to manage audio samples
fields of descriptor
– buffer size
– audio format: commonly16 bits linear PCM
– buffer features
– SetCooperativeLevel(windowHandle, level) set the priority of the sound buffer
DirectSound API - COM interfaces
IDirectSoundBuffer8– Lock(offset, size, addr1, size1, addr2, size2, flag)
ready all or part of the buffer for a data write and return pointers to which data can be written
– Play(reserved, priority, flags) cause the sound buffer to play, starting from the play cursor
– Unlock(addr1, size1, addr2, size2) release a locked sound buffer
– Stop() cause the sound buffer to stop playing
DirectSound API - COM interfaces
IDirectSoundNotify8– SetNotificationPositions(NumberOfNotifyStructure,
ArrayofNotifyStructure) set the notification positions; during playback, whenever the
play cursor reaches one of the specified offsets, the associated event is signaled
fields of NotifyStructure
– buffer offset
– notify event
Sound capturing is similar
DirectSound API - code example
1. Streaming audio in an event-driven threadwhile (true) { DWORD r = WaitForSingleObject(event, INFINITE); // receives notification of refilling buffer if (r == WAIT_OBJECT_0) { Buffer.Lock(offset, size, &addr1, &size1, &addr2, &size2, 0); // copy audio to buffer addresses returned // by DirectSound // could be two addresses because of buffer // wrap-around memcpy(addr1, audio, size1); if (size2 != 0) { memcpy(addr2, left, size2); } Buffer.Unlock(addr1, size1, addr2, size2); }} // while
Windows audio architecture revisited
Can we achieve lower latency?
– kernel mixing introduces at least 30 ms of delay
– kernel mixing is not necessary if I’m the only application generating audio streams
– How about interacting with device drivers directly?
Win MMApplication
DirectSoundApplication
SysAudio.SYS
Kmixer.SYS
WinMM.DLL DSound.DLL
Device Drive Container
USBDeviceDriver
IEEE 1394DeviceDriver
PCICARD Driver
ISACARD Driver
User Mode
Kernel Mode
By Hardware Vendor
Windows Components
DirectKS - the unofficial audio API
Win MMApplication
DirectSoundApplication
SysAudio.SYS
Kmixer.SYS
WinMM.DLL DSound.DLL
Device Drive Container
USBDeviceDriver
IEEE 1394DeviceDriver
PCICARD Driver
ISACARD Driver
User Mode
Kernel Mode
DirectKS
DirectKSApplication
By Hardware Vendor
Windows Components
Pros– very low latency
Cons– only one application
can play sound at one time
– applications need to handle audio format and sample rate conversion
– might not work in future version of Windows
The next-generation Windows audio
– None of the current audio interfaces satisfies real-time applications
transition between user mode and kernel mode for each I/O request
blocking upon completion of an I/O request CPU cycles for copying data
– WaveRT (wave real-time) drivers in the next version of Windows - “Longhorn”
data flow directly between the client and the audio hardware
Learn more
– URLsoverview
– http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwmt/html/audiooverview.asp
Windows Driver Model (WDM)
– http://www.microsoft.com/whdc/hwdev/tech/audio/wdmaudio.mspx#wdm1
DirectKS– http://www.microsoft.com/whdc/hwdev/tech/audio/
DirectKS.mspxWaveRT
– http://www.microsoft.com/whdc/hwdev/tech/audio/WaveRTport.mspx
Audio library overview
– Transmit audio over the internet use low latency audio APIs
– DirectSound or DirectKS pluggable codecs
– G.711, GSM, Speex, iLBC modular playout buffer integrated with rtplib++
– System Requirements Windows XP or Windows 2K DirectSound 9.x runtime libraries Visual C++ runtime libraries
– Initialization: session.setUserName(<name>) session.setUserInfo(map<code, value>) session.setRemoteAddress(<host/ip>, <port>) session.setLocalAddress(<host/ip>, <port>)
Audio library architecture
SIP user agent
Encoder
Rtplib++
DirectSound/DirectKS
Socket
Decoder
Playout buffer
Audio tool GUI
SIP user agent
Encoder
Rtplib++
DirectSound/DirectKS
Socket
Decoder
Playout buffer
Audio tool GUI
Network
Audio library API
– Initialization setUserName(name)
– set the local user name setRemoteAddress(host/IP, port)
– send audio to this address setLocalAddress(host/IP, port)
– receive audio from this address setPlayerAudioFormat(audioFormat)
– play audio in this format setCapturerAudioFormat(audioFormat)
– capture audio in this format
Audio library API
– Initialization (Cont.) setEncoder(encoder)
– use this encoder to encode audio– encoder can be created by
encoder = SpeexEncoder - create a Speex encoder instance encoder.setPayloadType(payLoadType) - set RTP payload type encoder.setOutputAudioFormat(audioFormat) - set the encoded format
setDecoder(decoder)– … (similar to encoder)
– Start startReceiver()/startSender()
– start to receive/send audio
Audio library delay
Min Max Avg.DirectSound 68 ms 195 ms 121 ms
DirectKS 42 ms 162 ms 111 ms
– One-way mouth-to-ear delay measurement of audio library using DirectSound and DirectKS
DirectKS shows close to 30 ms improvement over
DirectSound