windows audio architecture win mm application directsound application sysaudio.sys kmixer.sys...

Windows audio architecture

Win MMApplication

DirectSoundApplication

SysAudio.SYS

Kmixer.SYS

WinMM.DLL DSound.DLL

Device Drive Container

USBDeviceDriver

IEEE 1394DeviceDriver

PCICARD Driver

ISACARD Driver

User Mode

Kernel Mode

Windows Components

By Hardware Vendor

Windows Driver Model– supported by Win 98, Win ME,

Win 2K and Win XP– a single audio driver works for

multiple Windows versions APIs

– DirectSound– WinMM

Kernel streaming– multiple audio streams can be

played at the same time

– SysAudio.SYS decides the optimal audio format and sample rate conversion

– Kmixer.SYS performs the actual converting

WinMM API

Simple, but– high latency– inability to take advantage of hardware acceleration– no easy way to implement features, e.g. 3-D positioning,

effect processing Play audio

– waveOutOpen(…) - open the output audio device– waveOutWrite(…) - write the waveform audio data– waveOutClose(…) - close the output audio device– need to use callback or polling to check the result

Not very interesting to real-time applications

DirectSound API - over view

Audio component of DirectX package– low latency– use hardware acceleration– direct access to sound device– support capturing sound

Two programming interfaces– COM (Component Object Model) in C++– .NET in C++, C#, Visual Basic, etc.

Important objects– secondary buffers: write/read audio data– buffer cursors: point to current captured/played audio data– buffer notifications: send events when buffer cursors reach a position

DirectSound API - COM interfaces

IDirectSound8– CreateSoundBuffer(descriptor, bufferPointer, …)

create a sound buffer object to manage audio samples

fields of descriptor

– buffer size

– audio format: commonly16 bits linear PCM

– buffer features

– SetCooperativeLevel(windowHandle, level) set the priority of the sound buffer


IDirectSoundBuffer8– Lock(offset, size, addr1, size1, addr2, size2, flag)

ready all or part of the buffer for a data write and return pointers to which data can be written

– Play(reserved, priority, flags) cause the sound buffer to play, starting from the play cursor

– Unlock(addr1, size1, addr2, size2) release a locked sound buffer

– Stop() cause the sound buffer to stop playing


IDirectSoundNotify8– SetNotificationPositions(NumberOfNotifyStructure,

ArrayofNotifyStructure) set the notification positions; during playback, whenever the

play cursor reaches one of the specified offsets, the associated event is signaled

fields of NotifyStructure

– buffer offset

– notify event

Sound capturing is similar

DirectSound API - code example

1. Streaming audio in an event-driven threadwhile (true) { DWORD r = WaitForSingleObject(event, INFINITE); // receives notification of refilling buffer if (r == WAIT_OBJECT_0) { Buffer.Lock(offset, size, &addr1, &size1, &addr2, &size2, 0); // copy audio to buffer addresses returned // by DirectSound // could be two addresses because of buffer // wrap-around memcpy(addr1, audio, size1); if (size2 != 0) { memcpy(addr2, left, size2); } Buffer.Unlock(addr1, size1, addr2, size2); }} // while

Windows audio architecture revisited

Can we achieve lower latency?

– kernel mixing introduces at least 30 ms of delay

– kernel mixing is not necessary if I’m the only application generating audio streams

– How about interacting with device drivers directly?

Win MMApplication


SysAudio.SYS

Kmixer.SYS



USBDeviceDriver


PCICARD Driver

ISACARD Driver

User Mode

Kernel Mode

By Hardware Vendor

Windows Components

DirectKS - the unofficial audio API

Win MMApplication


SysAudio.SYS

Kmixer.SYS



USBDeviceDriver


PCICARD Driver

ISACARD Driver

User Mode

Kernel Mode

DirectKS

DirectKSApplication

By Hardware Vendor

Windows Components

Pros– very low latency

Cons– only one application

can play sound at one time

– applications need to handle audio format and sample rate conversion

– might not work in future version of Windows

The next-generation Windows audio

– None of the current audio interfaces satisfies real-time applications

transition between user mode and kernel mode for each I/O request

blocking upon completion of an I/O request CPU cycles for copying data

– WaveRT (wave real-time) drivers in the next version of Windows - “Longhorn”

data flow directly between the client and the audio hardware

Learn more

– URLsoverview

– http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwmt/html/audiooverview.asp

Windows Driver Model (WDM)

– http://www.microsoft.com/whdc/hwdev/tech/audio/wdmaudio.mspx#wdm1

DirectKS– http://www.microsoft.com/whdc/hwdev/tech/audio/

DirectKS.mspxWaveRT

– http://www.microsoft.com/whdc/hwdev/tech/audio/WaveRTport.mspx

Audio library overview

– Transmit audio over the internet use low latency audio APIs

– DirectSound or DirectKS pluggable codecs

– G.711, GSM, Speex, iLBC modular playout buffer integrated with rtplib++

– System Requirements Windows XP or Windows 2K DirectSound 9.x runtime libraries Visual C++ runtime libraries

– Initialization: session.setUserName(<name>) session.setUserInfo(map<code, value>) session.setRemoteAddress(<host/ip>, <port>) session.setLocalAddress(<host/ip>, <port>)

Audio library architecture

SIP user agent

Encoder

Rtplib++

DirectSound/DirectKS

Socket

Decoder

Playout buffer

Audio tool GUI

SIP user agent

Encoder

Rtplib++

DirectSound/DirectKS

Socket

Decoder

Playout buffer

Audio tool GUI

Network

Audio library API

– Initialization setUserName(name)

– set the local user name setRemoteAddress(host/IP, port)

– send audio to this address setLocalAddress(host/IP, port)

– receive audio from this address setPlayerAudioFormat(audioFormat)

– play audio in this format setCapturerAudioFormat(audioFormat)

– capture audio in this format

Audio library API

– Initialization (Cont.) setEncoder(encoder)

– use this encoder to encode audio– encoder can be created by

encoder = SpeexEncoder - create a Speex encoder instance encoder.setPayloadType(payLoadType) - set RTP payload type encoder.setOutputAudioFormat(audioFormat) - set the encoded format

setDecoder(decoder)– … (similar to encoder)

– Start startReceiver()/startSender()

– start to receive/send audio

Audio library delay

Min Max Avg.DirectSound 68 ms 195 ms 121 ms

DirectKS 42 ms 162 ms 111 ms

– One-way mouth-to-ear delay measurement of audio library using DirectSound and DirectKS

DirectKS shows close to 30 ms improvement over

DirectSound

windows audio architecture win mm application directsound application sysaudio.sys kmixer.sys...

Documents