team jarvis poster

Real Time Voice Actuation SystemPragya Agrawal, Dominic Calabrese, David Martel, Nathan Sawicki

Project DescriptionThe goal of our project is to design and build a real-time speech recognition system. This

project presents many hardware and software challenges for implementation in an embedded environmentand is a perfect project for EECS 452. Using a source filter model of speech and a support vector machineclassifier on a prerecorded library of vocal commands, our team was able to recognize the commands “one”“two” “three” and “four” with high accuracy. The finished system executes in real-time and has GPIO-basedactuation to demonstrate functional voice recognition.

HardwareTMS320C5515 eZdsp™ USB Stick Development

Tool[1]

Vocal input and featureextraction is handled onthis fixed point DSP chipHigh speed autocorrelationfunction ideal for featureextraction

Raspberry Pi Model B+[3]

700 MHz processor,BroadCom SoC withfloating point unit40 pin GPIOIdeal for runningclassification algorithmin real time

Sparkfun Bluetooth Modem – Bluesmirf[2]

Pairs the C5515 and theRaspberry PiPairs with other Bluesmirfmodules with relative ease115200 baud rate capableof real-time transmissionand receiving

Raspberry Pi : Classification & Actuation

C5515 : Detection & Feature ExtractionImplements the threshold demonstrated in the figureon the right to detect commandsDSPlibrary provides simple functions useful for real-time DSP on C5515Computes the 16-bit autocorrelation for speechcommands and transmits this data over bluetoothUART

Support Vector Machine LearningSupport Vector Machine (SVM) is a supervised learningalgorithm used for analyzing data for binaryclassification and regression analysis.We utilize Multi-class Support Vector Machine, anextension of SVM for real time classification of spokenwords into one of the four classes : “ONE”, “TWO”,”THREE”, ”FOUR”Our algorithm uses one-against-one method toconstruct (k *(k-1)/2) classifiers (k = number ofclasses), one SVM for each pair of classes. Each SVM istrained on data from two classes, to distinguish thesamples of one class from another. Classification of anunknown pattern is done according to maximumvoting, where each SVM votes for one class.LIBSVM tool, an integrated software for multi-classsupport vector classification is used. It employs radialbasis kernel function for classification.

Source-Filter Model of SpeechWord characterization should beindependent of volume, pitch, and durationof the wordSimplify speech production model to being:

1. Source - vibration of vocal chords2. Filter – vocal tract (i.e. positioning of

tongue, mouth, etc.)Filter most affects sound of word positionvocal tract differently to say different wordsAccurately modeling the filter provides abasis for word recognition[4]

Broad sweeps of spectrum (formants) result from the filter configuration. Rapidly varying peaks come

from source resonances

All-Pole Filter CoefficientsFirst n filter coefficients can be roughlycalculated using the first n time shifts of theautocorrelation of a signalAutocorrelation is computed by convolving asignal with time shifted version of itselfLevinson-Durbin recursion algorithm allows forquick computation if the matrix is ToeplitzSymmetricWant to capture spectral envelope, so want ~10filter coefficients[5]

Too many coefficients leads to over-fitting of curve

References[1] http://www.spectrumdigital.com/product_info.php?cPath=31&products_id=238[2] https://www.sparkfun.com/products/12577[3] http://www.adafruit.com/product/1914[4] Dutoit, T., Moreau, N., Kroon, P., How is speech processed in a cell phoneconversation?, 2009[5] Rabiner, L., Schafer, R., Introduction to Digital Speech Processing, 2007

Bluesmirf : Bluetooth CommunicationBluetooth implementation was a goal for the team. Usingpre-configured Bluesmirf devices, the c5515 is able totransmit UART data to the Raspberry Pi.Transmit ‘$$$’ over UART to send Bluesmirf intocommand modeProgram the Bluetooth address of another Bluesmirfmodule into memory. Transmit a ‘C’ command to pair thedevices

Developed preliminary algorithm using MATLAB anddesktop computers to mimic function of C5515 andRaspberry PiUsed MATLAB Coder Toolbox to convert core algorithminto C codeImplemented wrapper around algorithm that handlesUART communication, GPIO toggling and OpenVG libraryimage generation and data plottingImplemented LibSVM for classification

EECS 452, Digital Signal Processing Design Lab, Fall 2014

team jarvis poster

Documents