virtual theremin design project nicholas dargus, daniel · pdf file ·...
TRANSCRIPT
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
ECE496Y Design Project Course - Final Report
Title: Virtual Theremin using IEEE1394
Project I.D. # 2002105
Prepared by: Nicholas Dargus - [email protected] Daniel DeAraujo - [email protected] Jeremy Gillard - [email protected]
Supervisor: Prof. James MacLean
Section #: 5
Section Coordinator:
Phil Anderson
Date: Friday, April 11, 2003
The Edward S. Rogers Sr. Dept of Electrical and Computer Engineering University of Toronto
Executive Summary
This report outlines the design of the Virtual Theremin project, that makes use of
computer vision techniques. In this project we are explore the capabilities of tracking a
Theremin performer in real-time using an IEEE1394 based camera. Deve lopment of this
project was done in the C/C++ programming language, under a Linux environment with
IEEE1394 support.
The main goals of our project were: to detect a performer’s hands in real- time, apply
algorithms to track a performer’s hands in real-time, and to emulate the sound a physical
theremin instrument. Although there were several changes in design, the project was
completed as scheduled and was a complete success.
The design project was broken down, accordingly, into three main components: image
acquisition, image processing and tracking, and sound generation. This report
concentrates on the methods and materials used to complete these components and is
discussed in detail in the body of our report.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 2 of 43
Nicholas Dargus’ Contributions Nicholas contributed to several different aspects of the design project. He worked on the
initial computer system setup and later moved to working on the main graphical user
interface (GUI) that would serve as the front end for the Virtual Theremin.
Nicholas’ first responsibility was the basic initialization and setup of a computer with a
Linux-based operating system. He also needed to perform the IEEE1394 Linux driver
installation on this system, along with all of the necessary fixes to ensure its proper
operation. After the system setup was complete, Nicholas was originally scheduled to set
up a GUI that would control the IEEE1394 camera and run the Virtual Theremin. It was
later decided that this would be too complex a task, and that modifying an existing
interface would be more efficient use of time.
Making use of the user-interface code builder, Glade, he modified the Coriander GUI to
add the extra functionality to allow the Virtual Theremin operate correctly. He also
created the necessary interfaces for Daniel’s tracking modules and Jeremy’s sound
generation module.
The following sections were written by Nicholas, with contributions and suggestions
from Daniel and Jeremy: Introduction, Design (4.1, and 4.2), Appendix A and Appendix
B.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 3 of 43
Daniel DeAraujo’s Contributions
Daniel was responsible for the skin detection and hand tracking algorithms used in the
Virtual Theremin. His module would take an image provided by Nicholas’s module in
YUV format, search it for skin colored pixels, determine the center of mass of the hands,
and return the (x,y) co-ordinates of each hand to Jeremy’s sound generation module.
Daniel implemented his initial algorithms using Octave. This code consisted of a RGB to
YUV converter, a skin detection filter, a blob-growing algorithm, a full- image hand
locator, and a simple heuristic hand locator. He also implemented a script that was used
to extract U-V information from images consisting only of skin colored pixels which was
used to determine the approximate region in U-V space that skin tone occupies.
Once the Octave simulations were completed, Daniel converted each of his algorithms
into C in the quickest and most direct fashion possible, to ensure that the algorithms
functioned in C before beginning the optimization step. Once the code was functional in
C, the algorithms were tested and integrated into Nicholas’s module. The integrated
package was tested and fixed, while Daniel optimized the hand detection functions for
maximum speed. The optimized code can be found in Appendix C.
The following sections were written by Daniel, with contributions and suggestions from
Nicholas and Jeremy: Materials and Programming Methods, Design (4.3) and
Conclusion.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 4 of 43
Jeremy Gillard’s Contributions Jeremy contributed to the design project in multiple areas. His first responsibility was to
perform research on the Theremin instrument to determine how it functions and how it is
played so that an accurate conversion model could be created relating hand position to
sound output.
He researched the Linux sound environment so that the sound card would be correctly
enabled and available as an output device to the system. Considerable research also went
into determining what sound API’s were available, and how they could be used to access
the soundcard.
Afterwards, a function was created that would determine which digital samples had to be
written to the soundcard to produce the desired Theremin sound. He then programmed an
interactive software application that mimicked the Theremin output using the keyboard.
This software was used to adjust and fine tune the sound output function.
Next, a conversion algorithm was created that would map hand positions on the screen to
a specific amplitude and frequency the Theremin would produce based on those
positions. To thoroughly test this algorithm, a mouse driven application was created. This
application would give specific real time feedback to how the system would respond to
changing hand positions allowing for more accurate mapping between hand position and
sound output.
The following sections were written by Jeremy, with contributions and suggestions from
Daniel and Nicholas: Introduction and Design (4.4)
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 5 of 43
Acknowledgements
The authors of this report would like to take the opportunity to thank the silent member of
the design team, our project supervisor professor W. James MacLean. He has been an
outstanding supervisor. Not only is he an immense source of knowledge, but he is also
very friendly and a pleasure to work for.
Professor MacLean kept the design team on track by holding weekly meetings and was
always available for consultation. He guided us throughout the duration of the project
and made the work fun. Sincerest thanks to James MacLean from the design team.
In addition, the authors would also like to acknowledge that much of the concepts and
material used in our design originated from the publications listed in our references at the
end of this report. Without them, this project would not have been possible.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 6 of 43
Table of Contents CONTRIBUTIONS 2 SECTION 1: ACKNOWLEDGEMENTS 6 SECTION 2: INTRODUCTION 7
2.1 Project Background 2.1.1 The Theremin 7 2.1.2 YUV Color Space 8 2.1.3 Octave 9 2.1.4 IEEE1394 10
2.2 Motivation 10 2.3 Project Objectives 11 2.4 Report Outline 11 2.5 Literature Review 12
2.5.1 Image Acquistion 12 2.5.2 Image Processing and Tracking 13 2.5.3 Sound Generation 14 2.5.4 Programming Referances 16
SECTION 3: Materials and Programming Methods 17
3.1 Materials Used 3.1.1 Hardware 18 3.1.2 Software 18
2.2 Programming Techniques 18 3.2.1 Freestore 18 3.2.2 Concurrent Code Development 19 3.2.3 Incremental Software Development 19
SECTION 4: Design 20
4.1 Testing the IEEE1394 Bus 20 4.2 Graphical User Interface Implementation and Image Acquisition 22
4.2.1 Toggle Buttons 23 4.2.2 Spin Boxes 23
4.3 Image Processing 25 4.3.1 Skin Detection 25 4.3.2 Hand Tracking 28 4.3.3 Development Details 30
4.4 Sounc Card Support in Linux 32 4.4.1 Interactive Program for sound emulation using keyboard support 33 4.4.2 Driven Application for Sound Mapping Testing 34
SECTION 5: CONCLUSIONS 36
5.1 Further Work 37 5.2 Possible Applications 37
SECTION 6: REFERENCES 38 APPENDIX A: COMPUTER SYSTEM AND SOFTWARE VERSIONS 40 APPENDIX B: SETUP AND INSTALLATION OF LINUX 41 APPENDIX C: SOURCE CODE 43
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 7 of 43
Section 2: Introduction
The purpose of our project was to create a Virtual Theremin musical instrument using
computer vision techniques under a Linux-based environment. The project was broken
down into three main components: image acquisition, image processing and target
tracking, and sound generation, each of which were created as individual modules for the
main program.
2.1 Project Background 2.1.1 The Theremin
The Theremin musical instrument was invented by
physicist Lev Sergeivitch Termen (anglicized to Leon
Theremin) in 1919 while working for the Russian
government on alarm devices. One of the alarm devices he
created caused a whistling noise which changed in a
predictable way with the approaching of a body. He found
that he could play out melodies by moving his hand in
discrete amounts in front of the alarm. This gave him the
inspiration to create a musical instrument that could be played without physically contact.
Theremin was granted a US patent on February 28, 1928 for the "Thereminvox", as it
was called at the time.
The Theremin is based on the theory of beat frequencies. When you play a note that is
not in tune relative to a reference note of the same frequency, there is a recognizable
Leon Theremin playing his creation.[8]
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 8 of 43
pulse until both notes are brought to the same frequency. As these notes get further away
in frequency, beats are produced at a faster rate. The Theremin uses an electronic
oscillator to create a stable high frequency reference tone. Another electronic oscillator
which is initially in tune with the first oscillator is controlled by a capacitive sensing
antenna. The difference between the two pitches created by the oscillators is in the
auditory range, and is amplified . Ideally, the Theremin would produce a perfect sine
wave as its output, but due to coupling of its oscillators, it produces an asymmetrically
slewed sine wave.
The capacitance of the antenna is changed by
moving a hand towards the antenna or away
from it. This allows one to alter the pitch
produced by the instrument. A second antenna
is used to control the volume. Changing the
capacitance of the set of antennae with your
hands allows you to play the Theremin musical
instrument.
2.1.2 YUV Color Space
The YUV color model is an alternate way of representing a standard Red, Green Blue
(RGB) image. It was originally devised as a method of transmitting a full-color signal
from a television station while maintaining compatibility with existing black and white
television sets.
Figure 2.1: Image of a Theremin
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 9 of 43
A YUV image is composed of a Luminance channel (Y) which contains information
about the intensity of the pixels in the image and two Chrominance channels (U and V)
which contain all the necessary color information required for accurate reconstruction of
the image. The Y channel is the grayscale image that would be constructed from the
original RGB image. The U channel is created by subtracting the original Red channel
from the Y channel, and the V channel is constructed in a similar fashion by subtracting
the Blue channel from the Y channel.
We use the YUV representation of colors instead of HSI, as previously outlined in our
proposal [13], or other similar color spaces because our ADS Pyro camera can directly
capture images in YUV format, which negates the need for a costly conversion stage.
2.1.3 Octave
(From [9]) GNU Octave is a high- level language, primarily intended for numerical
computations. It provides a convenient command line interface for solving linear and
nonlinear problems numerically, and for performing other numerical experiments using a
language that is mostly compatible with (Mathworks’) Matlab. GNU Octave is also freely
redistributable software.
Octave was used as the development platform for the initial skin detection and hand
tracking algorithms because it has a simpler language and syntax compared to C, and
provides easy access to all the data being manipulated. However, it is much to slow to be
considered as a platform for a real- time system and was therefore only used to test the
correctness of above algorithms.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 10 of 43
2.1.4 IEEE1394
IEEE1394 is an extremely very fast external bus standard that can support high data
transfer rates [26]. Currently there are two versions available, 1394a which has a transfer
rate of 400Mbps and 1394b, which has a transfer rate of 800Mbps.
A single IEEE 1394 port can be used to connect up to 63 external devices In addition to
its high speed, IEEE 1394 also supports isochronous data transfers. Isochronous data
transfers are time dependant and refer to processes where data must be delivered within
certain time constraints without corruption. This mechanism is ideal for our project
because we are tracking in real-time and need to transfer large amounts of data from our
camera to the computer.
2.2 Motivation
In the past when performing with a Theremin instrument, the performer had to move his
or her hands in and out of several electro-magnetic fields in order to create music. The
positions of their hands, with reference to the Theremin are what generate music. Using
knowledge gained from the study of Theremins, our design group wanted to be able to
create a Virtual Theremin that made use of real-time object tracking.
Our Virtual Theremin will work by tracking a performer’s hands in real time and then,
depending on the locations of the hands, create the sound a Theremin would produce.
Several problems arise from this venture that must be made known. With a physical
Theremin, the music is created by the performer’s hands changing the capacitance of a
set of antennae. Our Virtual Theremin we will not be using antennae. Instead we will
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 11 of 43
track the performer’s hands using our camera and treat the hands as if they were actually
causing changes in a real Theremin. We will also have to determine how the different
hand locations affect the sounds that should be created. The time delay from when the
performer’s hands move, to when the Virtual Theremin outputs a new sound will have to
be small so as not to be noticeable to the performer. In addition, any user feedback video
should not have any perceivable time delay between hand movement and sound output.
2.3 Project Objectives
Our primary project objective was to create a Virtual Theremin instrument that closely
mimics the ideal Theremin. This includes tracking players’ hands in real time and
translating their position relative to a set of Virtual antennae to allow for an appropriate
sound response from the system.
In addition, we wanted to create an interface for the system that gives Theremin players
feedback as to how the system is tracking and allows for control over system parameters.
This includes:
•Adjustments to account for different lighting conditions.
•Controls to set the video feed parameters.
•Toggles to switch between multiple user feed back modes including hand location
markers and skin highlighting.
2.4 Report Outline
This report will explain the development of the Virtual Theremin system. Section 2 gives
the introduction and motivation for the Virtual Theremin project. Explanations on
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 12 of 43
system development and testing are discussed in Section 3. The section is divided into
the separate system modules: image acquisition and GUI development, image processing
and tracking, and sound generation.
Finally, Section 4 concludes the final report and discusses future work and other possible
applications that this system could be used for, given further development.
2.5 Literature Review
As mentioned previously our project was broken down into three main components:
image acquisition, image processing and tracking, and sound generation, each of which
was thoroughly investigated. The following works comprises our research into each area.
2.5.1 Image Acquisition The Embedded Systems Programming: Fundamentals of FireWire webpage contains an
overview of how the IEEE1394 protocol operates and is hosted by CMP Media LLC [1].
It discusses the protocol’s topology, data transfer and transaction processes, protocol
layers and configuration. It also goes into details about how the bus is managed and the
methods used to provide an easy-to-use, low-cost, high-speed connection. This website
also provides technical diagrams and links to other IEEE1394 references and
documentation.
The 1394 Trade Association is incorporated as a non-profit trade organization founded to
support the development of computer and consumer electronics systems that are easily
connected with each other via a single serial multimedia link. The association’s webpage
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 13 of 43
provides information into IEEE1394 technology and development [2]. A short history of
the IEEE1394 technology is a provided along with the benefits and future of using such a
technology. Links to other IEEE1394 documentation are also available.
The Linux1394.org webpage is run by Dan Dennedy, a maintainer of several Linux
libraries and applications [3]. The site is devoted to providing 1394 hardware access
under the Linux operating system and contains information and frequently asked
questions on how to get started using IEEE1394. Also available on the website, in the
form of downloads, are several software packages and C libraries that allow direct access
to IEEE1394 devices. The software packages and C libraries come with their own
documentation on how to access data coming through the IEEE1394 port. In our project
we used several of these libraries to communicate with the Pyro camera in a Linux
environment. Links to other web-based references and data archives were also available
through this webpage.
2.5.2 Image Processing and Tracking In Digital Image Processing by Pratt, and the similarly titled book by Gonzales, we found
in-depth information on many image processing concepts useful to our project, such as
filtering, image segmentation, pattern recognition and color space conversions. Image
segmentation methods found in both books were used to detect the location of the hands
in space and the accuracy will be improved with an application of basic filters.
In Pratt, there is detailed information and algorithms for transforming images from Red
Green Blue (RGB) space to hue, saturation and intensity (HSI) space [4]. Since we
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 14 of 43
originally planned to be using the HSI space to look for skin tone colours, the
transformation algorithms available in this book were extremely important.
The decision to switch from HSI to YUV required us to find documentation relating to
this new color space. Joe Maller’s webpage [17] discussed the history of YUV and its
use in various filters used by the Final Cut Pro video editing software package. He also
provides the formulae necessary to convert an RGB value into the YUV color space.
The Naked People Skin Filter site has information on the different ranges of skin color
that can easily be searched for in an HSI and YUV domain [5]. This information was
relevant to our project because it will allow us to locate the different areas of skin-tone,
such as the hands and head of the performer, and make our program more efficient when
trying to locate the hands.
Bare-Hand Human-Computer Interaction describes one possible way to track movement
by processing each frame in its entirety, as our prototype will do [24]. This is a simple
yet effective way to implement basic tracking functiona lity. It is robust since the tracking
algorithm can never really “lose” the object it is tracking, since it is performing a full
search every time. The document also establishes that the maximum acceptable latency
is 50ms for ease of use, and endeavored to produce an algorithm that could process the
video faster than 20Hz.
2.5.3 Sound Generation Described at the Take a Look at Theremins website are methods which the Theremin can
produce music without physically touching the instrument [15]. This site is an excellent
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 15 of 43
resource for Theremin technical documents and playing techniques. It provided us with
an excellent resource for learning how to mimic the Theremin’s responses to stimulus on
the computer.
The Open Sound System Programmer’s Guide manual gives detailed instructions on how
to program applications for the Open Sound System (OSS) [7]. The Open Sound
System(OSS) is a device driver for sound cards and other sound devices under UNIX and
UNIX-Compatible operating systems. The manual progresses from background
information on OSS devices and programming techniques to a detailed description of
programming of specific aspects of sound cards, such as the mixer. Both aspects of using
either a digitized voice or a synthesizer for output are explained, which helped in
determining which type of sound output will be best for this project.
This online Linux Sound HOWTO resource gives useful information on sound support
for Linux [8]. To be able to program an application for outputting sound, we need to
make sure that we have our soundcard installed correctly and functioning correctly. This
document lists supported Linux hardware, describes how to configure the kernel drivers,
and answers frequently asked questions relating to Linux sound. An overview of how a
sound card functions is also present, which was useful in helping to determine to use
either a digitized voice or synthesis.
The Method for the Theremin instructional manual written by Robert B. Sexton on how
the Theremin musical instrument is played gave us two important pieces of
information[15]. First, it helped in our tracking design model. We needed to know how
specific hand movements affect the Theremin. With this information, we can correctly
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 16 of 43
respond to the movements of the Virtual Theremin player’s hands, so as to duplicate the
response that is to be expected of a physical Theremin instrument. Secondly, it served as
a guide, to aid in our learning how to play the instrument effectively so as to demonstrate
that our system correctly duplicates the use of a physical Theremin.
Sine Wave Modulation Synthesis for Programmers was created by Ian Miller [12]. It is a
valuable resource for determining how sine waves can be used to synthesize sounds.
Equations on creating an appropriate sine wave at a particular frequency based on
sampling rates is present as well as more advanced information on sine wave
manipulation.
John Simonton writes an interesting article on the properties of Theremin sound, as well
as how the sounds are produced. It is an interesting article [25] for anyone wondering
why the Theremin sounds the way that it does, as well as those who are interested in
harmonics and pitch sensitivity.
2.5.4 Programming References Documentation was needed to explain the usage of several IEEE1394 programming
libraries. Geocrawler [18] was found to be most beneficial as it provided examples and
group discussions. Geocrawler is the leading news, collaboration and distribution
community for IT and Open Source development, implementation and innovation. It
currently boasts approximately 6,000,000 emails archived that discuss open source
development.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 17 of 43
The DFS’s C page webpage was created by DF Stermole [14]. It contains basic
information about Linux systems, as well as information about programming in C. The
pertinent information from this source comes in the form of a sample C program to allow
a user to obtain input from the keyboard using a single key press under a Linux
environment.
The A Pthreads Tutorial page was created by Andrae Muys [10]. The website contains
information on how to create multi-threaded applications under a Unix-type environment.
Programming concepts are explained by providing short easy to understand sample
programs. Topics covered include benefits of concurrency, creating threads, mutexes and
synchronization and examples of classical concurrency problems.
The Debian website contains information about the Debian operating system [11].
Documentation on the setup of the operating system is available, as well as information
on packaging and usage of the system. It is an invaluable resource for learning about
getting started with your Debian Linux system.
During the development of the simulation in Octave, the Octave Online Documentation
[9] was found to be an invaluable resource, as it covers all of the built- in functions in
detail. Correct syntactical use and valid parameter options are described, and many usage
examples are provided on the site.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 18 of 43
Section 3: Materials and Programming Methods
3.1 Materials Used
3.1.1 Hardware
For our project, we decided to use the following hardware components:
• Standard PC, 2GHz processor, 256MB RAM
• IEEE1394 PCI interface card, with OHCI compliant chipset
• IEEE1394 video camera
3.1.2 Software
The following software was also used in the development of the Virtual Theremin
• GNU GCC and G++ compilers, version 2.96
• DDD graphical debugger, version 3.3.1
• Pthreads library, version 1003.1c
• Coriander source code, version 0.26
3.2 Programming Techniques
During software development, we learned and applied the following programming
techniques. The use of these techniques as they were applied in our software will be
discussed in the relevant sections below.
3.2.1 Freestore
A freestore is used to improve access times to dynamically allocated memory space
during run time. Instead of allocating and deallocating memory blocks as they are
needed, a large amount of memory blocks are allocated and stored in a linked list. When
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 19 of 43
a new block of memory is required by the program, it calls a macro that returns the next
available block from the linked list, instead of having to find appropriate space in
memory. This offers very large performance gains, since dynamically allocating new
storage is one of the most time intensive system calls.
3.2.2 Concurrent Code Development
To make our code development more efficient, we modularized our Virtual Theremin
design which allowed each member to work independently. Each module was planned
such that it had rigid set of inputs and outputs requirements placed on it. Each member
was aware of what they needed to take as inputs and what was expected at the outputs.
By taking this approach, our group was able to maximize our overall design efficiency
and minimize integration time.
3.2.3 Incremental Software Development
To ensure that our code was of the highest quality, we decided to implement incremental
software development methods. We agreed that each team member would write their
code in such a way that every existing part of the module was tested before beginning
development of a new one. During the earlier stages of development, we daily backups of
our code, and once integration began and integration problems were solved, we backed
up our code every few hours. The version of our software found in Appendix C is the
most recent version, Mar19-2.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 20 of 43
Section 4: Design
This project was made up of three main components: image acquisition, image
processing and tracking, and sound generation. The methods used to develop, integrate
and test each of these components are listed below.
4.1 Testing the IEEE1394 Bus
After completing the installation of Debian as outlined in Appendix B, the IEEE1394 bus
was checked using two separate tests to ensure correct operation. The first test used a
small graphical program called gscanbus, which can be found at [3]. Gscanbus was
created to scan the IEEE1394 bus, check it exists and then provide information about any
connected devices. The second test used a video capturing program called Coriander
which can also be found at [3]. Coriander was created to output video feed from an
IEEE1394 video device to the screen.
The gscanbus test was successful and found the IEEE1394 bus on the first trial, verifying
that the IEEE1394 bus existed and was functioning properly. Information about our Pyro
camera connected to the bus was displayed to the screen.
The Coriander test was successful in verifying the operation of the bus, as video feed
from the Pyro camera was outputted to the screen in an appropriate manner, but after a
few minutes the video transmission from the camera would terminate. The transmission
would simply stop for no apparent reason and would resume only after a clean reboot of
the entire system. Thinking other IEEE1394 users may have experienced this problem,
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 21 of 43
we searched through [3] for some information. It was soon discovered that this
transmission problem was caused by the existing OHCI1394 driver.
The driver had been written in such a way that control of the IEEE1394 bus would be
handled by attached devices, not the Linux kernel. This wouldn’t have been a problem
had our camera been designed to control and manage the bus. Instead the camera was
designed assuming the kernel would be in control. Both kernel and camera were getting
confused about who had control of the bus and at a certain point during transmission
neither the kernel nor the camera would have control and the bus would lock, ceasing all
transmissions.
This problem was solved by editing the OHCI1394 driver to force the kernel to take
control of the bus. Following instructions listed on [3], the ohci1394.c driver file was
edited and the value of the bus control variable, attempt_root on line 162, was changed
from 0 to 1 as depicted in Figure 4.1 below.
It should be noted that this modification is considered a “hack” by [3] and can cause
major problems for anyone who is connecting more than one PC to the IEEE1394 bus.
Currently a more complete implementation of the bus management is being developed.
159 /* Module Parameters */ 160 MODULE_PARM(attempt_root,"i"); 161 MODULE_PARM_DESC(attempt_root, "Attempt to make the host root (default = 0)."); 162 static int attempt_root = 0;
Figure 4.1: Code fragment from ohci1394.c
Changed to 1
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 22 of 43
4.2 Graphical User Interface Implementation and Image Acquisition
At the outset of the project, it was
assumed that our group would
have to write the software to
capture images from the camera
and display it to the monitor. It
was also assumed that we would
have to design a graphical user
interface (GUI) to make our
software easier to use. After some
preliminary work, we discovered that creating an image capturing GUI would be more
complex then we anticipated.
To solve this problem and allow more time to focus on the image processing of our
design project, our group came up with a simple solution. Making use of the Linux open
source policy, we modified the original Coriander source code and added our own
functionalities and algorithms to it. This was an ideal solution for several reasons, but
mainly because Coriander already supported a functioning, easy to use GUI.
Furthermore, it also provided all the functionality for capturing images from the camera
and outputting them to the screen.
Figure 4.2: Coriander
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 23 of 43
Alterations to the Coriander source code included adding interfaces to call the different
functions provided in Daniel's and Jeremy's modules. Modifications where also made to
Coriander’s GUI using a user- interface code builder called Glade . These changes can
be viewed when Figure 4.3 is compared to that of the original, displayed in Figure 4.2.
The new version includes several toggle buttons
as well as several spin boxes, each of which are
explained below.
4.2.1 Toggle Buttons
As can be seen in Figure 4.3, five extra toggle
buttons were added to the Coriander GUI, these
were: “Save RGB & YUV”, “Toggle Tracking”,
“Toggle Sound”, “Toggle Skin Highlighting” and
“Toggle Hand Markers”.
The “Save RGB & YUV” toggle button was
added purely for the benefit of grabbing a snap
shot of what the camera was displaying and writing it to disk. The image was saved in
both RGB and YUV formats and mainly used by Daniel for testing in the skin detection
algorithms.
The “Toggle Tracking” button enables the hand tracking algorithm that constitutes the
main component of the Virtual Theremin. Clicking this button will pass images from the
Pyro camera to Daniel's hand tracking module.
Figure 4.3: Modified Coriander
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 24 of 43
Enabling tracking also enables, the “Toggle Sound”, “Toggle Skin Highlighting” and
“Toggle Hand Markers” buttons. Toggling the sound feature enables the sound
generation provided by Jeremy's sound module. “Toggle Skin Highlighting” and
“Toggle Hand Markers” are present in the GUI to provide visual feedback to the
performing user.
The skin highlighting feature shows
the user that the program is working
by displaying what is currently being
tracked as skin, in pink. The hand
markers feature is another visual aid,
that when enabled displays hand
markers to the screen showing the
user what the computer is
interpreting as hands. The left hand
gets a blue square, while the right gets a green one. An example of these buttons in
action can be seen in Figure 4.4.
The exact processes for performing the tracking, highlighting and sound generation are
explained in more detail by Daniel and Jeremy, later in this section.
Figure 4.4: Visual example of toggling skin highlighting and hand markers
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 25 of 43
4.2.2 Spin Boxes
Along with the toggle buttons, four spin buttons were also added to the Coriander GUI,
and include: “U MIN”, “U MAX”, “V MIN”, and “V MAX”. Since our program is
affected by different lighting conditions, as will be explained in section 4.3.1, these spin
boxes allow a user to customize the program to different lighting environments.
4.3 Image Processing
The image processing module makes use of the information acquired by the image
acquisition stage to determine the location of skin colored pixels in the image. Once this
information is generated, the image will be searched by the processing module to
compute the locations of the player’s hands in the image and return this information to
the program so it can be used to simulate the Theremin sound in the sound generation
module.
4.3.1 Skin Detection
The skin detection for each image is performed in YUV space. Since the determination of
skin colored pixels depends very little on the intensity of the pixel, we can safe ly ignore
the Y component of each pixel and use the color information stored in the U and V
channels to decide whether a pixel is skin colored or not. This reduces the skin detection
problem from a three-dimensional problem in RGB space to a two-dimensional problem
in UV space.
Originally, we had planned to perform skin detection in the HSI color space, but we soon
discovered that the conversion from HSI to RGB was very complex and time consuming.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 26 of 43
After discussing the situation with Professor MacLean, we began looking into other color
spaces that would be acceptable for our project. We decided to use the YUV color space
because our camera could be configured to output images in this format, removing the
need for any color space conversion to take place.
In addition to the reduction in complexity, YUV and similar color spaces which separate
intensity from color information offer another advantage in the skin detection problem. In
these color spaces, skin colored pixels of any background or ethnic origin all fall within a
small connected region of the color information space [5]. Since we did not have any
information regarding the values of skin tone in the YUV color space, it was necessary to
take sample images of skin and determine their chrominance values. Using Octave, we
created a function that would take a directory of images and extract the necessary
chrominance values (U and V), assuming that the images only contained skin colored
pixels or black pixels. The number of occurrences of each set of values was recorded and
plotted. The preliminary results can be seen in the following plot.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 27 of 43
We used the information collected to determine an appropriate range of U and V which
was used to segment the image into skin colored and non-skin colored pixels. Results of
this segmentation can be seen below:
Figure 4.4: Relative Frequency of U, V pairs for skin colored pixels
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 28 of 43
Figure 4.5: Test Image A Figure 4.6: Pixels detected as skin-color
The results obtained from our filter are quite adequate for our project, but it is worth
noting that the output of the filter is affected by the lighting conditions in the playing
environment. In particular, we see in Figure 4.6 that the palm of the hand is not
completely identified as skin. Looking to Figure 4.5, we see that the palm has a blue tint,
due to the subject being positioned close to the computer monitor at the time the image
was taken. This is a significant property of the skin detection algorithm, so we have
integrated a manual adjustment of the U-V range into the GUI so we can adjust the
sensitivity of the detection algorithm based on the current lighting conditions
4.3.2 Hand Tracking
Once the image has been segmented with the skin detection algorithm, it must be
searched to determine the location of the left and right hands in the image. The hand
location section of the module consists of three main parts: a blob growing function
which find all connected skin pixels from a given start pixel, a full image search
algorithm which creates a list of blobs from the entire image and determines which blobs
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 29 of 43
are the hands, and a quick search which searches the image based on the previous known
location of the hands.
The first function is used by the other functions to determine which pixels are part of a
connected component of skin. Called grow_blob() in handtracker.c (see Appendix X for
source code), this function uses packets from a freestore to create a list of pixels that
should be visited. The algorithm searches the 4-neighbourhood of the current pixel and
adds any skin colored pixels to the list. The algorithm then removes the next pixel from
the list and repeats the same process until the list is empty. At each pixel insertion, the
algorithm also accumulates information on the blob, in particular, the size of the blob in
pixels, and the pixel co-ordinate represent ing the center of mass of the blob.
The full image search function, find_hands() begins at the top left of the image and
searches each pixel until it finds a skin colored one that has not already been added to
another blob. It then passes the selected pixel to the grow_blob() function and stores the
resulting blob into a linked list, sorted by size. Once the entire image has been searched,
the three largest blobs are selected from the list, which are assumed to be the left and
right hands, and the head. We assume that the hands will be the leftmost and rightmost
blobs, and return the center of mass co-ordinates of these two blobs.
Once a full image search has been performed, it is no longer necessary to search the
entire image for the hands, since we assume that the player’s hands will not be
significantly further than in the previous frame. We will begin our search from the
previous known location of the hands, and search the 8-neighbourhood of the pixel for a
skin colored pixel. If one is found, we will call the grow_blob() function on that pixel and
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 30 of 43
assume that the returned blob is the hand. If for some reason we are unable to locate
either of the two hands based on the previous location, we resort to a full image search.
The time saved by the quick search is significant, and is up to 5 times faster than a full
search if the image is noisy. Fortunately, the penalty incurred by a miss in the quick
search will not affect the performance of the system, since the full image search is still
much faster than the required 50ms. The relevant timing results obtained by searching a
noisy image can be found in the following table:
(All times in ms) Full search Quick search hit Quick search miss
320x240 3 <1 5
640x480 17 5 24
In addition to the 3 major functions discussed above, there is a wrapper function
locate_hands() which functions as an interface to the module, and automatically
determines which of the two hand detection functions should be called, and an
initialization function locator_init() which allocates empty blob and pixel packets to the
respective freestores.
4.3.3 Development Details
Once the module was complete, it was found that the full search algorithm was
excessively slow. The first trial run of the full search took 273ms to completely search a
320x240 image. However, the quick search was found to be quite fast, taking less that
3ms to find the hands. Since there were real time constraints to be dealt with, we decided
that we could only spend time performing one type of optimization. We could optimize
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 31 of 43
the skin detection filter to minimize the number of full searches required, or we could
improve the processing time of the full search. We decided on the la tter course of action,
since we could apply the optimizations to both the full and the quick searches, and
hopefully gain more improvement for an equivalent time investment. This selection
yielded improvements much greater that we anticipated.
At the start of the optimization process, we spent time removing unneeded variables and
redundant calculations, and using more efficient methods of initializing internal data
storage, but this gave us only marginal improvements, a few milliseconds at best. We
then considered the internal data structures themselves, and realized that the same storage
space was being used every time through the detection process, but new memory was
being allocated every time. This concern was brought up to Professor MacLean, who
suggested we investigate the freestore programming technique, to reuse existing memory
allocations instead of finding new memory every time. We first implemented the pixel
storage and blob information storage as a freestore, with moderate speed improvements
of up to 50ms. However, once we began to reuse large existing arrays used for temporary
image storage, our processing times dropped dramatically. The end result of our
optimizations was a 100-fold improvement over our original full search times, from
296ms to 3ms. The quick search also improved, from 5ms to less than 1ms. As such, we
were able to make our system more robust, by inserting full frame searches at regular
intervals to ensure that, even if the hands were “lost” while tracking, they could be found
again within a certain time frame and there would be no noticeable delay in tracking.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 32 of 43
4.4 Sound card support in Linux
One of the major benefits of the Linux operating system is that it is easily customizable.
It allows users to tweak the system to suit their own individual needs. Sound setup in
Linux is more complicated, than most other operating systems, but is more efficient at
managing resources.
Recent kernel distributions for Debian Linux include many of the standard audio device
drivers, but they are not enabled by default. To use them, they must be enabled and
recompiled into the kernel. Our system contains a soundcard that uses one of the drivers
provided by the kernel
Linux devices have the interesting property that they are registered as files in the file
system of the operating system. This is a convenient property, as it allows some of the
standard file operations to be performed on the devices. Therefore, if you wish to write
samples to the soundcard for output, you can just write a file containing a set of samples
to the soundcard device file. This is an excellent way to test to see that the soundcard
installation has been performed correctly. By writing a standard audio file to the device ,
sound should be heard from the system speakers.
In addition, the Linux mixer device for the soundcard needs to be assigned to the users
who can have access to it. In our system, we want any user to have access to mixer
functionality, and thus the devices privileges were set to accommodate this.
To design our software to interface with the soundcard device, the open sound system
OSS application programming interface was used. The open sound system is a device
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 33 of 43
driver for sound cards operating under a Unix-compatible operating system. Sound cards
normally have different devices or ports which produce or record sound. The OSS
provides a common programming interface that can control all of the devices and ports
associated with the soundcard. By using this API, our Virtual Theremin software will be
able to be run on any Linux system that uses OSS. This is beneficial as it does not
constrain to any specific type of soundcard.
4.4.1 Interactive Program for sound emulation using keyboard support
The first step in generating Theremin sound was to create a test application. This
framework allowed the sound output characteristics to be adjusted until it conformed to
what an ideal Theremin should sound like.
The test application consisted of a thread that would continually output samples from a
waveform, defined below, to the soundcard device. Using the keyboard, two variables
could be adjusted both up and down. These variables correspond to both an amplitude
and a frequency component. In the C programming language, the keyboard input is
buffered so that multiple keys are read until the enter key is pressed. We wanted to have
a direct effect on the sound outputted in relation to the key presses so the buffering was
disabled. The test was successful as the sound thread read the variables and outputted
samples from the waveform to the soundcard.
Once testing application was complete, the function which created samples for a specific
waveform was written. A sine wave was used as our output waveform and this waveform
defines the ideal Theremin output. Since Theremin sound is continuous, complete cycles
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 34 of 43
of the waveform were defined to eliminate chop, which is an undesirable property where
the output sound temporarily halts.
By observing the above code segment, you can see the manner in which amplitude and
frequency were defined to produce complete cycles of samples from a sine wave.
Amplitude corresponds to the ‘vol’ variable, which is mapped between zero and one
hundred, with zero meaning no amplitude and one hundred being full amplitude. The
frequency of the output waveform is the ‘f’ variable, which corresponds to a set output
frequency.
This application allowed the output waveform to be adjusted until sound produced was
consistent with what our design group believed to be an ideal Theremin.
4.4.2 Mouse Driven Application for Sound Mapping Testing
To create a mapping between hand position and sound, a new test application was created
that used the mouse cursor position to act in place of hand positions. Since we defined
that the left hand would control the volume, by being tracked vertically, and the right
f=(2*PI*hz)/sr; count=(int)floor(sr/hz); /*number of samples in buffer for complete sine wave*/ for (i=0;i<(count/2);i++){ audio_buffer[buffer_index]=vol*sin(f*i)+128; /*so no negative values*/ buffer_index++; audio_buffer[buffer_index]=vol*sin(f*i)+128; /*so no negative values*/ buffer_index++; if (buffer_index==2){ /*write to audio device when buffer is full */ write(audio_fd, audio_buffer, BUF_SIZE); buffer_index=0; } } Figure 4.7: Code segment from keyboard testing program
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 35 of 43
hand would control the pitch, by being tracked horizontally, the current mouse position
could be used to correspond to both hand positions. We ignored the vertical position of
the right hand, and the horizontal position of the left hand. This test application was
written using openGL taking advantage of the glut library. The glut library provided the
mouse handling functionality for openGL.
To map a given hand position to a certain output amplitude and frequency of our sine
wave, a look-up table was created. This table translates a given x and y position for each
hand to a set of amplitude and frequency for output. The table was created by taking the
given set of pixels of the image, and defining a linear range of output for each dimension.
The vertical pixel positions would map between the required output volume range.
Similarly, the horizontal pixel positions would map out the given frequency range.
While using this test application, it was discovered that too many samples were being
written to the soundcard. This caused a noticeable delay between the movements of the
mouse and the sound generated. This was corrected by implementing a timer for the
sound output thread. The thread was timed to run at the same frequency of the soundcard
sampling rate. This made sure that no extra samples were being written to the audio
buffer that might cause a delay between mouse movement and sound output.
With the hand positions to sound output module complete, it was integrated into the
Virtual Theremin system. The integrated system was thoroughly tested by our design
group to make sure that the sound output module functioned correctly as a system
component. Testing of the sound output system component was done by our design
group by checking to see if hand position in the GUI matched the sound output.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 36 of 43
Section 5: Conclusions
Overall, our project was successful. We achieved all the objectives that were outlined in
our proposal [13] and were able to demonstrate our functioning system at the design fair.
Most notable is our processing speed, which far exceeds the desired 50ms. With a
320x240 pixel image, we were able to keep the processing delay under 5ms at all times,
and a full-size 640x480 image could be processed in less than 25ms. Theoretically this
could result in frame rates upwards of 40fps with a full-size video feed, which is
substantially faster than our original requirements. Our system was also able to track the
player’s hands without an explicit setup phase. Though a small amount of initial
calibration was required to compensate for the lighting conditions present in the playing
environment, the player could walk into the camera’s field of vision, and the system
would automatically begin tracking him. The system can track the user's hands through
regular Theremin movements and will track the hands throughout the entire image,
except if the user places his hand over his face. Once the hand is relocated however, the
system will quickly resume correct tracking.
Our Virtual Theremin mimics a real Theremin, both in terms of sound and physical
movements. Even though we are limited to a one octave pitch range, appropriate hand
movements will cause the system to respond with the required changes in pitch and
volume. Since a real Theremin should theoretically produce a pure sine wave, we
consider our computer generated sine wave to be an appropriate approximation.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 37 of 43
The UI we developed is very intuitive to use. All the essential features are easily
accessible via buttons and text boxes, and system properties such as video capture size,
skin detection range and user feedback modes are easily customizable.
5.1 Further Work
Though the Virtual Theremin was an instructive and fun way to learn about hand tracking
and human-computer interaction, the applications of our system are very broad, and serve
as a sound basis for further development in computer vision. Our system is quite robust
but could still use some adjustments. In particular, the skin detection algorithm could be
improved by using a data range that more accurately represents the distribution of skin-
colored pixels in U-V space. It may also be useful to augment the system with an
algorithm to automatically adjust the U-V range based on the current lighting conditions
detected by the camera.
5.2 Possible Applications
One particularly useful project could consist of adapting the tracking data provided by
our project to control a cursor on a computer screen in a similar fashion to the control
provided by a mouse. This could be useful in giving presentations in locations where it is
not always convenient to use a mouse (limited space, inconvenient placement of
computer equipment with respect to other apparatus, such as microphones). With further
development, we could use image segmentation techniques to recognize various hand and
finger positions to provide additional controls to the user.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 38 of 43
Section 6: References
1. (2002, Sept 1). CMP MEDIA LLC. [Online]. Available: http://www.embedded.com/1999/9906/9906feat2.htm
2. (2002, Sept 22). 1394 Trade Association [Online]. Available:
http://www.1394ta.org/ 3. Daniel Dennedy. (2002, Sept 1). Linux1394. [Online]. Available:
http://www.linux1394.org 4. Pratt, William K., Digital Image Processing, 2nd Ed. New York: John Wiley & Sons,
Inc., 1991. 5. Fleck & Forsyth. (2002, Sept. 22). Naked People Skin Filter. [Online]. Available:
http://www.cs.hmc.edu/~fleck/naked-skin.html 6. Sexton, Robert. “Method for the Theremin – Book 1 Basics,” Texas: Tactus Press,
1996. 7. Tranter, Jeff. (2002, Sept. 26). Open Sound System Programmer’s Guide. (1.11)
[Online]. Available: http://www.4front-tech.com/pguide/oss.pdf 8. Tranter, Jeff. (2002, Sept. 26). The Linux Sound HOWTO. [Online]. Available:
http://www.tldp.org/HOWTO/Sound -HOWTO/index.html 9. Eaton, John W. (2002, Nov 17). GNU Octave – Table of Contents. [Online].
Available: http://www.octave.org/doc/octave_toc.html 10. Muys, Andrae. (2003, Jan. 5). A Pthreads Tutorial. [Online]. Available:
http://www.cs.nmsu.edu/~jcook/Tools/pthreads/pthreads.html 11. (2002, Sept. 22). Debian. [Online]. Available: http://www.debian.org 12. Wilson, Ian. (2003, Jan. 5). Sine Wave Modulation Synthesis for Programmers.
[Online]. Available: http://www.geocities.com/SiliconValley/Campus/8645/synth.html
13. Dargus, DeAraujo & Gillard. Virtual Theremin using IEEE1394 - Technical
Proposal. Toronto: University of Toronto, 2002. 14. Stermole, DF. (2003, Jan. 4). DFS's C Page 2001-2002. [Online]. Available:
http://www.macdonald.egate.net/CompSci/index.html 15. Sexton, Robert. (2002, Sept. 22). Take a Look at Theremins. [Online]. Available:
http://www.ccsi.com/~bobs/theremin.html
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 39 of 43
16. Hawksley, John. (2003, Jan. 5). Sampling. [Online]. Available:
http://www.armory.com/~greebo/sampling.html 17. (2003, Jan 6). RGB and YUV Color. [Online]. Available:
http://www.joemaller.com/fcp/fxscript_yuv_color.shtml 18. (2003, Jan 8). Geocrawler. [Online]. Available:
http://www.geocrawler.com/archives/3/2711/2000/11/0/4732161/ 19. (2002, Oct 1). The Linux Kernel HOWTO. [Online]. Available:
http://www.linux.org/docs/ldp/howto/Kernel-HOWTO.html 20. (2002, Nov 22). Octave-forge Combined Index. [Online]. Available:
http://octave.sourceforge.net/index/index.html 21. Daniel Dennedy. (2002, Sept 1). Linux1394. [Online]. Available:
http://www.linux1394.org 22. Gonzales, Rafael C. & Woods, Richard E., Digital Image Processing, New Ed. New
York: Addison-Wesley Publishing Company Inc., 1992. 23. Hardenberg & Bérard. (2002, Sept. 26). Bare-Hand Human-Computer Interaction.
Berlin, Germany. [Online]. Available: http://iihm.imag.fr/publs/2001/PUI01_Hardenberg.pdf
24. imonton, John (2003, Feb. 10 ) On Theremin Tone. [Online]. Available:
http://www.paia.com/thereton.htm 25. (2003, April 4). Wepopedia [Online]. Available:
http://www.webopedia.com/TERM/I/IEEE_1394.html
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 40 of 43
Appendix A: Computer System and Software Versions A.1 Computer System
•i386-based processor, 2GHz •256MB DDR RAM •VIA KT333 chipset-based Motherboard •NVidia GeForce4 MX 440-based video card with 64MB DDR RAM •D-Link DFW-500 Rev. 1.69 1394 PCI card •SoundBlaster compatible soundcard on the motherboard •7200 RPM ATA-133 hard disk
A.2 Software Versions Software name Version Description [3] Coriander 0.26 A graphical utility that lets you control all of the features
of an IEEE-1394 Digital Camera gscanbus 0.7.1-1.1 Utility to display connected devices and do transactions libraw1394-d 0.9.0-2 A library to control A/V devices using the 1394ta AV/C
commands. This library also contains librom1394 for reading and decoding the CSR Config ROM of any device on the bus.
libdc1394 0.9.0-2 A library that is intended to provide a high level programming interface for application developers who wish to control IEEE1394 based cameras that conform to the 1394-based Digital Camera Specification (found at http://www.1394ta.org/).
Linux kernel 2.4.16 Linux kernel that supports IEEE1394
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 41 of 43
Appendix B: Setup and Installation of Linux with IEEE1394 Support
Prior to starting the image acquisition process, a suitable Linux environment was created.
A complete listing of all hardware and all software components used in the system can be
found in appendix A.
B.1 Installation of Debian
As outlined in [13] our group selected the Debian version of Linux because of its
somewhat simple installation procedures. Debian makes use of a command called
“dselect” that selects, downloads and installs packages available to the operating system
as well as determine if a package has any dependencies. These dependencies are then
downloaded and installed as well. Debian was also selected because it was readily
available for download from the University of Toronto’s network, which provided our
group with quick access times.
B.2 Installation and Setup of IEEE1394 and Sound
After completing the Debian installation process it was discovered that the kernel used by
the OS was insufficient for our needs, as it did not support IEEE1394. Since IEEE1394
is the basis behind our real- time image capturing, a compatible kernel version was
required. Presently our system is using kernel build 2.4.16 and was selected for two
reasons. The first was that any version below build 2.4 did not support the IEEE1394
technology, and the second was that any build preceding and proceeding 2.4.16 had
issues with other pieces of hardware in the computer system. For example, the IEEE1394
would work, but the sound card would not, and vice versa.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 42 of 43
With an IEEE1394-compatible kernel installed, the IEEE1394 drivers were activated by
accessing the kernel’s configuration menu (screenshot?) and enabling: IEEE1394
support, OHCI support and RAW1394 I/O support. Sound support was also enabled at
this point.
Virtual Theremin Design Project Nicholas Dargus, Daniel DeAraujo and Jeremy Gillard
Page 43 of 43
Appendix C: Source Code
The relavent source code for all the modules can be found on the following pages