facial tracking and animation todd belote bryan harris david brown brad busse
Post on 19-Dec-2015
214 Views
Preview:
TRANSCRIPT
Facial Tracking and Animation
Todd BeloteBryan HarrisDavid BrownBrad Busse
Problem Background
• Speech driven facial animation
• Correlate captured facial movements to audio patterns– Capture facial movements– Analyze corresponding audio
Goals and Objectives
• Develop an inexpensive, robust real-time system to track facial motion and process corresponding audio. The system must:– Cost around $1000– Run on a personal computer– Allow for long periods of data acquisition– Handle head movements– Recover from point occlusion– Output only necessary information
System Description
DATA ACQUISITION
FAP GENERATION
POINT INITIALIZATION
POINT TRACKING
AUDIO PROCESSING
• Top level system organization– Illustrates data flow– Functional block division
Division of Work
• Subsystem Leads:– Data Acquisition – Todd Belote– Point Initialization – David Brown– Facial Tracking – Brad Busse & Brian Harris– FAP Generation – Brian Harris
Data Acquisition
CAMERA
MICROPHONE
AVI MOVIE FILE
EH
EH
EH = EVENT HANDLER
AUDIO PROCESSING
VIDEO PROCESSING
FRAMEGRABBER
SW TIMER
CAPTURE CARD
DEBUG MODE
Total System
Data Acquisition
AVI MOVIE FILE
VIDEO PROCESSING
FRAMEGRABBER
SW TIMER
PHASE 1
BMP FILE
• Camera Emulation – Parses AVI movie File– Sends video frame data to Video Processing– Standalone
Data AcquisitionPHASE 2
CAMERA
AVI MOVIE FILE
EH VIDEO PROCESSING
FRAMEGRABBER
SW TIMER
CAPTURE CARD
• Begin Hardware Interface– Capture and Record Camera data to AVI File
Data AcquisitionPHASE 3
CAMERA
AVI MOVIE FILE
EH VIDEO PROCESSING
FRAMEGRABBER
SW TIMER
CAPTURE CARD
• Hardware to Processing– Real Capture Data to Processing– Mode Switch implemented
(Emulator / Hardware)
Data AcquisitionPHASE 4
CAMERA
MICROPHONE
AVI MOVIE FILE
EH
EH
AUDIO PROCESSING
VIDEO PROCESSING
FRAMEGRABBER
SW TIMER
CAPTURE CARD
WAV FILE
• Final Implementation– Audio Capture to Processing
Point Initialization
• Given: Grayscale bitmap of initial frame• Retrieve: Point locations and identification
Identify PointsFind Points
DATA AQUISITION
POINT TRACKING
RGB
Points
BOOL::DONE
Point Initialization
• Design Constraints– Comparison of noise to points– Point motion within one frame
• Process– Find point which meets minimum point criteria– Find center of point– Identify all points
Point Initialization
Point Initialization
Point Tracking
• Given: a frame of visual input and the initial positions of all the points
• Return: a list of displacements for use in FAP generation
Point TransformPoint Discovery
DATA AQUISITION
FAP GENERATION
RGB
Relative Point Location
POINT INITIALIZATIONInitial Point Location
Point Discovery
• Given a frame of visual data and the last known data point positions:– Finds new data points by searching the area
around the last seen position of each old data point
– Updates locations of facial parameters when possible (i.e. not missing or in conflict)
Design: Point Transform
Phase 1: Facial Orientation Correction
Approach: Criminisi et al.• Maps any arbitrary quadrilateral onto any other• This can account for all six degrees of freedom as well
as perspective distortion, greatly simplifying the computation required to reorient the face
• When using an orientation square that encompasses most of the face, this algorithm can be made as accurate as necessary
Point Transform: Demo
Design: Point Tracking
Phase 2: Data Point to Facial Parameter Conversion
• The rectified data points are then compared with their last known positions
• This will determine the displacement of the facial parameters they represent, or reassign them should the points be lost or in conflict
FAP Generation
• Convert pixels to centimeters• Normalize coordinates• Output File
FAP Generator
Point Tracking
FAP File
Resolved Point Locations
FAP Points
Validation / Test – Data Acquisiton
• Phase 1– SW TIMER:
• Verify periodicity of Timer via calls to high performance clock– Test at 1000 ms– Test at 500 ms– Test at 100 ms– Test at 50 ms– Test at 33 ms– Test at 25 ms
• System Validated with accuracy within 10% at 33ms– FRAME GRABBER
• PARSE Frames from existing AVI Movie File and Save each frame as a BMP file– Verify the number of frames corresponds to the length in the AVI Header– Determine that the Frames are the correct size– Repeat on multiple file formats to insure robustness
– PHASE 1 SYSTEM TEST• Display data passed to VIDEO PROCCESING as an on screen bitmap at the rates listed above for
the SW TIMER Testing.– Perform similar timing testing that was performed for the SW TIMER
Validation / Test – Data Acquisiton
• Phase 2– Record Test Video’s of Multiple Lengths
• 3 seconds• 30 seconds• 3 minutes
– Play Test Video in Windows Media Player • Determine if coloring/ video appears correct
– Parse Header Information to insure proper Values• Compression = BI_RGB• SIZE = 320x240• Rate = 30 fps
– Perform FRAMEGRABBER testing with the Test AVI files.• Phase 3
– Perform PHASE 1 system test with interface set to data from file.– Run system from camera and display VIDEO PROCESSING DATA on screen as bitmap.
• Determine if video appears correct• Run system for variable times to insure stability (with MOVIE Record Turned Off)
– 3 seconds– 30 seconds– 3 minutes– 30 minutes
– Test error cases• Invalid file name, during from file acquisition• Camera not present, data from camera
Validation / Test – Data Acquisition
• Phase 4– Capture audio test files, using clock calls to verify the length of capture = length of WAV file.
• 3 sec• 30 sec• 3 minutes
– Play audio test files in Windows Media Player to determine length and audio quality
– Data Acquisition System Timing Test• Run the system on hardware capture mode• Output the Audio and Video frame timestamps as they are delivered to processing• Output the corresponding time in which they are delivered• Analyze the data to check for synchrony, periodicity, and evidence of time shift.
Validation/Test - Initialization
• Test location and identification of points on many faces
• Failure to complete task may imply failure and may imply design constraint– Distance from camera– Initial face orientation
Validation/Test - Tracking
• Test a number of different faces in a number of different poses at the limits of our specified allowances
• If the system accomplishes the following– Correctly extracts data points from raw
visual data– Reorients the face to extract the correct
displacements for every available data point
The system will have passed validation
Validation/Test – FAP Generation
• Use FAE Engine to observe synchronization between audio and facial movements– Perform specific facial motions and validate
output• Eg. Move chin down, move eyebrows up, smile
• This test will also be used to validate entire system
Environmental and Health Considerations
• All hardware is off the shelf
• No harm from infrared light
• No harm from other products– Eg. Reflective markers
Social, Political and Ethical Considerations
• Provide low cost audiovisual capture– Increase research in field by removing cost
barrier– Further advances
• Eg. Phone for the deaf
• No Ethical Issues
• No Political affects
Economics and Sustainability
• No economies of scale due to narrow scope
• IBM PupilCAM is hard to locate and therefore sustainability with current hardware is issue– Other cameras could provide the same
function
top related