ece532 - university of torontopc/courses/432/2015/projects/2_fight… · software, in depth. we...
TRANSCRIPT
ECE532
Final Report
Motion Controlled Punching Game
10th April 2015
Syed Talal Ashraf
Syed Muhammad Adnan Karim
Yao Sun
Contents
Overview
Description
Motivation
Goals
Block Diagram
Outcome
Original Design
Final Design
Differences Outlined
Project Schedule
Original Milestones from Document
Revised Milestones
Description of Blocks
Description of IP’s used
Delta Calc Custom IP
Description of Design Tree
References
Description
The Motion Controlled Punching Game is a “Rock Em Sock Em” style punching game. A
player contests against a CPU player. The punching characters are displayed on a VGA screen
with the input from a camera in the corner of the screen. The player inputs are received by
motion tracking of colored patches attached to the arm and outside palm of the user. The
scoring system determines the distance between the two patches to determine the extension of
the arm, and scales the damage according to the extension. The greater the extension the
more damage is done.
A PMOD Camera is used to obtain visual input from the user. The camera operates at
320x240 pixels and its input is processed by an IP to track motion.
Below is a snapshot of the game.
Figure 1. Picture of game with feedback from camera top left.
Motivation
There are several motion tracking devices that are being used in the market today e.g
Microsoft’s Xbox Kinect and Sony’s Playstation Camera. These devices use and track gestures
and motion to create highly interactive environments using extremely complex algorithms to
do so. All of us took this course to learn about FPGA’s and hardware in depth but the reason
we chose this project was to understand motion tracking and how its done in hardware versus
software, in depth. We wanted to create a game that used raw hardware algorithms to track
motion and then use that in some sort of game. Initially we wanted to create a ping pong game
but we decided to create a punching game for its interactive and demonstrative appeal.
Goals
The goals of the project evolved over time but ended up as follows.
Animate the screen with two on screen characters
Create AI for the computer controlled character or player 2
Receive a video feed from the PMOD Camera and process it to receive (x,y) coordinates
of two patches of red and green colors
Use the input from processed video to control player 1 on the screen
The players will be able to punch or duck to avoid a punch from the opponent
The punch will have a variable strength depending on the extension of the arm
Create a suitable game logic to determine winner
Block Diagram
This is the final block diagram that was implemented.
Figure 2. Block Diagram Final
Original Design
The original block design is show below. The original design specifications were as follows.
a. The game shall interpret two motions. One to take the duck action and the
other to take the punch action.
i. At least 3 LEDs will be required.
ii. A camera for video input to the FPGA will be required
iii. Memory will be required to store the video input
b. The interpreted motion of the LED will be used to show and animate the
figures on screen.
i. A VGA monitor will be required
Furthermore, it was established that the punch will not be binary but its strength will be
determined by the velocity of the arm movement.
Figure 3. Original Block Design
Final Design
This is the block diagram for the final setup that we implemented.
Figure 4. Final Block Design
Differences
The final design has changed in 3 ways. The differences are outlined below.
Due to the lack of robust response from the motion tracking algorithm we could not
implement punches that have strengths proportional to the velocity of the movement
of tracking patches. Instead we used distance to determine the strength of the punches.
The distance between the two patches one on the upper arm and the other on the
outside of the palm is used to determine the strength. If the distance at the time of
capture is big then the resulting output will be big and hence the damage done by the
punch will be great.
The use of the audio module was omitted due to lack of time. Instead our time was
spent focusing on the tracking algorithm.
The VGA controller and the TFT Controller were shown as two separate modules on
the main diagram. These are infact the same which was corrected on the final block
diagram. This was not a difference in the design but more a technical correction.
Things To Do Differently
Start with the development of tracking algorithm simultaneously. We believed that we
needed the camera feed to do this however this could have been done and tested
separately by using the UART to load memory with custom images and processing
them
Use a better source code control mechanism. Google Drive is not meant for SVN
If someone is to pick this project from here, we recommend timing the Delta Calc
Module to determine how frequently it can be run and hence make the system more
robust. We also recommend using RGB888 and let the VGA concatenate that to 4 bits
later. This will allow the same camera to give a better output.
Original Milestones
Week 1 - (Feb. 10)
Setup all animations and use simple buttons to implement the actions for those
animations. Test the animations out thoroughly.
Week 2 - (Feb. 17)
Implement a simple dodge game where a simple AI algorithm throws random punches
at you and you have to dodge them. The player has only one action at this point to dodge
the punches. The punches will get progressively faster and the longer a player survives
the higher the score.
Week 3 - (Feb. 24)
Implement punch/duck animations and detect collisions and misses for all three
functionalities. Use a keyboard for input and implement all these actions with a simple
keyboard interrupt.
Week 4 - (Mar. 3)
Introduce the motion detection via camera feedback system to control the character.
This will involve more complex ways to get our input to the game and make the game
more interactive.
Week 5 - (Mar. 10)
Put everything together to form a game where a person has to duck/punch and fight
against an opponent.
Revised Milestones
Week 1 - (Feb. 10)
Completed TFT module and tested it works with monitor output.
Week 2 - (Feb. 17)
Further tested the TFT module, now works perfect being controlled by the microblaze.
Beginning work on the ov7670_top module.
Week 3 - (Feb. 24)
Revised approach to ov7670_top module, now includes VDMA instead of direct
memory access via axi lite protocol.
Week 4 - (Mar. 3)
Completed 2 basic character models representing the fighters
Completed character animations that include ducking, punching, and a fighting stance
Week 5 - (Mar. 10)
Completed 2 basic health bars representing the player's health
Started working on a background including clouds and moon
Week 6 - (Mar. 17)
Completed background
Completed work on OV7670 to VDMA module in 640x480 (VGA) resolution.
Completed a MATLAB algorithm for motion tracking
Began implementation of MATLAB algorithm in verilog (delta_calc) module
Week 7 - (Mar. 24)
The health bar decrements based on whether the player is hit or not
Animated what happens when both players punch each other at the same time
Completed game logic including an ending screen that displays the winner when a
player runs out of health
Continued implementation of delta_calc module
Week 8 - (Mar. 31)
Program simulates user input using that rand() function
Further progress on implementation of delta_calc module, FSM written, waiting to be
tested. Progress slow due to design fair.
Week 9 - (Apr. 7)
Integration between custom IP (Delta_Calc) and game logic completed
Delta_calc successfully completed and tested
Demo day, system fully functional
Description of IP Blocks
IP Used Origin Purpose
Microblaze Xilinx IP Library Provide a processor for software layer processing. Initiates several
hardware modules such as: VDMA, TFT, Delta_Calc and is
responsible for gamelogic and drawing.
Uartlite Xilinx IP Library Provides an interface for computer and Nexys 4 DDR board to talk via
commandline.
axi_tft Xilinx IP Library Provides interface between memory mapped location and VGA signals
outputted to monitor. Very crucial component to our design, serves as
our main user interface.
axi_vdma Xilinx IP Library Provides interface between stream and memory mapped bus channels.
Enables us to easily transmit signals from the OV7670 camera to an
arbitrary memory location.
ov7670_top Modified code based
on Professor Chows
original OV7670
Design and [1].
Provides interface between camera module and VDMA, transmits the
correct signals at correct timing to successfully allow camera and vdma
to talk. More information below
delta_calc Original Hardware
block
Processes video from DDR2 memory and processes it using colour
tracking algorithms to detect red and green blocks. Outputs x,y
coordinates of said blocks for software game logic use. More
information below
mig_7series Xilinx IP Library Memory Controller
Slice Xilinx IP Library Used for cutting down bus sizes.
Utility Vector Logic Xilinx IP Library Used for inverted signals (mainly for OV7670_PCLK)
Below are two of the major modules we worked on throughout the semester, we’d like to
introduce them in greater depth.
OV7670_top Custom IP
Fig 5. In depth look at submodules that make up the ov7670_top module. They include the I2C Look up
table and I2C controller derived from Professor Chows camera example, and borrowed code from [1] for
the ov7670_capture module.
OV7670_top module is the connection point between the camera hardware and the VDMA,
which will convert the streamed camera bits and memory map them to a specified location.
Testing of this block was a large portion of the project, and the team tested in stages to verify
correct functionality of the block. Initially work was done trying to connect the ov7670_top
block directly to memory using axi_lite protocol. After consulting with TA Charles, this
approach was abandoned in favor of the more efficient stream to memory mapped (S2MM)
method using VDMAs.
Inside the block, one can expect to find stitched together code from two locations. Essentially,
the team was tasked with modifying Professor Chows code to work with the VDMA instead of
directly writing the bram. We kept the I2C controller and IIC LUT roughly the same, and
connected the same external signals that they required from the camera.
The team had to read up on the documentation of the vdma controller to find out which
signals were significant, eventually we found out that: m_axis_tlast, m_axis_tuser,
m_axis_tvalid and of course m_axis_tdata[31:0] were the signals that controlled vsync, last
bit of line, and of course data transmission.
The ov7670_capture module is a slightly modified version of a piece of code we found online
at [1]. This VHDL code generated the signals we needed at the perfect timings to talk to the
VDMA. The team stumbled upon this piece of code looking for timings between ov7670 and
the Xilinx VDMA IP. Fortunately there was a similar setup between an ov7670 module and a
ZYNQ setup that used the same timings.
Our most significant change to this portion of the code is:
//-- Expand 16-bit RGB (5:6:5) to 32-bit RGBA (8:8:8:8)
m_axis_tdata <= "11111111" & d_latch(11 downto 8) & "0000" &
d_latch( 7 downto 4) & "0000" &
d_latch( 3 downto 0) & "0000";
This piece of code controlled how the output data was formatted from data coming in from the
camera. Originally set to RGB8:8:8:8, we made small modifications to fit our VGA standard.
Delta_Calc Custom IP
Fig 6. In depth look at the delta calc module.
The delta calc custom IP is our custom Hardware IP Block. It was purely coded by the team
with the only assistance being Xilinx’s auto generated code when initializing the slave and
master interfaces on a custom IP block. The block talks with the Microblaze processor through
eight memory mapped slave registers, and talks to memory to retrieve video data placed by
the VDMA at a fixed location (0x81000000).
Delta_Calc & Microblaze Connection
Overview of input output registers:
Register Purpose
slv_reg0 Initialization Register, used to start one cycle of the hardware module.
slv_reg1 Debug Register, used for outputting various data such as FSM states for debugging.
slv_reg2 Red Threshold Register, used for setting thresholds for detection of the red colour. Input
register should have the format:
0x00, red channel threshold >=, green channel threshold <=, blue channel threshold <=
slv_reg3 Green Threshold Register, used for setting thresholds for detection of the green colour. Input
register should have the format:
0x00, red channel threshold <=, green channel threshold >=, blue channel threshold <=
slv_reg4 Unused
slv_reg5 Red Output Register, outputs red x,y coordinates in the format:
16b red y coordinate, 16b red x coordinate
slv_reg6 Green Output Register, outputs green x,y coordinates in the format:
16b green y coordinate, 16b green x coordinate
slv_reg7 System Done Register, only bottom three bits are significant. However we only use the
bottom bit, which indicates whether the system has finished or not.
The module is not independent of the microblaze processor, it relies on the processor to
instantiate it for a single cycle each time the block is run. We chose to implement the module
in this way due to the fact that we want to process each 10th frame, and the easiest way to
count frames is via writedone interrupts from the VDMA. Therefore the processor acts like a
counter, counting to ten and instantiating the delta_calc module.
Delta_Calc Colour Detection Algorithm
We will go over a very high level approach to explaining the algorithm implemented within
Delta_Calc
Fig 7. High Level look at the algorithm implemented within the delta_calc module for colour detection.
The concept is similar to more complicated algorithm implemented in MATLAB, The screen is
split up into 32x24 (10x10 pixel) blocks. We implement green and red thresholds through
Microblaze which the algorithm will use to judge for what is red and green.
The outer layers of the algorithm iterate through blocks. Essentially an FSM implements a
32x24 iteration through blocks. For each iteration through the block FSM, an inner FSM runs
that calculates via multiplication and the initial mem address (0x81000000) the starting
address for the top left corner of the square. It then iterates through a 10x10 pixel area
checking whether each pixel fits the criteria of red or green pixels. Should either of the
conditions be satisfied, it will be incremented into a red and green counter that is reset for
each block. Should red/green pixels be greater than 50% of the block or greater than 50 pixels.
That block is considered red/green and the x and y coordinates of that block is outputted as
the red/green block that is tracked.
Delta_Calc testing
Testing was done in phases, we needed to build confidence that each layer of our algorithm
could work sufficiently. We seperated testing into three stages:
Stage 1
Stage one was simply testing the read/write capabilities of our block, we wanted to write one
piece of data from one memory location, hold it in a register inside our block and write it on
the next cycle to another memory location.
Stage 2
In this stage full read/write functionality was implemented and tested through copying a large
chunk of data from one location to another.
Stage 3
In the final stage of testing, the algorithm was written and was to be tested. We placed perfect
data consisting of pure reds and green blocks in certain locations and checked whether blocks
could be detected by the algorithm.
Software Level Design
The software was written in one file called helloworld.c that was run on the microblaze
processor. It has 3 major components: animation, game logic, and the integration of
hardware blocks into the program. The integration of hardware blocks involved
initializing and configuring the TFT controller, the VDMA, and the Delta_Calc hardware
blocks to function with the rest of the program. This integration was done by reading
and writing the appropriate memory addresses and registers for output from the
hardware (i.e. done signals, red_x, red_y, green_x, green_y, etc.).
The animations were performed through the use of 14 functions all of which used the
TFT controller’s XTft_SetPixel function. The following are the prototypes of the 14
functions:
1. void drawBox(int x, int y, int length, int height, int color); 2. void drawLeftRightDescent(int x, int starty, int endy, int thickness, int
color); 3. void drawLeftRightAscent(int x, int starty, int endy, int thickness, int
color); 4. void drawCircle(int centerx, int centery, int radius, int color); 5. void drawCloud1(int x, int y); 6. void drawCloud2(int x, int y); 7. void P1Stance(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 8. void P2Stance(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 9. void P1Punch(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 10.void P2Punch(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 11.void P1Duck(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 12.void P2Duck(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 13.void P1Erase(int* P1pstate, int* P1state, int* P2pstate); 14.void P2Erase(int* P2pstate, int* P2state, int* P1pstate);
The drawBox function is the simplest of these animation related functions and the other
13 functions follow a similar structure or depend on it. It is as follows:
void drawBox(int x, int y, int length, int height, int color) int i, j; for(i = x; i <= x + length; i++)
for(j = y; j <= y + height; j++) XTft_SetPixel(&TftInstance, i, j, color);
As shown in the code above, the XTft_SetPixel function was utilized to draw shapes
pixel by pixel in for loops and this was the technique used for all of the animation.
The game logic portion of the program was essentially a while loop with the following
control flow:
Description of Design Tree
All source code and documentation for our design, both hardware and software can be
found here: https://github.com/qoire/motion_controlled_fighting_game
Conclusion
In conclusion, we set out to design a game in hardware using Xilinx IPs and a custom IP.
Although we changed the specifications of the game and made some adjustments to the logic
of the game according to the problems we faced over the course, we were able to implement
the intended design. We had to cut down on the audio codec due to concerns about a working
tracking module but in the end we demonstrated the tracking of objects from a feed from a
camera. We believe that if we had a better camera, we could have achieved better and robust
results. However, given the unexpected complexity of the hardware system and given the
restraints on time placed because of the duration of bitstream generation, we feel we did a
good job. We have learnt valuable lessons in hardware development and value the extensive in
depth exposure we received to the Xilinx design suite Vivado.
References
[1]l. Lauri Võsandi, 'Lauri's blog | Video capture with VDMA',
Lauri.xn--vsandi-pxa.com, 2015. [Online]. Available:
http://lauri.xn--vsandi-pxa.com/hdl/zynq/xilinx-video-capture.html. [Accessed: 10-
Apr- 2015].