ece532 - university of torontopc/courses/432/2015/projects/2_fight… · software, in depth. we...

ECE532

Final Report

Motion Controlled Punching Game

10th April 2015

Syed Talal Ashraf

Syed Muhammad Adnan Karim

Yao Sun

Contents

Overview

Description

Motivation

Goals

Block Diagram

Outcome

Original Design

Final Design

Differences Outlined

Project Schedule

Original Milestones from Document

Revised Milestones

Description of Blocks

Description of IP’s used

Delta Calc Custom IP

Description of Design Tree

References

Description

The Motion Controlled Punching Game is a “Rock Em Sock Em” style punching game. A

player contests against a CPU player. The punching characters are displayed on a VGA screen

with the input from a camera in the corner of the screen. The player inputs are received by

motion tracking of colored patches attached to the arm and outside palm of the user. The

scoring system determines the distance between the two patches to determine the extension of

the arm, and scales the damage according to the extension. The greater the extension the

more damage is done.

A PMOD Camera is used to obtain visual input from the user. The camera operates at

320x240 pixels and its input is processed by an IP to track motion.

Below is a snapshot of the game.

Figure 1. Picture of game with feedback from camera top left.

Motivation

There are several motion tracking devices that are being used in the market today e.g

Microsoft’s Xbox Kinect and Sony’s Playstation Camera. These devices use and track gestures

and motion to create highly interactive environments using extremely complex algorithms to

do so. All of us took this course to learn about FPGA’s and hardware in depth but the reason

we chose this project was to understand motion tracking and how its done in hardware versus

software, in depth. We wanted to create a game that used raw hardware algorithms to track

motion and then use that in some sort of game. Initially we wanted to create a ping pong game

but we decided to create a punching game for its interactive and demonstrative appeal.

Goals

The goals of the project evolved over time but ended up as follows.

Animate the screen with two on screen characters

Create AI for the computer controlled character or player 2

Receive a video feed from the PMOD Camera and process it to receive (x,y) coordinates

of two patches of red and green colors

Use the input from processed video to control player 1 on the screen

The players will be able to punch or duck to avoid a punch from the opponent

The punch will have a variable strength depending on the extension of the arm

Create a suitable game logic to determine winner

Block Diagram

This is the final block diagram that was implemented.

Figure 2. Block Diagram Final

Original Design

The original block design is show below. The original design specifications were as follows.

a. The game shall interpret two motions. One to take the duck action and the

other to take the punch action.

i. At least 3 LEDs will be required.

ii. A camera for video input to the FPGA will be required

iii. Memory will be required to store the video input

b. The interpreted motion of the LED will be used to show and animate the

figures on screen.

i. A VGA monitor will be required

Furthermore, it was established that the punch will not be binary but its strength will be

determined by the velocity of the arm movement.

Figure 3. Original Block Design

Final Design

This is the block diagram for the final setup that we implemented.

Figure 4. Final Block Design

Differences

The final design has changed in 3 ways. The differences are outlined below.

Due to the lack of robust response from the motion tracking algorithm we could not

implement punches that have strengths proportional to the velocity of the movement

of tracking patches. Instead we used distance to determine the strength of the punches.

The distance between the two patches one on the upper arm and the other on the

outside of the palm is used to determine the strength. If the distance at the time of

capture is big then the resulting output will be big and hence the damage done by the

punch will be great.

The use of the audio module was omitted due to lack of time. Instead our time was

spent focusing on the tracking algorithm.

The VGA controller and the TFT Controller were shown as two separate modules on

the main diagram. These are infact the same which was corrected on the final block

diagram. This was not a difference in the design but more a technical correction.

Things To Do Differently

Start with the development of tracking algorithm simultaneously. We believed that we

needed the camera feed to do this however this could have been done and tested

separately by using the UART to load memory with custom images and processing

them

Use a better source code control mechanism. Google Drive is not meant for SVN

If someone is to pick this project from here, we recommend timing the Delta Calc

Module to determine how frequently it can be run and hence make the system more

robust. We also recommend using RGB888 and let the VGA concatenate that to 4 bits

later. This will allow the same camera to give a better output.

Original Milestones

Week 1 - (Feb. 10)

Setup all animations and use simple buttons to implement the actions for those

animations. Test the animations out thoroughly.

Week 2 - (Feb. 17)

Implement a simple dodge game where a simple AI algorithm throws random punches

at you and you have to dodge them. The player has only one action at this point to dodge

the punches. The punches will get progressively faster and the longer a player survives

the higher the score.

Week 3 - (Feb. 24)

Implement punch/duck animations and detect collisions and misses for all three

functionalities. Use a keyboard for input and implement all these actions with a simple

keyboard interrupt.

Week 4 - (Mar. 3)

Introduce the motion detection via camera feedback system to control the character.

This will involve more complex ways to get our input to the game and make the game

more interactive.

Week 5 - (Mar. 10)

Put everything together to form a game where a person has to duck/punch and fight

against an opponent.

Revised Milestones

Week 1 - (Feb. 10)

Completed TFT module and tested it works with monitor output.

Week 2 - (Feb. 17)

Further tested the TFT module, now works perfect being controlled by the microblaze.

Beginning work on the ov7670_top module.

Week 3 - (Feb. 24)

Revised approach to ov7670_top module, now includes VDMA instead of direct

memory access via axi lite protocol.

Week 4 - (Mar. 3)

Completed 2 basic character models representing the fighters

Completed character animations that include ducking, punching, and a fighting stance

Week 5 - (Mar. 10)

Completed 2 basic health bars representing the player's health

Started working on a background including clouds and moon

Week 6 - (Mar. 17)

Completed background

Completed work on OV7670 to VDMA module in 640x480 (VGA) resolution.

Completed a MATLAB algorithm for motion tracking

Began implementation of MATLAB algorithm in verilog (delta_calc) module

Week 7 - (Mar. 24)

The health bar decrements based on whether the player is hit or not

Animated what happens when both players punch each other at the same time

Completed game logic including an ending screen that displays the winner when a

player runs out of health

Continued implementation of delta_calc module

Week 8 - (Mar. 31)

Program simulates user input using that rand() function

Further progress on implementation of delta_calc module, FSM written, waiting to be

tested. Progress slow due to design fair.

Week 9 - (Apr. 7)

Integration between custom IP (Delta_Calc) and game logic completed

Delta_calc successfully completed and tested

Demo day, system fully functional

Description of IP Blocks

IP Used Origin Purpose

Microblaze Xilinx IP Library Provide a processor for software layer processing. Initiates several

hardware modules such as: VDMA, TFT, Delta_Calc and is

responsible for gamelogic and drawing.

Uartlite Xilinx IP Library Provides an interface for computer and Nexys 4 DDR board to talk via

commandline.

axi_tft Xilinx IP Library Provides interface between memory mapped location and VGA signals

outputted to monitor. Very crucial component to our design, serves as

our main user interface.

axi_vdma Xilinx IP Library Provides interface between stream and memory mapped bus channels.

Enables us to easily transmit signals from the OV7670 camera to an

arbitrary memory location.

ov7670_top Modified code based

on Professor Chows

original OV7670

Design and [1].

Provides interface between camera module and VDMA, transmits the

correct signals at correct timing to successfully allow camera and vdma

to talk. More information below

delta_calc Original Hardware

block

Processes video from DDR2 memory and processes it using colour

tracking algorithms to detect red and green blocks. Outputs x,y

coordinates of said blocks for software game logic use. More

information below

mig_7series Xilinx IP Library Memory Controller

Slice Xilinx IP Library Used for cutting down bus sizes.

Utility Vector Logic Xilinx IP Library Used for inverted signals (mainly for OV7670_PCLK)

Below are two of the major modules we worked on throughout the semester, we’d like to

introduce them in greater depth.

OV7670_top Custom IP

Fig 5. In depth look at submodules that make up the ov7670_top module. They include the I2C Look up

table and I2C controller derived from Professor Chows camera example, and borrowed code from [1] for

the ov7670_capture module.

OV7670_top module is the connection point between the camera hardware and the VDMA,

which will convert the streamed camera bits and memory map them to a specified location.

Testing of this block was a large portion of the project, and the team tested in stages to verify

correct functionality of the block. Initially work was done trying to connect the ov7670_top

block directly to memory using axi_lite protocol. After consulting with TA Charles, this

approach was abandoned in favor of the more efficient stream to memory mapped (S2MM)

method using VDMAs.

Inside the block, one can expect to find stitched together code from two locations. Essentially,

the team was tasked with modifying Professor Chows code to work with the VDMA instead of

directly writing the bram. We kept the I2C controller and IIC LUT roughly the same, and

connected the same external signals that they required from the camera.

The team had to read up on the documentation of the vdma controller to find out which

signals were significant, eventually we found out that: m_axis_tlast, m_axis_tuser,

m_axis_tvalid and of course m_axis_tdata[31:0] were the signals that controlled vsync, last

bit of line, and of course data transmission.

The ov7670_capture module is a slightly modified version of a piece of code we found online

at [1]. This VHDL code generated the signals we needed at the perfect timings to talk to the

VDMA. The team stumbled upon this piece of code looking for timings between ov7670 and

the Xilinx VDMA IP. Fortunately there was a similar setup between an ov7670 module and a

ZYNQ setup that used the same timings.

Our most significant change to this portion of the code is:

//-- Expand 16-bit RGB (5:6:5) to 32-bit RGBA (8:8:8:8)

m_axis_tdata <= "11111111" & d_latch(11 downto 8) & "0000" &

d_latch( 7 downto 4) & "0000" &

d_latch( 3 downto 0) & "0000";

This piece of code controlled how the output data was formatted from data coming in from the

camera. Originally set to RGB8:8:8:8, we made small modifications to fit our VGA standard.

Delta_Calc Custom IP

Fig 6. In depth look at the delta calc module.

The delta calc custom IP is our custom Hardware IP Block. It was purely coded by the team

with the only assistance being Xilinx’s auto generated code when initializing the slave and

master interfaces on a custom IP block. The block talks with the Microblaze processor through

eight memory mapped slave registers, and talks to memory to retrieve video data placed by

the VDMA at a fixed location (0x81000000).

Delta_Calc & Microblaze Connection

Overview of input output registers:

Register Purpose

slv_reg0 Initialization Register, used to start one cycle of the hardware module.

slv_reg1 Debug Register, used for outputting various data such as FSM states for debugging.

slv_reg2 Red Threshold Register, used for setting thresholds for detection of the red colour. Input

register should have the format:

0x00, red channel threshold >=, green channel threshold <=, blue channel threshold <=

slv_reg3 Green Threshold Register, used for setting thresholds for detection of the green colour. Input

register should have the format:

0x00, red channel threshold <=, green channel threshold >=, blue channel threshold <=

slv_reg4 Unused

slv_reg5 Red Output Register, outputs red x,y coordinates in the format:

16b red y coordinate, 16b red x coordinate

slv_reg6 Green Output Register, outputs green x,y coordinates in the format:

16b green y coordinate, 16b green x coordinate

slv_reg7 System Done Register, only bottom three bits are significant. However we only use the

bottom bit, which indicates whether the system has finished or not.

The module is not independent of the microblaze processor, it relies on the processor to

instantiate it for a single cycle each time the block is run. We chose to implement the module

in this way due to the fact that we want to process each 10th frame, and the easiest way to

count frames is via writedone interrupts from the VDMA. Therefore the processor acts like a

counter, counting to ten and instantiating the delta_calc module.

Delta_Calc Colour Detection Algorithm

We will go over a very high level approach to explaining the algorithm implemented within

Delta_Calc

Fig 7. High Level look at the algorithm implemented within the delta_calc module for colour detection.

The concept is similar to more complicated algorithm implemented in MATLAB, The screen is

split up into 32x24 (10x10 pixel) blocks. We implement green and red thresholds through

Microblaze which the algorithm will use to judge for what is red and green.

The outer layers of the algorithm iterate through blocks. Essentially an FSM implements a

32x24 iteration through blocks. For each iteration through the block FSM, an inner FSM runs

that calculates via multiplication and the initial mem address (0x81000000) the starting

address for the top left corner of the square. It then iterates through a 10x10 pixel area

checking whether each pixel fits the criteria of red or green pixels. Should either of the

conditions be satisfied, it will be incremented into a red and green counter that is reset for

each block. Should red/green pixels be greater than 50% of the block or greater than 50 pixels.

That block is considered red/green and the x and y coordinates of that block is outputted as

the red/green block that is tracked.

Delta_Calc testing

Testing was done in phases, we needed to build confidence that each layer of our algorithm

could work sufficiently. We seperated testing into three stages:

Stage 1

Stage one was simply testing the read/write capabilities of our block, we wanted to write one

piece of data from one memory location, hold it in a register inside our block and write it on

the next cycle to another memory location.

Stage 2

In this stage full read/write functionality was implemented and tested through copying a large

chunk of data from one location to another.

Stage 3

In the final stage of testing, the algorithm was written and was to be tested. We placed perfect

data consisting of pure reds and green blocks in certain locations and checked whether blocks

could be detected by the algorithm.

Software Level Design

The software was written in one file called helloworld.c that was run on the microblaze

processor. It has 3 major components: animation, game logic, and the integration of

hardware blocks into the program. The integration of hardware blocks involved

initializing and configuring the TFT controller, the VDMA, and the Delta_Calc hardware

blocks to function with the rest of the program. This integration was done by reading

and writing the appropriate memory addresses and registers for output from the

hardware (i.e. done signals, red_x, red_y, green_x, green_y, etc.).

The animations were performed through the use of 14 functions all of which used the

TFT controller’s XTft_SetPixel function. The following are the prototypes of the 14

functions:

1. void drawBox(int x, int y, int length, int height, int color); 2. void drawLeftRightDescent(int x, int starty, int endy, int thickness, int

color); 3. void drawLeftRightAscent(int x, int starty, int endy, int thickness, int

color); 4. void drawCircle(int centerx, int centery, int radius, int color); 5. void drawCloud1(int x, int y); 6. void drawCloud2(int x, int y); 7. void P1Stance(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 8. void P2Stance(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 9. void P1Punch(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 10.void P2Punch(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 11.void P1Duck(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 12.void P2Duck(int* P1pstate, int* P2pstate, int* P1state, int* P2state); 13.void P1Erase(int* P1pstate, int* P1state, int* P2pstate); 14.void P2Erase(int* P2pstate, int* P2state, int* P1pstate);

The drawBox function is the simplest of these animation related functions and the other

13 functions follow a similar structure or depend on it. It is as follows:

void drawBox(int x, int y, int length, int height, int color) int i, j; for(i = x; i <= x + length; i++)

for(j = y; j <= y + height; j++) XTft_SetPixel(&TftInstance, i, j, color);

As shown in the code above, the XTft_SetPixel function was utilized to draw shapes

pixel by pixel in for loops and this was the technique used for all of the animation.

The game logic portion of the program was essentially a while loop with the following

control flow:

Description of Design Tree

All source code and documentation for our design, both hardware and software can be

found here: https://github.com/qoire/motion_controlled_fighting_game

Conclusion

In conclusion, we set out to design a game in hardware using Xilinx IPs and a custom IP.

Although we changed the specifications of the game and made some adjustments to the logic

of the game according to the problems we faced over the course, we were able to implement

the intended design. We had to cut down on the audio codec due to concerns about a working

tracking module but in the end we demonstrated the tracking of objects from a feed from a

camera. We believe that if we had a better camera, we could have achieved better and robust

results. However, given the unexpected complexity of the hardware system and given the

restraints on time placed because of the duration of bitstream generation, we feel we did a

good job. We have learnt valuable lessons in hardware development and value the extensive in

depth exposure we received to the Xilinx design suite Vivado.

https://github.com/qoire/motion_controlled_fighting_game

References

[1]l. Lauri Võsandi, 'Lauri's blog | Video capture with VDMA',

Lauri.xn--vsandi-pxa.com, 2015. [Online]. Available:

http://lauri.xn--vsandi-pxa.com/hdl/zynq/xilinx-video-capture.html. [Accessed: 10-

Apr- 2015].

ece532 - university of torontopc/courses/432/2015/projects/2_fight… · software, in depth. we...

Documents