dragged, kicking and screaming: multicore architecture and video games
TRANSCRIPT
Summary of Topics:
Console Architecture
Meaning of Paper’s Title/Why the Video Game Developer HATED the new
Techniques/Problems
The Future
Video Game Architecture
For the most part, same as computer:
Very operating system-linked.
With PCs, almost always have been games.
Mac Gaming is sparse, recently increased.
Linux users have to compile/make their own.
Console Games = primarily single-core processors…until 2005.
XBOX 360• 3.2 GHz “Xenon” triple-core PowerPC, 2 hardware threads per processor
• 256 MB main RAM
• 500 MHz ATI “Xenos” GPU
-CPU accesses memory
through the GPU!
• GPU has 10 MB RAM embedded frame buffer
XBOX 360 vs. Playstation 3
Triple-Core PPC
• Xbox 360 - 512 MB, 700 MHz, GDDR3, shared by CPU and GPU
• CPU accesses memory through the GPU!
• GPU has 10 MB RAM embedded frame buffer
Multicore Cell Engine
PS3 - 512 MB total
256 MB 3.2 GHz XDR main RAM for the CPU
256 MB 700 MHz GDDR3 video RAM for the GPU
Multiple synergistic core units that attach to local stores, which then feed into DMAs going into the on-chip bus. One set-off PPE
(Power Processing Element), with an L1 and L2 cache. Developers are having some serious problems with this model.
Cell Architecture
Why So Unhappy?
Delays, setbacks, ecetera = unhappy fans.
Yu Suzuki; Saturn Virtua Fighter: “One very fast central processor would be preferable...I think that only one in 100 programmers are good enough to get this kind of speed out of the Saturn.”
Not implementing parallelism, use of multicore architecture, etc = unhappy fans.
If game developers utilize parallelism, the game will be delayed – 6 months, 1 year?
Beginning Techniques
• Patches, so computers at least realize there’s multiple cores available.
• Intel releases several multicore assists; especially in the beginning (coaxing people into it)
• Building Blocks
• Codeplay’s sieve compilers
• Broke a program into “sieve blocks” where automatic parallelization could be utilized
What do we do today?
Multithreading from the ground up
Decent (and fast!) parallelization
One of two main ways:
Every process on a different thread
Dependencies galore~!
Main gaming thread, with branches coming off for specific parts of the game and splintering into other threads.
Particularly beastly programs get their own multithreading implementations.Networking and I/O get their own threads.
“Best” Multithreading Approach
CASE EXAMPLE: Kameo, which achieved 2.2~2.5 cores in 6mos.Rendering, decompression were on a separate thread
Latter saved space on the DVD and improved load times for the game. Additionally, file I/O was separated onto two threads – one for reading, and
one for decompressing.
CASE EXAMPLE: Kameo
Best Processes for MTFile decompression – improve load times.
Rendering – separate update and render; can be problematic
Physics Engine? – Physics/Update/Render, but latency issues.
Graphical Fluff – always and forever.
Artificial Intelligence - position independency of data, cache coherency
Cascade ProjectFix dataflow by sending data from the parent to the child before the parent had completed!
Respect dependencies, divided AI
Resulted in reducing “the average time per frame from 15.5ms using a single thread to 7.8ms using eight threads.”
51% Speedup!
Work in progress – CDML
List constraints in language instead of working out later.
Multithreading is Tricky
Threads can fight over the cache
Dependencies
Data corruption, deadlocks
Bugs might not be apparent right away
Debugging sets developers back
The Future
ARM’s GPU/CPU Chip
Intel’s Larrabee Chip
Mobile Gaming Platforms laugh for now…
Unreal 4 Engine – “We’re waiting for massively multicore processors.”