gpu processing in database systems - amd

24.06.2011 DIMA – TU Berlin 1

Fachgebiet Datenbanksysteme und InformationsmanagementTechnische Universität Berlin

http://www.dima.tu-berlin.de/

GPU Processing in Database Systems

Max HeimelNikolaj Leischner

Volker MarklMichael Säcker


■ Database Systems Background■ Overview of GPU Technology■ Architecture of a Hybrid CPU/GPU System■ Database Operations on the GPU■ Alternative Architectures■ Research Challenges of Hybrid Architectures■ Current GPU/CPU Hybrid DBMS Systems

Agenda


Database Query

■ SQL is declarative: specifies what data is needed, not how to get it

SELECT DISTINCT o.name,a.driverFROM owner o,

car c, demographics d , accidents a

WHERE c.ownerid = o.id AND o.id = d.ownerid ANDc.id = a.id ANDc.make = 'Mazda' AND c.model = '323' ANDo.country3 = 'EG' ANDo.city = 'Cairo' ANDd.age < 30 ;

Find owner and driver of Mazda 323s that have been involved in accidents in Cairo, Egypt, where the age of the driver has been less than 30


Relational Database Operators

Selection/Filteringσcity=„Berlin“(owner)

Projectionπcity (owner)

make model ownerid

Honda Accord 1

Toyota Camry 1

Mazda 323 2

ownerid city

1 Berlin

2 Potsdam

3 Berlin

ownerid city

1 Berlin

3 Berlin

city

Berlin

Potsdam

Joincars |><| cars.ownerid = owner.ownerid owner

make model ownerid city

Honda Accord 1 Berlin

Toyota Camry 1 Berlin

Mazda 323 2 Potsdam

Further important database operators: union, intersection, set difference, Cartesian cross product, grouping/aggregation, sort

carsowner


■ Bottom up data flow graph

■ Strings together operators

■ Many QEPs for a query□ physical operators□ operator order □ pipelining vs. bulk transfer

■ Query optimization□ determines “best“ QEP

■ Cost model □ for each operator□ for operator combination

Query Execution Plan (QEP)

RETURN>7

NLJN /-----+-----\

7 N/A SCAN FETCH

| |7 N/A

SORT ISCAN | |7 N/A

HSJN ACCIDENTS/------+-----\

162015 10 SCAN HSJN

| /--+--\605999 14422 1000 DEMO. FETCH FETCH

| |14422 1000

ISCAN ISCAN| |

N/A N/ACAR OWNER

SELECT o.name,a.driverFROM owner o,

car c, demographics d , accidents a

WHERE c.ownerid = o.id AND o.id = d.ownerid ANDc.id = a.id ANDc.make = 'Mazda' AND c.model = '323' ANDo.country3 = 'EG' ANDo.city = 'Cairo' ANDd.age < 30 ;

Execution PlanQuery


■ Database Systems Background■ Overview of GPU Technology■ Architecture of a Hybrid CPU/GPU System■ Database Operations on the GPU■ Alternative Architectures■ Research Challenges of Hybrid Architectures■ Current GPU/CPU Hybrid DBMS Systems

Agenda


A Case for GPU architectures

■ Challenge in processor architecture□ More’s law is hitting a wall

− power wall− memory wall− ILP wall− …

□ Solution: Parallel processing

■ GPGPUS offer massively parallel processing□ Many more cores□ High GFLOPs and memory

bandwidth□ Lower/cost per core□ Can be used to build and test

massively parallel systems


■ GFLOPS & GB/s for different CPU & GPU chips..

■ Beware: theoretical peak numbers

GFLOPS & GB/s

Wikipedia, Intel, AMD, Nvidia, IBM

Xeon E5320

Xeon W3540 Xeon X5680

Opteron 2360SE

Opteron 6180SE

Power7 4.04Ghz

SPARC64 VIIIfx

Firestream 9270

Firestream 9370

Tesla c870

Tesla c1060

Tesla c2050

0

20

40

60

80

100

120

140

160

2006 2007 2008 2009 2010 2011 2012

Memory bandwidth (GB/s)

Xeon E5320

Xeon X7460

Xeon W3540

Xeon X5680 Xeon E8870 Opteron 2360SE

Opteron 2435

Opteron 6180SE

Power7 4.04Ghz

SPARC64 VIIIfx

Firestream 9270

Firestream 9370

Tesla c1060

Tesla c2050

0

100

200

300

400

500

600

2006 2007 2008 2009 2010 2011 2012

GFLOP/s


■ Most die space devoted to control logic & caches

■ Maximize performance for arbitrary, sequential programs

What is the difference? Look at a modern CPU..

www.chip‐architect.comAMD K8L


■ Little control logic,a lot of execution logic

■ Maximize parallel computational throughput

And at a GPU..

ixbtlabs.comATI RV770


■ SIMD cores with small local memories (16-48 kB)■ Shared high bandwidth + high latency DRAM (1-6 GB)■ Multi-threading hides memory latency■ Bulk-synchroneous execution model

GPU architecture

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

Local memory

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

Local memory

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

Local memory…

SharedDRAM

x86 host


■ Most database operations well-suited for massively parallel processing

■ Efficient algorithms for many database primitives exist□ Scatter, gather/reduce, prefix sum, sort..

■ Find ways to utilize higher bandwidth & arithmetic throughput

GPUs & databases: opportunities


■ Find a way to live with architectural constraints□ GPU to CPU bandwidth (PCIe) smaller than CPU to RAM bandwidth□ Fast GPU memory smaller than CPU RAM

■ Technical hurdles□ GPU is a co-processor:

− needs CPU to orchestrate work− get data from storage devices − etc.

□ GPU programming models (e.g., OpenCL, CUDA) are low level− need architecture-specific tuning− lack database primitives

□ Limited support for multi-tasking− extra work like aggregating operations into bulks required

GPUs & databases: challenges & constraints


■ J. Owens: GPU architecture overview, In ACM SIGGRAPH 2007 courses (SIGGRAPH '07). ACM, New York, NY, USA, , Article 2

■ H. Sutter: The Free Lunch is Over: A Fundamental Turn Toward Concurrency in Software, Dr. Dobb's Journal, 30(3), March 2005, http://www.gotw.ca/publications/concurrency-ddj.htm (Visited May 2011)

■ V. W. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. D. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, R. Singhal, and P. Dubey: Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU. SIGARCH Comput. Archit. News 38, 3 (June 2010), 451-460.

References & Further Reading

http://www.gotw.ca/publications/concurrency-ddj.htm

http://www.gotw.ca/publications/concurrency-ddj.htm


■ Database Systems Background■ Overview of GPU Technology■ Architecture of a Hybrid CPU/GPU System■ Database Operations on the GPU■ Alternative Architectures■ Research Challenges of Hybrid Architectures■ Current GPU DBMS Systems

Agenda


Hybrid CPU+GPU system

■ 1 or several (interconnected) multicore CPUs

■ 1 or several GPUs

CPU

I/OH

GPU

RAMRAMRAMRAM

HDD

CPU

I/OH

GPU

RAMRAMRAMRAM

HDD


Bottlenecks

■ PCIe bandwidth & latency

■ GPU memory size

■ GPU has no direct access to storage & network

CPU

I/OH

GPU

RAMRAMRAMRAM

HDD

CPU

I/OH

GPU

RAMRAMRAMRAM

HDD


■ Fully-fledged GPU query processor□ GPU hash indices, B+ trees, hash join, sort-merge join□ Handles data sizes larger than GPU memory, supports non-numeric

data types

GDB: GPU query processing

Relational query coprocessing on graphics processors

Bingsheng He and Mian Lu and Ke Yang and Rui Fang and Naga K. Govindaraju and Qiong Luo and Pedro V. Sander

CPU

I/OH

GPU

RAMRAMRAMRAM

HDD


■ 2x-7x speedup for compute-intense operators if data fits inside GPU memory

■ Mixed mode operators provide no speedup■ No speedup if data does not fit into GPU memory

GDB: results

05101520253035404550

NEJ EJ (HJ)

Performance of SQL queries (seconds)

CPU only

GPU only


■ Offload large bulks of OLTP queries to GPU

■ CPU generates processing plan based on a transaction dependency graph, GPU does actual query processing

□ System with 1 CPU + 1 GPU

□ Data must fit inside GPU memory

□ Only supports stored procedures

GPUTx: GPU OLTP query processing

High‐Throughput Transaction Executions on Graphics Processors

Bingsheng He and Jeffrey Xu Yu


■ Implementation based on H-Store

GPUTx: results

0

5

10

15

20

25

30

35

40

Scale factor 128

256 512 1024 2048

TPC‐B, normalized throughput

CPU (4 core)

GPUTx

0

5

10

15

20

25

Scale factor 20 40 60 80

TPC‐C, normalized throughput


■ B. He and J. Xu Yu: High-throughput transaction executions on graphics processors. Proc. VLDB Endow. 4, 5 (February 2011), 314-325.

■ B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander: Relational query coprocessing on graphics processors. ACM Trans. Database Syst. 34, 4, Article 21 (December 2009), 39 pages.



■ No dynamic memory allocation

■ Divergence□ Threads taking different code paths

Serialization

■ Limited synchronization possibilities□ Only on block level□ Limited block size

Challenges of GPU Programming


■ Database Systems Background■ Overview of GPU Technology■ Architecture of a Hybrid CPU/GPU System■ Database Operations on the GPU

□ Index Operations

□ Compression

□ Sorting

□ Relational Operators

□ Further Operations

■ Alternative Architectures■ Research Challenges of Hybrid Architectures■ Current GPU DBMS Systems

Agenda


► GPGPU architectures offer massively data-parallel processing

► Fruitfly architecture: code may run on future multicore CPUs

Opportunities and Challenges

► Bottlenecks (PCIe, local memory, disk and network access)

► Ensuring correctness while fully utilizing data parallelism is hard

► Synchronization mechanism for read/write conflicts is limited

► Cost modeling hard - some CPU details are unknown or change quickly


■ Second order functions on arrays

□ Map

□ Scatter (indexed write)

□ Gather (indexed read)

□ Prefix-Scan (combines/reduces entries)

□ Split (partitioning, e.g., hash or range)

□ Sort (e.g., bitonic sort/merge sort, radix sort, quicksort)

■ Library of primitives (building blocks) for GPGPU queryprocessing

Primitives for DBMS Operator Implementations

Relational joins on graphics processors

Bingsheng He and Mian Lu and Ke Yang and Rui Fang and Naga K. Govindaraju and Qiong Luo and Pedro V. Sander


■ Classic: B+ Tree

■ Directory accelerates access, exploting the fact that the data in a sorted and organized in a hierarchical way

■ Compression to map variable-length data to fixed size□ Dictionary encoding

■ CSS-Tree: Encode directory and payload in array, cache-aware

Indexing

… …

… …

Key

Payload in leaves

Directory


■ Path of a key within the tree defined by absolute value

■ Split key into equally sized sub-keys

■ Tree depth depends on key length

Prefix Tree Index

INT = 793910

0001 1111 0000 00112

1 15 0 310…

…

…


■ Idea: Group processing to exploit faster local memory■ Group computational trees into blocks■ Tree block is executed by a GPU thread block

□ use shared memory for intermediate results

■ Generate global result by traversing internal result list

Speculative Hierarchical Index Traversal


GPUs□ very limited memory capacity□ high computation capabilities and bandwidth⇒ Compression reduces overhead of data transfer⇒ Compression allows GPUs to address bigger problems

Usually: combine compression schemes for higher compression ratiosE.g., auxiliary scheme RLE and main scheme zero suppression

Goal: Carry operations (e.g., filtering) out on compressedrepresentation of the dataW. Fang, B. He, Q. Luo: Database Compression on Graphics Processors, Proc. VLDB Endow. 3, 1-2 (September 2010), 670-680

Database Compression

1000,1003,1005 1000,+0003,+0002 1000,+3,+2


Sorting with GPUs

■ Numerous GPU algorithms published□ Bitonic sort, bucket sort, merge sort, quicksort, radix sort, sample

sort..

■ Existing work: focus on in-memory sorting with 1 GPU

■ State of the art: merge sort, radix sort

■ GPUs outperform multicore CPUs when data transfer over PCIe is not considered


Radix Sort @ GPGPU

Revisiting Sorting for GPGPU Stream Architectures Duane Merrill, Andrew Grimshaw

Scan

Core 0

Bottom

‐level Red

uctio

nTop‐level

Scan

Bottom

‐level Scan

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Core 0

…

...

Reduce

Accumulate

Binn

ing

Keys (in)..

Accumulate

Binn

ing

Keys (in)..

Core 0

…

Reduce

Accumulate

Binn

ing

Keys (in)..

Accumulate

Binn

ing

Keys (in)..

Core 1

…

Reduce

Accumulate

Binn

ing

Keys (in)..

Accumulate

Binn

ing

Keys (in)..

Core 2

…

Reduce

Accumulate

Binn

ing

Keys (in)..

Accumulate

Binn

ing

Keys (in)..

Core 3

…

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Core 1

…Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Core 2

…

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Binn

ing

Keys (in)..

Scatter

Scan

Scan

Keys (o

ut)..

Core 3

…


■ For GPU radix sort the PCIe transfer dominates running time

GPU sorting performance

Revisiting Sorting for GPGPU Stream Architectures Duane Merrill, Andrew Grimshaw

Faster Radix Sort via Virtual Memory and Write‐Combining Jan Wasenberg, Peter Sanders

0

200

400

600

800

1000

1200

GPU Radix w/ PCIe transfer GPU Radix CPU Radix

Throughput (million items / second)

32bit keys

32bit keys, 32bit values


1. Split relations into blocks

2. Join smaller blocks in parallel

Example: Non-indexed Nested-Loop Join

R

S

Thread Group 1,1

Thread Groupi,1

Thread Group1,j

Thread Groupi,j

…

…

… …

R`

S`

… …

… …Thread 1 Thread T

Problem: To allocate memory for the result, the join has to be performed twice!

Aka “fragment and replicate join” (symmetric/unsymmetric)There is more: indexed NLJ, hash join, sort‐merge join, shuffling rings, etc.


■ Brent-Kung circuit strategy ■ Only upsweep phase necessary because only final result is

needed■ Permutation of elements to minimize memory bank conflicts■ Separate thread group to combine results of blocks

Example: Aggregation

x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15


■ Map/Reduce

■ Regular Expression Matching

■ K-Means Clustering

■ Apriori Clustering

■ Exact String matching

■ …

There is more...


■ J. Rao, K. A. Ross: Cache Conscious Indexing for Decision-Support in Main Memory, In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB '99), Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 78-89

■ C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. Brandt, P. Dubey: FAST: fast architecture sensitive tree search on modern CPUs and GPUs, ACM SIGMOD ‘10, 339-350, 2010

■ P. B. Volk, D. Habich, W. Lehner: GPU-Based Speculative Query Processing for Database Operations, First International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures In conjunction with VLDB 2010

References & Further Reading: Indexing


■ D. G. Merrill and A. S. Grimshaw: Revisiting sorting for GPGPU stream architectures. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques (PACT '10). ACM, New York, NY, USA, 545-546

■ J. Wassenberg, P. Sanders: Faster Radix Sort via Virtual Memory and Write-Combining, CoRR 2010, http://arxiv.org/abs/1008.2849 (Visited May 2011)

■ N. Satish, C. Kim, J. Chhugani, A. D. Nguyen, V. W. Lee, D. Kim, and P. Dubey: Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort. In Proceedings of the 2010 international conference on Management of data (SIGMOD '10). ACM, New York, NY, USA, 351-362.

References & Further Reading: Sorting

http://arxiv.org/abs/1008.2849


■ B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, P. Sander: Relational joins on graphics processors. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD '08). ACM, New York, NY, USA, 511-524.

■ D. Merrill, A. Grimshaw: Parallel Scan for Stream Architectures. Technical Report CS2009-14, Department of Computer Science, University of Virginia. December 2009.

References & Further Reading: Rel. Operators


■ N. Cascarano, P. Rolando, F. Risso, R. Sisto: iNFAnt: NFA Pattern Matching on GPGPU Devices. SIGCOMM Comput. Commun. Rev. 40, 5 20-26.

■ M.C. Schatz, C. Trapnell: Fast Exact String Matching on the GPU, Technical Report.

■ W. Fang, K. K. Lau, M. Lu, X. Xiao, C. K. Lam, P. Y. Yang, B. He, Q. Luo, P. V. Sander, K. Yang: Parallel Data Mining on Graphics Processors, Technical Report HKUST-CS08-07, Oct 2008.

■ B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang. 2008. Mars: a MapReduce framework on graphics processors. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT '08). ACM, New York, NY, USA, 260-269.

■ many more …

References & Further Reading: Further Ops



Agenda


A blast from the past: DIRECT

■ Specialized database processors already in 1978■ Back then could not keep up with rise of commodity CPUs■ Today is different: memory wall, ILP wall, power wall..

ControllerHost

Query processor

Massstorage

Memory

Query processor

Query processor

Memory Memory

DIRECT ‐ a multiprocessor organization for supporting relational data base management systems

David J. DeWitt


■ Massively parallel & reconfigurable processors■ Lower clock speeds, but configuration for specific tasks

makes up for this■ Very difficult to program■ Used in real world appliances: Kickfire/Teradata,

Netezza/IBM..

FPGAs

FPGA: what's in it for a database?

Jens Teubner and Rene Mueller

CPU

RAM

FPGA

Network

HDD


■ CPU & GPU integrated on one die, shared memory■ Uses GPU programming model (OpenCL)■ No PCIe-bottleneck but only memory bandwidth like CPU

CPU/GPU hybrid (AMD Fusion)

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

Local memory

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

ALUALUALUALU

Local memory…

Memory controller

x86 core x86 core…

RAM

AMD Whitepaper: AMD Fusion Family of APUs


■ Network processors□ Many-core architectures□ Associative memory (constant time search)

■ CELL□ Similar to CPU/GPU hybrid: CPU core + several wide SIMD cores w/

local scratchpad memories

There is more...


■ Similar approaches□ Many-core, massively parallel□ Distributed on-chip memory□ Hope for better perf/$ and perf/watt than traditional CPUs

■ Similar difficulties□ Memory bandwidth always a bottleneck□ Hard to program

− parallel− synchronization− low-level & architecture-specific

Common goals, common problems


■ D. J. DeWitt: DIRECT - a multiprocessor organization for supporting relational data base management systems. In Proceedings of the 5th annual symposium on Computer architecture (ISCA '78). ACM, New York, NY, USA, 182-189.

■ R. Mueller and J. Teubner: FPGA: what's in it for a database?. In Proceedings of the 35th SIGMOD international conference on Management of data (SIGMOD '09), Carsten Binnig and Benoit Dageville (Eds.). ACM, New York, NY, USA, 999-1004.

■ AMD White paper: AMD Fusion family of APUs, http://sites.amd.com/us/Documents/48423B_fusion_whitepaper_WEB.pdf (Visited May 2011)

■ N. Bandi, A. Metwally, D. Agrawal, and A. El Abbadi: Fast data stream algorithms using associative memories. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data (SIGMOD '07). ACM, New York, NY, USA, 247-256.

■ B. Gedik, P. S. Yu, and R. R. Bordawekar: Executing stream joins on the cell processor. In Proceedings of the 33rd international conference on Very large data bases (VLDB '07). VLDB Endowment 363-374.


http://sites.amd.com/us/Documents/48423B_fusion_whitepaper_WEB.pdf

http://sites.amd.com/us/Documents/48423B_fusion_whitepaper_WEB.pdf



Agenda


Architectural constraints

■ PCIe bottleneck□ Direct access to storage devices, network etc.?□ Caching strategies for device memory

■ GPU memory size□ Deeper memory hierarchy

(e.g. a few GB of fast GDDR + large „slow“ DRAM)?


■ Want forward scalability: performance should scale with next generations of GPUs□ Existing work often optimized for exactly 1 type of GPU chip

■ Need higher level programming models□ Hide hardware details

(processor count, SIMD width, local memory size..)

■ Cost estimation of operators

Performance portability challenges


■ Big data volume and limited memory

■ Execution Plan consists of multiple operators

■ Where to execute each operator?

□ Trade off transfer time between CPU and GPU and computational

advantage

□ Cost Models for GPGPU, CPU and hybrid

□ Amdahl’s law

Database-specific challenges



□ Monet DB/CUDA□ Parstream

Agenda


■ Joint Project of TU Berlin and CWI

■ Goal: Investigate GPGPU in the context of database systems□ We extend the open-source database system MonetDB using OpenCL

■ Why MonetDB?□ Open-source □ Column-oriented in-memory database system

■ Why OpenCL?□ Open standard with support from many big players□ Not restricted to graphics cards

General Overview: Hybrid MonetDB


■ Proof-of-concept implementation□ Supports only fixed-size columns fitting into device memory□ Implemented operators

− Table Scan− Nested-Loop Join− Aggregation− Group By

■ Implementation consists of:□ OpenCL context handling□ GPU memory management□ Integration into MonetDB Assembly Language (MAL).□ Support for multiple GPUs

Hybrid MonetDB System Outline


Parstream High Performance Indexing

ParallelExcecutionIndex Compression PartitionBig Data


ParStream combines state of the art database technologies with unique technologies

ColumnStore

Fast data access for analytical processing

CustomQuery Operators

SQL, JDBC and powerful C++ API enable fast query processing

In-MemoryTechnology

Data and indices can stay in memory due to efficient compression

High PerformanceIndex

Unique index structure allows highly parallel execution of queries

Patent Pending – High USP

ParStream – Building Blocks


■ S. Hong and H. Kim: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. SIGARCH Comput. Archit. News 37, 3 (June 2009), 152-163.

■ L. Bic and R. L. Hartmann: AGM: a dataflow database machine. ACM Trans. Database Syst. 14, 1 (March 1989), 114-146.

■ H. Chafi, A. K. Sujeeth, K. J. Brown, H. Lee, A. R. Atreya, and K. Olukotun: A domain-specific approach to heterogeneous parallelism. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming (PPoPP '11). ACM, New York, NY, USA, 35-46.

■ http://www.dima.tu-berlin.de■ http://www.monetdb.nl■ http:// www.parstream.com



Thank You

MerciGrazie

Gracias

Obrigado

Danke

Japanese

English

French

Russian

German

Italian

Spanish

Brazilian Portuguese

Arabic

Traditional Chinese

Simplified Chinese

Hindi

Tamil

Thai

Korean


■ Disclaimer & Attribution■ The information presented in this document is for informational purposes only and may contain

technical inaccuracies, omissions and typographical errors.■ The information contained herein is subject to change and may be rendered inaccurate for many

reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise this information and to make changes from time to time to the content hereof without obligation to notify any person of such revisions or changes.

■ NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.

■ ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

■ AMD, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names used in this presentation are for informational purposes only and may be trademarks of their respective owners.

■ The contents of this presentation were provided by individual(s) and/or company listed on the title page. The information and opinions presented in this presentation may not represent AMD’s positions, strategies or opinions. Unless explicitly stated, AMD is not responsible for the content herein and no endorsements are implied.

■