advanced micro devices - athlon buddy guest mike lewitt bill mccorkle november 28, 2001

33
Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Post on 20-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Advanced Micro Devices - Athlon

Buddy Guest Mike Lewitt Bill McCorkle

November 28, 2001

Page 2: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

RISC

IA-64

IA-32

What Have We Seen So Far?Where is the Competition?

Page 3: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Overview of Today’s Events Company History Differences in AMD Athlon

Architecture System Bus Macro vs. Micro Operations Floating Point Operations Branch Prediction Memory Management

Comparing Processor Performance

Page 4: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

AMD Intel May 1, 1969 – founded

Semiconductor company 1975 8080A and AM2900 1976 Sign cross-licensing

agreement 1987 AMD & Intel go to court 1992 Court awards full rights

to AMD to produce AM386 Processor

1991 AM386 (breaks Intel Monopoly)

1993 AM486 1997 AMD-K6 1998 Athlon – 1st 7th

Generation Processor

July 18, 1968 – founded Semiconductor memory

1971 4004 introduced 1971 8008 introduced 1976 Sign cross-

licensing agreement 1981 16-bit 8086 1982 286 (on-board

memory) 1985 32-bit 386 1989 486 1993 Pentium 1998 Celeron & Pentium

II

Page 5: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Architecture Summary AMD Approach

Balanced approach to optimize processor performance (IPC) and improving the operating frequency at the same time.

Intel Approach Increased pipelining depth to handle more

instructions which created loss in processor performance (IPC).

Solution: Compensated with much higher frequency to stay in competition. (=IPC)

Page 6: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Architecture Summary Overall Improvement to Performance

Frequency Improvements Smaller Geometries Faster Transistors (“process shrinks”) Deeper Pipelines Fewer Gates Per Clock Cycle

Work Per Clock Improvements Super scalar Architectures Dynamic Instruction Schedulers Larger On-Chip Caches Advanced Branch Prediction

Page 7: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Architecture Summary Clock Speed / EV6 Bus

Designed with very high clock speeds in mind

K7 has very deep buffers to enable those high clock speeds, offering up to 72 x86 instructions in-flight.

Uses Rising Edge and Falling Edge Detection For Bus

100 MHz Clock 200 MHz Processor 133 MHz Clock 266 MHz Processor

AMD vs. Intel comparing same clock

Page 8: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Architecture Summary EV6 Bus on AMD Athlon

Scalable up to 200 MHz Yielding Effective frequency 400 MHz

Multiprocessor support Highest bus bandwidth (1.60 GB/s)

Intel using 133 MHz (1.01 GB/s)

Page 9: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

AMD Athlon

PIII

Page 10: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Architecture Summary Instruction Control Unit

Holds 72 MOps Before Assignment(MOp = x86 instruction, therefore Athlon

can have 72 “in-flight” instructions) P6 Only Holds 13 in-flight MOps

Page 11: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Architecture Summary Execution Ports

AMD Has No Less Than 9 Intel Has 5

2 Dedicated to memory stores

Enhanced Parallelism Inside Athlon

Page 12: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Micro-OPs / Macro-OPs Athlon has 3 parallel x86 instruction

decoders translate into a Macro-Op of 72-entry ICU Uses 2 pipelines (Intel uses 1)

-Decoding common instructions (direct path) -Decoding complex x86 instructions (vector path)

Integer Scheduler is fed and holds max 15 M-Ops, representing 30 at a time

Leads to 3 parallel integer execution units

Page 13: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Micro-OPs / Macro-OPs Athlon Decoders 3-Way Instruction

Has 3 parallel decoding units Can handle any combination of instructions with

any of it’s decoders that are “fully capable” decoders

Handles Complex and Simple Instructions Intel Decoders

Has 3 parallel decoding units 1 Complex 2 Simple

Handles Complex / Simple / Simple

Page 14: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

3DNOW!

3DNOW! (Athlon) SSE (Intel)

Pipelines (parallel) 2 2

Instructions (how wide) 2 4

Effective Instructions per Cycle 4* 4

Registers Used 3DNOW! / FPU No FPU

Every 4-wide Intel SSE instruction is actually 2 Athlon micro-ops

*AMD takes advantage of rising edge as well as falling edge

**SSE Cannot be used with MMX Registers

MMX Developed When FPUs Not As Important

Page 15: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

3DNOW!

Each pipeline can do any instruction above.

The second pipeline can do any instruction in any group except the group the first pipeline has chosen.

Page 16: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

3DNOW!

Conclusion of 3DNOW! Vs SSE Both have pairing restrictions

SSE Separate Unit implementation more difficult program with more freedom

MMX-add & prefetch-instructions slightly better for SSE

Final Conclusion: DRAW

Page 17: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Full Architecture viewsAMD Athlon

PIII

Page 18: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Looking at the ALUs

Page 19: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Floating Point Operations

Fully pipelined FPU 3 ported parallel Floating Point

Execution Units Pentium has 3 also, but are behind

only one port FPU can execute two 80-bit

extended Ops Intel can currently only execute one

Page 20: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Pipelining Differences Determining the length

Execution rate of pipeline (ALU) Degree of Parallelism

AMD Athlo

n

Intel Pentium III

Integer Pipeline Length

10 12-17

Floating Point Pipeline length

15 25

(AMD-Athlon)

Page 21: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Branch PredictionExample:

if (x > 0){a=0;b=1;c=2; }

d=3;

Cycle Fetch Decode Execute Save

1 if (x>0)2 a=0 if (x>0)3 b=1 a=0 if (x>0)4 c=2 b=1 a=0 if (x>0)5 c=2 b=1 a=06 c=2 b=17 c=2

Cycle Fetch Decode Execute Save

1 if (x>0)2 a=0 if (x>0)3 b=1 a=0 if (x>0)

4 d=3squash

b=1squash

a=0 if (x>0)

5 d=3squash

b=1squash

a=0

6 d=3squash

b=17 d=3

Cycle Fetch Decode Execute Save

1 if (x>0)2 d=3 if (x>0)3 d=3 if (x>0)4 d=3 if (x>0)5 d=3

When x>0

When x<0

Predicting x<0

Page 22: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Branch Prediction AMD Athlon

Branch Target Buffer size of 2048 entries Branch History Table can store 4096 entries

Intel Pentium III Dynamic Branch Predictor can store 512

entries Approximate Correct Branch Predictions

AMD Athlon: 95% Intel Pentium III: 90-92%

Page 23: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Memory Management Level 2 Cache

512kB to 8 MB Rate of 1/3, 1/2, 2/3, 1/1 the clock frequency External to the CPU (Weakness of Athlon)

Intel L2: 256kB ‘on-die’ Intel moving away from Slot1 and back to socket AMD will need to move to ‘on-die’ and socket

connections to stay competitive Main push towards 0.18 -process

Level 1 Cache 64kB data and instruction caches (4x Pentium III) Scalability

Page 24: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Which One Is Better? In the past (286, 386, 486)

Performance = Frequency

In Today’s World Performance = IPC * Frequency

How else so we compare? Benchmarking

Page 25: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Benchmarking Software that performs different

tasks to obtain comparisons between processors.

Problems: Processor frequencies. Other processes already running. Types of programs

Some programs are written to take advantage of certain architecture.

Page 26: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Photo Editing Software

Page 27: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Animation Software

Page 28: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

3D Graphics Editor

Page 29: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

3D Gaming

Page 30: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Various Benchmarks

Page 31: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Summary Past couple years, AMD and Intel

have taken different approaches. We have gone over the main

architectural differences. We have shown how they compare. It will be very interesting to see

how the market plays out.

Page 32: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

Questions?

Page 33: Advanced Micro Devices - Athlon Buddy Guest Mike Lewitt Bill McCorkle November 28, 2001

References http://www.amd.com http://www.amdzone.com http://www.intel.com Gardner, Ryan. AMD employee CPU Specialist

email: [email protected] Hsieh, Paul. 7th Generation CPU Comparisons.

http://www.azillionmonkeys.com/qed/cpujihad.shtml . 11/30/00 Pabst, Thomas. The New Athlon Processor – AMD is Finally Overtaking

Intel . http://www6.tomshardware.com/cpu/99q3/990809/index.html. 8/9/99

Pabst, Thomas. AMD Processors vs. Intel Processors – Facts and Lies. http://www6.tomshardware.com/cpu/00q4/001017/athlon-02.html. 10/12/00

Morgan, Rob. Power Mac G4 Dual 500 vs. Pentium 4 vs. Athlon. http://www.barefeats.com/pentium.html . 1/08/01