comp/elec 525 advanced microprocessor …courses/ee8207/lecture01.pdfadvanced microprocessor...

17
1 COMP/ELEC 525 Advanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 [email protected] Scott Rixner Lecture 1 2 Goals Introduction to research topics in processor design Solid understanding of processor microarchitecture Exposure to systems simulation

Upload: others

Post on 22-Jul-2020

3 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

1

COMP/ELEC 525Advanced Microprocessor Architecture

January 13, 2004

Prof. Scott RixnerDuncan Hall [email protected]

Scott Rixner Lecture 1 2

Goals

Introduction to research topics in processor design�

Solid understanding of processor microarchitecture�

Exposure to systems simulation

Page 2: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

2

Scott Rixner Lecture 1 3

Logistics

Lectures: T/Th 10:50-12:05 KH 101

Lecturer: Prof. Scott Rixner

Labby: Hyong-youb Kim

Grading: 20% Discussion Participation

20% Discussion Leading

60% Project

Readings: Selected research paper from the literature

Available on the web

Scott Rixner Lecture 1 4

Information

URL: http://www.owlnet.rice.edu/~elec525/

Check the web page for announcements

E-mail: [email protected]

[email protected]

Page 3: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

3

Scott Rixner Lecture 1 5

Readings

Each student will lead a class discussion�

Prepare a presentation of the papers– Overview– Critique– Open discussion questions

Lead the ensuing discussion (this will be much easier if the discussion questions are well thought out!)

All students expected to read papers and contribute to the discussion after the presentation

Scott Rixner Lecture 1 6

Project

Work in groups�

Extension of issues discussed in class�

Requires a well-thought out hypothesis and evaluation

Final result– Paper– Presentation in class

Multiple project checkpoints and presentations�

Resulting papers/presentations will be posted on the web

Page 4: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

4

Scott Rixner Lecture 1 7

What Is Computer Architecture?

Technology

ApplicationsComputer Architect

Interfaces

Machine Organization

Measurement &Evaluation

ISA

AP

I

Link

I/O C

han

Regs

IR

Scott Rixner Lecture 1 8

Applications Drive Machine Balance

Numerical simulations– Floating-point performance– Main memory bandwidth

Transaction processing– I/Os per second– Integer CPU performance

Decision support– I/O bandwidth

Embedded control– I/O timing

Media processing– Low-precision ‘pixel’

arithmetic

Page 5: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

5

Scott Rixner Lecture 1 9

Processor Microarchitecture

Iterative process– Generate proposed architecture– Estimate cost– Measure performance

Current emphasis is on overcoming sequential nature of programs– Deep pipelining– Multiple issue– Dynamic scheduling– Branch prediction/speculation

Scott Rixner Lecture 1 10

1971: Intel 4004

First microprocessor�

10 micron process�

2,300 transistors�

3x4 mm die�

4-bit bus�

640 bytes of addressable memory

750 KHz

Page 6: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

6

Scott Rixner Lecture 1 11

1999: Intel Pentium III

3-way superscalar out-of-order issue

MMX and SSE�

0.25 micron process�

9.5 million transistors�

10.2x12.1 mm die�

64-bit bus�

16KB data and 16KB instruction caches

450 MHz

Scott Rixner Lecture 1 12

2000: Intel Pentium 4

Issues up to 5 ops per cycle�

MMX, SSE, and SSE2�

0.18 micron process�

42 million transistors�

217 mm2 die�

64-bit bus�

8KB data cache�

~12000 µµµµop trace cache�

256KB L2 cache�

1.4 GHz

Page 7: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

7

Scott Rixner Lecture 1 13

2003: Intel Pentium 4

Issues up to 5 ops per cycle�

HT, MMX, SSE, and SSE2�

0.13 micron process�

55 million transistors�

146 mm2 die�

64-bit bus�

8KB data cache�

~12000 µµµµop trace cache�

512KB L2 cache�

3.2 GHz

Scott Rixner Lecture 1 14

2011: Projected

0.05 micron process�

7 billion transistors�

30x30 mm die�

10 GHz

???

Page 8: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

8

Scott Rixner Lecture 1 15

Comparison

1979 1999 2011

Technology (nm) 10000 250 50Transistors (millions) 0.0023 9.5 7000Area (mm2) 12 123 817Clock Speed (MHz) 0.75 450 10000

What were the trends in the past?– How did that affect microprocessor design?

What do we expect in the future?– How will this affect microprocessor design?

Scott Rixner Lecture 1 16

Chip Dimensions

Feature size down 13% per year�

Edge length up 6% per year

Page 9: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

9

Scott Rixner Lecture 1 17

Available Area and Delay

Grids up 50% per year�

Gate delay down 13% per year

Scott Rixner Lecture 1 18

Chip Capability

Capacity/delay up 70% per year

Page 10: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

10

Scott Rixner Lecture 1 19

Performance Trends

-Microprocessor Report 1/95

Scott Rixner Lecture 1 20

Wire Delay

Wire delay up 100% per generation�

10’s of clocks to cross chip in 0.1 micron process

-Microelectronic Research Corporation

Page 11: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

11

Scott Rixner Lecture 1 21

Pin Bandwidth

Capability up 70% per year�

Bandwidth up 28% per year

Scott Rixner Lecture 1 22

Power

Power grows proportionally to performance

Page 12: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

12

Scott Rixner Lecture 1 23

DRAMs

Capacity up 60% per year�

Access time up 19% per year

Scott Rixner Lecture 1 24

Summary

Exponential improvement from technology�

Enormous capabilities�

Wire delay, pin bandwidth, power limited�

Impact on processor architecture?

Page 13: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

13

Scott Rixner Lecture 1 25

Other Influences

Compilers– Should the compiler or hardware do it?

Compatibility– What software needs to run?

Cost– Is the performance/functionality worth the cost?

Scott Rixner Lecture 1 26

Areas of Focus

Front end�

Instruction-level parallelism�

Memory system issues�

Technology implications�

Novel processor architectures

Page 14: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

14

Scott Rixner Lecture 1 27

Front End

Delivery of instructions to arithmetic units�

Processor speed is irrelevant if you don’t have operations to perform

Issues– Limited memory/cache bandwidth– Small basic blocks

Scott Rixner Lecture 1 28

Instruction-level Parallelism

Execute multiple, independent instructions in parallel

Most commonly exploited type of parallelism�

Issues– Limited architectural state– Complex scheduling– Complex bookkeeping– Limited accessible ILP

Page 15: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

15

Scott Rixner Lecture 1 29

Memory System Issues

Storage and delivery of data for/to arithmetic units�

Processor speed is irrelevant if you don’t have data to operate on

Issues– Capacity– Speed– Bandwidth

Scott Rixner Lecture 1 30

Technology Implications

How do technology improvements affect microarchitecture?

Architectural scaling?�

Power dissipation?�

Clock speed?�

Wire delay?

Page 16: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

16

Scott Rixner Lecture 1 31

Novel Processor Architectures

What is industry/academia looking at right now?�

Do these architectures address technology issues?�

Do these architecture address changes in application domains?

Scott Rixner Lecture 1 32

Next Class

Lecture/discussion

Papers– “PC Processor Microarchitecture”– “2001 Technology Roadmap for Semiconductors”– “SPEC CPU2000: Measuring CPU Performance in the

New Millenium”

Page 17: COMP/ELEC 525 Advanced Microprocessor …courses/ee8207/lecture01.pdfAdvanced Microprocessor Architecture January 13, 2004 Prof. Scott Rixner Duncan Hall 3028 rixner@rice.edu Scott

17

Scott Rixner Lecture 1 33

Next Week

Tuesday - Branch Prediction– “Branch Prediction Strategies and Branch Target Buffer

Design”– “A Comparison of Dynamic Branch Predictors that use

Two Levels of Branch History”

Thursday - Instruction Fetch– “Cooperative Prefetching: Compiler and Hardware

Support for Effective Instruction Prefetching in Modern Processors”

– “A Scalable Front-End Architecture for Fast Instruction Delivery”