ece 475/cs 416 computer architecture - introduction · pdf fileece 475/cs 416 computer...

16
1 ECE 475/CS 416 Computer Architecture - Introduction Edward Suh Computer Systems Laboratory [email protected] Today’s Agenda Question 1: What is this course about? What will I learn from it? Question 2: How will the course be run? What do I need to know? ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Upload: nguyenanh

Post on 06-Feb-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

1

ECE 475/CS 416 Computer Architecture - Introduction

Edward Suh Computer Systems Laboratory [email protected]

Today’s Agenda

Question 1: What is this course about? What will I learn from it?

Question 2: How will the course be run? What do I need to know?

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

2

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Title = “Computer Architecture”  What is “Computer Architecture”?

 Old definition (80s)=

 Today’s architects must do more; implementation hurdles are more challenging than those in instruction set design

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Role of the Computer Architect

To design and engineer the various levels of a computer system to maximize “performance” and programmability within limits of technology and cost.

 Architect must be aware of •  application characteristics and benchmarks •  measures of cost and performance •  technology trends •  software and hardware interaction

3

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

“Performance”?  Desktop computers

•  Largest market in dollar terms

 Web servers •  Amazon.com had $1.35MM revenue / hour (2005)

 Embedded / mobile computers

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Single-Processor Performance

From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th Edition, 2006

4

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Technology “If […] history […] teaches us anything, it is that man, in his quest for

knowledge and progress, is determined and cannot be deterred.” John F. Kennedy (1962)

 Amazing yearly advances •  ~60% more devices per chip (doubles every 18 months) •  ~15% faster devices (doubles every 5 years) •  disks increase ~60% in capacity •  circuit boards increase ~5% in wire density

 Faster devices and advances in circuit design improve performance

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Clock Frequency Growth Rate

Source: Intel

30% per year

5

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Architecture Contribution

Part I. Single-Core Processors

What kinds of architectural innovations enabled the uni-processor performance improvement over the past 20 years?

Same program (binary) Runs 1.58x faster each year!!

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

6

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Moore’s Law: 2X transistors / “year”

  “Cramming More Components onto Integrated Circuits”, Gordon Moore, Electronics, 1965   # of transistors / cost-effective integrated circuit double every N months (12 ≤ N ≤ 24)

Source UCB EECS 252 notes

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

CPUs: Archaic vs. Modern 1982 Intel 80286   12.5 MHz   2 MIPS (peak)   Latency 320 ns   134,000 xtors, 47 mm2   16-bit data bus, 68 pins   Microcode interpreter, separate FPU chip   (no caches)

2001 Intel Pentium 4   1500 MHz (120X)   4500 MIPS (peak) (2250X)   Latency 15 ns (20X)   42,000,000 xtors, 217 mm2 (310X)   64-bit data bus, 423 pins   3-way superscalar, Dynamic translate to RISC, Superpipelined (22 stage), Out-of-Order execution   On-chip 8KB Data caches, 96KB Instr. Trace cache, 256KB L2 cache

Source UCB EECS 252 notes

7

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Memory: Archaic vs. Modern 1980 DRAM (async)   0.06 Mbits/chip   64,000 xtors, 35 mm2   16-bit data bus per module, 16 pins/chip   13 Mbytes/sec   Latency: 225 ns   (no block transfer)

2000 DDR52 SDRAM (clocked)   256.00 Mbits/chip (4000X)   256,000,000 xtors, 204 mm2   64-bit data bus per DIMM, 66 pins/chip (4X)   1600 Mbytes/sec (120X)   Latency: 52 ns

(4X)   Block transfers (page mode)

Source UCB EECS 252 notes

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Disk: Archaic vs. Modern CDC Wren I, 1983   3600 RPM   0.03 GBytes capacity   Tracks/Inch: 800   Bits/Inch: 9550   Three 5.25” platters

  Bandwidth: 0.6 MBytes/sec   Latency: 48.3 ms   Cache: none

Seagate 373453, 2003   15000 RPM (4X)   73.4 GBytes (2500X)   Tracks/Inch: 64000 (80X)   Bits/Inch: 533,000 (60X)   Four 2.5” platters (in 3.5” form factor)   Bandwidth: 86 MBytes/sec (140X)   Latency: 5.7 ms (8X)   Cache: 8 MBytes

Source UCB EECS 252 notes

8

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

LANs: Archaic vs. Modern Ethernet 802.3, 1978   Bandwidth: 10 Mbits/s   Latency: 3000 msec   Shared media   Coaxial cable

Ethernet 802.3ae, 2003   Bandwidth: 10,000 Mbits/s (1000X)   Latency: 190 msec (15X)   Switched media   Category 5 copper wire

Source UCB EECS 252 notes

Coaxial Cable:

Copper core Insulator

Braided outer conductor Plastic Covering

Copper, 1mm thick, twisted to avoid antenna effect

Twisted Pair: "Cat 5" is 4 twisted pairs in bundle

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

How Did We Get Performance?  Trade-off transistors and bandwidth for latency  Take advantage of parallelism

•  .

 Principle of locality

9

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Pipelined Instruction Execution

I n s t r.

O r d e r

Time (clock cycles)

Reg ALU

DMem Ifetch Reg

Reg ALU

DMem Ifetch Reg

Reg ALU

DMem Ifetch Reg

Reg ALU

DMem Ifetch Reg

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 6 Cycle 7 Cycle 5

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Why Slowdown?

From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th Edition, 2006

10

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Architecture at a Crossroads  How many cores does your computer have?

 Uniprocessor performance now 2x / 5(?) years •  “Power wall”: power consumption limits the transistors that can be turned on •  “ILP wall”: law of diminishing returns on more HW for ILP •  “Memory wall”: off-chip memory accesses take hundreds of CPU cycles

 Change in chip design: multiple “cores”: Thread Level Parallelism (TLP)

 All microprocessor companies switch to multiprocessors (AMD, Intel, IBM, Sun; all new Apples 2 CPUs)

“We are dedicating all of our future product development to multicore designs. … This is a sea change in computing”

Paul Otellini, President, Intel (2004)

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

A Peek at the Syllabus  Cost and performance   In-order processors  Memory hierarchy  Out-of-order processors  Branch prediction  Speculative execution  Superscalar processors  VLIW, Vector  Simultaneous multithreading (a.k.a. Hyperthreading™)  Multicore hardware, parallel processing  Virtual machines, I/O, networks

11

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Labs  Verilog design projects

•  incredibly useful language to know — industry loves Verilog •  projects done in teams of two

 Expand on a basic MIPS R3000 processor •  Lab 0: Welcome to Verilog (not graded) •  Lab 1: Get used to processor model, fix bugs, add instructions •  Lab 2: Pipeline model, add forwarding logic •  Lab 3: Add caches and cache controller •  Lab 4 Final Lab (next)

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Final Lab  Superscalar (dual-issue) pipeline  Design a processor extension of your choosing

•  branch prediction •  dynamic scheduling •  hardware prefetchers •  speculative loads •  multiple-level caches •  instruction set extensions •  [your idea here]

 Project report required

12

What You Will Learn  How to evaluate architectural decisions?

•  You will need to choose among different designs

 Architectural techniques in modern microprocessors •  Go from 1986 (314) to 2002 •  Apply to your down designs

 Why processors are moving towards “multi-cores”

 Problems and solutions in multi-core systems

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

What Do I Need to Know?  You are expected to know MIPS ISA and Verilog

•  alternatively, you are expected to learn them quickly and on your own

 What about C/C++? •  as a computer engineer you should know C •  we use small C programs to test Verilogdesigns

 What about Unix/Linux? •  basic Unix skills you should have or acquire:

–  elementary tasks: logging in, changing password, manipulating files, etc. –  familiarity with a Unix text editor of your choosing (e.g., vi, emacs)

13

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

ECE 475/CS 416 Requirements  Prerequisites

•  ENGRD 230 or equivalent, and ECE 314 or equivalent –  logic design, FSM design –  basic computer organization

 Assets •  passion for computer hardware •  prior exposure to Unix and/or Verilog •  ability to work nonstop for extended periods of time

 You should not take this course if any of these apply •  you do not meet the prerequisites •  your schedule and/or lifestyle won’t fit a(nother) high-workload course

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Staff   Instructor: Edward Suh, 338 Rhodes, office hours TuTh 11am-Noon

 Teaching assistants: (office hoursTuWTh7-10pm, PH329) •  Jiho Choi •  Mark Cianchetti •  Richard Hough •  Yuan Ning •  KK Yu

  If you must, use the staff’s email: [email protected] •  but we may post your question (and our answer) on Blackboard

14

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Computing Resources  Blackboard is used for course communication

•  Announcements (e.g. errata, date changes, etc.) •  Handouts and lecture notes : print out before coming to lectures •  Questions / Answers http://blackboard.cornell.edu/

 All assignments handled through CMS

http://cscms.cit.cornell.edu/

 ECE Computing Labs for lab assignments

Course Components   Lectures: TuTh 2:55-4:10, PH219

•  Download notes from Blackboard “Course Documents” •  5 min break in the middle

  4 Homeworks •  Individual assignment

  4 Labs •  Group of (one or) two

  2 Exams •  Prelim & Final

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

15

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Grading  Grade distribution:

•  Homework 15% –  Individual

•  Midterm 15% (Oct. 16) •  Final 25% (TBA) •  Verilog projects 40% (5% + 5% + 10% + 20%)

–  Group of one or two •  Class participation 5% / Half grade at my discretion

  Late policy: 1min late = not submitted = zero (I’m not kidding) •  but you have onelifeline on one assignment – 24 hours

–  all parties involved must have lifeline available

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

A Few Rules  When in trouble with the material

•  Use Blackboard! It’s likely your question has been asked and answered –  do not send me questions by email

•  Observe office hours — we are all very busy –  do not randomly drop by

•  Ask in class! –  good citizen’s hallmark: in-class participation

  I have a keen eye and no tolerance for cheating •  disciplinary hearings are no fun •  check Cornell’s Code of Academic Integrity

http://cuinfo.cornell.edu/Academic/AIC.html

16

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

Textbook

Computer Architecture: A Quantitative Approach, 4th Ed. by John L. Hennessy and David A. Patterson

Morgan Kaufmann Publishers

ECE 475/CS 416 — Computer Architecture, Fall 2008, Suh

FAQ   I have a question about ECE 475/CS 416

•  office hours: TuTh 11-Noon, 338 Rhodes Hall

  I have a question about conducting research in your group •  office hours: TuTh 11-Noon, 338 Rhodes Hall

 What courses complement ECE 475/CS 416? •  ECE 474 (VLSI), CS 412/413 (Compilers), CS 414 (OS)