088768 - architetture avanzate dei calcolatori -...

32
088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina Silvano email: [email protected] Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB) Politecnico di Milano http://home.deib.polimi.it/silvano/aac.htm AA 2013/2014

Upload: ngonga

Post on 20-Feb-2019

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI

Prof. Cristina Silvanoemail: [email protected]

Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB)Politecnico di Milano

http://home.deib.polimi.it/silvano/aac.htm

AA 2013/2014

Page 2: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Cristina Silvano – Politecnico di Milano - 2 -

Goals of the AAC course

Provide an overview of the most recent and advanced computer architectures

Introduce the basic micro-architectural mechanisms found in modern microprocessor architectures

Provide the reasoning behind the adoption of advanced computer architectures

Page 3: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Advanced Computer Architectures:IBM Blue Gene P Supercomputer

Cristina Silvano – Politecnico di Milano - 3 -

Page 4: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Advanced Computer Architectures:Smart phones

Cristina Silvano – Politecnico di Milano - 4 -

Page 5: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Advanced Computer Architectures:Intel® Core™ i7-3770T Processor (Nehalem, up to 3.70 GHz)

160mm² die @ 22nm 1.40 billion transistors.

# of Cores 4

# of Threads 8

Clock Speed 2.5 GHz

Max Turbo Frequency 3.7 GHz

Intel® Smart Cache 8 MB

Instruction Set 64-bit

Instruction Set Extensions SSE4.1/4.2, AVX

Embedded Options Available No

Lithography 22 nm

Max TDP 45 W

Recomm. Customer Price TRAY: $294.00

Max Memory Size 32 GB

Memory Types DDR3-1333/1600

# of Memory Channels 2

Max Memory Bandwidth 25.6 GB/sCristina Silvano – Politecnico di Milano

Page 6: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

ARM Cortex-A8 core processorin Apple A4 System-on-Chip

Based on the ARMv7 architecture It’s a dual-issue in-order execution design The Apple A4 at 1 GHz (45nm manufactured by Samsung from March

2010 to present), a System-on-Chip that combines an ARM Cortex-A8 and a PowerVR GPU, is in the:

• Original iPad, April 2010• iPhone4: June 2010 (Black; GSM), February 2011 (Black; CDMA),

April 2011 (White; GSM & CDMA)• iPod Touch (4th generation): September 2010 (Black model),

October 2011 (White model)• Apple TV (2nd generation): Sept. 2010

6

Page 7: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

ARM Cortex-A9 MP core processor in Apple A5 System-on-Chip

Based on the ARMv7 architecture It’s a dual-issue in-order execution design The Apple A5 at 1 GHz (45nm to 32 nm manufactured by Samsung

from March 2011 to present), a System-on-Chip that combines a dual core ARM Cortex-A9 with NEON SIMD accelerator and a dual core PowerVR GPU, is in the:

• iPad 2 (A5 dual-core 45 nm) – March 2011; (A5 dual-core 32 nm) –March 2012

• iPhone 4S (A5 dual-core 45 nm) – October 2011• Apple TV 3rd generation (A5 single-core, 32 nm) – March 2012• iPod Touch 5th generation (A5 dual-core 32 nm) – October 2012• iPad Mini (A5 dual-core 32 nm) – November 2012

7

Page 8: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Apple A6 SoC was introduced on Sept. 2012 for the iPhone 5 Apple states that it is up to twice as fast and has up to twice the

graphics power compared to its predecessor the Apple A5 The A6 uses a 1.3 GHz custom Apple-designed ARMv7 based dual-core

CPU, called Swift, and an integrated triple-core PowerVR SGX 543MP3 GPU.

The A6 chip for iPhone 5 incorporates 1GB of LPDDR2-1066 RAM and provides double the memory capacity of iPhone4S while increasing the theoretical memory bandwidth from 6.4 GB/s to 8.5 GB/s.

8

Apple A6 System-on-Chip

Page 9: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Apple A6 System-on-Chip

Cristina Silvano – Politecnico di Milano - 9 -

ARMv7s ISA dual core Triple-core PowerVR

SGX 543MP3 GPU 1MB L2 cache 1.3 GHz 32nm Samsung 96.71mm2 (22% smaller

than A5)

Page 10: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Moore’s Law (1965) says that the numbers of transistors on a processor will double every 18 to 24 months

Cristina Silvano – Politecnico di Milano - 10 -

Page 11: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Stopper: Max. Clock Freq. Wall

Chip density is continuing increase ~2x every 2 years

Clock speed is not

Expose parallelism in a coarser level than ILP

Cristina Silvano – Politecnico di Milano - 11 -

Source: Intel, Microsoft (Sutter) and Stanford (Olukotun, Hammond)

Page 12: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Stopper: On-Chip Temperature Wall

Cristina Silvano – Politecnico di Milano - 12 -

Page 13: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Paradigm shift : Multi-core architectures

65 nm1.4 mm2

90 nm, 2.6 mm2

130 nm, 5.2 mm2

ARM 9180 nm11.8 mm2

Source: STMicroelectronics

Page 14: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Intel 80 core

Cristina Silvano – Politecnico di Milano - 14 -

Page 15: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

NVIDIA Fermi GPU

Cristina Silvano – Politecnico di Milano - 15 -

Page 16: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

NVIDIA Tesla GPU

Cristina Silvano – Politecnico di Milano - 16 -

Kepler GK110 Architecture• 7.1B Transistors• 15 SMX units (2880 cores)• >1TFLOP FP64• 1.5MB L2 Cache• 384-bit GDDR5• PCI Express Gen3

Page 17: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Dark Silicon ProblemDARK SILICON : chip fraction not

usable due to the power budget

Processor frequency is affected by technology effects (e.g. Vth)

Page 18: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Cristina Silvano – Politecnico di Milano - 18 -

AAC Course Schedule

Schedule: First Semester 2013-2014 (FALL 2013)

WEDNESDAY 10.15 - 12.15 Location: L.26.11 Leonardo Campus

THURSDAY 10.15 - 12.15 Location: L.26.16 Leonardo Campus

Page 19: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Contact Information

Office hours for students:Tuesday 10.00 - 11.00 at DEIB, Via Ponzio 34/5 First floor –Internal phone number: 3692 (better to send an email to get an appointment).

Main Contact: The students can contact prof. Cristina Silvano bye-mail ([email protected])by indicating: Subject: AAC COURSE Milano, Your_Surname, Your_Name, Your_POLIMI_ID_NUMBER

Please use your POLIMI student e-mail account: [email protected]

Cristina Silvano – Politecnico di Milano - 19 -

Page 20: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

AAC Teaching Assistants

Prof. Giovanni Agostae-mail ([email protected])

Prof. Gerardo Pelosie-mail ([email protected])

Cristina Silvano – Politecnico di Milano - 20 -

Page 21: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Cristina Silvano – Politecnico di Milano - 21 -

AAC Course Info

Teaching Activity: The course consists of 5 CFU and it is organized in 30 hours of lectures and 20 hours of written/tool-based exercises to prove the concepts presented during the lectures.

Pre-requirements: Basic concepts on logic design and computer architectures.

Page 22: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Cristina Silvano – Politecnico di Milano - 22 -

AAC Final Exam

FINAL EXAM:The final exam consists of a written exam. For each written exam, a max. score of 33 points will be assigned: max. 15 points will be assigned for the solution of the exercise part and max. 18 points will be assigned for answering to the theory part.

It is possible to ask an OPTIONAL project to the instructor. The project must be concluded by January 31st, 2014 (firm deadline). The project assign an additional score up to max 5 points. The additional points given by the project will be added to the score of the written exam only if the final score of the written exam will be sufficient (>=18).

Page 23: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Cristina Silvano – Politecnico di Milano - 23 -

AAC Teaching Material

Additional information in slides and papers available through the course webpage: http://home.dei.polimi.it/silvano/AAC.htm• If you're using MOZILLA FIREFOX AS WEB BROWSER, for a correct

visualisation and printing of the PDF SLIDES, please use the SAVE AS option and save the PDF FILE on your laptop for correct visualisation and printing.

Reference Book: "Computer Architecture, A Quantitative Approach", John Hennessy, David Patterson, Morgan Kaufmann, Fourth Edition.

Page 24: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Support for the international students

AAC course is offered in Italian Teaching materials (slides/papers/textbook) available in

English Final exam can be done in English Teaching support available in English Please notice international students can follow the

course HPPS (High Perfomance Processors and System) held by prof. Donatella Sciuto during the Second Semester 2013 - 2014. HPPS Course is completely offered in English. AAC course objective and program are aligned with HPPS course.

Cristina Silvano – Politecnico di Milano March 2013- 24 -

Page 25: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Cristina Silvano – Politecnico di Milano - 25 -

Overview of the AAC topics

How to increase performance while decrease the design cost ? • RISC: Reduced Instruction Set Computer• Pipeline

Can we gain more ?• Branch prediction• Instruction Level Parallelism (ILP)• Multithreading• Multiprocessors

Still performance does not scale ?• Memory hierarchy• Cache organization

Page 26: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Main lectures topics (1)

Review of basic computer architecture definitions and components (Central Processing Unit, Memory System, Input/Output Interfaces, Communication System)

Basic performance evaluation metrics of computer architectures Memory hierarchy: Basic and advanced concepts. Multi-level caches.

Performance evaluation, optimisation techniques. Central Processing Unit: the RISC approach (Reduced Instruction Set

Computer).

Cristina Silvano – Politecnico di Milano - 26 -

Page 27: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Main lectures topics (2)

Techniques for performance optimization: • Pipelining: The problem of hazards: structural, control and data

hazards; Optimization techniques to solve the problem of hazards

• Branch prediction techniques: Static and dynamic branch prediction techniques

• Speculative execution

Cristina Silvano – Politecnico di Milano - 27 -

Page 28: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Sequential vs. Pipelining Instruction Execution

Cristina Silvano – Politecnico di Milano - 28 -

I2

I1

WB MEM EX ID IF WB MEM EX ID IF

10 ns 10 ns

Page 29: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Main lectures topics (3)

Instruction Level Parallelism (ILP): • Static and dynamic scheduling;• Superscalar architectures;• VLIW (Very Long Instruction Word) architectures;

Cristina Silvano – Politecnico di Milano - 29 -

Page 30: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Instruction Level Parallelism: Example of 2-issue processor

Cristina Silvano – Politecnico di Milano - 30 -

2 ns

Time

I2

I3

I1

WBMEMEXIDIF

2 ns

WBMEMEXIDIF

WBMEMEXIDIF

WBMEMEXIDIF

WBMEMEXIDIF

I4

I5

WBMEMEXIDIF

WBMEMEXIDIF

WBMEMEXIDIF

WBMEMEXIDIF

WBMEMEXIDIF

2 ns

2 ns

I7

I8

I6

I9

I10

I1

I2

Instruction Per Clock = 2CPI = Clock Per Instruction = 0.5

Page 31: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Beyond ILP: Multithreading

Threads: Independent sequences of instructions

Single-threaded program Multi-threaded program

Page 32: 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI - …home.deib.polimi.it/silvano/FilePDF/AAC/Lesson0-Intro_AAC_Course... · 088768 - ARCHITETTURE AVANZATE DEI CALCOLATORI Prof. Cristina

Main lectures topics (4)

Beyond ILP:• Multithreading (Thread Level Parallelism – TLP)• Multiprocessors and multicore systems: taxonomy,

topologies, communication management, memory management, cache coherency protocols, example of architectures

• System-on-Chip and Network-on-Chip architectures; Digital Signal Processors; Stream processors and vector processors; Graphic Processors

Cristina Silvano – Politecnico di Milano - 32 -