graphite power models - mit...

19
Graphite Power Models Core, Cache and Network

Upload: vuquynh

Post on 26-Jul-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Graphite Power Models

Core, Cache and Network

Power Models

• Cache & Directory (using McPAT)

– http://www.hpl.hp.com/research/mcpat/

– Uses CACTI for modeling data and tag arrays

• Network (using Orion 2.0)

– http://projects.csail.mit.edu/cgi-bin/wiki/view/LSPgroup/OrionPage

– Upgrading to DSENT [Chen et. al. NOCS2012]

• Core (using McPAT)

– Currently validating against real hardware

2

Modeling Framework

3

Graphite Config File Core Params Network Params Memory Subsystem

McPAT / CACTI

Orion 2.0 / DSENT

Area

Performance

Energy

Technology Parameters (VDD, T, Wmin)

Benchmark

Core Models

Memory Subsystem

Models

Network Models

Fill Buffer

Write-back Buffer

Miss Status Buffer

Data Array

Tag Array

Cache (/Directory) Power Models Modeled Components

4

Data (Cache Line)

1 S 00 0x1AC9

1 M 10 0xB456

0 I

0xCB 0x45 0x68 0x21

0xA1 0x40 0x34 0xBF

0xBACD 0x34 0xBA 0x34 Read 0x1AB9

Core Power Models Modeled Components

5

Execution Unit

– Instruction Window

– Integer ALUs

– Floating Point Units (FPUs)

– Complex ALUs (Mul/Div)

– Results Broadcast Bus

Instruction Fetch Unit – Instruction Buffer

– Instruction Decoder

Load Store Unit

– Load/Store Buffers

Memory Management Unit – I-TLB

– D-TLB

Register Files - Integer RF - Floating Point RF

Network Power Models DSENT

Chen et al NOCS 2012

Georgas et al CICC 2011

Network Power Models DSENT Calibration on Network Components

• DSENT fully validated vs. Spice

• Energy modeling within 10% of Spice with satisfied timing constraints

7

Power Models Current Status

• Cache Model

– Models for L1-I cache, L1-D cache, L2 cache and directory are in place

• Network Model

– Currently uses Orion 2.0

– Integration of Graphite with DSENT is being carried out

• Core Model

– Currently validating model against real hardware running multicore applications

8

Backup Slides

9

Core Architectural Configuration

• General Parameters – clock_rate – core_tech_node – instruction_length – opcode_width – machine_type – num_hardware_threads – fetch_width – num_instruction_fetch_ports – decode_width – issue_width – commit_width – fp_issue_width – prediction_width – integer_pipeline_depth – fp_pipeline_depth – ALU_per_core – MUL_per_core – FPU_per_core – instruction_buffer_size – decoded_stream_buffer_size

10

• Register File – arch_regs_IRF_size

– arch_regs_FRF_size

– phy_regs_IRF_size

– phy_regs_FRF_size

• Load-Store Unit – LSU_order

– store_buffer_size

– load_buffer_size

– num_memory_ports

– RAS_size

Core Event Counters

11

• Instruction Counters – total_instructions

– int_instructions

– fp_instructions

– branch_instructions

– branch_mispredictions

– load_instructions

– store_instructions

– committed_instructions

– committed_int_instructions

– committed_fp_instructions

• Cycle Counters – total_cycles

– idle_cycles

– busy_cycles

• Reg File Access Counters – ialu_accesses

– mul_accesses

– fpu_accesses

– cdb_alu_accesses

– cdb_mul_accesses

– cdb_fpu_accesses

• Execution Unit Access Counters – ialu_accesses

– mul_accesses

– fpu_accesses

– cdb_alu_accesses

– cdb_mul_accesses

– cdb_fpu_accesses

Network Power Modeling

• Modeling Tool: – “Orion: A Power-Performance Simulator for

Interconnection Networks”

– http://projects.csail.mit.edu/cgi-bin/wiki/view/LSPgroup/OrionPage

• Tracked Events: – Link Traversals

– Router Buffer Reads/Writes

– Router Switch Allocator Requests

– Router Crossbar Traversals (Unicast/Multicast)

12

Cache Power Modeling

• Modeling Tool: – “McPAT: An Integrated Power, Area, and Timing

Modeling Framework for Multicore and Manycore Architectures”

– http://www.hpl.hp.com/research/mcpat/

• Tracked Events: – Directory Cache Accesses

– L1/L2 Cache Data Reads

– L1/L2 Cache Data Writes

– L1/L2 Cache Tag Accesses

13

Core Power Modeling

• Modeling Tool: – “McPAT: An Integrated Power, Area, and Timing

Modeling Framework for Multicore and Manycore Architectures”

– http://www.hpl.hp.com/research/mcpat/

• Example Events: – Integer/Floating Point add

– Integer/Floating Point subtract

– Integer/Floating Point multiply

– Integer/Floating Point divide

14

Power Models

• Activity Counters track events

– Total Dynamic Energy = Event Counter x Dynamic Energy associated with each event

– Total Static Energy = Completion Time x Static Power associated with each component

15

Overall Modeling Flow

Network Router & Link

Energy & Area

Cache Energy & Area

Core Counters

Electrical / Optical Router &

Link Counters

Inputs

Tools

Outputs

Network Models

Electrical Technology Parameters

Optical /Electrical

Technology Parameters

McPAT Core

McPAT Cache

Orion 2.0 / DSENT

Graphite

16

Cache Models

Benchmark Core

Models

Core Energy & Area

Cache Counters

Core Power Modeling Structure

17

Graphite Core Model

Graphite-McPAT Interface

McPAT Processor Model

McPAT Core Model

McPAT Data Structure

Architectural Parameters

Event Counters

Architectural Parameters

Event Counters

Core Power Modeling Process

18

Graphite Core Model

Graphite-McPAT Interface McPAT Processor Model

McPAT Core Model

McPAT Data Structure

Architectural Parameters

Event Counters

Architectural Parameters

Event Counters

Area Static Power Dynamic Energy

McPAT Cache Model

• Event Counters – Tag array reads

– Tag array writes

– Data array reads

– Data array writes

– Miss / Writeback/ Fill buffer accesses

19

Parameters Cache Size

Cache Block Size

Associativity

Miss / Writeback / Fill Buffer Size

Frequency

Latency

Throughput

Inputs

Area, Leakage Power, Dynamic Energy Outputs