7 eti pres

19
2011 ET International, Inc ETI SCC Baremetal Framework Bandwidth and Power Findings Rishi Khan 3/30/11

Upload: raymond-kung

Post on 23-Jun-2015

175 views

Category:

Documents


0 download

DESCRIPTION

SCC ETI Power

TRANSCRIPT

Page 1: 7 eti pres

©2011 ET International, Inc

ETI SCC Baremetal FrameworkBandwidth and Power Findings

Rishi Khan3/30/11

Page 2: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alOutline

• SCC Framework Overview• Bandwidth Findings• Power Findings• Software Access

Page 3: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alSCC Framework Overview

Page 4: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alMessaging Goals

• Asynchronous Communications• Single Threaded• Possibly Long Latency until data is received• Maximize bandwidth• Handle big and small messages• Extensible layer that supports MPI, BSD

sockets, etc

Page 5: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alDesign Choices

• One channel per core-pair per direction• Large window size (up to 1MB/channel)• Fast polling of incoming data (use MPB)• Circular buffer with 16 slots and read/write pointers• Poll local pointers, signal remote pointers• Use separate cache lines to avoid locking

2 cache lines * 48 channels = 3K per core

• Double map read and write pages Read – L2 cache enabled Write – L2 cache disabled (write back)

Page 6: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alCircular Buffer Example

Core 0 (reader)

Cache

MPB

Channel->local_read

Channel->mpb_write

Core 1 (writer)

Cache

MPB

Channel->local_write

Channel->mpb_read

DRAMChannel->body[]

Is there space?

Write the data (with length as first 2 bytes)

Upd

ate

writ

e po

inte

r

Pol

l loc

al w

rite

poi

nter

Read data

UpdateRead

Pointer

Page 7: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alSocket API

• int stream_recv(int nid, void *buf, size_t len, int nb);• int stream_send(int nid, const void *buf, size_t len);

0

20

40

60

80

100

120

Intel RCCE

ETI Streams (DRAM, Blocking)

ETI Streams (MPBs, Blocking)

ETI Streams (MPBs, Non-Blocking)

Message Size (bytes)

Mes

sag

ing

Ban

dw

idth

(M

B/s

ec)

L1

L2

Page 8: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alMPI

0

20

40

60

80

100

120

Linux MPI (Intel, Blocking, TCP)

Baremetal MPI (ETI, Blocking)

Baremetal MPI (ETI, Non-blocking)

RCKMPI

Message Size (bytes)

Mes

sag

ing

Ban

dw

idth

(M

B/s

ec)

L1

L2

Page 9: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alPower Goals

• External monitoring of voltage and current• Backend Power API

Update time functions with frequency changesKeep chip under safe conditions!!

• Internal synchronization of clocks• External synchronization of host and SCC

Page 10: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alExternal Monitoring

• Read /opt/sccKit/systemSettings.ini• Telnet BMC 5010• Request Status / Parse Data• Store timestamps

Page 11: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alBackend Power API

• power_session scc_open_power(heap h);• void scc_close_power(power_session ps);• int scc_set_freq(power_session ps, u32 requested_frequency);• int scc_set_voltage(power_session ps, u32 requested_millivolts);• char* scc_error_string(status_code code);

100 106 114 123 133 145 160 178 200 266 320 400 533 800

0.7 X X X X X X X X X X X X

0.8 X X X X X X X X X X X X X

0.9 X X X X X X X X X X X X X

1.0 X X X X X X X X X X X X X

1.1 X X X X X X X X X X X X X X

1.2 X X X X X X X X X X X X X X

1.3 X X X X X X X X X X X X X X

Allowable Frequency

Vol

tage

Page 12: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alInternal Synchronization

• Cores come out of sccReset in 20ms intervals• Each core’s clock starts at cycle 0 at reset• Each core’s frequency may be different• Solution:

Set all cores to 400MHz Barrier After Barrier, set internal integrator to 0

Page 13: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alFormulas for Time

• Use this formula for time: count = scc_cycle_count() - _integral_cycle; ns = _integral_time_ns +count*_current_ns_in_cycles;

• Use this for frequency change: _integral_time_ns += (scc_cycle_count() - _integral_cycle) *_current_ns_in_cycles; _integral_cycle = current_time; _current_ns_in_cycles = 1.0e9/((double)_global_clock/

(double)freq_divider);

Inte

gral

Tim

e

Freq

scc_cycle_count()

_integral_cycle

Page 14: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alSyncing Front/Back

• Change voltage from 0.7 to 1.1 every 1 second• Measure changes on frontend• Cannot get better than 0.5 seconds

0 2 4 6 8 10 12 14 16 18 2020

20.5

21

21.5

22

22.5

Amps

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

Residuals

Page 15: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alBug in BMC Voltage Readings

• 3 power islands• Drop voltage from 1.2 to 0.7 immediately• Raise Voltage after 20 seconds

0 5 10 15 20 25 30 35 400.7

0.8

0.9

1

1.1

1.2

Voltage

V0

V1

V2

Time

Vo

ltag

e

0 5 10 15 20 25 30 35 4017

18

19

20

21

22

23

Amps

Time

Am

ps

20.5 Seconds0.6 Seconds

Page 16: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alOther SCC issues

• If more than 24 cores pound on one MPB, contention overtakes system.Sleep required between polling

• Allowable Voltage/freq are chip specific• BMC telnet response is > 100ms

Page 17: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alFuture Work

• DARPA UHPC: Study how voltage/freq affect power dissipation

• Allan Snavely (UCSD)Systematically study loops over a number of

parameters to find the best voltage/freq.Create formulas to approximate good power

settings for unknown loops

Page 18: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alAccess to Software

• Email [email protected] • Beta available• Considering open sourcing SCC-specific

portions of our work for others to test/learn/improve

Page 19: 7 eti pres

Copyright 2011 ET International, Inc.

ET

Inte

rnat

ion

alAcknowledgements

• Mark Deazley (ETI)• Eric Hoffman (ETI)• Allan Snavely (UCSD)• Intel:

Tim MattsonTed KubaskaRob NoradkiWilf Pinfold, Shekhar Borkar (UHPC)