pulp: an open hardware platform the story so far · 2018-11-09 · 1 february 2016 first release of...

67
2 Integrated Systems Laboratory 1 Department of Electrical, Electronic and Information Engineering PULP: an Open Hardware Platform The story so far NIPS summer school 2018 - Perugia 19.07.2018 Frank K. Gürkaynak Davide Rossi http://pulp-platform.org

Upload: others

Post on 13-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

2Integrated Systems Laboratory

1Department of Electrical, Electronicand Information Engineering

PULP: an Open Hardware PlatformThe story so far

NIPS summer school 2018 - Perugia 19.07.2018

Frank K. Gürkaynak

Davide Rossi

http://pulp-platform.org

Page 2: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ 14:30 - 16:00 / Part I / Frank K. GürkaynakWhat is PULP and what are its building blocks§ Brief history of the project and its goals§ Open source HW, why and why not§ Components of a PULP based System§ RISC-V cores of PULP project§ PULP systems and implementations

§ 16:00 - 16:30 / Coffee Break§ 16:30 - 18:00 / Part II / Davide Rossi

Secrets of PULP based systems

§ Tomorrow / Hands on sessions with PULP systems

What will we cover today

Page 3: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Project started in 2013 by Luca Benini§ A collaboration between University of Bologna and ETH Zürich§ Key goal is

§ We were able to start with a clean slate, no need to remain compatible to legacy systems.

Parallel Ultra Low Power (PULP)

How to get the most BANGfor the ENERGY consumed in a computing system

Page 4: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Energy efficiency is the key driver in the PULP project

Maximum Energy Efficiency @ 0.5V + 0.5V FBBà 60GOPS/W

Wide operating range

Low leakage(< 2%)

1.8GOPS

~10mW @ 100 MHz, 0.75V

Mea

sure

men

t res

ults

from

PU

LPv1

a fo

ur c

ore

clus

ter i

n ST

FD

SOI

28nm

tech

nolo

gy ru

nnin

g a

para

llel m

atrix

mul

tiplic

atio

ns

Page 5: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ We concentrate on programmable systems§ Can not have custom hardware, need to be flexible§ We need to make the system accessible to application developers

§ Scalable over a wide operating range§ Work just as well when processing 0.001 GOPS as 1000 GOPS

§ Don’t waste idle energy§ Eliminate sources where cores and systems are idly wasting energy

§ Make good use of options provided by the technology§ Body biasing, power gating, library, memory selection etc..

§ Take advantage of heterogeneous acceleration§ Allow an architecture where accelerators can be added efficienty

To reach our goal we work on the following

Page 6: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Luca Benini holds a dual appointment in ETH Zürich and Bologna§ Total team about 50-60 members (60% in Zürich, 40% in Bologna)

§ 1 Professor§ 3 Assistant Professors and 2 Senior Scientists§ 8 Post Doctoral researchers§ 30+ Ph.D. students§ 6 Technical staff and 2 embedded staff in industrial partners

§ Most of this team works on projects that are related to PULP§ Core group of 15 designers that concentrate on PULP development§ Many others contribute to application development, software support

§ In ETH Zürich, the Microelectronics Design Center supports the project§ Permanent staff of 4§ Maintains design flows for ASIC and FPGA

PULP is maintained by a large group

Page 7: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

We have designed over 20 PULP based ASICs already

www.pulp-platform.org

28nm 28nm 28nm 65nm 65nm 65nm 65nm 65nm 65nm65nm

130nm 180nm28nm65nm180nm40nm 65nm130nm130nm

65nm

180nm

http://asic.ethz.ch

22nm65nm65nm130nm

Page 8: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

We firmly believe in Open Source movement

http://pulp-platform.org§ First launched in February 2016 (github)

§ HDL code, testbenches, testcases, SW, scripts, debug support.§ We use solderpad (v0.51) license (Apache like license adapted for HW)

http://www.solderpad.org/licenses/

Page 9: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ The way we design ICs has changed, big part is now infrastructure§ Processors, peripherals, memory subsystems are now considered infrastructure§ Very few (if any) groups design complete IC from scratch§ High quality building blocks (IP) needed

§ We need an easy and fast way to collaborate with people§ Currently complicated agreements have to be made between all partners§ In many cases, too difficult for academia and SMEs

§ Hardware is a critical for security, we need to ensure it is secure§ Being able to see what is really inside will improve security§ Having a way to design open HW, will not prevent people from keeping secrets.

Open Hardware is a necessity, not an ideological crusade

Page 10: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ The differences complicate things§ Not a uniform way to discuss the issue, depends on the design flow§ This talk about ASIC flow (the most complex one).

§ Different actors, different revenue streams§ Not every actor in the current design flow, earns money the same way§ EDA companies are long complaining that they are out of the loop

Their income is not based on the amount of ICs produced.§ Not everybody will be happy if open hardware will be more common§ Important to understand the relationships

§ Interestingly most intermediate file formats are open, readable§ Verilog, Liberty, SPICE, EDIF, CIF, LEF, DEF, OA (open access)

Frank. K. Gürkaynak - Open Source Hardware 10

Hardware design flows for PCB/FPGA/ASIC are different

Page 11: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

TechnologyProvider

IP Providers

EDA Companies

ASIC Design Flow and Main Actors

Integrated Systems Laboratory 11

HDLNetlist GDSII ChipLogic

Synth. P&R Foundry

PDK

OtherIP

Simu-lator

Sim.Models

High-Level Code

HLS

Std.Cells

§ Only beginning of the pipeline can be open HW§ Later stages contain business interest of various actors

Page 12: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

FPGAVendor

FPGA Design Flow and Main Actors

Integrated Systems Laboratory 12

HDLNetlist BitfileLogic

Synth. P&R FPGASimu-lator

Sim.Models

High-Level Code

HLS

FPGAResour.

FPGA VENDOR

§ FPGA vendors want to sell the FPGAs§ Little commercial interested in intermediate files

§ Most vendors allow bitfiles to be published -> will sell more FPGAs

Page 13: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Similar to Apache/BSD, adapted specifically for Hardware§ Allows you to:

§ Use § Modify§ Make products and sell them without restrictions.

§ Note the difference to GPL§ Systems that include PULP do not have to be open source (Copyright not Copyleft)§ They can be released commercially§ LGPL may not work as you think for HW

We provide PULP with SOLDER Pad License

http://www.solderpad.org/licenses/

Page 14: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ The following are ok:§ RTL code written in HDL, or a high-level language for HLS flow§ Testbenches in HDL and associated makefiles, golden models

§ How about support scripts for different tools?§ Synthesis scripts, tool startup files, configurations

§ And these are currently no go :§ Netlists mapped to standard cell libraries§ Placement information (DEF) § Actual Physical Layout (GDSII)

At the moment, open HW can (mostly/only) be HDL code

Page 15: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

OPEN

NOT YET

SWHW

Where are we now?

Frank. K. Gürkaynak - Open Source Hardware

15

Application

Operating System

Instruction Set Arch

Microarchitecture

Components (RAM, ALU)

Gates

Transistors

Emacs

Linux

RISC-V

PULP

Page 16: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULP Open-Source Releases and External Contributions

1 February 2016First release of PULPino, our single-core microcontroller

2

3

May 2016Toolchain and compiler for our RISC-V implementation (RI5CY), DSP extensions

August 2017PULPino updates, new cores Zero-riscyand Micro-riscy, FPU, toolchain updates

February 2018PULPissimo, ARIANE, PULP

1

2

4

September 2017Porting of ARM CMSIS to PULPinohttps://github.com/misaleh/CMSIS-DSP-PULPino

June 2017Porting of Verilator and BEEBS benchmarks to PULPinohttps://github.com/embecosm/ri5cy

December 2017STING: Open-Source Verification Environment for PULPinohttp://valtrix.in/programming/running-sting-on-pulpino

Releases Community Contributions

November 2017Numerous Bug fixes to RI5CY in PULPinohttps://github.com/pulp-platform/riscv

3

March 2018Open PULP

4

5

Page 17: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

We try to leverage open source as much as possible

Compiler Infrastructure

Processor & Hardware IPs

Virtualization Layer

ProgrammingModel

Low-Power Silicon Technology

Page 18: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Many companies (we know of) are actively using PULP§ They value that it is silicon proven§ They like that it uses a permissive open source license

Silicon and Open Hardware fuel PULP success

§ GreenWaves Technologies§ Dolphin§ IQ Analog (14nm chips)§ Embecosm§ lowRISC§ Mentor Graphics§ Cadence Design Systems§ ST Microelectronics (IT,F)§ Micron§ Google§ IBM§ NXP

§ Stanford§ Cambridge§ UCLA§ CEA/LETI§ EPFL§ National Chia Tung

University§ Politecnico di Milano§ Politecnico di Torino§ Universita Roma I§ Instituto Superior Tecnico –

U. de Lisboa§ Fondazione Bruno Kessler

§ Zagreb FER§ Universita di Genova§ Istanbul Technical U. § RWTH Aachen§ Lund§ USI – Lugano§ Bar-Ilan§ TU-Kaiserslautern§ TU-Graz§ UC San Diego§ CSEM§ IBM Research

§ SIAE Microelectronica§ Advanced Circuit

Pursuit§ Shanghai Xidian

Technology§ SCS Zurich§ IMT technologies§ Microsemi§ Arduino§ RacyICs

Companies that are using/evaluating PULP Research Centers/Universities using PULP

Com

pani

es w

ith a

nnou

nced

pro

duct

s, b

usin

ess

Com

pani

es th

at u

se P

ULP

inte

rnal

ly o

r for

trai

ning

Com

pani

es e

xplo

ring

oppo

rtun

ities

Page 19: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Platforms

InterconnectPeripheralsRISC-V Cores

The PULP family explained

Accelerators

Page 20: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

RISC-V Cores

We have developed several optimized RISC-V cores

RI5CY

32b

Microriscy32b

Zeroriscy32b

Ariane

64b

Page 21: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Our research was not developing processors…§ … but we needed good processors for systems we build for research§ Initially (2013) our options were

§ Build our own (support for SW and tools)§ Use a commercial processor (licensing, collaboration issues)§ Use what is openly available (OpenRISC,.. )

§ We started with OpenRISC§ First chips until mid-2016 were all using OpenRISC cores§ We spent time improving the microarchitecture

§ Moved to RISC-V later§ Larger community, more momentum§ Transition was relatively simple (new decoder)

How we started with open source processors

Page 22: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Started by UC-Berkeley in 2010§ Open Standard governed by RISC-V

foundation§ ETHZ is a founding member of the

foundation§ Necessary for the continuity§ Extensions are still being developed

§ Defines 32, 64 and 128 bit ISA§ No implementation, just the ISA§ Different RISC-V implementations (both

open and close source) are available

§ At IIS we specialize in efficient implementations of RISC-V cores

Spec separated into “extensions”

RISC-V Instruction Set Architecture

I Integer instructions

E Reduced number of registers

M Multiplication and Division

A Atomic instructions

F Single-Precision Floating-Point

D Double-Precision Floating-Point

C Compressed Instructions

X Non Standard Extensions

Page 23: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Our RISC-V cores

§ Zero-riscy§ RV32-ICM

§ Micro-riscy§ RV32-CE

§ Ariane§ RV64-ICM

§ RI5CY§ RV32-ICMX§ SIMD§ HW loops§ Bit

manipulation§ Fixed point

§ RI5CY+FPU§ RV32-ICMFX

Low Cost Core

Linux capable Core

Core with DSP enhancements

Floating-point capable Core

32 bit 64 bit

ARM Cortex-M0+ ARM Cortex-M4 ARM Cortex-A55ARM Cortex-M4F

Page 24: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ 4-stage pipeline, optimized for energy efficiency§ 40 kGE, 30 logic levels, Coremark/MHZ 3.19§ Includes various extensions (X) to RISC-V for DSP applications

RI5CY – Our workhorse 32-bit core

Page 25: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

RI5CY – ISA Extensions improve performance

for (i = 0; i < 100; i++) d[i] = a[i] + b[i];

mv x5, 0mv x4, 100Lstart:

lb x2, 0(x10)lb x3, 0(x11)addi x10,x10, 1addi x11,x11, 1add x2, x3, x2sb x2, 0(x12)addi x4, x4, -1addi x12,x12, 1

bne x4, x5, Lstart

Baseline

11 cycles/output

mv x5, 0mv x4, 100Lstart:

lb x2, 0(x10!)lb x3, 0(x11!)addi x4, x4, -1add x2, x3, x2sb x2, 0(x12!)

bne x4, x5, Lstart

8 cycles/output

Auto-incr load/store

lp.setupi 100, Lendlb x2, 0(x10!)lb x3, 0(x11!)add x2, x3, x2

Lend: sb x2, 0(x12!)

HW Loop

5 cycles/output

lp.setupi 25, Lendlw x2, 0(x10!)lw x3, 0(x11!)pv.add.b x2, x3, x2

Lend: sw x2, 0(x12!)

Packed-SIMD

1,25 cycles/output

Page 26: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ RI5CY was built for energy efficiency for DSP applications§ Ideally all parts of the core are running all the time doing something useful§ This does not always mean it is low-power§ The core is rather large (> 40 kGE without FPU)

§ People asked us about a simple and small core§ Not all processor cores are used for DSP applications§ The DSP extensions are mostly idle for control applications§ Zero-Riscy was designed to as a simple and efficient core.

§ Some people wanted the smallest possible RISC-V core§ It is possible to further reduce area by using 16 registers instead of 32 (E)§ Also the multiplier can be removed saving a bit more§ Micro-Riscy is a parametrized variation of Zero-Riscy with minimal area

Why we designed newer 32-bit cores

Page 27: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Only 2-stage pipeline, simplified register file§ Zero-Riscy (RV32-ICM), 19kGE, 2.44 Coremark/MHz§ Micro-Riscy (RV32-EC), 12kGE, 0.91 Coremark/MHz§ Used as SoC level controller in newer PULP systems

Zero/Micro-riscy, small area core for control applications

Page 28: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Different 32-bit cores with different area requirements

RI5CY Zero-riscy Micro-riscy

Page 29: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Different cores for different types of workload

RI5CY fo

r DSP

Zero-ris

cygood

trade-o

ff

Micro-ris

cyfor

contro

l

Normalized Static Power Consumption Dynamic Power Consumption

Page 30: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ For the first 4 years of the PULP project we used only 32bit cores§ Luca once famously said “We will never build a 64bit core”.§ Most IoT applications work well with 32bit cores.§ A typical 64bit core is much more than 2x the size of a 32bit core.

§ But times change:§ Using a 64bit Linux capable core allows you to share the same address space as

main stream processors.§ We are involved in several projects where we (are planning to) use this capability

§ There is a lot of interest in the security community for working on a contemporary open source 64bit core.

§ Open research questions on how to build systems with multiple cores.

Finally the step into 64-bit cores

Page 31: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

ARIANE: Our Linux Capable 64-bit core

Page 32: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Tuned for high frequency, 6 stage pipeline, integrated cache§ In order issue, out-of-order write-back, in-order-commit§ Supports privilege spec 1.11, M, S and U modes§ Hardware Page Table Walker

§ Implemented in GF22nm (Poseidon), and UMC65 (Scarabaeus)§ In 22nm: 910 MHz worst case conditions

(SSG, 125/-40C, 0.72V)§ 8-way 32kByte Data cache and

4-way 32kByte Instruction Cache § Core area: 175 kGE

Main properties of Ariane

7%8% 3%

21%

44%

9%

8%AreaPC GenIFIDIssueExReg FileCSR

Page 33: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Full FPGA implementation§ Xilinx Vertex 7 – VC707§ Core: 50 – 100 MHz§ Core: 15 kLUTs§ 1 GB DDR3

§ Mapping to simpler boards underway

Ariane mapped to FPGA boots Linux

Page 34: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Accelerators

InterconnectPeripheralsRISC-V Cores

Only processing cores are not enough, we need more

RI5CY

32b

Microriscy32b

Zeroriscy32b

Ariane

64bAXI4 – InterconnectDMA GPIO

APB – Peripheral BusI2SUART

Logarithmic interconnectSPIJTAG

Neurostream(ML)

HWCrypt(crypto)

PULPO(1st order opt)

HWCE(convolution)

Page 35: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Platforms

Accelerators

InterconnectPeripheralsRISC-V Cores

The pulp-platforms put everything together

RI5CY

32b

Microriscy32b

Zeroriscy32b

Ariane

64bAXI4 – InterconnectDMA GPIO

APB – Peripheral BusI2SUART

Logarithmic interconnectSPIJTAG

Neurostream(ML)

HWCrypt(crypto)

PULPO(1st order opt)

HWCE(convolution)

R5

MI

O

interconnect

A

Single Core• PULPino• PULPissimo

Page 36: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Simple design§ Meant as a quick release

§ Separate Data and Instruction memory§ Makes it easy in HW§ Not meant as a Harvard arch.

§ Can be configured to work with all our 32bit cores§ RI5CY, Zero/Micro-Riscy

§ Peripherals copied from its larger brothers§ Any AXI and APB peripherals

could be used

PULPino our first single core platform

PULPinoDataMem

RISC-Vcore

InstMem

I$

APB

-inte

rcon

nect

GPIO

AXI

-in

terc

onne

ct

BusAdapt

SPI M

UART

I2C

UART

SPI S

BootROM

Page 37: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Shared memory§ Unified Data/Instruction Memory§ Uses the multi-core infrastructure

§ Support for Accelerators § Direct shared memory access§ Programmed through APB bus§ Number of TCDM access ports

determines max. throughput § uDMA for I/O subsystem

§ Can copy data directly from I/O to memory without involving the core

§ Used as a fabric controller in larger PULP systems

PULPissimo the improved single core platform

RI5CYIbuf/ I$

instr data

Event Unit

Tightly Coupled Data Memory Interconnect

MemBank

MemBank

MemBank

MemBank

MemBank

MemBank

uDMA

APB / Peripheral Interconnect

Clock / ResetGenerator

DebugUnit

FLLs

I/Ointfs

UARTSPII2SI2C

SDIOCPI

JTAG

HardwareAccelerator Ext

Page 38: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Coreplex

§ Still work in progress§ Current version is very

simple§ Useful for in-house testing§ A more advanced version

will likely be developed soon.

Kerbin the single core support structure for Ariane

Kerbin

APB

-inte

rcon

nect

GPIO

AXI

-in

terc

onne

ct

BusAdapt

SPI M

UART

I2C

UART

SPI S

BootROM

Aria

neI$

D$

Timer

AXI2Per

Debug

FLL

Page 39: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Platforms

Accelerators

InterconnectPeripheralsRISC-V Cores

Finally for HPC applications we have multi-cluster systems

RI5CY

32b

Microriscy32b

Zeroriscy32b

Ariane

64bAXI4 – InterconnectDMA GPIO

APB – Peripheral BusI2SUART

Logarithmic interconnectSPIJTAG

M

I

Ocluster

interconnect

A R5R5R5

M MMMin

terc

onne

ct

Neurostream(ML)

HWCrypt(crypto)

PULPO(1st order opt)

HWCE(convolution)

R5

MI

O

inte

rcon

nect

A

Single Core• PULPino• PULPissimo

Multi-core• Fulmine• Mr. Wolf

R5

Page 40: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Multiple RISC-V cores§ Individual cores can be started/stopped with little overhead§ DSP extensions in cores

§ Multi-banked scratchpad memory (TCDM)§ Not a cache, there is no L1 data cache in our systems

§ Logarithmic Interconnect allowing all cores to access all banks§ Cores will be stalled during contention, includes arbitration

§ DMA engine to copy data to and from TCDM§ Data in TCDM managed by software§ Multiple channels, allows pipelined operation

§ Hardware accelerators with direct access to TCDM§ No data copies necessary between cores and accelerators.

The main components of a PULP cluster

Page 41: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

CLUSTER

PULP cluster contains multiple RISC-V cores

RISC-Vcore

RISC-Vcore

RISC-Vcore

RISC-Vcore

Page 42: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

CLUSTER

Tightly Coupled Data Memory

All cores can access all memory banks in the cluster

interconnect

RISC-Vcore

Mem Mem MemMem

RISC-Vcore

RISC-Vcore

RISC-Vcore

Mem Mem MemMem

Mem

Mem

Page 43: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

CLUSTER

Tightly Coupled Data Memory

Data is copied from a higher level through DMA

interconnect

RISC-Vcore

MemDMA Mem MemMem

RISC-Vcore

RISC-Vcore

RISC-Vcore

Mem Mem MemMem

Mem

Memin

terc

onne

ct

L2Mem

Page 44: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

CLUSTER

Tightly Coupled Data Memory

There is a (shared) instruction cache that fetches from L2

interconnect

RISC-Vcore

MemDMA Mem MemMem

RISC-Vcore

RISC-Vcore

RISC-Vcore

Mem Mem MemMem

I$

Mem

Memin

terc

onne

ct

L2Mem

I$ I$ I$

Page 45: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

CLUSTER

Tightly Coupled Data Memory

Hardware Accelerators can be added to the cluster

interconnect

RISC-Vcore

MemDMA Mem MemMem

RISC-Vcore

RISC-Vcore

RISC-Vcore

Mem Mem MemMem

I$

HWACCEL

Mem

Memin

terc

onne

ct

L2Mem

I$ I$ I$

Page 46: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

CLUSTER

Tightly Coupled Data Memory

Event unit to manage resources (fast sleep/wakeup)

interconnect

RISC-Vcore

MemDMA Mem MemMem

RISC-Vcore

RISC-Vcore

RISC-Vcore

Mem Mem MemMem

I$

HWACCEL

Mem

Memin

terc

onne

ct

L2Mem

I$ I$ I$

EventUnit

Page 47: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

An additional microcontroller system (PULPissimo) for I/O

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 48: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

How do we work: Initiate a DMA transfer

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 49: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

Data copied from L2 into TCDM

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 50: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

Once data is transferred, event unit notifies cores/accel

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 51: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

Cores can work on the data transferred

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 52: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

Accelerators can work on the same data

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 53: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

Once our work is done, DMA copies data back

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 54: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULPissimo CLUSTER

Tightly Coupled Data Memory

During normal operation all of these occur concurrently

interconnect

RISC-V

core

MemDMA Mem MemMem

RISC-V

core

RISC-V

core

RISC-V

core

Mem Mem MemMem

I$

HW

ACCEL

Mem

Memin

terc

on

ne

ct

L2

Mem

Mem

Cont

I/O

RISC-V

core

I$ I$ I$

Ext.

Mem

Event

Unit

Page 55: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Platforms

Accelerators

InterconnectPeripheralsRISC-V Cores

Finally for HPC applications we have multi-cluster systems

RI5CY

32b

Microriscy32b

Zeroriscy32b

Ariane

64bAXI4 – InterconnectDMA GPIO

APB – Peripheral BusI2SUART

Logarithmic interconnectSPIJTAG

M

I

Ocluster

interconnect

A R5R5R5

M MMMin

terc

onne

ct

cluster

interconnect

R5 R5R5R5

M MMM

cluster

interconnect

R5 R5R5R5

M MMM

cluster

interconnect

A R5R5R5

M MMMM

I

O inte

rcon

nect

Neurostream(ML)

HWCrypt(crypto)

PULPO(1st order opt)

HWCE(convolution)

R5

MI

O

inte

rcon

nect

A

Single Core• PULPino• PULPissimo

Multi-core• Fulmine• Mr. Wolf

Multi-cluster• Hero

IOT HPC

R5R5

Page 56: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

We will release HERO soon

§ Multi-cluster PULP system on FPGA connected to ARM cores§ Will be made available in 3Q2018 (soon).

Page 57: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

HERO is modifiable and expandable

§ Up to 8 clusters, each with 8 cores§ All components are open source and written in System Verilog§ Standard interfaces (mostly AXI)§ New components can easily be added to the memory map

TLX-4 0 0

So

C B

usMa ilb ox

L2Me m

Clu s t e r 0L1 Me m

Clu s t e r 1L1 Me m

Clu s t e r L-1L1 Me m

RAB

X-Ba r In t e rcon n e ct

Clu

ster

Bu

s

DMA

L1 S

PMB

ank M

-1

RISC-VPE N-1

Sh a re d L1 I$

L1 S

PMB

ank 0

L1 S

PMB

ank 1

L1 S

PMB

ank 2

RISC-VPE 1

RISC-VPE 0Pe

rip

he

ral B

us

TRYXPe r2 AXI

AXI2 Pe r

Tim e r

Eve n t Un it

TRYX TRYX

Sh a re d APU

DEMUX DEMUX DEMUX

# of clusters� {1, 2, 4, 8}

# of PEs� {2, 4, 8}

FPU� { private, shared (APU), of f}

integer DSP unit� { private, shared (APU)}

L1 SPM size and # of banks

I$ design, size, # of banks

L2 SPM size

system-levelinterconnecttopology

RAB L1 TLB sizeand L2 TLB size,

associativity,and # of banks

Page 58: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ All are 28 FDSOI technology, RVT, LVT and RVT flavor§ Uses OpenRISC cores§ Chips designed in collaboration with STM, EPFL, CEA/LETI§ PULPv3 has ABB control

Brief illustrated history of selected ASICs

PULP PULPv2 PULPv3

Page 59: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ First multi-core systems that were designed to work on development boards. Each have several peripherals (SPI, I2C, GPIO)

§ Mia Wallace and Fulmine (UMC65) use OpenRISC cores§ Honey Bunny (GF28 SLP) uses RISC-V cores§ All chips also have our own FLL designs.

The first system chips, meant for designing boards

Mia Wallace

HoneyBunny

Fulmine

Page 60: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Designed in collaboration with the Analog group of Prof. Huang§ All chips with SMIC130 (because of analog IPs)§ First three with OpenRISC, VivoSoC3 with RISC-V

Combining PULP with analog front-end for Biomedical apps

VivoSoC VivoSoCv2.0

VivoSoCv2.001

VivoSoCv3

Page 61: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Mr. Wolf our latest system chip will solve problems

§ TSMC 40LP - 9mm2

§ 64 pin QFN package§ Cluster with

§ 8x RI5CY§ 2x shared FPUs § 64 kByte TCDM

§ SoC domain with§ micro-riscy core to control

power modes§ DC-DC converter (Dolphin)

to regulate power§ 512 kBytes L2

Page 62: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

Mr. Wolf will (hopefully) solve problems

§ TSMC 40LP - 9mm2

§ 64 pin QFN package§ Cluster with

§ 8x RI5CY§ 2x shared FPUs § 64 kByte TCDM

§ SoC domain with§ micro-riscy core to control

power modes§ DC-DC converter (Dolphin)

to regulate power§ 512 kBytes L2

Page 63: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

PULP in the IoT domain: Latest version Mr. Wolf

Page 64: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

GAP8: commercial big brother of PULP from Greenwaves

L2

SPIM x 2

SoCTCDMBANK

#0

TCDMBANK

#1

TCDMBANK #M-1

SHARED I$

RISCV#0

RISCV#1

RISCV#7

BRIDGE

CLUSTER

TCDM INTERCONNECT

...

...CL

UST

ER B

US

PERI

PHER

AL

INTE

RCO

NN

ECT

DC FIFO

DC FIFOID R.

AXI2MEM

SOC

BUS

ROM

SoC Ctrl

JTAG2AXI

JTAG

FLL Ctr

UART

GPIO

Top

DU DU DU

PAD

MU

X

JTAGTAP

I2C x 2

I2S x 2

SOC

APB

TMC

uDM

A

SPI S

FLL x 2

DMA

CLUSTERPERIPHS

LVDS I/Q

AXI TO MEM+ MPU FILTER

FC SUBSYSTEM

L2 REFILLMASTER

UDMASLAVE

ROM REFILL MASTER

APB SLAVE

AXIMASTER

PWM x 4

10b // (CPI)

PP Im

PP Au

DC/DCRTC

PMU

BOR BOR TRC

Q-SPIM

AXI TO APB + MPU FILTER

Hyper-Bus

Serial I/Q

AQFN84 package

HWCE

Page 65: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ Taped out Jan 2018§ Arrived back June 2018§ Quentin

§ PULPissimo implementation§ RI5CY+ FPU + HWCE§ 512 kByte RAM

§ Kerbin§ Ariane core + Caches§ Uses the memory infrastructure

of Quentin§ Hyperdrive

§ Binary CNN Accelerator

Poseidon: Our latest chip in GF 22FDX3 mm

QuentinPULPissimo

Hyperdrive

KerbinAriane

Page 66: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

§ For the workshop:§ Davide will give details on how we do thing in PULP next§ Tomorrow there will be hands on sessions for GAP8 and Multicluster

PULP§ For us:

§ Plenty of chips in pipeline (Kosmodrom, Vega, Villalobos)§ Multi-core Ariane§ Atomic instructions, 64bit FPU, Vector Unit

§ For you:§ There is plenty to be done§ Let us know if you want to/can contribute

What is next

Page 67: PULP: an Open Hardware Platform The story so far · 2018-11-09 · 1 February 2016 First release of PULPino, our single-core microcontroller 2 3 May 2016 Toolchain and compiler for

The image part with relationship ID rId9 was not found in the file.

PULP @ ETH ZürichQUESTIONS?

@pulp_platformhttp://pulp-platform.org