using asic flow to quantify cycle time · • complex digital asic design • activity 1 case...

16
Using ASIC Flow to Quantify Cycle Time ECE 4750 Computer Architecture Discussion Section

Upload: others

Post on 31-Jul-2020

4 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

Using ASIC Flow to Quantify Cycle Time

ECE 4750 Computer ArchitectureDiscussion Section

Page 2: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

• Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2

Full Custom Design vs. Standard-Cell Design

L01 –Introduction 22

6.884 –S

pring 20052 Feb 2005

Full Custom D

esignDesigner is free to do anything, anywhere

–though each design team

usually imposes som

e disciplineM

ost time consum

ing design style–

Reserved for very high performance or very high volum

e devices (Intel m

icroprocessors, RF power amps for cellphones)

Requires complete custom

ization of all layers of wafer

Piece of full-custom m

ultiplier array, 1.0Pm

2-metal

Full-custom layoutin 1.0µm w/ 2 metal

layers

I Full-Custom Design

. Designer is free to do anything, anywhere; thoughteam usually imposes some design discipline

. Most time consuming design style; reserved forvery high performance or very high volume chips(Intel microprocessors, RF power amps forcellphones)

I Standard-Cell Design

. Fixed library of “standard cells” and SRAMmemory generators

. Register-transfer-level description is automaticallymapped to this library of standard cells, then thesecells are placed and routed automatically

. Enables agile hardware design methodology

ECE 5745 Course Overview 10 / 45

Page 3: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

• Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2

Standard-Cell Design Methodology

L01 – Introduction 246.884 – Spring 2005 2 Feb 2005

Standard Cell ASICs• Also called Cell-Based ICs (CBICs)• Fixed library of cells plus memory generators• Cells can be synthesized from HDL, or entered in schematics• Cells placed and routed automatically• Requires complete set of custom masks for each design• Currently most popular hard-wired ASIC type (6.884 will use this)

Mem1 Mem

2

Cells arranged in rows

Generated memory arrays

L01 – Introduction 256.884 – Spring 2005 2 Feb 2005

Standard Cell DesignCells have standard height but vary in widthDesigned to connect power, ground, and wells by abutment

VDD Rail

GND Rail

Clock Rail

Cell I/O on M2Power

Rails in M1

Clock Rail (not typical)

NAND2 Flip-flop

Well Contact under Power Rail

Ripple carry adder with carrychain highlighted

ECE 5745 Course Overview 11 / 45

Page 4: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

• Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2

Standard-Cell Design Methodology

Design in HDL

HDLSimulator

Place&Route

Switching Activity

PowerAnalysis

Synthesis

Standard Cells

Gate-Level Model

Layout

Area (�m2)

Execution Time(cycles/task)

Cycle Time (ns)Energy (J/task)

Area (�m2)Cycle Time (ns)Energy (J/task)

Energy (J/task)

ECE 5745 Course Overview 12 / 45

Page 5: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

TinyRV2 Stalling Processor • 90nm Tech

Page 6: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

Experiences Using a Novel Python-Based Hardware Modeling Framework for Computer Architecture Test Chips

PyMTL in Practice: BRG Test Chip 1

Taped-out Layout for BRGTC1

HostInterface

debu

g

RISCCore

SortAccel

Memory Arbitration Unit

SRAMBank(2KB)

SRAMBank(2KB)

SRAMBank(2KB)

SRAMBank(2KB)

diff clk (+)diff clk (�)

singleended clk

rese

t

CtrlReghost2chip

chip2host

LVDSRecv

clkdiv

clk treeresettree

clk

out

The testing platform enables running smalltest programs on BRGTC1 to compare the

performance and energy of pure-software kernelsversus the HLS-generated sorting accelerator

2x2mm 1.3M transistors in IBM 130nmRISC processor, 16KB SRAMHLS-generated accelerators

Static Timing Analysis Freq. @ 246 MHz

Testing Plans After Fabrication

Taped out in March 2016Expected return in Fall 2016LV

DS

divi

ded

clk

out

LVD

Scl

k ou

t

Cornell University

Christopher Torng

4 / 4

Page 7: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

$OO�QXPEHUV�KHUH�DUH�JHQHUDWH�IURP�«�

� 6\QRSV\V�'HVLJQ�&RPSLOHU���,&�FRPSLOHU� 6\QRSV\V���QP�*HQHULF�/LEUDU\�IRU�7HDFKLQJ�,&�'HVLJQ

4XRWH��,QWHO¶V�3HQWLXP�'��6PLWKILHOG��±����QP�SURFHVV�WHFKQRORJ\������±����*+]�

Ɣ ,QWURGXFHG�0D\���������Ɣ ����±����*+]��PRGHO�QXPEHUV����±����Ɣ 1XPEHU�RI�WUDQVLVWRUV�����PLOOLRQƔ ��0%�î����QRQ�VKDUHG����0%�WRWDO��/��FDFKHƔ &DFKH�FRKHUHQF\�EHWZHHQ�FRUHV�UHTXLUHV�FRPPXQLFDWLRQ�RYHU�WKH�)6%Ɣ 3HUIRUPDQFH�LQFUHDVH�RI�����RYHU�VLPLODUO\�FORFNHG�3UHVFRWWƔ �����*+]������0+]�)6%��3HQWLXP�'�����LQWURGXFHG�'HFHPEHU�����Ɣ &RQWDLQV��[�3UHVFRWW�GLHV�LQ�RQH�SDFNDJHƔ )DPLO\����0RGHO��

Page 8: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

immgen

[19:15]

[24:20]

ex_result_reg_M

regfile(read)

regfile(write)alu

wb_result_sel_M

rf_wen_W

rf_waddr_W

Fetch (F) Decode (D) Execute (X) Memory (M) Writeback (W)

br_target_reg_X

op2_sel_D

reg_en_D

inst_D

reg_en_Xdmemreq_msg_

addr

dmemresp_msg_data

reg_en_M reg_en_W

proc_to_mngr_data

mngr_to_proc_data

pc_sel_F

imemreq_msg_addr

reg_en_F

+4pc_incr_F

br_target_X

pc_reg_F

pc_sel_mux_F

pc_plus4_F

pc_next_F

pc_reg_D

inst_D_reg

inst_D

rf_rdata0_D

rf_rdata1_D

rf

op2_sel_mux_D

wb_result_sel_mux_M

wb_result_reg_W

rf_wdata_W

pc_F

alu_fn_X

br_cond_eq_X

imemresp_msg_data

+

imm_type_D

csrr_sel_D

csrr_sel_mux_D

num_corescore_id

pc_plus_imm_D

jalr_target_X

pc_reg_X

op1_reg_X

op2_reg_X

op1_sel_mux_D

op1_sel_D

dmem_write_data_reg_X

dmemreq_msg_data

br_cond_lt_Xbr_cond_ltu_X

imulimul_req_val_Dimul_req_rdy_D

imul_resp_val_Ximul_resp_rdy_X

+4pc_incr_X

imul_req_msg

ex_result_sel_mux_X

ex_result_sel_X

jal_target_D

rf

imul_resp_msg

wb_result_reg_W

stats_en_wen_W

stats_en

imm_gen_D

TinyRV2 Stalling Processor Datapath

Page 9: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

��ENQEM�KFGCNAENQEM��TKUG�GFIG�������������������������������������������������������

��ENQEM�PGVYQTM�FGNC[�RTQRCICVGF�����������������������������������������������������

��FRCVJ�KPUVA&ATGI�QWVATGIA��A�%.-�&((:�����������������������������������������������T

��FRCVJ�KPUVA&ATGI�QWVATGIA��A�3�&((:�������������������������������������������������H

��FRCVJ�KPUVA&ATGI�QWV=��?�4GI'P4UV���������������������������������������������������H

��FRCVJ�KPUVA&=��?�2TQE#NV&RCVJ246.���������������������������������������������������H

��EVTN�+0��2TQE#NV%VTN246.������������������������������������������������������������H

%NQEM�RTQRCICVKQP� �%.-�VQ�3�������PU
Tracing the Critical Path

Page 10: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

����

����

����

��%\SDVVLQJ�ORJLF�LV�QRW�VKRZQ�KHUH��\RX�ZLOO�KDYH�WKH�FKDQFH�WR�GUDZ�LW�LQ�WKH�ODE�UHSRUW�

Tracing the Critical Path

Page 11: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

��EVTN�KPUVAV[RGAFGEQFGTA&�KPA=��?�&GEQFG+PUV6[RG�������������������������������������H

�����

��EVTN�KPUVAV[RGAFGEQFGTA&�FRAKRQ��3�14�:���������������������������������������������H

��EVTN�KPUVAV[RGAFGEQFGTA&�QWV=�?�&GEQFG+PUV6[RG��������������������������������������H

�����

��EVTN�KOOAV[RGA&=�?�2TQE#NV%VTN246.��������������������������������������������������T

��FRCVJ�KOOAV[RGA&=�?�2TQE#NV&RCVJ246.������������������������������������������������T

&GEQFG�CP�KPUVTWEVKQP�������PU
Tracing the Critical Path

Page 12: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

FWUO����

����

����

Tracing the Critical Path

Page 13: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

��FRCVJ�KOOAV[RGA&=�?�2TQE#NV&RCVJ246.������������������������������������������������T

��FRCVJ�KOOAIGPA&�KOOAV[RG=�?�+OO)GP246.����������������������������������������������T

��FRCVJ�KOOAIGPA&�7��+0��014�:��������������������������������������������������������T

�����

��FRCVJ�KOOAIGPA&�7���3�#1��:���������������������������������������������������������H

��FRCVJ�KOOAIGPA&�KOO=��?�+OO)GP246.��������������������������������������������������H

��FRCVJ�REARNWUAKOOA&�KP�=��?�#FFGT���������������������������������������������������H

�����

��FRCVJ�REARNWUAKOOA&�QWV=��?�#FFGT���������������������������������������������������T

)GPGTCVKPI�KOO�������PU��CFFKVKQP�������PU
Tracing the Critical Path

Page 14: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

FWUO����

����

��������

����

����

Tracing the Critical Path

Page 15: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

��FRCVJ�REAUGNAOWZA(�KPAA���=��?�/WZ��������������������������������������������������T

�����

��FRCVJ�REAUGNAOWZA(�QWV=��?�/WZ������������������������������������������������������T

��FRCVJ�KOGOTGSAOUI=��?�2TQE#NV&RCVJ246.����������������������������������������������T

�����

��KOGOTGSASWGWG�S��FRCVJ�D[RCUUAOWZ�QWV=��?�/WZ���������������������������������������T

�����

��KOGOTGSAOUI=��?�QWV�����������������������������������������������������������������T

$[RCUU�SWGWG�������PU
Tracing the Critical Path

Page 16: Using ASIC Flow to Quantify Cycle Time · • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design

FWUO����

����

��������

��������

����

����

����

LPHPUHTBE\SDVVBTXHXH

Tracing the Critical Path