h r refactoring for heterogeneous computing with...

42
HETEROREFACTOR: Refactoring for Heterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian Zhang*, Muhammad Ali Gulzar, Jason Cong, Miryung Kim University of California, Los Angeles *Equal co-first authors in alphabetical order

Upload: others

Post on 06-Sep-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

HETEROREFACTOR: Refactoring forHeterogeneous Computing with FPGA

Jason Lau*, Aishwarya Sivaraman*, Qian Zhang*,Muhammad Ali Gulzar, Jason Cong, Miryung KimUniversity of California, Los Angeles

*Equal co-first authors in alphabetical order

Page 2: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

HETEROREFACTOR: Refactoring forHeterogeneous Computing with FPGA

Jason Lau Aishwarya Sivaraman

Qian Zhang Muhammad Ali Gulzar

Jason Cong Miryung Kim

Page 3: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

FPGA*-based Acceleration

Fast Efficient

* FPGA: Field Programmable Gate Array

3

Page 4: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

FPGA*-based Acceleration

Fast Efficient Effort

* Field Programmable Gate Array

4

Page 5: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evolution of Programming Modelmodule vecdot(a, b, c, clk, rst);

input [67:0] a, b;output [16:0] c;reg [5:0] s; reg [16:0] prod [0:7]; ...

always @(posedge clk or posedge rst)if (!rst) begin

if (s == 6’b00001)prod[0] = a[..] * b[..]; prod[1] =...s = 6’b00010;

else if (s == 6’b00010)reg1 = prod[0] + prod[1] + prod[2];s = 6’b00100; // goto L00100;

else if (s == 6’b00100)reg1 = reg1 + prod[3] + prod[4];s = 6’b01000;

else ... ;...endmodule

goto-style control.

instructions.

typeless.

registers.

VerilogHDL*

5

* HDL: Hardware Description Language

Page 6: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evolution of Programming Modelfpga_float<8,15> vecdot( fpga_float<8,15> a[], fpga_float<8,15> b[],

fpga_int<31> n) {for (fpga_int<31> i = 0; i < n; i++)

sum += a[i] * b[i];return sum;

}

MerlinHLS*,etc.

auto optimization.

typed.

auto schedule.

auto resource.

6

* HLS: High-Level Synthesis

Page 7: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Something is missing...fpga_float<8,15> vecdot( fpga_float<8,15> a[], fpga_float<8,15> b[],

fpga_int<31> n) {for (fpga_int<31> i = 0; i < n; i++)

sum += a[i] * b[i];return sum;

}

MerlinHLS*,etc.

bit-width.

waste scarcememory! FPGA memory:

< 100 MB

7

* HLS: High-Level Synthesis

bitwidth = 31

Page 8: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Something is missing...fpga_float<8,15> vecdot( fpga_float<8,15> a[], fpga_float<8,15> b[],

fpga_int<31> n) {for (fpga_int<31> i = 0; i < n; i++)

sum += a[i] * b[i];return sum;

}

MerlinHLS*,etc.

bit-width.

floating-point precision.

precision? memory?

8

* HLS: High-Level Synthesis

exponent 8 bits fraction 15 bits

Page 9: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Something is missing...struct Node {

Node *left, *right;int val; };

void init(Node **root) {*root = (Node *)malloc(sizeof(Node)); }

void insert(Node **root, int *arr);void delete_tree(Node *root) {// … free(root); }void traverse(Node *curr) {

if (curr == NULL) return;int ret = visit(curr->val);

traverse(curr->left);traverse(curr->right);

}

recursive data structure.

MerlinHLS*,etc.

bit-width.

floating-point precision.

4 errors in 14 lines of code

nested pointers

9

* HLS: High-Level Synthesis

Page 10: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Something is missing...struct Node {

Node *left, *right;int val; };

void init(Node **root) {*root = (Node *)malloc(sizeof(Node)); }

void insert(Node **root, int *arr);void delete_tree(Node *root) {// … free(root); }void traverse(Node *curr) {

if (curr == NULL) return;int ret = visit(curr->val);

traverse(curr->left);traverse(curr->right);

}

recursive data structure.

MerlinHLS*,etc.

bit-width.

floating-point precision.

4 errors in 14 lines of code

nested pointers

dynamic mem mgmt

preallocated size?

10

* HLS: High-Level Synthesis

Page 11: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Something is missing...struct Node {

Node *left, *right;int val; };

void init(Node **root) {*root = (Node *)malloc(sizeof(Node)); }

void insert(Node **root, int *arr);void delete_tree(Node *root) {// … free(root); }void traverse(Node *curr) {

if (curr == NULL) return;int ret = visit(curr->val);

traverse(curr->left);traverse(curr->right);

}

recursive data structure.

MerlinHLS*,etc.

bit-width.

floating-point precision.

4 errors in 14 lines of code

nested pointers

dynamic mem mgmt

pointer operations

preallocated size?

11

* HLS: High-Level Synthesis

Page 12: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Something is missing...struct Node {

Node *left, *right;int val; };

void init(Node **root) {*root = (Node *)malloc(sizeof(Node)); }

void insert(Node **root, int *arr);void delete_tree(Node *root) {// … free(root); }void traverse(Node *curr) {

if (curr == NULL) return;int ret = visit(curr->val);

traverse(curr->left);traverse(curr->right);

}

recursive data structure.

MerlinHLS*,etc.

bit-width.

floating-point precision.

4 errors in 14 lines of code

nested pointers

dynamic mem mgmt

pointer operations

recursion functions

preallocated size?

12

* HLS: High-Level Synthesis

Page 13: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evolution of Programming Model

evolvi

ng

ANSI CC++ 03

C++ 11

C++ 14

C++ 17

CPU

FPGA

Vivado HLS C/C++

Year

Programmability(languages features,programming difficulty, etc.)

HDL

gap

untimed descriptions

Merlin HLS C/C++

1989 2003 2011 2014 2017

pragma simplified evolving

Credit: A Multi-Paradigm Programming Infrastructure for Heterogeneous Architectures by Cong et al.

13

Page 14: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evolution of Programming Model

significant human effort.

& error-prone.

14

evolvi

ng

ANSI CC++ 03

C++ 11

C++ 14

C++ 17

CPU

FPGA

Vivado HLS C/C++

Year

Programmability(languages features,programming difficulty, etc.)

HDL

gap

untimed descriptions

Merlin HLS C/C++

1989 2003 2011 2014 2017

pragma simplified evolving

Credit: A Multi-Paradigm Programming Infrastructure for Heterogeneous Architectures by Cong et al.

Page 15: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evolution of Programming Model

significant human effort.

& error-prone.

waste scarce memory

15

evolvi

ng

ANSI CC++ 03

C++ 11

C++ 14

C++ 17

CPU

FPGA

Vivado HLS C/C++

Year

Programmability(languages features,programming difficulty, etc.)

HDL

gap

untimed descriptions

Merlin HLS C/C++

1989 2003 2011 2014 2017

pragma simplified evolving

Credit: A Multi-Paradigm Programming Infrastructure for Heterogeneous Architectures by Cong et al.

Page 16: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

I want it to run!

16

Page 17: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

I want it to run efficiently!

17

Page 18: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Automation!

18

Page 19: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

HETEROREFACTOR

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Recursive Data StructuresSupport and Optimization

IntegersBitwidth Optimization

Floating PointsBitwidth Optimization

on

e-click

19

Page 20: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Part 1. Dynamic Data Structures

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

on

e-click

Recursive Data StructuresSupport and Optimization

IntegersBitwidth Optimization

Floating PointsBitwidth Optimization

20

Page 21: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

on

e-click Data Structure Size Recursion Depth

Recursive Data Structures

Dynamic Data Structures: Instrumentation

21

Page 22: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Dynamic Data Structures: Refactoring

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Rewrite Memory Management

Modify Pointer Access

Convert Recursion

on

e-click

Recursive Data StructuresSupport and Optimization

Data Structure Size Recursion Depth

Recursive Data Structures

22

Page 23: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Example Program

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

void init(Node **root) {*root = (Node *)malloc(sizeof(Node)); }

void delete_tree(Node *root) { // …free(root); }

void traverse(Node *curr) { // entryif (curr == NULL)

return;int ret = visit(curr->val);traverse(curr->left);traverse(curr->right); // return

}

// top-level functionfloat kernel(float input[], int n) {

float value = computation(float(..), ..);}

Instrumentation

23

Page 24: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Refactoring Rule 1: Rewrite Mem. Mgmt.

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Instrumentation

void init(Node **root) { *root = (Node *)malloc(sizeof(Node)); }

void delete_tree(Node *root) { // … free(root); }

void init(Node_ptr *root) {*root = (Node_ptr)Node_malloc(sizeof(Node)); }

void delete_tree(Node_ptr root) { // …Node_free(root); }

24

Page 25: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Refactoring Rule 1: Rewrite Mem. Mgmt.

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Instrumentation

void init(Node **root) { *root = (Node *)malloc(sizeof(Node)); }

void delete_tree(Node *root) { // … free(root); }

void init(Node_ptr *root) {*root = (Node_ptr)Node_malloc(sizeof(Node)); }

void delete_tree(Node_ptr root) { // …Node_free(root); }

25

Page 26: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Refactoring Rule 2: Rewrite Pointer Access

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Instrumentation

void traverse(Node_ptr curr) {if (curr == NULL) return;int ret = visit(curr->val);traverse(curr->left);traverse(curr->right); }

Node Node_arr[NODE_ARR_SIZE];void traverse(Node_ptr curr) {

if (curr == NULL) return;int ret = visit(Node_arr[curr].val);traverse(Node_arr[curr].left);traverse(Node_arr[curr].right); }

26

Page 27: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Refactoring Rule 3: Convert Recursion

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Instrumentation

void traverse(Node_ptr curr) {traverse(Node_arr[curr].left);traverse(Node_arr[curr].right); }

void traverse_converted(Node_ptr curr) {stack<context> s(STACK_SIZE);while (!s.empty()) {

context c = s.pop();goto c.location;

L0:// traverse(Node_arr[curr].left);c.location = L1;s.push(c);s.push({curr: Node_arr[curr].left});continue;

L1:// ...

}}

27

Page 28: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Part 2. Integers

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Recursive Data StructuresSupport and Optimization

IntegersBitwidth Optimization

Floating Points Bitwidth Optimization

on

e-click

28

Page 29: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Integers: Kvasir-based Instrumentation

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

on

e-click Value Range

Integers

29

Page 30: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Integers: Refactoring

Instrumentation

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

on

e-click

Rewrite Integer Types

Value Range

Integers

30

Page 31: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Refactoring Rule: Modify Integer Type

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Instrumentation

Node Node_arr[NODE_ARR_SIZE];void init(Node_ptr *root) {

*root = (Node_ptr)Node_malloc(sizeof(Node)); }

void delete_tree(Node_ptr root) { // …Node_free(root); }

void traverse(Node_ptr curr) {if (curr == NULL) return;

// @invarants(ret[21,255])// int ret = visit(Node_arr[curr].val);

fpga_uint<8> ret = visit(Node_arr[curr].val);traverse(Node_arr[curr].left);traverse(Node_arr[curr].right); }

float kernel(float input[], int n) {float value = computation(float(..), ..);

}

31

Page 32: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Part 3. Floating Points

Program Variants Generation

Differential Execution and Probabilistic Verification

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Recursive Data StructuresSupport and Optimization

IntegersBitwidth Optimization

Floating Points Bitwidth Optimization

on

e-click

32

Page 33: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Program Variants Generation

Differential Execution and Probabilistic Verification

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

on

e-click

ddddC/ddddC/Pre-transformed Programs

with Different Precisions

Floating Points

Floating Points: Program Variants Generation

33

Page 34: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Floating Points: Program Variants Generation

Differential Execution and Probabilistic Verification

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

float kernel(float input[], int n) {float value = computation(float(..), ..);

}

float low_bit(float input[], int n) {fpga_float<8,16> value =

computation(fpga_float<8,16>(..), ..);}

float high_bit(float input[], int n) {fpga_float<8,23> value =

computation(fpga_float<8,23>(..), ..);}

Program Variants Generation

fpga_float<Exponent, Fraction> to customize FP precision * note: fpga_float<8,23> is 32 bit float type, fpga_float<5,16>uses 22 bits in total

34

Page 35: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Floating Points: Differential Execution

Program Variants Generation

Differential Execution and Probabilistic Verification

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

on

e-click

Precision Loss from Differential Execution

ddddC/ddddC/Pre-transformed Programs

with Different Precisions

Floating Points

35

Page 36: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Differential Execution and Probabilistic Verification

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

float kernel(float input[], int n) {float value = computation(float(..), ..);

}

float low_bit(float input[], int n) {fpga_float<8,16> value =

computation(fpga_float<8,16>(..), ..);}float high_bit(float input[], int n) {

fpga_float<8,23> value =computation(fpga_float<8,23>(..), ..);

}

void verification() {float diff = high_bit(..) - bit_ver(..);if (diff > epsilon) // failed sample

}

Program Variants Generation

Floating Points: Differential Execution

36

Page 37: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Floating Points: Probabilistic Verification

Differential Execution and Probabilistic Verification

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

void verification() {float diff = high_ver(..) - low_ver(..);if (diff > epsilon) // failed sample

}Program Variants

Generation

Use Hoeffding’s inequality [1] to calculate the number of samples to meet the required confidence level: alpha.

[1] Hoeffding, Wassily (1963). "Probability inequalities for sums of bounded random variables"

37

Page 38: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Guard Checking

Refactoring

Selective Offloading

C++ Inputs

Vivado HLS / Merlin

Instrumentation

● Input check on host and intermediate check on device

● Send a signal to the host to indicate fallback when:

○ Recursive programs: stack overflow, memory failure

○ Integers: overflow

● The host restart computation on CPU

38

Page 39: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evaluation: Coding Effort

ID / ProgramOrig.LOC

Manual LOC

ΔLOC

Auto.LOC

Orig.Chars

Manual Chars

ΔChars

R1/A.-C. 190 291 33% 557 5673 8776 35%

R2/DFS 86 198 57% 464 2236 5699 61%

R3/L. List 131 235 44% 329 3061 6686 54%

R4/M. Sort 128 342 63% 390 3267 9124 64%

R5/Strassen’s 342 735 53% 1006 10026 40971 76%

Geomean 49% 56%

49%reductionin LOC

56%in chars

39

Page 40: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Evaluation: Resource Reduction

83%reductionin BRAM

Recursive Data Structures*

42%increasein Fmax

* assuming a typical size of 2k, + a conservative size of 16k

22%reductionin FF

Integer

21%reductionin LUT

41%reductionin BRAM

52%increasein DSP

61%reductionin FF

Floating-point

39%reductionin LUT

50%increasein DSP

40

Page 41: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

Acknowledgement

● Guy Van den Broeck, Brett Chalabian, Todd Millstein,

Peng Wei, Cody Hao Yu, Janice Wheeler

● Intel CAPA grant

● CRISP, one of six centers in JUMP, a SRC program

● NSF grants: CCF-1764077, CCF-1527923, CCF-1723773

● ONR grant: N00014-18-1-2037

● Samsung grant

● Center for Domain-Specific Computing (CDSC)

○ Xilinx and VMWare.

41

Page 42: H R Refactoring for Heterogeneous Computing with FPGAmiryung/Publications/icse2020-heterorefactor-slides.pdfHeterogeneous Computing with FPGA Jason Lau*, Aishwarya Sivaraman*, Qian

● We adapt and expand automated refactoring to heterogeneous computing with FPGA.

● HETEROREFACTOR provides a novel, end-to-end solution that combines:

○ dynamic invariant analysis for identifying common-case.

○ kernel refactoring to enhance HLS synthesizability and to reduce memory usage.

○ selective offloading with guard checking to guarantee correctness.

● The proposed combination is unique to the best of our knowledge.

HETEROREFACTOR: Refactoring forHeterogeneous Computing with FPGAJason Lau*, Aishwarya Sivaraman*, Qian Zhang*,Muhammad Ali Gulzar, Jason Cong, Miryung KimUniversity of California, Los Angeles

*Equal co-first authors in alphabetical order

42