using fpgas to supplement ray-tracing computations on the cray xd-1

53
Using FPGAs to Using FPGAs to Supplement Ray-Tracing Supplement Ray-Tracing Computations on the Cray Computations on the Cray XD-1 XD-1 Charles B. Cameron United States Naval Academy Department of Electrical Engineering United States Naval Academy 105 Maryland Avenue, Stop 14B Annapolis, Maryland 21402- 5025 Research supported by: NASA Goddard Space Flight Center (Code 586) NRL Applied Optics Branch (Code 5630) DoD High Performance Computing Modernization Program at NRL (Code 5593) United States Naval Academy Xilinx, Inc.

Upload: logan-lott

Post on 04-Jan-2016

17 views

Category:

Documents


1 download

DESCRIPTION

Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1. Charles B. Cameron. United States Naval Academy Department of Electrical Engineering United States Naval Academy 105 Maryland Avenue, Stop 14B Annapolis, Maryland 21402-5025. Research supported by: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Using FPGAs to Supplement Using FPGAs to Supplement Ray-Tracing Computations on Ray-Tracing Computations on

the Cray XD-1the Cray XD-1

Charles B. Cameron

United States Naval AcademyDepartment of Electrical Engineering

United States Naval Academy105 Maryland Avenue, Stop 14BAnnapolis, Maryland 21402-5025

Research supported by:• NASA Goddard Space Flight Center (Code 586)• NRL Applied Optics Branch (Code 5630)• DoD High Performance Computing Modernization Program at NRL (Code 5593)• United States Naval Academy• Xilinx, Inc.

Page 2: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

TopicsTopics

• Ray tracing

• Conventional parallel processing

• Modulo scheduling

• Coordination of sequential and parallel processing

• Expected Performance

Page 3: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Ray tracingRay tracing

• MODIS– Moderate-resolution Imaging Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

Page 4: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

MODIS Optical SystemMODIS Optical System ( (Moderate-resolution Imaging Moderate-resolution Imaging

Spectroradiometer)Spectroradiometer)

Page 5: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

MODIS Optical SystemMODIS Optical System

•485 pinholes•400 rays per pinhole•241 121 rays reflected from the diffuser•5.66 109 rays

Page 6: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Ray Directed to a SurfaceRay Directed to a Surface

• MODIS– Moderate-resolution Imaging

Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

• Coordinate Transformation

Page 7: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Calculate the Intercept PointCalculate the Intercept Point

• MODIS– Moderate-resolution Imaging

Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

• Coordinate Transformation

Page 8: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Find the NormalFind the Normal

• MODIS– Moderate-resolution Imaging

Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

• Coordinate Transformation

Page 9: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Find the Refracted RayFind the Refracted Ray

• MODIS– Moderate-resolution Imaging

Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

• Coordinate Transformation

Page 10: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Find the Reflected RayFind the Reflected Ray

• MODIS– Moderate-resolution Imaging

Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

• Coordinate Transformation

Page 11: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Coordinate TransformationCoordinate Transformation

• MODIS– Moderate-resolution Imaging

Spectroradiometer

• The Intersection Problem

• Finding the Perpendicular

• Refraction

• Reflection

• Coordinate Transformation(Hard to visualize this!)

Page 12: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

TopicsTopics

• Ray tracing

• Conventional parallel processing

• Modulo scheduling

• Coordination of sequential and parallel processing

• Expected Performance

Page 13: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

ParallelismParallelism

Page 14: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

PerformancePerformance (5.66 (5.66 10 1099 rays) rays)

Processor DEC Alpha 3000 Series Model 800. 200 MHz

Cray XD-1 with 839 AMD Opteron 275 processors. 2.2 GHz

Duration 1.2 106 s

(Two weeks)

27 s

Rate 0.112 106 rays · surfaces / s

6.6 106 rays · surfaces / (s · processor)

Reduction in Time Consumed:

Improvement in Ray Tracing Rate:99.998 %

5,857 %

*

* Rate based on a linear regression of results obtained using a varying numbers of processors.

Page 15: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

DEC Alpha 3000 Series Model 800 Opteron alone

PerformancePerformance (5.66 (5.66 10 1099 rays) rays)

Page 16: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

EfficiencyEfficiency

Page 17: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

TopicsTopics

• Ray tracing

• Conventional parallel processing

• Modulo scheduling

• Coordination of sequential and parallel processing

• Expected Performance

Page 18: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Operations Required as a Operations Required as a Function of Surface, Aperture, Function of Surface, Aperture,

and Interaction Typesand Interaction Types

0

10

20

30

40

50

60

# o

f O

per

atio

ns

1 2 3 4 5 6 7 8 9 10 11 12

Circular

Aperture

Rectangular

Aperture

Plane 1. Refraction

7. Reflection

4. Refraction

10. Reflection

Sphere 2. Refraction

8. Reflection

5. Refraction

11. Reflection

Conicoid 3. Refraction

9. Reflection

6. Refraction

12. Reflection

Lots of theseNot too many of these

Page 19: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

27

6

6

112 4b b ac

2a

4ac

ac

2 4

2

b b ac

a

2 4b ac

2 4b ac

4

2b

b

c a

2

27

11

6

6

Quadratic EquationQuadratic Equation

Critical Path

(Data-Flow Limit)

88 cycles

Latency

Unit # of cycles

Adder 11

Multiplier 6

Divider 27

Square root extractor 27

Page 20: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 21: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 22: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 23: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 24: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 25: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 26: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Page 27: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:One MultiplierOne Multiplier

Equal to the Data-Flow Limit

Page 28: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

One collective computation

Modulo Scheduling:Modulo Scheduling:Filling the PipelineFilling the Pipeline

10c 0c

Cycle #

20c30c40c50c60c70c80c90c

Page 29: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Filling the PipelineFilling the Pipeline

10c 0c

Cycle #

20c30c40c50c60c70c80c90c

Page 30: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Multipliers are 100 % utilized

Modulo Scheduling:Modulo Scheduling:Filling the PipelineFilling the Pipeline

10c 0c

Cycle #

20c30c40c50c60c70c80c90c

No schedule conflicts

Page 31: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Two multipliers with two multiplications each

Page 32: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Two cycles

One adder with two additions

Maximum efficiency

Page 33: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Improved efficiency:

Up from 25 %

Page 34: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Page 35: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Page 36: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Less than the Data-Flow Limit

Page 37: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Modulo Scheduling:Modulo Scheduling:Two MultipliersTwo Multipliers

Less than the Data-Flow Limit, but double the throughput.

Page 38: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

TopicsTopics

• Ray tracing

• Conventional parallel processing

• Modulo scheduling

• Coordination of sequential and parallel processing

• Expected Performance

Page 39: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Cray XD-1Cray XD-1

•MPI (Message Passing Interface)

•Master node

•Reads file

•Distributes file

•Collates results

...

...

...

... ... ...220 nodes

Page 40: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

One Node of the Cray XD-1One Node of the Cray XD-1

•Open MP (Multi Processing)

•144 of 220 nodes have a Xilinx Virtex II Pro FPGA

•Opteron processors

•Sequential program

•Depth first

•FPGA

•Pipelined hardware

•Breadth first

AMD Opteron0

AMD Opteron1

AMD OpteronP2

AMD Opteron3

FPGA

FPGA ThreadRT Thread

RT Thread

RT Thread

RT Thread

Page 41: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

TopicsTopics

• Ray tracing

• Conventional parallel processing

• Modulo scheduling

• Coordination of sequential and parallel processing

• Expected Performance

Page 42: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

PerformancePerformance

Opteron alone 6.6 106 rays · surfaces / s · proc [meas.]

FPGA alone 5.4 106 rays · surfaces / s · proc [est.]

Reduction in speed = 20 %.

Page 43: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

PerformancePerformance

Opteron alone 6.6 106 rays · surfaces / s · proc [meas.]

FPGA alone 5.4 106 rays · surfaces / s · proc [est.]

Reduction in speed = 20 %.

Opteron with FPGA 12.0 106 rays · surfaces / s · proc [est.]

Increase in speed = +80 %.

Floating point units use 11% of FPGA

•1 adder

•1 multiplier

•1 divider

•1 square-root unit

Page 44: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

PerformancePerformance

Opteron alone 6.6 106 rays · surfaces / s · proc [meas.]

FPGA alone 5.4 106 rays · surfaces / s · proc [est.]

Reduction in speed = 20 %.

Opteron with FPGA 12.0 106 rays · surfaces / s · proc [est.]

Increase in speed = +80 %.

Floating point units use 11% of FPGA

Opteron with FPGA 25.2 106 rays · surfaces / s · proc [est.]

Increase in speed = +285 %.

Floating point units use 25% of FPGA

•1 adder

•1 multiplier

•1 divider

•1 square-root unit

•3 adders

•4 multipliers

•1 divider

•1 square-root unit

Page 45: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

PerformancePerformance

0.00

5.00

10.00

15.00

20.00

25.00

30.00

Opteron alone FPGA alone Opteron withFPGA

Opteron withFPGA

Note 1: 1 adder, 1 multiplier, 1 divider, 1 square-root takerNote 2: 3 adders, 4 multipliers, 1 divider, 1 square-root taker

MeasuredEstimate

Estimate

Estimate

(Note 1)(Note 2) (Note 1)

Page 46: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

SummarySummary

• Modulo scheduling produces 100 % efficiency of critical resources.

• Sequential processors get a boost from supplemental FPGA processing.

• Deep pipelines are efficient only if filled much of the time.

• FPGAs beat ASICs only if they can take advantage of special problem knowledge.

• Opteron uses 55 W.• Virtex II Pro FPGA uses 4 W to 45 W.

Page 47: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

EquationsEquations

• Intersection of a Ray with a Plane

• Intersection of a Ray with a Sphere

• Intersection of a Ray with a Conicoid

• Finding the Perpendicular

• Interaction of a Ray with an Optical Surface

• Coordinate Transformations

Page 48: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Intersection of a Ray with a Intersection of a Ray with a PlanePlane

List of equations

Initial direction

Normal to the plane

Point in the plane

Initial point

Final point

Page 49: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Intersection of a Ray with a Intersection of a Ray with a SphereSphere

List of equationsInitial pointFinal point

Initial direction

Page 50: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Intersection of a Ray with a Intersection of a Ray with a ConicoidConicoid

List of equations

Initial point

Final point

Initial direction

Page 51: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Finding the PerpendicularFinding the Perpendicular

Unit Vector Normal to a Sphere

Unit Vector Normal to a Conicoid

List of equations

Page 52: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Interaction of a Ray with an Interaction of a Ray with an Optical SurfaceOptical Surface

Refraction Reflection

List of equations

Initial index of refraction

Final index of refraction

Normal to the plane

Initial direction

Final direction

Page 53: Using FPGAs to Supplement Ray-Tracing Computations on the Cray XD-1

Coordinate TransformationsCoordinate Transformations

Rotation and Translation

Rotation

List of equations

Translation Vector

Rotation Matrix

Direction in Frame of Reference k

Direction in Frame of Reference k+1

Position in Frame of Reference k

Position in Frame of Reference k+1