ieeei 2010 ise for computation on complex floating point numbers instruction set extensions for...

17
IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors: Philipp Digeser, Marco Tubolino , Martin Klemm, Daniel Shapiro, Axel Sikora and Miodrag Bolic Email: {digeserp, tubolinm, klemmm, sikora}@dhbw-loerrach.de {dshap092, mbolic}@site.uottawa.ca

Upload: maddison-bufford

Post on 31-Mar-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Instruction Set Extensions for Computation on Complex Floating Point Numbers

Authors: Philipp Digeser, Marco Tubolino , Martin Klemm, Daniel Shapiro, Axel Sikora and Miodrag BolicEmail: {digeserp, tubolinm, klemmm, sikora}@dhbw-loerrach.de

{dshap092, mbolic}@site.uottawa.ca

Page 2: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Overview• Prior Art • Complex Floating Point Division• Instruction Set Extensions (ISE)• Instruction Hardware• Software Interface• Experiment• Performance Evaluation• Hardware Resource Utilization• Future Work• Conclusion

Page 3: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Prior Art

• We described the possibility of accelerating scientific observation using ISEs instead of software libraries such as carith

• In this work we demonstrated this possibility• The extension of our prior work can perform

several operations (complex addition/subtraction/multiplication/division) which improves the chances of our ISE being widely applicable.

Page 4: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Complex Floating Point Computations

• Unlike real multiplication or division, mathematical operations for complex numbers are usually provided by slow software. Consider complex division:

SlowE+ jF  =A+ jBC+ jD

E+ jF =( A+ jB)∙(C− jD)( C+ jD )∙(C− jD) 

=AC+BDC ²+D ²

+ jBC−ADC ²+D²

• 3 Additions/Subtractions• 6 Multiplications• 2 Divisions

Page 5: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Complex Floating Point Computations• Fast complex computations are necessary– Image and audio manipulation– Multi-antenna– Correlation– Others

• Example: STSDAS offers math libraries for image analysis, including stsdas.analysis.fourier.carith, which is used to multiply or divide two complex images [1].

Page 6: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Instruction Set Extension

• Instruction-Set Extensions, as the name implies, involves the addition of custom instructions to a processor’s instruction set

Generic custom instruction datapath [2]

Page 7: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Instruction Set Extension• An ISE candidate has limited I/O

access to the register file.• We use multicycle reads/writes

from/to the register bank in order to squeeze several operands into the two input-one-output register file [4]

• The computations can be distributed to one adder, one multiplier and one divider

• They can be pipelined• In case of divide by zero and

overflow flags are set

Original custom logic block [3]

Page 8: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Instruction Hardware

Operation when n=0 above, n=1 at right.

E+ jF =AC+BDC²+D ²

+ jBC−ADC²+D ²

Page 9: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Software Interface

• The designed hardware for complex division can be used easily in assembly (by inline) or C/C++ code as shown below:

ALT_CI_COMPLEX_CORE_INST(0, in_A, in_C);out_real = ALT_CI_COMPLEX_CORE_INST(1, in_B, in_D);out_imag = ALT_CI_COMPLEX_CORE_INST(0, 0, 0);

Page 10: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Experiment• h(u,v) is some blurred picture taken by a telescope

– Motion blurring: long exposure time and moving of the camera. E.g. hubble

• g(u,v) illustrates the image aimed to be recovered • f(u,v) the failure, called a point spread function, can be

calculated out of the known movement of the target

h(u,v) g(u,v)f(u,v)

Page 11: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Experiment• To restore the image, they must be transformed into the freq.

domain by applying an FFT and back using IFFT• This transformation leads to complex arrays in the freq.

domain that need to be divided:

h(u,v)f(u,v) g(u,v)

f(u,v) ∗g(u,v)=h(u,v) G(u,v)=H(u,v)/F(u,v)

Page 12: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Performance Evaluation

Approach Execution Time (seconds)

Loop Overhead (seconds)

Speedup

SW divisionISE accelerated division

9.176730.77180

0.022580.02258 12.2182

SW multiplicationISE accelerated multiplication

6.418270.76075

0.022730.02273 8.6651

SW additionISE accelerated addition

2.506100.74385

0.022590.02259 3.44344

SW subtractionISE accelerated subtraction

2.586610.74477

0.022600.02260 3.55442

• Size: 256x256 Pixel

Page 13: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Hardware Resource Utilization

• Considerable • The entire system requires 8864 Logic

Elements and 27 9-Bit DSP units• The complex core requires 2520 Logic

Elements and 23 9-Bit DSP units• Optimizing the ISE hardware to maximize

reuse was essential to limiting the hardware size

Page 14: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Future Work

• Adding FFT and IFFT• To accelerate other embedded complex

mathematics algorithms• Correlation of pictures– Instead of doing a slow time domain correlation– Heavy complex multiplication in freq. domain

Page 15: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Conclusion

• The designed ISE can be used to accelerate embedded complex mathematics operations

• Significant Speedup (up to 12)

Page 16: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

Questions?

Page 17: IEEEI 2010 ISE for Computation on Complex Floating Point Numbers Instruction Set Extensions for Computation on Complex Floating Point Numbers Authors:

IEEEI 2010ISE for Computation on Complex Floating Point Numbers

References[1] Space Telescope Science Institute. (2010) carith. [Online].

Available: http://stsdas.stsci.edu/cgi-bin/gethelp.cgi?carith.hlp[2] ALTERA Corperation. (2007) Nios II custom instruction user guide.

[Online]. Available: http://www.altera.com/literature/tt/tt nios2 multiprocessor tutorial.pdf

[3] P. Digeser, M. Tubolino, M. Klemm, D. Shapiro, and M. Bolic, “Instruction set extension in the NIOS II: A floating point divider for complex numbers,” in CCECE, 2010.

[4] L. Pozzi and P. Ienne, “Exploiting pipelining to relax register-file port constraints of instruction-set extensions,” in CASES ’05: Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems. New York, NY, USA: ACM, 2005, pp. 2–10.