accelerating coherent pulsar de-dispersion on graphics processing units by arjun radhakrishnan...

16
Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Upload: winfred-simpson

Post on 17-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Accelerating Coherent PulsarDe-dispersion on

Graphics Processing Units

byArjun Radhakrishnan

supervised byProf. Michael Inggs

Page 2: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Outline

Graphics Processing Units (GPUs)

Pulsars

Pulsar De-dispersion

Motivation

Implementation

Results

Conclusion & Future Work

Page 3: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Graphics Processing Units

GPUs are massively parallel processors that are present on consumer graphics cards

Generally used to render 3D objects on screen and calculate the colour of pixel to display

Are mass market products due to the video game industry

Performance tracks Moore's Law since the majority of on-chip space is devoted to compute units as opposed to cache on CPUs

*Source: [7]

Page 4: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Why Use GPUs?

Figure 1: Peak floating point performance of NVIDIA GPUs vs Intel CPUs [2]

Page 5: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Pulsars

Highly magnetised, rapidly rotating neutron stars formed after a supernova

Pulsars emit beams of electromagnetic radiation from their magnetic poles

Beams sweep in a circular path called the “lighthouse effect”

Produce periodic pulses when the pulse sweeps Earth

Figure 2: Pulsar Model [3]

Page 6: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Pulsar Dispersion

Pulsar emissions are distorted upon passing through the ionised Interstellar Medium (ISM)

Lower frequency components of the pulse are delayed more than higher frequencies

Page 7: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Pulsar De-dispersion

Pulsar emissions are distorted upon passing through the ionised Interstellar Medium (ISM)

Lower frequency components of the pulse are delayed more than higher frequencies

Correct for the dispersion by shifting the received signal a certain amount

Figure 3: Pulsar De-dispersion [4]

Page 8: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Coherent De-dispersion

Coherent de-dispersion is the most accurate method of removing the dispersion effects of the Interstellar Matter

Preserves amplitude and phase information from the receiving signal

Convolve the voltage signal with the inverse transfer function of the ISM

This transfer function is a function of the Dispersion Measure (DM) of the signal got from models of the galactic electron density

In practice we use the Fast Fourier Transform (FFT) to make the convolution operation a multiplication in the frequency domain and then apply an inverse FFT

Page 9: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Motivation

Why study Pulsars? A major SKA Science driver: Detection of gravitational waves and tests

of strong field relativity; Analysing black holes

GPU acceleration for MeerKAT Large frequency range (Low: 0.5 – 2.5 GHz, High: 8 – 14.5GHz) High bandwidth per polarisation (4GHz final) Large number of channels (16384) >10GB of data per second

Even more important for SKA since precision will be a high priority and data storage is not feasible

Page 10: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Implementation Considerations

Both CPU and GPU were tested with single-precision floating point

A bottleneck for GPU computing is the time taken to send data to it from main memory – minimise as much as possible

Use asynchronous data transfers to hide the latency

Re-calculate rather than copy data across

Use shared memory on the GPU for calculations and store to global memory at the end

Source data file used is fake dual polarisation data generated with a DM of 50pc/cm3 and 100MHz bandwidth centred on 1450MHz

Page 11: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Basic Program Flow

Figure 4: Program flow

Read in Data

HOST

Copy to GPU memory

Initiate GPU Kernel

V(f0) . H-1(f0) V(fn) . H

-1(fn)

Receive de-dispersed signal

Free Memory

Inverse FFT Inverse FFT

Parallel FFT Parallel FFT

DEVICE

Allocate memory on GPU

Begin De-dispersion

V(f1) . H-1(f1)

+

Output Array

Send Data Back to Host

Inverse FFT...

...

Parallel FFT...

+

Page 12: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Results

Figure 5: Left: Overall speedup (5x) Right: Kernel Speedup (12x)

Page 13: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Results

Was able to coherently de-disperse 50MHz on 1 GPU

Used 2 GPUs for the full 100MHz

Scaling across multiple GPUs was linear

Using larger transfer functions was found to increase performance since there was less of an overhead in memory access times

Page 14: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Conclusion

GPUs are significantly faster than CPUs for de-dispersion

Enabled real-time coherent de-dispersion for the dataset used

Coherent de-dispersion of a 100MHz bandwidth signal requires multiple GPUs at present

Faster memory access would greatly improve overall speedup

Currently testing with real undetected pulsar data

Page 15: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

Thank You!

Questions?

Page 16: Accelerating Coherent Pulsar De-dispersion on Graphics Processing Units by Arjun Radhakrishnan supervised by Prof. Michael Inggs

References

1. D. R. Lorimer and M. Kramer, Handbook of Pulsar Astronomy Cambridge University Press, 2005

2. NVIDIA CUDA Programming Guide

3. D. Manchester, “CSIRO ATNF Pulsar Education Page”

4. Jim Cordes, “The SKA as a Radio Synoptic Survey Telescope: Widefield Surveys for Transients, Pulsars and ETI”, SKA Memo 97

5. John Rowe Animation/Australia Telescope National Facility, CSIRO [Online]. http://www.atnf.csiro.au/research/pulsar/array/gallery.html

6. Cornell University Dept. of Astronomy, “Legacy Pulsars: Homepage” [Online]. http://arecibo.tc.cornell.edu/legacypulsardata/Default.aspx

7. VR-Zone, “The NVIDIA GeForce GTX 280 1GB bare,” [Online]. http://vr-zone.com/articles/nvidia-geforce-gtx-280-preview/5872.html?doc=5872