international networking for e-vlbi. what is jive? operate the evn correlator and support...

31
International Networking for e-VLBI

Upload: bryce-henderson

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

International Networking for e-VLBI

What is JIVE?Operate the EVN correlator and support astronomers doing VLBI.

A collaboration of the major radio-astronomical research facilities inEurope, China and South Africa

A 3 year program to create a distributed astronomical instrument of inter-continental dimensions using e-VLBI, connecting up to 16 radio telescopes

Radio Astronomy

Courtesy of NRAO

Radio vs. Optical astronomyThe imaging accuracy (resolution) of a telescope: θ ≈ 1.2 λ/D (λ = wavelength, D = diameter)

Hubble Space Telescope:λ ≈ 600nm (visible light)D = 2.4mθ = 0.06 arcsecond

Onsola Space Observatory:λ = 6cm (5GHz)D = 25mθ = 600 arcsecondsWanted: 240km dish

Very Long Baseline Interferometry• Create a huge radio telescope by using telescopes in different locations around the world at the same time

• Resolution depends on distance between dishes, milli-arc second level• Sensitivity on dish area, time and bandwidth• Requires atomic clock stability for timing• Processed in a specialpurpose super-computer:Correlator, 16x 1024Mb/s

Very Long Baseline Interferometry

tekst

• Initially (1990) we used large single-reel tapes

tekst

• Then came harddisk-packs

• And now: e-VLBI

Very Long Baseline Interferometry

“Never underestimate the bandwidth of a station wagon laden with computer tapes hurtling

down the highway”(Andy Tanenbaum)

Latency: 2 weeks Latency: 150ms

Why e-VLBI• Quick turn-around• Rapid response• Check data as it comes in, not weeks later (You can’t redo just 1 telescope)• More bandwidth• Logistics (disks delayed/deleted/damaged/destroyed)

Example: Cyg X-3• Star + black hole• Flares irregularly• Timescale: days• Left: 2 weeks late• May: Observed flare with e-VLBI

International Year of Astronomy

Observation of J0204+15 (IYA)

Observation of J0204+15 (IYA)

Observation of J0204+15 (IYA)

Networking challenges

e-VLBI is:

• High Bandwidth: > 1 Gb/s

• Long Distance: Worldwide

• Real-time

• Long duration: 12 hours

• Can accept a little

packet loss

TCP Research• Mirror port (span)

• eVLBI: RTT up to 375ms

• Window Size (kernel vers.)

• SACK-bugs

• TCP Tuning defeats fairness

• Conclusion:

• UDP

• ‘private’ connections

(LP, VLAN, dark fiber)

Lightpaths• Dedicated point-to-point circuit• Based on SDH/Sonet timeslots (NOT a lambda)• Stitched together at cross-connects• Guaranteed bandwidth• But also: a string of SPFs

e-VLBI

Lightpaths• Especially the longer lightpaths have many outages• NRENs usually very good about announcing maint.• A -lot- of email.

• e-VLBI is becoming a ‘target of opportunity’ instrument, planned and unplanned observations

Telescope CC Bandwidth RTT (ms)Sheshan CN 622M LP / 512M R 354 / 180

ATNF AU 1G LP 343

Kashima JP 512M R 288Arecibo PR 512M VLAN 154

TIGO CL 95M R 150

Westford US 512M R 92

Yebes ES 512M R 42,1Torun PL 1G LP / 10G R 34,9

Onsala SE 1.5G VLAN 34,2

Metsahovi FI 10G R 32,7

Medicina IT 1G LP 29,7

Jodrell Bank UK 2x 1G LP 18,6

Effelsberg DE 10G VLAN 13,5WSRT NL 2x 1G CWDM 0,57

Network Overview

Network overview

JIVE Network Setup

The 1Gb/s speedbump• VLBI (tape based) comes in fixed speeds, power of 2: 128Mb/s, 256Mb/s, 512Mb/s - and 1024Mb/s

• 1024Mb/s > 1Gb/s! (with headers it’s more like 1030)

• Dropping packets works but is sub-optimal

• Dropping ‘tracks’ to <1Gb/s: Takes a LOT of CPU work

• Lightpaths come in ‘quanta’ of150Mb/s, but Ethernet doesn’t

The Trouble with Trunking• Standard trunking: LACP (802.3ad)• Uses a hash of source/destination MAC, IP and/or Port to choose outgoing port• This is to prevent re-ordering• A single TCP/UDP stream will use only 1 link member!

• Recent Linux kernels come with bonding, ‘ifenslave’• Round Robin traffic distribution• Keep both halves in separate VLANS/Lightpaths as switches in between only speak LACP

“Do NOT cross the streams!”

CWDM from WSRT to JIVEMuch cheaper than upgrading to 10Gb/s

All the colours of the rainbow...

... and then some.

SX: 850nm

1610nm

LX: 1310nm

ZX: 1550nm

1470nm

1550nm

1510nm

1570nm

Timing is everything• Originating network interface often has more

bandwidth than the end-to-end network path

• Example: 1Gb/s interface, 622 Mb/s lightpath

• Trying to send 520 Mb/s for e-VLBI - should fit...

• Ethernet either sends at 1Gb/s, or 0 Gb/s

• Linux timer interrupt: 250Hz (2.6) or100Hz

(2.4)

• Data will be sent as bursts: 4ms at 1Gb/s is

512kB

• Buffer space at choke-point is limited, so

packets get

dropped when queue fills up

Timing is everything

• Linux timer interrupt, even at 1000Hz, is too

slow

• Use a CPU to run a calibrated delay at full

speed

• And mount a big cooling fan

• Good packet spacing - problem solved?

• A real-time problem requires a real-time

kernel

• High-resolution nanosleep( ) in 2.6.17 and

beyond

• Available by default in e.g. Debian Lenny

(2.6.26)

• Sleeping instead of spinning

• CPU-core load from 99% to 5% - much

greener!

Multicast - it’s not just for video• MERLIN - Multi Element Radio Linked

Interferometer

• 6 UK telescopes, connected at 128Mb/s

• Up to 4 recorded on one server

• Disk: copy the disks upon arrival

• e-VLBI: network to copy data

• Using a span/mirror port (ugly hack)

• Multicast 512Mb/s UDP stream

• Needs hardware multicast support in

switch/router

• Now in regular production use: ‘Merlincast’

The Elliptical Robin• Jodrell Bank ‘home’ telescope:

full 1024 Mb/s

• 2 lightpaths to JIVE, 1Gb/s each

• Use ethernet bonding (‘Round Robin’) -

distribute

the packets over 2 links

• No room left over for 512 Mb/s Merlincast?

• 8 line changes to

.../drivers/net/bonding/bond_main.c

• # modprobe bonding skew=4

• 5 packets on first interface, then 1 on the

other

e-VLBI today• ToO observation of Cygnus X-3 outburst

• 3 observation epochs (May 23rd , June 10th,

July 4th)

• Observations at K-band (1cm, 22GHz)

• Telescopes from Australia, Japan and Europe

e-VLBI today• ToO observation of Cygnus X-3 outburst

• 3 observation epochs (May 23rd , June 10th,

July 4th)

• Observations at K-band (1cm, 22GHz)

• Telescopes from Australia, Japan and Europe

Future e-VLBI• More bandwidth increases sensitivity (by √B)

• Current correlator limited to 1024Mb/s (per

station)

• Researching software correlators, GPUs

• Researching FPGA based correlators (64 Gb/s

/st)

• New telescope backends

• More telescopes increases image accuracy

• Current correlator limited to 16 stations

• N * (N - 1) / 2 baselines

• FPGA based design targetting 32 stations

Questions?