applications of computing in industry: what is low latency all about? efx – january 2014

40
Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Upload: reese-harker

Post on 14-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Applications of Computing in Industry:What is Low Latency All About?

eFX – January 2014

Page 2: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Divyakant Bengani

Undergrad degree in Management and IT from Manchester

Vice President at CS, responsible for eFX Core Technologies

Working in the banking industry since 2003 & CS for ~3 years

2

Page 3: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

EFX - What do we do?

Cash FX Only Spot, Forwards and Swaps

Continuous Publication of Prices Streaming Executable Rates

Response to Request for Quotes

Acceptance and Booking of Trades

3

Page 4: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Key Statistics

~200 Currency Pairs (E.g EURUSD / GBPJPY etc.) 3 billion prices broadcast a day 60000 trades a day >200 client connections

4

Page 5: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Technologies Used

Java C# for UIs GWT for Web UIs Oracle Coherence Oracle DB Derby DB Azul Zing JVM Low Latency Fix Engine

5

Page 6: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Protocols

Socket Connections Asynchronous JMS Java RMI HTTP (JSON, HESSIAN)

6

Page 7: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Payloads

Google Protobuf Fixed Length Byte Arrays FIX - Industry Standard JMS Map Messages Java Serialization

7

Page 8: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

EFX - Overall Architecture

8

Page 9: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Service Discovery

Zero Conf Dynamically add and remove services Applications do not need to know about each other - just

pick up what’s advertised

9

Page 10: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Automated Testing

10

Page 11: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Code Quality Analysis

11

Page 12: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Continuous Integration

12

Page 13: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

How to Achieve Low Latency

Page 14: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 14

Daniel Nolan-Neylan

Graduated from UCL in 2004 Started working at Credit Suisse in 2006

−First, networking for 4 years−Now, Application Developer in FX IT

Different projects:−Distributed caching system for static data−Simplified credit checking library−Pricing and trading gateway (now team lead)

November 2011

Page 15: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Wait a second!

Reminder:

1 second is:−1,000 milliseconds−1,000,000 microseconds−1,000,000,000 nanoseconds

Page 16: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Latency Numbers Every Programmer Should Know

L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns 14x L1 cache Mutex lock/unlock 25 ns Main memory reference 100 ns 20x L2 cache, 200x L1

cache Compress 1K bytes with Zippy 3,000 ns Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms Read 4K randomly from SSD* 150,000 ns 0.15 ms Read 1 MB sequentially from memory 250,000 ns 0.25 ms Round trip within same datacenter 500,000 ns 0.5 ms Read 1 MB sequentially from SSD* 1,000,000 ns 1 ms 4X memory Disk seek 10,000,000 ns 10 ms 20x datacenter roundtrip Read 1 MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20X

SSD Send packet CA->Netherlands->CA 150,000,000 ns 150 ms

By Jeff Dean: http://research.google.com/people/jeff/

Page 17: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

FX Trading – Latency Numbers

250ms – A human responding to price update 30ms – Bank accepting trade 10ms – Credit checking client 9ms – JVM Garbage Collecting 5ms – Persisting a trade to disk 2ms – JMS networking round-trip 1ms – Raw socket networking round-trip 0.5ms – Max wire-to-wire pricing latency 0.05ms – Min pricing latency 0.005ms – Writing price to FIX engine

Page 18: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Optimization Quotes

Michael A. Jackson:“The First Rule of Program Optimization: Don't do it.The Second Rule of Program Optimization (for experts only!): Don't do it yet.”

Rob Pike:“Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you have proven that's where the bottleneck is.”

Page 19: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Where to Optimize? Use Profiler

Page 20: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 20

Measuring Milliseconds and Nanoseconds in Java

Measure time taken for operations and log:−System.currentTimeMillis()

Good for taking a time/date that can be compared against other systems. Accuracy depends on OS, but 1ms accuracy achievable on modern Unix-based OS (Linux)

Bad if more precise measurements are required−System.nanoTime()

Good for sub-millisecond measurements Bad if comparable time with other systems required

−Realistically, need to use both

November 2011

Page 21: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 21

Quote Journalling – log latency of every price

November 2011

Page 22: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 22

Our Soak Test Harness

November 2011

Page 23: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 23

…and the graphs it can produce

November 2011

Page 24: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Removing Millisecond Delays

Identify the longest-running tasks−Usually I/O delays

Disk– Database activity– Synchronous logging– Writing files

Network– Calling network services– Remote services far away (e.g. Across Atlantic

~50ms)

Page 25: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Removing Millisecond Delays (2)

Analyze whether delays can be eliminated−Disk

Database activity -> Use a cache Synchronous logging -> Use asynchronous logging Writing files -> Use buffers and write asynchronously

−Network Calling network services -> Cache where possible Remote services far away -> Co-locate in same place

Page 26: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

FX Trading – RFQ Example

E.g. Incoming request for a price, target response time is 10ms−Need to:

Validate request parameters Internally subscribe for prices Obtain a globally unique transaction ID Perform a credit check

How to get all this done in just 10ms?

Page 27: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

FX Trading – RFQ Example (2)

Credit check−Old one took 30-200ms−New one takes 5-10ms

Using Caching and Co-location Parallelize all validation Pre-cache prices

−by opening up price streams in advance of being required

Page 28: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Don’t Optimize Too Soon

Remember:−Only optimize what you need to optimize−Remove longest delays first

No point removing micros if you still have delays of millis or worse

−Always measure your operations carefully Determine what minimum, maximum, mean, standard

deviation, and other percentiles are (99%, 99.9%, etc)−Watch for jitter and solve separately

Page 29: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Removing Microsecond Delays

Intra-process delays−Unbalanced / slow queues−Slow algorithms

Expensive loops repeated many times Poor use of object creation / memory allocation Contented memory controlled with locks Wasted effort calculating unwanted results

Page 30: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

FX Trading – Pricing Example

Achieving wire-to-wire latencies of 50μs−Google protobuf parsers replaced with low-garbage

creating versions each GC stops the JVM for 9,000μs (i.e. 9ms)

−LMAX Disruptors used instead of queues Busy spin consumer threads / single-write principle

−“PriceBigDecimal” class to replace Java BigDecimal class BigDecimal slow to instantiate and impossible to

mutate−No synchronous logging or network calls−Pre-cache static data before starting price stream

Page 31: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 31

Disruptor or Blocking Queues?

November 2011

Page 32: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Corporate Design, HCBC 1 32

Java BigDecimal or use Low Latency replacement?

November 2011

Page 33: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Removing Nanoseconds?

Use specialist hardware (such as FPGA) Understand low-level CPU interconnectivity with memory,

and how CPU caching works (including cache-lines) http://mechanical-sympathy.blogspot.com eFX – No need to pursue this level of performance at the

moment

Page 34: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Latency vs Throughput

Latency - time taken (typically mean, percentile or worst case) to complete a task

Throughput – the number of tasks completed in a given time period (typically, per second)

Throughput is 1/latency (per pipeline)

Page 35: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Increasing Throughput

Identify delays−Throughput constrained by latency−Blocking I/O calls delay unprocessed messages

Data bursts−What’s the peak throughput required?−What’s the gap typically between bursts?

Page 36: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Techniques to Increase Throughput

Batching−Sometimes latent calls are unavoidable−Using batching can strip overhead of making call per

transaction−Cost of batching is the delay incurred waiting for new

items to add to batch−More difficult to accurately measure delay per item when

multiple items are in a batch

Page 37: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

FX Trading – Batching Example

Legacy global server in LondonRegional trade acceptance componentsLatency between New York and London - 50msPer thread: 1/0.05 = 20 trades per second

maxHow to increase?

−More threads−Add batching per thread

Now, with batch size of 5, 100 trades per second per thread.

Page 38: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Techniques to Increase Throughput(2)

Use Asynchronous callbacks−Synchronous calls:

boolean doCall() Wait for response Can be delayed for varying time

−Asynchronous calls: void doCall(Callback callback) Do not wait and keep processing more events Can additionally overlay timeouts to improve resilience

Page 39: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

FX Trading – Asynchronous Callbacks

Submission of trade to price service for verification – was originally synchronous

Call blocks for 50ms – max 20 trades per second per thread After converting to asynchronous callbacks, the only delay

is putting packets on network buffer (μs), so effectively no delay – max numbers of trades is very high!

Page 40: Applications of Computing in Industry: What is Low Latency All About? eFX – January 2014

Q & A

eFX – January 2014