the devops paas infusion - may meetup

20
Cloud I/O in FS John Davies 17th May 2012 Monday, 21 May 12

Upload: norm-leitman

Post on 18-Nov-2014

277 views

Category:

Business


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: The DevOps PaaS Infusion - May meetup

Cloud I/O in FS

John Davies• 17th May 2012

Monday, 21 May 12

Page 2: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Agenda

•A quick look at data volumes in the front-office

•Front office enterprise architecture

•How can cloud help in this low-latency environment?

•Getting data into GigaSpaces

•A few examples

Monday, 21 May 12

Page 3: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Clouds are old!

•Most of us (here) have been using “grid” for a good decade• Cloud isn’t much different, it’s just a little more fluffy!

•Having computing resources in the cloud doesn’t solve integration issues• In fact is just means they need to handle higher volumes

•Getting Financial Services messages into a cloud for processing requires some clever integration technology

•Even once it’s in the cloud you need some clever technology to make the best of what you have

Monday, 21 May 12

Page 4: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Front-Office

•One area that may not seem ideal for cloud solutions is the front office

•Latency is critical, the 100ms latency to the cloud would be like years for your average arbitrage trader

•But the split-(milli)-second decisions made by the algo trading engines need to be based on reliable information

•The cloud is the perfect place to perform these operations...

Monday, 21 May 12

Page 5: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

We need more CPU power!

•The graph below shows the Dow Jones daily trading volumes since 1980, the y-axis is logarithmic• Log(vol) vs time i.e. 8 = 100m, 9 = 1 billion, 10 = 10 billion (per day)

Monday, 21 May 12

Page 6: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Algo-trading

• It’s not that complicated... (in theory anyway)• You need to have access to data from the exchange, come up with a trading strategy

(algorithm), write the code (usually in C, Python, R or something similar), deploy it to a machine as close as you can physically get to the exchange (co-hosting)

• ... and collect your money :-)

•The algorithm is basically a program that says something like• If APPL < opening price and MSFT > 30min moving avg and MSFT > opening price then

buy MSFT, sell APPL

•But it can get way more complex• Predictive models based on market data feed harmonics trying anticipate where the

market will be in 200 µs

• Correlation trading of stocks that appear unrelated, but mathematically correlate

Monday, 21 May 12

Page 7: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

So first get the data...

•Protocols to connect - each exchange is different!• FIX, ITCH, OUCH, PINCH, SCRATCH, many are optimised for performance

• Proprietary APIs can reduce latency to the order of 20µs

•FIX / FAST is a good standard approach, it comes in several versions and the latest can use 3 different encodings...• Standard FIX (tag/value pairs, all tags are integers)

• FixML (a very verbose XMLized version of the above)

• FAST (FIX Adapted for STreaming, like the standard version but compressed)

•When you’ve worked that out there’s the venue specific dialect that need tuning for each exchange

Monday, 21 May 12

Page 8: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Now the OS & Language...

•Since the FIX engine is connecting to your back-end servers you usually have to make a choice between...

•C / C++• Still regarded as the fastest FIX engines and usually the choice for the arbitrage traders

• Latency is reliable (i.e. no garbage collection)

• Supporting all different versions of Linux, UNIX, MS, 32bit, 64bit etc. is a real pain

•Java• Surprisingly only 3rd place, Java FIX engines are very fast but unless they are carefully designed garbage

collection can be a major issue

• Easiest to integrate into other architecture, most flexible

•.NET• The most popular simply due to the reason that most small businesses start off on

• Microsoft platforms (Excel etc.) and this fits in best

• Less used in the larger businesses (banks, major firms etc.)

Monday, 21 May 12

Page 9: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Enterprise Architecture...

• It’s not exactly complicate from an enterprise view point

•Offices are often in “nice” locations• Of course it’s nothing to do with the

favourable tax

•Most of the infrastructure runs in co-located boxes• Co-hosting costs money and limits what

you can do

New York Stock Exchange (NYSE)

CompanyFIX Engine

CompanyTrading Engine

Chicago Mercantile Exchange (CME)

CompanyFIX Engine

CompanyTrading Engine

Company Office

Monday, 21 May 12

Page 10: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

So what about the cloud?

•So far no cloud, where does come in to the picture• And I don’t mean the dark clouds appearing over the tax-havens

•The traders view of the world is a little window just a few milliseconds wide, what if we could expand that?• It’s like watching a fast movie with no controls, you sneeze as a shot’s fired and

you’ve missed half the plot (welcome to modern movies)

•What if we could skip back an hour or a day and replay scenarios through our algo trading engines?

Monday, 21 May 12

Page 11: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

First store the data...

•Tick data is the raw price feed from the exchange• 8=FIXT.1.1^A9=0^A35=X^A49=CME^A34=2127825^A52=20100120150049656^A

1128=8^A268=1^A279=0^A269=0^A48=109291^A22=8^A270=115060^A271=1^A

273=150049000^A336=2^A346=1^A83=27750^A1023=2^A75=20081117^A10=000^A

•This needs to be stored for legal reasons as you have to be able to demonstrate “best execution” for clients• We’ve seen the volumes earlier, we don’t need everything on the exchange but

certainly everything we’re trading

•Typically these are tens of gigabytes per day per exchange

•An interesting solution is to ship it up to EC2 and store it on EBS and S3

Monday, 21 May 12

Page 12: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Shipping to Amazon

•There are two mechanisms to get data up onto Amazon’s EC2, streaming and batch

•Batch is more efficient, we compress the data hourly and “scp” it up to an EC2 box-rw-r--r-- archive onix 12710903 7 Dec 2011 OrderBooksRepository_Channel_7_summary_20111125-14_00_54_154.gz-rw-r--r-- archive onix 19452739 7 Dec 2011 OrderBooksRepository_Channel_7_summary_20111125-15_00_54_785.gz-rw-r--r-- archive onix 27549005 7 Dec 2011 OrderBooksRepository_Channel_7_summary_20111125-16_00_55_417.gz

•Streaming adds the interesting advantage of having near-realtime data on EC2• Not to mention an EC2 box doing little more than writing to disk

•So we started to add monitoring processes to the data• Triggers, statistics, filters, aggregators etc.

Monday, 21 May 12

Page 13: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Terabytes to Petabytes of data

•Tens of gigabytes per channel, per exchange per day• Several terabytes per channel per exchange per year

• 20 plus exchanges (28 in one example) we’re into petabytes per year

•Fortunately it compresses well (10:1) so we can “archive” it as tar/gz and load it on-demand into our applications• Recent data is loaded onto EBS, “old” data onto S3

• Data is mounted form EBS drives on demand

•We now have a few interesting possibilities...• We can feed the data back for back-testing on-site

• We can sort/filter/analyse it in the cloud

• We can run the back-testing in the cloud

Monday, 21 May 12

Page 14: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Introducing the Technologies

•C24 - Integration Objects• Basically a Java-Binding tool with built-in messaging standards for financial services

• For example Fix, FpML, ISO-20022, SEPA, SWIFT etc. as Java APIs and self-contained objects that can self-validate

•GigaSpaces• The best implementation of Sun’s (now Oracle’s) Jini/JavaSpaces

• Powerful distributed implementation of the Master/Worker pattern

•C24 + GigaSpaces• The ability to onboard / work with financial services messages (as above) directly in

GigaSpaces with minimal work

• Take huge amounts of data, parse it (with C24-iO) and insert it into GigaSpaces

• The “workers” can “take” data from the “space”, execute the task and “write” back to the “space”

Monday, 21 May 12

Page 15: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Fix 5.0 PostTrade - Trade Capture Report...

Monday, 21 May 12

Page 16: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

A few clicks and a little code

•We click on the message library we need and deploy the code

•The result can now be used to insert Fix messages (for example) into GigaSpaces

•We can write using a very simple Java API (including an ESB such as Mule) or Spring

•A very simple Master/Worker pattern can be deployed into GigaSpaces to process/filter/enrich/sort the messages

Monday, 21 May 12

Page 17: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Spring Integration

•Spring make it all look very easy...

<bean id="FixSourceFactory" class="biz.c24.io.spring.source.FixSourceFactory">

<property name="encoding" value="ASCII"/>

</bean>

<c24:model id="inputFix" base-element="biz.c24.io.fix50sp2.TradeCaptureReportElement" />

! <file:inbound-channel-adapter

id="filesIn"

! directory="file:/Users/jdavies/dev/Spring_C24/spring-integration-samples/input"

filename-pattern="*.fix">

! <int:poller id="poller" fixed-delay="0"/>

! </file:inbound-channel-adapter>

<int-c24:unmarshalling-transformer

source-factory-ref="FixSourceFactory"

model-ref="inputFix"

input-channel="filesIn"

output-channel="Fix-Space" />

Monday, 21 May 12

Page 18: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

GigaSpaces does the rest

•We now have a space full of Java Objects that represent the Fix messages

•We can now use generic workers or Map/Reduce to sort/process the messages

•The same architecture works on your laptop, a server, a 4,000 CPU grid in the cloud• The latter is obviously a lot faster :-)

Monday, 21 May 12

Page 19: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Some examples

•Front Office• Huge usage of GigaSpace but latency tends to prohibit cloud, however backtesting

and post-trade processing are increasing looking towards the cloud

•Middle Office• Matching and reconciliation, “what if?” calculations, anti-fraud

• Data tends to need to be tokenised to conform to PCI regulations

•Prime Brokerage• Large CSV files arriving via FTP

• Parsing, validation, enrichment, transformation & reconciliation etc.

•Payments• Loyalties & offer calculations

Monday, 21 May 12

Page 20: The DevOps PaaS Infusion - May meetup

© 2012 C24 Technologies Confidential Information of C24 Technologies

Thank you

+=

Financial Servicesin the cloud

Monday, 21 May 12