"an introduction to kx technology - a big data solution", kyra coyne, data scientist at...

22
Kx Technology – a Big Data Solution Kx Community Zurich Meetup Kyra Coyne November 2016

Upload: maya-lumbroso

Post on 06-Jan-2017

39 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

Kx Technology – a Big Data SolutionKx Community Zurich Meetup

Kyra CoyneNovember 2016

Page 2: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

2

• Global company, division of First Derivatives plc (listed on LSE)• Large user community• Widely adopted in financial services over two decades• Software & industry solutions, consulting and implementation services

Known for:• Processing and analysis of large volumes of real-time and historical time series data• Extreme performance (low latency)• Integrates with and co-exists with other technologies• Ability to scale without requiring significant infrastructure

About Kx

Page 3: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

3

About the Technology• Integrated in-memory,

columnar database & programming system

• Streaming, real-time and historical data

• Map-Reduce built-in

• Native time-seriesfunctions

• Light-weight (~500kb)

• Standard OS & hardware

• Extreme Performance

Page 4: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

4

Kx Technology

• Integrated columnar database & programming system

• Streaming, real time and historical data

• Built for massive data volumes

• In-database analytics

• Parallelism

• Compression

WHAT IS Kx TECHNOLOGY

KEY FEATURES

• Interpreted

• Event-driven

• Functional

• Array / Vector

• Query

• Time-series

q PROGRAMMING LANGUAGE

WHY Kx TECHNOLOGY? HIGH PERFORMANCE, LOW LATENCY • We are fast not only due to data architecture, our

native programming language, q, runs inside the database not in separate processes with costly data passing.

POWER• We are one of the few fully 64 bit databases

and unique in having time as a native type, with nanosecond resolution and a full set of operations over time.

QUICK TO DELIVER • Unlike many compile-link and run approaches q is

dynamic allowing much shorter development and deployment cycles

Page 5: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

5

The Big Data Landscape

4

• 20 year track record; mission critical systems

• Streaming, real-time and historical data

• Processing and analyzing data in microseconds

• Hundreds of millions of transactions per second

• Terabytes to petabytes

• Trusted globally by largest institutions

Page 6: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

6

COST• Fully transparent costing models • Reduction in personnel, training, hardware and facilities costs

for clients

SECURITY• Robust, high performance infrastructure• Highly Secure• Comprehensive disaster recovery and business continuity planning

SERVICE • Best Practice and processes• Large pool of highly skilled engineers

RISK• Scalable model to respond to changes in demand

Page 7: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

7

• Large Canadian Utility (IESO)•Meter Data Management System• Processes 4.7 million meters• 120 million meter readings per day• Oracle RDBMS has 300+ billion records• Could not accommodate demand for

analytics

Downstream Case Study

Page 8: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

8

Downstream Pattern Case Study (before)

Meter Data Management

Distributors

CIS / AMISystem

Consumers

Billing Statement

Web Presentment

Meter Data

MasterData

Meter Data

Meter Data

Billing Request

Billing Response

Web Service Request

Web Service Response

Reports

Master Data

Before

Page 9: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

9

Downstream Pattern Case Study (after)

Meter Data Management

Distributors

CIS / AMISystem

Consumers

Billing Statement

Web Presentment

Meter Data

Meter Reads Retrieval

Web Services

MasterData

Meter Data

Meter Data

Billing Request

Billing Response

Web Service Request

Web Service Response

Reports

Master Data

Transform & Load

Bulk Data Extract

Change Data

Capture(Real Time)

Initial Extract

Intra-Day

Extract

Queries &Visualisation

Kdb+Database

Kx TechnologiesAfter

Page 10: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

10

Downstream Pattern Results

Service PreviousSolution

KxSolution

Web-services response time 30 sec 30 ms

Number of requests 30/sec 20,000/sec

Ad-hoc queries, aggregations, bulk extracts Hours-Days Second-Minutes

Service Availability 20x7, 6 days/week 24x7, 7 days/week

Page 11: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

11

Downstream Pattern

Pros• Maintains investment in

existing system• Rapid implementation of Kx

technology• Low risk or impact on existing

system• Functionality and availability

improvedCons• Added storage and possibly

licensing costs• Updates must flow through

existing system• Doesn’t address streaming

Data Feed

Existing Systemof

Record

Time series & Master

Data in kdb+

Kx Technologies

Ad hocQueries

Analysis

Dashboards

Time series data

Real-time or scheduled replication

Query results

Page 12: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

12

From Technology to Solutions

Solution

Kx Tools

Kx Platform

Page 13: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

13

Kx Use CasesBUSINESS USE CASES APPLICATION

Real-time Analytics Tick-capture and streaming data is analysed and enriched in real-time to produce live indicators of current conditions for further action.

Quantitative Research Run trade, quote and fundamental analysis on large datasets and produce trading indicators faster.

Risk Management Intra-day, pricing, credit, exposure and P&L alerts with visual tools, including heat maps and OLAP drill downs to monitor activity.

Market Surveillance Implement trading control alerts related to in-house and regulatory requirements, as well as the generation of planning reports.

Depth-of-Book Analysis Create real-time depth-of-book views for any instrument across thousands of symbols. Build order books from disk in sub second.

Network and Hardware Management

Manage multi-server distributed environments from a single dashboard. Monitor the health of thousands of processes and servers across plants spread throughout multiple regions.

Internet of Things (IoT) Real-time capture and processing of data generated by sensors in machines, homes, cars, smart meters, mobile phones and other devices.

Page 14: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

14

Kx Performance

Page 15: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

15

Kx Performance

Page 16: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

16

Kx Performance

Legend

• DNF = Did Not Finish

• RAM is memory used for queries

• Query times are in milliseconds

kdb+ is 10 to 100 times faster than other colstore (vertical, big3accel, hadoop / impala / parquet, ..)

kdb+ is 100 to 1000 times faster than the rowstore (postgres, big3rdbms, mongodb, spark, ..)

Page 17: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

17

Reference Architecture

| Kx Systems

Server 12

n

RDB 1 RDB n RTE

GW

Server 12

n

FH 1 FH 2 FH n

TP

Real-time Data

Server 12

n

PDB

HDB 1 HDB n

ClientLegend

Data Flow

Queries & Results

Data Persistence

Acronyms

FH = Feed HandlerTP = TickerplantRDB = Realtime DBRTE = CEP EngineGW = GatewayHDB = Historic DBPDB = Persisting DB

Page 18: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

18

Kx Stream

Page 19: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

19

Engineering for Performance

The worlds leading time series database, specifically designed for handling massive data volumes and real-time streaming analytics.

• To improve search performance and data consumption we apply sharding, where data is split between multiple servers.

• Stream for Kx offers inbuilt horizontal scaling across all data micro services.

• Horizontal scalability is applied at the point of data capture and data querying.

• In-built Map Reduce means that results are virtually instantaneous without the additional overhead of defining unique aggregation logic.

• For increased capacity, we use replication. This is the process of mirroring our data-set.

Page 20: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

20

How do we use our technology?

• Simplify both real-time & historical data using one powerful enterprise platform.

TOOLS for Kx - HARNESS THE POWER OF DATA

Kx - CORE TECHNOLOGY• Streaming analytics, in-memory compute and

database technology, providing a full application server with a powerful functional scripting language.  

Kx SOLUTIONS - BUILD POWERFUL BUSINESS INSIGHTS• Our tools are used to accelerate implementation of proven solutions for complex problems.

OUR PEOPLE - CHALLENGE US WITH YOUR UNIQUE PROBLEMS• Our engineers develop, deploy and support solutions for virtually any problem involving massive amounts of data

Page 21: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

21

Our Industry Solutions

Stream for Kx Tools to rapidly develop and deploy streaming real-time and historical analytics

Kx Technology

Kx for Algo

Build, Test, and deploy

Algorithmic Trading Strategies

Kx for Pharma

Kx for Analytics

Real Time and Historical Market

Analytics

Kx for Sensors

Dashboards for Kx

Kx for Surveillance

Real time market monitoring Surveillance

Workflow

IMS patient record analytics,

manufacturing, clinical research

Smart meters, utilities,

geolocation, customer analytics

Page 22: "An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Scientist at First Derivatives plc

22

Kx® and kdb+ are registered trademarks of Kx Systems, Inc., a subsidiary of First Derivatives plc

Resources• Free 32bit download version: http://kx.com/software-download.php

• Kx Wiki: http://code.kx.com/wiki/Main_Page

• Google Group: https://groups.google.com/forum/#!forum/personal-kdbplus

• Kx Github: http://kxsystems.github.io/

• STAC benchmarks: https://stacresearch.com/kx

• Kx Meetups: http://kx.meetup.com/

Kyra Coyne

[email protected]

+4917659883653