transmart hackathon introduction amsterdam 2015

14
Hackathon Introduction TRANSMART ANNUAL MEETING 2015 AMSTERDAM, OCTOBER 19, 2015 Kees van Bochove, Chair Architecture Working Group @ tranSMART Foundation | CEO @ The Hyve

Upload: kees-van-bochove

Post on 14-Apr-2017

527 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: TranSMART Hackathon Introduction Amsterdam 2015

Hackathon Introduction TRANSMART ANNUAL MEETING 2015

AMSTERDAM, OCTOBER 19, 2015

Kees van Bochove, Chair Architecture Working Group @ tranSMART Foundation | CEO @ The Hyve

Page 2: TranSMART Hackathon Introduction Amsterdam 2015

2

Page 3: TranSMART Hackathon Introduction Amsterdam 2015

3

Hackathon Topics

u  Building a POC around using SparkR on Amazon EC2 as

a computational backend for tranSMART 1.3

u  Improving the visual analytics in tranSMART, by updating

or adding analytics workflows in the SmartR plugin

Page 4: TranSMART Hackathon Introduction Amsterdam 2015

4

Apache Spark

u  Largest and most active open source project in data

science as of this year

u  Seen by many as a ‘replacement’ for Hadoop

[MapReduce] in the big data area

u  Implements lessons learned from Hadoop; built from the

ground up as a framework support data scientists

u  Core concept: in memory datasets & lazy evaluation

Page 5: TranSMART Hackathon Introduction Amsterdam 2015

5

SparkR architecture

Page 6: TranSMART Hackathon Introduction Amsterdam 2015

6

SparkR architecture

Page 7: TranSMART Hackathon Introduction Amsterdam 2015

7

Hackathon Goal: SparkR integration

u  Task: Integrate Spark with tranSMART database, via the

implementation of Spark RDD interface in tranSMART core API

u  Goal: Demonstrate scalability on compute side (scalability on

database remains limited because of relational database

paradigm)

u  Benefit: Can use Spark compatible applications on top of

tranSMART (e.g. machine learning, big data analytics tools etc.)

Page 8: TranSMART Hackathon Introduction Amsterdam 2015

8

Current Architecture

Page 9: TranSMART Hackathon Introduction Amsterdam 2015

9

Goal architecture

Page 10: TranSMART Hackathon Introduction Amsterdam 2015

10

SparkR architecture

Page 11: TranSMART Hackathon Introduction Amsterdam 2015

11

SmartR u  Plugin for tranSMART

1.3 written by

Sascha Hertzinger,

Uni. Luxembourgh

for IMI eTRIKS

u  Currently being

extended by The

Hyve and Sanofi

Page 12: TranSMART Hackathon Introduction Amsterdam 2015

12

Current SmartR analytics

u  Boxplot

u  Correlation Analysis

u  Heatmap

u  Timeline Analysis

u  ? Your ideas

u  TODO: add screenshots

Page 13: TranSMART Hackathon Introduction Amsterdam 2015

13

Hackathon Goal: Visual Analytics

u  Improve existing analytics workflows

u  e.g. heatmap

u  Add new workflows

u  D3.js library for building visualizations

Page 14: TranSMART Hackathon Introduction Amsterdam 2015