"awesomeness of container orchestras - data crunching in a company builder", martin held,...

26
Awesomeness of Container Orchestras Data Crunching in a Company Builder Martin Held

Upload: dataconomy-media

Post on 08-Jan-2017

94 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Awesomeness of Container Orchestras Data Crunching in a Company Builder

Martin Held

What is FinLeap doing?

Tech Department’s Mission

Our journey

How we did it

WebUI Backend

Service A Service B

Monolithic Application (typically RoR)

SQL

Data

Some Learnings

• finding developers is hard, finding developers experienced in a specific tech stack is even harder

• integrating data science like functionality is not straightforward

• scaling the monolithic application can be challenging

How we do it these days

WebUI

SQL

AP

I Gat

eway

Msg

Bro

ker

Backend

Service A

Service B

we plan and implement containerised microservice architectures

This allows us

• multilingual applications • better utilisation of existing dev resources

• broader talent pool for recruitment

• plug-in data science solutions as service

• scaling • tech wise - ‘scale bottleneck services‘

• business wise - dedicated teams for different tasks

• reuse of services (authentication etc.)

Service A

How is life in a container?

A Docker ContainerContainer 1 Container 2 Container 3

- package applications with its dependencies

- more lightweight than virtual machines (shared the OS)

- run on any computer, any infrastructure, any cloud

Life in a (DataScientist) Container

FROM ubuntu:14.04

ENV PYTHONPATH /opt/caffe/python

# Add caffe binaries to path ENV PATH $PATH:/opt/caffe/.build_release/tools

# Get dependencies RUN apt-get update && apt-get install -y bc cmake curl gcc-4.6 g++-4.6 wget

# Use gcc 4.6 RUN update-alternatives --install /usr/bin/cc cc /usr/bin/gcc-4.6 30 && \

# Clone the Caffe repo RUN cd /opt && git clone https://github.com/BVLC/caffe.git

# Build Caffe core RUN cd /opt/caffe && cp Makefile.config.example Makefile.config && \

# Add ld-so.conf so it can find libcaffe.so ADD caffe-ld-so.conf /etc/ld.so.conf.d/

# Run ldconfig again (not sure if needed) RUN ldconfig

# Install python deps RUN cd /opt/caffe && \ cat python/requirements.txt | xargs -L 1 sudo pip install

Life in a (DataScientist) Container

bvlc_reference_caffenet.caffemodel RUN wget -O models/deploy.prototxt https://raw.githubusercontent.com/BVLC/caffe/master/models/bvlc_reference_caffenet/deploy.prototxt

Life in a (DataScientist) Container

USE ubuntu base Image

Install Os dependencies (g++, python, git, fortran, curl, etc )

Clone Caffe repo (open source Deep Learning Lib)

Build Caffe Core

Install Python Dependencies

Build Caffe Python bindings

Add Model and Source Code

Specify Execution Command

+

+

+

Life in a (DataScientist) Container

Image Recognition Al

fish, aquarium, child

Orchestration

The Orchestra

Scheduling place and start container on host(s) offering required resources

Service Discovery + Registration allow containers to communicate with each other and the rest of the world

Implement Resilience e.g. auto restart containers in case of failure

An example

Reverse Image Search

Query Image

Reverse Image Search

Message Broker

Reverse Image Search

Crawler

Product Pictures Metadata

Product Picture

Raw Metadata

Message Broker

Reverse Image Search

Crawler

Product Pictures Metadata

Product Picture

Raw Metadata

Message Broker

Feature Extractor

Picture Features

Reverse Image Search

Crawler

Product Pictures Metadata

Product Picture

Raw Metadata

Message Broker

Feature Extractor

Picture Features

Metadata Parser

Raw Metadata

Structured Metadata

Reverse Image Search

Crawler

Product Pictures Metadata

Product Picture

Raw Metadata

Message Broker

Feature Extractor

Picture Features

Metadata Parser

Raw Metadata

Structured Metadata

StorageSink

Store

Picture Features,

Structured Data

Reverse Image Search

Crawler

Product Pictures Metadata

Product Picture

Raw Metadata

Message Broker

Feature Extractor

Picture Features

Metadata Parser

Raw Metadata

Structured Metadata

StorageSink

Store

Picture Features,

Structured Data NearestNeighbor

Search API

Query Picture

Similar Pictures

Query Features

Similar Pictures

to take home

containers are great tool to package code with all its dependencies and make it usable by others

allow us develop scalable plug-in ready data science solutions

complex scalable and resilient architectures for the masses

Thank you for your attention!

www.linkedin.com/in/martin-held

@