new approach to big data - snowflake inc.€¦ · evangelist for polyglot data environments...

14
New Approach to Big Data The Snagajob Story Robert Fehrmann Principal Architect @ Snagajob

Upload: others

Post on 24-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

New Approach to Big Data The Snagajob Story

Robert FehrmannPrincipal Architect @ Snagajob

Page 2: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation
Page 3: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

About Me● Master Degree in Computer

Science from “Technische Universitaet Braunschweig”

● 25 years building the data tier for applications in different verticals

● Evangelist for polyglot data environments

● Community involvement (MongoDB User Groups / DevOps)

Page 4: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Funnel Analysis

750 000 postings every day

600,000 unique visitors

X% find the posting interesting

Y% apply for the posting

(candidate)

Z%

Using Analytics to understand the funnel - Geographical Analysis- Customer Analysis- Historical Analysis- Industry Analysis- Click through rate &

abandoning the search- What makes a Posting

Interesting, ...

Page 5: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Data Collection Framework V1Web WebWeb

Message Bus

LB

TrackingService

TrackingService

Flume

Flume

Flume

Hadoop

Hue Impala Report

Console

SQL-DW

Looker

Vertica

Page 6: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Evolution

201620142012

“We want to be a cloud based company”

Peter Harris, CEO

2015

Search ContinuesFor a true

cloud solution till

….

Data warehouse & platform software

( on premise)

Vertica Data Warehouse

Hadoop

Vertica Data Warehouse

Move to CloudDoesn’t solve all

problems

Hadoop

Page 7: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Goals for Next Generation Solution● Horizontal Scalability

● PaaS

● Stability

● Ease of Use

● Can’t be more expensive

Page 8: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Architecture

Page 9: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Data Collection Framework V2

Web WebWeb

Message Bus

LB

TrackingService

TrackingService

FiveTran

Salesforce

Netsuite

Kenisis Snowflake

Looker

Snowflake PortalAdHoc

Spark

MongoDB

Page 10: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Results: Performance

Page 11: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Results: Better Use of Resources

Page 12: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Snagajob Platform

Page 13: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation

Other Features● Undrop (DB, Table, Schema) no restore required

● Clone (DB, Table, Schema) (metadata only operation)

● Native JSON Parsing (as well as CSV, AVRO, XML, Parquet)

● Automatic Encryption of Data

● Automatic Query Optimization (no tuning)

● All Data in one place (single source of truth)

Page 14: New Approach to Big Data - Snowflake Inc.€¦ · Evangelist for polyglot data environments Community involvement (MongoDB User Groups / DevOps) ... Hadoop. Goals for Next Generation