agile data science

19
Agile Data Science Alexander Bauer Lead Data Scientist @ Lidl Frankfurt Analytics Meetup, 2017/02/24

Upload: alexander-bauer

Post on 03-Mar-2017

109 views

Category:

Technology


0 download

TRANSCRIPT

Agile Data ScienceAlexander BauerLead Data Scientist @ Lidl

Frankfurt Analytics Meetup, 2017/02/24

Agenda• Data Science• Challenges• Agile Data Science Projects• Case Study

What is Data Science?• Data science, also known as data-

driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured – Wikipedia

Business GoalsWhy do companies hire data scientists?• Reduce costs• Increase revenue• Reduce risk• Create innovation

DeliverablesHow do data scientists deliver?• Actionable insights (reports)• Data products• New product features• Trials, A/B Testing

ChallengesWhy do many data science projects fail?• Lack of Business Understanding• Data Access (Security, Privacy)• Deployment and Operation

(Scalability, Acceptance)• Time to market (Competition,

Budget)

Case Study: Data Science for Sales Department

I want a recommender system for my

Sales Reps

Sure, we can use Alternating Least Square

Singular Value Decomposition!

Case Study: Data Science for Sales Department

Show me what you can do with Deep Learning

Cool, we can do something

with Tensorflow on your data

Case Study: Data Science for Sales Department

I want a dashboard of

sales by country and

product

Well, we can do visualizations -

but that‘s actually not my job!

Typical pitfalls during project execution

Modeling

Trial/Pilot

Operationalization

No access to data

Model does not scale

Users don‘t accept solution

Fails to meet business objective

Not enough signal

12 months

Out of budget

Solution: Iterative ApproachCRISP-DM

Agile Data ScienceHow can we implement CRISP-DM in practice?• Agile Product Management• Agile Development• Data Science Platform / Data Lake

Agile Product Management – The Product Vision Statement1

13

Close deals Prioritize leads Prevent churn Acquire new leads Up-sell Cross-sell

Sales Reps Sales Manager

Target Group Needs Product Business Goals Increase

conversion rate Increase average

basket size Reduce churn rate Grow customer base

„Leverage data science to increase sales team productivity“

?

1Roman Pichler: Agile Product Management with Scrum

User Stories – Briding the gap between algorithms and business needsAssociation Rules:

As a sales rep, I need to understand which products are often bought together, so that I can recommend additional products during sales calls

and increase upsale.

Churn Factor Analysis:As a sales rep, I need to understand the factors that drive churn so that I can select customers to call, make sure they are satisfied with our products and reduce churn.

Recommender system:As a sales rep, for each customer I need to understand which products were bought by customers with similar purchase history, so that I can make personalized recommendations and increase upsale.

Story Mapping and Release Planning

Up/Cross-Selling Churn Prevention Leads PrioritizationUser

Interface/Deployment

Association Rules Factor Analysis

Conversion - Factor Analysis

Item-Item Recommender

Viz: Top N Items per customer

A/B Testing

Simple Predictive Model for Churn

(sales history data)

Improved predictive model

for churn (incl. CRM data)

Content-based recommender for

cold-start (incl. CRM data)

Release 1

Release 2

Release 3

A/B Testing

Viz: Top N customer to likely

to churn

Agile Development with ScrumData Science is a Team Sport

Data Lake/Agile Platform

CRM Purchase Data Call Center Tickets

Platform Layer

Application Layer

Docker/VMsApp

Security/Auth

Auditing

MonitoringUnstructured Data Structured Data

Scalable Job Execution / Query Engine

App REST

ETL

Query Interface

/NotebooksVisualization

Tools

Scheduling

Legacy Systems

Business UsersAnalysts/Data Scientists

Summary / Call for Action• Data science projects rarely fail because of insufficient

modeling skills• Focus on business value, deliver „good enough“ models first• Deliver in small increments that already provide value end-to-

end, present in Sprint Reviews to all stakeholders• Manage stakeholers using a clear product vision, a user story

backlog and release plans• Deploy as early as possible to ensure user acceptance, declare

as „beta“ mode• Build an infrastructure that enables agile development

Thank you! Questions?