agile data science
TRANSCRIPT
What is Data Science?• Data science, also known as data-
driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured – Wikipedia
Business GoalsWhy do companies hire data scientists?• Reduce costs• Increase revenue• Reduce risk• Create innovation
DeliverablesHow do data scientists deliver?• Actionable insights (reports)• Data products• New product features• Trials, A/B Testing
ChallengesWhy do many data science projects fail?• Lack of Business Understanding• Data Access (Security, Privacy)• Deployment and Operation
(Scalability, Acceptance)• Time to market (Competition,
Budget)
Case Study: Data Science for Sales Department
I want a recommender system for my
Sales Reps
Sure, we can use Alternating Least Square
Singular Value Decomposition!
Case Study: Data Science for Sales Department
Show me what you can do with Deep Learning
Cool, we can do something
with Tensorflow on your data
Case Study: Data Science for Sales Department
I want a dashboard of
sales by country and
product
Well, we can do visualizations -
but that‘s actually not my job!
Typical pitfalls during project execution
Modeling
Trial/Pilot
Operationalization
No access to data
Model does not scale
Users don‘t accept solution
Fails to meet business objective
Not enough signal
12 months
Out of budget
Agile Data ScienceHow can we implement CRISP-DM in practice?• Agile Product Management• Agile Development• Data Science Platform / Data Lake
Agile Product Management – The Product Vision Statement1
13
Close deals Prioritize leads Prevent churn Acquire new leads Up-sell Cross-sell
Sales Reps Sales Manager
Target Group Needs Product Business Goals Increase
conversion rate Increase average
basket size Reduce churn rate Grow customer base
„Leverage data science to increase sales team productivity“
?
1Roman Pichler: Agile Product Management with Scrum
User Stories – Briding the gap between algorithms and business needsAssociation Rules:
As a sales rep, I need to understand which products are often bought together, so that I can recommend additional products during sales calls
and increase upsale.
Churn Factor Analysis:As a sales rep, I need to understand the factors that drive churn so that I can select customers to call, make sure they are satisfied with our products and reduce churn.
Recommender system:As a sales rep, for each customer I need to understand which products were bought by customers with similar purchase history, so that I can make personalized recommendations and increase upsale.
Story Mapping and Release Planning
Up/Cross-Selling Churn Prevention Leads PrioritizationUser
Interface/Deployment
Association Rules Factor Analysis
Conversion - Factor Analysis
Item-Item Recommender
Viz: Top N Items per customer
A/B Testing
Simple Predictive Model for Churn
(sales history data)
Improved predictive model
for churn (incl. CRM data)
Content-based recommender for
cold-start (incl. CRM data)
Release 1
Release 2
Release 3
A/B Testing
Viz: Top N customer to likely
to churn
Data Lake/Agile Platform
CRM Purchase Data Call Center Tickets
Platform Layer
Application Layer
Docker/VMsApp
Security/Auth
Auditing
MonitoringUnstructured Data Structured Data
Scalable Job Execution / Query Engine
App REST
ETL
Query Interface
/NotebooksVisualization
Tools
Scheduling
Legacy Systems
Business UsersAnalysts/Data Scientists
Summary / Call for Action• Data science projects rarely fail because of insufficient
modeling skills• Focus on business value, deliver „good enough“ models first• Deliver in small increments that already provide value end-to-
end, present in Sprint Reviews to all stakeholders• Manage stakeholers using a clear product vision, a user story
backlog and release plans• Deploy as early as possible to ensure user acceptance, declare
as „beta“ mode• Build an infrastructure that enables agile development