india analytics and big data summit 2015
TRANSCRIPT
India Analytics and Big Data Summit 2015
Location : Mumbai
Date : 3 Feb 2015
Name of the Speaker : Kanwal Prakash Singh, Data
Scientist
Company Name : Housing
www.unicomlearning.com
www.bigdatainnovation.org
www.bigdatainnovation.org
www.unicomlearning.com
● Information and data
● Data - Raw Facts or Figures
● Information - Processed facts, sensible
● Information is derived from data
● examples
www.bigdatainnovation.org
www.unicomlearning.com
● Why do we need data ?
● Some scenarios where scarcity of data led to dangerous consequences
○ Earth is Flat ○ Columbus and America vs India ○ Prosperity will last forever then stock markets crashed
www.bigdatainnovation.org
www.unicomlearning.com
● Take a guess , data collected per day - scale
○ Housing ○ Linked In ○ Facebook ○ Zomato
www.bigdatainnovation.org
www.unicomlearning.com
● It’s not the Data it’s the questions you seek form data
● Are you expecting the right questions from data ? ○ do you have adequate amount to test your
hypothesis, ○ if so are you sure you are not making strong beliefs
by overlooking on some bias in data ! ○ Correlation != causation
www.bigdatainnovation.org
www.unicomlearning.com
● Analytics @ housing
● What we capture, what we do with that -
● Optimise Operations / data collection in-house / recommendations and understanding users / user bucketing
● Forecasting, Price Estimates
● Heatmaps - Demand Supply , Price , CFI
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
● How can Data science be used for optimising operations ? ○ Flat Duplication ○ Listing Decay ○ Forecasting - Supply / Demand / Load ○ Route Optimisation
● Problem formulation followed by solution through Statistical methods
● Follow the curiosity and desire for perfection `
www.bigdatainnovation.org
www.unicomlearning.com
● utilization (useful DC hours/total available DC hours) metric is not up to the mark.
● Why ?
● Could have been a load issue (not enough listing requests hence DC sat idle) but that was not the case
www.bigdatainnovation.org
www.unicomlearning.com
● In fact it appeared we were overloaded.
● Again how ?
● Data Collectors were travelling a lot ( between two jobs)
www.bigdatainnovation.org
www.unicomlearning.com
● Hence came the idea of Branching
● The aim was two-fold: ○ reduce the travel time per flat ○ develop capability to serve a request within 45
minutes
● Done ? Awesome, problem identification done :)
www.bigdatainnovation.org
www.unicomlearning.com
● Not Really Done !
● There was a vast scope of improvement in the Scheduling Algorithm
● So all in all two problems ○ Find New offices ( delocalisation) ○ Optimise the Scheduling algorithm
www.bigdatainnovation.org
www.unicomlearning.com
● New Office Identification (Constrained Cost Optimisation)
● Expanding through setup of new Branches
● Estimation of branch locations
● Costs and capacities of new branches
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
Penalty Additions
www.bigdatainnovation.org
www.unicomlearning.com
● Bingo ! Nailed it
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
● Scheduling Algorithm for collection and distribution Systems
● Optimal allocation of timed tasks (Listing Requests) to the work force (Data Collectors)
● Minimum cost maximum matching in a graph
www.bigdatainnovation.org
www.unicomlearning.com
● Hungarian Algorithm
● Optimal Allocation of jobs to people - each person has some cost to perform a job
● Minimum Cost Maximum Matching in a Bipartite Graph
○ Matching - Set of Edges, with no vertices repeated
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
www.bigdatainnovation.org
www.unicomlearning.com
● Nailed it now :D
www.bigdatainnovation.org
www.unicomlearning.com
● Nailed it now :D
● 30 % operational cost reduced
● The best part - solution is transferable ○ All Delivery and collection systems
○ Any general Density Based Branching model
www.bigdatainnovation.org
www.unicomlearning.com
● Takeaways
○ Data is brahmastra
○ A noob cant master brahmastra, so rise to the levels of Elite Warriors - (Mahabharata had several)
○ How ? ■ Mindset - Curious / Hardworking/ Focused ■ Read/ Learn - Blogs / Books / Courses / Peers ■ Apply - Personal Projects / Kaggle ■ Teach
www.bigdatainnovation.org
www.unicomlearning.com
● Acknowledgements
○ Mr. Shanu Vivek, Operations BI, Housing ○ Mr. Vaibhav Krishan, Sr. Quant Analyst ○ Mr. Jaspreet Saluja, Co-Founder, Housing ○ Mr. Rishabh Gupta, Operations, Housing ○ Mr. Arpit Agarwal, Operations, Housing ○ Mr. Abhishek Anand, CTO, Housing ○ Mr. Nitin Sangwan, DSL, Housing
www.unicomlearning.com
www.bigdatainnovation.org
Speaker Name: Kanwal Prakash Singh
Email ID: [email protected]
India Analytics and Big Data Summit 2015
Organized by UNICOM Trainings & Seminars Pvt. Ltd.
THANK YOU