project management for big data projects
TRANSCRIPT
PROJECT MANAGEMENT & BIG DATA ANALYTICSSandeep Kumar PMP®
INTRODUCTIONIT Strategy & Business Transformation
Industries:Media & Advertising, Telecom, BFSI, FMCG, Manufacturing, BPO/KPO
Services:Shared Service Delivery, GIC & Back Office, PMO, Lean Six Sigma, Continuous Improvement, Enterprise IT, ERP, Cloud & Infrastructure, Development, Outsourcing and IT Security & Governance
TWO STREAMSProject Management of Analytics
• Big Data• Data Warehousing• Lean Six Sigma• In-memory computing• Internet of things• Social Media
Analytics of Project Management
• Time-Cost-Spec Analytics• Feasibility Analytics• Resource Analytics• Management of Collaboration• Agile & SCRUM
APOLOGIES, DISCLAIMERS, ET AL
• Big Data is over-hyped• Big Data is still evolving• Analytics is old, the tools are new!• Project Management solves most of the problems• Its importance is usually understated• The success of Big Data initiative lies primarily on the
management, then on the PM & the DS• Hold the PM responsible only if you know what you want!• The roles I talk about here are essentially with respect Big Data
Projects
If you have
Questions,
I will try & Answer them!
UNDER SCANNER !Big Data / Analytics
Myths
• It is mature and cool• Is an extension of EDW• Data Quality can be slightly compromised• A single pre-built technology (e.g. Hadoop) will suffice• Data scientists are easy to get• Virtualization / Clustering will take care of infra needs• If you have huge data, every solution is a Big Data
solution
Project ManagementMyths
• Managing only activities• Just time and cost management• Mere resource allocation• Reaching the finishing-line!• General management suffices• Have time to learn
UNDER SCANNER !Big Data / Analytics
Facts
• Responsible for Business Case and ROI definitions
• Executive Sponsorship & Funds• ‘Real’ Resource Provisioning• Based on Enterprise Architecture• Highly complex and iterative process• Loads of scientific knowledge required• Source of data increases every day• Should be able to adapt with time
Project ManagementFacts
• Responsible for Scope & acceptance by all parties
• Direction setting & KPI-SF definitions• ‘Real’ Resource Management• Right to procure & deploy the appropriate
resource• Stakeholder & Communication
management• Accountability and Responsibility for the
success (and failure)
THE DATA SCIENTISTKey skills of a Data Scientist – the hard skills guy• Basic Tools: Knowledge of statistical programming languages, like R or Python, and SQL
• Basic Statistics: Familiar with statistical tests, distributions, maximum likelihood estimators, etc.
• ETL Tools: Best in class like Informatica, IBM Infosphere, SAP BO, Oracle or SAS Data Integrator, Penta-ho, AB-Initio
• Machine Learning / Artificial Intelligence / Pattern Recognition: Methods for Classification and Regression like k-nearest neighbours, random forests, etc.
• Multivariable Calculus & Linear Algebra: Specially required where data is used for predictive performance or algorithm optimization
• Data Munging / Scrubbing or Cleanliness: For example inconsistent string formatting as ND or Del for New Delhi; date alignment as [mm-dd-yyyy] or [dd-mm-yyyy] or [yyyy-dd-mm]
• Data Visualization & Communication tools: Principles of and tools of Data Visualization like ggplot and d3.js.
• Software Engineering: Strong software engineering background, SDLC, Agile, Scrum, DB techniques, Data intensive product development
• Software Testing skills – To make sure the output delivers what the business needs
• Basic Project Management Skills: Thinking like a Project Manager
THE PROJECT MANAGERKey skills of a Project Manager – the soft skills guy• Project Charter: Project Stakeholders and Objectives documented and signed-off
• Business Case: Asks the ‘whats’ and ‘whys’ of the business requirement
• Scheduling Tools: Creates a Plan of Action to answer the ‘hows’ of the project
• Vendor Management: Links up all 1st and 3rd party resources
• Risk Management: The real management tool, with the mitigant
• Communication Management: The core of collaboration and Management
• Software Engineering: Software engineering background with fair knowledge of tools
• Software Testing skills: To make sure the output delivers what the business needs
• Basic Data Management Skills: Thinking like a Data Scientist
MERGING ROLES
The Data Scientist The Requirements guy The Data Tools guy The Resource guy The Specialist The Enterprise Architect The Software guy
Breaking the Technical Barriers
The Project Manager The Requirements & Scope guy The Project Tools guy The Resource guy The Generalist The Program Manager The hardware & Software guy
Breaking the Cultural Barriers
WHY DO ANALYTICS PROJECTS FAIL ?
When do Projects fail, in general?
• Not completed within budgets• Not completed on time• Not completed as per specifications
Whys and Wherefores…• Poor scoping – unclear objective• Inadequate resources – lack of talent• Inappropriate Solution – wrong tool
selection• Bad planning – Insufficient analysis• Bad execution – poor Project Management
“3 out of 4 Big Data Projects fail”
• Inaccurate Project Scope• Lack of Talent• Challenging Tools• Even more Challenging
Concepts• Poor Planning• Ownership Issues – Business
Initiative or IT Project?
Apache Hadoop vs. Apache Cassandra
55% of Big Data projects don’t get completed; in case of IT projects in general, it is only 25%.
THE SUCCESS MANTRA!• Get a Data Scientist at the PM or SME
• Or, at least get the Data Scientist as one of the leads• Ask very tough questions to sponsors for a Business Case• Get the CFO and the End-User on your side – get the expectations right• Take your time – use appropriate Project Management methodologies• Use the most appropriate Platform / Tool• Accept requirements’ volatility
• Scrum: accepting that the problem cannot be fully understood or defined, focusing instead on maximizing the team's ability to deliver quickly
• POC Deployment Acceptance Enhancement Deployment … …• Agile: adaptive planning, evolutionary development, early delivery, continuous improvement
• Document and Handover to End-User at every stage
PM answers “How” as long as the Business knows “Why & What”
QUESTIONS, ANY?