data scientist enablement roadmap 1.0

Click here to load reader

Post on 15-Jan-2015




5 download

Embed Size (px)




  • 1. Data Scientist Enablement Roadmap Advanced Center of Excellence Modern Renaissance Corporation In Collaboration with SONO team and others

2. Acknowledgement We thank our community of committed and passionate volunteers, experts, educators, innovators, benefactors, advisers, advocates and supporters We are also grateful to the outstanding support and encouragement from SONO team as well as other organizations like Open Courseware Consortium, MIT, IBM, HortonWorks, Stanford University, and Caltech etc. 3. Principles Philanthropy through Free and Open Education, Knowledge Dissemination and Social Innovation Synthesize Data Science, Big Data Architecture, Technology Platforms and Systems Engineering for Decision Making Collaboration, Crowdsourcing and Innovation Diffusion Emphasis on Knowledge, Skills and Abilities (KSAs) over Abstract Mathematics or Theoretical Profundity Principles without programs are platitudes.- George Bernard Shaw 4. Motivation Industry needs Data Scientists with versatile background in Machine Learning, Statistics, Big Data Architecture, Advanced Analytics, Evidence-Oriented Systems Engineering. Aspiring Data Scientists, Big Data Engineers also need well-rounded education, mentorship from experts as well as practical skills 5. Goals Prepare the students, practitioners to have set of T-shaped practical-skills emphasizing depth and breadth of a range of relevant disciplines and capabilities in Data/Decision Sciences and Big Data Architecture/Engineering. Make the course delivery easy, engaging and engendering. 6. Data Science Enablement Roadmap 2014 1 + 3 Courses gets you Masters Level CertificateRamping up Machine Learning with RFast track to Data Science Modern Data PlatformsAdvanced Techniques in Big Data Analytics 7. Data Science Enablement Roadmap - Future Possible extensions in future Data Mining Process Methodologies and Tools Advanced Techniques in Big Data AnalyticsRamping with R Fast track to Data Science Modern Data Platforms Machine Learning/AIData Visualization 8. Fast track to Data Science (DSE 400) Introductory course with NO pre requisites. Topics include Algorithms, Statistical Inference, Data Analysis, Model Building, Validation, Calibration, Data at rest and in motion, Causality, Meaning of Data, Data Engineering, Hadoop, R, Machine Learning, Data Mining, Visualization, Applications, Case Studies, variety of tools and techniques etc. 9. Ramping up with R (DSE 501) Prerequisite: DSE 400 Applied Statistics, Machine Learning, Data Mining, Graphing, Analytics and Visualization Use cases, Industry Applications 10. Modern Data Platforms (DSE 502) Prerequisite: DSE 400 Employ Hadoop and Hadoop Ecosystem to enable Enterprises handle data explosion and derive actionable analytics. MapReduce, Pig, Hive, NoSQL, Zookeeper etc. Also introduce streams computing with Storm and Kafka Case Studies: Fraud Prevention, Product Recommendation, Epidemic Prediction etc. 11. Modern Data Platforms (Contd ...) 12. Machine Learning and AI (DSE 503) Prerequisite: DSE 400 Explore and implement Machine Learning Algorithms such as Classification, Clustering, Ranking, Recommendation, Neural Networks, Adaptive Learning. Also to include Knowledge Engineering, Expert Systems, Ontologies, NLP and Reasoning 13. Data Mining Process and Methodologies (DSE 504) Prerequisite: DSE 400 Decision Trees Regression Classification Clustering Association Rules etc. 14. Data Visualization (DSE 505) Prerequisite: DSE 400 Story Telling/Data Journalism Data Visualization Methodology Tools and Techniques HTML5, d3.js, BIRT, Prefuse 15. Advanced Techniques for Big Data Analytics ( DSE 600) Prerequisite: DSE 400 On Demand Data Integration Data Virtualization Enterprise Data Hub Analytics Dashboards Big Data Appliances Analytics as a Service Open Stack and Savannah Privacy and Security 16. Advanced Techniques for Big Data Analytics ( contd ...) 17. Next Steps DSE 2014 stream is set to commence on Jan 19, 2004 For more details, visit DSE 400 Announcement Page To Enroll for DSE 400 visit Enrollment Page This presentation can also accessed at Data Scientist Enrollment Roadmap 1.0 We welcome thoughts and suggestions. Write to us at 18. References Data Jujitsu - The Art of Turning Data into a Product by DJ Patel Data Science eBook Dr. Vincent Garville Data Scientist - Sexiest Job of 21 Century (HBR) Doing Data Science by Rachel Shutt Data Visualization: a successful design process by Andy Kirk Disruptive Possibilities: How Big Data Changes Everything by Jeffrey Needham How to process, analyze and Visualize Data (MIT OCW) Knowledge-based Systems (MIT OCW) Learning from Data (Caltech) Statistical Thinking and Data Analysis (MIT OCW) The Complete Guide to Business Analytics (Collection) By: Thomas H. Davenport Think Bayes; Think Python; and Think Stats by Aleen Downey Learn from the masters - Johann Wolfgang von Goethe 19. References (contd ) Google White papers on GFS, MapReduce, Bigtable Real-time Analytics with Big Data - Facebook Case Study Interactive Data Visualization by Scott Murray Data, Models and Decisions (MIT OCW) Communicating Data (MIT Sloan School of Management OCW) Unleashing the Power of Hadoop - DBTA Thought Leadership Series How Educators Can Narrow Big Data Skills Gap - Jeff Bertloucci, Data Visualization with D3.js Cookbook What is Data Science? Agile Data Science 20. Thank You

View more