review of coursera data analysis course
DESCRIPTION
Review of Coursera Data Analysis Course. Jim Thompson [email protected]. To make sense of my comments…. Who’s the reviewer What is MOOC Overview of course (Through this reviewers eyes). The Reviewer (Who am I?). Not a professional data analyst: Chemist by training - PowerPoint PPT PresentationTRANSCRIPT
Review of Coursera Data Analysis Course
To make sense of my comments…
• Who’s the reviewer• What is MOOC• Overview of course
(Through this reviewers eyes)
The Reviewer (Who am I?)Not a professional data analyst:• Chemist by training• Develop and commercialize new materials and applications
by profession.Not a data analysis layman• Data analysis as a hobby, on and off for 25 years.• Downloaded R, Jan 2009, used ever sinse
“Data Analysts Captivated by R’s Power”The New York Times, January 2009http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?pagewanted=all
How I taught myself RWhatever fancies me at the moment• No mentor, nor colleague• Books (> 10 on R), Internet articles, R vignettes• Learning by doing, mainly work data,
for fun not for work.Because hobby, lacked discipline in:• Clean code• Reporting• Reproducible research• Appropriate use of stat technique
How I taught myself RWhatever fancies me at the moment• No mentor, nor colleague• Books (> 10 on R), Internet articles, R vignettes• Learning by doing, mainly work data,
for fun not for work.Because a hobby, lacked discipline in:• Clean code• Reporting• Reproducible research• Appropriate use of stat technique
I tried Open University• Excellent Teachers• One hour long lectures• Some class homework
provided. No grading• Complete at your own
pace
Intro to Programing , Stanford
I tried Open University• Excellent Teachers• One hour long lectures• The class homework
provided. No grading• Complete at your own
pace
Intro to Programing , Stanford
Don’t have one hour chunks of time. Nor the discipline.
“The Year of the MOOC”the New York Times [1]
• A massive open online course (MOOC) is … aimed at large-scale interactive participation and open access via the web. [2]• www.Udacity.com • www.edX.org• www.Coursera.org
[1] http://www.nytimes.com/2012/11/04/education/edlife/massive-open-online-courses-are-multiplying-at-a-rapid-pace.html?pagewanted=all&_r=0[2] http://en.wikipedia.org/wiki/Massive_open_online_course
Data Analysis by Jeffrey LeekAn applied statistics course focusing on data analysis, not mathematical details. How to:• Organize and perform analysis,• interpret results, • diagnose potential problems• write-up data analysesStatistical methods :
Data Analysis by Jeffrey LeekAn applied statistics course focusing on data analysis, not mathematical details. How to:• Organize and perform analysis,• interpret results, • diagnose potential problems• write-up data analysesStatistical methods :
Requires a working knowledge ofR
How does this work?• Time bond (i.e 6 weeks)• Plan on 3-10 hrs/wks• Watch three to five videos a week, 10-15 min long• Weekly quizzes• Submit two papers/reports• Slides, video, R code available for download• A certificate
Structure the analysis: Tips of finding, organizing, cleaning the data and the code.
Week 1 Week 2
Personal comments:
Structure the analysis: Tips of finding, organizing, cleaning the data and the code. Very useful.
Week 1 Week 2
Biggest Benefit I
Exploratory & Inferential:Clustering for exploratory analysis
Week 3 Week 4
Inferential & Predictive Analysislearned new techniques, best practices
Week 5Week 6
Advanced TechniquesGood stuff, but I was running out of gas
Week 5Week 5
Submit Two Reports1. Inference analysis of mortgage data:
“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score”
2. Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”
Submit Two Reports1. Inference analysis of mortgage data:
“This analysis considers whether any other variables have an important association with interest rate after taking into account the applicant's FICO score”
2. Predictive modeling using censors on cell phones:“Given the output Samsung phone, can we predict whether the owner is sitting, laying, standing, walking, walking up stairs, or walking down stairs.”
Biggest Benefit II• submitting mine, • analyzing others
Data analysis rubric• Main text
Does the analysis have an introduction, methods, analysis, and conclusions? Are figures labeled and referred to by number in the text? Is the analysis written in grammatically correct English? Are the names of variables reported in plain language, rather than in coded
names? Does the analysis report the number of samples? Does the analysis report any missing data or other unusual features? Does the analysis include a discussion of potential confounders? Are the statistical models appropriately applied? Are estimates reported with appropriate units and measures of uncertainty? Are estimators/predictions appropriately interpreted? Does the analysis make concrete conclusions? Does the analysis specify potential problems with the conclusions?
Data analysis rubric• Figure
Is the figure caption descriptive enough to stand alone?Does the figure focus on a key issue in the
processing/modeling of the data?Are axes labeled and are the labels large enough to read?
• ReferencesDoes the analysis include references for the statistical
methods used?• R script
Can the analysis be reproduced with the code provided?
Final commentsOn MOOC• Thumbs up!
On Data Analysis by Jeffrey Leek• Thumbs up!• Target audience: I might be the sweet-spot • Excellent reference (links attached).• On submitting reports:
• Learned most by writing the reports and grading others
NOTE: Intro to R course scheduled for September 2013
Data Analysis by Jeffrey LeekThe Class• https://www.coursera.org/
course/dataanalysis• https://github.com/jtleek/d
ataanalysis
The Prof• http://www.biostat.jhsph.e
du/~jleek/• http://simplystatistics.org/
MOOC