data analysis using stata - reed college€¦ · data analysis using stata kristin bott kbott @...
TRANSCRIPT
DataAnalysisusingStata
KristinBottkbott @reed.edu
<<Logintoacomputerwhileyou’rewaiting>>
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
blogs.reed.edu/datablog
Intro/K.Bott+Data@ReedAssociateDirector,[email protected] /ETC225
Geospatial/Statistical/ComputationalsoftwareDataanalysis+presentation
Data@Reed :[email protected]/data-at-reed
averyskeletalresearchprocess
Question
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Data
Results
Conclusions
data@reed/research+datasupport
Datadiscovery
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Datavisualization
Datacitation
Dataanalysis
Datamanagement
Datawrangling
Wherewe’reheaded(<60min)
• Stepzero:howtofindandpilotStata• Stepone:basicdataskills• Steptwo:externaldata• Stepthree:dataanalysis• Helpandresources
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
wheretofind/howtoaccess
WhereyoucanfindStata• Aroundcampus:– IRC’s(ETC)– Library– Eliot110/PPW(socialsciences)– Psychologycomputerlab
• “GradPlan”/optionforbuyingyourowncopy
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
AccessingStata
Onyourcomputers-->Applications
>>StataSE
openStata !
…andlet’stakealook.
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Review Variables
(variable)propertiesCommand
Results
WhatdidIdo?WhatdataamIworkingwith?
DatadetailsCurrentactionishere!
Whathappened?
• Point+click(GUI)vs command-line
Stata datasets
• Pre-loaded,usefulfortraining/learning• Type:
. set more off
. sysuse dir
. sysuse auto(1978 Automobile data)
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
stata:basicdataskills
Dataisloaded:nowwhat?
Whatsortofinformationdoyouwanttoknowaboutyourdata?
Howcanwegettothisinformation?
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Somethingsyoumaywanttoknow
Range ofdata/outliersMissing values[Howmany?Howcoded?]
Types ofvariablesVariation ofdata
(stepone:lookatyourdata– databrowser).browse
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
first-glancetools
.summarize (wholedataset)
.summarize rep78Observation/mean/StdDev /Min/Max
.describe (wholedataset)
.describe [var]var name/storagetype/disp format/valuelabel/var label
.codebook (wholedataset)
.codebook [var]type/range/units/unique/missing/mean/std dev /%tiles
first-glancetools
.summarize (wholedataset)
.summarize [var]Observation/mean/StdDev /Min/Max
.describe (wholedataset)
.describe rep78var name/storagetype/disp format/valuelabel/var label
.codebook (wholedataset)
.codebook [var]type/range/units/unique/missing/mean/std dev /%tiles
first-glancetools
.summarize (wholedataset)
.summarize [var]Observation/mean/StdDev /Min/Max
.describe (wholedataset)
.describe [var]var name/storagetype/disp format/valuelabel/var label
.codebook (wholedataset)
.codebook [var]type/range/units/unique/missing/mean/std dev /%tiles
Variablestoragetypes
• Describe orvariablewindowshows“storagetype”
• Numbers– byte,int(eger),long,float,double– varyinprecision+memorytheyuse(bytes)
• Letters:– String– str1,str2,…str244
• Question::Whydoesthismatter?
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Variablestoragetypes
• Describe orvariablewindowshows“storagetype”
• Numbers– byte,int(eger),long,float,double– varyinprecision+memorytheyuse(bytes)
• Letters:– String– str1,str2,…str244
• Question::Whydoesthismatter?
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Youcan’tfindthemeanofwords...
first-glancetools
.summarize (wholedataset)
.summarize [var]Observation/mean/StdDev /Min/Max
.describe (wholedataset)
.describe [var]var name/storagetype/disp format/valuelabel/var label
.codebook (wholedataset)
.codebook rep78type/range/units/unique/missing/mean/std dev /%tiles
first-glancetools
.summarize (wholedataset)
.summarize [var]Observation/mean/StdDev /Min/Max
.describe (wholedataset)
.describe [var]var name/storagetype/disp format/valuelabel/var label
.codebook (wholedataset)
.codebook [var]type/range/units/unique/missing/mean/std dev /%tiles
somefirst-glancetools
.tabulate foreignvariable/frequency/percent/cumulative%
.tabulate [var1] [var2]whatdoesthisdo?
.tabulate [var2] [var1]whatdoesthisdo?
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
somefirst-glancetools
.tabulate (wholedataset)
.tabulate [var]variable/frequency/percent/cumulative%
.tabulate foreign rep78whatdoesthisdo?
.tabulate [var2] [var1]whatdoesthisdo?
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
somefirst-glancetools
.tabulate (wholedataset)
.tabulate [var]variable/frequency/percent/cumulative%
.tabulate make rep78whatdoesthisdo?
.tabulate [var2] [var1]whatdoesthisdo?
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Hmm…doIhaveobservationsforthesamemake+repairrecord?
kbott’s first-glancetoolbox
Fordataset or[var]or[var1][var2].summarize.codebook.describe.tabulate.inspect .browse.list
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
belazy!– useonlyafew(unique)lettersforcommands
Basics:[browse]subsetsofdata
browse if foreign == 1 (equals)browse if foreign ~= 1 (notequal)browse if foreign != 1 (notequal)
browse if mpg > 5 & mpg < 20 (&joinsmultiple)
browse make mpg in 1/10(rangeofvalues)
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Canalsouseview toseeresultsinmainwindow
Basics:alterdata
.sort var (sortsfromlowtohigh)
.drop var (dropvariable,keeprest)
.keep var (keepvariable,droprest)
.replace var (replaceexistingvariable)
.generate var (generatenewvariable)
.egen var (extendedgenerate)
.clear [dataset](clearsfrommemory,doesnot erasedata)
dataanalysis+visualization
Basics(stockdataset)
• Histograms. sysuse auto. hist price, freq. hist price, freq bin(5). hist price, freq bin(15)
. hist price if foreign==1, freq bin(15)
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Basics(stockdataset)
• Scatterplots. sysuse auto. scatter mpg weight. scatter mpg weight || lfit mpg weight
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Analyze:twovariables
. correlate mpg weight
. regress mpg weight
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Analyze:twovariables
. correlate mpg weight
. regress mpg weight
Examinerelationshipbyforeign/domesticvehicles. by foreign: regress mpg weight
Examinerelationshipforonlyforeignvehicles. regress mpg weight if foreign==1
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Homework#3
additional+specifictools
Homework#3
• Measuresofcentraltendency+dispersion. summarize. tabulate
• Visualization. scatter
• Analysis. regress. predict
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Homework#3
• Measuresofcentraltendency+dispersion. summarize. tabulate
• Visualization. scatter
• Analysis. regress. predict. help predict
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Help+additionalresources
Help!+additionaltools
• Stata HelpMenu– contents– search– command
• Atthecommandline– help– search– findit
• External(toStata)resources• Trackworkviadofiles
K.Bott /InstructionalTechnologyServicesReedCollege/Portland,OR
Keytocollaboration:Dofiles
SaveyoutimeforrepetitioustasksMinimizeserrorsStoreyourdataanalysisprocess
.doeditorviaGUI
Kbott OfficeHoursbyappointment
helpdocumentation+moreD@Rsite:reed.edu/data-at-reed
[email protected] |ETC225