stata i: an introduction into the basics
DESCRIPTION
STATA I: An Introduction Into the Basics. Prof. Dr. Herbert Brücker University of Bamberg Seminar “Migration and the Labour Market” May 23, 2013. Contents The STATA Software package The Structure of STATA: Three files Getting started The STATA Menues The General Structure of STATA - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/1.jpg)
STATA I: An Introduction Into the Basics
Prof. Dr. Herbert Brücker
University of Bamberg
Seminar “Migration and the Labour Market”May 23, 2013
![Page 2: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/2.jpg)
Contents
1 The STATA Software package2 The Structure of STATA: Three files3 Getting started 4 The STATA Menues5 The General Structure of STATA6 Working with DO FILES7 Describe your data8 Running regressions
![Page 3: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/3.jpg)
1 STATA SOFTWARE PACKAGE
Image of STATA DVD in the Campus Net under:
“\\software\campliz”or:“\\software.uni-bamberg.de\campliz”
Then: Start -> AusführenThem: INSERT your licence:
Serial number: ….Code: ….Authorisation key: ….
![Page 4: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/4.jpg)
2 Structure of STATA: Three files
1. The DATA file (.dta) where you have your data.• You can watch you data with the DATA
BROWSER and edit your data with the DATA EDITOR
2. The DO file (.do) where you run and save your commands of any session. Very useful (i) to organise your data set, (ii) to see what you have done in the last session, (iii) to replicate what you have done in last session, (iv) to exchange work with your collaborators. • You write and run your commands with the
DO FILE EDITOR
![Page 5: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/5.jpg)
2 Structure of STATA: Three files
3. The LOG file (.log) which automatically reports all things which you have done during your session. Is automatically saved after your session. Not often used, but useful if something goes wrong.
![Page 6: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/6.jpg)
3 Getting started: the STATA empty window
![Page 7: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/7.jpg)
3 Getting started: The STATA empty window
The main window: shows commands, output and messages which arrive during your session
![Page 8: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/8.jpg)
3 Getting started: The STATA empty window
The main window: shows commands, output and messages which arrive during your session
The command window: here you can type your commands
![Page 9: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/9.jpg)
3 Getting started: The STATA empty window
The main window: shows commands, output and messages which arrive during your session
The command window: here you can type your commands
The variables window: Shows variables of your dataset
![Page 10: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/10.jpg)
3 Getting started: The STATA empty window
The main window: shows commands, output and messages which arrive during your session
The command window: here you can type your commands
The variables window: Shows variables of your dataset
The review window reports your previous commands
![Page 11: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/11.jpg)
3 Getting started: the windows after data loading
List of variables
Reports commands(one in this case)
Reports result of commands
![Page 12: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/12.jpg)
3 Getting started
• In principle, you can start your STATA session by (i) loading your data set and (ii) typing your commands in the command window.
• It is however recommended to use the DO FILE EDITOR right from the beginning.
• But let’s look at the STATA menues first.
![Page 13: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/13.jpg)
4 The STATA Menues
• For watching your data and changing your data by hand you need the DATA BROWSER and the DATA EDITOR.
• For starting and running your DO files you need the DO FILE EDITOR.
• The other menues are not relevant for the beginning.
The datapath
The dataeditor
The databrowser
The do fileeditor
The variablesmanager
The help menue
![Page 14: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/14.jpg)
4 The STATA Menues: The DATA EDITOR/BROWSER
The difference between the data browser and the data editor is that you can manipulate data in the editor and only watch them in the browser.
![Page 15: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/15.jpg)
4 The STATA Menues: The DATA EDITOR/BROWSER
You have two types of variables: NUMERICAL variables (black) and so-called STRING variables (blue) (e.g. text). STATA can identify STRING variables, but you cannot do numerical operations with them.
STRING variable
NUMERICAL variable
![Page 16: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/16.jpg)
4 The STATA Menues: The DATA EDITOR/BROWSER
HINT: You can transfer data e.g. from an EXCEL file into a STATA file by copy and paste (STRG C + STRG V) and vice versa in the data editor. But you have to be careful that you EXCEL is run in English, otherwise your data might be read as STRING variables by STATA. Of course there are many other ways to transfer data from Excel to STATA.
![Page 17: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/17.jpg)
5 The Grammar of STATA
General Structure of STATA commands
[prefix :] command [varlist] [if] [in] [weight] [, options]
![Page 18: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/18.jpg)
5 General structure of STATA
We will concentrate on:
[prefix :] command [varlist] [if] [in] [weight] [, options]
![Page 19: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/19.jpg)
5 General structure of STATA
We will concentrate on:
[prefix :] command [varlist] [if] [in] [weight] [, options]
What you want to do?
![Page 20: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/20.jpg)
5 General structure of STATA
• There are two types of variables (data):• numerical variables, e.g.: 0, 1, 501, 0.5, -12 etc.• string variables, e.g.: no voc train , male, female etc.
• How to deal with the data types:• Numerical variables: you can do all mathematical operations,
e.g. var1 + var2, var1/var2, var1*var2 etc.
• String variables: You have to use quotation marks for identifcation, e.g.• var1 = 1 if sex == “female”
![Page 21: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/21.jpg)
6 Working with DO FILES
• The standard approach is to start your work with a DO FILE
• Click on the DO FILE editor button after starting STATA
• Load an existing DO FILE or start a new one• Start the DO FILE with a command to load your data,
e.g.• use “path\data.dta”, clear
or, more specifically, with
• use “C:\Users\Herbert\Documents\STATA\Wagecurve\DE.dta", clear
![Page 22: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/22.jpg)
Open your DO FILE editor
• After starting STATA click on the DO FILE editor button
The do fileeditor
![Page 23: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/23.jpg)
How does a DO FILE look like
Commands
Descriptions of what you have done in stars *
![Page 24: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/24.jpg)
The DO FILE menue
Clicking this button runs the entire DO FILE (not recommended)
Clicking this button runs a selection of marked commands (recommended)
Note: STATA stops the DO File execution after the first mistake in your commands. That makes it advisable to proceed step by step.
![Page 25: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/25.jpg)
6 Step 1: Loading your data
• use “C:\Users\Herbert\Documents\STATA\Wagecurve\DE.dta", clear
• The use command loads the data• the “path\DE.dta” provides STATA the information on
the path where to find the data and the name of the data file (e.g. DE.dta)
• the clear command after the comma clears the memory, which is needed if you have used other data sets before
• Push the “Execute Selection (DO)” button to run the selected command(s)
• You can also run the entire DO File by pushing the “Execute Selection Quietly (RUN)” button
![Page 26: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/26.jpg)
Loading your data (I/II)
1. Write the command use „path\XXX.dta“, clear2. Mark the line and run the command by clicking the
execution button
![Page 27: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/27.jpg)
Loading your data (II/II)
![Page 28: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/28.jpg)
6 Step 2: Manipulating your data (I/VI)
• It is useful to save only a basic data set and generate the variables you need at the beginning of each session. That saves storage space (recommended in case of large data sets)
• Generating DUMMY variables• Use the gen command, e.g.
• gen D_ed1 = 0• This creates a variable consisting only of zeros• Then use the replace command, e.g.
• replace D_ed1 = 1 if ed1 == 1 • This replaces the zeros with 1 if the variables ed1
has a values of 1.
![Page 29: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/29.jpg)
Generating Dummy Variables: DO FILE commands
![Page 30: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/30.jpg)
Generating Dummy Variables: STATA main window
![Page 31: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/31.jpg)
6 Step 2: Manipulating your data (II/VI)
• Another example for generating dummy variables:• Use the gen command, e.g.
• gen year_1 = 0• This creates a variable consisting only of zeros• Then use the replace command, e.g.
• Year_1 = 1 if year == 1991 • This replaces the zeros with 1 if the year variable
has a values of 1991• Note: The STATA syntax requires that you have to
use after an if command always a double == for the definition of the value
![Page 32: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/32.jpg)
6 Step 2: Manipulating your data (III/VI)
• Creating series of dummy variables if it is too cumbersome to create them individually, e.g. in case of interaction dummies
• Syntax:• forvalues i = 1/3 {
forvalues j = 1/4{ gen D_ed`i’*D_ex`j’
} }• i.e. for each value I = 1,2,3 and each value j =
1,2,3,4 you generate an interaction dummy by multiplying the dummy variables for education and experience. Take care of the {}!
![Page 33: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/33.jpg)
Generating Dummy Variables: Advanced techniques
![Page 34: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/34.jpg)
Generating Dummy Variables: Advanced techniques
![Page 35: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/35.jpg)
Generating Dummy Variables: Advanced techniques
![Page 36: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/36.jpg)
6 Step 2: Manipulating your data (IV/VI)
• Transforming variables into log variables• Syntax:
• gen ln_wijt = ln wijt• By using again the gen command you can transform
the wage variable wijt into the natural logarithm of the wage by applying the ln operator
![Page 37: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/37.jpg)
Transforming data
![Page 38: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/38.jpg)
6 Step 2: Manipulating your data (V/VI)
• Useful operators in STATA:
• + add• - subtract• * multiply• / divide• ln transform into natural log• exp transform into exponential value
![Page 39: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/39.jpg)
6 Step 2: Manipulating your data (VI/VI)
• Control what you have done• Check you variables for mistakes in the browse
modus of the data set• You can delete wrong variable by using the drop
command, e.g.• drop ln_wijt
• Which simply drops your variable from the data set. Then you can create the correct one.
![Page 40: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/40.jpg)
6 Step 3Organize your data with globals
• It is not convenient if you have to work with too many variables, e.g. 200 dummy variables (that is cumbersome to type some by hand)
• You can define globals, which comprise many variables
• Syntax:• glo [name of global [list of variables]• glo D_i Ded_1 Ded_2 D_ed3
• i.e the global D_i consists of the variables Ded_1 Ded_2 and Ded_3
• If you want to use the global later you have to type • $[globalname], i.e. $D_i
![Page 41: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/41.jpg)
Creating globals
![Page 42: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/42.jpg)
7 Describe your data (I/II)
• Any econometric analysis requires in the first step that you provide descriptive statistics to the reader. This helps to understand what’s going on
• This can be easily done with the sum command• sum [variable name(s)]• sum LHijt LFijt wijt ln_wijt
• The sum command creates a table with the complete descriptive statistics, i.e. observations, mean, standard deviation, minimum, maximum
![Page 43: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/43.jpg)
Summary statistics
![Page 44: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/44.jpg)
Summary statistics
![Page 45: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/45.jpg)
7 Describe your data (II/II)
• Present your data graphically• It is usually helpful if you present the main
information /vairables in your data set graphically• There are many graphical commands, use the
Graphics menue• the simplest way is to show the development of your
variable(s) over time• Syntax:
• graph twoway line [variable1] [variable2] if …• graph twoway line wqjt year if ed==1 & ex == 1
• This produces a two-dimensional variable with the wage on the vertical and the year on the horizontal axis for education group 1 and experience group 1
![Page 46: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/46.jpg)
Making a graph
![Page 47: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/47.jpg)
Graph of mean wage in education 1 and experience 1 group
![Page 48: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/48.jpg)
Graph of migration rate in edu 1 and exp 1 group
![Page 49: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/49.jpg)
8 Running regressions
• The standard OLS regression command in STATA is• Syntax
• regress depvar [list of indepvar ] [if], [options] • regress ln_wijt m_ijt D_i D_j D_t
![Page 50: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/50.jpg)
8 Running Regressions
Recall: What is a linear regression model
The general econometric model:
γi indicates the dependent (or: endogenous) variable
x1i,ki exogenous variable, explaining the independent variable
β0 constant or the y-axis intercept (if x = 0)
β1,2,k regression coefficient or parameter of regression
εi residual, disturbance term
![Page 51: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/51.jpg)
Running a regression model
Regressioncommand
Dependentvariable
Independentvariables
Globals !
![Page 52: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/52.jpg)
Running a Regression: Output
![Page 53: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/53.jpg)
How to interpret the output of a regression
1. Observations2. fit of the model3. F-Test 4. R-squared5. adjusted R-squared6. Root Mean Standard Error
analysis of significance levels
variance of model
β0
β1
degrees offreedom
95% confidence interval
_cons 4.706176 .017403 270.42 0.000 4.672015 4.740337 mqkt -1.369118 .093913 -14.58 0.000 -1.553464 -1.184772 ln_wqkt Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 111.329246 799 .139335727 Root MSE = .33192 Adj R-squared = 0.2093 Residual 87.9145738 798 .110168639 R-squared = 0.2103 Model 23.4146717 1 23.4146717 Prob > F = 0.0000 F( 1, 798) = 212.53 Source SS df MS Number of obs = 800
. reg ln_wqkt mqkt
![Page 54: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/54.jpg)
8 Running Regressions: Panel Models
• Very often you use panel models, i.e. models which have a group and time series dimension
• There exist special estimators for this, e.g. fixed or random effects models• A fixed effects model is a model where you have a
fixed (constant) effect for each individual/group. This is equivalent to a dummy variable for each group
• A random effects model is a model where you have a random effect for each individual group, which is based on assumptions on the
distribution of individual effects
![Page 55: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/55.jpg)
8 Running Regressions: Panel Models
Preparation for Panel Models:• For running panel models STATA needs to identify the
group(individual) and time series dimension• Therefore you need an index for each group and an
index for each time period• Then use the tsset command to organize you dataset
as a panel data set• Syntax:
• tsset index year• where index is the group/individual index and year
the time index
![Page 56: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/56.jpg)
Preparation: Running the tsset command
![Page 57: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/57.jpg)
8 Running Regressions: Panel Models
• Then you can use panel estimatos, e.g. the xtreg estimator
• Syntax• xtregress depvar [list of indepvar ] [if], [options] • regress ln_wijt m_ijt, fe
• i.e. in the example we run a simple fixed effects panel regression model which is equivalent to include a dummy variable for each group (in this case education-experience group)
![Page 58: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/58.jpg)
Running a Panel Regression: command
![Page 59: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/59.jpg)
Running a Panel Regression: Output
![Page 60: STATA I: An Introduction Into the Basics](https://reader036.vdocument.in/reader036/viewer/2022062314/56812da9550346895d92cd49/html5/thumbnails/60.jpg)
Next Meeting:
June 13, 2013
Room RZ 01.02