getting started with stata how do i do this? it probably opened automatically, but you may have to...
TRANSCRIPT
Getting Started With STATAGetting Our Feet Wet with STATA
On https://ctools.umich.edu, you will find a dataset containing, among other things, some of the items from the SF-12 Assessments Of Physical And Mental Health from the National Longitudinal Survey of Youth (NLSY). 1) Download the dataset and brief codebook file to the desktop of the computer that you are
working on. 2) Double click on the file to open it (or it may have opened automatically when you
downloaded it) 3) Open a log file on the desktop, or in your IFS space to save your work. 4) Looking at your STATA program, try to answer the following questions:
a) Where are the different individuals in the dataset?
b) Where are the variables? Try generating a list of variables in the data set by using the “codebook” command.
c) Where are the SF-12 questions (variables?). Use lookfor SF-12 or scroll through the
variables window to find them.
d) How many individuals are represented in this data set? (hint: the describe command will help you)
5) The names of the variables are less than intuitive. Using the rename command, can you
rename one of the SF-12 variables so that it has a more intuitive name? 6) According to the codebook some values of the data actually represent missing data. Using
the SF-12 variable that you renamed, can you check to see so that missing values are counted as missing by the program?
7) Try calculating the average score and running a frequency distribution of one of the SF-12
items that you have been working with. What does this tell you?
Getting Our Feet Wet with STATA On https://ctools.umich.edu, you will find a dataset containing, among other things, some of the items from the SF-12 Assessments Of Physical And Mental Health from the National Longitudinal Survey of Youth (NLSY). 1) Download the dataset and brief codebook file to the desktop of the computer that you are
working on. 2) Double click on the file to open it (or it may have opened automatically when you
downloaded it) 3) Open a log file on the desktop, or in your IFS space to save your work. 4) Looking at your STATA program, try to answer the following questions:
a) Where are the different individuals in the dataset?
b) Where are the variables? Try generating a list of variables in the data set by using the “codebook” command.
c) Where are the SF-12 questions (variables?). Use lookfor SF-12 or scroll through the
variables window to find them.
d) How many individuals are represented in this data set? (hint: the describe command will help you)
5) The names of the variables are less than intuitive. Using the rename command, can you
rename one of the SF-12 variables so that it has a more intuitive name? 6) According to the codebook some values of the data actually represent missing data. Using
the SF-12 variable that you renamed, can you check to see so that missing values are counted as missing by the program?
7) Try calculating the average score and running a frequency distribution of one of the SF-12
items that you have been working with. What does this tell you?
How do I do this?
It probably opened automatically, but you may have to save it to the desktop, and double-click it to open it.
How do I do this?
How do I do these things?
This one is easy! Pick one of the questions and type:
Rename [oldname] [newname]
How do I figure this out?
Pick a variable (question) and use the summarize or tabulate command to try to get some information about the average answers to that question.
Download the data set
Select NLSY.dta
Return to Main Slide
These .txt files are the codebooks
Open a log file
Log files are
opened and
closed with the
little button
that looks like a
“scroll” next to a “stoplight”
Return to Main Slide
Try to answer the following questionsReturn to Main Slide
The spreadsheet containing rows of individuals, and
columns of the questions they were asked, can be seen by clicking on the
browse or edit data buttons
The questions that were asked in the survey are
just above
You can “lookfor” or “describe” certain variables (questions)“describe, short” will give you information about the characteristics of the data, including the number of respondents
Missing Values
This may strike you as a little bit complicated initially, but really, it’s a matter of common sense. Here’s an illustration of the problem:
• The responses to many survey questions are coded in the following way– Agree 1– Neutral 2– Disagree 3
• Often, survey responses such as “don’t know”, “refused to answer”, “was not interviewed” are assigned a special numeric code indicating the non-response such as 99, -8, -9.
• We will run into problems if we try to get an average value for this variable, because the codes for the missing responses will be averaged in with the codes for the actual responses, so we might get an average response of -90.
• We have to tell the software (using the recode command) that these answers are in fact missing, and should be excluded from our calculations.
• For many data sets, including the NLSY, this has already been done.
• The symbol for missing values in most statistical software is a period (“.”).
Return to Main Slide