getting started with stata how do i do this? it probably opened automatically, but you may have to...

Getting Started With STATAGetting Our Feet Wet with STATA

On https://ctools.umich.edu, you will find a dataset containing, among other things, some of the items from the SF-12 Assessments Of Physical And Mental Health from the National Longitudinal Survey of Youth (NLSY). 1) Download the dataset and brief codebook file to the desktop of the computer that you are

working on. 2) Double click on the file to open it (or it may have opened automatically when you

downloaded it) 3) Open a log file on the desktop, or in your IFS space to save your work. 4) Looking at your STATA program, try to answer the following questions:

a) Where are the different individuals in the dataset?

b) Where are the variables? Try generating a list of variables in the data set by using the “codebook” command.

c) Where are the SF-12 questions (variables?). Use lookfor SF-12 or scroll through the

variables window to find them.

d) How many individuals are represented in this data set? (hint: the describe command will help you)

5) The names of the variables are less than intuitive. Using the rename command, can you

rename one of the SF-12 variables so that it has a more intuitive name? 6) According to the codebook some values of the data actually represent missing data. Using

the SF-12 variable that you renamed, can you check to see so that missing values are counted as missing by the program?

7) Try calculating the average score and running a frequency distribution of one of the SF-12

items that you have been working with. What does this tell you?

Getting Our Feet Wet with STATA On https://ctools.umich.edu, you will find a dataset containing, among other things, some of the items from the SF-12 Assessments Of Physical And Mental Health from the National Longitudinal Survey of Youth (NLSY). 1) Download the dataset and brief codebook file to the desktop of the computer that you are

working on. 2) Double click on the file to open it (or it may have opened automatically when you

downloaded it) 3) Open a log file on the desktop, or in your IFS space to save your work. 4) Looking at your STATA program, try to answer the following questions:

a) Where are the different individuals in the dataset?

b) Where are the variables? Try generating a list of variables in the data set by using the “codebook” command.

c) Where are the SF-12 questions (variables?). Use lookfor SF-12 or scroll through the

variables window to find them.

d) How many individuals are represented in this data set? (hint: the describe command will help you)

5) The names of the variables are less than intuitive. Using the rename command, can you

rename one of the SF-12 variables so that it has a more intuitive name? 6) According to the codebook some values of the data actually represent missing data. Using

the SF-12 variable that you renamed, can you check to see so that missing values are counted as missing by the program?

7) Try calculating the average score and running a frequency distribution of one of the SF-12

items that you have been working with. What does this tell you?

How do I do this?

It probably opened automatically, but you may have to save it to the desktop, and double-click it to open it.

How do I do this?

How do I do these things?

This one is easy! Pick one of the questions and type:

Rename [oldname] [newname]

How do I figure this out?

Pick a variable (question) and use the summarize or tabulate command to try to get some information about the average answers to that question.

Download the data set

Select NLSY.dta

Return to Main Slide

These .txt files are the codebooks

Open a log file

Log files are

opened and

closed with the

little button

that looks like a

“scroll” next to a “stoplight”


Try to answer the following questionsReturn to Main Slide

The spreadsheet containing rows of individuals, and

columns of the questions they were asked, can be seen by clicking on the

browse or edit data buttons

The questions that were asked in the survey are

just above

You can “lookfor” or “describe” certain variables (questions)“describe, short” will give you information about the characteristics of the data, including the number of respondents

Missing Values

This may strike you as a little bit complicated initially, but really, it’s a matter of common sense. Here’s an illustration of the problem:

• The responses to many survey questions are coded in the following way– Agree 1– Neutral 2– Disagree 3

• Often, survey responses such as “don’t know”, “refused to answer”, “was not interviewed” are assigned a special numeric code indicating the non-response such as 99, -8, -9.

• We will run into problems if we try to get an average value for this variable, because the codes for the missing responses will be averaged in with the codes for the actual responses, so we might get an average response of -90.

• We have to tell the software (using the recode command) that these answers are in fact missing, and should be excluded from our calculations.

• For many data sets, including the NLSY, this has already been done.

• The symbol for missing values in most statistical software is a period (“.”).


getting started with stata how do i do this? it probably opened automatically, but you may have to...

Documents