advanced stata workshop

24
Advanced Stata Workshop FHSS Research Support Center

Upload: sanjiv

Post on 24-Feb-2016

55 views

Category:

Documents


0 download

DESCRIPTION

Advanced Stata Workshop. FHSS Research Support Center. Presentation Layout. Visualization and Graphing Macros and Looping Panel and Survey Data Postestimation. Visualization and Graphing in Stata. Intro To Graphing In Stata. “graph” is often optional. So is “ twoway ” in this case. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced  Stata  Workshop

Advanced Stata Workshop

FHSS Research Support Center

Page 2: Advanced  Stata  Workshop

Presentation Layout

• Visualization and Graphing• Macros and Looping• Panel and Survey Data• Postestimation

Page 3: Advanced  Stata  Workshop

Visualization and Graphing in Stata

5560

6570

7580

Life

exp

ecta

ncy

at b

irth

0.1.2.3Fraction

5560

6570

7580

Life

exp

ecta

ncy

at b

irth

2.5 3 3.5 4 4.5loggnp

0.0

5.1

.15

.2Fr

actio

n

2.5 3 3.5 4 4.5loggnp

Source: 1998 data from The World Bank Group

Life expectancy at birth vs. GNP per capita

Page 4: Advanced  Stata  Workshop

Intro To Graphing In Stata10

2030

40M

ileag

e (m

pg)

0 5,000 10,000 15,000Price

“graph” is often optional. So is “twoway” in this case.

. sysuse auto, clear

. graph twoway scatter mpg weight //Note that you don't need to type graph or twoway

. scatter mpg weight

Note: Nearly all graphing commands start with “graph”, and “twoway” is a large family of graphs.

Page 5: Advanced  Stata  Workshop

Creating Multiple Graphs with “by():”. twoway scatter mpg weight, by(foreign)

1020

3040

2,000 3,000 4,000 5,000 2,000 3,000 4,000 5,000

Domestic Foreign

Mile

age

(mpg

)

Weight (lbs.)Graphs by Car type

Note that the value label is displayed above the graphs, and the variable label is displayed in the bottom right hand corner.

Page 6: Advanced  Stata  Workshop

Overlaying “twoway” graphs

The || tells Stata to put the second graph on top of the first one – order matters! You don’t need to type “twoway” twice; it applies to both.

. twoway scatter mpg weight || lfit mpg weight

1020

3040

2,000 3,000 4,000 5,000Weight (lbs.)

Mileage (mpg) Fitted values

. twoway (scatter mpg weight) (lfit mpg weight)

1020

3040

2,000 3,000 4,000 5,000Weight (lbs.)

Mileage (mpg) Fitted values

This is another way of writing the command – it doesn’t matter which one you use.

Page 7: Advanced  Stata  Workshop

. twoway (qfitci mpg weight, stdf) (scatter mpg weight), by(foreign)

"by()" statements with overlaid graphs

“qfitci” is a type of graph which plots the prediction line from a quadratic regression, and adds a confidence interval. The “stdf” option specifies that the confidence interval be created on the basis

. twoway (qfitci mpg weight, stdf) (scatter mpg weight), by(foreign)

010

2030

40

2000 3000 4000 5000 2000 3000 4000 5000

Domestic Foreign

95% CI Fitted valuesMileage (mpg)

Weight (lbs.)

Graphs by Car type

stdf is an option of qfitci. by(foreign) is an option of twoway.

Page 8: Advanced  Stata  Workshop

"by()" statements with overlaid graphsAnother way of writing the previous command is:

010

2030

40

2000 3000 4000 5000 2000 3000 4000 5000

Domestic Foreign

95% CI Fitted valuesMileage (mpg)

Weight (lbs.)

Graphs by Car type

. twoway qfitci mpg weight, stdf || scatter mpg weight ||, by(foreign)

. twoway (qfitci mpg weight, stdf) (scatter mpg weight), by(foreign)

. twoway qfitci mpg weight, stdf || scatter mpg weight ||, by(foreign)

So: This was is easier to read.

This way is easier to type.

Page 9: Advanced  Stata  Workshop

Graphs with Many Options and OverlaysYou can make pretty impressive graphs just from code, if you overlay the graphs and specify certain options like: multiple axes, notes, titles and subtitles, axis titles and labels, and legends.

Page 10: Advanced  Stata  Workshop

Code for Previous Graph

. #delimit ;

. #delimit cr

> legend(label(1 "White males") label(2 "Black males") );> "(1918 dip caused by 1918 Influenza Pandemic)" )> note( "Source: National Vital Statistics, Vol 50, No. 6" > subtitle( "USA, 1900-1999" ) > title( "White and black life expectancy" ) > ytitle( "Life expectancy at birth (years)" ) > ylabel( 0 20(10)80, gmax angle(horizontal) ) > ylabel( 0(5)20, axis(2) grid gmin angle(horizontal) ) > xlabel( 1918, axis(2) ) > xtitle( "", axis(2) ) > ytitle( "", axis(2) ) > ||, > || lfit diff year > || line diff year > || line le_bm year . twoway line le_wm year, yaxis(1 2) xaxis(1 2)

. use http://www.stata-press.com/data/r12/uslifeexp, clear

. generate diff = le_wm - le_bm

. label var diff "Difference"

. #delimit cr

This may look scary, but it is actually fairly straightforward. See the accompanying do-file for explanation of each component.

Page 11: Advanced  Stata  Workshop

68

1012

14

01oct2009 01jan2010 01apr2010 01jul2010date

NASDAQ Composite Index ABC.com, Inc. share price

Using the Graph Editor

. tsline nci abc

It is often easier to make changes in the graph editor than to specify all the options in code.

Let’s make graph 1 into graph 2 by using the graph editor tools.

0

2

4

6

8

10

12

14

16

Sha

re P

rice

(US

D)

Oct 1, 2009Nov 1, 2009

Dec 1, 2009Jan 1, 2010

Feb 1, 2010Mar 1, 2010

Apr 1, 2010May 1, 2010

Jun 1, 2010

NASDAQ Composite Index ABC.com, Inc. share price

Source: CRSP, Bloomberg

Sep 24, 2009 - June 7, 2010

ABC.com Inc.Closing Share Price vs. Nasdaq Composite Index

Page 12: Advanced  Stata  Workshop

Recording Edits in the Graph Editor

Graph Element ChangeGraph Title Enter Title using quotes to separate lines, color=black

Graph Subtitle Enter subtitle

Graph Region Color = Bluish-gray

Y-AxisRange = 0 to 16 by 2, axis line = medium thick, add title, label angle = horizontal, grid lines = off

X- Axis

title = off, minor ticks = off, suggest # of ticks = 8, alternate spacing of adjacent labels = on, change label format, label size=small, axis line = medium thick

Plot 1 line color=green, width = thick

Plot 2 line color = blue, width = thick

Caption Add caption

Before you start making changes, click the record button. After you are done, click it again, and save your changes as a recording so you can “play” them back later. We will save this recording as advanced_workshop_1.

Page 13: Advanced  Stata  Workshop

Play Your Graph Recording

. tsline nci abc, play(advanced_workshop_1)

You can create a graph, open the graph editor, click the green play button, and then play back your recorded edits.

Or, you can play your edits right from the code:

You can also run all of your recorded edits on a different graph, and just change the title:

0

2

4

6

8

10

12

14

16

Sha

re P

rice

(US

D)

Oct 1, 2009Nov 1, 2009

Dec 1, 2009Jan 1, 2010

Feb 1, 2010Mar 1, 2010

Apr 1, 2010May 1, 2010

Jun 1, 2010

Computer World share price Computer Planet share price

Source: CRSP, Bloomberg

Sep 24, 2009 - June 7, 2010

ABC.com Inc.Closing Share Price vs. Nasdaq Composite Index

. tsline comp_world comp_planet , play(advanced_workshop_1)

You can run your recorded edits on a graph of a different type, though in this case not all of your edits will make sense:

0

2

4

6

8

10

12

14

16

Sha

re P

rice

(US

D)

Oct 1, 2009Nov 1, 2009

Dec 1, 2009Jan 1, 2010

Feb 1, 2010Mar 1, 2010

Apr 1, 2010May 1, 2010

Jun 1, 2010

NASDAQ Composite Index ABC.com, Inc. share price

Source: CRSP, Bloomberg

Sep 24, 2009 - June 7, 2010

ABC.com Inc.Closing Share Price vs. Nasdaq Composite Index

> , play(advanced_workshop_1). twoway (scatter nci date) (scatter abc date) ///

Page 14: Advanced  Stata  Workshop

Storing and Moving Your RecordingsGraph recordings are stored as .grec files in your “personal” folder, under the “grec” folder. Type “personal” to see where this is; normally it is C:\ado\personal. So by default Stata should store your .grec files in C:\ado\personal\grec.

your personal ado-directory is c:\ado\personal\. personal

. dir c:\ado\personal\grec\

1.3k 11/21/12 10:12 x grid.grec 0.9k 5/17/12 15:47 line..grec 0.7k 3/01/12 9:48 jeff_test_recording_graph_edits.grec 0.4k 2/21/13 9:12 advanced_workshop_1.grec

Unfortunately, if you are not faculty, you are probably using lab computers to use Stata, and when they are re-imaged, you will lose the files in your grec folder. So you can store the recordings on your flash drive by clicking the Browse button when you save your recording. Now, when you are in the graph editor and click the play button, your recording will not appear in the list because it is not stored where Stata knows to look for it. Never fear, just click Browse, and navigate to where your .grec file is. If you want your recording to be available right from code, as in play(advanced_workshop_1), you will need to move it (at least temporarily) to the “grec” folder, or write the directory location in the code: play(E:\flashdrive\Graph Recordings\advanced_workshop_1)

Page 15: Advanced  Stata  Workshop

Using Schemes in GraphingRecordings are great if you are going to be making the same kind of graph a lot. But a recording for a scatter plot will hardly affect a histogram at all, and might even make it look terrible. If you want to change the look of all graphs that you make, you may want to make a scheme. Schemes are text files which tell Stata how to draw graphs.

40

45

50

55

60

65

life

expe

ctan

cy

1900 1910 1920 1930 1940Year

4045

5055

6065

life

expe

ctan

cy

1900 1910 1920 1930 1940Year

. sysuse uslifeexp2, clear

. scatter le year. scatter le year, scheme(economist)

Page 16: Advanced  Stata  Workshop

More on Schemes

economist see help scheme_economist sj see help scheme_sj s1manual see help scheme_s1manual s1rcolor see help scheme_s1rcolor s1mono see help scheme_s1mono s1color see help scheme_s1color s2gcolor s2gmanual s2manual see help scheme_s2manual s2mono see help scheme_s2mono s2color see help scheme_s2color

Available schemes are

. graph query, schemes

Schemes are very powerful, because they let your implement a certain look without specifying a long series of options in every graph, or running every graph through the graph editor. However, creating schemes is fairly time consuming.

For more on creating your own schemes, see:

http://www3.eeg.uminho.pt/economia/nipe/2010_Stata_UGM/papers/Rising.pdfAnd http://www.ats.ucla.edu/stat/stata/seminars/stata_graph/graphsem.txt

Page 17: Advanced  Stata  Workshop

Manipulating Graphs: Memory vs. DiskWhen you draw a graph, it is stored in memory, under the name Graph.

If you draw another graph, it replaces the previous one in memory, and is now called Graph.

If you want to have multiple graphs up at the same time, you can use the name option.

graph save moves your graph from memory to disk, saving it as a .gph file.

graph dir lists all graphs in memory and on disk (in the current directory)

graph drop drops a graph from memory. Graphs contain the data files they represent, so if the dataset is large, they can actually take up quite a bit of memory.

. sysuse auto, clear

. scatter price mpg

. scatter price length

. scatter price mpg, name(scatter1)

. cd C:\Users\nickj22\Downloads\

. graph save scatter1 mygraph1.gph

Graph scatter1 mygraph1.gph. graph dir

. graph drop scatter1

Page 18: Advanced  Stata  Workshop

Manipulating Graphs DemoSee do file for demo

Page 19: Advanced  Stata  Workshop

More Example GraphsNote: Annotated code is in the do file for all of these

Histogram, with overlaid normal distribution

22 22 2233

17

50

33

38

1325

613

6 8

3831

158

020

4060

020

4060

9.5 10 10.5 11 9.5 10 10.5 11

NE N Cntrl

South West

Avg. education level Avg. education level

Avg. education level Avg. education level

Percentnormal educPercent

Per

cent

average education level

Graphs by Census region

8

6

2

8

12

16

20

12

6

12

05

1015

20P

erce

nt

9.5 10 10.5 11average education level

Source: US Census, 1980 and 1990

Avg. education level

Page 20: Advanced  Stata  Workshop

More Example Graphs

73.3

27.9

73.5

21.7

81.0

46.1

72.1

46.2

020

4060

80D

egre

es F

ahre

nhei

t

N.E. N. Central South West

Source: U.S. Census Bureau, U.S. Dept. of Commerce

by regions of the United StatesAverage July and January temperatures

July January

Use graph bar to make bar graphs

Page 21: Advanced  Stata  Workshop

More Example Graphs

5560

6570

7580

Life

exp

ecta

ncy

at b

irth

0.1.2.3Fraction

5560

6570

7580

Life

exp

ecta

ncy

at b

irth

2.5 3 3.5 4 4.5loggnp

0.0

5.1

.15

.2Fr

actio

n

2.5 3 3.5 4 4.5loggnp

Source: 1998 data from The World Bank Group

Life expectancy at birth vs. GNP per capita

Use graph combine to combine 3 graphs into one:

Page 22: Advanced  Stata  Workshop

More Example GraphsGraph matrix is a great alternative to a correlation matrix to investigate relationships between variables

Avg.annual %

growth

Lifeexpectancy

at birth

Log GNPper

capita

safewater

-10123

-1 0 1 2 3

50

60

70

80

50 60 70 80

6

8

10

12

6 8 10 1220406080

100

20 40 60 80 100

Source: The World Bank Group

Correlations among 1998 life-expectancy data

Page 23: Advanced  Stata  Workshop

More Example GraphsGet data labels (called marker labels in Stata) from the values of another variable

Canada

Dominican Republic

El Salvador

Guatemala

Haiti

Honduras

Jamaica

Mexico

Nicaragua

PanamaTrinidad

United States

Argentina

Bolivia

Brazil

Chile

ColombiaEcuador ParaPeru

UruguayVenezuela

5560

6570

7580

Life

exp

ecta

ncy

at b

irth

(yea

rs)

.5 5 10 15 20 25 30GNP per capita (thousands of dollars)

Data source: World Bank, 1998

North, Central, and South AmericaLife expectancy vs. GNP per capita

Page 24: Advanced  Stata  Workshop

More Example GraphsXtline from a panel data set can overlay lines for each value of panel variable. The labels on the x-axis are often a bit off to start though, as shown.

3500

4000

4500

5000

Cal

orie

s co

nsum

ed

01jan2002 01apr2002 01jul2002 01oct2002 01jan2003Date

Tess SamArnold

Jan 1 2002 - Jan 1 2003Calories Consumed by Subject