national survey of family growth, cycle k · national survey of family growth, cycle 1: sample...
TRANSCRIPT
DATA EVALUATIONI AND METHODS RESEARCH
National Survey ofFamily Growth, Cycle kSample Design,Estimation Procedures, and
VarianceEstimation
●
This report describes the procedures used to select the sample, esti-mate population parameters, and estimate sampling variances forCycle I of the National Survey of Family Growth.
DHEW Publication No, (PHS) 78-1350
Series 2Number 76
U.S. DEPARTMENT OF HEALTH, EDUCATION, AND WELFAREPublic Health Service
National Center for Health Statistics sHyattsville, Md. January 1978
Library of Congress Cataloging in Publication Data
French, Dwight K.National survey of family growth, Cycle I.(Vital and health statistics; Series 2, Data evaluation and methods research; no 76)
(DHEW publication; (PHS) 78-1350)Includes bibliographical references.Supt. of Dots. no.: HE 20.6209:2/761. Family size–United States–Statistical methods. 2. Fertility, Human. 3. Birth con-
trol–United States. I. Title. II. Series: United States. National Center for Health Sta-tistics. Vital and health statistics: Series 2, Data evaluation and methods research; no. 76.HI. Series, United States Dept. of Health, Education, and Welfare. DHEW publication;(PHS) 78-1350.RA409.U45 no. 76 [HQ766.5.U5] 312’.07’23s [301.42’1]ISBN 0-8406 -0116-6 77-608168
FoL’sale by the Supenntcndent of Documents, L-.S. Govwmmnt Punting officeWashington, D.C. 20}02
Stock X“O. 017-022-00604-!/
NATIONAL CENTER FOR HEALTH STATISTICS
DOROTHY P. RICE, Director
ROBERT A. ISRAEL, Deputy Director
JACOB J. FELDMAN, Ph.D., Associate Director for Andy.sis
GAIL F. FISHER, Associate Director for the Cooperative Health .Stutistics System
ELIJAH L. WHITE, Associate Director for Data Systems
JAMES T. BAIRD, JR., Ph.D., Associate Director for International Statistics
ROBERT C. HUBER, Associate Director for Management
MONROE G. SIRKEN, Ph. D., Associate Director for ~~czthernatical S tatistics
PETER L. HURLEY, Associate Director for Operations
JAMES M. ROBEY, Ph. D., Associate Director for Program Development
PAUL E. LEAVERTON, Ph. D., Associate Director for Research
ALICE HAYWOOD, Information Officer
STATISTICAL METHODS STAFF, DATA SYSTEMS
E. EARL BRYANT, Chief
DIVISION OF VITAL Statistics
JOHN E. PATTERSON, Director
ALICE M. HETZEL, Deputy Director
WILLIAM F. PFuTT, Ph. D., Chiej Family Growth Survey Branch
RITA U. HOFFMAN, Chief Programming Branch
COOPERATION OF THE NATIONAL OPINION RESEARCH CENTER
In accordance with specifications established by the National Center for Hesdth Statistics,the National Opinion Research Center at the University of Chicago, under a contractrsid agree-ment, participated in the design and selection of the sample and carried out the data collection.
Vital and Health Statistics-Series 2-No. 76
. DHEW Publication No. (PHS) 78-1350
Library of Congress Catalog Card Number 77-608168
PREFACE
This report presents a detailed description of the sample design, estimationprocedures, and variance estimation method used in Cycle I of the National Sur-vey of Family Growth. The survey was designed and conducted by the NationalOpinion Research Center (NORC) of the University of Chicago under a contrac-tual arrangement with the National Center for Health Statistics (NCHS). Thesampling plan was developed under the supervision of Martin Frankel and Benja-min King of NORC, in consultation with E. Earl Bryant, Monroe G. Sirken, andWilliam F. Pratt of NCHS.
Much of the report, prepared by Dwight K. French of the Statistical Methodsstaff, is based upon survey specification documents prepared by NORC and uponinternal NCHS memoranda. Dr. Frankel, Dr. King, Mr. Bryant, and Dr. Pratt,along with Drs. Gordon Bonham and Dwight Brock of NCHS, were the primaryresource persons for methodological questions.
In addition to internal review, NCHS policy stipulates that methodologicalreports are to be given a peer review for technical merit and readabilityy by one ormore persons who are familiar with the subject matter area of the report but arenot involved in producing the report. Mr. Garrie Losee of NCHS carried out thepeer review of this report and made many constructive suggestions.
...Ill
CONTENTS
Preface . ... ... . ... .. .. .... .. .. .. ... . ... ... .. .. .. .. .. .. .. . .. ... .. .. .. ... . ... ... .. . ... .. .. . ... ... .. ... .. .. .. .. .. .. .. .. ... .. ... . .............................
Introduction . ... ... .. .. .. ... .. ... .. .. .. .. .. . ... .. .. .. .... .. .. .. .. .. .. ... .. . .... .. .. .. .. .... .. .. .. .... .. .. .. .. .. .. .. .... .. .. .. .. .... ....................
Design Specifications .. .. ... ... .. .. ... .. ... .. .. .... . .. .. ... .. .. .. .. .. . .. . ... ... .. .. .. .. .. .. .... . ... . ... .. .. .. ... .. ... .. . ... .. .. .. .. ... . .. ... . .. .. ..
Sample Design . . .. .. .. ... . ... .. ... .. .. .. .. . ... .. .. .. .. .. .. .. ... ... . ... .. .. .. .. ... ... . ... .. .. .. .. ... . ... .. ... .. . ... .. .. .. .. .... . ... .. .. ................Summary .. . .. .... . .... . ... ... .. .. .. ... .. ... .. .. ... .. .. .. ... .. .. .. .. ... ... .. . ... . ... .. .. .. .. . .. ... .. .. .. .. .. ... . .. .. .. .. .. .. .. .. ...................Determination of Sample Size .. .. ... .. . ... .. .. .. . . .... . .. ... . ... .. .. .. .. .. . ... . ... .. .. .. .. ... ... . .. ... . ... .. .. .. .. .. .. .. . ... .. .... .. . .. .
Stratification and Selection of PSU’S .. . ... .. . .. .. . ... .. .. .... .... . .. ... .. .. .. ... ... . ... .. .. .. .. .. .... ... ... .. .. .. .. .. .. .. .. ... .. ... .Selection of Second-Stage Units . ... ... . .. .. .. .. .. .. ... . .. .. . .. ... .. .. .. ... . .. . . .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. . .. ... . ... .. .. .. .. ... .Selection of Third-Stage Units .. .. . .. .. ... .. .. .. . ... .. .. ... . ... .. . .... .. .. .. .. .. . .. .. .. ... .. .. .. .. ... ... . ... .. ... .. . .... .. . .. .. ... .. .. ..Selection of Fourth-Stage DwelHng Units .. .. ... . .. .. .. ... .. ... .. .. .. .. .... . ... .. ... . .. .... . .. ... ... .. . ... .. ... . ... .. ... . .. .. ... ..Fifth-Stage Selection of Sample Persons ... .. ... .. .. ... . .. . .. ... .. .. .. ... . .. . .. ... ... . .. ... ... . ... .. ... . ... ... .. ... . .. ... . .. . .. ... .
Characteristics of the Sample .. ... .. ... .. . .... .. ... .. . .. .... .. ... ... . ... .. .. .. ... .. ... .. .. ... .. .. . ... ... .. .. .. ... . .. .. .. ... .. ... . .. ... .. .. ... .
Estimation ... .. ... .. .. .. .... . .. ... ... ... ... . .... . ... .. . .... .. .. ... . ... .. ... .. .. ... .. . ... .. ... .. .. ... .. .. .. ... .. .... ... .. . .. .. .. .. ... . .. ...... ...........Weighting Procedure .. ... ... .. ... .. ... .. .. .. .. ... ... .. .. .. .. .. .. .. ... ... .. .. .. .. .. .... . . .. . ... .. ... . .. ... ... . ... .. .. ... .. ... .. .. . ... .. .. .. ...Estimating Equation ... .. .. ... ... .. .. ... .. ... .. . .. .... . .... . ... .. . .. . .. ... .. .. ... .. ... . .. ... .. .. .. . .. . ... ... . .... . .. .... .... .. .. .. .. .. ... . ...
Variance Estimation .. ... .. .. .. ... . ... ... .. ... .. .. ... .. .... .. ... . .... .. .. .. .. .. ... .. .. ... .. .. .. .. ... . .... . . .. .. .. .. .. .. .. .. .. .... .. .. ... . .. . .......Background .. .. .. .. .. .. .... .. . .. ... .. .. .. . ... .. .. .. .. . .. . ... ... .. .. .. .. .. .... . ... .. .. .. .. .. .. .. . . .... .. . ... .. .. .. . . .... . ... .. .. .. .. ...............Summary of Applicable Theory . .. ... ... .... . .. .. ... .. ... .. .. ... .. ... . ... .. . ... .. . ... ... ... ... .. ... . .. . .. ... .. .. .. ... . ... .. . .. . .. . .. .. .Application to the NSFG .... .. .. .. ... . .. .. .. .. .. .. .. ... ... .... .. .. .. ... .. . .. .... .. . ... .. ... . .. .. .. .. .. .. .. .. .. .. .... .... . ... .. .. .. .. .. .. . .
References . ... .. .. .. .... .. .. .. .. .. . ... .. .. .. . ... .. .. .. .. .... .. .. .. ... ... .. . .. ... . ... .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .... ........................
AppendixesL Glossary of Terms . .. ... ... .. .. ... .. .. ... .. ... .. .. .. .. ... . .. .. .. .. ... . ... .. ... .. .. ... . .. .. .. .. .. .. .. .. .. .. .. .... .. .. .. .. ... . .. ... ... . . .II. Evaluation of Alternative Estimators .. .. ... .. .. .. ... .. .. .. ..... . .. .. .. .. ... .... ... .. ... . ... ... .. .. .. ... .. ... . .. . .... .. .. ... .III. Household Screener .. .... . ... .. ... .. ... .. .. ... . .... . ... .. ... .. ... .. .. .. .... . ... .. ... . .. ... ... .. .. .. ... . .. .. .. .. .. .. ... . .. .. .. . ... .. ..
LIST OF FIGURES
1.
2.
3.
4.
.
A.
B.
Geographic divisions of the United States, and their order for the first-stage sample .... .. . ... .. ... . ... ...
Relative standard errors for aggregates of women, by race . .... .. . .. .. .. .. .. .... . ... .. ... . ... ... .... ... . . .. . ... .. . ... ..
Relative standard errors for percent of total and white women (base of percent shown in curvein thousands) .. ... . .. ... ... . ... ... .. ... ... . ... .. . ..... . .. .. .. ... .. ... . .. ... ... ... ... . ... .. .. .. .. .. .. .. .. .. .. ... . .. . ... .. ... . .. ... . .. ........
Relative s~andard errors for mean numbers of births expected by women of afl races and whitewomen .. .... .. .. .. .. ... ... .. .. .. .. .... .... .. .. .. .. .. .. .. .. .. .. .. .. .. .... .. . ... .. ... . .. .. .. .. ... . .. .... .. .. .. .. .. .. .. .... .. .. .. .................
LIST OF TEXT TABLES
Standard error tolerances of estimated proportions for selected population subgroups .. .. .. ... .. .. .. ...
Sample PSIJ’S for the National Survey of Family Growth ... .. .. .. .. .... .. .. .. .. .. .. .. ... .. .. ... .. .. .. .. ... . .. ... . .. ...
...m
1
1
3333789
10
11
131314
16161617
24
26
2829
4
1818
21
22
2
5
v
C. Ordering of States for sampling from the National Opinion Research Center non-SMSA frame .. .. .
D. Example of sampling table on the Household Screener ................. .............. .................................. .
E. Actual and expected number of sample DU’S, number of completed screeners, and screener com-pletion rates, by race and Stratum ........................... ............................ ........................................
F. Number of sample women, actual and expected number of completed interviews, and interviewcompletion rates, by race and Stratum ................ ................ ............ ................................ ............
G. Screener and interview completion rates and combined response rates, by Stratum .. .. . ... ... . .. .. .. .. .. .
H. Comparison of standard error tolerances with the corresponding values obtained from the Na-tional Survey of Family Growth (NSFG) .....................................................................................
J. Poststratification adjustment factors (ratio of September 1973 population control totals basedon Current Population Survey data to National Survey of Family Growth weighted estimates),hy race and age .......................... ...................... .......................................... ............ ...... ........ ........
K. Example of how to determine A& from PSU response data . ... .. .... . .... . ... . .. ... .. . .... .. .. . .. .... .. ... . ... ... .. .. .
L. Example of a half-sample replication pattern .... .. .. .. .... .. .... .. .. .. .. ... ... .. .. .. ... . ... ... .. .. .. .. .... .. .. .. .. ... ... ... . .
M. Estimates of A and B for relative standard error curves, by race ........................................ .............
SYMBOLS
Data not available ---------------------------------------- ---
Category not applicable -------------------------------- . . .
Quantity zero ---------------------------------------------- -
Quantity more than O but less than 0.05 ------ 0.0
Figure does not meet standards of
reliability or precision (more than
30 percent relative standard error) ----------- *
6
11
12
12
13
13
14
16
17
19
e
vi
NATIONAL SURVEY OF FAMILY GROWTH, CYCLE 1:
SAMPLE DESIGN, ESTIMATION PROCEDURES,
AND VARIANCE ESTIMATIONDwight K. French, Statistical Methods Staff
INTRODUCTION
The primary mission of the National Centerfor Health Statistics (NCHS) is to collect andpublish statistics relating to the health of theU.S. population. In carrying out its mission,NCHS collects data on vital events registered inthe united States, conducts inventories of healthfacilities and manpower, and conducts probabil-ity sample surveys based on household inter-views, health examinations, and medical records.Dtita collection programs are supplemented byresearch projects which investigate new tech-niques of data collection and evaluate currentlyoperating programs.
In response to the need for current informa-tion on the interrelated topics of fertility,family planning, and their effects on populationgrowth, the National Survey of Family Growth(NSFG) was established as an integral part ofthe NCHS program in 1972. Since the purposeof the survey was to collect data relating tonatality and the process of family formation anddissolution, it was placed within tlie Division ofVital Statistics. The NSFG is designed as acyclic survey; that is, data are to be collectedevery few years by means of a sample survey.
The target population for Cycle I of theNSFG consisted of civilian noninstitutionalized~vomen living in the conterminom UnitedStates who were less than 45 years of age andwho were currently married, previously married,or single mothers with children living in thehousehold at the time of interview. Data werecollected by means of personal interviews witha probability sample of these women. The inter-
views furnished information for determiningtrends and differentials in fertility, family plan-ning practices, sources of family planning adviceand services, effectiveness and acceptability ofvarious methods of family planning, and aspectsof maternal and child health that are closelyrelated to family planning.
The sample design and data collection forthe first cycle of the NSFG were contracted tothe National Opinion Research Center (NORC)of the University of Chicago. The sample con-sisted of 10,862 eligible women, of whom 9,800(90.2 percent) were interviewed during the 8-month period from July 1973 through Febru-ary 1974. Other reports will discuss the findingsof the survey. This report describes in detail thesample design and sample selection proceduresused in Cycle I, the techniques used to estimate -population parameters, and the procedure usedto estimate sampling variances.
DESiGN SPECIFICATIONS
The development of an efficient Sarilple
design must take into account the primary sur-vey objectives, amount of funds available, logis-tical problems, time limitations, organized spec-ulation concerning population parameters andunit operating costs. These requirements dic-tated a stratified multistage probability sampledesign for the NSFG, based essentially on thefollowing set of specifications:
1, The target population was defined to becivilian noninstitutionalized women liv-ing in the conterminous United States
1
2.
3.
4.
5.
who were less than 45 years of age andwho were either(a) currently married,(b) previously married, or(c) single mothers with one or more of 6“
the children born to them currentlyliving in the household. 7.
The sample would consist of approxi-mately 10,000 women, selected from aninitial probabilityy- sample of households.Trained field staff were to conduct ascreening interview with a responsiblemember of each sample household todetermine if there were any eligiblewomen (the screener questionnaire is re-produced in appendix III). When ahousehold contained one eligible woman, 8.she was included in the sample. In house-holds with more than one eligiblewoman, the staff member would ran-domly select one woman for the sample.
9.Data were to be collected from thesample women by means of personalinterviews lasting an average of 1 hour.
AH interviewers were required to befemale.
The interviewer would collect informa-tion on fertility, family planning prac-
tices, sources of family planning services,and related maternal and child healthpractices.
The fieldwork would be completed inapproximately 6 months.
The target interview rate for the totalsample and both major subsamples byrace was 90 percent of the expectednumber of women from all samplehouseholds (i.e., screener and interviewnonresponse combined would ideally beno more than 10 percent). Neitherscreener nor interview response was sup-posed to fall below a minimum rate of90 percent.
The contractor would design and imple-ment procedures to measure and controlthe quality of data collection and datapreparation.
For the population subgroups shown intable A, the sample design should yieldestimated proportions whose standarderrors are within the tolerances given inthe rightmost column. The toleranceswere based on a sample of 3,600 Negrowomen and 6,400 women of other races,and a design factor of approximately 1.4
Tabla A. Standard error tolerances of estimated proportions fors
(that is, it~as assumed ~hat the st~dard
Population subgroup
Negro.
Education:Less than high school .. ... . ... . .... . ... .. .. .. .. .. .. .. .... . ... .. .. .. .. ... .. ..
High school and more ... . ... .. .. .. .. .. ... .. .. .. ... .. . ... .. .. .. ... . ... .. .. ..
Parity:O-2 children .. .. . ... .. .. .. .. .. .. .. .. . . .... .. ... . .. .. .. ... . .. .. .. .. .. .. .. .. .. .. .. ..
3 children or more .. .. .. .. ... .. ... . . .... .. .. .. .. ... .. .. .. .. ... .. .. .. .. ... . .. .
Other races
Education:Less than high school ... ... .... .. .. .. .. .. .. ... .. ... .. .. .. . ... .. .. .. .. ... .. ..High school and more ... .. .. .. .. .. .. .. ... ... .. .. .. .. .. ... . .. .. ... . .. .... . ..
Parity:O-2 children .. .. .. .. .. ... .. .. ... . .. ... .. .. .. .. .. ... . .. .... .. .. .. . ... . .... . ... ... ..
3 children or more ... .. .. .. .. .. .. .. .. .. .... .. .. .. .. .. .. .. .. ... . .. . .. .. ... .. ..
Expected proportionof sample women in
the race group whoare in the subgroup
.48
.52
.50
.50
.30
.70
.59
.41
?cted population subgn
Estimated proportion
within the subgroup
.42
.25
,19,51
.23
.15
.10
.33
Standard errortolerance for
estimated proportionas specified by NCHS
.0164
.0139
.0128
.0163
.0133.0074
.0068
.0127
2
error of statistics based on the NSFGdesign would be about 40 percent largerthan their standard errors based on asimple random sample of the same size).
SAMPLE DESIGN
Summary
The sample design for Cycle I of the NationalSurvey of Family Growth (NSFG) was a five-stage probability design based on the NationalOpinion Research Center’s (NORC) 1972Master Probability SampIe.l The counties andindependent cities that make up the total land
area of the conterminous United States werecombined to form a frame of primary samplingunits (PSU’S). From this frame, the first stage ofsampling yielded 203 PSU’S which were dividedinto four replicate groups, three containing 51PSU’S and the fourth having the remaining 50.Two of these replicate groups, containing 101PSU’S, were chosen for the NSFG sample. Thenext two stages resulted in the selection of sev-eral segments (clusters of about 100 dwellingunits) from each sample PSU. A member of theNORC field staff listed the dwelling units(DU’s) within each segment, and a fourth-stagesystematic sample of DU’S was selected. At eachsample DU, an NORC interviewer attempted tocomplete the Household Screener questionnaireshown in appendix III and list the names of alleligible respondents. From this list one eligiblewoman was randomly selected for interviewing.
Determination of Sample Size
After the NSFG contract was awarded toNORC, they agreed to design the sample to pro-duce the race allocation that NCHS suggested intheir precision requirements—3,600 Negrowomen and 6,400 women of other races. Oncethe allocation was fixed, NORC proceeded tocalculate the number of sample dwelling unitsthat were needed to produce the final sampleof women, They started by collecting 1970 cen-
sus information on the number of occupiedDU’S with Negro heads and the number withheads of other races, as well as population countsof eligible women in the two race classes. The
ratio of occupied DU’S to eligible women forboth race groups was adjusted to account forthe following three factors:
1.
2.
3.
1970 census data indicated that 8 per-cent of all DU’S in the United Stateswere vacant.
Data from the NCHS Health InterviewSurvey indicated that 5 percent of alleligible women lived in households con-taining two or more eligible women.
Combined screener and interview nonre-sponse was expected to be 10 percent.
The adjusted DU-person ratios representedthe expected number of DU’S that would haveto be screened in order to find and interview onesample woman in each race class. By multiply-ing the final ratios by the desired number ofsample persons, NORC calculated the expectedminimum number of DU’S that wouId need tobe screened to yield 3,600 completed interviewswith Negro women and 6,400 completed inter-views with women of other races—9 ,141 NegroDU’S and 18,091 DU’S of other races, or a totalof 27,232 DU’S.
These minimum numbers of DU’S, however,do not represent the actual number that were re-quired to be screened for the NSFG. In areaswhere a large proportion of the population wasNegro, DU’S were oversampled to attain the re-quired number of Negro women. In order tokeep the sample essentially self-weighting forwomen of other races, DU’S of other races weresubsampled in these areas. However, it wasnecessary to screen all DU’S in order to deter-mine their race. This subsampling procedure in-creased the number required to be screenedfrom the minimum 27,232 to 31,842.
Stratification and Selection of PSU’S
The PSU’S in the NORC master sample wereselected from separate sampling frames of stand-ard metropolitan statistical areas (SMSA’S) andnonmetropolitan areas in the conterminousUnited States. The SMSA frame consisted ofthe 246 SMSA’S as defined by the U.S. Bureau
of the Census in March 1971. The frame was
3
ordered in the following manner, based on ad-vance 1970 census population data:
1.
2.
3.
The SMSA’S were first sorted by the ninecensus geographic divisions as shown infigure 1.
Within each geographic division, theSMSA’S were sorted into three sizeclasses: 1,000,000 persons or more,200,000 to 999,999 persons, and lessthan 200,000 persons.
In divisions 1, 2, 3, 4, 8, and 9, theSMSA’S in each size class were sorted byState, with the States placed in geo-graphic order from northwest to north-east to southeast to southwest. Withineach State the SMSA’S were ordered inthe same way. In divisions 5, 6, and 7,the SMSA’S in each size class were placedin descending order of the number of
residents of races other than white, inorder to increase the likelihood of select-ing an appropriate number of southernSMSA’S with large Negro populations.
The total population of the SMSA framebased on preliminary 1970 census data was138,789,636. After the ordering was completed,the frame was divided into 139 sequential zones,each having 1 million population (the 139thzone contained 789,636 persons and 310,364“blanks”). In each zone the numbers from 1 to1 million were assimed in ordered intervals tothe SMSA’S that were totally or partially in-cluded, each SMSA receiving an interval equalto its population within “the zone. A “hitnumber” between 1 and 1 million was randomlyand independently selected for each zone, andthe SMSA whose assigned interval included thatnumber was the PSU selected to represent thezone. The selection of a hit number was not
Figure 1. Geographic divisions of the United States, and their order for the first-stage sample
....,”, ,-
f’; ‘.
‘“WI!-----+”
.,.,,..;:.,*A.,... .Z!yll : -;.. .. ,:,. . ... ... , .:
,., ,;, .,. p
,,,,
,-] WI??’!{<..,,~;,fi:.<.:::‘ : ‘ ,.:.”;: ,,.; ,; ...?,,,,~: .< ,. i-)[ - F ‘“” ‘:” ‘,,.......y;>,;;,!2 ,, ,,
,:,,, ‘. *$- %J v..... ...‘%A+IONAL CENTER FOR I+EALTH STATISTICS
: u@ted foamtheU.S.8ur,f” of $. Cmau, : **—
necessary for zones that were completely cov-ered by one SMSA. This selection procedurediffers somewhat from classical sampling withprobability proportionate to size in that SMSA’Sthat overlapped zones could be selected morethan once, and very large SMSA’S had to beselected several times. For example, the NewYork City SMSA represented 11 zones, all ofwhich it covered completely. It was not selectedto represent two other zones that it did notcover completely.
After the PSU’S were selected, the 139 zones
were systematically separated into four groups(zones 1, 5, 9, . . .. 137 constituted group 1;zones 2, 6, 10 ,. ... 138 constituted group 2;etc. ). Groups 1 and 3 were randomly selected,and the SMSA’S that represented the zones inthose two groups became part of the NSFGsample. The remaining two groups were held inreserve, but were not used in the NSFG. Thesample of the SMSA population consisted of 70Psu’s, containing 56 distinct SMSA’S (seetable B).
All areas in the United States that were not
Table B. Sample PSU’S for the National Survey of Family Growth
SMSA’S
Akron, OhioAllentown-Beth lehem-Easton, Pa.-N .J.Appleton-Oshkosh, Wis.Atlanta, Ga,Baltimore, Md.Birmingham, Ala,Boston, Mass.Buffalo, N.Y.Chicago, I Il. (represented by 4 PSU’S)Clevelend, OhioColumbie, S.C.Columbus, OhioDallas, Tex,Denver, Colo.Des Moines, IowaDetroit, Mich. (represented by 2 PSU’S)Flint, Mich.Fresno, Calif.Gary-Hammond-E, Chicago, Ind.Great Falls, Mont.Hartford, Corm.Houston, Tex.Indianapolis, Ind.Jersay City, N.J,Knoxville, Term.Little Rock-N. Little Rock, Ark.Los Angales-Long Beach, Calif. (represented by 3 PSU’S)
Miami, Fla,Minneapolis-St, Paul, Minn.Nashville, Term,Naw Britain, Corm.New Orleans, La.New York, N,Y. (rapresented by 5 PSU’S)Newark, N,J. (rapresented by 2 PSU’S)Norfolk-Virginia Beach-Portsmouth, Va.Oklehoma City, Okla.Orlando, Fla.Pensacola, Fla.Philadelphia, Pa,-N.J, (represented by 2 PSU’S)Phoenix, Ariz,Pittsburgh, Pa,Portland, Oreg.-Wash.Racine, Wis.Reading, Pa.Rochester, N .Y,St. Louis, Mo.-l Il. (represented bv 2 PSU’S)San Bernardino-Rivarsida-Ontario, Calif.
San Diego, Calif.San FranciscoQakland, Calif. (represented by 2 PSU’S)Santa Rosa, Calif.Springfield-Chicopee-Holyoke, Mass.-Conn.Stamford, Corm.Tacoma, Wash.Topeka, Kan.Washington, D.C.Wichita Falls, Tex.
Non-SMSA PSU’S
Auglaize Co., OhioBurnett Co.-Washburn Co., Wis.Carroll Co,, Mo.Columbiana Co., OhioColusa Co., Calif.Florence Co., S.C.Gogebic Co., Mich.Houston Co., Ala.Iredell Co., N.C.Knox Co., OhioLincoln Co.-Rock Co., Minn.Long Branch-Asbury Park, N.J.Madison Co., S.C.Manitowoc Co,, Wis.Marion Co., W. Va.Marquatte Co,, Mich.
McDowell Co., W. Va.Meklenburg Co,, Va.Mesa Co., Colo.Newaygo Co., Mich.St. Lucie Co., Fla.Schuylkill Co., Pa.Smyth Co., Va.Sumter Co., Ga.Sussex Co., N .J.Upshur Co., W. Va.Victoria Co., Tex.Washington Co., Ala.Wilkes Co., N.C.Willacy Co., Tex.Yuba Co., Calif.
5
included in the SMSA frame. can be classifiedeither as counties, independent cities, or, in NewEngland, portions of counties (for convenience,all such areas will hereafter be referred to as“counties”). In certain instances, sparsely popu-lated counties were linked to provide an ade-quate population base for later stages of sam-pling. The non-SMSA frame consisted of theconterminous United States minus the 246SMSA’S, with the individual and linked countiesordered as follows:
1. Counties from census divisions 1 and 2with 50,000 persons or more, arrangedin descending order of population (DOP).
2. Counties from divisions 3 and 4 with60,000 persons or more, arranged inDOP within State. The order of Statesfor divisions 3 and 4 is given in table C.
3. Counties from divisions 3 and 4 withpopulations between 30,000 and 59,999,where the proportion of the populationliving in urban areas (as defined by theU.S. Census Bureau) “was greater thanor equal to 40 percent, arranged in DOPwithin State.
4. Counties from divisions 3 and 4 withpopulations less than 30,000, where theurban proportion was greater than orequal to 50 percent, arranged in DOPwithin State.
5. Counties from divisions 5, 6, and 7 withpopulations greater than or equal to30,000, where the urban proportion was30 percent or greater and the proportionof the population that was Negro wasless than 20 percent, arranged in DOPwithin State. The order of the States isgiven in tabIe C.
6. Counties from divisions 5, 6, and 7 with
populations greater than or equal to30,000, where the urban proportion was30 percent or greater and the Negro pro-
portion was 20 percent or greater,arranged in DOP within State.
7. Counties from division 8 arranged inDOP.
8.
9,
10.
Counties from division 9 arranged inDOP.
Counties from divisions 1 and 2 withpopulations less than 50,000 arrangedin DOP,
Counties from divisions 3 and 4 withpopulations between 30,000 and 59,999,where the urban proportion was lessthan 40 percent, arranged in DOP withinState.
11. Counties from divisions 3 and 4 withpopulations less than 30,000, where theurban proportion was less than 50 per-cent, arranged in DOP within State.
12. Counties from divisions 5, 6, and 7 with
13.
14.
populations greater than or equal to30,000, where the urban proportion wasless than 30 percent and the Negro pro-portion was less than 20 percent, ar-ranged in DOP within State.
Counties from divisions 5, 6, and 7 withpopulations less than 30,000, where theNegro proportion was less than 20 per-cent, arranged in DOP within State.
Counties from divisions 5, 6, and 7 withpopulations greater than or equal to30,000, where the urban proportion wasless than 30 percent and the Negro pro-portion was 20 percent or greater, ar-ranged in DOP within State.
Table C. Ordering of States for sampling from the NationalOpinion Research Center non-SMSA frame
Area I Order of States
North Central Region
Divisions 3 and 4 ............
South Region
Ohio, Michigan, Wisconsin, Minne.sotar Indiana, I Ilinois, Missouri,Iowa, Kansas, Nebraska, North Da-kota, South Dakota
Delaware, Maryland, West Virginia,Virginia, Kentucky, Tennessee,North Carolina, South Carolinar
Divisions 5, 6, and 7 ..... ..
Georgia, Alabama, Mississippi,Louisiana, Arkansas, Floride,Texes,Oklahoma.
6
15. Counties from divisions 5, 6, and 7 with Selection of Second-Stage Unitspopulations less than 30,000, where theNegro proportion was 20 percent orgreater, arranged in DOP within State.
The total population of the non-SMSAframe based on preliminary 1970 census countswas 63,456,729. This frame was partitioned inthe same way as the SMSA frame into 64 zonesof 1 miIIion persons (zone 64 contained 543,271“blanks”). The method of selecting a county torepresent each zone was exactly the samemethod used for SMSA’S; however, the methodof determining the primary and reserve PSU’Swas different. The 64 zones were divided into16 sets of 4 (zones 1-4 formed the first set,zones 5-8 the second set, and so forth). Each setof numbers was randomly permuted, and thecounties representing the first two zones in thesequence were assigned to the NSFG sample.For example, the permutation of zones 1-4 was3, 1, 4, 2. Therefore, the counties representingzones 3 and 1 were included in the survey, whilethe remaining two selections were placed inreserve status. Only 31 PSU’S were selectedfrom the non-SMSA frame because a blank wasselected as the hit number in zone 64, and zone64 was listed second in the permutation of set16. Thus the first-stage non-SMSA sample con-sisted of 31 counties, none appearing more thanonce, These 31 counties, when added to theShK3A sample, produced a total first-stage sam-ple of 101 PSU’S from 87 distinct localities.
NORC’S methods of ordering SMSA’S andnon-SMSA counties, selecting PSU’S from zonesof 1 million persons, and subsampling to deter-mine primary and reserve PSU’S ensured a rea-sonable geographic, racial, and urban-ruralbalance among the PSU’S. Even after the pre-liminary ordering, NORC was concerned that asample of four PSU’S from zones of 4 millionpersons would result in excessive geographicclustering. They therefore decided on the moredetailed method of selection from smaller zones.The systematic group method of selecting pri-mary and reserve SMSA’S was changed to ran-dom selection within sets for counties becausethe cyclical ordering of the non-SMSA framemight have caused geographic clustering if sys-tematic sampling had been used.
The units selected in the second stage ofsampling were block groups (BG’s) in areas wherecensus blocks were delineated and census enum-eration districts (ED’s) in other areas, Each ofthe 87 distinct localities was completely coveredby a nonoverlapping frame of BG’s, ED’s, or acombination of the two.
In order to reduce sampling error and ensureproportionate representation of women by raceand income, the second-stage frame within eachPSU was ordered by these variables prior tosampling. Because 1970 census income datawere not available at the BG and ED level,NORC attached to each unit the income andracial characteristics of the next higher ordercensus unit, which in most areas was the tract.In predominantly rural counties that were nottracted, the minor civil division (MCD ) or censuscount y division (CCD ) was the next level unit.The National Opinion Research Center purchasedfrom the National Planning Data Corporation ofIthaca, New York, population and housing datafor ED’s and BG’s, and income and racial com-position data for tracts, MCD ‘s, and CCD’Swithin each of the 87 distinct localities.
With these data the second-stage framewithin each PSU was ordered in the foIIowingmanner:
1. In SMSA’S the ED’s and BG’s in tractswith less than 10 percent Negro house-holds were placed before units in tractswith at least 10 percent Negro house-holds. (For this purpose, the race of thehousehold was defined as the race of thehead.) Within each race group, the unitswere arranged in ascending order ofmedian tract income. Within each tractthe ED’s and BG’s were arranged innumerical order, with BG’s precedingED’s if both types of unit were present.
2. In non-SMSA PSU’S the above methodof ordering was applied when at least 6percent of the total population wasNegro. In counties where Negroes con-stituted less than 6 percent of the popu-lation, the tracts, MCD’S, and CCD’S wereeither placed in ascending order by
median ‘income or, in sparseiy populated
7
counties, arranged from northwest tonortheast to southeast to southwest.Within each higher order unit, the BG’sor ED’s (usually ED’s) were arranged innumerical order.
After the ED’s and BG’s were ordered, smallunits were linked with the unit immediatelyfollowing to assure a minimum unit size of 100DU’S. The cumulative households in each PSUwere then divided into 18 zones of equal size.Within each of these second-stage zones, thenumbers between 1 and the zone size weregrouped into ordered intervals representing BG’sand ED’s in the same way that the intervals infirst-stage zones represented SMSA’S and coun-ties. A random hit number was selected for eachzone to determine the ED or BG that wouldrepresent the zone. For SMSA’S that appearedmore than once in the principal sample, thisprocedure was carried out for each first-stageappearance. Thus the Chicago SMSA, with 4 hitsin the sample, was represented by 4 X 18 or 72second-stage units.
Of the 18 second-stage units associated witheach PSU, a subsample of 12 was selected forthe NSFG. The 18 second-stage units were putinto 6 groups of 3 units apiece (group 1 con-sisted of ED’s and BG’s from zones 1, 2, and 3;group 2 included units from zones 4, 5, and 6;etc.). A random number was independentlyselected for each group to determine whichsecond-stage units were to be eliminated fromthe NSFG sample. The second-stage sample nowcontained 1,212 second-stage units within the101 Psu’s.
At this point NORC and NCHS decided tosupplement the sample by selecting additionalsecond-stage units in areas where the popula-tion was largely Negro, because the design re-quirements of the NSFG dictated an oversam-pling of Negro females. Without these additionalunits later-stage sampling rates in areas withmany Negroes would have been much largerthan the rates in areas with few Negroes. Theresulting large cluster sizes would have greatlyincreased the variances of survey estimates.
The first step in selecting the supplementalsample was to identify all second-stage units intracts, MCD’S, and CCD’S withpercent Negro households (again,
at least 10the race of a
household was determined by the race of itshead). The zones represented by these unitswere split into two half zones of equal size,say z. If the original hit number h for a zonefell into the first half zone, the number h -!- zwas used to select a tentative supplemental EDor BG. If the original hit number fell into thesecond half zone, h - z was the supplemental hitnumber.
The selection procedure had to be modifiedfor Washington County, Alabama, and SumterCount y, Georgia. The 36 original ED’s repre-senting these rural, predominantly Negro PSU’Sencompassed virtually all of their population.Therefore, it was necessary to return to the first-stage non-SMSA zones represented by thesePSU’S and randomly select an additional PSUfrom each. The two supplemental PSU’S wereHale County, Alabama, and Newton County,Georgia. Their BG’s and ED’s were orderedaccording to the procedure described earlier, andtheir cumulative DU’S were divided into 12zones of equal size. The second-stage zonemethod was used to select 12 tentative supple-mental units from each PSU.
To determine which of the tentative supple-mental units would be included in the sample,each one was paired with its corresponding origi-nal selection and the simple average percent ofNegro households was computed for each pair.It was estimated by NORC that in order to pro-duce the required sample of 3,600 Negro womenwith a sample of 32,000 DU’S, the supplementalunit should be included when the average per-cent of Negro households was 43.4 or larger.Otherwise, only the original unit should be re-tained. The average exceqded 43.4 percent for122 of the 1,212 zones. Therefore, the NSFGsecond stage consisted of 1,090+ 2(122) = 1,334ED’s and BG’s. For purposes of subsequentsampling, the units were divided into two Strata.Stratum I consisted of the 1,090 predominantlywhite units from the zones where the supple-mental unit was not used. Stratum 11 consistedof the remaining 122 pairs of units.
Selection of Third-Stage Units
The purpose of the third stage of samplingwas to select from each ED and BG a subareacalled a “segment” containing approximately
8
100 DU’S. When an entire second-stage unitcontained only about 100 DU’S, no third stageof sampling was required. When a third stagewas necessary, however, the logistics of selectinga segment from a BG and an ED were quitedifferent.
The BG is an urban geographic unit intro-duced during the 1970 census. It consists of aset of city blocks with the same highest orderdigit in the census block identification number.For the third stage of sampling, NORC subjec-tively split the BG into groups of blocks, eachgroup containing approximately 100 DU’Saccording to 1970 census counts. One groupwas randomly selected with probability propor-tionate to its size relative to the size of the totalBG.
The ED is a census divisional unit generallyused in nonmetropolitan areas. It is usuallythe lowest level unit for which decennial censusinformation on households is available. In someinstances, however, an area called an “ED” dur-ing the second stage was found to be covered byblocks. When that was the case, any necessarythird-stage sampling was carried out in the man-ner described in the preceding paragraph. Whenblocks were not delineated and the ED hadmany more than 100 DU’S, the ED was splitinto pseudoblocks which were bounded by roadsand other easily recognizable landmarks. Dwell-ing units within the pseudoblocks were fieldcounted in order to get rough measures of size,and a segment of approximately 100 DU’S wasselected with probability proportional to its sizerelative to the size of the entire ED accordingto the rough count.
The final third-stage NSFG sample consistedof 1,334 segments, one corresponding to eachsample BG and ED. Therefore, there were 1,090segments in Stratum I and 244 in Stratum II.
Selection of Fourth-Stage Dwelling Units
Members of the NORC field staff listed theDU’S within each of the 1,334 segments. Anindependent, systematic sample of DU’S wasselected from the listing sheet for each segment.The individual fourth-stage sampling rates werechosen so that the sample of DU’S was essen-tially self-weighting within Strata. That is, theoverall probabilities of selecting all sample DU’S
in Stratum I were approximately equal, as werethe probabilities associated with sample DU’Sfrom Stratum II.
In order for NORC to determine the fourth-stage sampling rate for each segment, theyfirst had to determine PI and PII, the uniformprobabilities of selection for the two Strata.To do this, preliminary census data were used toestimate the proportion of households in eachStratum with Negro heads. These proportions,along with the required number of screenedDU’S for Negroes and persons of other races,were the constraints used to calculate the re-quired number of screened DU’S for the twoStrata (hereafter denoted by SI and SII). Next,DT, the number of occupied DU’S in the UnitedStates at the time of the survey, was projectedfrom census housing totals for 1970 and 1972.The number of occupied DU’S in the UnitedStates which were in ED’s or BG’s that met thedefinition of Stratum I was estimated by
()1,090D1 = — D=1,212
where 1,090/ 1,212was the proportion of orig-inal second-stage sample units that fell into
Stratum I. A corresponding estimate of occu-pied DU’S for Stratum II units in the UnitedStates was
()122 DD1l = —
1,212 ~“
Because 1970 census data indicated that 92 per-cent of aIl DU’S in the United States were occu-pied, the selection probabilities for the twoStrata were given by
SI ● 0.92PI = = .000276219 and
D1
s~1s0.92P1l =
D1l= .002207277
Once NORC had calculated PI and PII, thefourth-stage sampling rate for any segment couldbe obtained by dividing PI or P1l by the product
9
of the known probabilities of selection at thefirst three stages. The fourth-stage rates yieldedan average of 16.7 screened DU’S per segment inStratum I and 56.9 in Stratum II. It’ is obviousfrom the difference between these averagecluster sizes that oversampling of ED’s and BG’sat the second stage could not by itself producethe required oversimple of Negro women.Further oversampling of DU’S was necessarywithin sample segments.
Because the field listing sheet was theframe for the fourth-stage sample, DU’S thatwere missed during the field listing, or that cameinto existence between the time of field Iistingand interviewing, had zero probability of beingsampled directly. In order to give these DU’Sthe same probability of selection as listed DU’Sin their respective segments, NORC developed aset of rules based on the half-open interval pro-cedure.2 These rules enabled the interviewerto link each missed DU and each new DU withexactly one existing line of the segment listingsheet.
The set of lines for a segment was first di-vided into subsets that represented listings forindividual blocks. In urban areas blocks wereusually well-defined units, but in many ruralsegments “blocks” were of widely varyingshapes and sizes. The only requirements for arural block were clearly defined boundaries anda complete up-to-date listing of DU’S by addressor location. Each line within a block repre-
sented a structure or a subunit within a structure(such as an apartment or room). The listingswere considered circular; that is, the last listingwithin a block was “followed” by the first.
When a sample line represented a completestructure, the interviewer was instructed to com-plete a screener for any DU’S witbin that struc-ture as well as any DU’S between that structureand the structure on the next line. “Between”was defined in terms of address numbers whenthey were available, and in terms of locationwhen they were not. If a sample line repre-sented a subunit within a structure, the inter-viewer’s instructions depended upon the positionof the sample line relative to the lines repre-senting the other subunits. Since the listingswere ordered, each multiunit structure had afirst- and last-listed subunit and any number(from O to n) of other subunits. If the sample
line was a first-listed subunit, the interviewerwas instructed to complete a screener for alladditional DU’S within the subunit and a.11DU’Sin the structure that were in unlisted subunits.If the sample line was a last-listed subunit, theinterviewer was instructed to complete a screenerfor all additional DU’S in that subunit and allDU’S between the structure containing the sub-unit and the next-listed structure. If the sampleline represented any other subunit, the inter-viewer was only responsible for additional DU’Sin the subunit.
In order to avoid large increases in samplesize due to additional DU’S, NORC set an arbi-trary limit of four DU’S per sample line. If morethan three additional DU’S were associated withthe original sample listing, the interviewer wasinstructed to call headquarters, where a randomsubsample of exactly four DU’S was chosen toreceive screeners.
The procedures for listing additional DU’Sand subsampling excess DU’S added 780 unlistedDU’S to the original sample, of which 439 werein Stratum I and 341 were in Stratum H. Thesubsampled DU’S were exceptions to the princi-ple of- equal probability if selection withinStrata.
Additional exceptions to equal probabilityof selection were made for 10 segments inStratum I which had grown rapidly from thetime of the census DU count to the time of theNORC field listing. The field staff in these“fast-growth” segments returned their listingsheets to NORC’S central office. The centraloffice reduced the fourth-stage sampling ratesto keep the number of sample DU’S from ex-ceeding 50, so that the interviewer’s workloadwould not become overly burdensome,
Fifth-Stage Selection of Sample Persons
To avoid the high correlation of informationfrom eligible women within the same DU, theNSFG design stipulated that no more than oneeligible woman from any sample DU would beinterviewed. During completion of the House-hold Screener, the NORC interviewer listed allmembers of the DU on the second page of theform and relisted the eligible females in orderof age in item 13 on the third page (see appendixIII). Item 13 provides space for listing up to six
persons because six was considered to be a rea-sonable limit for the number of eligible femalesto be expected from a single DU. When theinterviewer listed more than one eligible femaleshe referred to the sampling table on the firstpage of the questionnaire to determine the per-son she was supposed to interview. The samplingtable consists of five numbers that designatewhich, person to interview when the number ofeligible females is two, three, four, five, or six(see table D). National Opinion Research Cen-ter personnel filled in the table on every House-hold Screener by randomly ordering the 720possible sets of numbers and systematicallyassigning them to screeners. This method ofassigning interviews gave each eligible woman ina given DU the same probability of beingselected.
The oversampling in Stratum II at the sec-ond and fourth stages of selection led to thedesired oversampling of Negro females at thefifth stage, However, since 35 percent of theDU’S in Stratum II were expected to be of otherraces, oversampling would also have producedan unnecessarily large sample of females of otherraces. NORC avoided that costly problem bysubsampling DU’S of other races at a rate of 1out of 7.991. This rate gave the subsampledDU’S the same probability of selection as allDU’S in Stratum I. Subsampling was accom-plished by systematically printing “interviewregardless of race” on 1 of every 7.991 screenersused in Stratum II and “interview if Negro only”on the rest. For this purpose, the race of a DUwas defined as the race of the person who pro-vided the information for the Household
Table D. Example of sampling table on the Household Screener
IF NUNSER OF THEN INTERVIEWELIGIBLE FEkL4LES PERSON USTEOLISTED IN SUM- ON SIJWU,RiEOX
ILN+YBOX IS: LINE
J $$.
ElTwo /Three 3
Four I
Five +
Six or
more z
Screener because it seemed likely that thescreener respondent would often be an eligiblerespondent for the survey and because it mighthave been difficult to determine the race of allhousehold members in some instances.
CHARACTERISTICS OF THE SAMPLE
The first four stages of the design resulted inthe identification of 32,818 sample DU’S. Dur-ing the screening process, the interviewers dis-covered that 3,820 of these either were vacantor did not meet the definition of a DU. Com-plete Household Screeners were obtained for26,028 of the remaining 28,998 occupied DU’S,for a screener completion rate of 89.8 percent(table E). After 2,674 DU’S were removed fromthe sample by the subsampling procedure inStratum II, the final sample of 23,354 DU’Syielded a fifth-stage sample of 10,879 women,of which 4,362 were Negro and 6,517 were ofother races (table F). Complete interviews wereobtained for 9,817 women, of which 3,868 wereNegro and 5,949 were of other races, for aninterview completion rate of 90.2 percent (88.7percent for Negro women and 91.3 percent forwomen of other races). Combined screener andinterview response rates cannot be computed byrace because the race of the majority of nonre-sponding DU’S was unknown. Combinedscreener and interview response by Stratum isshown in table G.
Seventeen women (10 Negro, 7 of otherraces) were eliminated from the sample afterthey were interviewed because it was discoveredthat they had passed their 45th birthday beforethe date of interview. In addition, data for threesample women who were less than 15 years ofage (two Negro, one of another race) were ex-cluded from all tabulations so that analysiscould be conducted for eligible women 15-44years of age. Therefore, NSFG estimates arebased upon data from 9,797 women, of which3,856 are Negro and 5,941 are of other races.
While the number of interviewed Ne,growomen exceeded the desired sample size ‘of3,600, the number of interviewed women ofother races fell substantially short of the targetof 6,400. Most of the additional Negro sampIewas due to the unexpectedly large number-of
11
Table E. Actual and expected number of sample DU’S, number of completed screeners. and screener comrJetion rates. bv race and Stratum
Expectednumber oflegitimate.occupied
D U’s frompresurveyestimates
(4)
29,295
Number ofDU’S for Screenerwhich a completion
complete ratelscreener was (percent)
obtained
Total DU’SNumber of
Vacant or legitimate,in thesample not a DU occupied
DU’SRace and Stratum
All races
Both Strata ....................................................................
Stratum I ........................ ...............................................................Stratum 11......................................................................................
Negro
Both Strata ....................................................................
Stratum I .......................................................................................Stratum 11......................................................................................
Other races
Both Strata ....................................................................
Stratum I .......................................................................................Stratum 11......................................................................................
32,918 I 3,820 I 228,998 226,028 I 89.8
18,593 2,038 216,55514,225 1,782 212,443
. . . . . . 9,005
*15,134 91.4210,894 87.6
9,548 3. . .
16,57612,719
B,41O
. . . . . . 773
. . . . . . 8,232
. . . . . 18,438
6567,754
20,885
725 3.,.7,821 3.,.
17,328 3 ,..
. . . I ...I 15,059
. . . . . . 3,37974,278 3. . .3,050 3 . . .
15,9204,965
q(s)+ (3)] x 100.?Inclrrdes race unknown.3Appropriate screener completion rates by race cannot be derived because race information was not available for all legitimate, occupied DU‘s.
Table F. Number of sample women, actual and expected number of completed interviews, and interview completion rates, by race and Stratum
Race and Stratum
Number ofsamplewomen
(1)
Number ofcompletedinterviews
(2)
Interviewcompletion
rate 1(percent)
(3)
Approximatenumber ofcompletedinterviewsexpected
frompresurveyestimatas
(4)
All races
8oth Strata ............................................................................................................. 10,879 9,817 90.2 10,054
6,7584,121
4,362
3943,968
6,517
6,364153
26,4513,600
6,1643,653
3,86B
3413,527
5,949
5,823126
91.288.6
8B.7
86,588.9
91.3
Negro
Both Strata ............................................................................................................. 3,600
2813,319
Other races
Both Strata .................................................................................................. ........... 26,454
6,173291
91.582.4
1[(2) +(1)] x 100.2The target SampIc size f~)r women of ~aces other tlmn Negro was 6,400. However. the assigned probabilities of SekCtiOn fOr DU”S of race Other than
Negro in Stratum 1 and Stratum 11 yielded a slightly larger expected sample size thmr the original target.
Table G. Screener and interview completion rates and combined
response rates, by Stratum
Screener Interviewcomple- Combined
Stratumcomple-
tion tionresponse
rate rateratel
(1) (2) (3)
I Percent
Both Strata ... . ... . . 89.8 90.2 81.0
Stratum I ... .. .. .. ... .. ... . .... . 91.4 91.2 83.4Stratum II m... ... .. . .. .. .... .. . 87.6 88.6 77.6
~[(1) x (2)] + 100.
Negro DU’S identified in Stratum II (table E).This oversimple was enough to overcome thehigher-than-expected vacancy and nonresponserates. The number of DU’S of race other thanNegro in Stratum II was correspondingly muchlower than expected. However, because of thesubsampling procedure the number of inter-viewed women of races other than Negro wasonly 155 less than expected, In Stratum I thenumber of identified DU’S was also somewhatless than the expected value, and the vacancyand interview nonresponse rates were higher
than expected. These problems caused the re-maining sampling deficit for women of racesother than Negro, but had little effect on thesize of the Necgro sample because less than 5 per-cent of the DU’S in Stratum I were classified asNegro.
Table H shows that the precision of surveyestimates was not adversely affected because thetarget number of women of races other thanNegro was not interviewed. Standard errors forthe estimated proportions in table A were cal-culated from NSFG data using the balancedhalf-sample replication technique described laterin this report. For all population subgroupsexcept Negro women of parity 3 or more, thesestandard error estimates are substantially lowerthan the corresponding presurvey error toler-ance. The estimate for higher panty Negrowomen is slightly larger than its tolerance, butthe difference is unimportant, since the varianceestimates themselves are subject to variance.
ESTIMATION
Weighting Procedure
Since the NSFG is designed to produce un-biased estimates for the entire population of
Table H. Comparison of standard arror tolerances with the corresponding values obtained from the National Survey of FamilyGrowth (NSFG)
Population subgroup
Nagro
Education:Lass than high school .... .. .. .. .. .. ... .. .. ... . .. ... .. .. ... .. .. ... .. .. ... .. . ... .. ... .. .. ... .. .High school and mora ... .. .. .. . ... .. .. .. .. .... . ... . ... ... . ... .. ... .. .. .. .. ... .. ... .. .. .. .. ..
Parity:
O-2 children ... .. ... .. ... .. ... .. .. ... .. ... . .. .. .. ... . .. .. .. .. .. .. .. .. .. .. .. . ... .. .. .. .. .. . . .... .. ..3 children or more ... . ... ... .. .. .... . .. .... . ... .. .. .. ... ... .. .. .. ... . .. ... . ... .. ... .. .. .. .. .. .
Other races
Education:Less than high school . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
High school and more .. .. .... .. .. .. .. ... . .. .. .. ... .. ... .. .. .. .. ... ... .. .. .. .. .. .. .. ... ... . ..
Parity:
0.2 children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 children or more .. .. ... ... . ... .. .. .. .... .. ... .. . .. .. .. .. .. ... .. ... .. ... . .. ... . .... .... .. .. ..
Proportion of
women in the ratagroup who are in
the subgroup
Fromtable A
.48
.52
.50
.50
.30
.70
,59.41
NSFG
astimate
.48
.52
.58
.42
.27
.73
.64
.36
Estimatedproportionwithin thesubgroup
.42
.25
.19
.51
.23
.15
.10
.33
Standarderror
tolerancefrom
table A
.0164
.0139
.0128
.0163
.0133
.0074
.0068
.0127
Estimated .
standarderror
basad onNSFG data
.0150
.0121
.0112
.0170
.0106
.0053
.0054
.0099
13
eligible women in the United States, the sampledata must be inflated to the level of the popula-tion from which the sample was drawn. The in-flation factor, or weight, for each woman isthe product of several adjustments, one or moreat each stage of sampling. Three types of adjust-ments are involved.
Inflation by the reciprocal of the probabili-ties of selection. –The weight for each womanwithin a sample PSU is the product of the re-ciprocals of the probabilities of selecting (1) theED or BG, (2) the segment, (3) the DU, and (4)the eligible sample person. Because of the possi-bility (sometimes certainty) that certain SMSA’Sand counties could be selected as sample PSU’Smore than once, the first-stage weight is thereciprocal of the expected relative frequency ofoccurrence of the PSU. The first-stage weight isexplained in more detail in the discussion of theestimating equation.
Nonresponse adjustment. —Each sampleweight is adjusted for nonresponse to the House-hold Screener (screener nonresponse) and nonre-sponse of sample women to the detailed NSFGquestionnaire (interview nonresponse). Theseadjustments are necessary because the phenom-enon of nonresponse introduces bias into anyprobability sample. The respondents to a surveymay have a much different distribution of demo-graphic or health characteristics than the nonre-spondents. Even if the distribution of demo-graphic and health characteristics is about thesame for respondents and nonrespondents, thetwo groups are by definition different becausethe respondents participated in the survey whilethe nonrespondents did not. Nonresponseadjustments minimize the impact of nonre-sponse bias on final estimates by imputing tononresponding DU’S and women the characteris-tics of “similar” respondents. Similar respondentswere judged to be DU’S in the same PSU andStratum, and women in the same age-race classand PSU. Screener response was 89.8 percent forthe entire survey (91.4 percent in Stratum I and87.6 percent in Stratum II) and ranged from98.9 percent to 62.5 percent in individual PSU’S.Interview response was 90.2 percent for the sur-vey (91.2 percent in Stratum I and 88.6 percentfor Stratum II) and ranged from54.5 percent in individual PSU’S.
100 pe[cent to
Table J. Poststratification adjustment factors (ratio of Septem-
ber 1973 population control totals based on CurrentPopulation Survey data to National Survey of FamilyGrowth weighted estimates), by race and age
I [
Age I Negro IOtherraces
14-19 years . .. .. ... .. .. .. ... .. .. ... . .. .. .. .. .. .. .... .... .. ... .
20-24 years .. ... .. .. ... .. . ... .. .. .... . ... .. .. ... .. .. .. ... .. ...25-29 years . .. .. ... . .. ... .. ... .. .. ... .. .. . .. .. .. ... . .... . ... ..30-34 years ... . .. ... .. .. .. .. ... ... .. . ... .. ... .. .. ... ... . .. .. ..35-39 years ... ... .. . .. ... ... .. .. .... .. .. .. .. .. .. ... . .. ... . .. ..
40-44 years . ... .. . ... .. ... . ... ... .. .. .. .. ... .. .. ... .. .. .. .. ...
0.7430.9270.9750.9600,876
1.082
1.0361.0881.0920.991
0.951
1.097
Poststratification by age and race.–Theweight for each ever-married respondent is mul-tiplied by a poststratification adjustment factorthat is determined by the woman’s age and race.The 12 adjustment factors shown in table Jmake NSFG estimates of ever-married women ineach age-race class equal to independent controltotals for September of 1973 (the approximatemidpoint of data collection). The control totalsare based on data from the U.S. Bureau of theCensus’ Current Population Survey (CPS). Nopoststratification adjustment is applied to theweights for single mothers because reliable con- ,trol totals for this population are not available.Poststratification achieves much of the improve-ment in precision that would have been attainedif the sample had been drawn from a populationstratified by age and race. The method used tocompute the CPS control totals is discussed inappendix II as part of the evacuation of alterna-tive estimators.
NCHS decided to use a poststratified esti-mator instead of a simple inflation estimatorafter conducting research to compare the pre-cision of the two estimators. The methodologyand results of the comparison are given inappendix H.
Estimating Equation
The estimate of an aggregate parameter 1’ isgiven by3
Y’=Y; +Y\,
where Y; is the estimate for ever-marriedwomen and Y; is the estimate for single mothers.
14
is a poststratified estimator. The nonresponse-adjusted estimates Y’ for the 12 age-race classesare multiplied by the poststratification adjust-ment factor X. ~/X~ 1. X; 1 represents the N$$FGestimate of the numb erAof ever-married womenin age-race class a and Xa 1 is m estimate of thesame population group based upon the CPS.
12
is simply the sum of the nonresponse-adjustedestimates for single mothers in the 12 age-raceclasses.
The nonresponse-adjusted estimator for ever-married women in age-race class a is given by
Yaghzjk “
I,)~z, the corresponding estimator for single
mothers, is exactly the same, except that
6aghfk =
w5ghzjk =
the observed value of characteristicY for the sample woman selectedfrom DU k, segment j, ED or BG i,Stratum h, PSU g, and age-raceclass ct.
I
1 if the sample woman whose ob-served value is Y&h@ has everbeen married;
O otherwise.the fifth-stage weight applied to thesample woman represented by
‘agh~k “ The weight is equal to thenumber of eligible women in herDU.the reciprocal of the original fourth-stage sampling rate within seg-ment j.7.991 if segment j is in Stratum IIand the race of DU k is other thanNegro; otherwise T4g~tik = 1.the subsampling rate that was ap-plied if segment j was one of the10 fast-growth segments in StratumI; otherwise, F4ghtik = 1.the subsampling rate that was ap-plied if DU k was associated withexcess missed DU’S; otherwise,
‘4gh~k = 1.
the number of sample DU’S in seg-ment j.the reciprocal of the probability yof selecting segment j, the segmentselected from ED or BG i.the reciprocal of the probability ofselecting ED or BG i from PSU gand Stratum h.the number of sample ED’s andBG’s in PSU g and Stratum h.
an adjustment for screener nonre -sponse (DU’S for which it was im-possible to determine whether ornot there were any eligible respond-ents). The value Mgh represents thetotal number of DU’S from PSU g.~d Stratum h, and Mgh is the num-
ber of DU’S that were classified aseither including or not includingeligible respondents.
a partial adjustment for person non-response, where n.. is the numberof ‘sample person; -in PSU g, age-race class a, and ~ga is the numberof persons who respond. However,NCHS decided that nonresponsegreater than 50 percent within aPSU should be adjusted at the classlevel rather than the PSU level.Therefore, when nga is greater than
.*twice nga the nonresponse adjust-
15
Table K. Examplel of how to determine A& from PSU response
data
Number Number Nonre-Excess
of of re- sponsePSU number sample spend. adjust-
nonre-
persons ents in mentspond-
in class a class a factor ents
ports4~5 is used to estimate NSFG variances. Anempirical study by Bean6 gives evidence that thehalf-sample technique produces highly reliable,essentially unbiased variance estimates.
There are three important practical reasonswhy hzdf-sample replication is being used:
Totals forclass . .. ... .
1..........................2 . ... .. ... .. ... . .. ... . ... ..
3 .... .. .. ... ... . ... . ... . ...
4 ... ... . ... .. .. .. .. .. ... .. .5 .. .. .. ... .. .. .. .. .. .. ... ..It
43 31
10 1012 1013 6
6 52 0 t
. . . 3
1 01.2 02 11.2 02 2
43 431For this example, A- = — ‘—”
43-3 40
ment is defined to be 2, and the ex-cess nonresponse is adjusted in fac-tor Aa described below.2,000,000/Pg, where P is the 1970census population of &3U g. Thisfirst-stage weight is the reciprocalof the expected frequency of occur-rence of PSUg in the NSFG sample.For PSU’S that were completelycontained in a single zone at thetime of selection, WI is the re-
“fciprocal of the probabl lty of selec-tion. However, Wlg deviates fromthe reciprocal of the probability ofselection for PSU’S that overlapzones. For SMSA’S with popula-tions greater than 2 million W1g isless than 1, whereas the reciprocalof the probabilityy of selection forany PSU is always greater than orequal to 1.an excess nonresponse adjustmentthat compensates at the a-class levelfor nonresponse greater than 50percent in individual PSU’S. TableK gives an example of how to de-termine A a,
VARIANCE ESTIMATION
Background
The balanced half-samplenique described in detail in
16
replication tech-other NCHS re-
1.
2.
3.
Programming difficulties are reduced be-cause half-sample variances are com-puted by taking a simple average ofsquared deviations of half-sample esti-mates from the estimate based on thefull sample. Instead of having to pro-gram an exceedingly difficult varianceformula, the programmer must simplyadjust the estimation formula to com-pute estimates from appropriately chosenhalf samples.
The complete algebraic formula forNSFG variances is unknown because ofthe complexity of the design. Althoughzdgebraic expressions can be derived forparticular subprocedures–such as theindividual stages of sampling and thepoststratification and nonresponse ad-justments–a single, exact variance equa-tion has not been developed.
As stated by McCarthy4 : “Varianceestimates based upon the replicated esti-mates will mirror the effects of allaspects of sampling and estimation thatare permitted to vary randomly fromreplicate to replicate. ” Also, replicatedhalf-sample variances include some of the\’amiabilitydue to nonsampling (measure-ment) error, as well as sampling vari-abilityy.
Summary of Applicable Theory
The population of interest is classified into Lstrata, and two sample PSU’S are drawn fromeach stratum. Selection of exactly two samplePSU’S reflects an essential element of the theory.This requirement may be met for noncertaintyPSU’S b~ collapsing t~vo strata having one PSUeach, or for certainty PSU’S by creating two arti-ficial, or pseudo, PSU’S by random methods froma single PSU. The collapsing method producessomewhat positively biased (overstated) variance
estimates by introducing a between-stratumcomponent of variance that does not exist.z
Let the parameter of interest be denoted byY, for which an estimate Y’ has been obtainedfrom the complete sample. Y’ is a linear com-
bination of the sample observations in fullyrigorous developments, although several empiri-cal investigations indicate that the bias of half-sample variance estimates for certain ratio esti-mators and correlation statistics is negligible,if detectable at all.ASs>7~s
A half-sample replicate is defined as a col-lection of L PSU’S obtained by selecting one ofthe paired sample PSU’S from each stratum. Ifthe PSU’S within each stratum are designated by
the subscript i = 1 or 2 and there are K halfsamples, where K > L, the pattern may be sum-marized as in Table L. The “+” indicates that aPSU falls into a particular half sample, and the“-” indicates that it does not.
Analogs of Y’ corresponding to each halfsample are then computed. That is, for the kthhalf sample, Y’ is given by
h=]
where z’ = either 1 or 2 depending on which PSU
of the stratum is the half-sample k, and ~’i is,
Table L. Exal
Half.samplereplica-
tion
1
2
3
K
Ie of a half-samDie replication oattern
1
Psu—1—+
-1-
—
—2
—
+
+
—
2
Psu.1—
+
+—
—2
—
+
+
—
Stratum
3
Psu—1—
+
+—
—2
+
+
—
. . .
. . .
. . .
. . .
. . .
.,.
. . .
. . .—
L
Psu—1—+
—
+—
—2—
+
i-
—
in this example, a total. The estimator Y’ is
L
y’=~(yhl+‘h2)h=l
and its variance is estimated by
K
Z(2 .L y;Syt - y’)2
K k=l
Because it is impractical to compute the Yj forthe entire set of 2L possible half samples when Lis large, a subset of half samples is selected toproduce the estimates. A set of side conditionsrelating to the selection of PSU’S for the halfsamples has been developed by McCarthy4~5based on work by Plackett and Burmang andGurney.10 These side conditions greatly increase
2 by eliminating a between-the stability of sy~strata component of variance that is otherwisepresent. The value of S$J obtained from a sub-set of half samples that are chosen according tothe McCarthy criteria is equal to the value thatwould be obtained using all 2L half samples. Aset of half samples that satisfy the McCarthy cri-teria is called a “balanced set,” and the pro-cedure is referred to as “balanced half-samplereplication. ”
Application to the NSFG
In order to apply the balanced half-samplereplication technique, NCHS grouped the 103NSFG PSU’S into 48 strata. Seven of the stratawere self-representing; that is, the y consisted ofSMSA’S that came into the sample with cer-tainty (except for Boston, which was lumped
with 2 of the 5 New York City PSU’S). Withineach of these strata two pseudo-PSU’s were cre-ated by listing the PSU’S in numerical order andlisting the segments in numerical order withineach PSU. The first segment, and every secondsegment thereafter was assigned to the firstpseudo-PSU, and the remaining segments wereassigned to the second pseudo-PSU. Within theother 41 strata, each of the 2 replication PSU’Sconsisted of 1 or more NSFG PSU’S (but nevermore than one locality). The PSU’S in each stra-
tum belonged to the same geographic region and
17
Figure 2. Relative standard errors for aggregates of women, by race
i=
+’ A’3 4 56789 A’3 456789 A’3 4 56789
A’3 456789 3 456789
10 100 1,000 10,000
ESTIMATED NUMBER OF WOMEN (IN THOUSANDS)
Example of use of chart: An aggregate of 2 million women (on the scale at the bottom of the chart) of all races has a relativestandard error of 4.8 percent, or a standard error of 96,000 (4.8 percent of 2 million).
were generally about the same size. The value 1was assigned to one of the replication PSU’S ineach stratum, and the value 2 was assigned tothe other. Forty-eight half samples were thenformed from the entries in successive columns ofan orthogonal 48 X 48 matrix of 1‘s and 2‘sadapted from Plackett and Burman.g
In order to estimate the variance of anNSFG aq regate statistic Y’, 48 half-sample ana-
‘9logs of 1’ were computed. The formula for theAth half-sample estimate is
where
and
~=1
Y~lk and X~l ~ were computed in exactly thesame way as Y~l and -Y: 1. No additionalweighting was necessary because of the poststrat-ification process. On the other hand, the Y~2kwere computed after multiplying the weight foreach sirrale mother bv 2 to com~ensate for thehalf-sam~le procedur;. Thethen estimated by
.variance of Y~ was
- y32
Types of aggregate statistics produced from theNSFG include number of currently marriedwomen, number of ever-married women, num-ber of pregnancies for ever-married women, andtotal number of children born to ever-marriedwomen. Half-sample variances were not com-puted for all aggregate statistics because the timeand money needed to do so would have beenprohibitive. In addition, data reports would bemuch more cumbersome if a variance estimatewere published for each statistic. To avoid these
problems, variances for each type of statisticwere computed only for selected populationsubgroups, which were chosen to represent awide variety of demographic characteristics anda wide variation in the size of the estimates.Curves were fitted to relative standard error(RSE) estimates for number of currently mar-ried women and number of pregnancies, accord-ing to the model
‘SE(Y’)=E=JA and B are parameters whose estimates deter-mine the shape of the curve. The rationale forthe model and the iterative method that wasused to estimate A and B are explained else-where.’ I
Figure 2 shows relative standard error curtiesfor estimates of women by race. The estimatesof A and B for the curves are shown in table M.Although the curves for women were fitted toRSE’s for estimates of currently married worn%only, RSE’S for estimates of ever-married womenfail close to the curves. Therefore, it is appropri-ate to obtain predicted RSE’S for estimates ofever-married women from the aggregate curvesfor women.
Separate curves were needed for Negrowomen and women of all other race classifica-tions (women of all races, women of race otherthan Negro, and white women). Because Negrowomen were oversampled, an estimate of a givennumber of Negro women has a smaller RSE thanan estimate of the same number of women of allraces. The curves in figure 2 clearly show thisrelationship. For example, an estimate of200,000 Negro women has an RSE of 9 percent,while an estimate of 200,000 women of allraces has an RSE of 15 percent.
Table M. Estimates of A and B for relative standard error curves,
by race
Curve A I B
Total and white women .. ... .. .. .. .. . I 0.000017613 4,493.7916Negro women ... .. .. .. ... .. .. .. .. .. ... .. ... 0.000040219 1,600.4393
t I
19
Variances of aggregate statistics were used toderive variances of percents, which are ratios oftwo aggregates with the numerator being a sub-class of the denominator. Percent estimatesusually show the proportion of a populationthat has a particular characteristic of interest.The RSE of the percent estimate
p’ =;:. 100”
is given by the expression 1
RSE (P’)= ~ [RSEZ (Y’) - RSEZ (Z’)]
an acceptable estimate for the RSE of a per-cent whose denominator is not one of the valuesshown in figure 3.
In addition to percents, the NSFG providesother types of ratio estimates where the numera-tor is not a subclass of the denominator, such asthe mean number of expected births per woman,the mean number of expected additional birthsper woman, and the probability that a womangives birth within a certain number of monthsfollowing her last previous birth. Variances forthese types of estimates could not be derivedfrom variances of aggregate statistics, so theywere computed directly by the half-sample tech-nique. The variance of the ratio estimate R‘=Y’/W’is given by
1
. J/l+;- (A+;)1 ‘E
JBZ’-BY’(P’/Y’)=Y’Z’(P’/Y’)
JB(lOO -P’)=P’z’
where B is the least squares estimate from theerror curve for Y’ and Z’.
The RSE of P’is dependent upon the valuesof both P’and Z’. In order to account for thevariation in error due to P’and Z’, a set of per-cent RSE curves was derived from each aggregateRSE curve. Each curve in the set yields RSE’Sfor percent estimates with a fixed denominator.
Figure 3 shows the set of curves for percentof total and white women, along with an exam-ple of how to use the error chart. Each curvesatisfies the equation
RsE@)=~where P’ is the estimated percent and Z’ is thedenominator of P’.Linear interpolation yields
where R: is the kth half-sample analog of R‘.As was the case for aggregate statistics,
approximate error curves were fitted to selectedvariance estimates. The model for these morecomplicated ratio estimates was
RSE (R’)=Jw
‘JzF7’where A, B, and C were the parameters to beestimated. As with percentage estimates, theRSE varies with the size of the estimate R‘ andits denominator W’. The method of estimatingA, B, and C is the same general method used foraggregate estimates.l 1 The RSE’S for each typeof ratio statistic are represented by a set ofcurves, as was the case for percent estimates.
Figure 4 shows the set of curves for meannumber of births expected by women of allraces, along with an example of ho~v to use theerror chart. The three cur;es inthe equation
the chart satisfy
RSE(R’) = #0.00002488 - 649.05239 /R’W’ + 1,733.8522/W’
20
Figure 3. Relativa standard errors for percent of total and white women (base of percent shown in curve in thousands)
100908070
60I I I I r 1 f IITI lJ-
50--- , , , I , , , , , , I Ill [ 1 1 1 1 r I ! 1 1 1 1 1 1 1
ml I I II1 I I 1111 Ill [ 1 I I I I 111111140 200 I
dllllttmm I I I I I Illllltttttt30
20
10987
6
5
Llll 1111111111111
I I I I I 1 I I 1 I I
I 1 I 1 I I I I I I Ill
4 , 1 1 t n 1 1 I 1 1 1 1I I I I I I 1!/!!
1 1 , , r , r r 1 r
3
2
1.9.8.7
.6
.5
I I ! I I 1 I I I I 1
.4
.3
.2
.1
A 2 3 456789A
2 3 456789i
2 3 456789A
1 10 100 1,000
ESTIMATED PERCENT
Example of use ofchzrt: An estimate of 10 percent (from the scale at the bottom of the chart) ofa population subgroup of 1 million women (fifth curve from the top) has a relative standard error of20.0 percent, or a standard error of 2.0 percent (2o.o
where R’ is the estimated mean and W’ is theestimated number of women (the denominatorof R’). Linear interpolation yields an acceptableestimate for the RSE of R‘ when R‘ is not oneof the \’alues shown in figure 4.
Figures 2-4 are examples of relative standard
percent of 10 percent).
error curves for the various kinds of statisticsproduced from Cycle I of the NSFG. Each re-port that discusses findings from Cycle I will in-clude similar charts for all types of statisticspresented therein.
An estimate of the standard error of the dif-
21
Figure 4. Relative standerd errors for mean numbers of Mrths expectedby women of all races and white women
IJJ>
1 10 100 1,000 10,000
ESTIMATED NUMBER OF WOMEN IN THE DENOMINATOR (IN THOUSANDS)
Example of use of chart: An estimate of 2.0 births per woman (middle line on the chart) for a r)o~ulation submouD of900,000 women (read- from the scale at the bottom of the “chart) has a ;elative standard error of ~.O per;e~t, or a stand~rd &rorof 0.08 (4.0 percent of 2.0).
ference between any two aggregates, percents, orother ratio statistics is given by
This expression provides a good estimate of thestandard error for uncorrelated statistics, butcan only be considered a rough approximationotherwise. Because NSFG estimates are basedupon a large sample of women, the distributionsof Yj and Yj (and, therefore, Y: - y:) areapproximately normal. Frankell z shows empiri-cally that, using balanced half-sample replicationestimates of variance, the test statistic
Y; - Y;
‘=s (Y1- Y2)
approximates the student’s t distribution underthe null hypothesis of no difference betweenthe parameters estimated by Y; and Yj against
a two-sided alternative. The number of strata inthe replication design (48 for the NSFG) canreasonably be used as the number of degrees offreedom for the t statistic, although the exactvalue for the degrees of freedom is unknown.Therefore, individual two-tailed significance
tests of differences between NSFG statistics canbe performed with an approximate significancelevel of a by computing t and comparing it tothe two-tailed 1- & critical value for the t distri-bution with 48 degrees of freedom.
Example: Suppose 500,000 currentlymarried Negro women expect an average of2.46 births per woman and 5 million cur-rently married women of other races expectan average of 2.08 births per woman. Totest this race difference at the Q’ = .05 levelof significance, compute
t= 2.46-2.08
-~ (2.46 )2RSE2 (2.46)+ (2.08 )2RSE2 (2.08)
From figure 4, RSE(2.46) * .055 andRSE(2.08) ~ .018 so that
2.46-2.08t=
~(2.46)2 (0.55)2 + (2.08)2 (.018)2
= 2.71
The two-tailed .95 critical value (1 - u) for.a t statistic with 48 degrees of freedom is2.01. Therefore, the difference is significantat the .05 level.
o 00
23
REFERENCES
1King, B. F., and Frankel, M. R.: Description of theSample Design for the National Family Growth Survey.Unpublished manuscript. ,
2Kish, L.: Survey Sampling. New York. John Wiley& Sons, 1965. ,
3Brock, D. B.: Estimation and Variance Specifica-tions for the National Sui-vey of Family Growth. Unpub-lished manuscript.
4National Center for Health Statistics: Replication:An approach to the analysis of data from complex sur-veys. Vital and Health Statistics. PHS Pub. No. 1000-Series 2-No. 14. Public Health Service. Washington.U.S. Government Printing Office, Apr. 1966.
5National Center for Health Statistics: Pseudorepli-cation: Further evaluation and application of the bal-anced half-sample technique. Vital and Health Statistics.PHS Pub. No. 1000-Series 2-No. 31. Public Health Serv-ice. Washington. U.S. Government Printing Office, Jan.1969.
6National Center for Health Statistics: Distributionand properties of variance estimators for complex multi-stage probability samples: An empirical distribution.Vital and Health Statistics. Series 2-No. 65. DHEWPub. No. (HRA) 75-1339. Health Resources Admin-istration. Washington. U.S. Government PrintingOffice, Mar. 1975.
7Simmons, W. R., and Baird, J.: Use of pseudorepli-cation in the NCHS Health Examination Survey. Pro-ceedings of the Social Statistics Section of ASA. Ameri-can Statistical Association, 1968.
8Kish, L., and Frankel, M. R.: Balanced repeatedreplications. ]. Am. Stat. Assoc. 65:1071-1094, Sept.1970.
‘Plackett, R. L., and Burman, J. P.: The design ofoptimum multifactorial experiments. Biometrika 33:305-325, 1946.
1 ‘Gurney, M.: The Variance of the ReplicationMethod for Estimating Variances for the CPS SampleDesign. Dittoed memorandum, U.S. Bureau of theCensus, 1962.
11 National Center for Health Statistics: Estimationand sampling variance in the Health Interview Survey.Vital and Health Statistics. PHS Pub. No. 1000-Series 2,No. 38. Public Health Service. Washington. U.S. Gov-ernment Printing Office, June 1970.
‘ 2 Frankel, M. R.: Inference from Survey Samples:An Empirical Investigation. Ann Arbor. Institute forSocial Research, University of Michigan, 1971.
‘ 3 French, D. K., and Bryant, E. E.: Samplc-Dependent Estimation for the National Survey ofFamily Growth. Proceedings of the Social StatisticsSection of ASA. American Statistical Association, 1976,
24
APPENDIXES
CONTENTS
I. Glossary of Terms .. .............................. ................ ...................................................... .................... 26
II. Evaluation of Mternative Estimators ............................................................................ .. ................ 28
111, Household Smeener ............................. .................................................... ............ .. .... ................. .... 29
25
APPENDIX
GLOSSARY OF
Conterminous United States. –The land areaconsisting of the District of Columbia and allStates except Alaska and Hawaii.
Dwelling unit (D U).–A single room, orgroup of rooms, that is intended for separate liv-ing quarters. The people who live there mustlive and eat separately from everyone else in thebuilding (or apartment ) and the room or groupof rooms must have either
1.
2.
A separate entrance directly from theout side of the building or through acommon hall, or
Comdete kitchen facilities for the useof t~is household only. Complete kit-chen facilities include all of the follow-ing:a. a range or cooking stove, andb. a sink with piped water, andc. a mechanical refrigerator,
Education. –The highest grade of schoolcompleted.
Geographic region.–For the purpose of clas-sif ying the population by geographic area, theU.S. Bureau of the Census has grouped the 50States and the District of Columbia into fourregions, as follows:
Region States included
Northeast ... ...... Maine, New Hampshire, Ver-mont, Massachusetts, RhodeIsland, Connecticut, NewYork, New Jersey, Pennsyl-vania
1
TERMS
North Central...
South ...............
West .................
Michigan, Ohio, Indiana, Illi-nois, Wisconsin, Minnesota,Iowa, Missouri, North Da-kota, South Dakota, Kansas,NebraskaDelaware, Maryland, Districtof Columbia, Virginia, WestVirginia, North Carolina,South CaroIina, Georgia,Florida, Kentucky, Texas,Tennessee, Alabama, Missis-sippi, Arkansas, Louisiana,OklahomaMontana, Idaho, Wyoming,Colorado, New Mexico, Ari-zona, Utah, Nevada, Washing-ton, Alaska, Oregon, Cali-fornia, Hawaii
Alaska and Hawaii are not included in the NSFGsample design.
Household. –A family living together, or fiveor fewer unrelated individuals living together ina DU.
Parity. –The number of live births a womanhas had.
Screener interview. –A preliminary interviewat the household to collect information aboutthe DU and to determine whether or not thehousehoId includes one or more women who areeligible for the detailed interview.
Standard metropolitan statistical area(S.MSA).–A county or group of contiguouscounties (except in New England) which con-tains at least one central city of 50,000 peopleor more, or “twin cities” with a combined popu-
26
~Iation of at least 50,000. In addition, other con- of the Census, the urban areas of the Unitedtiguous counties are included in an SMSA if, States include all cities or “twin cities” with ataccording to certain criteria, they are socially least 50,000 population in 1970 together withand economically integrated with the central the surrounding closely settled area and allcity. other incorporated or unincorporated popula-
Urban area.–As defined by the U.S. Bureau tion centers with 2,500 inhabitants or more.
000
27
APPENDIX II
EVALUATION OF ALTERNATIVE ESTIMATORS
An aggregate parameter Y for all eligiblewomen less than 45 years of age in the UnitedStates can be written as Y = YI + Yz, where YIis the total for all ever-married women and Y2 isthe total for all single mothers with one or moreof their children currently living in the house-hold. A1temative estimators of Y considered forthe NSFG were
Y’~p=Y; +Y; (1)
and
Y;= Y:+Y; (2)
where Y; and Y! are simple inflation estimatw-sand
is a poststratified estimator. The X&l representNSFG estimates of the number of ever-married~omen in each of 12 age-race classes and theXal are corresponding estimates for September1973 (the approximate midpoint of NSFG inter-viewing) based upon the U.S. Bureau of the Cen-sus’ Current Population Survey (CPS).
At the time the two estimators were pro-posed, it was decided that the choice of Y: or
Y~P as the official estimator for the NSFGwould be based on information from the sampledata. Values of certain statistics would be cal-culated using both estimators, and variances ofthe two competing estimates would be com-puted using identical half-sample replication pro-cedures. If the variance of Y: was the lower ofthe two variances for many of the experimentalstatistics, and if Y$ did not result in higher vari-ances for more than a few statistics, then itwould be chosen; otherwise, YjP would be used.
Simple inflation and poststratified esti-mates of the number of currently married U.S.women and the relative standard errors (RSE’S)of both estimates were computed for more than200 subdomains of the population. The post-stratified estimator had a smaller RSE than theinflation estimator for more than 80 percent ofthe domains. The improvement in precision was20 percent or better for almost two-fifths of thedomains. A more complete discussion of thedesign and results of the study can be foundelsewhere. 13
Because the RSE of Y; was much smallerthan the RSE of YjP for many aggregate esti-mates, and was no more than 10 percent largerthan the RSE of YjP for any parameter in the
study, the decision was made to use Y$ as theofficial NSFG estimator.
NOTE: The list of references follows the text.
000
28
APPENDIX Ill
HOUSEHOLD SCREENER
NORC-4604July, 1973
BEGIN DECK 21
r NOTICE:OFFICE USE ONLY
Final Status❑ 10
Tel. Re-Check:
I Yes. .o. lNo . ...2
11
ONB No. 68 - S72170
NATIONALOPINION RESEARCHCENTER Expires:April 30, 1974Universityof Chicago
Collectedfor the NationalCenter for Health Statistics
HOUSSHOLDSCREENER
All informationwhich would permit identificationof anyindividualwill be held in strictconfidence,will be usedonly by personsengagedin and for the purposeof the sur-vey, and will not be disclosedor releasedto others forany purpose,as in accordancewith Section305(a)of thePublicHealth ServiceAct, Section1.103(a)of the PublicHealth ServiceRegulations[42 CFR 1.103(a)]and underPublic HealthDelegationof AuthorityNmber 31.
ASSIGNMENTBOX
rSAMPLINGTABLE
IF NUMBER OF THEN lNTERVIW
1ELIGIBLE FENALES PERSON LISTEDLISTED IN SUM- ON SUMMARY EOXWRY BOX 1S: LINE
ElTwo
Three
Four
Five
L -J 1=’1 IINTERVLBWBR: Is this addressin a rural%rea or in some other kind of area?
Rural . . . 1 Other . . . 2 12
INTRODUCTION
Hello, I’m from the NationalOpinionResearch Center.(SHOWID BADGE.) A letterwas sent to you recentlyexplainingthestudywe are conductingfor the U.S. PublicHealth Service. As yOU
may recallfrom the letter,the study is being conductedall overthe countryand is about familysize.
RECORDOF ALL CONTACTSAT HOUSEHOLD
${k Date ‘lime ‘p’CODE ONE Interv’r
PerjTel Results Ser. uex initials
.
-2- TIME AMBEGAN: PM
rnOusEnOLDENUME8AT10N
n ,0,,
BEGIN DECK 22
1, TO start, how many people live in this household? NumER
2, What is the name of the heed of this household? (ENTER NANE ON LINE 01 BELOW. )
3. And the other members of this ho.sehold--uh.w are their name.? Let8s begin witheveryone related to (HEAD). BE su~ PERSON 1t:cL0TIE5 (HlM5ELFl Herself).NAIEs IN TABLE BELOW.)
(ENTER
4. Are there other people living here who are not related to (HEAD)?
5. I have listed (READ mm IN ORUER). Is there anyone else staying her. n.w, suchss friends, ,,lat i”,, or roomers?
AFTSRLISTING HOUSEHOLD,ASK QS. 6-9 FOR EACH PERSONAS APPROPRIATE A
what CODE(PE:OX) , S
SEX(ASK
relation- ~Fship to(HEALJOF y;ROUSE-som) 7
Vxous)
6. 7. 8. 9. IF 13 YEARS OR OWES ASK: 1S
flow old (PER30X) no. ci.rried,widowed,di.vorcedor annulled,separatedor has(he/she) neverbee.mrried? IF NE-vER >NRRIED m RE?ORTELIAS LmxGTOGETHER, CODE ‘TXFO!L.L4L.,$ IF WVERmRRItD ASD KOT LIVC+G TOG2THER BUTHAS m’N cHzLDR3N, CUDE I,WITH OLTCHILDREK.,,
ln-Si”gld
~rr for-wid.Div. Sep. with Never
FirstName LastName M F km. rated .;?. slam.. .,1d,,”
12 Is01 HEAD
14 15 1612 12 3456 7
17 1s02
19 2U 2112 12 3456 7
22 23 24 is
0326
12 12 3&56 7
27 28 29 3U
0431
12 12 3456 7
S2 S3 3U S505
3612 12
93456 7
37 38
0639 40 41
12 12 3456 7
,2 4307
4k 115 46
12 12 3456 7
47 6s08
., 50 51
12 12 3456 7
52 3,09
5, 5s 56
12 12 3456 7
57 Sn
10 ‘59 60 61
12 12 3456 7
IF MOHE THAN 10 PEOPLE IN HOUSEHOLD, GO T@ CON2’INIM’fIONBOOKLET,PAGE 2.
.,s(HESW
PERSOX)
(hi$her)last
birch-day7
10. 1, there anyone no..way from h.rm who usually 1iv.. her,?
❑ Ye. n . .
11. Do .,,3’ Qf the PC,P1, i,,this I,,usc!, cld ha,., , 1,.-. ,,:yxh.,, ,,,,:
12, Are my of the persons m th~s l:o. seh..ld now m full-time acti\.e duty with theAnneal Force, of the L’nicedSt.:e,-
❑ Yes D NO. ...
I
~
30
-3-
13. SELECTION OF RSSFONDENT
(cHECK ONE BOX)
DECK 21 CON’TINUEO
10
2EI
3CI
❑
NO ELIGIBLE 140MEN UI$DER 65 YRS. OF AGS,NO INTERVISW REQUIRSD; SKIP TO Q. 14.
ONE SLIGISLE VONAN UNDER 45 YRS. OF AGE,CURRENTLY ~ OR INFORMALLY MARRIED;CIRCLE R* S PERSON NUI!BER ON P. 2 ;SKIP TO Q, 14; THEN USE lcusRENTLY NARRIED wEETloNNAIR~
ONE ELIGIBLE !40NAN UNDER 45 YRS. OF AGE,CURRENTLY _, SEPARATED, _ OR ANNULLEO, OR SINGLE W/OrNNCHILDREN;CIRCLE R’ S PERSONN07LBERON P. 2:SKIP TO Q. 14; TH2N USE[P05T-NkRRIEo OUESTIONNAIRE~
NOSE THAN ONE ELIGIBLE wOitiN WDER 45 YRS. OF AGE,
FDLLOWSTEPS 1 - 4 ‘2U SELECT CORRECT RESPONDENT
S!SIG: List names of elisible wfemales in Summary Box--in order of ase, be-gfnning with the oldest.“ Ll”e #1. M
II SUMMARY BOX
14 11.!Line # Nam Age
I
Ha1
2
3
4
5
6
5
US. Sanpling Table on pase 1to determine which eligiblefemale to interview.
Circle R’s L%ne # in SummaryBox, and write selected R’sname here:
(Respondent’s Nune)
(cHECK ONE BOX)
•1
•1
—14. ASK EVERYOSE: Flay 1 have your telephone “.mber (in case my office wants to verify this interview)?
Telephone n.. : Area Code: I 17
No.hone . . . . . . . . . . . . ...2
IF P1lD1lE SO. GIVEN. CODE LOCATION “_
Se fad, . . . . . . . . . . . . ...3
lnllouseholJ . . . . . . . . . . ...4
I.hmnof neighbor . . . . . . . ...5
<— Other (SPECIFY) . . . . . . . . . ...6
TINE SCREENER AMThank you very much for your help. Eh3ED : PM
INTERVIEWER: FILL OUT A-F BELOW k SANPLING REPORT & INTERVIEWER SIGNATURE AND NUhBER ONPAGE 4 IMMEDIATELY AFTER YOU LEAVE THE HOUSEHOLD.
A.
s.
Race of household (by ob,erv.ation)
Slackltlegr o ...,..... .,1white . . . . . . . . . . . . . ..2Other (SPECIFT)
3
Nocable to observe..,.. . ..4
Code t,w of living quarters:
Dctacbed single family house . . . . . . 1
Trailer . . . . . . . . . . . . . . . ...2
2-4 fcanily houselapartment buildins . . . . 3
Row house (3 . . mare attached unit. ) , . k
Apartment house (5 or more unite.; freeaccess to housins units) . . . , . . . . 3
Aparttoe”t house (5 or more units; lockedentry, or g.ardod by doorman, or both) . 6
Other (SPECIFY)
7
c. LenStb of time
1s for Screener:min”tea
D. ‘“” ‘f “’’’””: Q mE. With whom did you ccmduct the Screener?
19 [same (,) of screener i. fomant (s)]
20 21
22 2s
28 27
F. If screener informant ~ household member,describe relationship to selected re*P.an-de”t . . co household.
(Relationship to Household)
[GO TO PAGE L I
31
G.YM
*c*
CHECK (~ HERE IF NOMISSED DU OR OTHER IICORRECTIONS
7]
%
AT SAMPLEADDRESSL
OFFICEUSE ONLYDU Line
Code(10 (11!17
?hone
callCol.
1
2
3
4
5
6
7
8
Psu
AddressCorrections,Additions,
orDeletions
SANPLINGREPORTON MISSED DUiS AND OTHER CORRECTIONS m— %. #_ Part
DescriptionofCorrections,Additions,
orDeletions
.
Total AdditionalI)ULs
sampleLine # Case # 1 uDU’s Added DU’S Deleted OFFICEUSECorrectionAddltIonsAddLtlons ddztzons
to at Same at Same between Line ONLY
Line #Reason BLOCK#
(f%%?) (Ci%~kij)‘Na%ip ‘} (77-80)
II
1
I
I
I
I
I,
I
I+-,
I
II
I
I,
I
I1
I I IIM I I !IF 1 TO 4 DUIS:
DECK 21FILL OUT HOUSEHOLDSCRSENSRFOR EACH & CONDUCTSCKEENRRIN2ERVIEW.
=T
IF 5 OR MORSDU’S:CALL SANFLINGDE~. (cOLLECTAC 312/684-56oo)FOR INSTRUCTIONS_ YOU DO ANT ADDITIONALSCKSENERINTERVIEWS.
INTERVIEWER
I I I I 20 32ID NUMBER:
AT TIIE OF PHONECALL, CIRCLECODE IN “PHONECALL COL1lFOR LINE(S)AT WHICH YOUARI INSTRUCTED’20IXlSCREENERINTERVIEW(S);THEN FILL OUT SCREENERFOR EACH OF THOSE & CCJ:DU.:rSC8.EEN8RINTERvIEw.
VITAL AND HEALTH STATISTICS PUBLICATIONS SERIES
Foremerly Public Health Service Publication No. 1000
Series 1. Programs and Collection Procedures. -Reports which describe the general programs of the National Center for Health Statistics and its offices and divisions, data collection methods used, definitions, and other material necessary for understanding the data.
Series 2. Data Evaluation and Methods Research . -Studies of new statistical methodology including experimental tests of new survey methods, studies of vital statistics collection methods, new analytical techniques, objective evaluations of reliability of collected data, contributions to statistical theory.
Series 3. AnaZy tical Studies. -Reports presenting analytical or interpretive studies based on vital and health statistics, carrying the analysis further th .an the expository types of reports in the other series.
Series 4. Documents and Committee Reports. -Final reports of major committees concerned with vital and health statistics, and documents such as recommended model vital registration laws and revised birth and death certificates.
Series 10. Data from the Health Interview Survey. -Statistics on illness; accidental injuries; disability; use of hospital, medical, dental, and other services; and other health-related topics, based on data collected in a continuing national household interview survey.
Series ll*, Data from the Health Examination Survey. -Data from direct examination, testing, and measurement of national samples of the civilian, noninstitutionalized population provide the basis for two types of reports: (I) estimates of the medically defined prevalence of specific diseases in the United States and the distributions of the population with respect to physical, physiological, and psychological charac- teristics; and (2) analysis of relationships among the various measurements without reference to an explicit finite universe of persons.
Series 12. Data from the Institutionalized Population Surveys. -Discontinued effective 1975. Future reports from these surveys will be in Series 13.
Series 13. Data on Health Resources Utilization. ,-Statistics on the utilization of health manpower providing long-term care, ambulatory care, hospital care, and family planning services.
and facilities
Series 14. Data on Health Resources: Manpower and Facilities. -Statistics on the numbers, geographic distrib- ution, and characteristics of health resources including physicians, dentists, nurses, other health occu- pations, hospitals, nursing homes, and outpatient facilities.
Series 20. Data on Mortality. -Various statistics on mortality other than as included in regular annual or monthly reports. Special analyses by cause of death, age, and other demographic variables; geographic and time series analyses; and statistics on characteristics of deaths not available from the vital records, based on sample surveys of those records.
Series 21, Data on Natality, Marriage, and Divorce. -Various statistics on natality, marriage, and divorce other than as included in regular annual or monthly reports. Special analyses by demographic variables; geographic and time series analyses; studies of fertility; and statistics on characteristics *of births not available from the vital records, based on sample surveys of those records.
Series 22. Data j?om the National Mortality and Natality Surveys. -Discontinued .effective 1975. Future reports from these sample surveys based on vital records will be included in Series 20 and 21, respectively.
Series 23. Data from the National Survey of Family Growth. -Statistics on fertility, family formation and disso- lution, family planning, and related maternal and infant health topics derived from a biennial survey of a nationwide probability sample of ever-married women 1544 years of age.
For a list of titles of reports published in these series, write to: Scientific and Technical Information Branch National Center for Health Statistics Public Health Service Hyattsville, Md. 20782