flow and diffusion of high-stakes test scores

4
SOCIAL SCIENCES PHYSICS Flow and diffusion of high-stakes test scores M. Marder 1 and D. Bansal Center for Nonlinear Dynamics and Department of Physics, University of Texas, Austin, TX 78712 Edited by Leo P. Kadanoff, University of Chicago, Chicago, IL, and approved August 18, 2009 (received for review December 2, 2008) We apply visualization and modeling methods for convective and diffusive flows to public school mathematics test scores from Texas. We obtain plots that show the most likely future and past scores of students, the effects of random processes such as guessing, and the rate at which students appear in and disappear from schools. We show that student outcomes depend strongly upon economic class, and identify the grade levels where flows of different groups diverge most strongly. Changing the effectiveness of instruction in one grade naturally leads to strongly nonlinear effects on student outcomes in subsequent grades. Fokker–Planck equation | convection | education T exas began testing almost every student in almost every pub- lic school in grades 3-11 in 2003 with the Texas Assessment of Knowledge and Skills (TAKS). Every other state in the United States administers similar tests and gathers similar data, either because of its own testing history, or because of the Elementary and Secondary Education Act of 2001 (No Child Left Behind, or NCLB). Texas mathematics scores for the years 2003 through 2007 comprise a data set involving more than 17 million examinations of over 4.6 million distinct students. Here we borrow techniques from statistical mechanics (1) developed to describe particle flows with convection and diffusion and apply them to these mathemat- ics scores. The methods we use to display data are motivated by the desire to let the numbers speak for themselves with minimal filtering by expectations or theories. The most similar previous work describes schools using Markov models. “Demographic accounting” (2) predicts changes in the distribution of a population over time using Markov models and has been used to try to predict student enrollment year to year (3, 4), likely graduation times for students (5), and the production of and demand for teachers (6). We obtain a more detailed descrip- tion of students based on large quantities of testing data that are just starting to become available. Working in a space of score and time we pursue approximations that lead from general Markov models to Fokker–Planck equations, and obtain the advantages in physical interpretation that follow from the ideas of convection and diffusion. Results Fig. 1A compares Texas mathematics scores taken in spring 2006 to those taken in spring 2007. Each arrow represents a group of students whose score fell into a range such as 80%-89% in 2006. The tail of each arrow is centered in the bin where students start. The tip of the arrow points in the direction of the aver- age change in score of the students in this group. The number of students accounted for by the arrow is shown by its area (not its length). The question one answers by following the flow is “If I know students’ scores when they are in third grade, what is the most likely set of scores for them to have as they head towards 11th?” The plot shows a snapshot of student motion across all grades in one year, and does not show the motion of particular individuals all the way from third to eleventh grade. The number of students represented by each arrow is large (40,000 students for the larger arrows in Fig. 1) so the standard error of the mean for changes in score is around 0.1%. Three shaded bands indicate the cut scores that divide commended from passing and failing performance. Knowing students’ scores in one year does not completely deter- mine their scores the next year. Score changes have a random com- ponent. The degree of randomness can be reduced by judiciously grouping similar students, but cannot be eliminated. Variations between students and schools, and the fact that students guess at problems they do not know, lead to uncertainty on the order of 10% in the change of individual scores from year to year. Adopting the language of fluids, both convection and diffusion contribute to the flow of students. Eq. 3 presents a Fokker–Planck equation (1) that makes this idea precise. The diffusive contribution to the flow due to guessing can be modeled mathematically: students invariably guess the answers to questions they do not know, because there is no penalty for guessing, and the fraction of correct guesses can be modeled by a binomial distribution. Each question has four responses, so the probability of guessing correctly is f = 1 4 , and students who know n questions out of N total questions on an exam will guess at the remaining M = N n, resulting in a mean (normalized) score of (n + fM)/N with a variance of f (1 f )M/N 2 . This variance pro- vides a lower limit for the amount of diffusion in the absence of any other diffusive terms. The actual diffusion, as measured from the data, is several times greater than this lower limit in all cases, indicating that randomness due to guessing only provides a small part of the diffusion. One consequence of diffusion is that not all students follow the path predicted by their flow arrows. Fig. 2A is a graphical represen- tation of the amount students’ scores are raised above or below the main flow by diffusion. These differences are due both to guessing and to differences in education, school, etc. A more subtle conse- quence of diffusion is that following students from the future into the past is different from following them from the past into the future. One can obtain a different set of flow arrows by choosing students whose score fell into a range such as 80%-89% in 2007 and computing their average score the prior year. This flow, which is displayed as Fig. 1B answers the question “If I know students’ scores when they are in 11th grade, what is the most likely set of scores for them to have had coming from third?” The upper and lower plots in Fig. 1 divide students into two groups according to their level of economic need. The upper plots show students not eligible for free and reduced meals (called “not low income”), and the bottom plots show those who are eligi- ble (called “low income”). Many other groupings are possible, including those by race and gender, economic need according to school rather than by individual, or combining race and gender with economic need. There are two additional phenomena that contribute to student flows. Students appear and disappear from one year to the next, and students can be required to repeat a grade. These contribu- tions are reflected to some extent in the sizes of arrows in Fig. 1, but it is useful to bring them out directly. Vertical arrows in Fig. 2B show the number of students who appeared in each grade and were missing or had zero score the previous year minus those who Author contributions: M.M. designed research; M.M. and D.B. performed research; M.M. and D.B. analyzed data; and M.M. and D.B. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1 To whom correspondence should be addressed. E-mail: [email protected]. www.pnas.org / cgi / doi / 10.1073 / pnas.0812221106 PNAS October 13, 2009 vol. 106 no. 41 17267–17270 Downloaded by guest on January 26, 2022

Upload: others

Post on 26-Jan-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

SOCI

AL

SCIE

NCE

SPH

YSIC

S

Flow and diffusion of high-stakes test scoresM. Marder1 and D. Bansal

Center for Nonlinear Dynamics and Department of Physics, University of Texas, Austin, TX 78712

Edited by Leo P. Kadanoff, University of Chicago, Chicago, IL, and approved August 18, 2009 (received for review December 2, 2008)

We apply visualization and modeling methods for convective anddiffusive flows to public school mathematics test scores from Texas.We obtain plots that show the most likely future and past scoresof students, the effects of random processes such as guessing, andthe rate at which students appear in and disappear from schools.We show that student outcomes depend strongly upon economicclass, and identify the grade levels where flows of different groupsdiverge most strongly. Changing the effectiveness of instruction inone grade naturally leads to strongly nonlinear effects on studentoutcomes in subsequent grades.

Fokker–Planck equation | convection | education

T exas began testing almost every student in almost every pub-lic school in grades 3-11 in 2003 with the Texas Assessment

of Knowledge and Skills (TAKS). Every other state in the UnitedStates administers similar tests and gathers similar data, eitherbecause of its own testing history, or because of the Elementaryand Secondary Education Act of 2001 (No Child Left Behind, orNCLB). Texas mathematics scores for the years 2003 through 2007comprise a data set involving more than 17 million examinationsof over 4.6 million distinct students. Here we borrow techniquesfrom statistical mechanics (1) developed to describe particle flowswith convection and diffusion and apply them to these mathemat-ics scores. The methods we use to display data are motivated bythe desire to let the numbers speak for themselves with minimalfiltering by expectations or theories.

The most similar previous work describes schools using Markovmodels. “Demographic accounting” (2) predicts changes in thedistribution of a population over time using Markov models andhas been used to try to predict student enrollment year to year(3, 4), likely graduation times for students (5), and the productionof and demand for teachers (6). We obtain a more detailed descrip-tion of students based on large quantities of testing data that arejust starting to become available. Working in a space of score andtime we pursue approximations that lead from general Markovmodels to Fokker–Planck equations, and obtain the advantagesin physical interpretation that follow from the ideas of convectionand diffusion.

ResultsFig. 1A compares Texas mathematics scores taken in spring 2006to those taken in spring 2007. Each arrow represents a groupof students whose score fell into a range such as 80%-89% in2006. The tail of each arrow is centered in the bin where studentsstart. The tip of the arrow points in the direction of the aver-age change in score of the students in this group. The number ofstudents accounted for by the arrow is shown by its area (not itslength). The question one answers by following the flow is “If Iknow students’ scores when they are in third grade, what is themost likely set of scores for them to have as they head towards11th?” The plot shows a snapshot of student motion across allgrades in one year, and does not show the motion of particularindividuals all the way from third to eleventh grade. The numberof students represented by each arrow is large (40,000 studentsfor the larger arrows in Fig. 1) so the standard error of the meanfor changes in score is around 0.1%. Three shaded bands indicatethe cut scores that divide commended from passing and failingperformance.

Knowing students’ scores in one year does not completely deter-mine their scores the next year. Score changes have a random com-ponent. The degree of randomness can be reduced by judiciouslygrouping similar students, but cannot be eliminated. Variationsbetween students and schools, and the fact that students guess atproblems they do not know, lead to uncertainty on the order of10% in the change of individual scores from year to year. Adoptingthe language of fluids, both convection and diffusion contributeto the flow of students. Eq. 3 presents a Fokker–Planck equation(1) that makes this idea precise.

The diffusive contribution to the flow due to guessing can bemodeled mathematically: students invariably guess the answersto questions they do not know, because there is no penalty forguessing, and the fraction of correct guesses can be modeled bya binomial distribution. Each question has four responses, so theprobability of guessing correctly is f = 1

4 , and students who known questions out of N total questions on an exam will guess at theremaining M = N − n, resulting in a mean (normalized) score of(n + fM)/N with a variance of f (1 − f )M/N2. This variance pro-vides a lower limit for the amount of diffusion in the absence ofany other diffusive terms. The actual diffusion, as measured fromthe data, is several times greater than this lower limit in all cases,indicating that randomness due to guessing only provides a smallpart of the diffusion.

One consequence of diffusion is that not all students follow thepath predicted by their flow arrows. Fig. 2A is a graphical represen-tation of the amount students’ scores are raised above or below themain flow by diffusion. These differences are due both to guessingand to differences in education, school, etc. A more subtle conse-quence of diffusion is that following students from the future intothe past is different from following them from the past into thefuture. One can obtain a different set of flow arrows by choosingstudents whose score fell into a range such as 80%-89% in 2007and computing their average score the prior year. This flow, whichis displayed as Fig. 1B answers the question “If I know students’scores when they are in 11th grade, what is the most likely set ofscores for them to have had coming from third?”

The upper and lower plots in Fig. 1 divide students into twogroups according to their level of economic need. The upper plotsshow students not eligible for free and reduced meals (called “notlow income”), and the bottom plots show those who are eligi-ble (called “low income”). Many other groupings are possible,including those by race and gender, economic need according toschool rather than by individual, or combining race and genderwith economic need.

There are two additional phenomena that contribute to studentflows. Students appear and disappear from one year to the next,and students can be required to repeat a grade. These contribu-tions are reflected to some extent in the sizes of arrows in Fig. 1,but it is useful to bring them out directly. Vertical arrows in Fig.2B show the number of students who appeared in each grade andwere missing or had zero score the previous year minus those who

Author contributions: M.M. designed research; M.M. and D.B. performed research; M.M.and D.B. analyzed data; and M.M. and D.B. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.1To whom correspondence should be addressed. E-mail: [email protected].

www.pnas.org / cgi / doi / 10.1073 / pnas.0812221106 PNAS October 13, 2009 vol. 106 no. 41 17267–17270

Dow

nloa

ded

by g

uest

on

Janu

ary

26, 2

022

Fig. 1. Flow plots recording scorechanges from spring 2006 to spring 2007.The top row has data from studentsnot elibigle for free and reduced mealswhereas the bottom row shows studentswho are eligible. That is, students fromwealthier families are on top, and thosefrom poorer families are below. Flowarrows vS show the most likely futurepath of students whose starting point isknown. Reverse flow arrows v−

S can be fol-lowed backwards to determine the mostlikely history of students whose endingpoint is known. The top band highlightscommended students, the middle bandhighlights passing scores, and the bottomband highlights failing scores, using thecut scores published for each year andgrade by the Texas Education Agency.

had been present and now vanish or get zero score. The horizontalarrows show the numbers of students who repeat a grade.

Fig. 3 shows how flow fields evolve over time for low-income stu-dents. The broad outlines of the flow pattern remain remarkablyconstant, while at the same time there are some systematic changessuch as a rapid increase in the numbers of students obtainingcommended scores at 10th grade.

DiscussionA characteristic pattern in Fig. 1 is a strong horizontal flow witharrows of decreasing size above and below it pointing towardsthe flow center. In fluids, this phenomenon results from thecompetition between fluctuations and dissipation: a particle that

is moving much faster than those around it because it has justreceived a particularly large random kick is most probably goingto slow down. In statistics this phenomenon is called regressionto the mean (7) and explains why arrows above the center of theflow tend to point down. Regression to the mean can be caused byseveral factors, including the mathematics of guessing, or by thesmall likelihood of students having exceptional teachers severalyears in a row.

Educational outcomes for students from wealthy and poor fam-ilies are very different in Texas. The flow fields show where thegreatest divergences between these groups occur. The flow pat-terns in the top and bottom rows of Fig. 1 start out in nearly thesame direction until the transition to middle school between fifth

Fig. 2. Student diffusion, disappearance,and retention. (A) Diffusion plots. Theseshow the changes in student numbers dueto random processes such as guessing onthe exam (Eq. 11). To find the total changein numbers of students in every cell includ-ing contributions from diffusion, add thesevertical arrows to the convective arrows ofFig. 1A. (B) Students appearing and disap-pearing from school or retained in a grade,deduced from mathematics exams admin-istered in spring 2006 and spring 2007.Vertical arrows show the net result of stu-dents appearing and disappearing fromschool, whereas horizontal arrows showthe numbers of students required to repeata grade. Downward pointing arrows meanthat more children are disappearing from agrade than appearing in it. Areas of arrowsare proportional to the numbers of chil-dren involved. The scale of arrows andmeaning of the colored bands is the sameas in Fig. 1.

17268 www.pnas.org / cgi / doi / 10.1073 / pnas.0812221106 Marder and Bansal

Dow

nloa

ded

by g

uest

on

Janu

ary

26, 2

022

SOCI

AL

SCIE

NCE

SPH

YSIC

S

Fig. 3. Flow plots for low-income stu-dents 2003–2007. The conventions for thearrows and colored bands in these plotsare the same as in Fig. 1. The figure showsthat the main features of the flow are veryconsistent from year to year. Arrows cir-cled in the upper right corners highlightthe small but rapidly increasing numbersof low-income students performing at thehighest levels on the test at 10th grade.

and seventh grade, when students from economically disadvan-taged backgrounds flow downwards at a higher pace than theirless disadvantaged counterparts and never recover. Ninth gradeis another crucial time because students who are not passing themathematics exams are forced to repeat a grade and consequentlydisappear from schools in large numbers. This effect is muchstronger for those who are economically disadvantaged than forthose who are not, as shown in Fig. 2B.

Flow fields address many questions about the educational sys-tem. There is a debate over the student variables that shouldbe used to describe effects of teachers and schools. Sanders (8)states that “models should not include socio-economic or ethnicaccommodations but should only include measures of previousachievement of individual students.” In this view, prior year scorescontain everything one needs to know about the state of the stu-dents. However differences between flow directions have greatstatistical significance. For example, sixth graders not eligible forfree and reduced meals and mathematics scores between 90%and 100 % in 2006/2007 drop on average in score by 4.4% thenext year, whereas those eligible for free and reduced meals dropin score by 7.0%. (N ∼ 30, 000, t = 34, p < 10−9). Similar sta-tistical significance applies to the differences between virtuallyall the arrows in the upper and lower rows of Fig. 1. Changes inscores depend strongly, reproducibly, and with high statistical sig-nificance, upon poverty level even after controlling for previousachievements of students. It is possible that this difference in scorechanges is entirely due to the lower quality of teachers assignedto the least affluent students. However, it is difficult to reach sucha conclusion simply from test data; the conclusion that ineffectiveteachers are largely to blame for unsatisfactory student perfor-mance risks being circular (9) if ineffective teachers are definedto be those whose students’ test scores decrease (10). Drawingconclusions about school effectiveness from test data presentscomparable difficulties (11).

Another claim is that the difficulty of items on the TAKS examsis carefully chosen so as to maintain students’ scores at the samelevel over time (12). This claim is only partly supported by thedata. Most flow vectors are close to horizontal but the slopes arenot negligible and, over the course of several years, students flowto very different regions from where they began, a change which

depends strongly on variables such as students’ ethnic group andeconomic class.

Because the impetus to pass No Child Left Behind camefrom Texas, there have been debates on differing reasons Texastest scores have been rising. McNeill et al. (13) suggest thatrises in Texas test scores can be attributed to an increasing pat-tern of retaining low-income students at ninth grade until theydisappear. Data support this claim. Retention of low-incomestudents (those eligible for free and reduced lunch) at ninthgrade increased between 2003/2004 and 2006/2007 from 31,200to >33,000, whereas the number of low-income students disap-pearing from ninth grade increased from 23,200 to 28,500.

Finally, we note that linear modeling is pervasive in analysisof educational data (14), but we see many effects that are inher-ently nonlinear. For example, suppose that through some form ofimproved instruction it is possible to increase the score gains oflow-income students in sixth and seventh grades. This will have theeffect of diverting the entire flow pattern slightly upward. In 10thgrade, the number of low-income students in the highest scorebracket (90%-100%) constitutes the exponentially small tail of adistribution centered at around 60%. Thus small motions of thedistribution upward result in exponentially varying gains at thetop. Such gains are evident. For example, consider the low incomestudents scoring >90% in 10th grade circled in Fig. 3. Between2003 and 2006, the number of these students grew by 15–50% peryear, starting with 2,150 students in 2003 and ending with 7088in 2006. The model presented here is nonparametric and makesminimal assumptions about the form of the underlying probabilitydistributions for student score changes.

It should be possible to use our convection and diffusion mod-els in order to predict quantitatively how improvements at lowergrades affect flow of students at higher grades. These predic-tions would apply to the average behavior of large numbers ofstudents although individuals would display considerable varia-tion. The success of these predictions will partly hinge on theextent to which score changes in successive years are statisticallyindependent. Our preliminary examination of this question indi-cates that knowledge of two prior year’s scores only improvesprediction by ≈10%, so the assumption of independence isacceptable.

Marder and Bansal PNAS October 13, 2009 vol. 106 no. 41 17269

Dow

nloa

ded

by g

uest

on

Janu

ary

26, 2

022

Materials and MethodsDataset. We obtained all TAKS-related data that the Texas EducationAgency, the government agency charged with administering and evaluatingstandardized tests in Texas, is able to release. Each row of the dataset containsinformation about a single examination of a single student, including thatstudent’s demographic details as well as their responses to each question ofthe examination along with their score. Each student has a globally unique,anonymized identifier that allows us to follow them through time as theymove between schools. Students are described by race, gender, and eligibil-ity for free and reduced-price meals, which is an indication of whether familyincome is low or high. Every answer they have bubbled on each test is pro-vided, and can be compared with correct answers. The schools and districts inwhich students take the exams are named. There are some things we wouldhave liked to know that are not included: there is no indication of whetherstudents have changed schools in the middle of the year. The State probablydoes not know. More puzzling, there is no information on the teacher underwhose care the student took the exam. The State certainly does have thisinformation, because it returns to each teacher a record of their students’scores, and provides reports on the performance of every teacher to eachdistrict. However they must not retain the data, because the Texas EducationAgency is required release all information in their possession in accord withthe Texas Freedom of Information Act, and data linking students to teachersare not available.

The data set has defects, some of which can partially be remedied andsome of which cannot. There are over 27,000 students with invalid recordswho end up coded with the same unique identifier and must be removed.Such defects appear to involve tens of thousands of students but there isnothing to be done about them. Out of a population of millions we do notbelieve that these defects are likely to distort our results.

We transformed the dataset to make analysis more manageable, but with-out changing any entries. We created normalized sets of tables to speed upsearches in MySQL. We also produced condensed files containing all relevantinformation about each student on one line, useful for analysis with Pythonscripts. We normalize all students’ scores by dividing by the maximum possiblescore for that student for that exam.

Statistical Methods. Let NaS and Nb

S be the numbers of students in years aand b with score S. N will always depend upon some other variables as well,such as the grade level, perhaps economic need or race of the student, butwe suppress additional indices for the moment so as to focus on the primaryvariables of scores and time. Let Rab

S′→S be the number of students with scoreS′ in year a who score S in year b. The master equation is

NbS − Na

S =∑δS

(Rab

S−δS→S − RabS→S+δS

) + Rab0→S − Rab

S→0. [1]

Transitions to and from the state with score zero have to be treated sepa-rately because they correspond to students who were sick, absent for otherreasons, left school, left the country, or have an invalid exam. We do not dis-tinguish between students who show up in the dataset with a zero score andthose who do not appear at all. We define

ΔabS ≡ Rab

S→0 − Rab0→S [2]

to be the disappearance of students with score S between years a and b.To obtain a Fokker–Planck equation, assume that R is slowly varying as a

function of S, although not slowly varying as a function of δS. Then, to secondorder in score changes,

RabS−δS→S ≈ Rab

S→S+δS − δS∂

∂SRab

S→S+δS + 12

δS2 ∂2

∂S2 RabS→S+δS [3]

This gives

NbS − Na

S ≈ −ΔabS − ∂

∂SvSNa

S + ∂2

∂S2 DSNaS [4]

where the forward flow vS and the forward diffusion DS aredefined by

vS ≡∑δS

δSRab

S→S+δS

NaS

[5]

DS ≡∑δS

12

δS2 RabS→S+δS

NaS

. [6]

The forward flow vS gives the average score change of students withscore S in year a who also have a (nonzero) score in year b. The diffu-sion coefficient DS sets the magnitude of random variations in scores. Onecan repeat the derivation of Eq. 4 but Taylor expand the second term ofEq. 1 rather than the first. This leads to the reverse flow v−

S and reversediffusion D−

S :

v−S ≡

∑δS

δSRab

S−δS→S

NbS

[7]

D−S ≡

∑δS

12

δS2 RabS−δS→S

NbS

. [8]

NbS − Na

S ≈ −ΔabS − ∂

∂Sv−

S NbS − ∂2

∂S2 D−S Nb

S . [9]

The reverse flow v−S answers the question “If a student has a score

between 80% and 89% in 12th grade, what is the most likely path tohave been followed since third grade?” The average of the forward andreverse flows is a current that predicts changes in student numbers withoutdiffusion:

NbS − Na

S + ΔabS ≈ − ∂

∂S

(vSNa

S + v−S Nb

S

)/2 ≡ − ∂J

∂S. [10]

Subtracting Eq. 4 from Eq. 9 gives the identity

JD ≡ − vSNaS − v−

S NbS

2= − 1

2∂

∂S

[DSNa

S + D−S Nb

S

] + const, [11]

which means that the difference between the forward and reverse flows orthe sum of the forward and reverse diffusions provides a measure of totaldiffusion.

We now can interpret more precisely what we have plotted. Fig. 1A dis-plays the vector (δt, vS)Na

S , where δt = 1 year is the horizontal distance fromone grade level to the next, with the vector scaled in the vertical direction sothat vS = 1 corresponds to the height difference between 0 and 100%. Theflow plots show streamlines of the most likely future path of students. Fig.1B displays the vector −(δt, v−

S )NbS . Fig. 2A plots triangles of height JD, and

whose width is proportional to√

NaS + Nb

S . Note that JD +vSNaS = J, so the ver-

tical components of the vectors plotted under Flow and Diffusion sum to theaverage score changes of all students including both the effects of convectiveflow and diffusion. The vertical arrows in Fig. 2B are the disappearance rateΔab

S . The horizontal arrows are computed similarly, and are obtained fromthe total number of students in two consecutive years found to be repeatinga grade.

ACKNOWLEDGMENTS. We thank Philip Kromer for helping to put the data innormalized form, Stephen Stigler for posing stimulating questions followinga presentation, the Texas Education Agency for supplying preliminary data,and the University of Texas Dallas Educational Research Center and the RayMarshall Center at the University of Texas at Austin for access to final unfil-tered data. This work was supported by National Science Foundation GrantDMR 0701373.

1. van Kampen NG (2007) in Stochastic Processes in Physics and Chemistry (North-Holland, Amsterdam), 3rd Ed, pp 197–200.

2. Stone R (1971) Demographic Accounting and Model-Building (Organization forEconomic Cooperation and Development, Washington, DC).

3. Gani J (1986) Formulae for projecting enrolments and degrees awarded in universities.J R Stat Soc 126:400–409.

4. Nicholls MG (1982) Short term prediction of student numbers in the Victoriansecondary education system. Aust New Zealand J Stat 24:179–190.

5. Shah C, Burke G (1999) An undergraduate student flow model: Australian highereducation. High Educ 37:359–375.

6. Burke G (1976) Demographic accounting and modeling: An application to traineesecondary teachers in Victoria. Aust Econ Pap 15:240–251.

7. Stigler SM (1999) Statistics on the Table (Harvard Univ Press, Cambridge, MA), pp157–188.

8. Sanders WL (2000) Value-added assessment from student achievement data: Oppor-tunities and hurdles. J Pers Eval Educ 14:329–339.

9. Kupermintz H (2003) Teacher effects and teacher effectiveness: A validity investiga-tion of the Tennessee Value Added Assessment System. Educ Eval Policy Anal 25:287–298.

10. Jordan HR, Mendro RL, Weerasinghe D (1997) Teacher Effects on Longitudinal StudentAchievement: A Report on Research in Progress (CREATE, Dallas, TX).

11. Haertel E (2005) Using a longitudinal student tracking system to improve the designfor public school accountability in California. Available at http://ed.stanford.edu/suse/faculty/haertel/Haertel-Value-Added.pdf.

12. Stroup W (2009) What Bernie Madoff can teach us about accountability in education.Education Weekly 28:22–23.

13. McNeil LM, Coppola E, Radigan J, Heilig JV (2008) Avoidable losses: High-stakesaccountability and the dropout crisis. Educ Policy Anal Arch 16:1–45.

14. Weerasinghe D (2007) How to Compute School and Classroom EffectivenessIndices: The Value-Added Model Implemented in Dallas Independent SchoolDistrict (Office of Institutional Research, Dallas Independent School District,Dallas, TX).

17270 www.pnas.org / cgi / doi / 10.1073 / pnas.0812221106 Marder and Bansal

Dow

nloa

ded

by g

uest

on

Janu

ary

26, 2

022