7/27/2019 Nagler - Coding Style and Good Computing Practices
http://slidepdf.com/reader/full/nagler-coding-style-and-good-computing-practices 1/6
Coding Style and Good Computing Practices
Author(s): Jonathan NaglerSource: PS: Political Science and Politics, Vol. 28, No. 3 (Sep., 1995), pp. 488-492Published by: American Political Science AssociationStable URL: http://www.jstor.org/stable/420315
Accessed: 06/10/2008 05:15
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/action/showPublisher?publisherCode=apsa.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected].
American Political Science Association is collaborating with JSTOR to digitize, preserve and extend access to
PS: Political Science and Politics.
7/27/2019 Nagler - Coding Style and Good Computing Practices
http://slidepdf.com/reader/full/nagler-coding-style-and-good-computing-practices 2/6
Coding Style and Good Computing Practices
Jonathan Nagler, University of California, Riverside
Replication of scholarly analysisdepends on individual researchers
being able to explain exactly whatthey have done. And being able to
explain exactly what one has done
requires keeping good records of it.This article describes basic goodcomputing practices1 and offers ad-vice for writing clear code that fa-cilitates the task of replicating the
research. The goals are simple.First, the researcher should be ableto replicate his or her own work sixhours later, six months later, andeven six years later. Second, othersshould be able to look at the codeand understand what was beingdone (and preferably why it was
being done). In addition, followinggood computing practices encour-
ages the researcher to maintain a
thorough grasp of what is beingdone with the data. This allows formore efficient research: one is not
always rereading one's own workand retracing one's own steps to
perform the smallest bit of addi-tional analysis.
This article is not meant only for
sophisticated statistical researchers.In fact, the statistical proceduresyou ultimately use have nothing todo with the topic of this article.These practices should be usedeven if you are doing nothing more
complex than producing 2 x 2 ta-bles with a particular data set. The
sequence this article is written inleads the reader from more generalthemes to more specific ones. I en-
courage readers who haven't the
patience for the big picture to skipahead rather than skip the whole
article. Even learning basic conven-tions about variable names will putyou ahead of most coders!
First, what do I mean by com-
puting practices and "code?" Com-
puting practices covers everythingyou do from the time you open thecodebook to a data set, or begin toenter your own data, to the time
you produce the actual numbersthat will be placed in a table of an
article. "Code" refers to the actual
computer syntax-or computer pro-
gram-used to perform the compu-tations. This most likely means theset of commands issued to a higher-level statistics package such asSAS or SPSS. A given file of com-mands is a program. Most politicalscientists do not think of them-selves as computer programmers.But when you write a line of syn-tax in SAS or SPSS, that is exactlywhat you are doing-programminga computer. It is coincidental thatthe language you use is SAS in-stead of Fortran or C. The paradox
is that most political scientists arenot trained as computer program-mers. And so they never learnedbasic elements of programmingstyle, nor are they practiced in theart of writing clean, readable, main-tainable code. The classic text on
programming style remains Ker-
nighan and Plauger, The Elements
of Programming Style (1974, 1978),and most books on programminginclude a chapter on style. I recom-mend including a section on codingstyle in every graduate methods
sequence.This article starts from the point
at which a raw data set existssomewhere on a computer. Itbreaks analysis into two basic
parts: coding the data to put it intoa usable form, and computing withthe data. It is useful to break thefirst part into two component steps:reading the essential data from a
larger data set; and recoding it and
computing new variables to be usedin data analysis.
Lab Books
Our peers in the laboratory sci-ences maintain lab books. Thesebooks indicate what they did to
perform their experiments. We arenot generally performing experi-ments. But we have identical goals.We want to be able to retrieve a
record of every step we have fol-lowed to produce a particular re-
sult. This means the lab bookshould include the name of everyfile of commands you wrote, with abrief synopsis of what the file wasintended to do. The lab bookshould indicate which producedresults worth noting.
It is a good idea to have a tem-
plate that you follow for each labbook entry; this encourages you toavoid becoming careless in yourentries. A template could include
date, file, author, purpose, results,and machine. You might have a set
of purposes-recoding, data extrac-tion, data analysis-that you feeleach file fits into. It may seem su-
perfluous to indicate on what ma-chine the file was executed. Butshould you develop the habit of
computing on several machines, orshould you move from one machineto another in the course of a
project, this information becomesinvaluable in making sure you canlocate all your files.
It makes a lot of sense to havethe lab book on-line. It can be ineither a Wordperfect file, or a plainascii text file, or whatever you aremost comfortable writing in. First,it is easy to search for particularevents if you remember theirnames. Second, it can be accessed
by more than one researcher if youare doing joint work.
Rule: Maintaina labbook from the
beginningof a projectto the end.
Lab books should provide an au-dit trail of any of your results. Thelab book should contain all the in-
formation necessary to take a givenpiece of data analysis, and traceback where all of the data camefrom, including its source and allrecode steps. So although youmight want to keep many sorts ofentries in a lab book, and there are
many different styles in which to
keep them, the point to keep inmind is that whatever style youchoose, it should meet this purpose.
PS: Political Science & Politics88
7/27/2019 Nagler - Coding Style and Good Computing Practices
http://slidepdf.com/reader/full/nagler-coding-style-and-good-computing-practices 3/6
CodingStyleand GoodComputingPractices
Command Files
The notion of command files may
appear as an anachronism to some.
After all, can't we all just work in-
teractively by pointing and clickingto do our analysis, not having to go
through the tedium of typing sepa-
rate lines for each regression wewant to run and submitting our jobfor analysis? Yes, we can. But, we
probably don't want to. And even
if we do, modern software will
keep a log for us of what we have
done (a fact that seems to escape
many users of the software). The
reason I am not a fan of interactive
work has to do with the nature of
the analysis we do. The model we
ultimately settle on was usually ar-
rived at after several estimates test-
ing many models. And this is ap-propriate: we all ought to be testingthe robustness of our results to
changes in model specification,
changes in variable measurement,and so forth. By writing command
files to perform estimates, and
keeping each version, one has a
record of all this.
You generally want a large set of
comments at the beginning of each
file, indicating what the file is in-
tended to do. And each file should
not do too much. Comments on thetop of a file should list the following:
1. State the date it was written,and by whom.
2. Include a description of what
the file does.3. Note from what file it was im-
mediately derived.
4. Note all changes from its prede-cessors, where appropriate.
5. Indicate any data sets the fileuses as input. If the file uses adata
set,there should be a com-
ment indicating the source of thedata set. This could be eitheranother file you have that pro-duced the data or a descriptionof a public-use data set. If it is a
public-use data set, include thedate and version number.
6. Indicate any output files or dataset created.
For example, the first nine lines ofa command file might be
\* File-Name: mnpl52.gData: Feb 2, 1994Author: JN
Purpose: This file does multinomial
probits on our basic model.Data Used: nes921.dat (created by
mkascic.cmd)Output File: mnpl52.outData Output: None
Machine: billandal (IBM/RS6000)\*
You could keep a template withthe fields above left blank, and readin the template to start each newcommand file. You should treat thecomments at the top of a file the
way you would the notes on a ta-
ble; they should allow the file tostand alone and be interpreted byanyone opening the file without ac-cess to other files.
Know the Goal of the Code
It makes no sense to start codingvariables if you do not know whatthe point of the analysis using thevariables will be. You end up witha bunch of variables that are all
being recoded later on and confus-
ing you to no end. Before you start
manipulating the data, figure outwhat you will be testing and howthe variables need to be set up.
Example: say you want to testthe claim that people who voted in
1992, but did not vote in 1988,were more likely to support Perotthan Bush in 1992. How do youcode the independent variable indi-cated here? Well, you really want a"new voter" variable, because
your substantive hypothesis isstated most directly in terms of"new voters." Thus, the variableshould be coded so that
vote in 1992, not voted in 1988 = 1voted in 1992, voted in 1988 = 0
Rule: Code each variable so that it
corresponds as closely as possible toa verbal description of the substan-tive hypothesis the variable will beused to test.
Fixing Mistakes
If you find errors, the errorsshould be corrected where theyfirst appear in your code-not
patched over later. This may meanrerunning many files. But this is
preferable to the alternative. If youpatch the error further downstreamin the code then you will need toremember to repeat the patchshould you make any change in the
early part of the data preparation(i.e., you decide you need to pulladditional variables from a data set,
etc.) If the patch is downstream,you are also likely to get confusedas to which of your analysis runsare based on legitimate (patched)data, and which are based on incor-rect data.
Rule: Errors in code should be cor-rected where they occur and the
code rerun.
Data Manipulation VersusData Analysis
Most data go through many ma-
nipulations from raw form to theform we use in our analyses. Itmakes sense to isolate as much ofthis as possible in a separate file.For instance, suppose you are us-
ing the 1992 National Election
Study (NES) presidential electionfile. You have 100 variables inmind that you might use in youranalysis. It makes sense to have a
program that will pull those 100variables from the NES data set,
give them the names you want, doany basic recoding that you know
you will maintain for all of youranalysis, and create a "system file"of these 100 named and recodedvariables that can be read by yourstatistics package. There are atleast two reasons for this. First, itsaves a lot of time. You are goingto estimate at least 50 models be-fore settling on one. Do you reallywant to read the whole NES dataset off the disk 50 times when youcould read a file 1/20th the size in-
stead? Second, why do all that re-coding and naming 50 times? You
might accidentally alter the recodesin one of your files.
Rule: Separate tasks related to data
manipulation vs. data analysis into
separate files.
Modularity
Separating data manipulation anddata analysis is an example of mod-
September1995 489
7/27/2019 Nagler - Coding Style and Good Computing Practices
http://slidepdf.com/reader/full/nagler-coding-style-and-good-computing-practices 4/6
Symposium
ularity. Modularity refers to the
concept that tasks should be splitup. If two tasks can be performedsequentially, rather than two at a
time, then perform them sequen-tially. The logic for this is simple.Lots of things can go wrong. Youwant to be able to isolate what
went wrong. You also want to beable to isolate what went right. Af-ter what specific change did thingsimprove? Also, this makes formuch more readable code.
Thus if you will be engaging in
producing some tables before multi-variate analysis, you might have aseries of programs: descripl.cmd,descript2.cmd, . . ., descrip9.cmd.Following this, you might produce:
regl.cmd, reg2.cmd, . . . ,reg99.cmd. You need not constrain
yourself to one regression per file.
But the regressions in each fileshould constitute a coherent set.For instance, one file might contain
your three most likely models ofvote choice, each disaggregated bysex. This does tend to lead to pro-liferation of files. One can startwith regl.cmd and finish with
reg243.cmd. But disk space is
cheap these days, and the files can
easily be compressed and stored on
floppies if disk space is gettingtight.
Rule:Eachprogram houldperformonly one task.
KISS
Keep it simple and don't get tooclever. You may think of a veryclever way to code something thisweek. Unfortunately you may notbe as clever next week and youmight not be able to figure out what
you did. Also, the next person toread
your code might not be soclever. Finally, you might not be asclever as you think-the clever
way you think of to do three stepsat once might only work for fiveout of six possible cases you are
implementing it on. Or it might cre-ate nonsense values out of whatshould be missing data. Why takethe chance? Computers are veryfast. Any gains you make throughefficiency will be dwarfed by theconfusion caused later on in trying
to figure out what exactly yourcode is doing so efficiently.
Rule:Do not try to be as clever aspossiblewhen coding.Try to writecode thatis as simpleas possible.
Variable Names
There is basically no place forvariables named XI other than insimulated data. Our data are real;they should have names that impartas much meaning as possible. Un-
fortunately many statistical pack-ages still limit us to eight-characternames (and for portability's sake,we are forced to stick with eight-character names even in packagesthat don't impose the limit). How-
ever, your keyboard has 84 keys,and the alphabet has 52 letters: 26
lower-case and 26 upper-case. In-dulge yourself and make liberal useof them. There are also several ad-ditional useful characters-such asthe underscore and the digits0-9-at your disposal. It is a con-vention in programming to use UP-PER-CASE characters to indicateconstants and lower-case charac-ters to indicate variables. This
might not be as useful in statistical
programming. You might adopt theconvention that capitals refer to
computed quantities (suchas
PROBCHC1: the estimated PROBa-
bility of CHoosing Choice 1). Andif you are trying to have your code
closely follow the notation of a par-ticular econometrics article, youmight use a capital U for utility, ora capital V for the systemic compo-nent of utility. Obviously in such acase comments would be in order!Some people like variable namessuch as NatlEcR because the use of
capitals allows for clearly indicatingwhere one word stops and another
starts. NatlEcR makes it easier tothink of 'National Economic-Ret-
rospective' than natlecr might. Youwill need to make some tradeoffs inthe conventions you choose. The
important thing is to adopt a con-vention on the use of capitals andstick with it.
Rule:Use a consistentstyle regard-ing lower-andupper-case etters.
Rule: Use variablenamesthathavesubstantivemeaning.
When possible a variable nameshould reveal subject and direction.The simplest case is probably a
dummy variable for a respondent'sgender; imagine it is coded so that0 = men, 1 = women. We couldcall the variable either "SEX," or"WOMEN." It is clear that
"WOMEN" is the better name be-cause it indicates the direction ofthe variable. When we see our co-efficients in the output we won'thave to guess whether we codedmen = 1 or women = 1.
Rule: Use variablenamesthat indi-cate directionwhere possible.
Similarly, value labels are usefulfor packages that permit them. The
examples of computer syntax I usein this article are written in SST
(Dubin/Rivers 1992), but they can
be translated easily into SAS,SPSS, or most statistical packages.Here is a simple example. The vari-able natlecr indicates the respon-dent's retrospective view of perfor-mance of the national economy.Notice that the variable name canindicate only so much informationin 8 characters. But the label of itand the values tell us what weneed. And the fact that the labeltells us where to look the variable
up in the code book is further pro-tection.
label var[natlecrl\lab[v3531:nationalconomy - retrol\val[1 gotbet3 same 5 gotworsel\
Some people using NES data-or
any data produced by someone else
accompanied by a codebook-fol-low the convention of naming thevariable by its codebook number
(i.e., V3531), and using labels forsubstantive meaning. I think this isa poor practice. Consider which ofthe following statements is easier toread:
logit dep[preschcl\ind[oneeduc women partyid]
or:
logit dep[V56091\ind[oneV3908 V4201V3634]
The codebook name for the vari-able should definitely be retained;but it can be retained in the labelstatement. Without the codebookname one would not know which ofthe several party-identification vari-ables the variable partyid refers to.
PS: Political Science & Politics90
7/27/2019 Nagler - Coding Style and Good Computing Practices
http://slidepdf.com/reader/full/nagler-coding-style-and-good-computing-practices 5/6
Coding Style and Good Computing Practices
Writing Cleanly
Anything that reduces ambiguityis good. Parentheses do so and so
parentheses are good. A readershould not need to remember the
precedence of operators. But inmost cases parentheses are more
valuable as visual cues to groupingexpressions than to actual prece-dence of operators. Almost as use-ful as parentheses is white space.Proper spacing in your code canmake it much easier to read. Thismeans both vertical space betweensections of a program that do dif-ferent tasks, and indenting to make
things easier to follow.
Rule:Use appropriatewhite space inyour programs,anddo so in a con-sistent fashion to makethemeasy toread.
Comments
There is probably nothing more
important than having adequatecomments in your code. Basicallyone should have comments before
any distinct task that the code isabout to perform. Beyond this, onecan have a comment to describewhat a single line does. The basicrule of thumb is this: is the line ofcode absolutely, positively self-
explanatory to someone other thanyourself without the comment? Ifthere is any ambiguity, go aheadand put the comment in.
Remember though, the commentsshould add to the clarity of thecode. Don't put a comment beforeeach line repeating the content ofthe line. Put comments in before
specific blocks of code. Only add acomment for a line where the indi-vidual line might not be clear. And
remember, if the individual line isnot clear without a comment,
maybe you should rewrite it.Rule: Includecommentsbefore eachblock of code describing he purposeof the code.
Rule: Includecomments for any lineof code if the meaningof the linewill not be unambiguouso someoneother thanyourself.
Rule:Rewriteany code that is notclear.
Following is a case where a singlecomment lets us know what is go-
ing on. Most programmers thinkthat well-written code should be
self-documenting. This is partlytrue. But no matter how well writ-ten your code is, some commentscan make it much clearer.
remremremremremrem
Finally, after you have done re-codes and created new variables itis a good idea to list all the vari-ables. This way, you can confirmthat you and your statistics pack-age agree on what data are avail-
Create party-id dummy variables
Missing values are handled correctly here by SST.In other statistics packages these three variables mighthave to be initialized as missing first.
set dem = (pid < 3)set ind = (pid == 3)set rep = (pid > 3)
rem ***********************************************************rem ***********************************************************
Recodes andCreatingNew Variables
Probably the most importantthing to keep track of both when
recoding variables and creating newvariables is missing data. There isno general rule that can specify ex-
actly how to do this, because treat-ment of missing data can varyacross statistics packages. Thus thebest rule is:
Rule:Verifythatmissingdataarehandledcorrectlyon any recode or
creationof a new variable.
In some statistics packages youmay be best served by initializingall new variables as missing data,and allowing them to become legiti-mate values only when they are
assigned a legitimate value. Thebest advice is to recode and createnew variables defensively.
Rule: After creatingeach new vari-able or recoding anyvariable,pro-duce frequenciesor descriptivesta-tistics of the new variable and
examine themto be sure thatyouachievedwhat you intended.
Generally it is poor style to hard-wire values into your code. Anyspecific values are likely to changewhen some related piece of codesomewhere else is altered or whenthe data set changes.
Rule:Whenpossible, automatethingsandavoid placinghard-wiredvalues (those computed"by hand")in code.
able and how many observationsare available for each variable. InSST this would be done with a list
command; in SAS, PROC CON-TENTS will produce a clean list ofvariables. Most statistics packagesoffer similar commands.
Procedures or Macros
Most political scientists do notever have to write a macro or pro-cedure, but maybe that's why they
do so little secondary analysis oncethey generate some estimates. The
purpose of a well-defined procedureis to automate a particular se-
quence of steps. Procedures andmacros are useful both in makingyour code more readable and in
allowing you to perform the same
operation multiple times on differ-ent values or on different variables.The use of procedures and macrosis a topic for a separate article, but
political scientists should realizethat the tools are available in most
statistics packages.
Summing Up
The rules presented here repre-sent one way of accomplishing the
goal you should have in mind. That
goal is to write clear code that willfunction reliably and that can beread and understood by you andothers and can serve as a road map
September1995 491
7/27/2019 Nagler - Coding Style and Good Computing Practices
http://slidepdf.com/reader/full/nagler-coding-style-and-good-computing-practices 6/6
Symposiumymposium
for replicating and extending yourresearch.
Most people are in a huge hurrywhen they write their code. Either
they are excited about getting theresults and want them as fast as
possible, or they figure the codewill be run once and then thrown
out. If your program is not worthdocumenting, it probably is notworth running. The time you save
by writing clean code and com-
menting it carefully may be yourown.
Rules
1. Maintain a lab book from the
beginning of a project to theend.
2. Code each variable so that it
corresponds as closely as pos-sible to a verbal description ofthe substantive hypothesis thevariable will be used to test.
3. Correct errors in code where
they occur, and rerun the code.4. Separate tasks related to data
manipulation vs. data analysisinto separate files.
5. Design each program to per-form only one task.
6. Do not try to be as clever as
possible when coding. Try to
for replicating and extending yourresearch.
Most people are in a huge hurrywhen they write their code. Either
they are excited about getting theresults and want them as fast as
possible, or they figure the codewill be run once and then thrown
out. If your program is not worthdocumenting, it probably is notworth running. The time you save
by writing clean code and com-
menting it carefully may be yourown.
Rules
1. Maintain a lab book from the
beginning of a project to theend.
2. Code each variable so that it
corresponds as closely as pos-sible to a verbal description ofthe substantive hypothesis thevariable will be used to test.
3. Correct errors in code where
they occur, and rerun the code.4. Separate tasks related to data
manipulation vs. data analysisinto separate files.
5. Design each program to per-form only one task.
6. Do not try to be as clever as
possible when coding. Try to
write code that is as simple as
possible.7. Set up each section of a pro-
gram to perform only one task.8. Use a consistent style regard-
ing lower- and upper-case let-ters.
9. Use variable names that have
substantive meaning.10. Use variable names that indi-
cate direction where possible.11. Use appropriate white space in
your programs, and do so in aconsistent fashion to make the
programs easy to read.12. Include comments before each
block of code describing the
purpose of the code.13. Include comments for any line
of code if the meaning of theline will not be unambiguous tosomeone other than yourself.
14. Rewrite any code that is notclear.
15. Verify that missing data arehandled correctly on any re-code or creation of a new vari-able.
16. After creating each new vari-able or recoding any variable,produce frequencies or descrip-tive statistics of the new vari-able and examine them to besure that you achieved what
you intended.
write code that is as simple as
possible.7. Set up each section of a pro-
gram to perform only one task.8. Use a consistent style regard-
ing lower- and upper-case let-ters.
9. Use variable names that have
substantive meaning.10. Use variable names that indi-
cate direction where possible.11. Use appropriate white space in
your programs, and do so in aconsistent fashion to make the
programs easy to read.12. Include comments before each
block of code describing the
purpose of the code.13. Include comments for any line
of code if the meaning of theline will not be unambiguous tosomeone other than yourself.
14. Rewrite any code that is notclear.
15. Verify that missing data arehandled correctly on any re-code or creation of a new vari-able.
16. After creating each new vari-able or recoding any variable,produce frequencies or descrip-tive statistics of the new vari-able and examine them to besure that you achieved what
you intended.
17. When possible, automate thingsand avoid placing hard-wiredvalues (those computed "byhand") in code.
Note
1. A version of this article appeared inThe Political Methodologist, vol. 6, no. 2,
spring 1995, 2-8.I thank Charles Franklin, Bob Hanneman,
Gary King, and Burt Kritzer for useful com-ments and suggestions.
References
Dubin, Jeffrey, and R. Douglas Rivers. 1992.Statistical Software Tools Users Guide:Volume 2.0. Pasadena, CA: Dubin/RiversResearch.
Kernighan, Brian W. and P. J. Plauger.1978. The Elements of ProgrammingStyle, 2nd edition. New York: McGraw-Hill.
About the Author
Jonathan Nagler is associate professor at
University of California, Riverside. His re-search interests include qualitative analysis,campaigns and elections. He can be reachedat [email protected].
17. When possible, automate thingsand avoid placing hard-wiredvalues (those computed "byhand") in code.
Note
1. A version of this article appeared inThe Political Methodologist, vol. 6, no. 2,
spring 1995, 2-8.I thank Charles Franklin, Bob Hanneman,
Gary King, and Burt Kritzer for useful com-ments and suggestions.
References
Dubin, Jeffrey, and R. Douglas Rivers. 1992.Statistical Software Tools Users Guide:Volume 2.0. Pasadena, CA: Dubin/RiversResearch.
Kernighan, Brian W. and P. J. Plauger.1978. The Elements of ProgrammingStyle, 2nd edition. New York: McGraw-Hill.
About the Author
Jonathan Nagler is associate professor at
University of California, Riverside. His re-search interests include qualitative analysis,campaigns and elections. He can be reachedat [email protected].
Response: Potential Research Policies for Political Science
Paul S. Herrnson, University of Maryland, College Park
Response: Potential Research Policies for Political Science
Paul S. Herrnson, University of Maryland, College Park
As evidenced by the thoughtful
symposium participants, politicalscientists have a variety of opinionsabout verification, replication, anddata archiving. I continue to have
strong reservations regarding the
proposed replication, verification,data relinquishment rules debatedhere. The symposium, however,has convinced me of one importantthing: a still broader discussion isneeded before editors or organiza-tions change the norms governingresearch or publication. WalterStone's (1995) proposal to surveypolitical scientists to obtain their
As evidenced by the thoughtful
symposium participants, politicalscientists have a variety of opinionsabout verification, replication, anddata archiving. I continue to have
strong reservations regarding the
proposed replication, verification,data relinquishment rules debatedhere. The symposium, however,has convinced me of one importantthing: a still broader discussion isneeded before editors or organiza-tions change the norms governingresearch or publication. WalterStone's (1995) proposal to surveypolitical scientists to obtain their
opinions on verification/replicationis, therefore, a good one.
The survey should ask politicalscientists to consider a
varietyof
policies applicable to quantitativestudies, including the following:
1. A "True" Replication PolicyJournals would reserve a certainamount of space for studies that
replicate a piece of research inits entirety, including data col-lection and analysis. Authorswho seek to publish studies thatare based on original data wouldbe required to provide more de-
opinions on verification/replicationis, therefore, a good one.
The survey should ask politicalscientists to consider a
varietyof
policies applicable to quantitativestudies, including the following:
1. A "True" Replication PolicyJournals would reserve a certainamount of space for studies that
replicate a piece of research inits entirety, including data col-lection and analysis. Authorswho seek to publish studies thatare based on original data wouldbe required to provide more de-
tails on their data collection pro-cess than are included in articles
currently published in most po-litical science journals, making iteasier for others to replicate the
original research. Neulip (1991)is an example of a publicationbased entirely on replication.
2. A Verification PolicyJournals would require scholarsto include with their manuscriptsubmissions the printout fromwhich their statistical resultswere generated. The printoutwould include variable distribu-
tions, scatterplots, and other
tails on their data collection pro-cess than are included in articles
currently published in most po-litical science journals, making iteasier for others to replicate the
original research. Neulip (1991)is an example of a publicationbased entirely on replication.
2. A Verification PolicyJournals would require scholarsto include with their manuscriptsubmissions the printout fromwhich their statistical resultswere generated. The printoutwould include variable distribu-
tions, scatterplots, and other
PS: Political Science & PoliticsS: Political Science & Politics9292