read - statistical mehtods and reasoning in archaeological research

Upload: luciusquietus

Post on 03-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    1/38

    Statistical Methods and Reasoning in ArchaeologicalResearch: A Review of Praxis and PromiseDWIGHT W. EADDepMment ofAnthroporogyUCLALos Angeles, CAPa024U.S.A.

    INTRODUCTION

    In 1959 the symposium, The Application of Quantitative Methods inArchaeology, was held at Burg Wartenstein. Despite its title, only asingle paper, by Albert Spauldiig (1960), discussed statistical methods.Spaulding's paper, 'Statistical Description and Comparison of ArtifactAssemblages,' set forth the thesis that archaeological analysis can beextended effectively through reasoning based on statistical methods. Whatstatistics offered, Spaulding suggested, was not merely a collection ofmethods applicable to archaeological data, but a way to represent and ex-press more exactly the ideas of archaeology: hence to make these ideasamenable to more precise reasoning-much as he had accomplished withhis oft repeated definition of a type as the non-random association ofattributes (Spaulding 1953).Statistics as a means to reason about quantitatively expressed informa-tion will be the organizing theme of this review. Statistics in this sensecontrasts with statistics viewed as methods to be applied to data. Theformer, as Spaulding indicated, expands upon, and can become part of,archaeological thinking; the latter primarily provides additional informa-tion to be used in archaeological arguments. But statistics taken asproviding a means to represent archaeological ideas has not, as will beseen, been the dominant perception of what constitutes statistical applica-tions in archaeology. Instead, statistics as a means to represent and reasonabout ideas and concepts has played a role whose potential is yet to berealized fully.Though Spauldii's was the only confetence paper that made use ofstatistical concepts, substantive application of statistical methods toarchaeology had begun much earlier. In 1950 the book, Some Applica-tions of Stalistics toArchneoIogy, by Oliver Myers was published in Cairo.Myers' work was evidently unknown to Spaulding for he commented(1960: 82) that he was unaware of any application of regression methodsin archaeology, yet Myers made extensive use of regression and correla-

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    2/38

    DWI GHT W. REA D

    tion analysis in an analysis of the surface materials from a series ofEgyptiansites.Whereas Spaulding emphasized the idea of developing archaeologicalideas with statistical concepts, Myer's work was a forerunner (thoughapparently without descendant) of what today constitutes the bulk ofquantitative and statistical applications in archaeology. Myers wasconcerned with using statistical methods to confirm or disconfirm severalhypotheses about the covariance of pottery sherds and flint in surfacecollected sites as a way to determine the chronological or cultural affinityof the sites. Myers commented that "if statistics be fully applied toarchaeology, it will be possible not only to draw new knowledge from theclues we now use, but to obtain information from sources which havenever been considered as evidence" @. v). Though not a statistician- eadmits to his lack of mathematical background - he hoped that "thisadmission will encourage other unmathematically minded archaeologiststo make use of such [statistical] methods" @. v).The methods he used comprise most of the techniques introduced onlymuch later into American archaeological research through the emphasison quantitative methods which became associated with the "new archae-

    ology." These included computation of correlation coefficients (correlationbetween flint and sherd frequencies), curviIinear regression analysis(relationship between sherd thickness and maximum dimension), analysisof variance and covariance (significance of the regression model forthickness and maximum dimension; independence of the regression modelwith hardness of the sherd measured on the Mohs' scale as a proxymeasure for likelihood of destruction through time), density contours(analysis of the spatial distribution for each type of sherd and flint), aprobability model (to determine whether or not two types of sherds couldhave had essentially identical spatial distributions by chance through twoseparate occupations), simulation (to determine the optimal grid squaresize for recording flint and sherd frequencies per grid square), anddeductive reasoning (expected pattern of correlation amongst the severaltypes of sherds and flints according to the hypothesized sequence ofsettlements for the area).The book is remarkable not only for its early insights into the mannerin which statistical methods can address archaeological problems, but alsobecause it had no impact on the field despite careful work that currentlycould serve as a textbook example of statistical applications.Curiously, a paper that did have the effect of galvanizing archaeologists

    ' into awareness of the enormous potential of statistical methods is also apaper that, if anything could serve as an example of incorrectly appliedstatistical methods. The paper, "A Preliminary Analysis of FunctionalVariability in the Mousterian of Levallois Facies", by S. Binford and L.B i o r d (1966), used factor analysis in an attempt to determine an

    M E T H O D S IN ARCHAEOLOGI CAL RES EARCH

    underlying, hypothesized, process accounting for the observed frequencydistribution of artifacts in sites. On the one hand, the authors argued fromthe archaeological side about how assemblages might be structuredthrough the way in which tools are organized as sets of tools (the "tool-kit") in the performance of tasks. On the other hand, they suggested thatone could, apparently, use statistical methods to recover the positedorganization from frequency counts of artifacts on sites. With this paper,possibility seemed to become reality; no longer was the archaeologistlimited to making subjective inference about past processes, but could, itseemed, objectively determine that reality from the patterning found indata by the archaeologist.Spaulding's paper had given American archaeologists the intellectualjustification for using quantitative methods as a basic part of archaeolog-ical reasoning; Binford and Binford's paper seemed to show that thepotentid of what could be achieved with quantitative methods was limitedonly by the creativeness of the archaeologist. A similar theme was alsoestablished in England through the publication in 1968 of David Clarke'smagnum opus,Ana&ical Archaeology.Clarke went beyond the arguments of Spaulding, and Binford andBiiford and laid out an entire program for what he called "analyticalarchaeology." It had the goal of "the continuous elucidation of therelationshius which Dermeate archaeoloeical data bv means of disciolineddirectedatowards he precipi&tion of a gody of general &wryn(pp. 662-433. Unlike the "new archaeology," which em pha s i i archae-G16gy as a science, Clarke argued that "h yt ica l archieology is not ascience but it is a discipline, its primary machinery is mathematical ratherthan scientific" @. 663). By this Clarke meant that archaeology is basedon probabilistic, not certain, regularities, and only the latter are thehallmark of a science. This is not a drawback, but a difference in focus:'The relationships which analytical archaeology may hope to elucidate q e. . [the] domains of archaeological syntactics, pragmatics and semantics"@. 663). These relate to "cultural systematics, cultural ethnology, andcultural ecology," respectively, and "the relationships within each domainform independent grammars . that] may eventually be expressed in asymbolic and axiomatic calculus" @. 663) - ence the linkage of archae-ology with mathematics. Finally, Clarke observes: "it looks as though the'observer language' or meta-language of archaeological syntactics will bethe first grammar to uncover a calculus of symbolic expression over thenext few decades" @. 663). The beginn@ of such a calculus "are verylargely the result of the developing contributions of quantification, statis-tical procedure, and the impact of the computer"@.651).Clarke's argument canies the idea underlying Spaulding's use of statis-tical ideas as part of archaeological reasoning to its logical conclusion Ifthe proper role of statistics, and by extension, mathematics, is one of

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    3/38

    DWIGHT W . R E A D METHODS IN ARCHAE OL OGICAL RE SE ARCH 9reasoning and not just of providing techniques, then it follows that thereshould be formaVmathematical representation of the structuring ideas andrelationships upon which archaeological reasoning and theory is based.The form of such a representation would be a calculus through which thelogic of those ideas and relationships could be developed. Though this is atheme that has been taken up by others as well - .g., Gibbon (1984:352-83) discusses the need for formalized (axiomatic) theory, and Readand Leblanc (1978) give an example in a restricted domain of what anaxiomatically expressed theory might look like - Clarke alone has setforth a grandiose, programmatic statement.In these four works by Spauldimg, Myers, the Bi o r ds and Clarke, wefind the two main threads that will be used here to categorize topics thathave been taken up extensively in the last several decades by researcherstrying to derive more refined information from archaeological datathrough quantitative methods. The two threads are: (1) reasoning about,and representation of, archawlogical concepts using the conceptualframeword provided by statistics and mathematics (eg., Clarke 1968,Spaulding 1960) and (2) statistics as a series of methods to beused in theanalysis of archaeological data (e.g., Myers 1950, Binford and Binford1966) for specific goals.This review will also be divided into two parts. The first part willconsider publications from four international conferences and one nationalconference on quantitative methods in archaeology. The focus will be ondeveloping ideas about the role of quantitative methods as a reasoningsystem about archawlogical concepts. The second part will reviewpublicati0,ns that make substantive use of quantitative methods and q u t i -tatively based reasoning. Here the discussion will focus on the waystatistical concepts translate into archaeological methods as a means toaddress archaeological issues. The distinction is not exact in that theformer ultimately has to be expressed in tenns of methods and the later,implicitly if not explicitly, depends on the researcher's persuasion aboutthe proper role of quantitative reasoning in archaeological reasoning.Selection of material to be reviewed is generally restricted to publica-

    tions appearing after 1975. The date is arbitrary; it is used to keep thereview reasonably bounded. To thb degree it was feasible, non-American~ublications were included. but coverwe is uneven This iournal willhopefully promote increased communication among all archaeologistsinterested in quantification and mathematical m e l i n g of archaeological, problems.Papers that simply used numbers, or applied methods mainly as anadjunct to other arguments were not included. The latter were excludedbecause they would require a review focused more on the efficacy of theapplication of quantitative methods in archaeological research, such as theones provided by Thomas (1978), Clark (1982) and Scheps (1982), and

    less on quantitative methods as a means to reason about archaeologicalideas. On the othe~ide of the continuum from applied to theoretical,papers whose emphasis was mathematical rather than statistical were alsonot included as these are the subject of a forthcoming review (Read1987a)-PART I. CONFERENCESAND SYMPOSIA ON STATISTICALAND QUANTITATIVE METHODS

    Subsequent to the Burg Wartenstein Conference a number of internationaland national conferences and symposia devoted to the application ofquantitative, statistical and mathematical methods in archawlogy havebeen held. Publications from five of these conferences will be reviewed toillustrate changing perceptions of the role of quantitative and statisticalmethods in archaeology. The conferences, identified by location and thetitle of the conference proceedings, are: (1) Marhemafia in the Archeo-logical and Historical Sciemes, Mamaia (Hodson, Kendall, and Fautu, eds.1971); (2) Archiologie et Calculotews: Probkms Semilogiques,Marseille(Gardi ed. 1970): (3) Ra komm nt et Me'thodes Mathimatiques enArche'ologie, Paris (Borillo, Femandez de la Vega, and Guenoche, eds.1977); (4) To Pattern the Past, Amsterdam (Voorrips and Loving, eds.1985); and (5) Quantitative Research in Archaeology: Progress andProspects,Portland (Aldenderfer, ed. 1987).MATHEMATICS N THE ARCHAEOLOGICALAND HISTORICAL SCIENCES

    Whereas the conference held at Burg Wartenstein had but a single paperthat dealt with statistical methods, the Mamaia conference was structuredto bring together archaeologistsihistoriansand statisticians/mathernaticiansviewed as representing two separate disciplines. Participants wereinstructed to address their papers m the practitioners of the otherdiscipline (Moberg 1971). In effect, the conference took the view thatstatistical techniques should be introduced into archaeology throughproblem identification by the archaeologist linked to methods provided bythe statistician see also Thomas 1975).Spaulding, in hi s "Introductory Address" to the conference, madeexplicit his more implici* stated theme from the Burg Wartensteinsymposium when he obser;ed that "these [ stadstid] techniques merelymake explicit and extend the im~licitmathematical reasoning that aU- -archawl6gists usen (Spaulding i971: 15; emphasis in the- original).Spaulding's claim recalls Clarke's (1968) view that analytical archeologywi l l need to develop a symbolic and axiomatic calculus to express thefundamental ideas and reasoning of the discipline. 'The validity of the first

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    4/38

    D W I G H T W. R E A D M E T H O D S I N A R C H A E O L O G I C A L R E S E A R C Hpart of Spaulding's assertion , namely that statis tical techniques can serveto explicitly express otherwise implicit arguments, is exemplified by theconference papers that addressed artif act grouping and seriation.Thus, Hodson (1971) argued that the archaeological interest in group-ing handaxes for analytical purposes could be realized through themethods of numerical taxonomy. Similarly, Rowlett and Pollnac (1971)suggested that plotting factor loadings is a way to spatially groupcemeteries from the Marne Culture in order to better understand thosedata. Both were trying to connect the archaeologist's analytical goals withwhat were thought to be appropriate quantitative techniques. Conversely,the statistician Rao (1971) discussed his work on decamping a vector Xof p standardized measurements, with correlation matrix R , into a size, s,and shape, h, component via.

    s - 'R -'X and h=

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    5/38

    12 DWIGHT W. R E A D METHODS N A R C H A E O L O G I C A L R E S E A R C H 13in question. This contrasts with the more common, implict argument byanalogy that mathematical rigor will entail archaeological rigor (Read1985; see also BoriUo 1977: 220).

    Another conference on the role of mathematics in archaeology was heldabout the same time in Marseille under the auspices of the ColloqueInternational du Centre National de la Recherche Scientifique. Theconference was narrower in scope as it primalily addressed classification,but broader in approach as it focused on conceptualization of the problemof classification and not iust on mathematical/statistical techniaues.The papers in this conference highlight the philosophical split betweenrealists who advocate methods aimed at the empirical level of patternsdisplayed in data - .g., pattern recognition (Bordaz and Bordaz 1970),cluster analysis and proximity analysis (Doran 1970), scalograms (Elisseeff1970) and multidimensional scaling (Liigoes 1970) - ersus rationalistswho attempt to derive a theory of data structures - .g., papers onautomatic classification by Regnier (1 970) Lerman (1 970) and de la Vega(1979). Automatic classi6cation is not viewed in the more limited sense ofnumerical taxonomy, but as theory guided by the formal properties of aclassification:Sous le nom de hslassificabilit6 nous dCsignerons I'aptitude d'une population Ed'objets 6 h e rganis& en une hiCrarchie de classifications . . Under the name ofh-classi6cation we designate the reasonableness for a population E of objects to beorganized into a hierarchical dassificat ion.. ] (Leman1970:319, translation add ed).

    Turned around, it also becomes a theory explaining when it is not possibleto construct a classification.In Lerman's paper, arguments about limitations are theoretical. Moreoften, discussion about limitations reverts to empirical observation. Thus,Sparck-Jones (1970) noted that the lack of "global classification thwry"means it cannot be assumed that "a grouping algorithm which works wellin one case will work well in another" - theme echoed many timesby other authors: e.g., BlashJield and Aldenderfer (1977), Read andChristenson (1978); Aldenderfer and Blashfield (1984), among others.Sparck-Jones concludes that it will be necessary to "construct alternativeclassifications, and to apply any plausible means of selecting the best ofthese, post hoc . " pp.258, 259). His logic is spund, but the conclusion,is disturbing for those who would use a strictly empiricist approach as itimplies a kind of quantitative nihilism: try anything and everything untilresults areobtained that can be rationalized as informative.Discussion of another kind of limitation was provided by Cowgill

    (1970), who drew attention to the discrepancy between what archawl-ogists can sample and what they would like ti5 sample (1970: 162), henceto the problem of linking the interpretation of analytical results based onwhat is sampled to the underlying domain of interest. Cowgill distinguishedthree situations: (1)a population made up of events "involving behavior ofmembers of a specific comunity . "@. 162), (2)a population of physicalansequences made up of the "physical consequences of the eventsconstituting elements of the first kind of population . . ." @. 162) and (3) aphysical finds population made up of "all those physical consequences ofhuman behavior which are still present and detectable . . ." @. 163).Cowgill made these distinctions to highlight the need for "clearer thinkingabout the intervening stages between statistical results and behavioralimplications* @. 164) - topic later taken up and popularized by thework of Schiffer (1976) on site formation processes.Whereas Lerman was concerned with a topic that is properly in theprovince of the mathematical statistician, Cowgill raised questions that areproperly in the province of the archaeologist, yet require integration ofthinking from both sides for resolution. Cowgill correctly warned thatstatistical applications run the risk of being sterile exercises if the stepsleading from data to analysis to interpretation cannot be adequatelyjustified. Cowgill mildly chided those who have used factor analysis onsmall data sets (eg., Binford and Binford 1966) for this reason, noting that"the source of error I have just illustrated is, by itself, enough to make usdoubt whether such results can be trusted to tell us very much about anylarger populations which the 16 entities may represent" @. 165).However, the problem of linking technique to data to interpretation isonly hinted at by Cowgill's comments about sampling. At base, there is amuch deeper issue of what constitutes the theoreticaVsymbolic terms usedin a theory, on the one hand, and what constitutes the empirical/descrip-tive terms used to discuss the data to which the thwry will be applied, onthe other hand. The need for this distinction can be seen in the confound-ing of grouping and classification as if these are synonymous terms (eg.,Sparck-Jones 1970, Do rm and Hodson 1975, among others); a confound-ing that underlies attempts to provide automatic cIassi6cation - asopposed to automatic grouping- echniques and theory.The rational for the grouping/classification distinction as it applies toarchaeology has been given in detail by Dunnell (1971). Dunnell uses adistinction between grouping as having reference to the phenomenologicaldomain and classification as having reference to the ideational domain.That is, grouping involves forming sets of concrete objects; classificationdeals with symbolic representation of those sets. Automatic procedures d onot produce symbols, only sets of objects; the archawlogist may inferregularity among the grouped objects and impose symbolic identification

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    6/38

    DWIGHT W. READ METHODS IN ARCHAEOLOGICAL RESEARCH 15of that regularity, but this cannot be automatic. Grouping is at the level ofempirical phenomena; symbolic representation is an abstraction based onempirical phenomena.The symbolic/empirical dichotomy was discussed in some length byBorillo (1977) in a paper given in the Paris workshop on mathematicalmethods in archaeology.

    In a workshop that took place in 1977 in Paris, Borillo (1977) presentedan extensive discussion of the logic and rational of using formal, mathe-matical methods in archaeology, and by extension in anthropology. Borillonoted that mathematics operates with symbols, hence with representationsof empirical phenomena, not with the phenomena directly. He observedthat there are two translations involved when mathematical fonnatism isused: one going from objects to their "reprkentation r6gulibren, i.e. anunambiguous, weU-defined representation linking object and symbol @.2); and the other a translation that gives the symbolic system interpreta-tion through terms relevant to the archaeological domain, with the latterbased on a research strategy and a coherent, theoretical framework @. 3).Mathematics, he argued, can be used to examine hypotheses about how aset E of data and a set R of measures for those data are structuredthrough a formal representation and a calculus which bears upon thehypothesized structure for the sets E and R. An example would beBinford and Binford's (1966) using factor analysis as a way to recovergroupings that allegedly could be interpreted as tool kits.The linkage between formal methods and analytical results of archaeo-logical interest can, he suggested, best be seen in the problem of automaticclassification (the formal calculus)@. 15). Formal results (e.g., separationof data into di i rent groups by means of an algorithm) must be comparedwith independent data @. 16) for their validation (eg., use of discriminantanalysis to show the distinctiveness of the groups). But circularity couldenter in as the validation techniques are themselves part of the samecalculus. Borillo escapes the circularity through also requiring congruencewith the opinions of archaeological experts. The analysis must bereexamined if there is disagreement with experts; however, if there can beno agreement, then the choice of a set E of objects, a priori classes, etc.are thrown into question (p. 21). Fially, there must be corroboration inthe form of agreement with the logic of the'framework in which the

    ' problem is posed, and agreement between the interpretation made of thedistinctive traits found through the calculus and other, relevant informa-tion @. 23).But if the expertise of archaeologists who use non-mathematical, non-formal methods is to be the arbiter of results arrived at through formal

    calculi, then one might well ask what is to be gained through using aformal calculus.Borillo gives several answers, two of which will be mentioned here.First, it becomes possible to formulate more powerful theories. Whenor "laws" seen at the empirical level are recast as relations in asymbolic calculus, one goes from the more particular to the more general,from the more empirical to the more abstract, and in so doing to a levelwhere these "laws" "s'organise selon des structures plus gknerals, a desniveaux #abstraction plus blev& w m e rganized through more generalstructures, at higher levels of abstractiony (p. 26, translation added).Second, phenomena that appear empirically distinct, and often seen aspertaining to different disciplines, may become grouped in a revealing andsignificant way through formal representation @p.26-27).For Borillo, then, formal methods are a means to anive at a higher,more abstract level wherein the properties giving structure to phenomenaat the more concrete level can be expressed and examined. Archaeologicaldata can be embedded into this symbolic level through an appropriaterepresentation, and if there is embedding into the symbolic level, theremust also be the reverse. Unlike Clarke, Borillo does not Suggest thatarchaeologists should develop an appropriate symbolic calculus to expressand develop the theory appropriate to archaeology. Rather, Borillo seemsto suggest that there must be simultaneous development of two systems:the theoretical system of the archaeologist through which structures forthe data are hypothesized and the formal calculus through which these areanalytically examined.This leaves Borillo placing a curious emphasis on the archaeologicalexpert, an emphasis which the 'new archaeologists' have rejected out ofhand. Expertise, in the framework of the 'new archaeology', hash0 specialepisteomological role; rather, i t is even to be abhorred in that validation isto come out of confirmation and verification separated from expertise, perse. It may take an expert to come up with significant ideas; but the 'newarchaeologists' have argued that ideas have validation only through confir-mation via appropriate testingagainst empirical observation

    For hew archaeologists' heavily steeped in an empirical, logicalpositivist approach, Borillo's argument is likely to seem almost irrelevant.Yet profound insights are often made by persons whose methods mayseem cavalier and antithetical to "sound" scientific methodology. One hasonly to consider the insights and impact of persons such as Levi-Straussand Pi et .Borillo correctly draws attention to validation as being more extensivethan merely confinnation through empirical observation. Good theoriesalso provide a coherent logic for the interpretation of phenomena. Doran(1977) took up a similar theme when he emphasized the need to explorethe knowledge of the archaeologist: "more powerful techniques will be

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    7/38

    D W I G H T W . R E A D M E T H O D S I N A R C H A E O L O G I C A L R E S E A R C H

    obtained not by improving the quality of the mathematics, but by findingways of using knowledge rigorously" @. 180). But whereas Borillo usedthe archaeologist's knowledge for validation, Doran suggested, followingthe work done on the expert system, DENDRAL see Doran 1977 forreferences), that archaeologists should be concerned with understandingthe origin (in the sense of what reasoning led to their formulation) ofhypotheses and not just their validation.Thus, what constitutes the relationship between archaeology andmathematicdstatistical methods is not simple and different persons havehad varying views. Moberg (1977) gives a different viewpoint - nd onewhich seems to characterize many archaeologist's conceptuWtion of thisrelationship - wherein mathematics (read: statistics) provides techniquesthat can be applied by archaeologists.Mathematics is needed, he suggested,not because one is going from concrete observation to symbolic represen-tation but because the empirical context becomes too complex when theamount of data to be analyzed passes a certain threshold @. 194).Mathematics,in his view, provides a series of techniques and the problemthe archaeologist faces is which technique should be used: "Les arch&-logues ont besoin d'aide pour choisir les mkthodes adkquates [Thearchaeologists need help in choosing the right methodsy @. 194, transla-tion added). However, if there are "mkthodes adiquates," then therewould be no need for validation of the results as Borillo argues. Yet asBorillo (1977: 14) noted, all methods assume a certain structure andwithout adequate and accepted arguments linking empirical phenomena toformal models.(which do not yet exist), there cannot be a "right" method.Whallon (1977) examined this linkage by arguing that however muchthe polythetic model for grouping data may be useful for statisticallydesigned automatic "classi6cation" procedures, it may not be valid forarchaeological typologies @. 216). Whallon (1977) discussed his earlierpaper (1972) aimed at replicating an "intuitive" typology for Oswegopottery using a monothetic, hierarchical procedure. While no statisticalmethods more complex than a chi-square test were used, his analysis is inkeeping with Spauldii's comment that statistical analysis of archawlog-icaldata is an extension of archaeological reasoning.Whallon's paper bridges the two modes of reasoning - mbeddingarchaeological concepts into an appropriate calculus to examine theirlogical consequences versus viewing mathemapcs/statistics as providing

    , methods to be applied by the archaeologist- epresented by Borillo's andMoberg's discussions. Whallon clarifies the issue from the archaeologist'sviewpoint, hence lays the foundation for how one might begin to build asymbolic calculus (in Clarke's sense) appropriate to archaeological reason-ing. The outline of a formalized, symbolic system for expressing the logicof grouping procedures as they relate to a classificationwas independentlydiscussedina paper by Read (1974).

    Other authors in the workshop reviewed the "symbolic calculus" side of~ o d l o ' s iscussion; that is, they focused on the statistical analysis ofsmcture: Ihm (1977a) reviewed uni- and multivariate statistics; Sibson(1977) discussed the theory behind multidimensional scaling; Lerman(1977) developed a new rank-order correlation coefficient which avoidsbias in Kendall's t; and Guenoche and Ihm (1977a) developed estimatorsthat may be used in principal component and discriminant analysis forsituations where data is missing.The remaining papers for the workshop considered several aspects ofseriation. Ihm's paper (1977b) is particularly interesting for it formallydefines what constitutes a solution for seriation. Given an incidence orabundance matrix X whose columns represent units (sites, graves, etc.)and rows the type of object being counted @resence/absence for anincidence matrix, frequency for an abundance matrix), then one needs tofind a mapping T such that for the kth column vector, x,, of measure-ments for a unit (i.e., the vector whose components are the values of themeasures for the kth unit) T associates with xk a real number:

    T : , - tkEReals,with t proportional to the date of origin for unit k. Ihm shows how such amappmg T may be defined if there is a distance measure, 4 such that forthe it h and jth units, with column vector of measurements xi and xirespectively,

    Hence the problem is to find such a distance measure. Ihm correctlypoints out that there is no general solution, only a solution if one hasspecified a "modUe mathimatique de la variation de frkquence desdiffkrents types avec le temps [mathematical model for the change infrequency of types through time]"@. 140, ranslation added).For an incidence matrix where types appear and disappear in a regularmanner through time, Ihm noted that it suffices to use the L, city block)norm; i.e., d(x,, x,) - x, - ,l, - Z lx, - ,[. For abundance matrices

    where the frequency of a type follows a Guassian distribution throughtime (the classical "battleship" curve), Ihm showed that the typical"horseshoe" shape that results from using multidimensional scalingtechniques as discussed by Kendall (1971) can be linearized by taking alogarithmic transformation of (he vector dot product of the i th and jthcolumn vectors (which represent the i th and jth unit) in thematrixX.

    Ihm's (1977b) paper represents well Borillo's discussion of the doubletranslation involved when using formal methods. First, there is a transla-tion to a symbolic representation (the incidence or abundance matrix asthe representation of an archaeological unit). Analysis takes place withinthe context of a symbolic calculi (definition of a distance measure, d;

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    8/38

    D W I G H T W . R E A D M E T H O D S IN A R C H A E O L O G I C A L R E S E A R C Hderiving a mapping T from d that maps the symbolic representation,xi, ofunits to an ordinal time scale, ti , based on a model relating time to thepresence/ absence or frequency of types within units). Second, there is atranslation back from the symbolic domain to the domain of archaeology(interpretation of the ordinal scale as a relative time scale for the archaeo-logical units through verification of the model for change in presence/absence or abundance through time). Whereas the other authors (e.g.,Regnier (1977), de la Vega (1977)) dealing with seriation emphasizedmore the algorithmic or procedural side of a seriation solution, Ihrneffectively drew attention to the fact that no solution is possible in thesymbolic domain without an a priori model; such a model can only bejustified through observation and theory about the empirical domain.

    TOPATTERN THE PASTIn 1981, at the Tenth International Union International des Sciences Pre-et Proto-histoire (UIPPS)meeting in Mexico City,what had been the DataBanks Commission expanded into Commission 4 on "Data Managementand Mathematical Methods in Archaeology." Under its new aegis, Com-mission 4 sponsored an international symposium in Amsterdam in 1984,leading to the volume, To Pattern the Pu.sf. The volume contains 23 papersdivided into four sections: (1) Documentation and Presentation, (2)Classi-fication and Seriation, (3)Pattern Searching and Pattern Verification, and(4) Modelling and Simulation. While the sections generally represent thetopics considered in the previous conferences, there are certain notabledifferences.First, extensive work had now been done on the mathematical basis forseriation methods, particularly by Ihm (e.g., Ihm and van Groenewoud1984), based on correspondence analysis. Seriation using correspondenceanalysis figures in three of the papers: those by Djindjian (1985),Slachmuylder (1985) and Scollar et ul. (1985). In the first of these threepapers, Djindjian (1985) pointed out what had heretofore been discussedas different techniques (the Brained-Robinson seriation method (Robinson1951), the Petrie matrix analysis method (Petrie 1899) and the multi-dimensional scaling method (Kendall 1971)) were in fact identical andcould all be subsumed under correspondence analysis. Djindjian notedseriation based on correspondence analysis is @ advancement in that it

    , can display the interfering effects of structures other than the supposedserial relation of the unitsWing analyzed, hence the effects of these otherstructures can be removed. While other seriation methodsmainly examineunits separated from their spatial context, Djindjian suggested usingtoperiation which combines topochronology (Werner 1953) - hat is,"the chronological development of a site as expressed in space" @. 126)-with seriation.

    In the second paper Slachmuylder (1985) applied correspondenceanalysis to seriate a series of Belgium sites using an extensive typology(121 types) developed by Rozoy (1978) for the mesolithic of France andBelgium. The results agree reasonably well with14C dates for the sites.The third paper, by ScoUar, Weidner, and Herrog (1985), discussed aseries of programs based on Ihm's algorithm for seriation using correspon-dence analvsis and written in Pascal for a mainframe computer. Important----here is th; the suite of programs have been announced-as available forworld-wide distribution. Since the date of this conference, the entire seriesof programs have been ported to PC's and are now avilable on a cost onlybasis, thus making a major addition to archaeological methods widelyavailable. In effect, the development of this suite of programs signals thatbasic, underlying mathematics for seriation has been satisfactorily workedout.W e he work of Ihm is a contribution to archaeological methodologyby a statistician interested in archaeological research, the Symposium alsoincluded a contribution to statistical methods by an archaeologist.Voomps,Loring, and Strackee (1985) developed what they call a gammamir density function as a way to model a baseline condition of randombehavior in the selection of animal species hunted during the MiddlePaleolithic. Their model is motivated by a suggestion due to Johnson(1981) that settlement system integrationmay be a multiplicative functionof various conditional probabilities. Using the assumption that species areprocured at random from the environment, Voomps et al. (1985)assigned to each species a "popularity" index representing the proportionof sites in which the species is found. These indices are multiplied togetherto form a multiplicative popularity index, v, for a site, following Johnson'ssuggestion of using a multiplicative function to represent system integra-tion. Under the assumption that the popularity indices are uniformlydistributed over the interval [0, 11, and that the number of species in a siteis uniform over the interval [I, N], Vonips et al. (1985) showed that thedensity function f (x) for x - -In o will, for a fixed value of N, have agamma distribution:

    If the number of terms, n, n the products making up o has a uniformdistribution over [I ,N], then.the density function for x becomes

    i.e., a mix of gamma unctions.The model was applied to data on the coastal and inland Middle andUpper Paleolithic from West Central Italy. The authors found statistical fitto the gamma mix for the coastal data sets using the Kolmogorov-Smimovone sample test and concluded, assuming the validity of the model, that

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    9/38

    20 DWIGHT W. READ METHODS IN ARCHAEOLOGICAL RESEARCH 21only the inland sites show evidence of specialization, with no detectabledifference between the Middle and Upper Paleolithic sites. However, noneof the observed distributions is well-modeled by the gamma mix distribu-tion since all but one of the cumulative, observed distributions lies abovethe cumulative distribution for the gamma mix distribution, whereas thenull hypothesis of no difference implies that the observed distributionshouldbe randomly distributed around the theoretical distribution.Stochastic models such as the one developed by Vomps et al. (1985)and relevant to processes underlying how archaeological data arestructured play a more prominent role in this Symposium than in theprevious conferences. Modeling a situation as the outcome of a stochasticprocess is also used by Chippindale (1985) who presented a simulation ofthe shapes of cow horns found in rock-art on Mont =go.The simulation used a directed random walk on a computer screen.From the current pixel location, the next pixel was selected according to aprobability scheme for the eight surrounding pixels. By varying theprobabiities from one compute; run to the next, chippindale was able to~r od uc e varietv of simulated horn sh a~e s hich resemble the rock-art; o m, hence he &as able to demonstrateathat ocal "pattern" which mightbe used in a typology for the horn shapes in the rock-art is produciblefrom a stochastic process, hence may have no relation to "intention" onthe part of the artisans.These two papers share commonality with several other papers in theSymposium that all ask a similar question: What is the basis, or lackthereof, for structure seen in archaeological data? Whereas Vomps et al.and Chippindale examined baseline models using stochastic processes,Simek, Ammerman, and Kintigh (1985) saw the spatial distribution ofdata in archaeological sites as more complex than is generally supposed inarguments advanced about activity areas.They suggested the need for heuristic approaches that allow the analystto decompose the spatial distribution into spatial clusters that can beexamined in terms of their heterogeneity @. 230) while simultaneouslytaking into account the effect of small sample sizes on the number ofartifact classes recovered in the site (see Kintigh 1984 for an extensivediscussion of this problem). Bietti, Burani, and Eanello (1985) u s 4 muchthe same philosophy and suggested that pattern recognition does not somuch need automatic methods as interactive ones that allow the archaeo-logist "to perform several experimenfs on the data, to play with the datan@. 224, emphasisin the original). iA paper by Carr (1985a) suggested that Fourier filtering methods maybe a solution to the double b i d discussed in Christenson and Read(1977), namely that data need to be prescreened prior to applyinggrouping techniques precisely with the information that one is attemptingto gain from the analysis. O'Shea (1985: 107) reached a similar conclusion

    in his discussion of the difficulties with using cluster procedures to recoversimulated groupings in mortuary data. And, as Christenson and Read hadshown earlier, he found that the ability of cluster procedures to recovergroupings diminishes with inclusion of measurements for nonessential, orirrelevant attributes.QUANTITATIVERESEARCHM ARCHAEOLOGY

    In 1985, a national symposium on quantitative methods was held jointlywith the 50th Annual Meeting of the SAA in Portland, and Commission 4of the UISPP. Whereas the charge in the Mamaia Conference was for thearchaeologistrhistorian to address the statisticiadmathematician and viceversa, here the archaeologist participants were charged to "think about ..how they would communicate their thoughts to archaeologists whoare nonspecialists in the field of statistics or quantitative methods"(Aldenderfer 1987a: 7).Also different about this symposium was the many papers that ques-tioned the fit between statistical methods and the goals of archaeologicalresearch. Rather than viewing statistical methods as the standard towardswhich archaeologists should strive (see Aldenderfer 1987: lo), thesepapers saw discordance between goals and methods. The discordance wasseen as stemming in part from the non-holistic nature of an assemblage(Brown 1987: 295-5), and in part from lack of concordance among thevariables selected, data to be analyzed and models inherent in methodsbeing used (Carr 1985, 1987a; Clark 1987; Read 1985, 1987b; Brown1987).Yet another sense of discordance was discussed by Aldenderfer(1987a) who pointed out that while there may have been conceptual ac-ceptance of a shift to a quantitative/mathematical pproach to classifica-tion in American archaeology, the means for carrying out the shift, asexemplified by Read's papers (1974a, 1974b; see also Read 1982) settingforth a mathematical basis for typologies, did not get equally accepted.This happened because (1) the formal language of set theory was a banierand (2) non-quantitatively formulated typologies allegedly "work," hencethere was no perceived need to change past methods @. 26). Aldenderferattributed the reaction in part to a social and political dimension @. 27)which continues to affect how quantitative methods are viewed in thediscipline, separate from their intrinsic value.Although the various authors gave different suggestions, common is thesense that something is amiss, that the presumed promise of quantitativemethods as a means to provide objectivity and insight is defective.Proffered solutions varied: Q-mode analysis (Aldenderfer 198%; Brown1987); artiiicial-intelligence (Doran 1987); "middle range theory"(Voomps 1987); simple statistics (Wballon 1987); and causal modeling

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    10/38

    DWIGHT W. READ

    (Kimball 1987) were all suggested. Read (1985, 1987b) did not so muchadvocate a specific method as examine the discordance that arises from astatistical framework implicitly oriented towards description, on the onehand, and research goals aimed at explanation, on the other hand. Read(1985, 1987b) pointed out that if statistical analysis is to have explanatorycontent, then a population brought forward for study must be homoge-neous with respect to an underlying process giving these data theirstructure. Also, the variables used in the analysis must measure relevantaspects of the outcomes of the process. Otherwise, the analysis only hasdescriptive value in the sense that estimates for parameters in positedmodels may bear only an uncertain relationship to the properties of theprocesses allegedly being modeled.Read (1985, 1987b) exemplified the argument with previous work onthe spatial configuration of huts in !KungSun camps and a typology forprojectile points from the Chumash Indian site, 4VEN39, in California.Read concluded that

    we must come to grips with how the cultural context .. s strctured, and how theC U ~ N I ~ ~ontext relates to che phenomeoa that we 0b s e ~ e nd study. SmMcalreasoning is one means to express those relationsso that their comequencu can bedmwnoutand cxpmed (1987: 103, mphasisadded).

    Return has thus been made to Spauldiing's implication that application ofstatistical methods should be but extension of archaeo~ogicd~easonin~.

    The theme of discordance and its resolution was also taken UD bv Carr(1985b, 1987a). Carr sees "specification of potentially rele4t &ta ortechniques using middle range theory, constrained exploratory dataanalysis [Carr 1985b1, entry models, and stepwise cyclical analysisn(1987a: 237) as ways to minimize the problem of discordance betweenheterogenityof data and presumed homogeneityof technique.Carr arguedthe "theories, models, hypotheses, test implications, mathematicaltechniques, data collection methods, andlor the data that are involved inan investigation be relevant to and logically consistent with each other,and ultimatelyto the empiricalphenomenonofinterest" (1987a: 187).

    Carr suggested that the "methodological double bind" discussed byChristenson and Read (1977) may be overcome by a multifacetedapproach to data analysis aimed at explorationof data structure to gaininsight into how the phenomena of interest ar t structured, and through' using initial insights in a reflexive,cyclical manner (1987a: 213). Includedin this approach is the use of multiple entry models for approaching theanalysis of complex data. Carr (1987a: 224-35) considersan entry modelto consist of three parts: (1) a mathematicalmodel specifying the relation-ship between the variables in question and the implied data structure, (2)an enumeration of possible processes that could give rise to data

    structured according to the mathematical model, and (3) specification of

    ME T HODS I N ARCHAE OL OGICAL RE SE ARCH

    analytical techniques consistent with the mathematical model and datastructure. Making the interconnections among data, structuring processesand analytical methods, is, he points out, a basic aspect of scientificmethodology (1987a: 238).Related to the lack of fit problem between data structure and modelstructure is the concern Cowgill expressed in 1970 about data reliability.Data reliability is examined in the symposium paper by Nance (1987).Nance pointed out that archaeologists have not been sufficiently concernedabout the reliability and validity of the data they collect, yet the methodsfor measuring the reliability of measurements are well established and can,ashe demonstrates,be fruitfullyapplied to archaeological research.

    SUMMARY OF CONFERENCE APERS

    While these conferencesall place attention on what can be achieved withquantitativemethods in archaeological research, what this means as praxishas been less unified. Three Werent approaches are exemplifiedby thesepapers: (1) a statistical-method/archaeological-problemdichotomy, (2)statistical reasoning as an extension of archeological reasoning and (3)formaVmathernaticalrepresentation of archaeologically defined relation-ships.The first approach is meant as identificationof a specific archaeologicalproblem along with a relevant statistical method. Many of the papers fromthe Mamaia conference are of this nature: the archaeologists identify aproblem and want to know an appropriate method; the statisticiansidentify a method and want to know an appropriate problem. When donewell, it can be effective; done poorly, it reintroduces subjectivity throughcontrived interpretations. Effective work using this framework dependsupon having a well-argued archaeological problem that clearly identifiesthe specific relationship to be examined rather than relying on an implicitanalogy between problem and method (see Read 1985) for justification.That is, the difference is between an assertion such as "we want to know ifrecords of hardness of sherds according to the Mohs' scale are useful fordistinguishing groups of sherds, knowing already that the hardness is anindication of the method of manufacture or the manner of conservationn(Myers 1950: 17, emphasis added) versus "it seems preferable wheretaxonomy is the main interest, to carry out a special purpose clusteranalysis" (Doran and Hodson 1975: 185,emphasisadded).

    In the second grouping are included statistical distinctions used toprovide a framework for expressing concepts which are part of archaeo-logical reasoning. Papers with this orientation are less frequent than thoseof the first group and are scattered through the conferences;e.g. of thepapers mentioned above, those by Orton (1971), Whallon (1977),Chippindale(1985),Carr (1987a),and Read (1985,1987b)are examples.

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    11/38

    24 D WIGHT W. R E A DThe third group includes papers that focus on developing properties ina formal framework once the basic relationships of interest have beenidentified. The mathematical work on seriation and on the mathematics ofautomatic classification, both of which figure heavily in the Marseilleconference and the Paris workshop, are examples of thisgroup.These differences in approach are partially the consequence of differenttrajectories that archaeological research has taken in different countries,

    hence there is no common, unidirnensional time trajectory. Nonetheless,there has been a significant trend towards questioniFg the framework usedfor the first grouping. That framework assumes a division of labor betweenarchaeologist and statistician. But as the UISPP and SAA symposiumsmake clear, this division is being eroded as archaeologists increasinglyregard statistical methods as a means to reason quantitatively; hence theyincreasingly use statistical methods as part of, and as a means to extend,archaeological reasoning.The third group has had limited development despite programmaticstatements (e.g., Clarke 1968; Gibbon 1985) about the importance of de-veloping formal representation of archaeological arguments (co maSalmon 1982: 178). Research fitting into this group requires shifting awayfrom an empirical framework aimed at identification and interpretation ofdata structure to one centered on developing theory as an abstractedsystem of reasoning utilizing the principles and processes thought tostructure the data of interest. The groundwork for such a shift has beenlaid; undoubtedly research of thiskind will become more prevalent.

    PART LI. QUANTITATIVE Pr/LETHODS:IMPLEMENTATIONThe first part of this review has focused on the changing perceptions ofwhat constitutes the role of quantitative methods in archaeologicalresearch. In this second part, focus will be on the manner in whichquantitative methods have been used in research and on conceptualproblems that arise from their application. For this part of the review,articles have been arouwd in accordance with a basic analvtical scheme:sampling, me as we ie ni data stiu&re, inference, and modeiing. Measure-ment will be examined for three topics that share the common problem ofconstructing a measure for a (or quality) that do& not havesimple formulation. The three topics are: (I) shape of artifacts and' features, (2) number of individuals represented by a faunal assemblage,and (3) chronology. Means for assessment of structure will be taken upwith two to~ics:1) inference of structure as ex~ressedn a data matrixand (2) skcture' & a spatial arrangement. ~taksticalmodeling will bediscussed as an extension of archaeological reasoning.

    METHOD S IN A R C HA EOL OGIC A L R ESEA R CHSAMPLING

    Sampling in American ArchaeologyThe Burg Wartenstein conference concluded, with regard to statisticalmethods, that "a central problem appears to be the establishment of thecorrect method of sampling in the individual casen and "archaeologists[should] familiarize themselves with the basic principles of samplingtheory" (Heizer and Cook 1960: 357). Shortly after this conferenceBinford (1964) proposed probability sampling methods as the basis forthis "correct method of sampling." Yet almost two decades later, in areview and critique of the application of statistical sampling methods inarchaeology, Hole (1980) noted that even though texts on sampling theoryare accessible, and even though sampling theory has great potential forarchaeological research, "its contributions to archaeological knowledgehave been disappointing"@. 21 7).Hole is critical of archaeologists who view sampling merely as aconstruct to be mapped onto archaeological data recovery fieldworkorocedures. Her criticism was foreshadowed in a commentarv Binfordj1974) made on the Symposium "Sampling in Archeology" heldas part ofthe 1974 SAA Annual Meetings in which he observed that "a samplingstrategy must be evaluated with regard to the character of the targetpopulation to be evaluated, not in absolute termsn @. 252). Similarly,Read (1974) had commented: 'There i s no best sampling procedure . . .The sampling procedure must take into account . the informationdesired, the distribution of that information in space, cost of obtainingsamples, and degree of precision needed, etc." @. 60) - theme alsoechoed by several of the other participants of the symposium. If so, therecommendation of the Burg Wartenstein conference was in error, andalso in error were al l subsequent papers centered on delineation of aspecific sampling scheme as the best for archaeological research.Hole described papers that limited their concern with sampling to thevirtues (or lack thereof) of specific sampling schemes as "AlgorithmicArchaeology"@. 220) and "Paper Surveysn@. 224). By these terms Holerefers to a shift from "papers exploring and advocating" probabilitysampling to "a veritable flood of prose regarding it as the only way" (p.218). Hole discussed the Cache River project (Schiffer and House 1978)as an example of algorithmic archaeology and a paper by S. Plog (1978) asan example of paper surveys. Hole commented that in the Cache Riverproject "the research design utilized no prior knowledge about the distn-bution of archaeological sites, their nature, the prior likelihood of sitesexisting in different areas of the survey tract .. . .One could design thesame research for this area . ..without knowing much about archaeology

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    12/38

    DWIGHT W. READ. . and] could equally employ th e design anywhere else on th e face of theearth" @p. 222-223). InPlog's paper an attempt had been made to assessthe relative value of di fferent sampling designs through simulated surveysbased on the d istribution of site s in a single region. Hole noted that "Plog'scritical misfortune is the failure to realiie that the results are moreinformative of th e data than the statistics" @. 225). That is, simulations ofthis kind are neither ab out statistical properties of sampling proceduresnor the relative efficacy of one o r another sampling procedure for estimat-ing a parameter such as the number of sites in the region; rather, they areabout how spatial distribution properties of archaeological data differen-tially affect sampling designs. The lat ter is a mathematical issue; the formeran archaeological one. The one depends upon mathematically framedarguments; the other upo n archaeologically framed arguments.The mathematical argument has two foci: one is inferential statistics-the relationship between sample statistics and population parameters asexpressed through estimato rs of population parame ters from sample data.The other has to do with extending the underlying statistical theory tosampling designs more complicated than simple random sampling. Read(1974) highlighted some of the mathematical results of sampling theorywhich bear on choices one might make in designing a sampling programfor a region; for example, stratific ation is advantageous only if the result-ing strata have greater internal homogeneity with resp ect to t he variable ofinterest than does the region as a whole. Hence, the answer to thearchaeologist's question: 'Should I stratify?" from a sampling theoryperspectiveis, "That depends on the spatial variabiity of the data."The statistical argument for stratification is based upon a particulargoalqelection of an estimator with minimum variance. An estimator thatis constructed using a stratified sample based upon intra-stratum homoge-neity and inter-stratum heterogeneity will have less variance associatedwith it than an estimator constructed using a simple random sample.Although these are important properties upon which to base researchcentered around inferential statistics, the rationale for designing anarchaeological research p r o w upon goals derived from sampling theoryis less clear. Archaeologists do use stratification in regional studies, butstratification is generally seen in terms of the importance of ecologicalandlor geological characteristics in a materialist interpretative framework,not because of distributiona l propertie s of pop ulations whose parametersare to be estimated throueh sa m ~l eat a sFurther, a property &h as spatial variability in site density need not bea good archaeological criterion for stratification. To illustrate, a matterwhich ariseswith Htratiiication is the size of the sub-sample allocated to astratum. Sampling theory shows that if the size of the sub-sample for astratum is proportional to the variance of the variable for that stratum(Neyman optimization), then the parameter value will be estimated with

    METHODS IN ARCHAEOLOGICAL RESEARCH

    equal efficien cy across al l strata. From a sampling theory viewpoint it isreasonable t o use Neyman optim ization (see also the discussion byv o o mp s et al. 1987: 233). But the same viewpoint need not be true froman archaeological pe ~~ pe ~t iv e.Suppose that in one stratum, sites are perfectly uniformly arranged over,pace and in an other stratum they are clustered. The former has lowvariance for the number of sites per sample unit; the latter has highvariance. Un der Neyman optimiz ation most of the sam ple should beallocated to the latter and only a small sub-sample should used for theformer. It is unlikely that many archaeologists would accept such asampling program - ot because of any failure to appreci ate the mathe-matics of sampling theory, but because archaeological rese arch is unlikelyto be focused solely on an estimate of the num ber of sites in the region(Plog, Weide, and Stewart 1977: 115-116). Instead, it requires manykinds of informat ion, little of which is distributed in a parallel fashionwith site density. As Hayes, Brugge, and Fudge (1981: 109) noted in theirdiscussion of a transec t sample survey of Chaco Canyon N ationalMonument - The transect survey was . .directed toward obtaining asmuch information as possible . [and] conducted specifically as aninductive searchn - he antithesis of the rationale for the Neymanoptimization criterion for allocating samples to strata. The Chaco Canyonsurvey is also instructive for what it informs about sources of bias inobtaining archaeological data.The C haco Canyon surveyhad two parts: (1) the transect survey,whichused north-south running ransects located randomly also an east-west lineand covered 25% of the Monument, and (2) a 100% coverage of the 32sq. mi. of the Monument. The latter was done to inventory all sites and soallowed for comparison of the "actual" and the estimated total number ofsites in the region based on the transec t survey.Hayes et al. treated the transect samples as observational values from asimple random sample and computed a 95% conflidence interval for thetotal number of sites to be 1547.8, 717.61 for the estima te of 644. 1 sitesin the region. However, in the compl ete inventory a total of 1,68 9 siteshad been found - pproximately a three-fold discrepancy between theestimated and the observed totals. The authors then proceeded to res t theaccuracy of th e tran sect design"@. 123) by using the site data from thecomplete survey corresponding to the transect sample to estimate the totalnumber of sites. Not surprisingly, a 95% confidence interval, 11593.9,2159.7 based on this datum includes the population total of 1,689.Indeed, the meaning of a 95% confidence interval is that the constructedinterval has a 95% chance of including the true value so long as nosystematic bias had been used in locating transects and th e distribution isapproximately normal.Of ore importance than this elementary confirmation of a theoretical

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    13/38

    DWIGHT W . R E A D M E T H O D S IN A R C H A E O L O G I C A L R E S E A R C Hproperty from inferential statistics was their discovery that the discrepancyappears to be related to different number of person hours/transect used inthe two surveys (1 1.7 versus 30.6 person days/sq. mi.). The total inventoryhad 2.62 times as many person-hours per transect and if the number ofsites found is approximately linear with intensity as Hayes et al. argued,then the corrected estimated total would be 632.2 X 2.62 = 1,655.6versus the actual value of 1,689 sites. Hence difference in samplingintensity seems to account for the discrepancy.Concern with 6nite population corrections, correction for sampling byspatial unit as a form of cluster sampling (e.g., Mueller 1975; Nance 1983)and other technical details which are part of sampling theory becomeh i m c a n t when compared to a bias of this magnitude. This and otherstudies (eg. Coombs 1978) suggest that precision and accuracy may needto be replaced by reliability and validity (see Nance 1987) when discuss-ing research design and field methods.Thebackground theory for so doing already existsas Nance (1987) haspointed out. Whether it will be used effectively depends upon archaeo-logists addressing questions such as: What are the archaeologicalproblems to be f a d when trying to get reliable and valid data?" not"which test of reliability or validity is better for this data set?". Nance(1987: 290) correctly sees the need to "collect data in ways that permit--reliability and validity to be assessed;" even more, data needs to becollected in ways that are reliable and valid.

    Sampling in BritishArchaeologySymposia on sampling have not been limited to American archaeology.A conference titled 'The Role of Sampling in Contemporary BritishArchaeology' was held in 1977 at Southampton. Unlike the earlier SAAsymposium, the British conference did not have a background ofindigenous archaeological work to draw upon and only some of theAmerican literature had been accessible to British archaeologists (Cherry,Gamble, and Shennan 1978: 1). In these papers there is a curious naiveteabout what statistical theory establishes - ven a distrust of theory byBritish archaeologists (Cherry and Shennan 1978: 30) - hat shows upin reliance upon experimentation 'to compare the efficiency of samplingdesigns against one another. . using archaeological materials" (Haselgrove1978: 164; see also Voorrips et al. 1978: 241; Fasham and Monk 1978:395). Archaeological data are seen as a different kind of data and

    , sampling theory has to be tested to see if it "really worksn for these data.More positively, there is often an attempt to fit sampling to archaeologicalpractice rather than the reverse (e.g. Peacock1978; Jones 1978; Champion1978). Cherry et al . (1978: 154) see the need for a three stage researchprogram (see also Redman 1973) in which the first stage is aimed at site

    reconnaissance, the second stage is to use probability sampling of the sitebased on the information obtained from stage 1 and the last stage shouldinclude judgmental selection of units from stage 2 which require moreextensive excavation.Nonetheless, discussion frequently reverts to the level of generalitieswhich provide little information on what, substantively, needs to beconsidered when forming a sampling program. It does little good to saythat "the specilic research aims, previous knowledge of the populationcharacteristics, and the ease or difficulty of implementation given cost,time, storage and sample size limitations must be kept clearly in mind"(Fasham and Monk 1978: 379) without effectively translating the generalproblem into specific guidelines for making descisions when formulating asampling program. Attempts are made to do this, but marred by miscon-ceptions. For example, Fasham and Monk (1978: 388) are encouragedthat a sampling fraction as small as 1% yielded on unbiased estimate ofthe mean even though bias is a property of estimators, not estimates, and(except for asymptotically unbiased estimators) unrelated to samplefraction.

    Form VersusSubstance n SamplingErrors of this kind aside, both in this conference and in the earlier SAAsymposium one senses that more often than not the arguments are aboutthe form and not the substance of sampling theory. The idea of samplingas a way to reason about the relationship between what can be observedand what one wants to know gets translated into an exercise focusing onwhich method better answers questions as defined by sampling theory. Isthe intent of archaeological sampling to estimate population parameters?Or is sampling the means to recover the data which constitute the domainof study for the archaeologist? The first question is answered by samplingtheory; the second is not. Orton (1978), in a brief conhibution to theconference, recognizes the difference when he rejects the sampled-popula-tion/target-popuktion dichotomy as largely irrelevant to archaeology andDroDoses instead that one should set UD "mathematical models which willkiude both archaeological and sam pkg ideas" and "it may even be that.sampling theory itself is irrelevant . . [except as] a starting point fordiscussion about why we quantify, how we quantify, and what sort ofstatements we can reasonably expect to prove useful"@. 400, 402). Andusing a mathematical idiom to reason about archaeological problems is,Groube (1978: 405) suggested in his closing remarks to the conference,but a shift from ad hoc to formal methods, hence is an extension ofalready existing archaeological reasoning. The sentiment is reminiscent ofSpaulding's comments at the Mamaia conference.There is a place for testing sampling designs with archaeological

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    14/38

    DWIGHT W. READ METHODS IN ARCHAEOLOGICA L RESEARCH

    data; that is, for testing how one can recover the data of concern toarchaeologists in a manner consonant with the interpretive goals ofarchaeological research. Archaeological research has gone from inferringpatterns on the basis of single sites and single extensive excavations topatterns seen through aggregated data, where the data are obtained fromphysically distinct locations, be they sample units in a site or spatial unitsin a region Interpretation that is based on pattern seen in the aggregateplaces new demands on the way data are obtained to ensure that patternused to make interpretation is not pattern created through data recoverymethods. Read (1986) has argued that spatial sampling is primarily ameans to create a representative (in the sense of preserving proportions)sample of the full range of archaeological data and is only secondarily ameans to estimate population parameters- xcept for predictive model-ing, which explicitly has as its goal estimation of population parameters.Read suggests that sampling efficiency should be seen as relating to datarecovery, not to estimator variances; it should implement current knowl-edge about the spatial distribution of sites both in relationship to ecolog-icaVgeological features and to other sites as part of a settlement system.Efficient sampling in this sense requires the application of archaeologicalreasoning, not just sampling theory reasoning. There may be overlap in thetwo modes ofreasoning, ut they are not identical.

    MEASUREMENT

    Artifacts are material objects, hence have form. Artifact form has bothquantity (size) and quality (shape). Shape is comprised of the relationshipof parts to one another; size is the magnitude of the parts. In addiion,form, as it relates to amfacts, has both a material aspect (function) and anideational aspect (emic conceptualization of fonns). While both size andshape may involve the material and the ideational dimensions, shapewould seem, in most cases, to contain the greater amount of information;hence the frequent emphasis on measuring shape separated from size.Though shape belongs to the domain of geometry, geometry does notadequately give us a vocabulary to discuss the separation of form into sizeand shape except in special cases; e.g, forms that have the same shape butdifferent size (congruent shapes) or fonns which differ in shape butwithout specification of size (e.g., hiangles versus rectangles). In theabsence of geometric speciiication, archaeologists have largely relied onlinear measurements to measure forms, and ratios of measurements toinfer shape.

    ARTIFACT FORMOne of the more extensive and systematic of these schemes was given byBenfer (1967) who used a suite of 23 measurements on a projectile point.Benfer's system is but the elaboration of a tradition of measurement whosesimplest canonical form is: length, width and thickness of an object. Shapein this canonical form is taken to be a ratio such as shape- ength/width.This, for example, became the basis for Bordes' (1961: 6) distinction of ablade from a flake: if Uw< 2 the lithic is a flake; otherwise it is a blade.Bordes' system is arbitrary in the sense that it has no underlying geometricor cultural theory.The extremes one can be led into by an arbitrary system of measure-ment are demonstrated in an article by Pitts (1978) who applied principalcomponents, Wards cluster technique, Gower's Constellation analysis(described in Doran and Hodson 1975: 205-209) and spatial analysis topublished data on flakes from England and Italy in which the length/widthratio was divided into six shape classes. Though Pitts noted the criticism(e.g. Hassan 1974, Chapter m)made of Bordes' blade/flake dichotomy aspossibly being nothing more than an arbitraq subdivision of a continuum,Pitts only sees the problem as one of choice of the number of classes (thedichotomyused by the French, or the several fold subdivision used in Italyand England) and standardizationof classes.Pitts 6nds that something non-constant has been measured- he datahave spatial and temporal patterning- nd concludes that the several folddistinction is better than the dichotomous one. The implication is not seen:by the same logic an even h e r distinction is Likely to be more informativeand camed to its extreme one would say that the best approach is to makeas many classes as possible; i.e., use length and width measurements tomake each object the sole member of its own arbitrarily defined class.Clearly the problem lies not in whether one should use 2 or several shapeclasses, but in the premises upon which the original dichotomywas madeas an imposed and arbitrary system for the numerical representation ofshape.

    Similar criticisms of an arbitrary approach to artifact measurementappeared in two independent sets of papers (Main 1978, 1979; Read1982). Both authors noted that the arbitrary approach fails to provideinformation h m which the original outline can be reproduced. Call theability to recreate the fonn from a numeric representation the archivalproperty. In addition to the archival property, Read noted that a llmeasures or coefficientsmaking up a numeric representation should havemeaningful interpretation within the theoretical framework used in theanalysis. The two authors differed, however, on specific suggestions for ameasurement system that would satisfy the archival property.

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    15/38

    D W I G H T W . R E A D M E T H O D S IN A R C H A E O L O G I C A L R E S E A R C HMain (1978) represented a smoothed outline given in digitized formthrough two coefficients: arc length and slope of the chord (tangent angle)connecting consecutive points in the digitization. By plotting arc lengthagainst tangent angle, a profile for a shape is produced and profiles,or sections of profiles, can be compared and a similarity measureconstructed. Then the similarity measure can be used in a cluster analysisfor grouping. Main applied the technique to cross sections of train railsand "achieved a very reasonable classification of 48 fairly irregularlyshaped objectsn@. 46).Read (1982) discussed the need for a non-redundant and sufficient setof coefficients for representing the outline of projectile points. Bysufficient he meant that the original form could be reproduced from thecoefficient values; i.e., the archival property would be satisfied, and bynon-redundant he meant that no smaller number of coefficients would besufficient. Read viewed the outline of a form as a series of connected arcs(curves with no inflection points). Read represented a smooth arc via threecoefficients: (1) length of a chord connecting the endpoints of the arc, (2)

    maximum height of the arc above the chord and (3) distance along thechord to the maximum height. This, along with the angle of connection fortwo arcs, leads to a representation in the form of a vector: the first threecomponents of the vector represent an arc, then the next component is theangle joining this and the next arc together, the next three componentsdefine the next arc, etc., until the outline has been completely traced. Readshowed that this representation scheme was sufficient and non-redundantand applied it to Scottsbluff points so as to examine intersite di ie renw inthe assemblages from which these points came. Phagan (1985) has applieda similar scheme to projectile points from the Dolores Project.Read noted that the distinctions upon which the representation wasbased - rcs and angles - ould be interpreted for cultural and func-tional saliency through the form of Frequency distributions for coefficientvalues. Read used this technique to show that the angle for the tip of thepoint is in accord with material constraints whereas the shape of the basesection of the Scottsbluff points is more in accord with ideationalconstraints.

    A third representation for the outline of an artifact was discussed byGero and Maxxullo (1984) who illustrated that if the outline is embeddedin a polar coordinate system centered at the centroid of the form, then theoutline (at least for most lithics) could be viewed as the graph of a. periodic 'fun~tion, he Fourier function (defined below) with period in.Whereas Pitts carried the arbitrarv measurement svstem to an extreme.Tyldesley, Johnson, and Sn ap (i985) did the &me with so-calledobjective measurement procedures. They suggested representation in theform of a profile computed by constructing the angle between a major axisthrough the centroid of the form and a minor axis through the centroid of

    the two sub-shapes formed by the major axis (references for the methodare given in the paper). The process starts by drawing a major axisvertically and measuring the angle between the major and the correspond-ing minor axis. The construction of the major and minor axis and themeasurement of the angle between them is then repeated for each 1degree increase in the angle between the major axis and a vertical line asone goes around the outline. The profile is formed from a plot of theinterval angle (the angle between the major axis and a vertical line) versusthe angle between the major and the minor axis.The authors commentedthat when the profiles are used to group paleolithic bifaces, groupings areformed "which may differ from those classified by more traditionalmethods, but which are worthy of consideration as an entirely objectivelygenemted system" @. 23 , emphasis added).The profile based on centroids does not satisfy the archival property ascongruent shapes will have identical profiles, though one could add a sizecoefficient. The Fourier function can satisfy the archival property, butdoes not satisfy the non-redundancy property as the number of coef-ficients used to d e h e the function relates to accuracy of fit to outlineswith "corners" (points of discontinuity for the first derivative), not toshape properties per se.

    As Read (1982) discussed, values for coefficients which are part of thecharacterization of a shape can be treated as values of variables whendifferent shapes are compared. However, these need to have interpretationat the level of process. The Fourier coefficients, except possibly for thefirst 2 or 3, cannot be so interpreted. The tangent angle used in Main'srepresentation is too "fine" a measure for interpretation as it (potentially)varies at the scale of pixels. Read's vector representation is at the rightscale, but whether arcs and angles between arcs have emic relevance wasonly partially resolved.From these papers we may distinguish two separate problems that neednot have identical solutions (see also Main 1979). The first problem hasseveral solutions for simpler cases such as planar curves; the other has notyet been resolved.The first problem is archiving a form in an efficient manner, or creatingwhat Main (1979) calls an internal representation p). ain discussedproperties an ideal IR should have, including one-to-one mapping betweenoutlines and JRs, ease of performing Euclidean transformations (changesof scale, rotations and translations) and amenability to analytical treat-ment, such as constructing similarity coefficients. Digitized information,for example, is an IR but not ideal as it requires large numbers ofcoordinate points for accurate representation. In addition to his profiletechnique, Main suggests using local curve fitting techniques when repre-senting the outline of a shape.Spline functions are of this nature and have seen extensive development

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    16/38

    DWIGHT W. READ ME T HODS IN ARCHAE OL OGICAL RE SE ARCHin computer graphics for locally fitting curves and surfaces. They holdmuch promise as a general solution to archiving two and three dimen-sional shapes. Global techniques such as the fitting of Fourier functionsare also possible, especially with elliptical Fourier functions that canrepresent complex planar curves (Kuhl and Giardina 1982).The second problem is to find a representation that also has interpreta-tion, as Borillo (1977) has discussed, in the context of archaeologicaltheory. Most of the archiving methods given in these papers (includingspline functions) do not satisfy this criterion. Read's (1982) system comesclosest, but is still a first approximation. Better understanding is needed ofthe process by which shapes are formed or constructed, e.g., Krause's(1984) work on pottery production sequences using formal grammars.FEATURE HAPEWhereas artifacts are single entities with form, features are composed ofseveral smaller units which are combined to make a larger whole. Theseraise the question of what design principles were used to locate the smallerunit within the larger structure. More mundane features such as firepits,etc., have had less speculation about design principles than more spectac-ular features; many of the latter have also been given interpretation asdevices for storing and retrieving astronomical and calendrical obsewa-tions. Interpretation of features, structures and the like as early calculatingor information storage devices depends upon linking aspects of form withappropriate physical ob se~a tions, ence requires delineation of designprinciples.Megalithic Stone RingsIn 1967,A. Thom published a treatise on megalithic sites in England andclaimed to have discovered a number of design principles, including a unitof measurement called the "megalithicyard," or quantum, equal,to 2.72 It0.003 ft. He said it was used as a standard for the construction of themore-or-less circular stone rings. Thorn's work on isolating this quantumwas challenged by Angell(1976, 1978a, 1978b, 1979) on both statisticalgrounds and failure to consider alternative design principles that could beused o construct the circles.

    s AngeU. (1978a) recast Thom's argument in h e form of a statisticalmodel:

    where yi s the diameter of the ith stone ring, 6 is the "MegalithicYard",mi is a positive integer, f i is the error in locating the center of the circle,E , is the error term for the measurement of y,.

    Angell argued that Thom makes a circular argument through first usingan estimate of 6 to determine the integers m,, and then using the mi tojustify the estimate for the "megalithic yard," 6. Angell used the model todetermine appropriate statistical tests of the various assumptions made byThorn. The re-analysis by Angell showed that Thom's conclusion of havinga "megalithic yard" is not supported byhis data.In two of the papers (1976, 1978b) Angell addressed the problem offitting a curve to the approximately circular shape of the stone rings. In theearlier of these two papers he developed a polygonal method and in thelatter one a method that is a modification of the well-known technique fordrawing an ellipse by using two fixed points and a cord attached to thesetwo points. In effect, the construction technique uses a sequence ofelliptical arcs to match the outline of the megalithic stone rings. Bothmethods used a computer based, iterative least squares solution (based ona minimization algorithm due to Powell (1964)) for fit between theconstructed curve and the cosrdi mte location of the stones.In the fourth paper (1979) Angell developed an "explanationn for themegalithic rings by suggesting that they were devices for recording andretrieving certain special days during the year. As part of the argument,Angell developed yet another way to fit a curve to the stone rings, thistime based on the changing orientation and length of a shadow for anupright pole during the year. The model Angell developed uses twocoefficients to d e h e a curve: 6 - he angular altitude of the center of thesun, and d - fixed offset from the pole from which shadow lengths aremeasured. Four other coefficients are used to orient a constructed curve(origin, reference point, angular difference of coordinate systems andscale) with the megalithic ring. Angell again used an iterative fittingprocedure based on Powell's algorithm and was able to get good fit torings which vary from "flattened circles" to "egg-shape."In this model the stone locations correspond to calendrical days, hencethe stones serve as "reminders" of certain special days each year. Theauthor commented that "AU existing theories reflect to some extent theirauthor's particular interest be it engineerin& astronomy or mathematics.The present author therefore submits his theory as equally acceptable.Whether these hypotheses are fact or fancy will never be knownn@. 12)-caution that isexemplary but often ill-heeded.Leaving aside the interpretation to be made of the stone rings, his suiteof papers highlights the problem noted above with measurement of artifactshapes: Is the goal an archival or an interpretive/explanatoryone? Thesecan overlap; eg., Angell's procedure based on shadow lengths is botharchival and was constructed to provide a hypothesized interpretation.Angell's procedure also satisfies Read's criterion of non-redundancy;the shape construction procedure has associated with it a minimal numberof coefficients needed to recreate the form; these coefficients then serve as

  • 7/28/2019 Read - Statistical Mehtods and Reasoning in Archaeological Research

    17/38

    D W I G H T W . R E A D M E T H O D S IN A R C H A E O L O G I C A L R E S E A R C Hits representation. It should be noted that this number of coefficients needbe minimal only for the specific rule which relates the numeric values forthe coefficients to a specific form. Further, the rule becomes an assertionabout how the form can be constructed, hence is interpretable as havingcultural meaning. The rule not only states how to construct the form fromthe coefficient values, but in so doing identifies a potential set of principlesunderlying production of that form. From here, it is a small step to theidea of a grammar of artifact shape, a topic taken up in a 1986 SAAsymposium called "Form and design in archaeology: A grammatical ap-proach" (C. Chippindale, organizer). Quantitative structures now becomequalitative structures, reversing the earlier shift from quality to quantity asthe means for representation of structure in data.Postholes: Rectangular and Round StructuresAnother group of papers (Cogbill 1980; Fletcher and Lock 1981,1984a,1984b; Litton and Restorick 1984; Bradley and Small 1985) consideredthe analysis of postholes for regular pat tern that may have correspondedto round or rectangular structures. The papers by Fletcher and Lock andby Bradley and Small examine the likelihood that a regular shape(rectangular for Fletcher and Lock, circular for Bradley and Small) couldbe a chance occurrence.In their paper, Fletcher and Lock (1984a) are concerned not only withmethods for determining if the number of rectangles is greater than whatwould be expected in random data, but are also testing intuitive precon-ceptions that structures in English hillforts aregenerally between 2 and 4meters in size and are square. The authors establish (1984b) that theexpected number of rectangles in N randomly distributed points over anareaof A m is given by:

    E (number of rectangles)=Me-%/2(1- E) ;where E =p( 1 - -7, p = (N - 2)2(b- ) TA, t- .41 T2(N- 2YA,T - olerance for agreement between opposite sides, a, b - ength of thesides of the ectangle.Random data are generated corresponding to the parameters for theexcavation and the authors (1984a) use the above model to demonstratethat the number of rectangles found in the excavated data significantlyexceed what would be expected by chance. Their detailed analysisgoes onto confirm the preconceptions about the hillfork based on visual inspec-

    t tion of posthole locations, yet it also shows that identification of squarestructures by eye is problematic.Whereas the data used by Fletcher and Lock is relatively dense (935postholes in 2,664 mZ), the data used by Bradley and Small are quite thin.They based their analysis not on the random distribution of points over aregion, but on a Poisson distribution for the nearest-neighbor distance

    between postholes. From the Poisson distribution the expected number ofinstances can be determined in a collection of points where a subset of npoints will fall on the circumference of a circle with tolerance E in itsestimated radius:E (number of circles)=g(n)pnR 2n - 2 A ~ n),

    where p = Poisson parameter, R - radius of the circle, g(n) is deter-mined by Monte Carlo simulation (tabulated in Bradley and Small (1985).The authors use the model to show that for the Aldennaston Wharfexcavation (Bronze Age, England), two candidates for circular structurescould just be chance occurrences, whereas for the South Lodge Campexcavation (Bronze Age, England), two candidates for circular structuresare unlikely to have arisen by chance. The statistical conclusions arecorroborated by other data.ShapeofCorbelled DomesIn a series of papers (Cavanagh and Laxton 1981, 1982,1985; Cavanaghet al. 1985), the shape of corbelled domes was examined by fitting a powerfunctionf (x)- xb to the interior curve of the dome. In the last paper ofthis series, the question considered was whether the dome plus the support-ing wall formed a single curve, or whether it was the join of two curves: onerepresenting the dome and the other the supporting wall. To test thehypothesis, a two-phase regression model W