mel-tr-74-29 - eric · document.resume sd 097 642 cs 001 383 author williams, allan r., jr.; and...

DOCUMENT.RESUME

SD 097 642 CS 001 383

AUTHOR Williams, Allan R., Jr.; And OthersTITLE Readability of Textual Material--A Survey of the

Literature.INSTITUTION Air Force Human Resources Lab., Lowry AFB, Colo.

. Technical Training Div.REPORT NO MEL-TR-74-29PUB DATE Jul 74NOTE 67p.; See related document CS 001 381

EDRS PRICE MF-$0.75 HC-$3.15 PLUS POSTAGEDESCRIPTORS Armed Forces; *Literature Reviews; *Readability;

Reading; *Reading Comprehension; Reading Instruction;Reading Level; *Reading Materials; ReadingResearch

ABSTRACTThis literature review documents the state-of-the-art

of readability assessment and indicates directions for futureresearch in readability measurement. The period since 1953 isemphasized, although there is some consideration of earlier work.Primary emphasis within the review is based on readabilitymeasurement; however, a final section is included which reviewsrecent work bearing on the topic of increasing comprehensibilitythrough multimodal presentation methods. Various formulas forcalculating readability are presented and placed in historicalperspective. The contents include: "Introduction," which presents thescope and organization of the review; "Methods for MeasuringReliability,u which discusses rating methods, use tests, readabilityformulas, early formulas, detailed formulas, recent formulas, clozeprocedures, and multimodal presentations; and "Discussion," whichdiscusses the estimation of reading levels, formulas, and futureresearch. An appendix of various readability measures is alsoincluded. (NR)

'UR FORCE LI.;

H

UMAN

RES0URC

U S DEPARTMENT OF HEALTH.EDUCATION WELFARENATIONAL INSTITUTE OF

EDUCATIONT HI DO( oAAL NI HAS BEEN REPRODUCfh t pEcrivro FROMT1i1 PI RSON.:IR ORGANIZATION ORIGINAf:M,'T POINTS 01 VIEW OR OPINIONSSIA PO NOT NECESSARILY REPRE

.)I I IC 1AL NATIONAL INSTITUTE OFE.DU(nTIUN POSITION Ok POLICY

AFHRLTR-74-29

READABILITY OF TEXTUAL MATERIALA SURVEY OF THE LITERATURE

By

Allan R. Williams, Jr.Arthur I. Siegel

Applied Psychological Services, Inc.Wayne, Pennsylvania 19087

James R. Burkett

TECHNICAL TRAINING DIVISIONLowry Air Force Base, Colorado 80230

July 1974

Approved for public release; distribution unlimited.

LABORATORY

AIR FORCE SYSTEMS COMMANDBROOKS AIR FORCE BASE,TEXAS 78235

NOTICE

When US Government drawings, specifications, or other data are usedfor any purpose other than a definitely related Governmentprocurement operation, the Government thereby incurs noresponsibility nor any obligation whatsoever, and the fact that theGovernment may have formulated, furnished, or in any way suppliedthe said drawings, specifications, or other data is not to be regarded byimplication or otherwise, as in any manner licensing the holder or anyother person or corporation, or conveying any rights or permission tomanufacture, use, or sell any patented invention that may in any wayly: related thereto.

This report was submitted by Applied Psychological Services, Inc.,Wayne, Pennsylvania 19087, under contract F41609.72-C-0011, workunit 11210403, with the Technical Training Division, Air Force HumanResources Laboratory (AFSC), Lowry Air Force Base, Colorado 80230.Dr. James R. Burkett, Instructional Technology Branch, was thecontract monitor.

This report has been reviewed and cleared for open publication and/orpublic release by the appropriate Office of Information (Op inaccordance with AFR 190.17 and DoDD 5230.9. There is no objectionto unlimited distribution of this report to the public at large, or byDDC to the National Technical Information Service (NTIS).

This report has been reviewed and is approved.

MARTY R. ROCKWAY, Technical DirectorTechnical Training Division

Approved for publication.

HAROLD E. FISCHER, Colonel, USAFCommander

UnclassifiedSECURITY CLASSIFICATION OF THIS PAGE (When Data Entered)

REPORT DOCUMENTATION PAGE BEFORE COMPLETING FORMI. REPORT NUMBER

AFHRITR-74-292. GOVT ACCESSION NO. 3. RECIPIENT'S CATALOG NUMBER

4. TITLE (and Subtitle)READABILITY OF TEXTUAL MATERIALS ASURVEY OF THE LITERATURE

15. TYPE OF REPORT & PERIOD COVERED

6. PERFUHMING ORG. REPORT NUMBER

7. AUTHOR(a)Allan R. Williams, Jr.Arthur I. SiegelJames R. Burkett

8. CONTRACT OR GRANT NUMBER(a)

F41609-72-C-0011

9. PERFORMING ORGANIZATION NAME AND ADDRESS

Applied Psychological Services, Inc.Wayne, Pennsylvania 19087

10. PROGRAM ELEMENT. PROJECT, TASK

62185t 6 WORK UNIT NUMBERS

11210403

11. CONTROLLING OFFICE NAME AND ADDRESSHq Air Force Human Resources Laboratory (AFSC)Brooks Air Force Base, Texas

12. REPORT DATEJuly 1974

13. NUMBER OF PAGES70

1i. MONITORING AGENCY NAME & ADDRESS(it different from Controlling Office)

Technical Training DivisionAir Force Human Resources LaboratoryLowry Air Force Base, Colorado 80230

IS. SECURITY CLASS. (of this report)

UnclassifiedI5a. DECLASSIFICATION:DOWNGRADING

SCHEDULE

16. DISTRIBUTION STATEMENT (of Mitt Report)

Approved for public release, distribution unlimited.

17. DISTRIBUTION STATEMENT (of the abstract entered In Block 20, if different from Report)

18. SUPPLEMENTARY NOTES

19. KEY WORDS (Continue on revere* side if necessary and identify by block number)

readabilityreading levelsreading comprehensiontraining materialstechnical writing

20. ABSTRACT (Continue on reverse side If necessary end Identify by block number)

The literature relating to methods of measuring the readability/comprehensibflity of textual materials is reviewedand analyzed. Various formulas for calculating readability are presented and placed in historical perspective. Thegeneral status of research into the development of readability indices is discussed.

Dn F RM147340 I JAN 73 EDITION OF 1 NOV SS IS OBSOLETE

Unclassified

SECURITY CLASSIFICATION OF THIS PAGE (When Data Entered)

SUMMARY

Problem

It is expected that the mental aptitude and academic achieve-ment levels of enlistees into the United States Air Force may dropas the goal of an all-volunteer military force is approached. More-over, there is currently a gap between the reading achievement levelrequired to read certain Air Force technical literature and the read-ing ability levels of enlistees. Accordingly, the Human ResourcesLaboratory initiated a program to define methods to optimize thematching of technical training materials to the literacy skills of AirForce trainees. This review of the literature relating to methodsfor determining the readability of textual material represents thefirst result of this program. The two remaining aspects of the pres-ent study are: (1) experimental evaluation of modified training materi-als, when presented with and without auditory. supplementation, and(2) preparation of a training materials modification handbook, whichwill integrate information from the literature review, the materialmodification effort, and the experiment.

This literature review is intended to: (1) document the state-of-the-art of readability assessment, and (2) indicate directions forfuture research in readability measurement.

App roach

This report selectively reviews the literature relative to read-aoility-comprehensibility measurement. The period since 1953 is em-phasized, although there is some consideration of earlier work.

Sources searched for relevant literature included the Psycho-logical Abstracts, from 1950 through 1971, the Technical AbstractBulletin of the Defense Documentation Center, from 1962 through1971, and the U. S. Government Research and Development Reportsof the Department of Commerce, from 1968 through 1971. ThePASAR automated retrieval system of the American PsychologicalAssociation was employed to search more completely the literatureabstracted in the Psychological Abstracts between 1967 and 1970. Ofcourse, many additional references were found while reading the pa-pers indicated through the above sources.

1

NResults

Many readability formulas have been derived. The majorityare.based on linear regression equations relating various observedcharacteristics of text; i. e., sentence length or word length, to somecriterion of comprehension, such as a comprehension test score. Allof the formulas are highly intercorrelated and are all undoubtedly high-ly correlated with cloze score. Cloze score is based on a relativelynew procedure in which a judgment of readability is based on the per-centage of deleted words in textual material which subjects are ableto replace correctly. Cloze score has gained considerable recent ac-ceptance as a readability criterion. But practical considerations maymake application of one of the many other available formulas more ap-propriate in many instances.

Conclusions

The readability measurement field suffers from the lack of aunifying conceptual or theoretical structure. Clearly, readability ismultifactor in nature. The dimensions of readability must be deter-mined with consideration given to variables within both the text andthe reader. Upon isolation of these variables, studies to indicate howthe variables interact in dynamic reading situations will be required.While the call for a unifying theoretical structure may seem to evadepractical issues, it appears that little real progress can be made inreadability measurement without such a structure. Specific sugges-tions for required research are contained within the body of this re-port.

ii

TABLE OF CONTENTS

Pam.

CHAPTER I - INTRODUCTION 1

Scope and Organization of this Review 1

CHAPTER II- METHODS FOR MEASURING RELIABILITY 3

Rating Methods 3Use Tests 3The Readability Formulas 4Early Formulas 5Detailed Formulas 8Less Cumbersome Formulas 14Recent Formulas 21Cloze Procedures 30Application of Readability Formulas as

"Rules for Writing" 33Multimodal Presentation 34

CHAPTER III- DISCUSSION 39

Estimation of Reading Level 39Discussion of Formulas 41Future Research Avenues 44

REFERENCES 47

APPENDIX A - Glossary of Readability Measures 57

CHAPTER I

INTRODUCTION

It is expected that the mental aptitude and academic achieve-ment levels of enlistees into the U. S. Air Force will drop as thegoal of an all-volunteer military force is approached (Valentine &Vito la, 1970). Project 100, 000 has also had the effect of loweringthe overall academic achievement level of the Air Force. Accord-ingly, it is expected that efforts to increase the comprehensibility ofwritten materials, both those intended for training purposes and thoseused on the job, will yield significant advantages. An example of themismatch between the reading level of military personnel and the read-ability of the materials they use can be found in the work of Vineberg,Sticht, Taylor, and Caylor (1970), who reported that 75 per cent of asample of Army Military Occupational Specialties' reading materialswere written at ^ level six to eight school grades higher than the read-ing level of low- .ave1 (Category IV) personnel and four to six grade lev-els higher than the average reading levels of non-Category IV person-nel. The need tc match the reading level of Air Force technical litera-ture to the reading level of trainees and job incumbents is quite clear.

Scope and Organization of this Review

This report reviews the literature relevant to techniques formeasuring the readability/ comprehensibility of written materials.Use of such techniques could help to improve the intelligibility of AirForce reading materials and by implication reduce the gap betweenthe reading level of the reader and the material he reads. Primaryemphasis within the review is based on readability measurement. How-ever, a final section is included which reviews recent work bearing onthe topic of increasing comprehensibility through multimodal presenta-tion methods.

The term "readability" may be defined as:

the sum total (including the interactions) of all those elements within agiven piece of printed material that affects the success a group of readershave with it. The success is tne extent to which they understand it, readit at an optimum speed, and find it interesting (Dale & Chall, 1949).

OtSt

4011404

Since readability measurement within a training context is the topicof interest here, only the first part of this comprehensive definition- -readability as it affects ease of understanding or comprehension--willbe considered directly.

Within the measurement concept, "readability formulas, " inwhich attempts are made to predict the readability of written materialbased on quantitative analysis of the material, will be considered first.These methods have been traditionally based on items such as averageword length (in letters), average sentence length (in words), frequencyof occurrence of words not appearing on lists of common words, andfrequency of occurrence of prepositional phrases and independentclauses. Mare (1963) reports that various reviewers have countedbetween 29 and 56 such predictive formulas. However, work on thistype of formulation has greatly slowed within the past 15 years (Bormuth,1966; Tannenbaum, & Greenberg, 1968). Studies of readability sincethat time have tended to concentrate on the "cloze" procedure of W. L.Taylor (1953).

The cloze procedure is the second topic of discussion. In thistechnique of readability assessment, the percentage of deleted wordsin a passage correctly filled in by a reader is taken as an index ofreadability. This procedure overcomes the problem of measuring thereadability of technical literature having an unusual, specialized vocabu-lary which is familiar to individuals within a specific milieu--the pre-cise problem of interest in dealing with technical job-related literature.

2

CHAPTER II

METHODS FOR MEASURING READABILITY

Many approaches are possible to the problem of objectivelymeasuring the readability or comprehensibility bf written prose. Themost elementary methods; e. g. , rating methods and use tests, pos-sess serious drawbacks.

Rating Methods

In rating mm thods, judgments of the readability of written manu-als are made by samples which are representative of the intended userpopulation or by persons considered to be expert in the covered field.These judgments are necessarily quite subjective, requiring that alarge number of raters be employed to "average out" the variabilitybetween raters. Raters must also be presented with materials cover-ing a wide range of comprehensibility, so that they may choose theparticular materials exhibiting the optimum level of comprehensibilityfor the intended user population. This procedure may be very expen-sive to use due to the range of written materials which must be prepared,the necessary size of the rating group, and the effort required to inter-pret the ratings. Problems may also be encountered in selectingan ap-propriate rating group and in generalizing from the rating group to theusing population.

Use Tests

The use test method fol. evaluating readability involves admin-istering comprehension (use) tests to a sample of those for whom thematerial is intended, after the material has been read by the personsin the sample. High test scores dre assumed to be associated withhighly readable text, and low scores with less comprehensille text.In addition to the time required to collect and test the sample of users,large amounts of time are required to write tests based on the text.

3

Standardization of the tests is impossible, a priori, and spe-cific tests may well be much too easy or difficult to rate accuratelythe text versions of interest. Moreover, if low scores are attained,it is not known whether the low test scores can be attributed to thecharacteriatics of the text itself or, possibly, to the inability of thetested sample to comprehend in general. Finally, there is %theguidance available regarding whether the tests should test transferof factual information, transfer of main points, ability to apply in-formation, or what ?

The Readability Formulas

Quantitative analysis of written text has become the most popu-lar method of assessing readability. The reliability and economy ofsuch methods may be expected to far surpass that of the previouslymentioned methods. One reviewer (Mare, 1963) reports that be-tween the publications of the first such formula in 1923 and 1959,over 29 "readability formulas" were developed.

Quantitative analysis of written text has been conducted in hopeof finding what determines a readable style. Until very recently, notheoretical model of language behavior was available from which togenerate hypotheses. Hence, the method employed in analysis of texthas been one of correlation. Characteristically, the text is analyzedand possible factors affecting readability (such as sentence length andvocabulary measures) are conjectured.

A set of readings is then collected and ordered by readabil-ity according to a specific criterion (such as reading speed, testedcomprehension, judgment). The chosen variables are then meas-ured, their correlations with the readability criteria determined,and regression equations written. Until very recently, linear regres-sion techniques and an additive model have been used exclusively. Ingeneration of the equations, analysts have added additional factors un-til the increase in predictable variance accounted for by adding al atherfactor was negligible.

In recent years, activity in this area has sharply declined(Tannenberg& Greenbaum, 1968). We may, however, soon witnessa renewal of interest in measurement of readability as the science ofpsycholinguistics grows. Analysis of readability from a theoretical

4

431,Copp

point of view, e.g., Bormuth (1966), may contribute greatly to thescientific understanding of the written information transfer process.Use of electronic data processing equipment in analyzing readabilityis also a great aid in overcoming two of the traditional problems inreadability analysis: the time and effort required to perform the anal-ysis by hand, and reliability of measurement--both across time andacross individuals.

Early Formulas,

The first important attempt to measure objectively readabilityrepresented an attempt to aid teachers in choosing appropriate textsfor their classes.

In the years around 1920, science teachers at the junior highschool level complained that the number of technical and scientificterms present in textbooks intended for classroom use was becomingexcessively large. It was becoming necessary to devote large amountsof class time to teaching the meanings of the new vocabulary, lesseningthe time available to teach scientific concepts. Lively and Pressey(1923) attempted to develop an objective method for determining the vo-cabulary difficulty, or "burden, " of textbooks, so that those books withexceptionally dqficult vocabularies could be avoided.

Their measure was based on a simplification of Thorndike'sTeacher's Word Book of 10 000 Words, published in 1921. This booklists the frequency of occurrence of the most commonly occurring10,000 words in the English language. Lively and Pr..,ssey used a"weighted median index number'' as their measure of vocabulary dif-ficulty. In order to compute the index, a sample of one thousandwords evenly distributed through a text was taken. The individualwords were found on Thorndike's list, and an index was assigned toeach on the basis of its location in the list. For example, those wordsappearing in the most common thousand.(according to Thorndike) re-ceived an index number of ten. Those in the second thousand wereassigned index number nine, and so on through the ten, thousand-wordblocks in the list. Words not appearing on Thorndike's list were as-signed an index number of zero.

5

BESTcopy AVAILABLE

To compute the vocabulary burden, the median value of theindices of the words sampled from the text was determined, count-ing each value of zero twice. Substantial agreement was obtainedbetween the judged difficulty of a wide variety of reading selections,ranging from second grade to college level, and the ordering of theselections by the weighted median index number. These findings en-abled Lively and Pressey to conclude that the weighted median indexnumber provided an estimate of the vocabulary difficulty of texts.Lively and Pressey were aware of the weaknesses in their readabil-ity measure. They pointed out that their 'index relies on the appro-priateness of Thorndike's word count, and this may not be optimalfor their intended applications, since Thorndike's count appeared tothem to be based largely on materials employing a literary and evenpoetic vocabulary. They also admitted that larger samples than onethousand words from an entire book may be more appropriate, al-though they made no effort to evaluate the effects of varying samplesizes.

In addition to its purely historical significance, the readabil-ity assessment procedure of Lively and Pressey is significant becauseit led directly to the first of the "readability formulas" of the moderntype,-that of %V ashburne and Vogel (Chall, 1958)--in which various fac-tors correlating with a selected criterion of readability were combinedinto a single multiple regression equation.

Washburne and Vogel's effort began as one of generating normsfor use with the Lively and Pressey method, a need which was pointedout by Lively and Pressey when they presented their initial report.Working in the Winnetka, Illinois, school system, Washburne and Vogeldetermined the weighted median index numbers of 700 books reportedas having been read and liked during the preceding year by at least 25of the 37, 000 students in the school system. Categorization scorescomprising a weighted median index number were correlated with me-dian grade levels derived from the paragraph meaning section of theStanford Achievement Test for those students who reported readingand liking a given book. it was found that a correlation of . 80 existedbetween grade level and number of "zero-value" words present in thetested sample from the corresponding book (Washburne & Vogel, 1926).This information was summarized in the Winnetka Graded Book List,which received considerable use by parents, teachers, and librariansin selecting books to be made available to children in grades 3 through 9(Chall, 1958).

8E67COPY

AVAILABLE

A need was soon realized for a method of determining the ap-propriate grade level for books published after the Winnetka list wasprepared, without repeating the massive effort employed in developingthe original list.. Accordingly, Washburne and Vogel selected 150 ofthe books on the Winnetka list to isolate the internal factors related toreading difficulty; i, e. , factors which might be effective in distinguish-ing those books read by lower grade pupils from those read by studentsof higher grades. Ten factors were found, and their correlation withgrade level was determined. All factors correlated significantly withthe criterion of grade level, but'only four were subsequently used inderiving a regression equation for measuring readability, since manyof the other factors had very high intercorrelations (Vogel & Washburne,1928).

The readability formula developed by Vogel and Washburne isbased on a systematically chosen sample of 1000 words from the bookto be tested, which is analyzed as follows:

number of different words appearing (x2)

number of occurrences of prepositions (x3)

number of words not on Thomdikeis list of 10, 000 (x4)

numbs of simple sentences in 75 sample sentences (x5)

The difficulty of the book is expressed in terms of grade levels of theStanford Paragraph Meaning Test (X1). The regression equation de-rived was:

Xi = . 085x2 + . 101x3 . 604x4 - 411x5 + 17. 43.

This formula correlated . 845 with the reading test scores of studentswho read and liked the respective books during the previous year. Itis important to note that the criterion employed here is not strictly com-prehensibility of the text. It is confounded by subjective evaluation of thebooks by students and factors such as subject matter and number of pic-tures, which are not of direct interest to the question of readability.

7

RESTCOPY

4411ABLE

In addition to initiating the use of multiple regression equa-tions in the study of readability, Washburne and Vogel's study is sig-nificant because of their conceptual approach to the problem. Theywere the first to: (1) analyze the effects of structural factors in thetext, (2) employ an objective criterion of textual difficulty as opposedto qualitative judgment, and (3) describe readability in terms of schoolgrade levels, as opposed to purely relative measures (Chall, 1958).

The readability measures of Lively and Pressey and of Vogeland Washburne are typical of the early studies of readability in thatthey: (1) measure readability essentially as a function of vocabularyfactors, (2) depend heavily on Thorndike's word list, and (3) employrelatively crude criteria of difficulty of text. Other less significantearly attempts to measure readability include those of Johnson (1930),Patty and Painter (1931), and Thorndike (1934) himself.

Detailed Formulas

The readability studies of Ojemann, Dale and Tyler, and Gray.and Leary are the most significant of the efforts undertaken between 1934and 1938, a period characterized by' evaluation of larger numbers offactors potentially related td readability, reduced emphasis on wordfrequency lists, such as that of Thorndike, and concern over more ob-jective readability criteria.

The year 1934 is considered to be the beginning of the trendtoward detailed analyses of factors relating to readability, in which alarge number of factors, including qualitative ones, were evaluated.The first of these to appear was that of Ojemann (1934), who institutedthe use of comprehension test scores as the criterion of readability.

Ojemann collected 16 parent-education passages, each approxi-mately 500 words in length. The passages were read by adults, andcomprehension tests were given covering each passage. A reading achieve-ment test was also administered. The grade-level of difficulty of eachpassage was taken as the mean tested reading level of those people whocorrectly answered at least half of the comprehension test questionsfor that selection. After arranging the 16 selections in order of diffi-culty, Ojemann analyzed their contents for eight sentence factors, six

aer

vocabulary factors, and three qualitative factors. All of the 14 quan-titative factors (sequence factors and vocabulary) correlated signifi-cantly with the criterion.

Of the sentence factors, three exhibited correlations of . 60or above. These were: number of simple sentences, number ofprepositions, and number of prepositions plus infinitives. The fivesentence factors having correlations of less than .6 with the criteri-on of difficulty were: number of complete sentences, number of com-pound sentences, number of dependent clauses, mean length of de-pendent clauses, and ratio of total words in independent clauses tothe total words in a selection.

All six vocabulary factors correlated . 60 or higher with thecriterion. These were: percentage of words in Thorndike's first1000, percentage of words in Thorndike's first 2000, percentage ofwords known by 70 per cent of sixth-grade pupils, percentage ofwords known by 90 per cent of sixth-grade pupils, mean difficultyof different words, based on Thorndike's list, and mean difficultyfor each word.

No attempt was made to write a regression equation, sincethe qualitative factors, concreteness versus abstractness of rela-tions, obscurity in expression, and incoherence in expression, ap-peared to have considerable importance in determining the difficultyof the passages, although they could not be quantified. Instead of aformula, Ojemann presents his 16 tested selections, with their re-spective values for all the quantitative factorb and tests of grade-lev-el difficulty. This type of presentation was intended to allow evalua-tion of the qualitative factors in testing new passages, in addition tocomparing new passages according to the quantitative factors.

9

In addition to being the first application of comprehensiontest scores as a criterion of reading difficulty, Ojemann's studywas the first Lo employ adult subjects and adult reading materials.Furthermore, he was tly.: first to demonstrate the'importance ofqualitative, nonstatisticalfactors in the determination of readabil-ity.

Dale and Tyler (1934) evidenced vdry great concern over therange of applicability of then-available re,:dability measures. Theirinterest was in defining those factors determining the difficulty ofreading materials dealing with health education for adults of extreme-ly low reading ability. They stated that:

A critical analysis of the widely varying results of previous studies indi-cates the impossibility of determining the factors in the reading materials whichmake them understandable unless the investigations separate the influence offactors within the reading materials from those outside. Thereader's interestin the topic treated in the reading matter, his ability to read, the kind of com-prehension appropriate to the purposes of the reading matter, and the difficultyof the ideas developed in tht: reading matter are all factors which greatly affecthis comprehension of the rnaterial read but are distinct from the characteristicsinvolved in the materials themselves... the various factors not in the readingmaterials themselves must be controlled in order to determine the effects offactors within the materials (pp. 384-385).

In order to isolate the factors which affect reading difficulty,given a fixed topic, a fixed readership, and a fixed level of compre-hension, Dale ana Tylor asked adults of low reading ability to readhealth education articles collected from newspapers, magazines, andbooks. The readers then completed a multiple-choice comprehensiontest designed to measure their understanding and retention of the mainideas of the selections.

The type of comprehension test employed was one of the most dif-ficult that could have been used. Questions did not deal with particularitems of information which could be remembered with relative ease.by the subject. Rather, the subject was required to integrate the en-tire selection and determine its main point. Additionally, the testquestions were written so that of the five possible responses to each,one option was "best" and the others were characterized by varying de-grees of correctness; i. e., accurate statements of secondary points

10

OtstCopp

44494

of the article or slightly Inaccurate statements of the points in thearticle, requiring the subject to know what was covered in the selec-tion, and also what was not covered. Good performance on such atest must have been a formidable task for adults determined to bereading at the third to fifth grade level--the level of these subjects.Not surprisingly, adequate comprehension was not obtained on any ofthe collected reading materials for determining the correlation betweencomprehension and the 29 quantitative factors.

To obtain sufficient comprehension, Dale and Tyler found it nec-essary to write their own, much simplified, test passages. Three vin-

'ciples were found which produced articles with the required ease of read-ing: (1) use of very basic vocabulary, (2) use of informal style charac-terized by conversational manner and anecdotal examples, and (3) free-dom from digression from the topic of interest.

After rewriting., restoring, and readministering their materials,Dale and Tyler found that of the 29 factors studied, 10 correlated signi-ficantly with comprehension. Of these, three were included in a regres-sion equation for predicting comprehension; i. e. , technical words in thepassages (x2), the number of nontechnical words not known to 90 per-cent of sixth grade pupils (x3) [from an unpublished study by Dale], andthe number of indeterminate clauses (x4).

The equation predicted the percentage of adults of reading lev-els of grade 3 to grade 5 who could understand the main point of a pas-sage (X1) based on the above measures, and took the form

X1

-9. 4x2

- 0.4x3 + 2 2x4 + 114. 4.

Predicted comprehension correlated , 51 with the criterion.

A comprehensive study of factors influencing readability wasperformed by Gray and Leary (1935). They listed 82 factors relatingto readability as drawn from a study of: (1) adults' and children's read-ing materials, and (2) recommendations of adult students, teachers,and professors of English. Of the 82 factors, 18, such as "image bear-ing words, " "physical and psychic association, " etc., were discardedbecause they seemed to "defy objective measurement. " Twenty more

11

irsi

factors were discarded because of their infrequency of occurrencein the'samplea materials. Twenty of the remaining 44 factors cor-related significantly (r > . 27) with comprehension test scores of asample of 756 adult subjects who were given general adult readingmaterials, both fiction and nonfiction. The subjects were selectedin such a way as to be representative of the then current populationof adult readers.

When Gray and Leary separated out their upper quartile ("good"readers) and their lower quartile ("poor" readers), the factors corre-lated differently (by group) with the comprehension test scores. It wasfound that vocabulary measures were most highly correlated with com-prehension scores for poor readers (those achieving comprehensionscores in lowest quartile), while readers scoring in highest quartileshowed comprehension scores to be most closely related to sentencestructure and length.

Gray and Leary provided an entire family of regression equationsrelating quantifiable factors to predicted comprehension test scores.For low ability readers, as defined by the comprehension test admin-istered, their initial equation included eight factors, and correlated64 with their criterion. Further analysis showed that considerable

reduction in the number of factors included in the equation resulted inminimal change in the multiple correlation with the criterion measure,or in the probable error of the estimate. Nine formulas were presentedemploying various sets of four of the original eight factors. Multiplecorrelations computed for these formulas varied from .6350 to .6402,and the probable errors of the formulas varied from .2956 to .2975.These findings allowed Gray and Leary to suggest th't in employingtheir measure of readability, one of the four-factor formulae be em-ployed and that the sole criterion for choosing between them be theease of measuring the factors required in each formula.

However, the most popular application of Gray and Leary'sresults has been based ou their five-factor formula, which demonstratesa correlation of .64 with the criterion, and the lowest probable errorof all (by . 0004). This formula takes the form:

where:

X1

= 01029x2

+ 009012x5

- 02094x6

- 03313x7

-. 01485x8 + 3.774

12

X1

=2

x5

x6 =

x7 =

x8 =

predicted average comprehension score, alonga scale of +4 to -4

average number of hard words (words not appear-ing on Dale list of 769 easy words) appearing insamples*

average number of 1st, 2nd, and 3rd person pro-nouns appearing in sample

average sentence length in words

average percentage of different words in sample

average number of prepositional phrases appear-ing in sample

Such an equation obviously lacks broad utility since it is only applic-able to low ability readers--the sample on which it is based.

Three hundred fifty books were tested using the five factorformula. The distribution of predicted difficulty of the books was verynearly normal. Five categories of difficulty were defined and labeled"A" ("very easy") through "E" ("very difficult"). Predicted compre-hension scores in the "very easy" category ranged from 1.46 to 1.15,and the "very difficult" scores included values from 0.22 to -0.09.Correlation of indiv41uals' comprehension scores with their respectivetested reading grade levels indicated that category A materials werewritten at approximately the 2nd to 3rd grade levels, category B wasat roughly the 4th grade level, and category C at the 6th grade to JuniorHigh School level. The two highest difficulty categories did not corre-late reliably with reading grade level.

*a 100 word passage from each chapter analyzed.

13

8E41°PI 41/4/480,

Other narrower studies of this period are included in the sum-mary table in Appendix A and are briefly discussed by Mare (1963).

Less Cumbersome Formulas

The next trend in readability research, that toward more effici-ent and easily applied formulas, saw the development of the two mostpopular and widely applied readability measures--the Flesch and Dale-Chall formulas.

The appropriateness of searching for efficient formulas wasdemonstrated by Lorge (1939). He recomputed correlations and re-gression equations based on the structural variables contained in thefive factor formula of Gray and Leary. Lorge used as criterion ma-terials the reading passages of the Standard Test Lessons in Readingof McCall and Crabbs. This is a set of 376 reading passages normedin terms of number of comprehension test questions correctly answer-ed, and related to grade levels of the Thorndike-McCall Reading Scale.Since its initial application by Lorge, this has become the most fre-quently used and highly respected criterion in studies of reliability(Klare, 1963).

Using the variables of Gray and Leary, Lorge obtained highercorrelations with his criterion than had Gray and Leary with any oftheir formulas. He produced two regression equations, each based ononly two of the Gray-Leary variables, x2 and x6, and x2 and x8 (intheir notation). Multiple correlations of . 7406 and 7456, respective-ly, were reported. This is the first instance in which over one-half ofthe variance in a pure measure of comprehension has been accounted forby a readability formula. Lorge attributed his high multiple correlationscompletely to his use of more adequately standardized criterion materi-als and his use of a larger sample of criterion materials--376 passages,as opposed to 48 for Gray and Leary.

After study of additional variables, all of which were measuresof aspects of vocabulary, Lorge concluded that other variables add in-significantly to the predictive accuracy attainable from vocabulary alone.He suggested that this finding may be attributable to insufficient reliaLil-ity of available criteria, including that of McCall and Crabbs. But, hecontended that improved criterion measures would not negate his thesisthat vocabulary is the most important determinant of reading compre-hension.

14

ars,copr

40/48te

Lorge did not present his own readability formula until 1944(Lorge, 1944). His formulation included two of the Gray-Leary fac-tors, x6 (average sentence length in words) and x8 (number of prepo-sitional phrases per hundred words), as well as x9 (ratio of hard wordsto total words in the sample). Lorge categorized these as a sentencestructure factor, idea density factor, and vocabulary factor, respec-tively, and considers them the primary structural elements relative toreadability. A "hard word, " to Lorge, was one that does not appear onthe Dale list of 769 easy words. This list includes words common toThorndike's first thousand most frequent words and the first thousandmost frequent words known to children entering the first grade, fromthe International Kindergarten List (Gray & Leary, 1935).

Reanalysis of Lorge's data by Dale uncovered certain compu-tational errors, and Lorge's formula was subsequently recomputed in1948 (Lorge, 1948). The final formula is:

C50 = 10x

2+ 06x

6+ 10x

8+ 1.99

where C5a is the reading grade level of individuals answering one-halfof the Mcc;a11-Crabbs questions on that passage correctly. The ob-tained predictor-criterion correlation was . 6 ?, and addition of othervariables raised the correlation to .705.

Lorge is the first to emphasize that readability formulas ingeneral are not to be construed as prescriptions for writing, but areonly usable as an approximate of the reading difficulty of a passage.

Rudolf FlesCh was strongly dissatisfied with the then availablemeasures of readability as they could be applied to adult reading ma-terials (Flesch, 1943a). Readability formulas, when applied to adultliterature, such as magazines, tended to rankthe literature in ordersfar different from the ordering that wa.s produced by judgment or know-ledge of the educational levels of readers of each magazine. Fleschthought that this misranking.was due to excessive emphasis on rangeof vocabulary and insufficient stress on other factors.

15

litsi copy

Flesch hypothesized three types of factors that should highlyinfluence readability for adults. Pirst, he suggested that sentencelength should be an important correlate of adult reading difficulty,although it did not appear to be for children. He found support forthis idea in the work of Gray and Leary (1935), who found sentencelength to be an important variable associated with readability forgood readers only. Second, Flesch hypothesized that abstractness wascorrelated with readability for adults. Third, he thought that the read-er's interest in the topic of a reading passage should influence its read-ability.

Based on Lorge's data employing the McCall-Crabbs StandardTest Lessons in Reacam, Flesch developed his initial formula, inwhich adult reading material would be assigned to a wide range ofschool grade levels on the basis of: (1) average sentence length, (2)number of affixed morphemes (affixes and suffixes, with certain ex-ceptions), an index of abstractness, and (3) number of words of per-sonal reference (pronouns, names, words directly relating to people,such as aunt, baby, etc. )(Flesch, 1943b). His regression weightswere subsequently recomputed by Lorge when computational errorswere found by Dale (Lorge, 1948).

Almost simultaneously, due to difficulty of application, ap-parent lack of sensWOity to the human interest factor, and misuse ofthe formula as a rule for writing, Flesch completely revised his formu-la and published his "reading ease" and "human interest" formulas(Flesch, 1948). The two new formulas were again standardized againstthe McC..11-Crabbs lessons. The "reading ease" formula appeared as11. E. = 206. 835 - 846 WL 1. 015 SL, in which SL is average samplesentence length in words and WL is word length measured as syllablesper 100 words, Syllable count was substituted for the earlier count ofaffixes. The correlation between the two measures was .87, and thesyllable count was expected to be more easily and reliably taken.

The "human interest" index is presented as:

H. I. = 3. 635PW + . 314PS

where PW is average percentage of personal words, using a slightly

16

614,

narrower definition than previously, and PS is the percentage ofsentences spoken or addressed to the reader, including exclama-tions, or grammatically incomplete sentences whose meaning mustbe determined from the context. This factor tests the "conversa-tional quality" of the passage, and its inclusion represents an attemptto bring out the easy readability of direct conversational style. Thereading ease formula correlated 70 with the McCall-Crabbs; the hu-man interest index correlated .43 with the McCall-Crabbs,

The new formulas were held by Flesch to locate tested passageson scales which range from 0 to 100. A reading ease score of zero isconsidered to represent "practically unreadable" material, while 100represents text which is easily read by any literate person. A humaninterest score of zero indicates no human interest, and 100 indicatesthat the passage is "full of human interest."

The Flesch formulas have become the most widely applied inthe entire history of readability research. This wide application isdue in part to the ease of computation of his formulas and partly tothe wide exposure given to his formulas through a long series of popu-larized books (e.g., The Art of Plain Talk [Flesch, i946], The Art ofReadable Writing [Flesch, 1949], and How to Test Readability [Flesch,1951]). These books saw wide circulation in business, governmental,and journaliztic circles, where they were employed as rules for writ-ing (Chall, 1958).

A simplification of the Flesch reading ease formula was pro-posed by Farr, Jenkins, and Patterson (1951), who observed a corre-lation of -.91 between the average number of syllables per word andthe number of one syllable words in a passage. Accordingly, theysuggested that the number of one syllable words be used in Flesch'sformula so as to r.ermit more rapid testing with no loss of reliability.Farr, Jenkins, and Patterson generated such a formula, and found it tocorrelate .95 with scores produced by the Flesch formula. They there-fore concluded that their formula,

Reading Ease = 1.599 (number of one syllable words)

- 1.015 (mean sentence length in words) - 31,517

should be considered an acceptable substitute for the Flesch formula.

17

itir

Both Flesch (1952) and Klare (1952) immediately criticizedthis proposal. Flesch maintained that accuracy would be lost in eval-uating very easy or very difficult materials, and Klare suggested thatthe reliability of counting one syllable words should be lower than thatof counting syllables in the manner suggested by Flesch. England,Thomas, and Patterson (1953) experimentally tested these criticismsand were able to discount them completely. The Farr, Jenkins, andPatterson formula has since received a moderate amount of accept-ance--considering the rate at which it is reported as being used inapplied evaluations of materials. Chall (1958) pointed out the ironyof this return to word length as a criterion, of difficulty, one of thevery factors Flesch considered unsuitable.

Dale and Chall (1948) employed the Flesch formula to evaluateeducational materials published by the National Tuberculosis Associa-tion in terms of readability for the average adult. They eventuallysought to develop their own measure, because of two drawbacks en-countered in the use of the Flesch formula. First, they found low be-tween rater reliability for the count of affixes necessary for the Fleschformula. Although Dale and Chall expressed considerable respect forthe justification, presented by Flesch (1943b), for using a count of af-fixes as an index of abstractness, they wondered (since affixes corre-lated. 78 with abstractness) whether or not some other and more man-ageable correlate of abstractness could be employed. This question issupported by Lorge (1944), who stated that all measures of vocabularyload, including abstractness, are intercorrelated. Second, Dale andChall considered the use of personal references in the Flesch humaninterest formula to be oversimplified. References to senators, thoughpersonal, in the Flesch sense, are not generally associated with a low-ering of abstractness of text, as are references to "Dad" or "John, "which are generally associated with a very concrete type of statement.

Again using the McCall-Crabbs test data originally collectedby I,orge, Dale and Chall derived a new regression equation. Thisequation is based on average sentence length, and the "Dale score"--the relative number of words in the text samples not appearing ona list of 3000 words known to 80 per cent of a sample of fourth grad-ers. The form of the equation was:

1579 (Dale score) + . 0496 (sentence length) + 3. 5365XC50

in which the criterion, X is the reading grade level of an individualC50'

18

able to answer correctly half of the comprehension test questions.The score yielded by this equation correlated .70 with the McCall-Crabbs criterion.

Another formula based on the Flesch formula is that ofGunning (1952). In this formula, reading grade level necessary tounderstand tested material is equal to .4 of the sum of mean sen-tence length in words and percentage of words of three or more syl-lables. No numerical correlation was reported between this and oth-er estimates of reading difficulty, although there is little reason tobelieve that the correlation was low.

The McCall-Crabbs Standard Test Lessons in lieadint, whichwere the basis of the last five formulas discussed, were developed in1926. They were revised in 1950, and at least 60 of the passages inthis more recent edition are entirely new, dealing with modern topics,such as atomic energy and aviation (Powers, Sumner, & Kearl, 1958).Based on this revision of tilt_ McCall-Crabbs tests, Powers, Sumner,and Kearl undertook to recompute the Flesch, Da le-Cball, Farr-Jenkins-Patterson, and Gunning indices of readability. They hoped to produceformulas accounting for changes in reading abilities over the twenty-four year period between test editions, and to facilitate comparison ofthe formulas, since they would be based on identical measurement rules,mathematical procedures, etc.

The recomputed formulas and their respective correlations withthe constant criterion of grade score of pupils answering one-half of thetest questions correctly are:

Flesch:-2.2029 + (n0778)(mean sentence length) + (.0455)(number of syllables per 100 words)

Dale-Chall:3.2672 + (. 0596)( mean sentence length,' + (. 1155)(percentage of words not appearing or :)ale list of3000)

10

r = .6351

r = .7135

Farr - Jenkins - Patterson:8. 4355 + (. 0923)(mean sentence length)- (. 0648)(percentage of one syllable words)

Gunning:(.0984) (percentage of words of three or moresyllables)

test

r = . 5836

r = . 5865

In support of their recalculations, Powers, Sumner, and Kearlpointed out that the revised formulas give estimates more consistentwith one another than did the original Flesch and Dale-Chall formulas;the other formulas were not compared in their original forms. It wasconcluded that the Dale-Chall formula is the "best, " since it had thehighest predictive power (correlation) and the lowest standard error,. 77 grade levels, of the four formulas tested.

The final readability measure of this period to be discussed isthat of McElroy. This measure was unfortunately called a "fog count, "the same term applied by Gunning to his formula. This duplication ofnames has caused confusion on the part of some later researchers,such as Kincaid (1972).

McElroy's fog count is, again, related to the Flesch formulain that it is based upon a count of syllables (Klare, 1963). The pro-cedure to be followed in using this formula is assign a value of 1 toeach word of one or two syllables appearing in the sampled passage,and a value of 3 to each remaining word--which will have three ormore sylls.bles. The assigne"I values are added and the value thus de-termined is the fog count. '^o determine the reading grade level as-sociated with a particular fog count value, if the sum is over 20, it isdivided by 2. If the sum is under 20, 2 is subtracted and the resultis divided by 2.

No statistical information relating to the development, or ac-curacy of this formula is available (Kiare, 1963; Kincaid, 1972).

20

e ent Formulas

Klare (1963) considers the period 1953 to 1959, when his re-view was written, to be one exemplified by a trend toward speciali-zation in readability formulas. However, it seems that there is littleto be gained in naming a trend on the basis of only seven formulas ofrelatively minor importance which appeared in a six year period. Ex-tension of the period to include all recent studies of readability meas-urement techniques appearing from 1953 to the present does, however,appear appropriate and renders impossible the naming of a trend ofstudy characterizing this recent period.

During this period, in addition to formulas intended for a limitedrange of application, additional readability measures intended for gen-eral application were developed, and two new approaches to the problemof measuring readability were presented.

During the last 18 years, four formulas appeared which are in-tended for application to primary school texts. These are the formu-las of Spach.: (1953), Wheeler and Smith (1954), Bloomer (1959), andTribe (1956).

The formula of Spache (1953) predicts the primary grade level(grades 1-3) of textual material on the basis of average sentence lengthin words (x1) and percentage of words not appearing on the Dale list of769 common words (x2). The formula predicts grade level as equal to:.141x1 + . 086x2 + .839. The reported correlation between predictedscore of tested materials and usual grade level of application was .818.

Wheeler and Smith (1954) based their formula for prediction ofgrade level on the grade 'designation assigned by the publisher of thetextual materials in their sample, Their formula equated grade lev-els to 10 times the product of the mean length of units (sentences,with minor exceptions) in -words and the percentage of multisyllablewords. The value thus determined is located in a table and grade lev-el of 1 through 4 read ofi. This was the first formula to be based ona multiplicative model of factors. This type of equation allows inter-actions between the factors (as discussed by McLaughlin [1969]), sothat a particular change in factor A will affect the predicted value dif-ferently at varying levels of factor B. This may be a highly important

21

BEST COPY/4144kt

characteristic for a readability formula, in the light of data such asthat of Gray and Leary (1935), showing that the relative importanceof vocabulary and structural factors does not remain constant acrossall reading levels. Multiplicative formulas may be able to deal moreappropriately with findings. of this type than do additive formulas.But, it is doubtful that the simplified formula of Wheeler and Smithis a very large step in the proper direction.

The formula of Bloomer (1959) again used customary gradelevel application as the readability criterion. In Bloomer's develop-ment, reading grade level is predicted from abstraction level as indi-cated by the number of words per modifier (modifying phrase) and"sound complexity" (sic) of modifiers. Bloomer contended that thesevariables may be used as predictors of readability, although he pre-sented no formula, predictive method, or method for application.His approach is based on the observation that abstraction in text in-creases with grade level, and that the two variables (words per modi-fier and sound complexity) employed are closely associated with ab-straction. The multiple correlation between the two variables andassigned grade level was .78, whi'h Bloomer considers to comparefavorably with the correlations obtained through the procedures ofFlesch and of Lorge.

The McCall-Crabbs Standard Test Lessons in Reading, 1950revision, were again used as a criterion by Tribe (1956). Grade levelscore of children, in grades 2 through 8, who could correctly answerone-half of the reading test questions was found to be equal to:

.0719x1 + .1043x5 + 2.9347

where xi is the average sentence length and x5 is the percentage(times 100) of words not appearing on the Rinsland word list. A cor-rection factor is then applied to the pr edicted grade level. This fac-tor is based on a table presented by Dale and Chall (1948).

22

An unusual criterion was employed by Jacobson (1965) todevelop formulas to determine the readability levels of high schooland college chemistry and physics texts. The predicted criterionmeasure was the average number of words indicated as not beingunderstood (as indicated by reader underlining) by readers, basedon 200 word sample passages. This procedure was first employedin 1928, and is known as the Kyte test. Test-retest reliability ofthis test over a one week interim was reported by Jacobson to be.95 for the physics texts and .85 for the chemistry texts in hissample. Two formulas are presented - -one for physics texts andone for chemistry texts. The variables included are;

X mean underlining score (the criterion)

x1

mean number of words per independent clausein each 200 word sample passage

x2 mean number of mathematical terms in thesampled passages

x3 mean number of words in the sampled passagesthat are above the 6,000 word level in Thorndike's20,000 word list

x4 ; mean number of (technical) words per passage'which do not appear in the Powers list of 1828essential scientific terms

The formula derived for use with physics texts is:

X = -.0003 + 29.7059x2 + 21. 119x3 + 35.0029x 4

which exhibited a correlation with the criterion of .70.

23

The formula for chemistry texts is:

X = . 003 + . 1706x1

13.+ 13 7231x2

- 43.7262x3 - 2. 3577x4

Predictions from this formula correlated .67 with the criterion.

Flesch introduced three additional readability indexes (the term"formula" is probably inappropriate here) in 1954 and 1958 (Flesch, 1954,1958). All are relatively subjective measures in which counts are made,based on 100 word samples of text, and the counted values are convertedto arbitrary scales through reference to conversion tables. All werevalidated by inspection only.

The "r" score (Flesch, 1954) is an index of realism, based onthe number of references to specific human beings, their attributes orpossessions, locations, objects numbered or named, dates, times,and colors.

The "e" score (Flesch, 1954) is an index of energy, based onindications of voice communication, such as inflection.

The "formality-popularity" scale (Flesch, 1958) is based onthe total numbers of: capitalized, underlined or italicized words, num-bers (not spelled out), punctuation marks, symbols (#, $, etc.), be-ginnings of paragraphs, and endings of paragraphs.

Forbes developed a readability measure intended for applicationto the instructions and items contained in all types of standardized testsand opinion polls, except vocabulary tests (Forbes & Cottle, 1953). Todetermine a test's reading grade level, each word appearing difficultto the grader is looked up in the 1942 Thorndike Junior Century Dic-tionary, and its listed frequency of occurrence, from the first to the20, 000th words noted. The total of all indices above 4 is computedand divided by the total number of words sampled. This vocabularyindex is luoked up in a table, which indicates the corresponding gradelevel.

24

114rOpp

They reported that the readability of tests measured usingthis procedure correlates . 96 with the average of the readability ofthe same material calculated using five other procedures, includingthe Dale-Chall and Flesch methods. According to Forbes and Cottle,at the time of this work, no readability measure directly applicable totest forms was available. However, in view of the reported high cor-relation (. 96) between this technique and the others, it seems that thecontribution is minimal. Additionally, if the other measurement pro-cedures are not appropriate to this application, what is the value of theForbes and Cottle contribution? No other method of assessing the read-ability of the test materials is reported.

Six new readability measures of the traditional form have ap-peared in very recent years. The first of these was that of Coleman,developed in 1965 and discussed by Szalay (1965). Coleman developeda family of four formulas, using one through four measured variables,respectively, to predict readability level. The readability criterionin this case was the mean "cloze" score on the passage achieved bya sample of college students. (The "cloze" score is the percentageof deleted words of a passage that are correctly guessed and writtenin by a subject. This approach to readability is discussed in subse-quent paragraphs. )

Correlation's among the four formulas and criterion scoresvaried between . 85 and . 91 when tested independently by Szalay andColeman (Szalay, 1965). It is therefore recommended by Szalay thatonly the simplest formula be employed:

Predicted cloze score = 1. 29 (percentage of one syllable words)- 38. 45.

Factors present in the other three formulas but not making significantadditions to predictive ability are: (1) sentence length, (2) frequencyof occurrence of pronouns, and (3) frequency of occurrence of prepo-sitions.

25

BESTCOPT

AVAILABLE

A very similar formula was developed by HuMRRO personnelfor measuring the readability of Army technical literature (Caylor,Sticht, Fox, & Ford, 1972). Their measure is called the FORCASTformula after its developers, FORd, CAylor, and STicht. They con-sidered existing formulas inappropriate for their purpose becauseschool students and school or general texts had been most often em-ployed in developing readability formulas. This type of standardiza-tion was believed to make the applicability of prior formulas to tech-nical publications for adults suspect. Moreover, application of manyof the existing formulas required special grammatical or linguistic

,competence on the part of the person attempting to apply the formulas.Ford, Caylor, and Sticht elected to use doze score as the criterion ofreadability. They believed the doze test to be more objective thanmultiple choice tests or the other more traditional indices of corn-prehensioi They also pointed out that doze has "consistently yieldedvery high correlations with multiple choice tests and other more sub-jectively constructed measures of comprehension and difficulty"(Caylor, et al., 1972, p. 12).

Additionally, as part of their own work, they found a corre-lation of approximately .80 between doze score on 150-word passageschosen from the readings required in a wide range of Army jobs andachieved reading grade level, as measured by the United States ArmedForces Institute Reading Achievement Test III, Form A, AbbreviatedEdition.

Previous research had indicated that if a doze score of 35 percent is achieved by a given subject on a particular test passage, thenit may be reasonably expected that the subject will correctly answerapproximately 70 per cent of a set of multiple-choice questions basedon that passage. Hence, doze score appears to be a good indicator ofboth comprehension and reading achievement level.

A doze score of 35 per cent was arbitrarily, chosen as the cri-terion of potentially adequate comprehension of a text passage. Thereading grade level of a passage was defined as the lowest readinggrade level (as determined by USAFI Reading Achievement Test score)at which 50 per cent of the tested subjects achieved a doze score of35 per cent or higher on that passage.

26

Oar

A literature search provided Ford, Caylor, and Sticht witha list of 15 structural properties of text that had been applied inprevious readability formulas and required no special competence orequipment to measure. Correlations between cloze score and each ofthe structural properties were computed and several regression equa-tions were derived. Their preferred formula employed only a singlefactor, number of one-syllable words per passage. This factor isvery easily measured, and basing the equation on additional factorsallowed no practical increase in predictive power.

The FORCAST formula predicts reading grade level as equal to:

20 - (number of one-syllable words/10)

The correlation between predicted reading grade level of a passage withtested reading grade level associated with 35 per cent cloze score was.87. A subsequent application using new test passages and new sub-jects produced a correlation of .77.

The two last discussed formulas are extremely simple in form.However, the inclusion of additional factors in the formulas allowed in-consequential gains in predictive power, while greatly increasing theeffort required in applying the formulas. In contrast, the developersof the remaining four readability measures of recent years set outwith the stated purpose of attempting to provide a method of measur-ing readability with minimal effort.

Smith and Senter (1966) developed a readability equation whosedata may be collected from mechanical counters easily installed on anIBM Selectric typewriter. This technique allows measurement of read-ability at essentially rough draft typing speed. Mechanical countersare used to record the numbers of kay strokes, blank spaces, andsentences (an equal sign must be typed at the end of each sentence; thenumber of activations of this key indicating the number of sentencestyped). From these counts, the mean number of words per sentence[number of spaces divided by number of sentences (w/s)] and the meanlength of words (number of strokes aivided by number of spaces (s/w)]may be computed. Based on examination of graded school texts, theregression equation predicting grace level (GL) from the above ratiosis:

GL =0.50 (w/s) + 4.71 (s/w) 21.43.

27

BEST COPY AVAILABLE

This may be simplified to yield the arbitrarily scaled AutomatedReadability Index (ARI) equal to (w/s)+ 9 (s/w). The authors sup-port use of the ARI instead of predicted grade level because consider-able variability exists in characteristics of texts written for schoolstudents beyond the junion high school level. Accordingly, a precisestatement of grade equivalent appears inappropriate. It is also pointedout that readability, as measured by a formula such as this, increasesmore slowly with grade level at high levels than at low ones.

Dismissal of prediction of grade level removes the need tocomplicate the formula by attempting to deal with this nonlinearity.

As advantages of this procedure of estimating readability,Smith and Kincaid (1970) pointed out: (1) the speed of data collectionthat is possible, (2) the concrete nature of the acquired data, makingits collection extremely objective and reliable, and (3) the ease withwhich it could be incorporated into modern computerized typesettingmachinery.

Coke and Rothkopf (1970) have adapted the Flesch readabilityformula for automatic computation by computer. The determinationof words per sentence is straightforward in their algorithm, but wordlength in syllables is indexed by number of vowels per word. Usingthese two variables, their program produced Flesch reading easescores that exhibited a correlation of . 92 with scores calculated usingthe normal procedure. They discount the practical utility of theirprogram, but indicate that it is useful for testing the adequacy of textsampling procedures. They present a graph showing the probabilityof miscalculating reading ease score by five points or more as a func-tion of sample size.

F'ry (1968) presented his readability measure as a simple graphon which grade level may be looked up, given average sentence lengthin words and syllables. With this presentation, he hoped to reducegreatly the amount of time required to compute an index of readabil-ity, thereby increasing the popularity of such a measure. His gradelevel designations were based on inspection of textbooks used at vari-ous grade levels. The graphic presentation employed permits thereadab4ity measure to reflect accurately nonlinearities in the gradelevel function without resorting to higher order equations or arbitraryscaling.

28

McLaughlin, a psycholinguist, developed a readability formu-la which is based on the 1961 revision of the McCall-Crabbs Test

AMIONM11.

Lessons. His formula is based on a multiplicative rather than addi-tive model of factors (McLaughlin, 1969). McLaughlin feels that wordlength, a measure of semantic difficulty, and sentence length, a meas-ure of syntactic difficulty, interact with reading difficulty in a mannerthat cannot be accounted for by additive formulas. Interestingly enough,McLaughlin found that his readability formula, based on a multiplica-tive model, could be computed more easily than any previously existingmeasure. He first points out that the step of multiplication of wordand sentence lengths may be avoided, since a count of syllables in aset number of sentences is an equivalent procedure. Ile next pointsout that even counting all the syllables in N sentences is unnecessary.He found that the number of syllables in 100 words is equal to threetimes the number of words of over two syllables, plus 112. He thenderived his regression equation, and found that by adjusting testedsample size the formula for predicting grade level necessary to an-swer 100 per cent of the McCall-Crabbs questions correctly could begreatly simplified. A very close approximation to his formula is pre-sented:

SMOG Grade = " + square root of polysyllable count:in 30 sentences

Predictions from this formula correlated at approximately . 70 withthe criterion. The term SMOG grade, or SMOG count, is in tributeto the FOG count of Gunning, the first application of the count of num-ber of polysyllabic words to readability determination and to the charac-teristic atmospheric condition of his home city--London.

*number of words of three or more syllables.

29

Cloze Procedures

8E3rCOPP

41/4148lE

The doze procedure, which has been briefly mentioned pre-viously, was introduced by Taylor (1953) as a measure of readabiLtythat is free from many of the disadvantages of traditional readabilitymeasures. In the doze method; samples of text are presented withsome words deleted and replaced by blank spaces. The subject's taskis to fill in the blank spaces with .he correct words. The name "doze"was applied to this procedure by Taylor because of its resemblance tothe principle of closure of the Gestalt school of psychology: the "humantendency to complete a familiar but not-quite finished pattern--to 'see'a broken circle as a whole one... by mentally closing the gaps. "

In his initial report, Taylor presented data showing that thedoze procedure consistently ranked tested "standard" reading passagesin the same order as the readability formulas of Flesch and of Dale-Chall. He also indicated that the rank ordering of doze scores is main-tained regardless of system of word deletion employed, be it every nthword or random, with low (10 per cent) or high (20 per cent) rates ofword deletion. A random or, equivalently, every nth deletion systemis strongly defended. If enough words are deleted, all kinds of wordsare deleted in the proportion in which they actually occur in the text.

Analysis of the effects of scoring for only precise matcheswith the deleted word as compared with applying the more tedious pro-cedure of accepting synonyms as correct responses raises all scoresequally. Accordingly, there is no effect on discriminability and themore difficult procedure is not warranted.

Taylor also demonstrated that the doze test can handle unusualmaterials more effectively than readability formulas. Gertrude Stein,for example, writes in quite short sentences with a fairly simple vo-cabulary. However, her style is such that her material is very dif-ficult to read. This is accurately reflected by the doze test, but theFlesch and Dale-Chall formulas rate sample passages taken from herwritings as very easy reading.

30

SESTcOPY

411411481E

Taylor (1957) found correlations of . 70 to .80 between dozescores and comprehension scores of Air Force trai.ttees readingtypical Air Force technical material. Bormuth (1968) found cor-relations of . 90 to . 96 between doze scores and scores on testsof comprehension of passages from the Gray Oral Reading Tests.Bormuth's questions were of the transformational type, measuringretention of "facts" from the passages only. In constructing ques-tions of this type, one word or clause of a statement in the passageis deleted and replaced by a question marker. The answer to thequestion, then, is the deleted element. For example, "The boy rodethe horse, " becomes "Who rode the horse?" Bormuth indicated thatlimiting the test to questions of this type circumvented the problemof poor matching of the readability levels of the passages and the tests,since the questions are determined by the sentences of the passage.

Rankin and Culhane (1969) similarly correlated doze scores andcomprehension scores, but without limiting the types of questions in-cluded in the costprehension tests. Their test questions included itemsrelating to voc4ulary, fact, sequence, causal relationship, main idea,inference from facts, and author's purpose. They obtained a correla-tion of . 68 between doze scores based on excerpts from encyclopediaarticles and comprehension test score.

Bormuth (1967) determined the doze scores corresponding to:(1) 75 per cent comprehension test score, the comprehension levelgenerally considered necessary to allow effective classroom study ofa text with assistance of a teacher available, and (2) 90 per cent com-prehension test score, the level which is considered an indication thatthe text is sufficiently comprehensible to allow effective independentstudy. A 30 per cent doze score was found to be associated with 75per cent comprehension test score, and 50 per cent doze score wasassociated with 90 per cent comprehension score. Replication of thestudy in the succeeding year (Bormuth, 1968) showed a doze scoreof 44 per cent to be associated with the "classroom level" and 57 percent to be associated with the "independent level. " Similar proce-dures by Rankin and Culhane (1969) using their more difficult type ofcomprehension test,- as described previously, found doze scores of 41and 61 per cent at the comprehension score points of interest. The cor-respondence between these results and those of Bormuth is quite re-markable in view of the fact that Bormuth considered his 1967 results

31

oittcot411411480

relating doze score to 90 per cent comprehension score relativelyinvalid, because of ceiling effects present in the comprehensiontest used in that study.

Based on these results, Rankin and Culhane concluded thatit would be appropriate for teachers to consider books on which pu-pils cannot attain a doze score of approximately 40 per cent to betoo difficult for those students.

'ine doze procedure has numerous advantages and disadvan-tages. Among the advantages pointed out by Taylor (1953) and Kiare,Sinaiko, and Stolurow (1970) are: (1) scoring reliability is very high,(2) it works well with "non-standard" material, (3) it accounts for in-terest and prior knowledge of reader populations, (4) subjects of allabilities seem to enjoy it, and (5) test materials are easy to con-struct.

Among its disadvantages are: (1) doze is a measure, not a ,

predictor of readability, requiring testing. of sizable samples of people,(2) Klare et al. (1970) hypothesize that it may depend more on know-ledge of language than subject matter, (3) it may not accurately re-flect all types of comprehension, and (4) it may depend excessivelyon "short-range constraints"--the four or five words appearing oneach side of the deleted word.

32

litSt004441011E

A. lication of Readabilit Formulas as "Rules .for Writin

The urge to apply mechanically the traditional readabilityformulas for purposes of improving readability is very strong, butmay not be appropriate, Smith and Kincaid (1970) point out thatreadability scores are gross measures of difficulty at best and mustnot be taken as indices of good or bad writing. A deliberate attemptto shorten sentence and word length does not necessarily enhancereadability. In fact, readability may be degraded. A more advantage-ous approach is to make the writing more logical and precise, Whencombined with these principles, consideration of the structural fac-tors contained in readability formulas may, however, contribute toreadability.

The recommendations of Flesch (1951) are typical of thosemade in hope of improving readability, He suggests that: (1) a per-sonal type of discourse be adopted, (2) the importance of points pre-sented be discussed, (3) introductions and summary statement be in-cluded, (4) punctuation be used in such a manner as to nelp the read-er, (4) points be discussed in chronological order or in order of in-creasing importance, and (6) excessive wordiness be avoided. He doesrecommend, additionally, that short paragraphs, sentences, and wordsbe used.

The advantage to be gained by careful consideration of the over-all structure of a passage of text was demonstrated by Lee (1965). Sig-nificantly better learning was found from articles written in a highlystructured manner compared to "normal" articles or those in whichparagraph order has been randomized. The highly structured articlesdiffered from the normal in that they included: (1) an introductory para-graph outlining the points to follow, (2) a final summarizing paragraph,(3) a number of major and minor headings, and (4) transitional para-graphs elaborating on completed and subsequent topics and emphasizingthe organization of the paper.

33

gatCOPY

411414814

Multimodal Presentation

The field of information transfer has also turned to the in-vestigation of mult :modal information presentation to enhance andfacilitate the information transfer process. Research in this areahas been mainly devoted to answering the question: Can method Xtransfer information as well as some other method? It has beenargued (Phillips, 1966) that this is not a worthwhile issue becausethe question assumes that the method against which comparisons arebeing made has been the most effective in transmitting the informa-tion. What the research may actually demonstrate is that method Xmay transfer information just as poorly as method Y. Phillips (1966)suggests that a more relevant question is:

Which resource or combination of resources (people, places, media) is ap-propriate for teaching what type of subject matter to what type of learnerunder what conditions (time, place, size of group, and so on) to achievewhat purpose? (p. 374).

Research on improving instructional materials has often oc-curred mainly without any reference to any precise theoretical no-tions (Briggs, 1966; Lumsdaine, 1964). Theories of learning havenot been taken into account. Even though a great deal of researchhas been performed to improve the effectiveness with which materi-als are presented in certain media, when one asks which media willbe more effective in presenting a certain type of material to a specialclass of learner, one comes to a standstill.

An experiment by Bouriss,eau, Davis, and Yamamato (1965)demonstrated how assumptions in the audiovisual field often fail topossess merit. It is generally assumed that in tez*ms of direct sen-sory appeal, pictures are superior to printed or spoken words. "Apicture is worth a thousand words" is generally assumed. Contraryto this belief, Bourisseau, Davis, and Yamamato (1965) demonstratedthat pictorial stimuli are inferior to verbal (printed) stimuli in regardto both the number of subjects making sensory responses and the totalnumber of sensory responses evoked.

34

tercotesouszt

There have been a number of studies which suggest little, orno payoff from multimodal presentation.

Virag (1971) attempted to assess the effectiveness of threemodes in transmitting content to students with different aptitudes.The three instructional modes were: (1) low verbal (tape-slides andshort film episodes--materials presented at a fixed pace via the audio-visual communicative channels), (2) high verbal (written case studies--presented at a self pace through the channel of print), and (3) con-ventional (short lectures--presented at a fixed pace through the audio-communicative channel). The results indicated that no single mode ofinstruction was consistently more effective than the other two for anyparticular aptitude pattern.

In an attempt to prepare an audio-tutorial minicourse, Long(1970) made use of five methods. The procedures presented the ma-terials to be learned through: (1) printed text, (2) printed text withsupplementary programmed items, (3) printed text with supplementarylaboratory manipulations and observations, (4) taped audio programwith supplementary programmed items, and (5) taped tutorial programwith laboratory manipulations and observations. No significant differ-ences were found to exist between the five instructional methods em-ployed.

Travers (1965) reported a study performed by Van Mondfransin which verbal materials to be learned were presented auditorially,visually, and simultaneous audio-visual presentations. The resultsindicated that there was no difference between single sense channelpresentations and no difference between single and multiple sensechannel presentations. This research supports the earlier conclu-sion by Van Mondfrans and Travers (1964) that the use of two sensorymodalities has no advantage over one in the learning of material whichis redundant across modalities.

In a study by Goodrich (1971), four types of literary forms orsubjects were presented via four instructioal media: (1) fiction, (2)nonfiction (autobiography), (3) a lecture about literary symbolism,and (4) a lecture about composition. The four media were: (1) TV,(2) audio tape, (3) face-to-face lecture, and (4) text. There wereno significant differences between the different media. Goodrich con-cluded that the effects of the medium may not be noticeable undernormal learning situations.

35

BarCOPY

AVAILABLE

Sticht (1969) presented materials of differing levels of diffi-culty, to Ss of different aptitudes through the visual and auditory sen-sory channels. According to Sticht, the results indicated that listen-ing was as effective as reading in transmitting information of all threedifficulty levels for both average and low aptitude subjects. Siegel,Barciki and Macpherson (1965), at Applied Psychological Services,presented materials to four groups of college students via the auditoryand visual channels. Two groups, one auditory and one visual, alsoreceived adjunct programmed materials. The adjunct materials con-sisted of multiple choice questions with correct answers. The use ofthe adjunct materials provided feedback to the learner on what infor-mation was missed. The results indicated that both audio and visualpresentation with adjunct programmed materials were significantlybetter than without adjunct materials. there were no significant dif-ferences between the different sensory channel presentations.

On the other hand, a number of studies have suggested somegain to accrue from multichannel presentation.

Singer (1970) investigated comprehension as effected by vary-ing visual and auditory presentation. The information to be learnedwas presented: (1) textually, (2) verbally, and (3) verbally paired withreading. Comprehension was poorer when the materials were pre-sented only auditorially, but no difference existed between reading andreading with listening.

Severin (1967) presented materials via the auditory, the visual,and the audio-visual modes of presentation. The results indicated thatprint and print with audio were superior to audio for transferring in-formation. There was no difference between print and print with audio.These findings were substantiated in a later study (Singer, 1970).

The results of a recent study (Senour, 1971) concerning the ef-fects of student control of audio tape learning indicated that providingcontrol functions to the student; i, e. , the capability for starting, stop-ping, and replaying the tape, aided learner achievement as comparedto the situation which denied them. There was a significantly positivecorrelation between learner achievement and the number of times thelearner elected to use the controls. The subjects reported that theydid not have to take notes because they could replay the tape until theyknew it.

36

Nelson (1970) reported a study on the effects of visual-audi-tory modality preference on learning mode preference. The resultsindicated that auditory and visual perceptual subtests of readingreadiness tests are not sufficiently sensitive to disccrnrriodalitypreferences. Another report (Hueber, 1970) supported the resultsobtained by Nelson (1970).

If sensory modality preferences do affect information transfer,a means of determining these preferences must be developed. If audiotape presentations of information are going to be used, can subjectsbe taught to listen more attentively? A large body of research indi-cated that training to listen is possible (Brown, 1954; Erikson, 1954;Irwin, 1953; Nichols, 1949; Lewis, 1956). Other researchers (Erikson,1954; Irvin, 1954) suggest that low listening ability subjects benefit fromsuch training more than subjects of high listening ability.

CHAPTER 1II

DISCUSSION

The most important characteristics of the readability formu-las described here, as well as of those formulas appearing before1953, and considered of at least moderate importance are presentedin tabular fc m in Appendix A to this review.

In terms of application of these formulas for predicting read-ability, in the late fifties, there appeared to be rather general agree-ment (Chall, 1956; Klare, 1963; Powers et al. , 1958), that the mostprecise formula available, and therefore the one to be most generallyrecommended, was the Dale-Chall method. If reference to a wordlist was to be avoided, the Flesch formula was recommended, unlessother factors such as special reader populations, types of reading ma-terial, or particular advantages of one or another formula warrantedanother choice. It does not seem that this general set of recommenda-tions should be changed at this time.

Estimation of Reading Level

The majority of the readability formulas predict readabilityin terms of reading grade level. In order to apply them most effec-tively, knowledge must be obtained concerning the distribution of read-ing grade levels of the expected reader population. The variability ofreading grade level within school grades and within groups of adultshaving achieved particular levels of education found by Gray and Leary(1935) suggests that pure estimation of reading grade level is some-what unreliable. Reading grade level may be determined by admin-istering standard tests of reading ability, of which 37 are reported asbeing available in The Sixth Mental Measurements Yea rt--,ok (Buros,1a65), to a sample of the expected reading population.

39

OarCOPPANIGIOlt

However, in certain areas of application, most notably themilitary, a much more efficient, less time consuming, and lessexpensive procedure may be employed. It has been found in theAir Force (Madden & Tupes, 1966) and in the Army (Caylor et al.,1972) that reading grade level may be estimated from certain apti-tude test scores. Madden and Tupes noted that the general aptitudeindex (AI) of the Airman Qualifying Exam (AQE) correlated above .70with reading level. This is largely due to the inclusion of a readingvocabulary subtest score within the general AL Although readinggrade level as measured by the California Test of Reading Vocabularyand Reading Comprehension was estimable from the general AI alone,

more accurate prediction of an individual's reading grade level couldbe made by using a regression equation based on general AI and theindividual's selector AI score. The latter is one of three aptitude

area scores--administrative, mechanical, or electronicwhich arereferred to when assigning men to career fields in the Air Force. Forsome career fields, the selector variable is the administrative AI;for others, it is the mechanical AI, while for others, the electronic AIis employed. Selection to a few career fields is based solely on gen-eral Al. The regression equations appropriate for estimating readinggrade level of individuals in career fields which use the selector AIs

for personnel selection are:

administrative: RGL = . 0437 (GenAI) + . 050 l(MAI) + 5.0730

mechanical: RGL = . 099 l(GenAI) + 0085(MechAI) + 5.0459

electronic: RGL = .0743(GenAI) + . 0222(E1AI) + 4.6088

Caylor et al. (1972) similarly produced a regression equationto predict reading grade level as measured by the U. S. Armed ForcesInstitute (USAFI) Reading Achievement Test III, Form A, from know-ledge of an individual's Armed Forces Qualifying Test (AFQT) score.Their equation is:

RGL = . 75(AFQT score) + 5.52.

40

These regression formulas allow much more confidence tobe placed in statements concerning the appropriate readability lev-els of materials intended for use by Army or.Air Force enlistedpersonnel.

Discussion of Formulas

A number of general criticisms have been applied to readabil-ity formulas. The definition of the criterion of comprehensibility hasbeen a problem since the earliest studies (Bormuth, 1966). The usu-al practice has been to administer multiple choice criterion questionsjust after the passages being tested are read. Lorge (1939) criticizedthis procedure because test performance may be strongly influencedby the difficulty of the language of the test questions.

The difficulty of the questions may also be varied, so that thesubject may be able to score highly by simply remembering details ofa passage, or he may be required to determine the author's purposein writing the passage, and select the best of several highly similar al-ternatives.

In addition, Fry (1968) pointed out that reading grade levelsare not rigorously defined, so that different reading tests, especiallythose developed at different times, may provide different grade levelsfor identical subjects. New reading tests are more difficult than oldones, indicating that students at given grade levels read better thantheir predecessors. He summarizes the problem as one of "tryingto determine grade level when grade level won't stand still. " Itseems strange that the various predictions have not been correctedfor criterion unreliability. Also, there is no agreement on the levelof comprehension to be accepted. Fifty and 75 per cent were commonwhen McCall-Crabbs tests were employed, but the FORCAST formulauses 70 per cent (35 per cent cloze score) and the SMOG count wasbased on 100 per cent comprehension.

Most of the readabili-,, formulas do not account for the effectson readability of the reader's interests, experiences, tr aptitudes.Exceptions to this are the supplementary indices of human interest,abstraction, realism, energy, and formality-popularity of Flesch,

41

Cepr

along with the FORCAST formula, and Jacobson's measures of thereadability of physics and chemistry texts. However, some of thesemeasures will allow accurate reflection of the readability of materi-al appearing in professional or technical journals.

Finally, as Bormuth (1966) pointed out, until very recent yearsno theoretical base was available from which to generate testable hy-potheses relating to readability. Powerful theories of language behavi-or did not exist, so that only the most obvious statistical characteris-tics of the written text were studied.

Although he did not consider it appropriate to present his for-mulas in his 1966 paper, Bormuth reported that he has written regres-sion equations whose predictions correlate up to . 93 with doze testscore. He considers doze scores to be the only acceptable criterionof readability currently available. He reported that all c. the variablesin his formulas are new ones generated from modern linguistic theoriessuch as those of Chomsky and Yngve. None of the traditional readabil-ity variables were powerful enough to be included in his formulas.

Inter-user and test-retest reliabilities of formulas vary widely.Kincaid (1972) found that the test-retest reliability of the McElroy fogcount was only . 5. He also pointed out that he gets a headache aftertaking a fog count for 30 minutes. The highly objective ARI, however,has a measured test-retest reliability of over . 99 (Huff, 1970). England,Thomas, and Patterson (1948) found the interrater and test-retest reli-abilities of the Flesch reading ease formula were approximately . 90.

The reliabilities of other readability formulas had not been sat-isfactorily tested prior to 1958, according to Mare (1963), and workof this type has not appeared since that time.

42

04,49..

All of the readability measuring procedures discussed showroughly similar correlations with their respectiN e criteria of dif-ficulty. Additionally, all of the criteria seem to be highly intercor-related. Hence, from the validity standpoint, there is little basisfor selecting one readability measure over another. In this case,utility and practicality for the individual user seem to represent thecriteria to be employed when selecting a readability measure. Ashas been previously reported, the Dale-Chall and Flesch formulas,as revised by Powers, Sumner, and Kearl (1958), are the most highlyrespected of the traditional formulas, but considerations of particularcharacteristics of an individual situation may warrant choice of oneof the many other available formulas.

There ib, however, some indication that the mo13t powerfulmethod of measuring readability currently available is the cloze meth-od. However, cloze is less easy to apply than the otht..r techniques.Application of the cloze method requires preparation of numerous testforms, assembly of a group of subjects who are representative of theappropriate reader population, and considerable administration andscoring time. Also, cloze tests cannot be used to monitor the'read-ability of material as it is being written.

It has been knowri, since at least the time of the Dale-Tylerstudy in 1934, that using readability formulas as a basis for "rulesfor writing" is not an effective way to produce readable material.The recommendations made to writers interested in improving theinformation transfer capability of their output have always been ofa "qualitative" nature, as opposed to the "quantitative" nature of thereadability formulas. Recommendations to writers concern such con-siderations as stylistic variables, level of abstractness, coherency,obscurity of expression, difficulty of ideas expressed, ideationaldensity, and soon. These factors are generally ignored by readabil-ity formulas, since they are extremely difficult to measure or evendefine. This leaves us in the paradoxical situation of attempting tomeasure readability by measuring factors which do not determinereading ease. Readability formulas are based on relatively unim-portant characteristics of text which must be considered almost en-tirely artifactual.

43

Future Research Avenues

The search for more relevant variables and methods ofmeasuring them impresses us as the most pressing need for futureresearch in the area of readability measurement. It is likely thatmodern psycholinguistic theory could be a very stimulating areafrom which to draw important readability variables. Applicationsof information theory to the study of readability have not proven fruit-ful thus far. Such applications have been limited to attempts to meas-ure the redundancy of textual material as reflected by the variabilityof responses made in doze tests.

In order to me asure information transfer using the conceptsof entropy and bits, it is necessary to be able to specify accuratelythe stimulus alphabet. In this case, the stimulus variables are lettersor words and the probabilities and conditional probabilities of occur-rence of each symbol, from the point of view of the receiver, or read-er. Inability to specify adequately these probabilities makes the in-formation theoretic approach seem relatively inappropriate at thistime.

Research is also needed which focuses on development of amanageable criterion of readability. Cloze scores do not representa panacea, but the other criteria available have disadvantages. Com-prehension test scores are easily confounded with variable aspects ofthe test questions, and reading grade level is another step removedfrom comprehension, the topic of interest.

Second, the effects of the additional variables of general intel-ligence, interest, and prior experience on readability are not wellunderstood. In 1935, Gray and Leary observed that quantitative fac-tors related to readability were not the same for good and poor readers.But work on this variable has not continued. Few, if any, investigationshave sought to determine the effects of reader interest and experienceon readability, although a few formulas have attempted to account forthese factors, notably the Flesch reading ease formula and the FOR-CAST formula.

44

Third, the majority of readability formulas have been vali-dated on school students of various levels. Students comprise thepopulation of interest in only a small portion of the possible appli-cations of readability formulas. Hence, more cross-validationalstudies are needed which are based on samples of adult readers ofall reading ability levels.

Fourth, no measure of readability is available for evaluationof tests or programmed instructional material. These types of mate-rial are being used more and more in our society, and a method ofmeasuring their readability is greatly needed.

In all future research, it will be appropriate to evaluate eachreadability formula developed in mathematical forms other than theadditive linear model traditionally employed. Many authors havefound nonlinearities in the relationships between their chosen vari-ables and criteria. They have dealt with this problem by presentingtheir data graphically, or presenting a table of corrected values forthe predictions from their formulas. It is likely that nonlinear re-gression techniques account for available data in a significantly moreadequate manner than do formulas of the current, linear variety.

45 Lk.,.)k

REFERENCES

Bloomer, R.H. Level of abstraction as a function of modifier load.Journal of Educational Research, 1959, 52, 269-272.

Bormuth, J. R. Readability, a new approach. Reading ResearchQuarterly, 1966, 1, 79-132.

Bormuth, J. R. Comparable cloze and multiple-choice comprehen-sion test scores. Journal of Reading, 1967, 10, 291-299.

Bormuth, J. R. Cloze test readability; criterion referenced scores.Journal of Educational Measurement, 1968, 5, 189-196.

Bourisseau, W., Davis, 0. , & Yamamato, K. Sense-impressionresponses to differing pictorial and verbal stimuli. AV Com-munication Review, 1965, 13, 249-258.

Briggs, L. A procedure for the design of multimedia instruction. InProceedings of the 74th Annual Convention of the AmericanPsychological Association. Washington, D. C.: AmericanPsychological Association, 1966. P. 257-258.

Brown, J. How teachable is listening? Education Research Bulle-tin, 1954, 33, 85-93.

Buros, O.K. (Ed. ). The Sixth Mental Measurement Yearbook, High-land Park, N. J.: Gryphon Press, 1965.

Caylor, J. S., Sticht, T. G., Fox, L. C., & Ford, J. P. Methodolog-ies for determining reading requirements of military occupa-tional specialities. Presidio of Monterey, Calif.: HumanResources Research Organization, HumRRO Division No. 3,Draft Technical Report, January 1972.

Chall, J. S. Readability: An appraisal of research and application.Columbus, Ohio: Bureau of Educational Research, OhioState University, 1958.

Bar copy

Coke, E. U., & Rothkopf, E. Z. Note on a simple algorithm for acomputer-produced reading ease score. Journal of AppliedPsychology, 1970, 54, 208-210.

Dale, E., Chall, J. S. A formula for predicting readability (and)Instructions. Educational Research Bulletin, 1948, 27, 11-20,28, 37-54.

Dale, E., & Chall, J. S. The concept of readability. ElementaryEnglish, 1949, 26, 19-26.

Dale, E., & Tyler, R. W. A study of the factors influencing the diffi-culty of reading materials for adults of limited. reading ability.Library Quarterly, 1934, 4, 384-412.

De Long, V. R. Primary promotion by reading levels. ElementarySchool Journal, 1939, 38, 663-671.

Dolch, E. W. Fact burden and reading difficulty. Elementary EnglishReview, 1939, 16, 135-138.

Dolch, E. W. The use of vocabulary lists in predicting readability andin developing reading materials. Elementary English, 1949, 26,142-149.

England, G. W., Thomas, M. , Paterson, D. G. Reliability of the ori-ginal and the simplified Flesch reading ease formulas. Journalof Applied Psychology, 1953, 37, 111-113.

Erikson, G. Can listening efficiency be improved? Journal of Commu-nication, 1954, 4, 128-133.

Farr, J. N. , Jenkins, J. J. , Paterson, D. G. Simplification of Fleschreading ease formula. Journal of Applied Psychology, 1951, 35,333-337.

Flesch, R. Estimating the comprehension difficulty of magazine articles.Journal of General Psychology, 1943, 28, 63-80. (a)

Flesch, R. Marks of readable style. New York: Bureau of Publications,Teachers College, Columbia University, 1943. (b)

Flesch, R. The art of plain talk. New York: Harper & Bros., 1946.

48

147 Cat

Flesch, R. A new readability yardstick. Journal of Applied Psychol-o 1948, 32, 221-233.

Flesch, R. The art of readable writing. NewYork: Harper &. Bros., 1949.

Flesch. R. How to test readability. New York: Harper & Bros., 1951.

Flesch, R. F. Reply to "Simplification of Flesch reading ease formula.Journal of Applied Psychology., 1952, 36, 54-55.

Flesch, R. F. How to make sense. New York: Harper & Bros., 1954.

Forbes, F. W. , & Cottle, W. C. A new method for determining reada-bility of standardized tests. Journal of Applied Psychology, 1953,37, 185-190.

Fry, E. A readability formula that saves time. Journal of Reading, 1968,11, 513-516, 575-578.

Lillie, P. J. A simplified formula for measuring abstraction in writing.Journal of Applied Psychology, 1957, 41, 214-217.

Goodrich, H. An investigation of tile differential effects of four differentmedia on information acquisition and preception. DissertationAbstracts International, 1971, 31, 3775.

Gray, W. S. , & Leary, B. E. What makes a book readable. Chicago:University of Chicago Press, 1935.

II1

Gunning, R. The technique of clear writing. New York: McGraw-Hill,1952.

Huebner, R. Interaction between patterns of individual differences insensory modalities of students and methods of classroom instruc-tion. Dissertation Abstract International, 1970, 31, 679.

Huff, K. H. & Smith, E. A. Reliability, baseline data, and instructionsfor the automated readability index.. AFHRL-TR-70-14, AD-729209. Lowry AFB., Colo. : Technical Training Division, Air ForceHuman Resources Laboratory, October 1970.

Irvin, C. An analysis of certain aspects of listening training programamong college freshman at Michigan State College. Journal ofCommunication, 1954, 4, 2.

49

1406,fria._-w1"77/Irwin, C. Evaluating and training program in listening for college '4$0.

freshmen. School Review, 1953, 61 25-29. 4#04,

Jacobson, M. D. Reading difficulty of physics and chemistry textbooks.Educational and Psychological Measurement, 1965, 25 449-457.

Johnson, G. R. An objective measure of determining reading difficulty.Journal of Educational Research, 1930, 21, 283-287.

Kessler, E. The readability of selected contemporary bodks for leisurereading in high school biology. Science Education, 1941, 25,260-264.

Kincaid, J. P. Making technical writing readable. Human Factors So-ciety Bulletin, 1972, 15(5), 3-4.

Klare, G. R. A table for rapid determination of Dale-Chall readabilityscores. Educational Research Bulletin, 1952, 31, 43-47.

Klare, G. R. The measurement of readability. Ames, Ia. : Iowa StateUniversity Press, 1963.

Klare, G. R. , Sinaiko, H. W. , & Stolurow, L. M. The cloze procedure:A convenient readability, test for training materials and transla-tions. Arlington, Va. : Institute for Defense Analysis, .Scienceand Technology Division Paper P-660, 1971.

Lee, W. Supra-paragraph prose structure: Its specification, perception,and effects on learning. Psychological Reports, 1965, 17, 135-144.

Lewerenz, A. S. Measurement of the difficulty of reading materials.Educational Research Bulletin, Los Angeles City Schools, 1929,8, 11-16. Cited by G. R. Klare, The measurement of readability.Ames, Ia.: Iowa State University Press, 1963, P. 40.

Lewerenz, A. S. Vocabulary grade placement of typical newspaper con-tent. Educational Research Bulletin, Los Angeles City Schools,1930, 10, 4-6. Cited by G. R. Klare. The measurement of read-ability. Ames, Ia. : Iowa State University Press, 1963.

Lewerenz, A. S. A vocabulary grade placement formula. Journal of Ex-perimental Education, 1935, 3, 236.

Lewerenz, A. S. Selection of reading materials by pupil ability and pupilinterest. Elementary English Review, 1939, 16, 151-156.

50

iithrot

Lewis, N. Listen please! Clearing House, 1956, 30, 535-536.Lively, B.A., & Pressey, S. L. A method for measuring the "vo-

cabulary burden" of textbooks. Educational Administrationand Supervision, 1923, 9, 389-398.

Long, G. A study on the preparation of an audio-tutorial minicourse.Dissertation Abstracts International, 1970, 31, 1690-1.691.

Lorge, I. Predicting reading difficulty of selections for children.Elementary English Review, 1939, 16, 229-237.

Lorge, I. Predicting readability. Teachers College Record, 1944,45, 404-409.

Lorge, I. The Lorge and Flesch readability formulae: A correc-tion. School and Society, 1948, 67, 141-142.

Lumsdaine, A. Educational technology, programmed learning, andinstructional science. Theories of Learning and Instruction.63rd Year Book, National Society for the Study of Education,Chicago: University of Chicago Press, 1964, 371-401.

Madden, H. L. , & Tupes, E. C. Estimating reading ability levelfrom the AQE General Aptitude Index. PRL-TR-66-1, AD-632 182. Lackland AFB, Texas: Personnel Research Lab-oratory, Aerospace Medical Division, February, 1966.

McClusky, A. Y. A quantitative analysis of the difficulty of readingmaterials. Journal of Educational Research, 1934, 28, 276-282.

McLaughlin, G. H. SMOG grading: A new readability formula.Journal of Reading, 1969, 12, 639-646.

Morris, E. C. , & Halverson, D. Idea analysis technique. Unpub-lished. On file at Columbia University Library. 1938. Citedby G. R. Klare, The measurement of readability. Ames, Ia.:Iowa State University Press, 1936.

Nelson, J. The effects of visual-auditory modality preference onlearning mode preference in first grade children. DissertationAbstracts International, 1970, 31, 2219-2220.

Nichols, R. Teaching of listening. Chicago Schools Journal, 1949,30, 273-278.

51

ser 11°P1 111Ojemann, R. The reading ability of parents and factors associated 14matwith the reading difficulty of parent education materials. InG. D. Stoddard (Ed. ) Researchers in Parent Education II.Iowa City, Ia. : University of Iowa, 1934. Cited by J. S. Chall,Readability: An appraisal of research and application. Colum-bus, Ohio: Ohio State University, Bureau of Educational Re-search, 1958.

Patty, W. W. , & Painter, W.I. A technique for measuring the vocab-ulary burden of textbooks. Journal of Educational Research,1931, 24, 127-134.

Phillips, M. Learning materials and their implementation. Reviewof Educational Research, 1966, 36, 373-379.

Powers, R. D. , Sumner, W. A. , & Kearl, B. E. A recalculation offour adult readability formulas. Journal of Educational Psy-chology, 1958, 49, 99-105.

Rankin, E. F. , & Culhane, J. W. Comparable cloze and multiple -choice comprehension test scores. Journal of Reading, 1969,13, 193-198.

Senour, R. A study of the effects of student control of audio tapedlearning experiences (via the control functions incorporatedin the instructional device). Dissertation Abstracts Interna-tional, 1971, 31, 3351-3352.

Severin, W. The effectiveness of relevant pictures in multiple-chan-nel communications. AV Communications Review, 1967, 15,386-401.

Siegel, A. I. , Barcik, J. D. , & Macpherson, D. H. Mass trainingtechniques in civil defense. I. Introductory studies of tele-phonic adjunct training. Wayne, Pa.: Applied PsychologicalService 3, February 1965.

Siegel, A. I. , & Fischl, M. A. Mass training techniques in civil de-fense. II. A further study of telephonic adjunct training.Wayne, Pa.: Applied Psychological Services, July 1965.

Singer, C. Eye movements, reading comprehension, and vocabu-lary as effected by varying visual and auditory modalitiesin college students. Dissertation Abstracts International,1970, 31, 164.

Smith, E. A. , & Kincaid, J. P. Derivation and validation of theAutomated Readability Index for use with technical mate-rials. Human Factors, 1970, 12, 457-464.

52

Smith, E. A. & Senter, R. J. Automated Readability Index. Wright -Patterson Air Force Base, Ohio: Aerospace Medical Division,AMRL-TR-66-220, November 1970.

Spache, G. A new readability formula for primary grade reading ma-terials. Elementary School Journal, 1953, 53, 410-413.

Sticht, T. G. Learning by listening in relation to aptitude, reading, andrate-controlled speech. Presidio of Monterey, Calif. : HumanResources Research Organization, HumRRO Division No. 3,Technical Report 69-23, December, 1969.

Stone, C. R. Measures.of simplicity and beginning texts in reading.Journal of Educational Research, 1938, 31, 447-450.

Szalay, T.G. Validation of the Coleman readability formulas. Psy-chological Reports, 1965, 17, 965-966.

Tannenbaum, P. H. , & Greenberg, B. S. Mass communication. In P. R.Farnsworth (Ed. ), Annual Review of Psychology, 1968, 19, 351-387.

Taylor, W. Cloze procedure: a new tool for measuring readability.Journalism Quarterly, 1953, 30, 415-433.

Taylor, W. L. Recent developments in the use of "dime procedure. "Journalism Quarterly, 1956, 33, 42-46, 99.

Taylor, W. L. "Cloze" readability scores as indices of individual differ-ences in comprehension and aptitude. Journal of Applied Psychol-ogy, 1957, 41, 19-26.

Thorndike, E. L. Improving the ability to read. Teachers College Re-cord, 1934, 36, 1-19, 123-144, 229-241.

Travers, R. M. W. On the transmission of information to human receivers.Audiovisual Communication Review, 1964, 12, 373-385.

Tribe, E. B. A readability formula for the elementary school based uponthe Rinsland vocabulary. Unpublished Doctoral dissertation, Uni-versity of Oklahoma, 1956. Cited by G. R. Klare, The measure-ment of readability. Ames, Ia. : Iowa State University Press,1963, P. 69-70.

53

torzavAtakt

Valentine, L.D. Jr. , & Vito la, B. M. Comparison of self-motivatedAir Force enlistees with draft-motivated enlistees. AFHRL-TR-70-26, AD-713 638. Lack land AFB, Teitass PersonnelResearch Division, Air Force Human Resources Laboratory,July 1970.

Van Mondfrans, A., & Travers, R. Learning of redundant materialspresented through two sensory modalities. Perceptual & MotorSkills, 1964, 19, 743-751. .

Van Mondfrans, A. , & Travers, R. Paired-associate learning withinand across sense modalities and involving simultaneous and se-quential presentations. American Educational Research Journal,1965, 2, 89-99.

Vineberg, R. ,Sticht, T. G. , Taylor, E. N. , & Caylor, J. S. Effects ofaptitude (AFQT), job experience, and literacy on job performance:summar of HumRRO work units UTILITY and REALISTIC.Presidio of Monterey, Calif. : Human Resources Research Or-ganization, HumRRO Division No. 3, August 1970.

Virag, W. The effectiveness of three instructional modes employed totransmit content to students with different aptitude patterns.Dissertation Abstracts International, 1971, 31, 5520.

Vogel, M. & Washburne, C. W. An objective method of determining gradeplacement of children's reading material. Elementary SchoolJournal, 1928, 28, 373-381. Cited by C. R. Klare, The measure-ment of readability, Ames, Ia. : Iowa State University Press, 1963,P. 38-39.

Washburne, C. ,& Morphett, M. V. Grade placement of children's books.Elementary School Journal, 1938, 38, 355-364.

Washburne, C., & Vogel, M. What books fit what children? School andSociety, 1926, 23, 22-24. Cited by G. R. Klare, The measurementof readability, Ames, Ia. : Iowa State University Press, 1963.P. 38-39.

Wheeler, L. R. , & Smith, E.H. A practical readability formula for theclassroom teacher in the primary grades. Elementary English,1954, 31, 397-399..

54

OatisuPygripA

wietEWheeler, L. R. , & Wheeler, V.A. Selecting appropriate reading ma-terials. Elementary English, 1948, 25, 478-489.

Yoakam, G.A. A technique for determining the difficulty of readingmaterials. Unpublished. University of Pittsburgh, 1939.(Revised 1948.) In G. A. Yoakam, Basal Reading Instruction,New York: McGraw-Hill, 1955. Cited by J. S. Chall, Read-ability: An appraisal of research and application. Columbus,Ohio: Bureau of Educational Research, Ohio State University,1958. P. 28.

55/PI 64nk

APPENDIX A

Glossary of Readability Measures

Aut

hor(

s)D

ate

Cri

teri

onV

aria

bles

Pred

icte

dV

alue

Liv

ely

&19

23Ju

dged

dif

fi-

Wei

ghte

d m

edia

n W

ei(

Rel

ativ

ePr

esse

ycu

ltyno

. , b

ased

on

Tho

rndi

kew

ord

list

diff

icul

ty

Dol

ch19

28G

rade

leve

las

sign

ed b

ypu

blis

hers

Five

indi

ces

of v

ocab

u-la

ry "

load

Was

hbur

ne19

28M

ean

test

edX

2: N

umbe

r of

dif

fere

ntR

eadi

ng&

Vog

el(W

inne

tka)

read

ing

leve

lof

thos

e lik

ing

book

wor

ds in

100

0 w

ords

X3:

Num

ber

of p

repo

sitio

nsin

IC

C(

wor

ds

scor

e on

Stan

ford

Ach

ieve

-m

eat T

est

(X1)

X4:

Num

ber

of w

ords

not

on T

horn

dike

list

of

10 ,C

CG

wor

ds

X :

Num

ber

of s

impl

ese

nten

ces

in 7

5se

nten

ces

'Lew

eren

z19

29O

rder

of

pre-

sent

atio

n on

Perc

enta

ge o

f w

ords

be-

ginn

ing

with

W, H

, B (

easy

)G

rade

leve

l

Stan

ford

I, a

nd E

(ha

rd)

GP

Ach

ieve

men

tL

OT

est

Lew

eren

z19

301)

Rat

io o

f A

nglo

-Sax

on to

Gra

de le

vel

1935

Gre

ek a

nd R

oman

wor

ds19

39(d

iffi

culty

)

John

son

1930

Gra

de le

vel

assi

gned

by

publ

ishe

rs

Patty

&19

31N

one

(rel

ied

Pain

ter

on v

alid

ity o

fT

horn

dike

wor

dlis

t)

2) R

atio

of

wor

ds in

"C

lark

'sfi

rst 5

06"

toto

tal d

iffe

rent

wor

ds (

dive

rsity

)

3) E

stim

ate

of im

age

bear

ing

or s

enso

ry w

ords

(in

tere

st)

4) P

olys

ylla

bic

wor

d co

unt

5) V

ocab

ular

y m

ass

(num

ber

of d

iffe

rent

wee

ds in

sam

-pl

e:th

at a

re n

ot o

n C

lark

'slis

t of

500

com

mon

wor

ds

Perc

enta

ge o

f po

lysy

llabl

eG

rade

leve

lw

ords

1) T

horn

dlke

wor

d lis

t ind

ex R

elat

ive

num

bers

diff

icul

ty

2) N

umbe

r of

dif

fere

nt w

ords

in s

ampl

e

Form

ula

Typ

e of

Ran

ge o

fM

etho

d of

coa

t -M

ater

ial

Mat

eria

lpr

ison

with

Stud

ied

Stud

ied

crite

rion

Wei

ghte

d m

edia

n in

dex

num

-be

r

Bas

e ju

dgm

ent o

f di

ffic

ulty

on "

seve

ral o

f vo

cabu

lary

fact

ors

X1

=.0

85X

2 +

.101

X3

+ .6

04X

4

.411

X +

17.

43

Mea

n of

tabl

ed v

alue

s fo

r ea

chof

var

iabl

es

Mea

n of

tabl

ed v

alue

s fo

r ea

chof

var

iabl

es

Tab

led

valu

e of

per

cent

age

ofpo

lysy

llabl

es

Inde

x N

umbe

r=A

A.W

.W.V

.

Scho

ol,

Gra

de 2

-In

spec

tion

gene

ral,

colle

gesc

ient

ific

Rea

ding

Prim

er-

Insp

ectio

nte

xts

grad

e 4

Chi

ldre

n's

libra

rybo

oks

Gra

des

3-9

Cor

rela

tion

.845

(Thi

s va

lue

is c

on-

foun

ded

by e

ffec

tsof

pop

ular

ity o

fbo

oks)

Stan

ford

Gra

de 2

-G

raph

icA

chie

ve-

colle

gem

erit

Tes

tpa

ragr

aphs

Wid

ePr

imer

toIn

spec

tion

vari

ety

grad

e 8

Scho

olA

.W.W

. V. =

mea

n T

born

dike

text

sIn

dex

Num

ber

R =

num

ber

of d

iffe

rent

war

dsin

sam

ple

Gra

des

9-12

40)

Aut

hor(

s)D

ate

Cri

teri

onV

aria

bles

Pred

icte

dV

alue

Form

ula

Typ

e of

Ran

ge o

fM

ater

ial

Mat

eria

lSt

udie

dSt

udie

d

Met

hod

of c

om-

pari

son

with

crite

rion

Oie

man

n19

34X

C50

Dal

e &

1934

Tyl

erC

ompr

e-he

nsio

n sc

ore

of a

dults

of

low

rea

ding

leve

l

6 m

easu

res

of v

ocab

ular

ydi

ffic

ulty

, 8 m

easu

res

of s

truc

ture

, 4 q

ualit

ativ

efa

ctor

s

X -

Num

ber

of d

iffe

rent

tech

nica

l ter

ms

X3:

Num

ber

of d

iffe

rent

hard

non

-tec

hnic

alw

ords

X :

Num

ber

of in

dete

rmi-

4 na

te c

laus

es

McC

lusk

y19

34R

eadi

ngsp

eed,

com

-pr

ehen

sion

scor

e

Voc

abul

ary,

sen

tenc

est

ruct

ure

Tho

rndi

ke19

34V

ocab

ular

ydi

ffic

ulty

Num

ber

of w

ords

not

on

Tho

rndi

ke li

st in

10,

000

wor

d sa

mpl

e

Gra

y an

d19

35M

ean

com

-X

2: N

umbe

r of

dif

fere

ntL

eary

preh

ensi

onte

st s

core

of

adul

ts

wor

ds n

ot o

n D

ale

Lis

ta

769

Was

hbur

ne &

193

8M

orph

ett (

re-

visi

on o

f V

ogel

& W

ashb

urne

,19

28)

Mea

n te

sted

read

ing

leve

lof

thos

e lik

ing

book

. Als

ote

ache

r ju

dg-

men

t at g

rade

s1-

2.

X5:

Num

ber

of p

erso

nal

X6:

Mea

n se

nten

ce le

ngth

in w

ords

X7:

Per

cent

dif

fere

nt w

ords

X8:

Num

ber

of p

repo

sitir

mal

X2:

Num

ber

of d

iffe

rent

wor

ds p

er 1

000

wor

ds

X3:

Num

ber

of d

iffe

rent

wor

ds p

er 1

000

not i

nT

horn

dike

's m

ost

com

mon

150

0 w

ords

X4:

pron

ouns

phra

ses

Num

ber

of s

impl

ese

nten

ces

in 7

5 se

nten

-ce

s

Perc

ent o

fad

ults

of

low

RG

L(3

-5)

who

will

und

er-

stan

d pa

ssag

e(X

1)

Gra

de le

vel

Mea

n co

m-

preh

ensi

onte

st s

core

of

adul

ts (

X1)

Gra

de le

vel

(X1)

16 p

assa

ges

are

pres

ente

d in

ord

erof

dif

ficu

lty, t

o be

com

pare

dw

ith m

ater

ial t

o be

test

ed

Xl =

-9.

4X2

- . 4

.X3

+ 2

.2X

4

+ 1

14.4

± 9

.0

No

form

ula.

Des

crip

tion

ofvo

cabu

lary

and

sen

tenc

e st

ruc-

ture

cha

ract

eris

tics

'n e

asy

and

hard

mat

eria

ls.

Tab

led

valu

e of

num

ber

of w

ords

not o

n T

horn

dike

list

.

Xi =

.010

29X

2 +

.009

012X

5 -

.020

94X

6 -

033.

3X7

0148

5X8

+ 3

.774

X1

=. 0

0255

X2

+ .

Q45

8X3

0307

X4

+ 1

.294

Adu

ltG

rade

6-

educ

atio

nco

llege

Adu

lt he

alth

Bel

owed

ucat

ion

grad

e 8

mat

eria

l

Qua

ntita

tive

fact

ors:

corr

elat

ion.

Indi

vidu

alfa

ctor

s up

to .6

0Q

ualit

ativ

e fa

ctor

s:in

spec

tion.

Cor

rela

tion

.511

.

Wid

e va

riet

y A

bove

Insp

ectio

ngr

ade

8

Gen

eral

.fic

tion

and

non-

fict

ion

Gen

eral

adul

t

Gra

de 4

-9

Gra

de 2

-co

llege

Cor

rela

tion

.643

5

Chi

ldre

n's

Gra

des

1-9

Cor

rela

tion

.86

libra

ry b

ooks

(Thi

s va

lue

is c

on-

foun

ded

by e

ffec

tsof

pop

ular

ity o

fbo

oks)

Aut

hor(

s)D

ate

Cri

teri

onV

aria

bles

Ston

e19

38R

atio

of

new

wor

ds to

tota

lw

ords

, per

cent

sen

tenc

esco

mpl

ete

on o

ne li

ne

De

L o

ng19

38Pe

rcen

tage

of

wor

ds in

var

-io

us le

vels

of

Ston

e's

wor

dlis

t.

Lor

ge19

39M

cCal

l -X

2: P

erce

nt o

f w

ords

out

-19

44C

rabb

s te

stsi

de D

ale

list o

f 76

919

48no

rms

wor

ds

X6:

Mea

n se

nten

ce le

ngth

X: P

erce

nt p

repo

sitio

nal

8 ph

rase

s

Mor

ris

&H

alve

rson

1938

4 Q

ualit

ativ

e vo

cabu

lary

clas

ses:

I) w

ords

lear

ned

earl

y in

life

II)

loca

lism

s

III)

con

cret

e w

ords

IV)

abst

ract

wor

ds

Yoa

kam

1939

Voc

abul

ary

inde

x: b

ased

on

Tho

rndi

ke w

ord

list.

Wor

dsbe

twee

n 4t

h an

d 20

th th

ou-

sand

inde

x is

the

num

ber

of th

ousa

nd in

whi

ch w

ord

appe

ars.

Abo

ve 2

0th

thou

-sa

nd in

dex

= 2

0.

Kes

sler

1941

Judg

ed d

il-a

culty

Mea

n se

nten

ce le

ngth

inw

ords

Mea

n nu

mbe

r of

dif

fere

ntha

rd w

ords

per

10G

wor

ds

Fles

ch19

43Ju

dgm

ent a

ndX

s: M

ean

sent

ence

leng

thM

cCal

l-C

rabb

s te

s'no

rms

X: N

umbe

r of

aff

ixes

per

m10

0 w

ords

Xh:

Num

ber

of p

erso

nal

refe

renc

es p

er 1

00w

ords

Pred

icte

dV

alue

Form

ula

Typ

e of

Mat

eria

lSt

udie

d

Ran

ge o

fM

ater

ial

Stud

ied

Met

hod

of c

om-

pari

son

with

crite

rion

Rel

ativ

edi

ffic

ulty

Six

leve

lsof

rel

ativ

edi

ffic

ulty

Rea

ding

grad

e le

vel

need

ed to

answ

er 5

0%of

que

stio

nsco

rrec

tly(C

50)

Rel

ativ

edi

ffic

ulty

Gra

de le

vel

Rel

ativ

e di

ffic

ulty

bas

ed o

nva

riab

les

stat

ed.

Tab

led

valu

e of

per

cent

ages

of w

ords

at v

ario

us le

vels

of S

tone

's w

ord

list

C50

.10X

2C

6 N

1 O

' S

+ 1

.99

Tab

led

norm

s re

late

cou

nt o

fw

ords

in 4

cla

sses

to d

iffi

culty

Rea

ding

Gra

de 1

text

s

Rea

ding

text

s

McC

all-

Cra

bbs

test

pas

-sa

ges

Mea

n in

dex

num

ber

com

pari

son

.Sch

ool

with

tabu

late

d va

lues

text

s

Gra

de le

vel

Fact

ors

com

pare

d to

Gra

y an

dL

eary

sta

ndar

ds in

depe

nden

tly

Rea

ding

grad

e le

vel

need

ed to

answ

er 7

596

C75

= .1

338X

s +

.064

5Xm

-

.06S

9Xh

+ 4

.249

8

of q

uest

ions

Lat

er c

orre

cted

to;

corr

ectly

C75

= .0

7Xm

+ .0

7Xs

- .0

SXh

(C75

) (c

orre

c-tio

n fo

r hi

gh+

3.2

7gr

ade

leve

ls)

Pre

-pri

-m

er to

grad

e 2

Gra

des

Cor

rela

tion

3-12

.67

Mul

tiple

cor

rela

-tio

n of

.74

was

fou

ndbe

twee

n cl

asse

s I,

III,

IV

, and

McC

all-

Cra

bbs

test

nor

ms

( L

orge

, 193

9)

2nd

-14t

hIn

spec

tion

grad

e

35 te

xt-

Gra

de 2

-In

spec

tion

book

sco

llege

Adu

lt m

ag-

Gra

des

3-C

orre

latio

n .7

4w

ines

and

12M

cCal

l -C

rab'

- p

as.

sage

s

Aut

hor(

s)

Dat

eC

rite

rion

Fles

ch(r

evis

ion)

Dal

e &

Cha

ll

Var

iabl

esPr

edic

ted

Val

ueFo

rmul

a

Typ

e of

Mat

eria

lSt

udie

d

1948

Judg

men

t and

McC

all-

Cra

bbs

test

norm

s

1948

McC

all-

Cra

bbs

test

norm

s, c

om-

preh

ensi

onte

st s

core

s

Dol

ch19

48G

rade

leve

las

sign

ed b

ypu

blis

hers

CI

Whe

eler

&W

heel

er19

48N

one

(rel

ied

on v

alid

ity o

fT

horn

dike

wor

d lis

t)

Fles

ch19

50C

ompr

e-he

nsio

n te

stsc

ores

Farr

,Je

nkin

s, &

Patte

rson

1951

Fles

ch r

eadi

ngea

se s

core

Gun

ning

1952

Fles

ch s

core

s,M

cCal

l-C

rabb

sno

rms,

and

judg

men

t

wl =

Mea

n w

ord

leng

th (

syl-

Rel

ativ

ela

bles

per

10C

wor

ds)

diff

icul

ty,

rela

tive

sl =

Mea

n se

nten

ce le

ngth

inte

rest

pw =

Num

ber

of p

erso

nal

wor

ds p

er 1

0C w

ords

ps =

Num

ber

of p

erso

nal

sent

ence

s pe

r 10

Gse

nten

ces

X1:

Per

cent

of

wor

ds n

oton

Dal

e lis

t of

30G

Cco

mm

on w

ords

X2:

Mea

a. s

ente

nce

leng

thin

wor

ds

Rea

ding

Eas

e =

206

.835

-. 8

46w

1M

cCal

l--

1. 0

15si

Cra

bbs

pas-

sage

sH

uman

Int

eres

t = 3

.635

+ .3

14Pw

ps

Rea

ding

C50

= 1

579X

1 +

.049

6X2

+gr

ade

leve

lne

eded

to3.

6365

answ

er 5

0%of

que

stio

nsco

rrec

tly (

Cso

)

X1:

Med

ian

sent

ence

leng

th G

rade

leve

l

X2:

90t

h pe

rcen

tile

sent

ence

leng

th3:

Perc

ent h

ard

wor

ds (

not

on D

olch

list

of

100C

wor

ds).

Perc

ent o

f w

ords

in e

ach

thou

sand

of

Tho

rndi

ke20

,000

wor

d lis

t

wl:

Wor

d le

ngth

(sy

llabl

espe

r 10

0 w

ords

)dw

: per

cent

age

of "

defi

nite

wor

ds."

nose

s: N

umbe

r of

one

-syl

-la

ble

wor

ds

41: M

ean

sent

ence

leng

th

Mea

n se

nten

ce le

ngth

, per

-ce

nt o

f w

ords

of

3 or

mor

esy

llabl

es

Loo

k up

thre

e va

riab

les

in ta

bles

,av

erag

e th

e ta

bled

val

ues.

Inst

ruct

iona

l, T

abul

ate

wor

ds in

pas

sage

s by

Inde

pend

ent,

Tho

rndi

ke th

ousa

nds.

For

in-

and

aver

age

stru

ctio

nal l

evel

, 90%

of

wor

dsgr

ade

leve

lm

ust b

e be

low

stu

dent

's R

GL

.Fo

r in

depe

nden

t lev

el 9

5%.

R: L

evel

of

R =

168

.095

+ .5

32dw

- .8

11w

1ab

stra

ctio

n"on

arb

itrar

ysc

ale

Rel

ativ

edi

ffic

ulty

Fog

Inde

x(t

he r

eadi

nggr

ade

leve

lne

eded

tore

ad a

nd u

nder

-st

and

mat

eria

l)New

Rea

ding

Eas

e In

dex

=1.

599n

- 1.

015

51 -

31.

517

Fog

Inde

x =

.4 x

(m

ean

sen-

tenc

e le

ngth

+ p

erce

nt o

fw

ords

of

3 or

mor

e sy

Lla

bles

)

Ran

ee o

fM

ater

ial

Stud

ied

Gra

des

3-12

Met

hod

of c

om-

pari

son

with

crite

rion

McC

all-

Gra

des

3-C

iabb

s pa

s-12

, 3-1

6sa

ges,

sch

ool f

or h

ealth

mat

eria

l,pa

ssag

eshe

alth

art

i-cl

es

Rea

ding

text

s

Scho

ol te

xts

Adu

lt, p

ub-

lic r

elat

ions

Gen

eral

adul

t and

scho

ol

Cor

rela

tion

R.E

.: .7

0H

. I. :

. 43

Cor

rela

tion

.70

Gra

des

1-In

spec

tion

6 Gra

des

3-C

orre

latio

n .7

212 Fu

ll ra

nge

Cor

rela

tion

.95

of F

lesc

h"r

eadi

ngea

se"

valu

es

Gra

des

6-In

spec

tion

12

Aut

hor(

s)D

ate

Cri

teri

on

McE

lroy

1953

(AF

Mau

ual

11-3

)

Var

iabl

esPr

edic

ted

Val

ueFo

rmul

a

Typ

e of

Mat

eria

lSt

udie

d

Ran

ge o

fM

etho

d of

com

-M

ater

ial

pari

son

with

Stud

ied

crite

rion

Forb

es &

1953

Mea

n va

lue

Cot

tleof

fiv

e ex

ist-

ing

read

abili

-ty

for

mul

as

Spac

he19

53G

rade

leve

las

sign

ed b

ypu

blis

hers

Fles

ch19

54O

bser

ved

char

acte

rist

ics

of e

asy

and

hard

mat

eria

l

Whe

eler

&19

54G

rade

leve

lC

t)Sm

ithas

sign

ed b

yto

publ

ishe

rs

Tri

be19

56M

cCal

l-C

rabb

s te

stsc

ores

(re

-vi

sed

form

)

Gill

ie19

57Fl

esch

"le

vel

of ^

bst

ract

ion"

scor

e.

Eas

y w

ords

(pr

onou

nced

in1-

2 w

ords

), a

ssig

ned

valu

e 1

hard

wor

ds (

pron

ounc

ed in

3or

mor

e sy

llabl

es),

ass

igne

dva

lue

of 3

Mea

n T

horn

dike

voc

abu-

lary

dif

ficu

lty in

dex,

wor

dsin

4th

thou

sand

and

abo

ve.

X1: .

ylea

n se

nten

ce le

ngth

X: P

erce

ntag

e of

wor

ds n

ot2

on D

ale

list o

f 76

9 ea

syw

ords

Fog

Cou

nt(a

rbitr

ary

valu

e) a

ndre

adin

g gr

ade

leve

l

Rea

ding

grad

e le

vel

Gra

de le

vel

- N

umbe

r of

ref

eren

ces

toA

rbitr

ary

hum

an b

eing

s, e

tc. ,

per

100

sca

les

ofw

ords

real

ism

("r

")-

Num

ber

of in

dica

tions

of

and

ener

gyvo

ice

com

mun

icat

ion

per

("e"

)10

0 w

ords

.

Mea

n le

ngth

of

"uni

ts"

(sim

- R

eadi

ngila

r to

sen

tenc

es),

per

cent

age

grad

eof

mul

ti-sy

llabl

e w

ords

.le

vel

Fog

Cou

nt p

er s

ente

nce

= s

um o

fl's

and

3's

in s

ente

nce

RG

L =

sum

of

l's a

nd 3

's d

ivid

edby

num

ber

of s

ente

nces

.If

val

ueis

ove

r 20

, div

ide

by 2

.If

bel

ow20

, sub

trac

t 2, t

hen

divi

de b

y2.

Loo

k up

mea

n vo

cabu

lary

inde

xSt

anda

rd-

Gra

de 5

-C

orre

latio

n .9

6in

tabl

es to

obt

ain

RG

Liz

ed te

sts

colle

ge

Gra

de le

vel =

.141

X +

.086

X2

Gra

desc

hool

3+

.839

test

s

Gra

des

1-C

orre

latio

n .8

2

"r"

scor

e: ta

bled

val

ue c

orre

spon

d-in

g to

obs

erve

d nu

mbe

r of

ref

eren

c-es

to h

uman

bei

ngs,

etc

."e

" sc

ore:

tabl

ed v

alue

cor

resp

ond-

ing

to'rv

ed n

umbe

r of

indi

ca-

tions

of

dice

com

mun

icat

ion

Gen

eral

adul

tm

ater

ial

Inde

x =

10

(mea

n le

ngth

of

units

x S

choo

lpe

rcen

t mul

ti-sy

llabl

e w

ords

).te

xts

Loo

k up

inde

x in

tabl

e to

obt

ain

RG

L

Adu

ltIn

spec

tion

Gra

des

I-In

spec

tion

4

X :

Mea

n se

nten

ce le

ngJi

Rea

ding

C50

= .0

719X

1 +

.104

3X5

+M

cCal

l-G

rade

s 3-

1gr

ade

leve

lC

rabb

s pa

s-12

need

ed to

2.93

47sa

ges,

195

0X

s: P

erce

ntag

e of

wor

ds n

ot a

nsw

er 5

0%-C

orre

ctio

n is

then

app

lied

thro

ugh

revi

sion

on R

insl

and

wor

d lis

t,of

que

stio

nsre

fere

nce

to ta

ble

of D

ale

and

mul

tiplie

d by

100

.co

rrec

tly (

C50

) C

han

(194

8)

X :

Num

ber

of f

inite

ver

bsA

bstr

actio

nA

bstr

actio

n le

vel =

36

++

X1

1 pe

r 20

0 w

ords

leve

l (ar

-bi

trar

y sc

ale)

X2:

Num

ber

of d

efin

ite a

rti-

cles

with

nou

ns p

er 2

00w

ords

X.,:

Num

ber

of n

ouns

of

abst

ract

ion

per

200

wor

ds.

- (2

X 3

)

Scho

olG

rade

4-

Cor

rela

tion

.83

text

s,co

llege

gene

ral

adul

t ma-

teri

al

Aut

hor(

s)D

ate

Cri

teri

onV

aria

bles

Pred

icte

dV

alue

Form

ula

ype

ofrt

ange

or

Mat

eria

lM

ater

ial

Stud

ied

Stud

ied

Met

hod

of c

om-

pari

son

with

crite

rion

Ston

e19

57G

rade

leve

las

sign

edby

publ

ishe

rs

Pow

ers,

Sum

ner,

&K

earl

Fles

ch

1958

McC

all-

C r

ab b

s T

est

Les

sons

, 195

0re

visi

on

1958

Obs

erve

dch

arac

teri

stic

sof

wri

tten

mat

eria

l

Blo

omer

1959

Gra

de le

vel

assi

gned

by

publ

ishe

rs

Jaco

bson

1965

Mea

n nu

m-

ber

of w

ords

stat

ed a

s no

t-un

ders

tood

by

read

ers

X :

Num

ber

of f

inite

;ver

bsI

per

200

wor

ds

X :

Num

ber

of d

efin

ite2

artic

les

with

nou

ns p

er20

G w

ords

X3:

Num

ber

of n

ouns

of

ab-

stra

ctio

n pe

r 20

0 w

ords

Rec

alcu

latio

n of

fou

r ex

ist-

ing

form

ulas

: Fle

sch

Rea

d-in

g E

ase,

Dal

e-C

hall,

Far

r,Je

nkin

s, a

nd P

ater

son,

and

Gun

ning

. Var

iabl

es a

ndun

chan

ged.

Coe

ffic

ient

s al

tere

d as

ap-

prop

riat

e

Abs

trac

tion

Abs

trac

tion

leve

l = 3

6X

2X

1le

vel (

ar-

bitr

ary

scal

e)-

(2X

3)

XC

SO

XC

50

XC

SO

XC

SO

Cou

nt o

f ev

iden

ces

of in

for-

Arb

itrar

ym

ality

: cap

italiz

atio

n, u

n- s

cale

of

derl

inin

g, p

unct

uatio

n, s

ym-

"for

mal

ity-

bols

, beg

inni

ngs

or e

nds

ofpo

pula

rit

para

grap

hs, e

tc.

- N

umbe

r of

wor

ds p

er m

odi-

Lev

el o

ffi

erab

stra

ctio

n-

Soun

d co

mpl

exity

of

mod

-if

iers

X1:

Mea

n nu

mbe

r of

wor

dsY

: Mea

nnu

mbe

r of

per

inde

pend

ent c

laus

ew

ords

not

know

n pe

rX

2: C

once

ntra

tion

of m

athe

- sa

mpl

e

mat

ical

term

s

X3:

Con

cent

ratio

n of

wor

ds

outs

ide

of m

ost c

omm

on6,

000

on T

harn

dike

list

X4:

Con

cent

ratio

n of

wor

dsno

t on

Pow

er's

list

of

es-

sent

ial s

cien

tific

term

s.

Scho

olG

rade

4-

text

s, g

en-

colle

geer

al a

dult

mat

eria

I

Pesc

h:X

C50

= -

2.2

029

McC

all-

+ .0

778

(sen

tenc

e le

ngth

).0

445

Cra

bbs

pas-

(syl

labl

es p

er 1

0G w

ords

)sa

ges,

195

0re

visi

onD

ale-

C h

all:

Xc5

0=

3. 2

672

+ .

0596

McC

all-

Cra

bbs

pas-

(sen

tenc

e le

ngth

) -

.064

8 (p

erce

nt s

ages

, 195

0of

wor

ds n

ot o

n D

ale

list)

revi

sion

Cor

rela

tion

.83

Gra

des

3-C

orre

latio

n .6

412 G

rade

s 3-

12

Farr

-Jen

kins

-Pat

erso

n:M

cCal

l-G

rade

s 3-

Xcs

o=

8.4

355

' .09

23C

rabb

s pa

s-12

(sen

tenc

e le

ngth

) -

.064

8 (p

erce

nt s

ages

, 195

0of

one

syl

labl

e w

ords

)re

visi

on

Gun

ning

: Xcs

o=

3.0

680

+ .0

877

McC

all-

Gra

des

3-C

rabb

s pa

s-12

(sen

tenc

e le

ngth

)* .0

984

(per

cent

sag

es, 1

950

of o

ne s

ylla

ble

wor

ds)

revi

sion

Find

tota

l num

ber

of e

vide

nces

of

Popu

lar

toA

dult

info

rmal

ity p

er s

ampl

e, u

se ta

ble

scho

larl

yto

det

erm

ine

form

ality

-pop

ular

ity p

erio

dica

lsca

tego

ry.

Gra

de le

vel p

redi

cted

fro

m th

eR

eadi

ngva

riab

les

liste

d.Pr

oced

ure

not

text

sde

scri

bed.

Phys

ics

text

s: Y

= -

.001

3 +

29.7

059X

24

21.1

19X

3 4

35.0

029X

4

Che

mis

try

text

s: Y

= .0

C3

1706

X1

4 13

.723

1 X

2 -

43. 7

262X

3-

2. 3

577X

4

Phys

ics

&C

hem

istr

yte

xts

Cor

rela

tion

.71

Cor

rela

tion

.S8

Cor

rela

tion

.59

on

Gra

des

1-C

orre

latio

n .7

86 H

igh-

scho

ol &

colle

ge

Cor

rela

tion

Phys

ics

text

s:.7

0C

hem

istr

y te

xts:

.67

mel-tr-74-29 - eric · document.resume sd 097 642 cs 001 383 author williams, allan r., jr.; and...

Documents