analytical reasoning involved in graduate
TRANSCRIPT
ANALYTICAL REASONING
INVOLVED IN GRADUATE
PERCEPTIONS OF FACULTY IN
SKILLS
STUDY:
SIX FIELDS
Donald E. Powers Mary K. Enright
GRE Board Professional Report GREB No. 83-23P ETS Research Report 86-43
December 1986
This report presents the findings of a research project funded by and carried out under the auspices of the Graduate Record Examinations Board.
Analytical Reasoning Skills Involved in Graduate Study: Perceptions of Faculty in Six Fields
Donald E. Powers Mary K. Enright
GRE Board Professional Report GREB NO. 83-23P
December 1986
Copyright, 0 c 1986 by Educational Testing Service. All rights reserved.
Acknowledgments
The authors wish to thank the members of the GRE Research Committee for helpful suggestions; Laurie Barnett fr>r programming the analyses; Neal Kingston, William Ward, and Cheryl Wild for helpful reviews of the report.
Abstract
This study was intended to provide information on the role of analytical abilities in graduate study. Graduate faculty in six fields of study (chemistry, computer science, education, English, engineering, and psychology) were asked to judge:
(a) the importance for academic success of a wide variety of analytical skills
(b) the seriousness of various reasoning errors
(c) the degree to which a variety of “critical incidents” had affected their estimations of students’ analytical abilities.
Faculty members were generally able to discriminate among the various skills, errors, and incidents they were asked to consider, although the vast majority of skills, for example, were rated to be at least moderately important on average. Some skills were viewed as extremely important in all disciplines. More typically, however, disciplines varied quite markedly with respect to faculty perceptions of the importance, seriousness, and impact of these skills, errors, and incidents.
Several relatively interpretable dimensions were found to underlie particular clusters of reasoning skills. These dimensions or factors involved (a) the analysis and evaluation of arguments, (b) the drawing of inferences and the development of conclusions, (c) the definition and analysis of problems, (d) the ability to reason inductively, and (e) the generating of alternative explanations or hypotheses. These dimensions were judged to be differentially important for success in the six disciplines included in the study.
The study results would appear to have definite implications for developing future editions of the GRE analytical ability measure. For instance, some reasoning abilities that were perceived as very important are well represented in the current version of the analytical measure but others are not. That some abilities appear to be much more important in some fields than in others would imply the need to carefully balance the measurement of various reasoning skills to ensure fairness to test takers from different fields of study. Fairness should be aided by including for measurement those skills rated as consistently important across all disciplines. Finally, the several dimensions underlying various clusters of reasoning skills should be applicable to extending the test specifications for the current version of the analytical measure.
Analytical Reasoning Skills Involved in Graduate Study: Perceptions of Faculty in Six Fields
Despite the complexity of human cognitive abilities, standardized admissions tests have tended to focus almost exclusively on the measurement of broadly applicable verbal and quantitatice aptitudes. One criticism of such omnibus verbal and quantitative ability measures is that they provide only limited descriptions of students' academic strengths and weaknesses, and that they do not therefore adequately reflect test takers' differential development in other important cognitive areas.
In 1974 the GRE Board approved a plan to restructure the GRE Aptitude Test in order to allow examinees to demonstrate a broader range of academic skills (Altman, Carlson, & Donlon, 1975). A survey of constituents revealed that, of several possible new areas of measurement (e.g., abstract reasoning, scientific thinking, and study skills), graduate faculty, administrators, and students were most receptive to assessing analytical or abstract reasoning skills (Miller & wild, 1979). Developmental activities then followed and, after careful psychometric study of several alternative analytical item types, four distinct kinds of items were selected for the new analytical section of the GRE Aptitude Test, which was introduced operationally in the 1977-78 testing year. Graduate institutions were cautioned against using the scores from the new analytical section until further evidence could be generated on the validity of the new measure. Subsequently, the administration of the new measure to large numbers of examinees under operational conditions enabled the further collection of information about the new measure.
Some research strongly suggested the promise of the analytical section: it appeared to measure an ability that was distinguishable from the verbal and quantitative abilities measured by the test (Powers & Swinton, 1981), and the score derived from it was related to successful performance in graduate school (Wilson, 1982). Unfortunately, however, further research suggested serious problems with the two item types (analysis of explanations and logical diagrams) that comprised the bulk of the analytical section. Performance on these item types was shown to be extremely susceptible to special test preparation (Swinton & Powers, 1983; Powers & Swinton, 1984) and to within-test practice (Swinton, wild, & Wallmark, 1983). Consequently, in 1981 the two problematic item types were deleted from the test, and additional numbers of analytical reasoning and logical reasoning items, which constituted a very small part of the original analytical measure, were inserted.
The most recent research on the General Test (Stricker & Rock, 1985; Wilson, 1984) has given us some reason to question both the convergent and the discriminant validity of the two remaining item types. Specifically, the two currently used analytical item types correlate more highly with other verbal items or with other quantitati ve items than reviewing the psycholog reasoning, Du ran, Power
i S
they do with each other. Moreover, after cal and educational resea rch literature on I? and Swinton (in press) concluded that the
-2-
two currently used GRE analytical item types reflect only a limited portion of the reasoning skills that are required of graduate students. The most notable omission is the assessment of inductive reasoning skills, i.e., reasoning from incomplete knowledge, where the purpose is to learn new subject matter, to develop hypotheses, or to integrate previously learned materials into a more useful and comprehensive body of information. Thus, it seemed, the analytical ability measure of the GPE General Test might be improved through further effort.
The objective of the study reported here was to generate information that might guide the development of future versions of the GRE analytical measure. More specifically, the intention was to gain a better understanding of what reasoning (or analytical) skills are involved in successful academic performance at the graduate level, and to determine the relative importance of these skills or abilities both within and across academic disciplines. It was thought that this information might be especially useful for developing additional analytical item types.
As mentioned earlier, the initial version of the GRE analytical ability measure was developed after a survey had suggested the importance of abstract reasoning to success in graduate education. This survey, however, was not designed to provide any detailed information on the importance of specific analytical skills, as was the intention here.
Method
Questionnaire Development
Initially, 30 department chairs (in English, education, engineering, chemistry, computer science, or psychology) were contacted in 30 graduate institutions, and asked to identify three faculty members in their departments who would be willing to provide their insights into the analytical or reasoning skills that are most critical for successful performance in graduate school. These 30 institutions were chosen from the GRE Directory of Graduate Programs in such a way as to ensure some degree of geographical representation. All of these departments require or recommend that applicants submit GPE General Test scores; it was felt that these departments might be more interested than nonrequiring departments in efforts to improve the GRE General Test.
At this preliminary stage, faculty members were informed of the purpose of the project and asked to give, in an open-ended fashion, examples of:
(a) the analytical, reasoning, or thinking skills they perceived as most important for successful graduate study in their fields (e.g., identifying assumptions on which an argument is based), particularly as these skills differentiate successful from marginal students
-3-
(b) specific critical incidents related to thinking or reasoning that caused them to either raise or lower their estimation of a student's analytical ability (e.g., failing to qualify a conclusion as appropriate)
(c) particular reasoning or thinking "flaws" they have observed in their students (e.g., using the conclusion as the premise of an argument).
Useable responses were obtained from 33 faculty members, who suggested a total of 138 important reasoning or thinking skills, 86 critical incidents, and 75 reasoning "flaws." Some of these responses were duplicates. Several other respondents did not specify discrete skills or errors but chose rather to send helpful discursive replies to our inquiry. All responses were condensed and edited, and generally evaluated with respect to whether they should be included in the larger, more structured questionnaire that was planned. Some responses constituted usable questionnaire items essentially as stated by respondents (e.g., "the ability to break complex problems into simpler components"). Other responses were revised or eliminated because they were too general (e.g., "the ability to think independently"), and others because they were too specific or applied only to a particular field (e.g., "the ability to resolve into enthymemic form any argumentative work" or "the ability to take ecological validity into account").
The structured questionnaire was constructed on the basis of this preliminary survey, on a review of relevant literature (Duran, Powers, & Swinton, in press) and on a number of additional books or texts on reasoning (e.g., Campbell, 1974; Fischer, 1970; Johnson & Blair, 1983; Kahane, 1984, Nosich, 1982; Salmon, 1984; Striven, 1976; Toulmin, Rieke, & Janik, 1984; Wason & Johnson-Laird (1972); and Weddle, 1978). Several other articles, e.g., a seminal work by Ennis (1962) and a list of skills by Arons (1979), proved especially useful. Various issues of CT News, published by the Critical Thinking Project at California State University at Sacramento, were also perused. Previous work on critical incidents in graduate student performance (Reilly, 1974a, 1974b) was also consulted, and several of the incidents related to critical facility were included in the present study. Finally, the list generated by Tucker (1985), who gathered the impressions of ETS test development staff, philosophers, and cognitive psychologists, also proved to be a valuable resource.
The final questionnaire (see Appendix A) was structured to include questions about the importance and frequency of various reasoning skills, of commonly observed errors in reasoning, and of specific incidents that may have led faculty to adjust their estimates of students' analytical abilities. Questions were grouped under several headings, mainly to give respondents some sense of their progress in responding to the rather lengthy questionnaire.
-4-
The Sample
Six academic fields (English, education, psychology, chemistry, computer science, and engineering) were included in the final survey. These fields were thought to represent the variety of fields of graduate study and the variation in the kinds of reasoning abilities involved in graduate education. Using the data tapes of the Higher Education General Information Survey (HEGIS), nonoverlapping samples of 64 graduate institutions with doctoral programs were drawn for each of the six graduate fields. A random sampling procedure was used such that eight institutions from each of the eight HEGIS geographic regions were selected for each field. This sampling was greatly facilitated by the work of Oltman (1982). The admission requirements of these institutions were determined from the Directory of Graduate Programs (GRE/CGS, 1983), and only those that either required or recommended GRE General Test scores were included in the sample. In this manner, 40 institutions were selected for the final sample for each field. In addition, one institution with a relatively large proportion of Black students and one with a relatively large percentage of Hispanic students were included in the samples for each field, thus raising the total number of institutions to 42 per field. Letters were then sent to departmental chairpersons, who were asked to nominate two faculty members who would be willing to complete the questionnaire. Respondents were paid $25 for their participation.
Data Analysis
Means and standard deviations were calculated for each question by academic field of study, and analyses of variance were run for each question to assess differences among the six fields. The various ratings were correlated within questionnaire sections. For example, within the section on reasoning-skills, the ratings of frequency-and importance were correlated; within the section on reasoning errors, the ratings of frequency and seriousness were correlated. -
Finally, within each section (and for each kind of rating), the data were factor analyzed to effect some reduction in the large number of questions. A principal axis factoring, with squared multiple correlations as the initial estimates of communalities, was used to determine the number of factors to be retained for each section, according to both the magnitude of the eigenvalues and the breaks in their size. (Our inclination was to err on the side of retaining too many factors at this exploratory stage.) Various numbers of factors were then rotated according to the varimax criterion. Although other oblique rotations could have been used also, it was felt that-the detection of uncorrelated factors would best serve the objectives of further test development.
-5-
Results
The Sample
A total of 165 chairpersons (65% of those contacted) nominated a total of 297 faculty members, of whom 255 (86%) returned usable questionnaires. The response rates across fields were generally comparable.
Full professors constituted a slight majority of the responding sample (51%) ; associate professors made up the next largest proportion (34%). About 13% were assistant professors, and the remaining small proportion were deans, associate deans, or lecturers.
Item-level Results
Tables 1-3 show the mean ratings by discipline for each question included in the survey instrument. The numbers in the total column are the grand means for all disciplines. Numbers under each discipline represent for each item the deviations from these means. The F tests in the right-most column indicate whether the means are significantly different among the six disciplines. Because the average ratings, over all respondents, for “frequency of use” and “importance for success” correlated .99, only the importance ratings are presented for reasoning skills. Likewise, only the “seriousness” ratings are presented for reasoning errors, since their correlation with frequency ratings was .98, and, for critical incidents, only the average “effect” ratings are presented, since their correlation with frequency ratings was .94.
Tables 1-3 show a substantial number of significant differences among disciplines with respect to the importance placed on various reasoning skills (Table l), the seriousness with which they regard particular kinds of reasoning errors (Table 2), and the impact that various critical incidents have on the estimation of students’ analytical abilities (Table 3). Table 4, showing only the very highest rated skills and most critical errors and incidents, gives a flavor of the differences among these six disciplines. For example, chemistry faculty placed a high premium on being able to generate hypotheses, questions, or experiments, to draw sound inferences from observations, and to analyze and evaluate previous research. Engl i sh faculty, on the other hand, saw greater importance in skills involving argumentation-- being able to understand, evaluate, analyze, elaborate, recognize, and support aspects of an argument.
Faculty in the six disciplines also appeared to have quite different views as to the numbers of skills that were important in their respective disciplines. The numbers of reasoning skills that received average ratings of 4.0 or higher varied markedly by discipline as follows: 23 for chemistry, 5 for computer science, 27 for education, 22 for engineering, 29 for English, and 26 for psychology. These differences may have arisen, for example, from our
particular choice of questions, from differences in standards amonq disciplines, or from some other factor(s) .
It can be seen, even from Table 4, however, that some skills were
-6-
viewed as very important by several disciplines. For example, ‘breaking down complex problems into simpler ones” was rated as the single most important skill (of the 56 skills listed) in both computer science and engineering. “Determining whether conclusions are logically consistent with, and adequately supported by, the data” was rated as one of the three most important skills by both education and psychology faculty; “drawing sound inferences from observations” was the highest rated skill in chemistry and nearly the highest in education.
The extent to which faculty in different disciplines agreed on the importance of various skills, errors, or incidents can be examined in a slightly different manner. To get some idea of the skills, errors, and incidents that were viewed as relatively important, and for which average ratings did not differ significantly across disciplines, Table 5 was prepared. This table shows only those skills that received average ratings of importance of more than 3.5 over all six disciplines combined, and for which analyses of variance did not detect any significant differences among disciplines.
“Reasoning or problem solving in situations in which all the needed information is not known” was the skill rated as most important overall. Such skills as ‘detecting fallacies and logical contradictions in arguments,” “deducing new information from a set of relationships, ” and “recognizing structural similarities between one type of problem or theory and another” were the next most highly rated skills. These were followed closely by “taking well-known principles and ideas from one area and applying them to a different specialty,’ “monitoring one’s own progress in solving problems,” and “deriving from the study of single cases structural features or functional principles that can be applied to other cases.”
Table 6 lists the reasoning errors and critical incidents that were judged overall to be the most serious or to have the most effect on the estimation of students’ abilities. Three errors/incidents were judged to be most serious or critical: ‘accepting the central assumptions in an argument without questioning them,” “being unable to integrate and synthesize ideas from various sources,” and “being unable to generate hypotheses independently.”
It should be noted that there are many other decision rules, based on average ratings and differences among disciplines, that could have been used here to form a “common core” of skills or errors/ incidents. Tables l-3 could be consulted to apply alternative rules.
Factor Analytic Results
To condense the many questions into a more manageable form, factor analyses were computed for each section of the questionnaire.
-7-
For the section on reasoning skills, only the importance ratings were analyzed because they were so highly correlated with frequency ratings. Because frequency ratings were slightly less correlated with ratings of seriousness and criticality in the other two sections, they too were analyzed for the questionnaire sections on reasoning errors and critical incidents.
The reader should bear in mind that the factors resulting from this analysis should not be construed as representing dimensions of analytical ability, but rather only as reflecting the dimensions that underlie faculty perceptions of analytical abilities. These dimensions merely reflect the extent to which graduate faculty tended to rate certain skills as about equally important (or equally unimportant ) , not the degree to which these dimensions represent “factors of the mind.” Thus, the results presented below are intended to provide a parsimonious representation of faculty perceptions rather than a basis for postulating distinct analytical abilities.
Reasoning skills. For the ratings of importance of reasoning skills, the largest eigenvalues were 16.4, 3.9, 2.4, 1.6, and 1.1, and the application of a scree test (Cattell, 1966) suggested the appropriateness of a five-factor solution, which was then rotated according to the varimax criterion (Kaiser, 1958). The five-factor varimax rotation accounted for 80% of the common variance. The factor loadings and communalities are given in Appendix B. Table 7 summarizes the variables that were most instrumental in defining each factor.
Factor I, which accounted for about a third of the common variance, was characterized by highest loadings, generally, from skills involving arguments. Thus, Factor I seems to involve a kind of critical thinking related to argumentation.
Factor II accounted for about 29% of the common variance, and was defined primarily by variables related to the drawing of conclusions, e.g. I generating valid explanations, supporting conclusions with sufficient data, and drawing sound inferences from observations. The conclusion-oriented skills that define this second critical thinking factor would seem to be of a more active or productive nature, involving the construction of inferences or conclusions, rather than evaluating the soundness of arguments or inferences, as is the case for Factor I.
Factors III-V each accounted for a somewhat smaller proportion of common variance (10% - 15%) than did Factors I and II. Factor III is best defined by skills related to defining and setting up problems or analyzing their components as a prelude to solving them. Factor IV is best characterized by inductive reasoning skills, i.e., the drawing of conclusions that have some evidential support, but not enough to indicate logical necessity. Factor V is somewhat difficult to define, but, by virtue of its two highest loadings, it seems to reflect an ability to generate alternatives.
-8-
Reasoning errors. For the ratings of seriousness of reasoning errors, the largest eigenvalues were 6.5 and 1.1, and the two factors accounted for 96% of the common variance. (Frequency ratings were also factor analyzed and are presented in Appendix B. Because the results were so similar to the analysis of seriousness ratings, they are not discussed here.) As shown in Table 8, Factor I, which explained about 52% of the common variance, was characterized by loadings from errors involved in the evaluation of evidence, e.g., offering irrelevant evidence to support a point. Factor II, on the other hand, seemed to involve more formal logical errors, particularly as related to reasoning with more statistically oriented material--for example, failing to take account of a base rate, failing to recognize differences between populations and samples, and confusing correlation with causation.
Critical incidents. As for reasoning errors, only ratings of the effects of critical incidents, not their frequencies, were factor analyzed. This analysis yielded eigenvalues of 8.2, 1.6, and 0.9 for the largest factors, and this three-factor solution accounted for 89% of the common variance. Table 9 summarizes the results, and the complete set of loadings is given in Appendix B.
Factor I, explaining about 38% of the conanon variance, was best defined by highest loadings from such incidents as accepting/ supporting arguments based more on emotional appeal than evidence, offering nonconstructive or unsound criticism of other students’ presentations, and confusing anecdote and/or opinion with “hard data.” This factor appears to involve critical facility.
Factor II, accounting for about 34% of the common variance, appears to involve the ability to consider or to generate alternatives, being defined primarily by high loadings from such incidents as accepting conclusions without critically evaluating them, being able to criticize but unable to suggest better alternatives, being unable to integrate ideas from various sources, and being unable to generate hypotheses.
Factor III, defined by such incidents as applying a formula or rule without sufficient justification, and searching for a complicated solution when a simpler one is obvious, is difficult to interpret. One possible characterization might be a kind of rationality or critical facility that is sometimes referred to as practical judgment or perhaps “common sense. I’
Scales based on reasoning skill factors. Table 10 gives the average scores for each discipline on scales composed of the questionnaire items that best defined the reasoning skill factors discussed earlier. As is clear, there are substantial differences among disciplines on these scales. Skills involved in analyzing/ evaluating arguments (Scale 1) were rated as extremely important in English (m = 4.53), quite important in education (m = 3.83) and psychology (m = 3.73), and somewhat less important in the other three disciplines, particularly computer science (m = 2.97).
-9-
Critical thinking skills involved in developing or otherwise dealing with conclusions (Scale 2) were viewed as very important (means greater than 4.0) in all disciplines except computer science.
Abilities involved in analyzing and defining problems (Scale 3) were rated as extremely important in computer science (m = 4.05) and engineering (m = 4.00), but less important in other disciplines, especially English (m = 2.70).
The inductive reasoning skills reflected on Scale 4 were rated as moderately important on each of the six disciplines. The skills composing Scale 5, generating alternatives/hypotheses, were rated very high in psychology (m = 4.21) and in education (m = 3.93), and as somewhat less important in other disciplines, particularly computer science (m = 2.90).
Other Comments from Respondents
A number of general comments were made about the study--some positive and some negative. The study was described alternately as “very well done” and “interesting, ” but also, by one respondent, as a “complete waste of time. ” Most of the comments were positive, however, and many pertained more specifically to the kinds of questions that were asked. The consensus seemed to be that the questionnaire was not easy to complete. Moreover, faculty in the several disciplines sometimes had different ideas as to what kinds of questions would have been appropriate. For example, one English faculty member noted the lack of questions on the use of language in critical writing, and a computer science faculty member observed that questions on abilities involved in formulating proofs, which are vital to success in computer science, were only partially covered in the questionnaire. An education faculty member noted that the survey did a better job of assessing skills associated with hypothesis-testing than with other research skills.
Along these same lines, a number of other respondents also believed that the questions were more relevant to other disciplines than to theirs. Several computer science professors, for example, characterized the questions as oriented more toward argument than problem solving, in which they had greater interest. An engineering professor said that some of the questions were more pertinent to educational research than to scientific or technical research, and one English faculty found that questions seemed “geared to the hard sciences. ” Finally, some noted ambiguities or redundancies, or lamented that the questions were “too fine.” Even with these difficulties, however, most of the comments about questions were positive: “Items seem especially well chosen,” “questions are appropriate, ” “questions were quite thorough,” “a good set of questions, ” “topics covered are critical,” and “your lists are right on target. ” The majority of comments, therefore, suggested that the questionnaire was pitched at about the right level and included appropriate kinds of reasoning skills.
-lO-
A number of comments were made about the relationship between subject matter and analytical skills, e.g., that successful problem solving is predicated on having specific knowledge in a field. One respondent believed that the questionnaire downplayed the importance of “context effects” in favor of “strict reasoning ability,” and another noted that the measurement of analytical abilities is quite discipline specific. Another commented on the difficulty of measuring analytical ability without regard to the amount of knowledge available.
Several faculty commented on the development of analytical skills in graduate school and on the differential importance of these skills at various stages of graduate education. one respondent said, “I rated entering behavior or behavior across the entire program (courses, internships, dissertation). If I were to rate the dissertation experience alone, the ratings would have been much higher. ” Many noted that by the end of their programs, skills would be expected to increase and various reasoning errors could be expected to occur less frequently: “Entering students are more likely to make these errors and graduates to make far fewer.” Another said, “In some sense, the essence of graduate training is analytical skills.’ “These are skills which students acquire. Flhen they enter they make most of the mistakes you mentioned. If they can’t learn, they leave the program. ” Another said, “I’m more concerned about the presence of these behaviors after my course than before it. One simply does not harshly judge a beginning student who makes an error, but one could be very critical of a student about to finish a Ph.D. thesis....’
Discussion
Summary
Some 255 graduate faculty in six fields of study (chemistry, computer science, education, engineering, English, and psychology) (a) rated the importance for success in their programs of a wide variety of reasoning skills, (b) judged the seriousness of various reasoning errors, and (c) indicated the degree to which a variety of “critical incidents” affected their estimations of students’ analytical abilities.
skill skill Facul
Faculty members were able to di scriminate among the various s they were asked to consider, although the vast majority of s were seen, on average, as at least moderately important. ty also made distinctions with respect to both the seriousne ss of
different kinds of reasoning errors and-the effects of various critical incidents. Individually, such general skills as ‘reasoning or problem solving in situations in which all the needed information is not known” were viewed as extremely important in all disciplines. Sucfieasoning errors and critical incidents as “accepting the central assumptions in an argument without questioning them,“-“being unable to integrate and synthesize ideas from various sources,” and “being unable to generate hypotheses independently” were judged to be very serious in all disciplines.
-ll-
More typically, however, disc iplines varied quite markedly wi respect to faculty perceptions of the importance, seriousness, or impact of various reasoning skills , errors, or incidents. expected, for example,
As migh whereas "knowing the rules of formal logic"
rated as one of the most important skills in computer science, thi skill was rated as quite unimportant in all other disciplines.
th
.t be WEIS
S
Several relatively interpretable dimensions were found to underlie faculty perceptions of the substantial number of reasoning skills, errors, and incidents that were rated in the study. Skills involved in (a) the analysis and evaluation of arguments, (b) the drawing of inferences and the development of conclusions,, (c) the definition and analysis of problems, (d) the ability to reason inductively, and (e) the generating of alternative explanations or hypotheses formed five distinct dimensions, which were perceived as differentially important for success in each of the six disciplines included in the study. Analysis and evaluation of arguments was judged to be most important in English, defining and analyzing problems most important in computer science and engineering, and generating alternatives most important in psychology and education. Inductive reasoning skills were judged to be about equally important in all disciplines, and drawing inferences/developing conclusions was rated as relatively important in all disciplines except computer science.
Implications
In providing some information on faculty perceptions of the involvement of various reasoning skills in their disciplines, the study has, we hope, implications for developing future versions of the GRE analytical ability measure. Converting this information to operational test items will represent a significant step, however, and it is not crystal clear at this stage exactly how helpful these results may be eventually. Nonetheless, the findings do seem to contain several useful bits of information:
1. Among the specific reasoning skills perceived as the most important were several, e.g., "deducing new information from a set of relationships" and "understanding, evaluating, and analyzing arguments," that seem well represented in the two item types (analytical reasoning and logical reasoning) currently included in the analytical section of the General Test. This suggests that these item types should continue to play a role in future editions of the GRE General Test.
2. Some skills that are not measured by the current version of the analytical measure were rated as very important. "Reasoning or problem solving in situations in which all the needed information is not known" was among the skills rated as most important in each discipline, but currently unmeasured, at least in any explicit manner, by the analytical measure. In this regard, however, the previous
-12-
GRE-sponsored work of Ward, Carlson, and Woisetschlager (1983) is noteworthy. These investigators studied “ill-structured” problems, i . e. , problems that do not provide all the information necessary to solve the problem, and noted the resemblance of these problems to one variant of the logical reasoning item type used in the analytical measure. They concluded that there was no indication that “ill- structured” problems measure different aspects of analytical ability than do “well-structured” problems, and therefore that “ill-structured” problems could not be expected to extend the range of cognitive skills already measured by the GRE General Test. They did note, however, that the “ill-structured” item type could be used to increase the variety of items types in the test. The findings of the current study suggest that the inclusion of this item type would probably meet with faculty approval in most fields of study.
3. With respect to their perceived importance, skills involving the generation of hypotheses/alternatives/explanations tended to cluster together, and the inability to generate hypotheses independently was one of the incidents rated consistently as having a substantial effect on faculty perceptions of students’ analytical abilities.
A number of years ago the GRE Board sponsored a series of studies (Frederiksen & Ward, 1978; Ward, Frederiksen, & Carlson, 1978; Ward & Frederiksen, 1977; Frederiksen & Ward, 1975) that explored the development and validation of tests of scientific thinking, including one especially promising item type called “Formulating Hypotheses,” which required examinees to generate hypotheses. Although the research suggested that this item type complemented the GRE verbal and quantitative measures in predicting success in graduate school, the work was discontinued, largely because of problems in scoring items that require examinees to construct, not merely choose, a correct response. Carlson and Ward (1986) have proposed to renew work on the “Formulating Hypotheses” item type in light of recent advances in evaluating questions that involve constructed responses. The results of the faculty survey reported here would appear to support this renewal.
4. Some of the highly important skills that are currently well represented in the analytical measure are viewed as more important for success in some disciplines than in others. For example, “understanding, analyzing, and evaluating arguments” was seen as more important in English than in computer science. However, some skills seen as highly important in some disciplines but not in others may not be as well represented currently. For example, “breaking down complex problems into simpler ones” was perceived as
5.
6.
-13-
extremely important in computer science and engineering but not at all important in English. This would suggest, perhaps, the need to balance the inclusion of items reflecting particular skills, so that skills thought to be important (or unimportant) in particular disciplines are neither over- nor underrepresented.
The several dimensions that appear to underlie clusters of reasoning skills may provide an appropriate way to extend the current test specifications for the analytical measure, especially if new item types are developed to represent some of these dimensions.
The reasoning skills that were rated as very important, and consistently so, across disciplines point to a potential common core of skills that could be appropriately included in an “all-purpose” measure like the GRE General Test. Other skills judged to be very important in only a few disciplines might best be considered for extending the measurement of reasoning skills in the GRE Subject Tests. Faculty comments about the difficulty in separating reasoning from subject matter knowledge would seem to support this strategy.
Limitations
Any study of this nature is necessarily limited in several respects. First of all, the survey approach used here is but one of several that can be used to inform decisions about extending the measurement of analytical abilities. Tucker’s (1985) results provide useful information from different perspectives--those of cognitive psychologists and philosophers. Other approaches that might also be informative include the methods of cognitive psychology, which could be used not only to supplement but also to extend the survey results reported here. These methods would seem especially appropriate because they relate more directly to actual skills and abilities than to perceptions.
Second, the diversity that characterizes graduate education renders the results of this study incomplete. Some clues have been gained as to similarities and differences among a limited sample of graduate fields. However, the substantial differences found among fields are a source of concern, since we cannot be certain whether or not some other sample of fields might exhibit even greater variation.
Finally, as several survey respondents pointed out, many of the reasoning skills about which we asked are expected to, and do, improve as the result of graduate study. In some sense these skills may represent competencies that differ from, say, the verbal skills measured by the GENE General Test in the respect that these analytical skills may develop much more rapidly. A question of interest, then, is how to accommodate the measurement of these skills in the context of graduate admissions testing, which currently focuses on the predictive effectiveness of abilities that are presumed to develop slowly over a significant period of time.
-14-
Future Directions
The study suggested several possible future directions. Because of the substantial variation among fields, one possibility would involve extending the survey to include additional fields of graduate study. Some refinements could now be made on the basis of past experience. For example, ratings of the frequency with which skills are used, as well as the frequencies of errors and critical incidents, could probably be omitted without much loss of information. OII the other hand, it would seem desirable to add categories allowing ratings of the differential importance of various reasoning skills at different stages of graduate education, ranging from entry level to dissertation writing.
Finally, based on the reasoning skills identified as most important, criterion tasks might be developed against which the validity of the current GRE analytical measure could be gauged. This strategy would make especially good sense for those important skills that may not be measurable in an operational test like the GRE General Test, but which might correlate highly with the abilities now measured by the test. One specific possibility would be the development of rating forms, which could be used by faculty to rate the analytical abilities of their students. These ratings could then be used as a criterion against which GRE analytical scores could be judged.
-15-
References
Altman, R. A., Carlson, A. B., & Donlon, T. F. (1975). Developing a plan for the renewal of the GRE norming test: A report on
, I investigations znto the possible shortening of the GRE-V and GRE-Q and the creation of a modular aptitude test (GRE Report GREB No. 74-3). Princeton, NJ: Educational Testing Service.
Arons, A. B. (1979). Some thoughts on reasoning capacities implicitly expected of college students. In J. Lochhead and J. Clement (Eds.), Cognitive process instruction. Philadelphia, PA: The Franklin Institute Press.
Campbell, S. K. (1974). Flaws and fallacies in statistical thinking. Englewood Cliffs, NJ: Prentice-Hall, Inc.
Carlson, S. B., & Ward, W. C. (1986). A new look at formulating hypotheses items (proposal to the GRE Research Committee). Princeton, NJ: Educational Testing Service.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245-276. -
Duran, R. P., Powers, D. E., & Swinton, S. S. (in press). Construct validity of the GRE analytical test (Draft final report to the GRE Research Committee, January 1984). Princeton, NJ: Educational Testing Service.
EMiS, R. H. (1962). A concept of critical thinking: A proposed basis for research in the teaching and evaluation of critical thinking ability. Harvard Educational Review, 32, 81-111.
Fischer, D. H. (1970). Historians' fallacies: Toward a logic of historical thought. New York: Harper & Row, Publishers.
Frederiksen, N., & Ward, W. C. (1975). Development of measures for the study of creativity (GRE Board Professional Report GREB No. 72-2P; also ETS RB No. 75-18). Princeton, NJ: Educational Testing Service.
Frederiksen, N., & Ward, W. C. (1978). Measures for the study of creativity in scientific problem-solving. Applied Psychological Measurement, 2, l-24. -
Graduate Record Examinations Board and Council of Graduate Schools in the United States. (1983). Directory of graduate programs: 1984 and 1985 (4 volumes). Princeton, NJ: Educational Testing Service.
Johnson, R. H., & Blair, J. A. (1983). Logical self defense (2nd ed.). Canada: McGraw-Hill Ryerson Limited.
-16-
Kahane, H. (1984). Logic and contemporary rhetoric: The use of reason in everyday lite (4th ed.). Belmont, CA: Wadsworth Publishing Company, Inc.
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187-200. -
Miller, R., & wild, C. (Eds.). (1979). Restructuring the Graduate Record Examinations Aptitude Test (GRE Board Technical Report). Princeton, NJ: Educational Testing Service.
Nosich, G. M. (1982). Reasons and arguments. Belmont, CA: Wadsworth Publishing Company, Inc.
01 tman, P. K. (1982). Content representativeness of the Graduate Record Examinations advanced tests in chemistry, computer science, and education (GRE Board Professional Report GREB No. 81-12~). Princeton, NJ: Educational Testing Service.
Powers, D. E., & Swinton, S. S. (1981). Extending the measurement of graduate admission abilities beyond the verbal and quantitative domains. Applied Psychological Measurement, 5, 141-158.
Powers, D. E., & Swinton, S. S. (1984). Effects of self-study for coachable item types. Journal of Educational Psychology, 76, 266-278.
Reilly, R. R. (1974a). Factors in graduate student performance (GRE Board Professional Report GREB No. 71-2P; also RB 74-2). Princeton, NJ: Educational Testing Service.
Reilly, R. R. (1974b). Critical incidents of graduate student performance (GREB Board Research Report GREB No. 70-5). Princeton, NJ: Educational Testing Service.
Salmon, M. H. (1984). Introduction to logic and critical thinking. New York: Harcourt Brace Jovanovich, Publishers.
Striven, M. (1976). Reasoning. New York: McGraw-Hill Book Company.
Stricker, L. J., & Rock, D. A. (1985). The factor structure of the GRE General Test for older examinees: Implications for construct
(GRE Board Research Report GREB No. 83-10~~). Princeton, Testing Service.
Swinton, 5. S., & Powers, D. E. (1983). A study of the effects of special preparation on GRE analytical scores and item types. Journal of Educational Psychology, 75, 104-115. -
Swinton, S. S., wild, C. L, & Wallmark, M. M. (1983). Investigation of practice effects on item types in the Graduate Record Examinations Aptitude Test (GRE Board Professional Report GREB No. 80-1cP). Princeton, NJ: Educational Testing Service.
-17-
Toulmin, S., Rieke, R., & Janik, A. (1984). An introduction to reasoning (2nd ed.). New York: Macmillan Publishing Company, Inc.
Tuck er, C. (1985). Delineation of reasoning processes important to the construct validity of the analytical test (GRE Report submitted to the GRE Research Committee). Princeton, NJ: Educational Testing Service.
Ward, W. C., Carlson, S. B., & Woisetschlaeger, E. (1983). Ill- structured problems as multiple-choice items (GRE Board Professional Report GREB No. 81-18P). Princeton, NJ: Educational Testing Service.
Ward. W. C., & Frederiksen, N. (1977). A study of the predictive validity of the tests of scientific thinking (GRE Board Professional Report GREB No. 74-6P; also ETS RB No. 77-6). Princeton, NJ: Educational Testing Service.
Ward, W. C., Frederiksen, N., & Carlson, S. (1978). Construct validity of free-response and machine-storable tests of scientific thinking (GRE Board Professional Report GREB No. 74-8P; also ETS RB No. 78-15). Princeton, NJ: Educational Testing Service.
Wason, P. C., & Johnson-Laird, P. N. (1972). Psychology of reasoning: Structure and content. Cambridge, MA: Harvard Universtiy Press.
Weddle, P. (1978). Argument: A guide to critical thinking. New York: McGraw-Hill Book Company.
Wils [on, K. M. (1982). A study of the validity of the restructured GRE Aptitude Test for predicting first-year performance in graduate
(GRE Board Research Report GREB No. 78-6R). Princeton, NJ: Testing Service.
Wilson, K. M. (1984). The relationship of GRE General Test item-type part scores to undergraduate grades (GRE Board Professional Report GREB No. 81-22~). Princeton, NJ: Educational Testing Service.
Table 1
Mean Ratings of Importance of Reasoning Skills by Disciplines
Variable
Computer
Tota I Chem i stry SC ience Education Engineering Engi ish Psychology F
(N=255) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46) -
Reasoning Skills--General
Deriving principles from facts or cases 4.02 -.21 -.04 -.09 -.23 .23
Reasoning when facts are known explicitly 3.63 .31 .55 -.06 .44 -.59
.35 3.80**
-.65 10.81***
Reasoning when all information is not known 4.24 -.I1 -.15 -. 10 .29 -.06
Reasoning when inconsistencies are present 4.03 .07 -.59 -.03 -.Ol .22
.13 1.60
.34 5.92***
Knowing rules of formal logic 2.76 -.46 1.03 -.21 .lO -.31 -.15 9.62”“”
Applying principles to a different specialty 3.75 -.05 -.08 -.13 .23 -.ll .14 1.32
Supporting or refuting a given position 3.52 -.23 -.62 .47 .47 .23 7.74***
Analyzing and evaluating previous research .13
Incorporating isolated data into a framework .38
Monitoring progress in solving problems
4.33
3.37
3.71 0.20
-.66
-.40
.I5
.05
-.34
.02
-.09
.I2
.21 .25 7.63***
-.16 .04 .23
.29 -.24 -. 12
3.19””
1.71
Deriving principles that can be applied to
other cases 3.68 .16 -.06 .09
Recognizing the probabilistic nature of events 3.24 -.63
Understand i ng and eva I uat ing arguments 4.05
0.11
-.40
.oo
.03
-.56
Deducing information from relationships 3.86 -.16
.31
.09
.02
.44
0.79
9.49”“”
-.38
.16
.ll -.18
-.40 .70
.70 .14
-.07 .Ol
10.51***
0.60
Knowing what evidence will support a
hwotheses or thesis , I - . 4.19 .32 -.75 .14 -.26 .31 .22 10.35”“”
Table 1 (Continued)
Discipline
Variable
Computer
Tota I Chem 1 stry SC I ence Ed ucat i on Engineering Engl ish Psychology +
(N=255) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46)
Specific Skills--Problem Definition
Identifying central issues or hypotheses 4.34 -.13 -.34 .16 .Ol .09 .22 3.20””
Recognizing sides of an issue 3.58 -.15 -.75 .39 -,38 .62 .26 10.59”“”
Generating alternative hypotheses 3.87 .05 -. 50 .16 -.24 .13 .39 4.80***
Breaking down complex problems 4.06 .05 .57 -.13 -.85 -. 17 14.34***
Identifying variables in a problem 3.88 .36 .02 .16 -.59 -.47
Identifying approaches to solving problems 3.98 -.25 -.05 -.20 .oo
7.65**”
L
3.03” Lo I
Recognizing similarities between problems 3.83 -.25 .03 -.06 .13 1.16
Developing operational definitions 3.66 .36 .02 .46 5.49***
Setting up formal models 3.36 -. 93 13.49”“”
Recognizing the historical context 2.97
-.66
-. 12
-.70
.59
-.74
.43
-.02
-.07
-.22
.41
.54
.51
.07
.17
-. 11
.73
-.55 1.36 26.65***
Specific Skills--Constructing Theses or
Arguments
Supporting assertions with details 3.92 -.08 -.60 .13 -.22 .74
Making explicit components In a chain of
reason i ng
Distinguishing relevant and irrelevant
information
-.04
.22
.04
-.09
.11
11.22**t”
3.57 -.17 .06 -. 10 .Ol .29 1.10
4.20 .05 -.45 .09 -. 10 .30 3.78**
Table 1 (Continued)
Discipline
Variable
Computer
Tota I Chemistry Science Education Engineering Engl ish Psychology E
(N=255) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46)
Perceiving relationships among observations 4.10 .15 -.70 .03 .04
Drawing distinctions between similar ideas 3.67 -.48 -.30 .19 -.13
Using analogies appropriately 3.36 -.38 -.26 .12 .18
Elaboratlng an argument 3.80 -.37 -. 57 .30
Drawing sound inferences 4.25 .32 -.74 .25
Producing a consistent argument 4.12 -.54
Synthesizing different positions
Translating graphs or symbolic statements
3.10
3.55
.15
-.34
.40
-.26
.45
.lO
.33
.02
-.43
l 03
-.05
-.lO
.50
Specific Skills--Analyzing Arguments
Recogn I zi ng the central argument 4.04 .12 -.57 -.02 -. 11
Identifying assumptions 3.98 -.17 -.12 -.03 .06
Testing the validity of an argument 3.57 -.09 .27 -.23 .02
Finding alternative explanations 3.82 .16 -.56 .28 -. 16
Recognizing supporting points 3.64 .04 -. 52 .Ol
Detecting fallacies and contradictions 3.92 .08 -.36 .06
Distinguishlng major points in an argument 3.28 -.12 -.52 -. 14
-.20
-.22
.02
.15 .33
.56 .16
.53 -.18
1.02 .04
-.04 .18
.35 -.Ol
.22 .16
-1.44 .07
.57 .oo
.47 -.20
-.09 .13
-.27 .55
.61 .06
.22 .21
.63 .11
7.74***
6,67***
3.93**
18.59”“”
8.84”“”
4.67”“” I
2. go* 7
18.17”“”
7.13***
3.14***
1.19
7.89***
6.84”“”
2.18
5.84”“”
Table 1 (Continued)
Discipline
Variable
Computer
Tota I Chem i stry Sc i ence Education Engineering Engl ish Psychology F
(N=255) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46) -
Being sensitive to different evidence 3.31 -.34 -. 96 .71
Recognizing shifts in meaning of an argument 2.90 -.90 -.69 .52
Judging whether a thesis has been supported 3.86 -.08 -.81 .24
Determining whether conclusions are supported 4.25
Distinguishing among factors 3.45
.lO
-. 20
-. 79
-.35
.27
.lO
-.42 .05 .95 18.86***
-.55 1.48 .14 33.50***
-. 32 .73 .25 12.76***
.05 .11 .27 8.65***
.31 -.24 .38 3.31**
Specific Skills-- Drawing Conclusions
Generating vat id explanations 4.16 .18 -.76 .27 .07 .lO .14 8.26”“”
Supporting conclusions 4.24 .15 -.77 .31 .02 .22 .07 8.92***
Recognizing implications of an interpretation 4.07 -.07 -.72 .26 -. 28 .59 .23 14.82***
Qualifying conclusions 3.99 .06 -.73 .13 -.15 .40 .29 8.90”“”
Generating new questions or experiments 3.88 .53 -.58 .07 .21 -.61 .38 10.08”“”
Considering alternative conclusions 3.85 .24 .25 -. 16 .41 6.33***
Comparing conclusions 3.91 .28
-. 59
-.47
-. 77
.04
.14
.18
-. 14
-.09
.07
.05 3.25**
Revising views for new observations 3.84 .30 .07 .18 6.38”“”
Note. Entries under individual discipl ines are deviations from the grand means given under ltTotaI.ll Standard deviations for
totals range from .78 to 1.31, and 33 of 56 are between .9 and 1.1.
“p < .Ol
*yp < .05
“““p < .OOl -
Table 2
Mean Ratings of Seriousness of Reasoning Errors by Disciplines
Disci pl ine
Variable
Computer
Tota I Chemistry SC ience Ed ucat i on Eng i neer i ng Engl ish Psychology f
(N=i55) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46)
Error
Faillng to recognize differences between
sample and population 2.92 -.59 -.55 .64 .41 -.82 .91 16.90***
Failing to account for base rate 2.64 -.18 -.62 .36 .14 -.03 1.34 24.59***
Offering irrelevant statements 3.05 -.46 -.66 .58 -.53 .41 .66 8.27***
Failing to recognize a concealed conclusion 3.34 -.42 -.45 .34 -.24 .45 .32 6.31**”
Confusing coincidence with causation 3.60 -.47 -.44 .49 -.22 -.20 .83 9.51***
Using an ambiguous term 3.02 -.66 -.62 .18 -.38 .93 .54 15.13”“”
Arguing to accept an unproven conclusion 3.20 -.20 -.07 .lO
.09
-.21
-.12
-.26 .65 3.96**
Accepting assumptions with questioning 3.96 -.23 -.22 .24 .24 2.14
Fai I ing to evaluate the credibil ity of a
source 3.74 -.23 -.63 .35 -.09
Offering irrelevant evidence 3.48 -.40 -.53 .27 -.18
.37 .23 5.55”“”
.61 .23 7.49***
Making general izations from insufficient ev i d ence 3.86 -.24 -.42 -.14
inappropriately interpreting text 3.49 -.52 -.61
.31
.39 -.28
.30 .20 3.97**
.62 .40 11.22***
Table 2 (Continued)
Discipline
Tota I
Computer
Chemistry Sc fence Education Engineering Engl ish Psychology _F_
Variable (N=255) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46)
Allowing anecdotal information 3.23 -. 64 -. 33 .72 .Ol -.51 .74 10.32***
Basing conclusions on partial data 3.57 -.08 -.36 .09 .oo .16 .19 1.50
Fail ing to recognize simi larities in
analogies 3.06 -.33 -.42 .28 .03 .17 .28 3.46””
Note. Entries under individual disciplines are deviations from the grand means given under “Total.~’ Standard deviations for
totals range from .98 to 1.48, and 8 of 15 are between 1.0 and 1.2.
*p < .Ol - **p < .05
***p < .OOl -
Table 3
Mean Ratings of Effects of Critical Incidents by Dlscipl lnes
Discipline
Variable
Computer
Tota 1 Chem i stry Science Ed ucat i on Engineering Engl
(N=255) (N=37) (N=43) (N=42) (N=43) (N=44)
Incidents
Making irrel evant remarks 3.23 -.23 -.46 -.33
Offering nonconstructive criticism 2.79 -.22 -.40
.26
.35 -.23
Submitting a nonresponsive paper 3.67 .17 -.32 -.05 -.23
Accepting an argument on emot iona I appeal 3.34 -.23 .49 -.46
Conf usi ng anecdote with “hard data” 3.24 -.24
-.77
-.45 .28 -. 14
Accepting uncritically conclusions of
authorities 3.46 -.42 -.23
Being unable to integrate ideas 3.96
-.ll
-.15 -.19
.15
-.05 -.lO
Failing to recognize that evidence can
support conclusions 3.39 -.12 -.39 .11
-.17
-.22
.05
-.29
.Ol
-.18
Using inconsistent statements 3.33 .12 -.38
Argulng for the obvious 2.76 -.09 -.16
-.oo
-. 14
.12
.14
.09
Unable to perceive analogy breakdown 2.98 -.03 -.35
Falling to draw conclusions about a pattern 3.41
3.44
.18 -.18
Making implausible assumptions .23 -.42
Table 3 (Continued)
Discipline
computer
Tota I Chemistry Science Ed ucat ion Engineering Engl ish Psychology F - Variable (N=255) (N=37) (N=43) (N=42) (N=43) (N=44) (N=46)
Applying rules without justification 3.38 .70 .11 -.33 .72 -1.06 -.14 16.24”“”
Relying solely on narrative 3.68 -.12 -.ll .I3 -.25 .50 -.16 2.99”
Preferring complex to simple explanations 3.23 .07 -.ll -.06 -.23 .I6 .a1
Unable to see a pattern 3.76 .13 -.32 .02
.39
.47
-.ia
-.12
-.28
.oa
.30
.I7
-.05 .23
.ia
-.oo 1.61
Fai I ing to qual ify conclusions 3.44 .07 -.56 -.17 -.Ol .2a 4.99””
Writing a one-sided essay 3.01 -.30 -.60 -.28 .47 25 7.46”“”
Searching for a complicated solution 3.28 .02 .16 .23 -.19 -.04 1.11
Ignoring detai Is 3.74 .20 -.37 -.04 .26 .082 .23
Unwilling to respond in an unknown situation 3.28 .56 .06
Unable to generate hypotheses 3.94 .14
-.ll
.31
-.27
Unable to suggest a I ternat i ves 3.43
Unable to suggest tests of hypotheses 3.52
-.15
-.28
.20 a.35 -.19 3.76””
-.06 -.12 .23
-.04
-.19
-.oo .oo
-.64 .63
1.54
.90
8.15”“”
Note. Entries under individual discipl ines are deviations from the grand means given under “Total .‘I Standard deviations for
totals range from .93 to 1.27, and 15 of 25 are between 1 .O and 1.2.
“p < .Ol
**p < .05
**,, < ,001
Table 4
Reasoning Skills, Errors, and Incidents Rated as Most Important or Critical by Disciplines
Discipline Skills Errors/Incidents
Chemistry Drawing sound inferences from observations (4.57)
Critically analyzing and evaluating previous research or reports in a field (4.46)
Generating new questions or experiments to extend or support the interpretation of data (4.42)
Computer Science
Breaking down complex problems or situations into simpler ones (4.62)
Reasoning or solving problems in situations in which all facts underlying a problem situation are known (4.19)
Education Supporting conclusions with sufficient data or information (4.55)
Determining whether the conclusions drawn are logically consistent with,
and adequately supported by, the data (4.52)
Applying a formula, algorithm, or other rule without sufficient justification (4.08)
Being unable to generate hypotheses independently (4.08)
Being unable to integrate and synthesize ideas from various sources (3.77)
Making generalizations on the basis of insufficient evidence (4.17)
Confusing coincidence and/or correlation with causation (4.10)
Failing to evaluate the credibility or reliability of a source or text (4.10)
Table 4 (Continued)
Discipline Skills Errors/Incidents
Clearly identifying central issues and problems to be investigated or hypotheses to be tested (4.50)
Drawing sound inferences from observations (4.50)
Engineering Breaking down complex problems or situations into simpler ones (4.60)
Reasoning or solving problems in situations in which all the needed information is not known (4.53)
English
Identifying all the variables involved in a problem (4.40)
Elaborating an argument and developing its implications (4.82)
Understanding, analyzing, and evaluating arguments (4.75)
Supporting general assertions with details (4.66)
Recognizing the central argument or thesis in a work (4.61)
Applying a formula, algorithm, or other rule without sufficient justification (4.09)
Being unable to integrate and synthesize ideas from various sources (4.27)
Accepting the central assumptions in an argument without questioning them (4.20)
Relying solely on narrative or description in papers and reports when analysis is appropriate (4.18)
Table 4 (Continued)
Discipline Skills Errors/Incidents
Psychology Critically analyzing and evaluating previous research or reports in a field (4.58)
Confusing coincidence and/or correlation with causation (4.43)
Clearly identifying central issues and problems to be investigated or hypotheses to be tested (4.57)
Accepting the central assumptions in an argument without questioning them (4.20)
Determining whether the conclusions drawn are logically consistent with, and adequately supported by, the data (4.52)
Note. Numbers in parentheses are the average ratings for each skill, error, or incident for each discipline.
-29-
Table 5
Reasoning Skills Rated Consistently As At Least Moderately Important
Skill Mean
Rating
Reasoning or solving problems in situations in which all the needed information is not known
Detecting fallacies and logical contradictions in arguments
Deducing new information from a set of relationships
Recognizing structural similarities between one type of problem or theory and another
Taking well-known principles and ideas from one area and applying them to a different specialty
Monitoring one’s own progress in solving problems
Deriving from the study of single cases structural features or functional principles that can be applied to other cases
Making explicit all relevant components in a chain of logical reasoning
Testing the validity of an argument by searching for counterexamples
4.24
3.92
3.86
3.83
3.75
3.71
3.68
3.57
3.57
Note. Moderately important is defined as having an average c rating over all disciplines of 3.5 or greater. There were no significant differences among disciplines with respect to the average importance of these skills.
-3o-
Table 6
Errors or Incidents Rated Consistently as at Least Moderately Serious or Critical
Error/Incident Mean
Rating
Accepting the central assumptions in an argument without questioning them 3.96
Being unable to integrate and synthesize ideas from various sources 3.96
Being unable to generate hypotheses independently 3.94
Being unable to see a pattern in results or to generalize when appropriate 3.76
Ignoring details that contradict an expected or desired result 3.74
Submitting a paper that failed to address the assigned issues 3.67
Basing conclusions on analysis of only part of a text or data set 3.57
Note. Moderately critical is defined as having an average c rating over all disciplines of 3.5 or greater. There were no significant differences among disciplines with respect to the average ratings of seriousness or criticality.
-31-
Table 7
Summary of Variables Defining Factors Underlying Ratings of Importance of Reasoning Skills
Factor Variables Loading Highest on Factors
I Recognizing shifts in the meaning of a word in the course of an argument
Elaborating an argument and developing its implications
Recognizing supporting points in an argument
Recognizing the historical context of a problem
Understanding, analyzing, and evaluating arguments
Judging whether a thesis has been adequately supported
Recognizing the central argument or thesis in a work
II Generating valid explanations to account for observations
Supporting conclusions with sufficient data or information
Drawing sound inferences from observations
Determining whether the conclusions drawn are logically consistent with, and adequately supported by, the data
Comparing conclusions with what is already known
Considering alternative conclusions
III Setting up formal models for problems under consideration
Breaking down complex problems or situations into simpler ones
Translating graphs or symbolic statements into words and vice versa
Identifying all the variables involved in a problem
Knowing the rules of formal logic
(73)
(72)
(70)
(67)
(64)
(64)
(63)
(72)
(6%
(69)
w3)
(67)
(5%
(69)
(59)
(57)
(56)
(53)
-32-
Table 7 (Continued)
Factor Variables Loadincr Hichest on Factors
IV Recognizing structural similarities between one type of problem or theory and another (61)
Drawing distinctions between similar but not identical ideas (53)
Synthesizing two different positions into a third position (44)
Deriving from the study of single cases structural features or functional principles that can be applied to other cases (41)
V Finding alternative explanations for observations (53)
Generating alternative hypotheses (46)
Note. given.
Only the variables that best characterize the factors are Loadings are given in parentheses for the factor on which
the variable’s loading was most predominant.
-33-
Table 8
Summary of Variables Defining Factors Underlying Ratings of Seriousness of Reasoning Errors
Factor Variables Loadina Hiuhest on Factors
I Offering irrelevant evidence to support a point
Making generalizations on the basis of insufficent evidence
Failing to evaluate the credibility or reliability of a source or text
Accepting the central assumptions in an argument without questioning them
II Failing to take into account the base rate for a phenomenon in a population
Failing to recognize differences between a sample and a population of interest
Offering irrelevant statements about a person’s character or circumstances to oppose his/her conclusion
Confusing coincidence and/or correlation with causation
(75)
(72)
(71)
(70)
(76)
(68)
63)
(61)
Note. given. the va
Only the variables that best character Loadings are given in parentheses for
riable’s loading was most predominant.
ize the factors are the factor on which
Table 9
Summary of Variables Defining Factors Underlying Ratings of the Effects of Critical Incidents
Factor Variables Loadinq Hiqhest on Factors
I Accepting or supporting an argument based more on emotional appeal than on evidence
Offering criticism of other students' presentations that was not constructive or well founded
Confusing anecdote and/or opinion with "hard data"
Submitting a paper that failed to address the assigned issues
II Accepting the conclusions of recognized authorities without critically evaluating them
Being able to criticize but unable to suggest better alternatives
Being unable to integrate and synthesize ideas from various sources
Being unable to generate hypotheses independently
Failing to recognize that evidence can support more than one conclusion
III Applying a formula, algorithm, or other rule without sufficient justification
Searching for a complicated solution when an obvious simple one exists
Being unwilling to respond in a new or unknown situation when the limits of a student's knowledge has been reached
(72)
(65)
(65)
(62)
(71)
(59)
(58)
(56)
(55)
(67)
(58)
(54)
Note. given.
only the variables that best characterize the factors are Loadings are given in parentheses for the factor on which
the variable's loading was most predominant.
-35-
Table 10
Means of Scales Based on Items that Best Defined Factors by Disciplines
Scale Discipline 1 2 3 4 5
Chemistry 3.34 4.29 3.57 3.38 3.43
Computer Science 2.97 3.37 4.05 3.58 2.90
Education 3.83 4.29 3.44 3.73 3.93
Engineering 3.25 4.11 4.00 3.61 3.44
English 4.53 4.10 2.70 3.87 3.46
Psychology 3.73 4.26 3.37 3.78 4.21
Total 3.61 4.07 3".52 3.66 3.56
Key. Scales are unweighted composites of the following variables: Scale 1 = Recognizing shifts in the meaning of a word in the course of
an argument
Scale 2 =
Elaborating an argument and developing its implications Recognizing supporting points in an argument Recognizing the historical context of a problem Judging whether a thesis has been adequately supported Recognizing the central argument or thesis in a work Understanding, analyzing, and evaluating arguments Generating valid explanations to account for observations Drawing sound inferences from observations Supporting conclusions with sufficient data or information Determining whether the conclusions drawn are logically
Scale 3 =
Scale 4 =
Scale 5 =
consistent with, and adequately supported by, the data Comparing conclusions with what is already known Considering alternative conclusions Revising a previously held view to account for new observations Setting up formal models for problems under consideration Breaking down complex problems or situations into simpler ones Translating graphs or symbolic statements into words and vice
versa Identifying all the variables involved in a problem Knowing the rules of formal logic Recognizing structural similarities between one type of problem
or theory and another Drawing distinctions between similar but not identical ideas Deriving from the study of single cases structural features or
functional principles that can be applied to other cases Synthesizing two different positions into a third position Deriving general or abstract principles from disparate facts or
cases Finding alternative explanations for observations Being sensitive to the strength of different types of evidence
(correlational, causal, testimony) Generating alternative hypotheses Recognizing the probabilistic nature of most events
Note. Means have been divided by the number of items on each scale.
I. REASONING SKILLS
The followlng are checkllsts of sune general and specific reasonlng skills
In your graduate program. Please rate these skills wfth respect to:
( 1) frequency-how often do students 1 n your graduate program need to use
that may be Important to success
this ski I I?
(2) Importance for success-to what extent does thfs skfll dffferentfate between marglnal and successful students
in your progran?
For ftlmportance for success,t8 please refer to these descrlptlons:
1. lhls skill Js not relevant to my ffeld of teaching
2. There Is little or no difference between marglnal and successful students with respect to this skill
3. There Is a moderately important difference between margfnal and successful students wfth respect to this skill
4. There Is a very Important difference between marglnal and successful students with respect to thls skrll
5. There 1s a crItIcally Important difference betneen margfnal and successful students wlth respect to thfs sklll
A. Reasonfng Sk1 I Is--General
Derlvlng general or abstract principles frun
dlsparate facts or cases. . . . . . . . . . .
Reasoning or problan solvlng fn sltuatfons Jn
which all facts underlylng a problem solution
are known expllcltly. . . . . . . . . . . . .
Reasoning or problem solvfng In sltuatlons In
which all the needed InformatIon is not known
Reasonlng when lnconsistencfes are present in
the fnformatlon . . . . . . . . . . . . . . .
Knowfng the rules of formal logic . . . . . .
Taklng wel I known prlncfples and Ideas fran one
area and applying them to a different specialty
Being able to both support and refute a given
posftlon. . . . . . . . . . . . . . . . . . . .
CYltfcally analyzing and evaluating prevlous
research or reports In a field. . . . . . . . .
lnccrporatlng Isolated Instances or data Into a
preexlstlng fr wk.............
. .
. .
. .
. .
. .
. .
. .
. .
. .
Mftorlng one’s own progress Jn solving problems .
Frequency ImDortance for Success/d f f f erence
Never/ Vet-Y Hardly Ever Frequent I y
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
rbt Relevant Wet-ate Q-ltlcal
1
1
1
1
1
1
1
1
1
1
Llttle
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
Important
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
-2-
Derlvlng frcm the study of sSngle cases structural
features or functional prrnclples that can be
appl led to other cases. . . . . . . . . . . . . . .
ReccgnlzIng the probabl I IstIc
wents. . . . . . . . . . . .
nature of most
Understanding, analyzing, and evaluating arguments.
Deducing new InformatIon frcm
relatIonshIps . . . . . . . .
a set of
Knowing what klnd of evidence WI I I suppoct or
refute a hypothesis or thesis . . . . . . . . . . .
Other (specIfyI
B. Speclflc Skills-+roblem DefInftlon
Clearly ldentlfylng central Issues and problems to
be lnvestlgated or hypotheses to be tested. . . . .
F@cognlzlng two or more sides of an Sssue . . . . .
Generatjng alternatlve hypotheses . . . . . . . . .
Break!ng down ccmplex problems or sftuatlons Into
slmplm ones. . . . . . . . . . . . . . . . . . . .
Identlfylng all the variables involved In a
Problem . . . . . . . . . . . . . . . . . . . . . .
ldentffylng more than one approach to solving a
Problem . . . . . . . . . . . . . . . . . . . . . .
f?ecognIzlng structural sfmflarltles between one
type of problem or theory and another . . . . . . .
Developing operatlonal (cr very preclse) deflnStIons
of concepts . . . . . . . . . . . . . . . . . . . .
Sel-tlng up formal models for problems under
co&deratJon . . . . . . . . . . . . . . . . . . .
RecognIzlng the hlstorlcal context of a problem . .
Other (specify)
Freauencv Importance for Success/d 1 f f erence
Navel-/ VSY Hardlv Ever Freauentlv
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
r&t We I evant Wderate O-Hlcal
LIttIe
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
Important
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
c. Speclflc Sklsts--&struct~ng Theses
or Argunents
Supporting general assertlons with details. . . . .
bklng expllcft all relevant ccmponents In a chain
of logIca reasoning. . . . . . . . . . . . . . . .
D!stfngufshlng between relevant and Irrelevant
lnfmt!on . . . . . . . . . . . . . . . . . . . .
Percelvlng relatfonships among observations . . . .
Drawing dlstlnctlons betwean slmllar but not
ldentlcal Ideas . . . . . . . . . . . . . . . . . .
Using analogles appropriately . . . . . . . . . . .
Elaborating an argument and developfng Its
ImpI IcatIons. . . . . . . . . . . . . . . . . . . .
D-awing sound Inferences frcm observations. . . . .
Producing an argument that 1s Internally conslstent.
Syntheslzlng two dlfferent posStfons Into a thlrd
posItJon. . . . . . . . . . . . . . . . . . . . . .
Translating graphs or symbolic statements Into
words and vice versa. . . . . . . . . . . . . . . .
Other (specIfyI
D. Speclffc Skills-Analyzing Arguments
Recognlzlng the central argument or thesis In
awork...................... ,
Identlfylng both stated and unstated assumptions.
Testing the va1ldH-y of an argunent by seat-chlng
for counterexamp 18s . . . . . . . . . . . . . . .
Flndlng alternatlve explanations for observations
Recognlzlng suppcrtlng po!nts In an argument. . .
Detecting fal lacfes and logical contradlctlcns
In arguments. . . . . . . . . . . . . . . . . . .
Freauencv Importance for Success/d I f ference
New-/ VWY I-Wdl y Ever Frequently
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
r+kYt
Relevant Met-ate CrItIcal
1
1
1
1
1
1
1
1
1
1
1
1
Lfttle
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
Important
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
Dlst1ngulshlng major and minor points In an
argument. . . . . . . . . . . . . . . . . . . . . .
Eielng sensltlve to the strength of djfferent types
of evidence (correlational, causal, testlmony). . .
Recognlzlng shifts In the meaning of a word ln the
course of an argument . . . . . . . . . . . . . . .
Judging whether a thesls has been adequate1 y
supported . . . . . . . . . . . . . . . . . . . . .
Determlnlng whether the conclusions drawn are
loglcally consistent wlth, and adequately
supposed by, the data. . . . . . . . . . . . . . .
Dlstlngulshlng among contt-lbutlng, necessary and
sufflclent factors. . . . . . . . . . . . . . . . .
Other (specIfyI
E. Speclflc Skills-Bawlng Conclusions
Generatlng valld explanations to account for
observations. . . . . . . . . . . . . . . . . . . .
Supportlng conclusions with sufflclent data or
InformatIon . . . . . . . . . . . . . . . . . . . .
F&cognlzlng the lmpl lcatlons or consequences of an
lnterpt-etatlon. . . . . . . . . . . . . . . . . . .
@al lfylng conclusions appropriately and recognlzlng
ways in which conclusions could be challenged . . .
GeneratIng new questions or experiments to extend
CN- support the lnterpretatlon of data . . . . . . .
Oonslderlng alternative concIuslons . . . . . . . .
Cornpat-lng concIuslons with what is already known. .
Revising a previously held view to account for new
observations. . . . . . . . . . . . . . . . . . . .
Other (specify)
Freauencv Importance for Success/d 1 f f erence
#3ver/
Hardly Ever very
Frequent I y
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
Not
Fzle I evant Wderate cf-ltlcal
Little Important
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
1 2
3
3
3
3
3
3
4
4
4
4
4
4
5
5
5
5
5
5
-5-
I I. REASONING EMms
Please indicate which of the following kinds of reasoning errors you have observed in your graduate students and the
frequency with which you have noted these errors.
Error
Failing to recognize differences between a smple and a
population of interest. . . . . . . . . . . . . . . . . . . . . .
Failing to take into account the base rate for a phenanenon
in a population . . . . . . . . . . . . . . . . . . . . . . . . .
Offering irrelevant statements about a person’s character or
circumstances to oppose his/her conclusion. . . . . . . . . . . .
Fail ing to recognize when a conclusion is concealed within a
premise . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Confusing coincidence and/or correlation with causation . . . . .
Using an ambiguous term to shift senses in the sane argument. . .
Arguing that a conclusion should be accepted (rejected)
because it has not been disproved (proved). . . . . . ,.....
Accepting the central assunptions in an argunent without
questioning them. . . . . . . . . . . . . . . . . . . . . . . . .
Fail ing to evaluate
or text. . . . . .
the credibility or reliability of a source
. . . . . . . . . . . . . . . . . . . . . . .
Offering irrelevant evidence to support a point . . . . . . . . .
Making general izations on the basis of insufficient evidence. . .
Reading into a text one’s own views or inappropriately
interpreting text or data on the basis of one’s own experience. .
Al lowing anecdotal information to override more extensive
statistical data. . . . . . . . . . . . . . . . . . . . . . . . .
Basing ccnclusions on analysis of only part of a text or data
set...............................
Failing to recognize relevant similarities and dissimilarities
when using analogies to argue . . . . . . . . . . . . . . . . . .
Other (specify)
Other (specify)
Frequency
Never/ Hard I y Ever
V'3t-Y Frequent
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
!Ser iousness
IW very SeriOUS !ht-iOUS
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
III. CRITICAL IICIDENTS
l-bw frequently have you observed the followfng klnds of Incidents or tendencies In your students?
these had on your estlmatlon of the analytlcal ablllty of your students?
Incidents
Repeatedly maklng Irrelevant remarks during class dlscusslons . .
Offerlng crltlclsm of other studentst presentations that was
not constructive or wel I founded. . . . . . . . . . . . . . . . .
Submlttlng a paper that falled to address the asslgned lssues . .
Accepting or supportlng an argunent based more on emotional
appeal than on evidence . . . . . . . . . . . . . . . . . . . . .
ConfusIng anecdote and/or opln Ion wl th “hard data”. . . . . . . .
Accepting the concIuslons of recognized authorltles wlthout
crltlcally evaluating them. . . . . . . . . . . . . . . . . . . .
Being unable to Integrate and synthesize Ideas from varlous
sources.............................
Falling to recognize that evidence can suppcrt more than one
ccKIcluslon. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Uslng Inconslstent statements to support the same posltlon. . . .
Arguing In support of the obvlous . . . . . . . . . . . . . . . .
Eelng unable to perceive when an analogy breaks down. . . . . . .
Reoognlzlng a pattern but fal I Ing to draw any conclusions about
lt or attemptIng to explain It. . . . . . . . . . . . . . . . . .
MakIng lmplauslble assumptions. . . . . . . . . . . . . . . . . .
Applylng a formula, algorlthm, or other rule wlthout sufflclent
justlflcatlon . . . . . . . . . . . . . . . . . . . . . . . . . .
Relylng solely on narratlve or descrlptlon In papers and reports
when analysis 1s appropriate. . . . . . . . . . . . . . . . . . .
Referrlng ccmplex or far-fetched explanations over simpler ones.
Being unable to see a pattern In results or to generalize when
appropriate . . . . . . . . . . . . . . . . . . . . . . . . . . .
Falllng to recognize that a deflnlte conclusion cannot be
reached (or falllng to qualify a concIuslon approprlately). . . .
New-/ VWY bb Q-eat Hardly Ever Frequent Ef feet Ef feet
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
what effect have
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
-7-
Never/ Hardlv Ever
V’V Freauent
Searching for a ccmpl lcated solution when an obvlous simple one
=lsts. . . . . . . . . . . . . . l ..*......*.....
Ignoring details that contradld an expected cr desired result. .
Being unwllllng to respond In a new
the I Imlts of a student’s knowledge
Being unable to generate hypotheses
or unknown sltuatlon when
has been reached. . . . . . .
Independently . . . . . . . .
EIelng able to crltlclze but unable to suggest better
alternatIves. . . . . . . . . . . . . . . . . . . . . . . . . . .
Eblng unable to suggest appropriate tests of hypotheses . . . . .
Other
Other
(specl fy)
W=lfy)
lb
Ef feet
Q-eat Effect
Iv. BACGROUM INFOlMATION
2. Acadmlc rank (e.g., Associate Professor)
3. Instltutlon
4. Department
5. Years of teaching experience at the undergraduate/graduate levels
6. Your current teaching load (nunber of contact hours per week) undergraduate
graduate
7. Altogether, about how many graduate students at each of the followlng levels have you advised over the past three years?
Msters
Doctoral
Do you have any comments that you’d llke to make about measuring students’ analytlcal abllltles?
Thanks for your contrlbutlon to our study.
Appendix B.l
Varimax Factor Loadings for Ratings of Importance of Reasoning Skills
Factor
Variable I 11 III IV v Communality
Reasoning Skills--General
Deriving general or abstract principles from disparate facts or cases
Reasoning or solving problems in situations in which all facts underlying a problem solution are known explicitly
Reasoning or solving problems in situations in which all the needed information is not known
48
36 28
27
38 35 31
Reasoning when inconsistencies are present in the information
Knowing the rules of formal logic
34 32 38 41
53 38
Taking well-known principles and ideas from one area and applying them to a different specialty
Being able to both support and refute a given position 51
Critically analyzing and evaluating previous research or reports in a field 40 45
Incorporating isolated instances or data into a preexisting framework
Monitoring one's own progress in solving problems
Deriving from the study of single cases structural features or functional principles that can be applied to other cases
Recognizing the probabilistic nature of most events
Understanding, analyzing and evaluating arguments 64
36 20
Deducing new information from a set of relationships 33 33
Knowing what kind of evidence will support or refute a hypothesis or thesis 53 43
41
16
31 40
30 46
23
21
32 25
49
32
52
Appendix B.l (Continued)
Factor
Variable I 11 III IV v Communalitv
Snecific Skills--Problem Definition
Clearly identifying central issues and problems to be investigated or hypotheses to be tested 32 30
Recognizing two or more sides of an issue 58
Generating alternative hypotheses 32
Breaking down complex problems or situations into simpler ones
Identifying all the variables involved in a problem
Identifying more than one approach to solving a problem
Recognizing structural similarities between one type of problem or theory and another
Developing operational (or very precise) definitions of concepts
Setting up formal models for problems under consideration
Recognizing the historical context of a problem
Specific Skills --Constructing Theses or Arguments
30
67
59 44
32 56 43
31 36 39
48
69
Supporting general assertions with details 57 40 52
Making explicit all relevant components in a chain of logical reasoning
Distinguishing between relevant and irrelevant information
Perceiving relationships among observations
40 40
36 51
58
Drawing distinctions between similar but not identical ideas 46
Using analogies appropriately 35
61 47
37 52
53
39
24
36 54
46 39
33
53
55
42
46
54
29
Appendix B.l (Continued)
Factor
Variable I 11 III IV v Communalitv
Elaborating an argument and developing its implications 72
Drawing sound inferences from observations
Producing an argument that is internally consistent 51
Synthesizing two different positions into a third position 32
Translating graphs or symbolic statements into words and vice versa
Specific Skills--Analyzing Arguments
Recognizing the central argument or thesis in a work 63 34
Identifying both stated and unstated assumptions 52
Testing the validity of an argument by searching for counterexamples 36
Finding alternative explanations for observations
Recognizing supporting points in an argument 70
Detecting fallacies and logical contradictions in arguments 58 32
i Distinguishing major and minor points in an argument 53
Being sensitive to the strength of different types of evidence (correlational, causal, testimony) 41 35 42 52
Recognizing shifts in the meaning of a word in the course of an argument 73 65
Judging whether a thesis has been adequately supported 64 41 61
64
69 56
43 49
44 34
57 36
38
33
44 33
53
53
47
46
53
64
54
43
Appendix B.l (Continued)
Factor
Variable I II III IV v Communality
Determining whether the conclusions drawn are logically consistent with, and
-adequately supported by, the data
Distinguishing among contributing, necessary, and sufficient factors
Specific Skills--Drawing Conclusions
Generating valid explanations to account for observations
Supporting conclusions with sufficient data or information
Recognizing the implications or consequences of an interpretation
Qualifying conclusions appropriately and recognizing ways in which conclusions could be challenged 48 56
Generating new questions or experiments extend or support the interpretation of data
to
Considering alternative conclusions
57 37 52
59 37 60
67 56
58 49
7.37 3.84 3.21 2.51 25.34
Comparing conclusions with what is already known
Revising a previously held view to account for new observations
Common Variance
36
31
‘47
8.41
68
43 35 34
72
69
45
64
49
59
58
54
57
Note. Loadings less than .30 have been omitted, as have all decimal points.
Appendix B.2
Varimax Factor Loadings for Ratings of Frequency of Reasoning Errors
Error Factor
I IT. Communality
Failing to recognize differences between a sample and a population of interest
Failing to take into account the base rate for a phenomenon in a population
Offering irrelevant statements about a person's character or circumstances to oppose his/her conclusion
Failing to recognize when a conclusion is concealed within a premise
Confusing coincidence and/or correlation with causation
Using an ambiguous term to shift senses in the same argument
Arguing that a conclusion should be accepted (rejected) because it has not been disproved (proved)
Accepting the central assumptions in an argument without questioning them
Failing to evaluate the credibility or reliability of a source or text
Offering irrelevant evidence to support a point
Making generalizations on the basis of insufficient evidence
Reading into a text one's own views or inappropriately interpreting text or data on the basis of one‘s own experience
Allowing anecdotal information to override more extensive statistical data
Basing conclusions on analysis of only part of a text or data set
Failing to recognize relevant similarities and dissimilarities when using analogies to argue
48
51
30
40
40
53 38
65 47
37 34 25
63 40
63 44
67 47
63 46
70
42
58
74 56
75 58
53
30
39
43
55
46
42
50 40 42
Common Variance 4.06 2.60 6.66 Note. Loadings less than .30 have been omitted, as have all decimal points.
Appendix B.2
Varimax Factor Loadings for Ratings of Seriousness of Reasoning Errors
Error Factor
I II Communality
Failing to recognize differences between a sample and a population of interest
Failing to take into account the base rate for a phenomenon in a population
Offering irrelevant statements about a person's character or circumstances to oppose his/her conclusion
Failing to recognize when a conclusion is concealed within a premise
Confusing coincidence and/or correlation with causation
Using an ambiguous term to shift senses in the same argument
Arguing that a conclusion should be accepted (rejected) because it has not been disproved (proved)
Accepting the central assumptions in an argument without questioning them
Failing to evaluate the credibility or reliability of a source or text
Offering irrelevant evidence to support a point
Making generalizations on the basis of insufficient evidence
Reading into a text one's own views or inappropriately interpreting text or data on the basis of one's own experience
Allowing anecdotal information to override more extensive statistical data
Basing conclusions on analysis of only part of a text or data set
Failing to recognize relevant similarities and dissimilarities when using analogies to argue
30 63 48
40 56 47
38 61 51
50 48 48
35 60 48
70 53
71 56
75 61
72 56
65
43
55
68 48
76 59
33
56
53
50
39
46 44 40
Appendix B. 3
Varimax Factor Loadings of Effects of Critical Incidents
Factor
Incident I I.1 III Communality
Repeatedly making irrelevant remarks during class discussions 60 36
Offering criticism of other students' presentations that was not constructive or well founded 65 44
Submitting a paper that failed to address the assigned issues 62 42
Accepting or supporting an argument based more on emotional appeal than on evidence 72
Confusing anecdote and/or opinion with "hard data" 65
Accepting the conclusions of recognized authorities without critically evaluating them
Being unable to integrate and synthesize ideas from various sources
Failing to recognize that evidence can support more than one conclusion
Using inconsistent statements to support the same position
Arguing in support of the obvious
56 40 51
38 39 37
Being unable to perceive when an analogy breaks down 49 38 47
Recognizing a pattern but failing to draw any conclusions about it or attempting to explain it
Making implausible assumptions 48 53
Applying a formula, algorithm, or other rule without sufficient justification
Relying solely on narrative or description in papers and reports when analysis is appropriate 39 41
Preferring complex or far-fetched explanations over simpler ones 32 40 33
71 56
58 40
55 40
34 48
67
60
52
40
53
45
37
Appendix B.3 (Continued)
Factor
Incident I 11 III Communality
Being unable to see a pattern in results or to generalize when appropriate
Failing to recognize that a definite conclusion cannot be reached (or failing to qualify a conclusion appropriately)
Writing a one-sided essay when a more balanced treatment is appropriate
Searching for a complicated solution when an obvious simple one exists
Ignoring details that contradict an expected or desired result
Being unwilling to respond in a new or unknown situation when the limits of a student's knowledge have been reached
Being unable to generate hypotheses independently
Being able to criticize but unable to suggest better alternatives
Being unable to suggest appropriate tests of hypotheses
Common Variance
40 42 40
42 48 32 51
48 41 44
58 40
34 40 39 42
54 39
56 35 44
59 40
41 37 31
4.15 3.66 3.05 10.86
Note. Loadings less than .30 have been omitted, as have all decimal points.
Appendix B.3
Varimax Factor Loadings of Frequency of Critical Incidents
Factor
Incident I II III Communality
Repeatedly making irrelevant remarks during class discussions
Offering criticism of other students' presentations that was not constructive or well founded
Submitting a paper that failed to address the assigned issues
Accepting or supporting an argument based more on emotional appeal than on evidence
Confusing anecdote and/or opinion with "hard data"
Accepting the conclusions of recognized authorities without critically evaluating them
Being unable to integrate and synthesize ideas from various sources
Failing to recognize that evidence can support more than one conclusion
Using inconsistent statements to support the same position
Arguing in support of the obvious
Being unable to perceive when an analogy breaks down
Recognizing a pattern but failing to draw any conclusions about it or attempting to explain it
Making implausible assumptions
Applying a formula, algorithm, or other rule without sufficient justification
Relying solely on narrative or description in papers and reports when analysis is appropriate
Preferring complex or far-fetched explanations over simpler ones
52 31
46 22
37
71
59
41
36 27
53
39
40
53
47
47
47 39
44 29
36 35
53 58
32 35
33 30 42
42 28
32 50 42
54 34
39 28
49 30
Appendix B.3 (Continued)
Factor
Incident I II III Communality
Being unable to see a pattern in results or to generalize when appropriate 47 45 45
Failing to recognize that a definite conclusion cannot be reached (or failing to qualify a conclusion appropriately) 41 42 43
Writing a one-sided essay when a more balanced treatment is appropriate
Searching for a complicated solution when an obvious simple one exists
Ignoring details that contradict an expected or desired result
Being unwilling to respond in a new or unknown situation when the limits of a student's knowledge have been reached
Being unable to generate hypotheses independently 72 54
Being able to criticize but unable to suggest better alternatives alternatives
65
64 43
39 32 41 42
48 36
35 52 39
48
36
Being unable to suggest appropriate tests of hypotheses 68 52
Common Variance 3.75 3.27 2.72 9.75
Note. Loadings less than .30 have been omitted, as have all decimal points.