machine translation: linguistic characteristics of mt systems and general methodology of evaluation:...

3
590 BOOK REVIEWS Up to this point in this book, about halfway, there are still no descriptions of research meth- ods per se. Some of the articles described above may be of interest to individual scholars, but their anecdotal, descriptive focus is not likely to generate widespread interest. I would recommend Michael Hill’s essay for beginning library students because he makes a good, even-handed case for expand- ing the role of research in information service practice. The third section of this book actually includes two articles on research methods, one by Blaise Cronin on marketing research methods and one by Christine Borgman on human factors research methods. Cronin’s article is basically definitional- that is, he describes the nature, purpose and some of the forms of market analysis. His description is very superficial and serves to inform the reader about market analysis rather than providing methods for conducting market research. Borgman’s article on human factors research is more comprehensive than Cronin’s and she provides the reader with some meta-level description of the state-of-the-art in this area. She also includes an extensive bib- liography to help the reader go beyond the article itself (in contrast to Cronin’s four citations). As well done as Borgman’s contribution is, human factors research remains a fairly narrow research do- main that can hardly address the subtitle of this book. After Borgman’s piece, comes a bibliography on quantitative methods in library management by Schwartz & Ferligoj. It is difficult to assess the bibliography itself since the authors are only presenting examples of research reference materials that might be used for library management. Of the last five pieces in this book, there are four articles focusing on research using bibliographic cita- tion analysis as a method. Three of these (Ferligoj et al., Schubert et al., and Tudjman et al.) are research reports on specific research projects using citation analysis and may serve the reader inter- ested in research questions for which citation analytic methods are appropriate. The same reader will be interested in Radosvet Todorov’s discussion of citation-based measures for evaluating scientific journals. A warning however, Todorov’s essay is quite esoteric and is not recommended for the novice. The book closes with a “conference summing-up” by Robert Hayes. He begins with a statement that the conferees had a problem agreeing on what information research is, and goes on to talk about information policy research as what he would have liked the conference to focus on. Hays continues with an analysis of what was presented in the conference and concludes with the question “where should we go from here?” My view of the misleading nature of the title was made clear at the beginning of this review. I think that the value of this book is largely resident in the individual articles that represent conference contributions by a diverse set of researchers and practitioners. The appropriate audience for this book is the scholar who is oriented towards library research rather than those who are interested in infor- mation science research without (at least necessarily) the library as the organizing metaphor for an information institution or system. There is little or no contribution across the collection that addresses a serious problem in both library and information science disciplines for research methods. School of Information Studies Syracuse University Syracuse, NY MICHAEL S. NILAN Machine Translation: Linguistic Characteristics of MT Systems and General Methodology of Eval- uation. J. LEHRBERGER AND L. BOURBEAU. John Benjamins Publishing Co., Amsterdam (1988). xiii + 241 pp. $30.00. ISBN 90-272-3124-9. The Machine Translation (MT) field has reason to be concerned about user evaluations of its prod- uct. The 1966 ALPAC report, which concluded that contemporary MT systems were not cost- effective compared to manual translation, had a catastrophic effect on funding in the field. Because MT is regarded as more of an “engineering” discipline than other parts of computational linguistics, decisions about the quality of MT research will fall disproportionately to users of delivered systems. Since users of MT systems will be, more often than not, government and corporate bureaucracies, these important decisions will be made by relatively few, large (hopefully) sophisticated users. With such systems as EUROTRA nearing evaluation phases by end users, it is clear that the users’ abil- ity to evaluate such complex systems is of vital importance. It is to these users that Lehrberger and Bourbeau’s book is primarily addressed. The authors were contributors to the TAUM (Traduction Automatique de L’Universite de Montreal) Project. As

Upload: neal-oliver

Post on 02-Sep-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

590 BOOK REVIEWS

Up to this point in this book, about halfway, there are still no descriptions of research meth- ods per se. Some of the articles described above may be of interest to individual scholars, but their anecdotal, descriptive focus is not likely to generate widespread interest. I would recommend Michael Hill’s essay for beginning library students because he makes a good, even-handed case for expand- ing the role of research in information service practice.

The third section of this book actually includes two articles on research methods, one by Blaise Cronin on marketing research methods and one by Christine Borgman on human factors research methods.

Cronin’s article is basically definitional- that is, he describes the nature, purpose and some of the forms of market analysis. His description is very superficial and serves to inform the reader about market analysis rather than providing methods for conducting market research. Borgman’s article on human factors research is more comprehensive than Cronin’s and she provides the reader with some meta-level description of the state-of-the-art in this area. She also includes an extensive bib- liography to help the reader go beyond the article itself (in contrast to Cronin’s four citations). As well done as Borgman’s contribution is, human factors research remains a fairly narrow research do- main that can hardly address the subtitle of this book.

After Borgman’s piece, comes a bibliography on quantitative methods in library management by Schwartz & Ferligoj. It is difficult to assess the bibliography itself since the authors are only presenting examples of research reference materials that might be used for library management. Of the last five pieces in this book, there are four articles focusing on research using bibliographic cita- tion analysis as a method. Three of these (Ferligoj et al., Schubert et al., and Tudjman et al.) are research reports on specific research projects using citation analysis and may serve the reader inter- ested in research questions for which citation analytic methods are appropriate. The same reader will be interested in Radosvet Todorov’s discussion of citation-based measures for evaluating scientific journals. A warning however, Todorov’s essay is quite esoteric and is not recommended for the novice.

The book closes with a “conference summing-up” by Robert Hayes. He begins with a statement that the conferees had a problem agreeing on what information research is, and goes on to talk about information policy research as what he would have liked the conference to focus on. Hays continues with an analysis of what was presented in the conference and concludes with the question “where should we go from here?”

My view of the misleading nature of the title was made clear at the beginning of this review. I think that the value of this book is largely resident in the individual articles that represent conference contributions by a diverse set of researchers and practitioners. The appropriate audience for this book is the scholar who is oriented towards library research rather than those who are interested in infor- mation science research without (at least necessarily) the library as the organizing metaphor for an information institution or system. There is little or no contribution across the collection that addresses a serious problem in both library and information science disciplines for research methods.

School of Information Studies Syracuse University Syracuse, NY

MICHAEL S. NILAN

Machine Translation: Linguistic Characteristics of MT Systems and General Methodology of Eval- uation. J. LEHRBERGER AND L. BOURBEAU. John Benjamins Publishing Co., Amsterdam (1988). xiii + 241 pp. $30.00. ISBN 90-272-3124-9.

The Machine Translation (MT) field has reason to be concerned about user evaluations of its prod- uct. The 1966 ALPAC report, which concluded that contemporary MT systems were not cost- effective compared to manual translation, had a catastrophic effect on funding in the field. Because MT is regarded as more of an “engineering” discipline than other parts of computational linguistics, decisions about the quality of MT research will fall disproportionately to users of delivered systems. Since users of MT systems will be, more often than not, government and corporate bureaucracies, these important decisions will be made by relatively few, large (hopefully) sophisticated users. With such systems as EUROTRA nearing evaluation phases by end users, it is clear that the users’ abil- ity to evaluate such complex systems is of vital importance.

It is to these users that Lehrberger and Bourbeau’s book is primarily addressed. The authors were contributors to the TAUM (Traduction Automatique de L’Universite de Montreal) Project. As

Book Reviews 591

pointed out in an introduction by Maurice Gross, the TAUM project, supported by the Canadian government, accomplished a great deal of both fundamental and practical research on English and French linguistics as well as MT. Between 1973 and 1981, the TAUM project released two Fully-Auto- mated Machine Translation (FAMT) systems, TUAM/METEO and TAUM/AVIATION. TAUM/ METE0 translates weather reports from English into French, and is in production use today by Envi- ronment Canada. TAUM/AVIATION, a much more ambitious project, had as its domain aircraft maintenance manuals. The authors have the perspective of having participated in this large and suc- cessful project from beginning to end. The resulting book is not a survey of the MT field, or even a technical discussion of analysis, transfer, and generation methodologies, but rather a systematic carefully thought-out exposition of the “linguistic phenomena” faced by MT systems, with many use- ful contrastive examples in French and English. People new to the MT field, especially users and prospective users of MT systems, will find this book indispensible reading.

Lehrberger and Bourbeau follow a “System Evaluation” approach to MT systems, in which the requirements that a system must satisfy as a function of their specific end-user environment, are care- fully specified. From these characteristics (e.g., the degree of human interaction expected, the seman- tic domain) and a knowledge of the linguistic “phenomena” that occur in MT, a user can create appropriate tests of a prospective system to verify that it satisfies user needs. This approach to sys- tem evaluation borrows from ideas in software engineering, and provides the evaluator with a more balanced, informative evaluation than the type practiced in, for example, the ALPAC report.

The ALPAC Report evaluation technique of concentrating on raw measures of output intel- ligibility and error rate is only one measure (however important) of a system’s usability. For exam- ple, by taking into account the system’s expected patterns of usage, an evaluator is able to classify errors into useful subcategories (e.g., errors easily detectable and correctable by the user, vs. errors that “silently” change the entire meaning of a translation). By identifying specific classes of phenom- ena (e.g., complex noun phrases, idioms) as potential problem areas, the evaluator can develop very specific tests of a system’s performance in areas important to an application.

The great bulk of the book’s content is instructional in content. Chapter 1 provides a brief intro- duction to MT and an overview of the rest of the book. The authors explicitly state their goal of help- ing MT users “form some idea of what goes on inside a system’s black box.”

Chapter 2 provides a general taxonomy of MT systems, from more of a systems engineering viewpoint than is commonly encountered in a computational linguistics book. Its main purpose is to introduce the linguistic components (e.g., lexical, morphological, syntactic, semantic) to be dis- cussed in more detail in future sections.

Chapter 3 discusses these components, not with respect to any particular design, but with respect to the problems that can arise in them. By far the major emphasis here is on dictionary design. The section on the lexical component is extensive, with complete examples of different dictionary design styles (in TAUMIMETEO, TAUM/AVIATION, and ALPS, a commercial MT system developed by Automated Language Processing Systems of Provo, Utah). The section on morphology discusses the difficulties not only of inflectional morphology, but also the problems of dealing with word pro- cessing and typographical conventions, reflecting the TAUM project’s FAMT tradition. Even the sec- tions on the syntactic and semantic components discuss these topics mainly from the viewpoint of identifying syntactic and semantic phenomena that must be handled in syntactic and semantic phases of a system, rather than from (for example) the viewpoint of parsing and transformations.

Chapter 4 discusses the difference between a corpus-based approach, based on sublanguage, and a standard grammar approach to designing a system. It is regrettably brief, giving only a tan- talizing glimpse into an area that was one of the TAUM project’s major research areas.

Chapter 5, “Linguistic Evaluation by the User, ” is the longest section of the book. It takes the reader (and presumably user) through all aspects of a system evaluation of an MT system, at an almost cookbook level of detail. Outlines of the procedure are provided as flowcharts, and guide- lines for selecting sample tests, dictionaries, and test run documentation are suggested. An interest- ing sidelight of the discussion in this chapter is the advice given on inferring the grammar used by the system designer. For example, if an evaluator suspects that a particular rule concerning lexical or semantic attributes governs conjunctions of noun phrases, the book describes how a set of exam- ples can be used to test for the existence of the suspected rule. The chapter gives many very specific examples (in the form of contrastive examples in English and French) of the linguistic phenomena described in earlier sections.

The book has two appendices. Appendix A is a stand-alone report, “A Synthesis of Evaluations on Machine Translation Systems” by Lehrberger. Appendix B is a detailed flowchart of the TAUM/AVIATION system. This flowchart is the most detailed flowchart of an MT system that this reviewer has ever seen in the standard (non-technical-report) literature, and is very instructive as to

592 BOOK REVIEWS

the difference between the rather sketchy environments that many research systems labor under and the environment required of a real production system.

The book has an interesting “slant” toward users rather than toward MT system builders. First, the examples given of MT system components are almost exclusively drawn from the TAUM/ METE0 and TAUM/AVIATION systems, rather than from a variety of systems. Considering the familiarity of the authors with the design and history of the systems, this is a valuable viewpoint. Second, the descriptions of the linguistic components of a system concentrates on lexical issues, such as selecting attributes for word entries and the different languages in which word entries can be expressed, rather than on syntax and semantics. A typical user, after all, has little or no input in the grammar design (and may not even be permitted to see the grammar of a proprietary system!). It should be noted however, that the system cannot be too closed. A prospective user would find it dif- ficult to create word entries reliably with no understanding of the syntactic and semantic process- ing of the system.

The authors do not discuss the area of interactive MT systems, including those coupled with speech recognition and generation systems. Such systems provide a rather different problem in eval- uation, as their criteria for success depend not only on the ability to render individual sentences cor- rectly, but on their ability to get “close enough” that intelligent, cooperating users on each side of a conversation can sort out a misunderstanding without needing to have a language in common. Fur- ther, different parsing algorithms must be used in systems that use speech input, and a knowledge of such requirements is crucial for system evaluation.

Lehrberger and Bourbeau point out that there are numerous areas for further work in the area of MT system evaluation (as indeed there are in deliverable computational linguistic systems in gen- eral). Since few, if any, MT systems have been as thoroughly evaluated as the TAUM systems, the authors are clearly in the forefront of this field. This book reflects their contribution to MT system evaluation, and should be studied by users and system designers alike.

A T&T Bell Laboratories Whippany, NJ

NEAL OLIVER

People and Computers: How to Evaluate Your Company’s New Technology. C. CLEGG ET AL. Ellis Horwood Limited, Chichester (1988). x + 245 pp. $49.95. ISBN 0-470-21207-l.

The major title of the work, People and Computers, is somewhat misleading in that it indicates an “impact of technology on people” or “coping with technology and systems” approach. The minor title is more accurate: evaluating technology in the organizational setting.

The work is a compendium of issues, techniques and checklists for evaluating the implemen- tation of technology, primarily computer systems, in an organizational setting. Its strong point is this emphasis on the computer system being evaluated, not in light of itself, but in light of organi- zational goals.

Divided into four parts, equipment and working conditions, usability (of the software/system), job quality and operator performance, and the wider organizational view or overall effectiveness, the work presents a series of questions and checklists that can be applied to the evaluation of the system. The overall emphasis is on effectiveness with questions such as productivity, quality of prod- uct or service, market share, comparative evaluation with the non-automated (or otherwise auto- mated) output of the organization and quality of user/employee satisfaction uppermost in the authors’ minds.

The authors include a section on how to carry out an interview, meeting, or working group (on- going project evaluation/planning team) and a section on managing change, but by far the most valu- able portions of the book are the questions they raise. These are presented first in a list of thirty key issues to be addressed, spread across the four areas of equipment/environment, usability, job quality and performance, and overall effectiveness. These are supplemented by more detailed questions and checklists for each area in chapters 8-12 (2-5 primary, 8-12 in depth).

The authors have done a good job in collecting, analyzing and explaining the various issues, including such esoteric or unusual issues as return on investment, personal commitment of employees, staff turnover, employees that are challenged, but not overwhelmed, ratio of skill to pay, the locus of decision making, formalized decision making versus flexibility of approach and the non-use of certain system functions in the interest of simplicity. One question of particular interest that the authors raise is that of long term costs versus short run benefits, e.g., the curtailing of training or