real-time speech-to-text services · (1) large built-in dictionary (50,000 words or larger), with...

22
1 REAL-TIME SPEECH-TO-TEXT SERVICES Michael Stinson, Sandy Eisenberg, Christy Horn, Judy Larson, Harry Levitt, and Ross Stuckless 1 1 In the order listed above, the authors are associated with National Technical Institute for the Deaf (Rochester, New York), California State University, Northridge (Northridge, California), University of Nebraska (Lincoln, Nebraska), St. Louis Community College (St. Louis, Missouri), City University of New York (New York, NY), and National Technical Institute for the Deaf. 2 The report on notetaking made reference also to computer- assisted notetaking, C-Print TM , and real-time captioning, each of which is an application of real-time speech-to-text. In the present report, frequent reference is made to the generation of notes as a secondary application of real-time speech-to-text. INTRODUCTION Real-time speech-to-text has been defined as the accurate transcription of words that make up spoken language into text momentarily after their utterance (Stuckless, 1994). This report will describe and discuss several applications of new computer-based technologies, which enable deaf and hard of hearing students to read the text of the language being spoken by the instructor and fellow students, virtually in real time. In its various technological forms, real-time speech- to-text is a growing classroom option for these students. This report is intended to complement several other such reports in this series which focus on notetaking (Hastings, Brecklein, Cermack, Reynolds, Rosen, & Wilson, 1997) 2 , assistive listening devices (Warick, Clark, Dancer, & Sinclair, 1997), and interpreting (Sanderson, Siple, & Lyons, 1999). It is notable that the Department of Justice has interpreted the Americans with Disabilities Act (P.L. 101-336) to include computer-aided transcription services under “appropriate auxiliary aids and services” (28CFR, §36.303). It should be emphasized at the outset that the real- time speech-to-text services described and discussed in this report are intended to complement, not replace, the options that are already available. DEVELOPMENT OF REAL-TIME SPEECH-TO-TEXT SYSTEMS Over the past 20 years, several developments have made it possible to use real-time speech-to-text transcription services as we know them today. These began with the development of smaller, more powerful computer systems, including their capability of converting stenotypic phonetic abbreviations electronically into understandable words. These parallel developments led to the earliest applications of steno-based systems both to the classroom and to real-time captioning in 1982. In the later 1980s, laptop computers became widely available. This enhanced portability led to the use of computers for notetaking in which the notetaker used a standard keyboard in the regular classroom. It was at this time that stenotype machines were also linked to laptop computers, enhancing their portability. In the late 1980s, abbreviation software became available for regular keyboards (Stinson & Stuckless, 1998). Currently, both steno-based and standard keyboard approaches are being used with deaf and hard of hearing students in many mainstream secondary and postsecondary settings. Although the full extent of their usage nationwide remains to be documented, over the past 10 years there clearly has been an increased demand for speech-to-print transcription services in the classroom (Cuddihy, Fisher, Gordon, & Shumaker, 1994; Haydu & Patterson, 1990; James & Hammersley, 1993; McKee, Stinson, Everhart, & Henderson, 1995; Messerly & Youdelman, 1994; Moore, Bolesky, & Bervinchak, 1994; Smith & Rittenhouse, 1990; Stinson, Stuckless, Henderson, & Miller, 1988; Virvan, 1991). TWO CURRENT SPEECH-TO-TEXT OPTIONS Currently, two major options are available for providing real-time speech-to-text services to deaf and hard of hearing students. The first and second parts of this report will discuss these two options in order. But first, several general comments about the two systems should be made. Steno-based systems. For these systems, a trained stenographer uses a 24-key machine to encode

Upload: others

Post on 28-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

1

REAL-TIME SPEECH-TO-TEXT SERVICESMichael Stinson, Sandy Eisenberg, Christy Horn, Judy Larson,

Harry Levitt, and Ross Stuckless1

1 In the order listed above, the authors are associated withNational Technical Institute for the Deaf (Rochester, NewYork), California State University, Northridge (Northridge,California), University of Nebraska (Lincoln, Nebraska), St.Louis Community College (St. Louis, Missouri), CityUniversity of New York (New York, NY), and NationalTechnical Institute for the Deaf.

2 The report on notetaking made reference also to computer-assisted notetaking, C-PrintTM, and real-time captioning, each ofwhich is an application of real-time speech-to-text. In thepresent report, frequent reference is made to the generation ofnotes as a secondary application of real-time speech-to-text.

INTRODUCTION

Real-time speech-to-text has been defined as theaccurate transcription of words that make up spokenlanguage into text momentarily after their utterance(Stuckless, 1994).

This report will describe and discuss severalapplications of new computer-based technologies,which enable deaf and hard of hearing students toread the text of the language being spoken by theinstructor and fellow students, virtually in real time.In its various technological forms, real-time speech-to-text is a growing classroom option for thesestudents.

This report is intended to complement several othersuch reports in this series which focus on notetaking(Hastings, Brecklein, Cermack, Reynolds, Rosen, &Wilson, 1997)2, assistive listening devices (Warick,Clark, Dancer, & Sinclair, 1997), and interpreting(Sanderson, Siple, & Lyons, 1999). It is notable thatthe Department of Justice has interpreted theAmericans with Disabilities Act (P.L. 101-336) toinclude computer-aided transcription services under“appropriate auxiliary aids and services” (28CFR,§36.303).

It should be emphasized at the outset that the real-time speech-to-text services described and discussedin this report are intended to complement, notreplace, the options that are already available.

DEVELOPMENT OF REAL-TIMESPEECH-TO-TEXT SYSTEMS

Over the past 20 years, several developments havemade it possible to use real-time speech-to-texttranscription services as we know them today. Thesebegan with the development of smaller, morepowerful computer systems, including theircapability of converting stenotypic phoneticabbreviations electronically into understandablewords. These parallel developments led to theearliest applications of steno-based systems both tothe classroom and to real-time captioning in 1982.

In the later 1980s, laptop computers became widelyavailable. This enhanced portability led to the use ofcomputers for notetaking in which the notetakerused a standard keyboard in the regular classroom.It was at this time that stenotype machines were alsolinked to laptop computers, enhancing theirportability. In the late 1980s, abbreviation softwarebecame available for regular keyboards (Stinson &Stuckless, 1998).

Currently, both steno-based and standard keyboardapproaches are being used with deaf and hard ofhearing students in many mainstream secondary andpostsecondary settings. Although the full extent oftheir usage nationwide remains to be documented,over the past 10 years there clearly has been anincreased demand for speech-to-print transcriptionservices in the classroom (Cuddihy, Fisher, Gordon,& Shumaker, 1994; Haydu & Patterson, 1990;James & Hammersley, 1993; McKee, Stinson,Everhart, & Henderson, 1995; Messerly &Youdelman, 1994; Moore, Bolesky, & Bervinchak,1994; Smith & Rittenhouse, 1990; Stinson,Stuckless, Henderson, & Miller, 1988; Virvan,1991).

TWO CURRENT SPEECH-TO-TEXT OPTIONS

Currently, two major options are available forproviding real-time speech-to-text services to deafand hard of hearing students. The first and secondparts of this report will discuss these two options inorder. But first, several general comments about thetwo systems should be made.

Steno-based systems. For these systems, a trainedstenographer uses a 24-key machine to encode

Page 2: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

2

spoken English phonetically into a computer whereit is converted into English text and displayed on acomputer screen or television monitor in real time.Generally, the text is produced verbatim. When usedin schools, this system is often called CART(computer-aided real-time transcription), an aptacronym in view of the fact that stenotypists oftentransport their equipment from one classroom toanother on wheels.

Computer-assisted notetaking systems. For thesesystems, a typist with special training uses a standardkeyboard to input words into a laptop/PC as theyare being spoken. Sometimes these take the form ofsummary notes, sometimes almost as verbatim text.These systems are often abbreviated as CAN(computer-assisted notetaking).

Both types of systems provide a real-time text outputthat students can read on a computer or televisionscreen in order to follow what is occurring in class.In addition, the text file can be examined bystudents, tutors, and instructors after class either onthe screen or as hard copy.

These technologies offer receptive communicationto deaf and hard of hearing students. However, theyprovide limited options for expressivecommunication on the part of these students, andservice providers need to keep this in mind.

We will begin by providing some basic “nuts andbolts” information that service providers need inorder to implement a steno-based or computer-assisted notetaking (CAN) system. For each of thesesystems, we address four major questions:

(1) How do these systems work?(2) What major considerations need to be

addressed with respect to their implementationas a support service in the classroom?

(3) Who is qualified to provide the service, andwhat is his/her training?

(4) How can the system’s effectiveness beevaluated, and what has been learned fromevaluations to date?

In considering these systems, we will discuss aspectsof particular speech-to-text systems with which wehave had personal experience. Our focus onparticular systems or associated college programs isnot intended as an endorsement over other systemsor college programs.

The third part of this report pertains to the use ofspeech-to-print services relative to other forms ofsupport service, and the fourth part to thedevelopment of new speech-to-text systems,focusing on the status and potential of automaticspeech recognition (ASR).

STENO-BASED SYSTEMS

Steno-based systems began to be used in classroomsin 1982, with mainstreamed deaf and hard ofhearing students at Rochester Institute ofTechnology (Stuckless, 1983). Today, steno-basedsystems rank as an effective support service for largenumbers of deaf and hard of hearing students inmainstream college environments throughout thecountry. This growth is due to a number of factors,including refinements in the necessary software;faster, more reliable, and more portable computers;the increasing availability of stenographic reporters(and in many cases the lowering cost of theirservices); and most important, generally favorableclassroom evaluations (Stinson, Stuckless,Henderson, & Miller, 1988).

HOW STENO-BASED SYSTEMS WORK

The person who provides this service in theeducational setting may be called a stenotypist,stenographer, or stenographic or educationalreporter. His/her equipment typically includes alaptop with several cables and special software, astenographic machine that has been designed tointerface with the laptop and its software, and adisplay of some kind for presenting the student withthe text.

The stenotypist can display the text in real time inseveral ways, using a TV or computer monitor(including the screen of a second connected laptop),or projecting the text onto a screen by using anLCD or overhead projector. Unlike conventionalcaptioning, which superimposes a line or two of textover a picture, real-time steno-generated text can filla full screen. Depending on the need, the textoutput of a steno-based system can be displayed inthe classroom itself and/or elsewhere via electronicconnections.

Typically the stenotypist is present in the classroomwith the deaf or hard of hearing student. However,depending on his/her level of skill and familiaritywith the topic under discussion, it is also possible to

Page 3: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

3

use a phone link to transmit speech to a stenotypistin a distant location, returning the text to thestudent via a second telephone line or using acomputer modem. Cellular phones have also beenused successfully for this purpose where fixedtelephone lines were not available (Kanevsky,Nahamoo, Walls, & Levitt, 1992).

Equipment. The equipment consists of three basiccomponents: a computer-compatible stenographicmachine, an IBM-compatible laptop, and thesoftware needed to convert the stenographic inputof speech and display it as text.

Stenographic machine. The stenographic machine,similar to that used by “computer-connected” courtreporters, permits the stenotypist to “write” (key in)verbatim dialogue at speeds of 200 wpm or greater.3

These speeds are possible in large part because he/she can “chord” keys, depressing several keyssimultaneously instead of sequentially as inconventional typing.

Laptop computer. A Pentium 166 MHz or faster lap-top, with at least 32 MB of memory and an active-matrix screen is recommended. Two serial ports arepreferred, but a PCMCIA slot is acceptable.

Software. The translation software is the heart of thesystem. Several companies produce the software, andeach stenotypist has his/her favorite. Among themost popular are RapidText (Irvine, CA) andCheetah Systems (Fremont, CA). Essentially, thesoftware consists of four parts, often incorporatedinto a single software product.

(1) large built-in dictionary (50,000 words orlarger), with provisions for the stenotypist tomake additions as new words arise in class,

(2) program which selects words from the diction-ary based on a specific logic and set of rules, and

(3) word processing program that arranges thesewords in a particular format and performs otherediting tasks.

The following chart shows examples of steno codeand their corresponding English words.

Steno code (input) English text (output)WREUG writingO on-T thePH-PB/APS machine’sKAOE/PWORD keyboard

(4) encoding software to format and display thetext in tandem with any of several peripheraldevices, e.g., TV or computer monitor, laptopscreen, projected image, or printer/paper copy.

Need for technical support. It should be emphasizedthat a steno-based system is a technologicallysophisticated service. Software needs to be installedcorrectly, and hardware needs to be set up properly.Students depend on the system, and if it breaksdown it will need to be repaired promptly, sotechnical support should be available and close athand.

APPLICATIONS WITH DEAF ANDHARD OF HEARING STUDENTS

Steno-based systems provide a two-fold service thatincludes real-time speech-to-text transcription fordeaf and hard of hearing students to read almostinstantly in the classroom, and a written record ofthe class that they can use later for review. We willdiscuss these two applications in turn.

Real-time classroom implementation. Steno-basedsystems can be used to cover a variety of campusevents, sometimes as “real-time captioning” wherethe text appears under the video image of a speaker.However, their primary application with deaf andhard of hearing students is in the classroom. Steno-based systems as used in the regular classroom pro-vide a means for the deaf or hard of hearing studentto replace listening with reading what the teacherand fellow students are discussing, in near real time.

As indicated earlier, the stenotypist sits near thefront of the classroom, sometimes to the side wherehe/she is in visual range of the teacher, students, thechalkboard, and other visual media that might be inuse. Incidentally, the stenotypist’s equipment issilent and requires little space.

So long as the text is legible to the deaf or hard ofhearing student, it can be displayed in a number ofways. If the service is being provided for a singlestudent, a second laptop can be used as a screen.However, if a number of deaf and/or hard ofhearing students are using the service, a large TV orprojection screen is in order.

3 Parenthetically, the average speaking rate of college teachers asthey lecture is around 150 words per minute, with a standarddeviation across the faculty of about 30 wpm (Stuckless, 1994).

Page 4: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

4

From a classroom perspective, the presence of asteno-based system or a computer-assistednotetaking system in the class is similar in somerespects to having an interpreter there. Moreattention will be given to similarities and differenceslater in this report.

Hard copy text. Transcripts of lectures can be used ascomplete classroom notes, preserving the entirelecture and all students’ comments for subsequentreview by deaf and hard of hearing students takingthe course. Typically, these transcripts are sharedwith these students and with the instructor. Someinstructors welcome the transcripts as a way oftightening their lectures and reviewing theirstudents’ questions and comments.

If the instructor chooses, he/she should be at libertyto share them with hearing members of the classalso.4 The transcripts can be of value also in tutoringdeaf and hard of hearing students, enabling tutors toorganize tutoring sessions in close accord withcourse content. Also, interpreters sometimes usethem to improve their signing of course-specificwords and expressions.

Once the stenotypist has completed the real-timetranscription of a class for the deaf or hard ofhearing student(s) enrolled in the course, he/shewill edit the text. Depending on the particular class,a 50-minute class is likely to generate 25 to 30 pagesof text.

If the stenotypist has a high accuracy rate in a givenclass, e.g., 98-99%, he/she may be able to correcterrors and make the text more readable in one-halfhour or less. Obviously more errors (causes of whichare discussed later under Accuracy) will require moreediting time.

Many students who use the text for review purposesprefer receiving an ASKII disk (edited or unedited)so they can organize their own format and decide forthemselves what they want to retain or discard.

ACCURACY

The most important task for the stenotypist workingin the classroom is to maintain high accuracy in theproduction of text from speech. When the accuracydrops below 95%, i.e., more than one word error in25, intelligibility of the text drops off rapidly.5

The following excerpt from a lecture6 illustratessome of the types of errors that can appear withsteno-based systems. The upper line indicates whatthe teacher said, and the lower line indicates atranscribed text version.

(Speech) Interestingly enough one of the mostpopular courses(Text) INTERESTINGLY ENOUGH ONE OFTHE MOST POPULAR COURSES

on this campus is a course on death and dying.Since so many of us areON THIS CAMPUS IS A COURSE ONDEARTH AND DYING. SINCE SO MANY ___US ARE

trying to avoid that I have some ambivalentfeelings about theTRYING TO AVOID THAT I HAVE SOME ALLBEVELLENTD FEELINGS ABOUT THE

popularity of that course. I do know thatit’s a very popularPOPULAR ARE THE OF THAT COURSE. IDO NO THAT IT’S A VERY POPULAR

course and at the same time I know that it’s asubject that most of usCOURSE AND AT THE SAME TIME I NOTHAT IT’S A SUBJECT THAT MOST OF US

want to avoid.WANT TO AVOID.

Types of errors. Based on the number of departures inthe text from what was spoken, there are six worderrors in the above 65-word spoken excerpt, yielding90% accuracy. We can see the four most commontypes of word errors illustrated in the text above:

mistranslate – death/DEARTH, popularity/POPULAR ARE THE

omission – of/ _untranslate – ambivalent/ALL BEVELLENTDhomonym – know/NO (2)

4 It is common for stenographic reporters in private practice toadd a surcharge for distribution of extra copies of the text. Inthe educational environment, this should be discouraged.

5 This pertains to all the real-time speech-to-text systemsdiscussed in this report.

6 This particular lecture was given in February 1982 at NTID/Rochester Institute of Technology, as part of the first course inwhich a steno-based system was ever used. Today, we look forbetter than 95% accuracy.

Page 5: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

5

Sources of errors. There are at least three generalsources of errors:

(a) Stenotypist errors. The computer is unforgivingof input errors on the part of the stenotypist.Once made, they cannot be corrected “online”.

(b) Vocabulary limitations. Each stenotypist isexpected to add and maintain his/her ownspecial course-related dictionary of words beyondthe large dictionary that comes with the soft-ware. The goal here is nearly perfect pre-editedtext, so ongoing dictionary-building time (likeediting time) should be built into the service.

The textbooks used in class should be made availableto the stenotypist for the purpose of dictionarybuilding. Instructors are also encouraged to sharespecialized vocabulary likely to be used in class withthe stenotypist so he/she can enter this vocabularyinto his/her dictionary prior to the class meeting.Over time, the accuracy of the stenotypist’s workwill improve as he/she builds a specializeddictionary and his/her stenotyping errors diminish.

(c) Teacher/classroom/course content factors. Someteachers and hearing classmates of the deaf andhard of hearing students articulate more clearlyand/or speak more slowly and deliberately thanothers. Also, some are more grammatically“correct” in their speech than others.

Adverse classroom factors include “noisy” classroomconditions, e.g., several people speaking simultane-ously. The stenotypist cannot be expected to producemeaningful, accurate text under these conditions.

By their very nature some areas of study lendthemselves better to the use of steno-based systemsthan others. For example, courses demandingconsiderable physical activity and foreign languagecourses may be poor prospects for the use of steno-based systems.

THE STENOTYPIST

Some stenotypists provide their services on anhourly basis, and some by the academic term. Stillothers are employed as members of the college’sprofessional staff. Mostly this depends on thenumber and year-to-year continuity of deaf and hardof hearing students likely to be requesting theservice.

A college with just one student requesting theservice is unlikely to hire a stenotypist on a long-term basis when there is no assurance that thestudent will complete his/her program of studies inthe same institution. At the other end, a college thathas an ongoing need to provide steno-based servicesfor numerous deaf and hard of hearing students eachyear is likely to prefer hiring stenotypists as regularstaff members.

Training. The starting point for becoming astenotypist is training in a stenographic or court-reporting school, of which there are more than 400throughout the country. Many stenotypists and mostactive court reporters are affiliated with the NationalCourt Reporters Association (NCRA). Both courtreporting and stenotyping in the college settingrequire high-speed, accurate stenographic translationof the spoken word, often involving multiplespeakers. Most court reporters, however, ipso factoare not adept at providing real-time transcription inthe classroom. They have the luxury of being able toedit their material before producing a readabletranscript.

In contrast, stenotypists in the classroom situationmust produce near-perfect accuracy without thebenefit of prior editing. This calls for special skillsthat overlap with those of real-time TV captionistsand which come with training (if available) andexperience. When feasible, it is useful for thebeginning stenotypist to have a semester of practicetime, and time to build his/her special dictionary,before taking on full responsibility for supportingstudents in the classroom. Another opportunity forpractice is to produce transcripts from videotapes forcaptioning purposes.

Certification. The National Court ReportersAssociation offers certification at several levels. Somestenotypists argue that NCRA certification has littlerelevance to working as a stenotypist in theclassroom, but certification undeniably providesadded assurance of both speed and accuracy.7

Recruitment. Sometimes the most direct andefficient way to recruit stenotypists, at least for shortterm, temporary support, is through localstenographic agencies. Insist on real-time experience

7 The Center on Deafness at California State UniversityNorthridge periodically offers workshops for stenotypistsinterested in working with deaf and hard of hearing studentsattending college.

Page 6: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

6

and require that they provide their own hardwareand software (including their own dictionary).

Local court reporting/stenographic schools may beable to provide leads from among their owngraduating students and graduates. For long-termrecruitment of stenotypists into college programs fordeaf and hard of hearing students, an internshipagreement with one of these schools can be aneffective way of incorporating newly graduated real-time stenotypists into the college’s support servicesfor deaf and hard of hearing students.

Pay levels. Compensation standards for stenotypistsworking with deaf and hard of hearing students atthe college level vary considerably, based on trainingand experience. Colleges with little or no priorexperience using real-time stenotypists in theclassroom may wish to check with other colleges thathave, before varying much either way from thefollowing ranges.

For “educational realtime reporters” with full-time(two semester, 40 hour week) college positions, theNational Court Reporters Foundation (NCRF) ofNCRA has suggested a salary range of $20,000 to$38,500 plus a full benefits package.8 This range canbe adjusted for use in colleges that use anothercalendar such as the quarter.

For those who are retained on an hourly fee basis,NCRF has suggested the following: $40-$75 perclass hour (2-hour minimum), $15-$40 per hour forpreparation time (30 minutes for each class hour),and $15-$40 per hour for production time (editingfor distribution). However, fees of up to $150 perhour have been reported.

The importance of preparation and editing hasalready been discussed. Typically those who providethe service on an hourly fee basis furnish their ownsteno machines, laptops, and software.

Workloads. On-line classroom stenotyping requiressustained and undivided attention. And like teachingand interpreting, when done without periodic breaksit can be mentally and physically fatiguing. As a rule,for full-time staff, course coverage should not exceed20-22 class hours per week. Back-to-back classesshould be infrequent. Between-class time, e.g., threeto four hours a day, can be used mainly forpreparation and editing purposes. First-timecoverage of new courses (and different instructors

teaching the same courses) will require morepreparation and editing time than those previouslycovered.

Evaluation of the service. Support service providersneed some way to determine whether students usinga steno-based system are being adequately served.Two aspects of evaluation are (a) quality of the real-time display and the hard-copy text, and (b)student/consumer feedback regarding his/herbenefits from use of the system.

Quality of real-time display and edited text. Early andlater on in the course, the stenotypist’s collegesupervisor should appraise the quality of the displayand the edited text for each course being covered bythe stenotypist. The supervisor’s principal interesthere is that the real-time display be relatively free oferrors (recognizing that the stenotypist is not thesource of all errors), and that its format contributeto its readability. This can be determined byexamining the unedited text, including wordcorrectness/errors, punctuation, paragraphing, andindications of changes in speakers.

The edited text should be appraised relative to itsintelligibility and ease of student use for reviewpurposes.

Student/consumer feedback. Students using thesteno-based service should be asked to make aformal evaluation midway through the course.Information may be collected on the student’sperceptions regarding the skills and attitudes of thestenotypist. The Appendix shows a sample form usedat California State University, Northridge to obtainstudent/consumer feedback.

In addition, each instructor who uses the stenosystem in his/her class should be given theopportunity to express his/her perception of thevalue of the service relative to its use by the deaf orhard of hearing student(s) in the class.

A study conducted with deaf and hard of hearingstudents at Rochester Institute of Technology takingcourses in the College of Business and/or Liberal

8 Information from Realtime in the educational setting:Implementing new technology for access and ADA Compliance(1994), National Court Reporters Foundation: Vienna, VA.Booklet available through NCRA Member Services andInformation Center, 8224 Old Courthouse Rd., Vienna, VA22182-3808.

Page 7: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

7

Arts indicated that students responded favorably tothe system, although there was variability in theirresponses. A majority of the students reported thatthey understood more from the steno-based textdisplay than from interpreting (Stinson, Stuckless,Henderson, & Miller, 1988).

When supporting an individual student, a steno-based or other speech-to-print system obviously ismore expensive when combined with other servicessuch as interpreting, than when it is the only serviceprovided, i.e., used “stand alone”. It may be difficultto justify the provision of both the speech-to-textservice and interpreting services for a single studentin the class.

Nor do there appear to be consistent policies fordealing with such requests in colleges around thecountry when one student in the class requestsspeech-to-text, and another requests interpreting. Insome circumstances, both services have beenprovided, whereas in others, students have beenlimited to only one of these services. Clearguidelines regarding when to provide one or bothservices remain to be developed.

A CAVEAT ON STENO-BASED SYSTEMS

In the hands of competent stenotypists, steno-basedreal-time speech-to-text offers a powerful supportservice to many deaf and hard of hearing students incollege. Unfortunately, the relatively high costs ofwell-qualified stenotypists (not their equipment),together with their scarcity in most locations of thecountry, combine to make the service unavailable orunderused in many colleges.

With this in mind, we proceed to examine somerelated alternatives.

COMPUTER-ASSISTED NOTETAKING (CAN):COMPUTER SYSTEMS WITH STANDARDKEYBOARDS

When used with deaf and hard of hearing students,computer-assisted notetaking (CAN) systems, likesteno-based systems, are used primarily in theclassroom, in lieu of interpreters and notetakers.Like steno-based systems, CAN converts speech intotext in real time for the deaf or hard of hearingstudent to read in the classroom. And like steno-based systems, CAN provides the student with anedited or unedited copy of the text for use as notes.

Unlike steno-based systems, CAN involves the use ofa standard keyboard and a typist with special train-ing, referred to in this report as a captionist butcalled a transcriber in some settings. There are anumber of CAN systems, each of which varies in itsdetails. In general, these systems all involve a (hear-ing) captionist sitting in the classroom and using astandard keyboard and a commercially availableword processing program (such as WordPerfect) totranscribe information as it is being spoken in class.

The text is displayed in real time for deaf and hard ofhearing students to read on a TV monitor or on asecond laptop display (depending upon the numberof deaf or hard of hearing students using that systemin the particular class). At the end of class, the text issaved as a word processing file that can then beedited, printed, and distributed to these students ashard copy.

KEYBOARD INPUT

Various CAN systems have “evolved” from the useof standard typing (character by character). Thelimitation of standard typing, even at high speed, isthat it cannot keep up with the speed of speaking.Instructors’ speaking rates typically run around 150words per minute, and sometimes in burstsexceeding 200 words per minute.

Nevertheless, one basic approach is simply to substi-tute the handwriting of notes (at around 30 wordsper minute) with typing (at around 60 words perminute) – that is, the typist takes down in summarywhat the instructor says. With the advent of laptop“notebook” computers, this has become commonamong students who take notes for themselves, andincreasingly among those who take notes for deafand hard of hearing students (Hastings, Brecklein,Cermak, Reynolds, Rosen, & Wilson, 1997).

Various CAN systems employ different strategies toenable the captionist to increase his/her speed ofinput in order to capture more spoken content anddetail. The goal is to come as close as possible tocapturing all the relevant information beingdiscussed in class, in a readable format. Twostrategies are employed to enable transcribers tocover as much information as possible: (a)computerized abbreviation systems to reducekeystrokes, and (b) text-condensing strategies toenable the transcriber to type fewer words withoutlosing spoken information.

Page 8: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

8

SUMMARY OF CHARACTERISTICS OF DIFFERENT COMPUTER-ASSISTED NOTETAKING SYSTEMS

CHARACTERISTICS OF SYSTEMS

System Uses abbreviation Location Attempts verbatim Communication Requiredto increase of text or real-time between student skills and/orspeed display notes and transcriber training

_______________________________________________________________________________________________________________________CAN-Cleveland Not described Connect with Generally, One-way Minimal: must(Messerley & monitor summary notes communication: be able to summarize;Youdelman, transcriber to type more than 60 wpm;1994) student good English use_______________________________________________________________________________________________________________________Computer- Not described Connect with Summary notes Two-way None describedAssisted laptop communicationNotetaking-NY between(Kozma-Spytek transcriber and& Balcke, 1995) student_______________________________________________________________________________________________________________________CAN- Abbreviations Connect with Usually summary One-way Overall littleWashington, D.C. used as long monitor or notes, but is communication: training, but(Virvan, 1991) as everyone laptop capable of near- transcriber to required skills

understands. verbatim tran- student are ability toDoes not use scription type over 60 wpmcomputerized and summarizeabbreviation well, good Englishexpansion program skills, hear well

_______________________________________________________________________________________________________________________C-Note Student & Connect with Varies from Two-way Not described(Cuddihy, Fisher, transcriber develop laptop near-verbatim communicationGordon, & appropriate to summary between transcriberShumaker, 1994) shorthand system notes and student_______________________________________________________________________________________________________________________Project Not described Connect with Summary notes Two-way Not describedCONNECT laptop and near-verbatim communication(Knox-Quinn & text between trans-Anderson-Inman, criber and1996) student_______________________________________________________________________________________________________________________C-PrintTM Emphasizes Connect with Near-verbatim Two-way Transcriber shouldNTID System extensive use of monitor or text communication be able to type(McKee, Stinson, phonetically- laptop between 60 wpm. FormalEverhart, & based abbrevia- transcriber and course provided,Henderson, 1995; tion system to student with 62-pageEverhart, Stinson, reduce key training manualMcKee, Henderson, strokes and 50 training& Giles, 1996) audiotapes_______________________________________________________________________________________________________________________InstaCap Uses single key- Wireless Varies from One-way Not fully described;(Hobelaid, 1988; strokes to invoke connection with near-verbatim communication: transcriber’sWarick, 1994) full words for 20 monitor to summary transcriber to skills evaluated

abbreviations student every 3-5 years_________________________________________________________________________________________________Notebook Not described Connect with Generally Two-way Not describedComputer Notetaking laptop summary notes communicationSystem (James between transcriber& Hemmesley, 1993) and student_______________________________________________________________________________________________________________________

COST AND PERSONNEL ADVANTAGES OVER STENO-BASED SYSTEMS

CAN systems have several practical advantages over steno-based systems. CAN systems use portable, low-costequipment. Also, the potential pool of typists/captionists is much larger than that of stenotypists and the costs oftheir services are usually lower than those of well-qualified stenotypists or interpreters. In general, the specialtraining required for a well-qualified typist to become an acceptable CAN captionist can be a month or less,depending on the specific goals of the system (McKee et al., 1995).

Several CAN systems have been developed for or used in providing support services to deaf and hard of hearingstudents. The following table presents a summary of characteristics of eight computer-assisted systems for whichpublished information is available.

Page 9: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

9

HARDWARE

The hardware used for CAN systems is simpler thanthat required for steno-based systems. However,when used in tandem with appropriate software, itcan be sufficient to produce an effective text display.

Laptop computer. The basic piece of equipment is alaptop computer. Some systems use IBM-compatiblecomputers (e.g., IBM’s ThinkPad, NEC Versa2000). Others report using Apple MacintoshPowerBooks (Messerley &Youdelman, 1994).

Display. The real-time text on the transcriber’slaptop can be displayed for the deaf or hard ofhearing student using (a) a second laptop computer,(b) a VGA-to-TV adapter that connects the laptopto a regular TV monitor, or (c) an LCD projectiondisplay.

SOFTWARE

A CAN system requires word processing softwareand in most instances communication software. Themore sophisticated systems also use abbreviationsoftware.

Word processing software. Products such asWordPerfect 6 and Word 97 often have special built-in features that increase their effectiveness, such asWordPerfect’s “Macro” and “QuickCorrect”features. These permit creating the abbreviations ofa limited number of words and phrases for input intoa computer.

Communication software. This software permitscommunication between two or more laptopcomputers by creating an asynchronous link. Thesesystems include C-Note (Cuddihy et al., 1994) andCarbon Copy (McKee et al., 1995).

This software provides two ways of communicatingbetween two computers: (a) a full-screen mode,where only one individual can enter a message at atime, and (b) a split-screen mode where bothindividuals may enter messages simultaneously.Most of these programs permit scrolling back toreview previous material on the student’s computerwhile new material is being entered on thecaptionist’s computer. (Cost: $200).

Word abbreviation software. Several softwarepackages have been developed specifically for

extensive abbreviation of words and phrases beingentered into the computer. At this time, the twosystems most commonly used with CAN appear tobe the following:

Productivity Plus Instant TextProductivity Software Textware Solutions International, Inc. 83 Cambridge St.1220 Broadway Burlington, MA 01803-4181New York, NY 10001

Using one of these systems, the computerautomatically converts the abbreviations typed bythe captionist into the full words that appear on thescreen. This software serves to increase typing speedwithout increasing the necessary number ofkeystrokes, and permits the text to more closelyapproach the speed of the talker.

An example of the application of one of theseabbreviation systems to a CAN service is a speech-to-text transcription system called C-PrintTM whichwas developed at the National Technical Institute forthe Deaf (McKee, Stinson, Giles, Colwell, Hager,Nelson-Nasca, & MacDonald, 1998).9 C-PrintTM

uses an extensive word-abbreviation dictionary,along with specific text-condensing strategies.

A major difference between C-PrintTM and otherCAN systems is its commitment to coming as closeas possible to providing a verbatim transcription, duelargely to the extensive abbreviation system itemploys. As the teacher (or class participant) talks,the captionist types a series of abbreviations. Foreach abbreviation, Productivity Plus searches thedictionary for its equivalent full word and displays iton the screen. Two examples of abbreviations andtheir expansions as used in C-PrintTM appear below.

Abbreviations Full expansionst kfe drqr the coffee drinkerslvg t pblm solving the problem

The C-PrintTM captionist is not required to memorizeall the abbreviations in the C-PrintTM system.Instead, she/he learns a set of phonetic rulesdeveloped specifically for C-PrintTM, which are thenapplied to any English word that has been added toits system’s general dictionary. The general

9 The C-PrintTM project has been supported by grants 180J3011and 180U6004 from the United States Department ofEducation, Office of Special Education.

Page 10: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

10

dictionary developed by the C-PrintTM staff currentlycontains approximately 10,000 words, includingsuffixes, which were selected from research on wordfrequencies in English. Specialized dictionaries canalso be created that allow for the abbreviation ofvocabulary, phrases, and acronyms unique to acourse or subject area.

TEXT DISPLAY

Format. The text display for a CAN system generallyshows words appearing letter by letter, as opposed toa steno-based system that displays individual wordsor groups of words in a single burst. For the C-PrintTM system, the student sometimes sees a split-second conversion from the abbreviation to the fullword. Student feedback indicates this is not distracting.

The number of lines of text displayed in real timevaries by the type of display and size of letters. Asingle-spaced laptop display may show 30 or morelines of text. A television monitor display withletters of a large font size, such as 30-point, maypermit up to 15 lines, depending upon the particularsystem.

Content. For the C-PrintTM system, the operatordoes not type every word, but does try to capture asmuch important information as possible. The textgenerated by some CAN systems (for both real-timedisplay and hard copy) can be considerably moredetailed than notes taken by trained notetakers, butis more condensed than the transcriptions providedby steno-based systems. Below is an uneditedparagraph of text, with follow-up comments,produced in a history class by a C-PrintTM captionist.Note the use of complete sentences.

Professor: King has successfully gone intoBirmingham after the failure in Albany, and hasprovoked a great deal of violence and hasgotten a great deal of press coverage. It issevere violence. Although violence is seen onnational television and Kennedy responds bynot defending the existing legislation asEisenhower did, this is a crucial shift, but bysaying he will create legislation in support ofthe cause. That is the Civil Rights Bill of June,1963. He is initiating his own legislation. Itwould strengthen desegregation in all places.In response to this is the march on Washingtonthat takes place on Aug. 28, 1963. This is insupport of Kennedy’s bill.

Bayard Rustin and A. Philip Randolph comeback into the picture to organize the event.King gives his famous “I have a dream”speech. It is a great symbolic event. It showsa great deal of unity within the country behinddoing something about civil rights.

Student: Is that an all-Black march?

Professor: No. It was by no means an all-Blackmarch, it was greatly diverse. A. PhilipRandolph gets his dream of the march, but it isnot all Black. The movement is unified aroundone strategy — provoke violence, get it ontelevision, and get government to dosomething.

At the end of class, the CAN text is saved as a wordprocessing file that can then be corrected anddistributed to students as hard copy text, on a floppydisk, or electronically. Electronic distributionrequires that the captionist have access to a computerand can send the file electronically to the student.The student in turn can download and print the textat his/her convenience. Student feedback indicatesthat an effort should be made to distribute the texton the same day as the class or the following day.

PREPARATION FOR CLASS

The captionist has a number of duties prior to actualin-class transcription. In preparation for each class,she/he needs to become familiar with new termsand concepts likely to be used in class. If workingwith a CAN system that uses extensive abbreviations,she/he may add abbreviations to the specializeddictionary so that words used frequently in aparticular course (e.g., technical words, propernames, new terms) will appear when theircorresponding abbreviations are typed.

Equipment must be set up prior to the class. Thismay mean connecting two laptops with each other.If a television monitor is to be used, it must berequisitioned and connected.

Prior to the first class, the captionist should discusswith the students for whom the speech-to-text willbe used how the CAN system works, what they canexpect from it, and their respective responsibilities.They may also need to discuss specific ways in whichthe captionist can be helpful during class. This mayinclude matters such as repeating the students’

Page 11: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

11

questions if they’re not understood in class, orreading aloud the questions and other comments thestudent types on his/her laptop with the intent ofsharing them with the class. The latter assumes thatthe particular student chooses not to voice for him/herself, and that the particular CAN system beingused has this interactive feature.

If the class activity is a small group discussion, it isdesirable for the real-time display to be a laptopmonitor rather than a television monitor. It seemseasier for deaf and hard of hearing students to shiftbetween viewing a laptop display directly in front ofthem and observing the speaker(s) than to shiftattention between a television monitor and thespeaker(s).

PREPARATION AND DISTRIBUTION OF NOTES10

The hard copy notes are intended to be educationaltools, not necessarily near-verbatim accounts of whathappened in class. Therefore, information that isextraneous to the educational content can beomitted. Also, any confidential information aboutthe students or others should be omitted. Thecaptionist should be sensitive to the wishes of theinstructor regarding other information to beomitted from the hard copy notes for a particularclass.

Assignments should be accurately recorded. Beyondassignments, a good approach for captionists to usewhen deciding what information to include andwhat to omit is to provide notes that would help astudent who was absent know what educationalinformation was presented. This approach will helpcaptionists decide what to include, and what changesto make, to render the class content both accurateand understandable.

ERGONOMICS AND THE SCHEDULINGOF CAPTIONISTS

Transcribing for more than one hour without abreak increases the risk of what has variously beencalled repetitive motion injuries and cumulativetrauma disorder. Captionists in the collegeenvironment are likely to engage in intense typing ofcontinuous lectures for up to one hour and willgenerally need an hour of “down” time beforeresuming typing. This time can often be devoted topreparing notes or preparing for the next class.

In an attempt to minimize ergonomic risk factors, itis recommended that:

(a) captionists continue to develop their skills withthe abbreviations system to reduce keystrokes,and use other text condensing strategies

(b) where possible, captionists choose seating thatreduces discrepancies in table, elbow, andkeyboard height

(c) regular interviews with the captionist beconducted by her/his supervisor to monitorchanges in comfort, fatigue, and effort

(d) where feasible, the college make the captionist’sposition part time.

QUALIFICATIONS AND TRAININGOF A CAN CAPTIONIST

Qualified captionists need first to be skilled typists(with typing speeds of 60 words per minute orbetter), need to have good verbal and auditory skills,and need to be familiar with the operation of laptopcomputers. It is helpful if the captionist hasfamiliarity with the course material, although thisoften is impractical as a requisite.

A survey of existing pay scales suggests an hourlyrate ranging from $10 - $15, inclusive of preparationtime and time required for text editing anddistribution as notes. One college surveyed indicateda pay scale comparable to that of interpreters.

With respect to training, the C-PrintTM system atNTID appears to be the only college offering CANtraining as a course (McKee, Stinson, Everhart, &Henderson, 1995). This one-month course isdesigned to teach the abbreviation rules that enablethe C-PrintTM captionist to save substantial numbersof keystrokes. The course also teaches strategies tocondense information. Training includes practicetranscribing real college lectures from audiotapes.Training materials consist mostly of a 62-pagemanual and 50 audiotapes.

10This topic and several others that follow draw extensively fromMcKee, B., Stinson, M., Giles, P., Colwell, J., Hager, A.,Nelson-Nasca, M., & MacDonald, A. (1998). C-PrintTM: AComputerized Speech-to-Print Transcription System: A Guide forImplementing C-PrintTM. Rochester, NY: National TechnicalInstitute for the Deaf.

Page 12: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

12

Regardless of the CAN system that is used, a realissue is how soon captionists can becomecomfortable displaying what they are typing in realtime in the classroom. Coming into the classroomand keying in rapidly spoken lecture material, whichwill be viewed by a student who is dependent uponit for learning, is a challenging and sometimesstressful task.

Captionists may be concerned about keeping upwith a lecture pace, omitting important information,and making errors. Before they can becomecomfortable doing this, they may need in-classexperience transcribing lectures where the text is notdisplayed in real time for the student.

ILLUSTRATIVE POLICIES AND PROCEDURES

As with steno-based systems, the cooperation of thecaptionist, the deaf and hard of hearing students,hearing classmates, and the instructor is necessary inorder for the CAN service to work successfully in theclassroom. The following policies and proceduresare adapted from those developed for one college(NTID) in its use of a CAN system (Giles, 1996),and are organized around General Information,Captionist’s Responsibilities, and Student’sResponsibilities.

General Information

• CAN notes are intended to be used bysupported student(s) registered in the courseand should not be copied unless otherwisespecified by the instructor.

• CAN notes are not a substitute for attendingclass.

• Because the notes need to be edited quickly anddistributed as soon as possible, CAN notes arenot guaranteed to have 100% correct grammaror spelling.

Captionist’s Responsibilities

The captionist will:

• provide an in-class text display for appropriatesupport service students. In addition, notes(generated from the text display) will be madeavailable to supported students who attendedclass.

• make every effort to type spoken informationword-for-word, and communicate the

information in the manner in which it isintended. At times (during fast speech), thecaptionist will need to summarize information,but she/he will type as much of the importantinformation as possible.

• assist by voicing comments or questions typedby students on the laptop provided (if it has thenecessary communication software), or inanother way mutually agreed upon.

• begin typing upon arrival of the students. Anyannouncements made by the instructor beforethe student(s) arrive will be typed. After 10minutes, if none of the supported students arein attendance, the captionist will leave.However, if the student has notified the CANoffice or the instructor at least 24 hours inadvance, the captionist will take notes ifapproved by the instructor.

• indicate different speakers in the text byindicating “Professor”, “Female Student”, and“Male Student”.

• be responsible for facilitating communicationbetween the supported student(s) and others,i.e., the instructor and other students. Thisincludes asking for clarification from theinstructor or other students when necessary.

• be responsible for trying to resolve anyproblems stemming from student or instructorconcerns about CAN.

• arrive at least 10 minutes before class to allowtime for equipment set up.

• become familiar with the scheduled lecture bypreparing for class through reviewing thetextbook and related materials.

• find a replacement if she/he is sick. If areplacement cannot be found, the captionistwill notify the appropriate Support Departmentthat will notify the supported student(s).

• provide on-the-spot troubleshooting forequipment breakdown with minimumdisruption to the class. If no solution is found,the captionist will make an effort toaccommodate the supported student(s) to thebest of his/her ability. Technical breakdownsare unforeseen and most often requirediagnoses outside the classroom environment.

• when necessary, request an interpreter forspecial circumstances such as an oralpresentation by the supported student(s).

• provide class handouts to authorizedindividuals, e.g., tutors.

• Summarize videotapes (captioned oruncaptioned).

Page 13: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

13

Student’s Responsibilities

The student will:• introduce him/herself to the captionist so the

captionist is familiar with each student.• be responsible for taking notes and diagrams

from the blackboard and overhead.• be responsible for notifying the CAN Office if

he/she will not be attending class or haswithdrawn from the course. Three consecutiveunexcused absences will result in thetermination of CAN services.

• be responsible for double-checking spelling onany vocabulary.

• raise her/his hand when interested incommunicating comments or questionsthrough typing on the laptop (if so equipped).

• inform the captionist of any special needs forspecial circumstances, e.g., interpreter, at leasttwo weeks in advance.

EVALUATING CAN SERVICES

In evaluating the effectiveness of CAN services,college staff will want to consider (a) the quality ofthe real-time display in class, and (b) the quality ofthe hard-copy text or notes distributed to studentsafter class (together with the timeliness of theirdistribution). Evaluation should be tied to theobjectives of the system, i.e., summary notes vs.near-verbatim text.

If the intent is that the captionist record as muchinformation as possible, there is a need for somekind of comparison between what the teacher andstudents actually said in class and what the captionisttyped. For example, some preliminary data indicatethat it is possible for a CAN system to capture 65percent of the total ideas expressed in a lecture and83 percent of the important ideas. These figureswere obtained by using a standardized procedure forcomparing recordings of teachers’ lecture materialwith the corresponding text typed by the captionists.

It is also important to obtain deaf and hard ofhearing student feedback regarding (a) the benefit ofthe real-time display, (b) the extent of theirunderstanding of the classroom discourse, (c) theirability to participate in class, (d) the professionalismof the captionist and appropriateness of her/hisbehavior, and (e) helpfulness of the notes.

Feedback should be obtained also from thecaptionist and the instructor. The evaluation form

for stenotypists as shown in the Appendix can bemodified for use in connection with CAN systems.

Questions for the instructor can include whether therole of the captionist was adequately explained,whether the captionist performed her/his job withminimum disruption to the class, whether teachingmethods were altered to accommodate the CANsystem, and whether the instructor was able toexpress her/his concerns to the captionist.

To date, the systematic collection of feedbackregarding CAN systems from students and facultyhas been limited. One major theme that emergesfrom all the reports is that students perceived thesevarious systems as beneficial, particularly in creatingincreasing understanding of classroomcommunication ( Hobelaid, 1988; McGee et al.,1995; Everhart, Stinson, McKee, & Giles, 1996).

Data also have been collected in the process ofevaluating the C-PrintTM system at RochesterInstitute of Technology. Questionnaire interviewdata from mainstreamed deaf and hard of hearingstudents indicated that they reported significantlygreater understanding of information during alecture with C-PrintTM than with an interpreter. Inaddition, students stated a preference for the hard-copy detailed notes generated by the C-PrintTM

system over notes from a traditional notetaker(Everhart et al., 1996).

These findings are similar to those for steno-basedsystems, but should not be construed to suggest thatsuch systems should replace these more traditionalservices. The important point is that these data doshow that some students and some classes find theservices beneficial.

RELATIVE ADVANTAGES OF STENO-BASEDAND CAN SYSTEMS

Steno-based systems. Steno-based systems have thefollowing advantages:• Steno-based systems capture virtually every

word that is spoken. Thus, it is possible for thestudent to read the text of exactly what was saidin real time.

• One stenotypist can cover a two-hour class,with a brief break.

• The stenotype machine is virtually silent.

Page 14: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

14

CAN systems. CAN systems have the followingadvantages:• CAN systems yield notes that are briefer and

potentially easier to study than the verbatimtranscripts yielded by steno-based systems.

• CAN captionists require relatively little specialkeyboard training beyond the ability to type 60words per minute, increasing their availability.

Consideration of the relative advantages of the twosystems indicates that it is not possible to make ageneral recommendation of one system over theother. A college may even wish to include bothservices in its repertoire of technologies.

The decision regarding which of the two services toprovide will depend on a variety of issues, includingavailability of potential staff to provide support,costs, the type of class, and individual student needs.

NEW TECHNOLOGIES FOR COMMUNICATIONAMONG MACHINES

A relatively recent application of technology, usedmost often with steno-based systems, is the provisionof real-time transcription between two “remote”sites by telephone lines. The voice of a speaker ispicked up by a microphone and transmitted to astenotypist at a remote location via the first of twotelephone lines. The stenotypist relays the real-timetext via a second telephone line back to a televisionor computer display for the deaf or hard of hearingindividual to read where he/she is located(Preminger & Levitt, 1997; Eisenberg & Rosen,1996; Levitt, 1994; Stuckless, 1994). Althoughreports of this approach describe applications onlywith steno systems, it should apply also with CANsystems.

Infrared and radio frequency-based networkingdevices use a technology that increases the portabilityand ease of use of speech-to-text systems in theclassroom. This technology eliminates the need forthe cables that are commonly used to connect laptopcomputers with each other. One drawback of thesecables is that the two laptop computers, i.e., the onebeing used by the captionist or stenotypist and theone being used by the student, need to be relativelyclose to each other. Also, cable connections requireset-up time (often between classes) and areinconvenient when strung out in a classroom setting.

Infrared networking devices use a PCMCIA adapter(such as Cooperative that is produced by Photonics),or devices now being integrated into many laptopmodels, permitting wireless communication betweencomputers. This means that the two (or more)computers do not need to be in close proximity toeach other, and time does not need to be devoted toconnecting the computers (Knox-Quinn &Anderson-Inman, 1996).

Software that permits two-way communicationbetween the student and captionist or stenotypistalready has been described. Network software (suchas Aspects produced by Group Logic), provides forreal-time collaborative interaction among up to 32persons working in the same word-processing orgraphics document. This network software permitsthe stenotypist or captionist to simultaneouslycommunicate with more than one other computer,i.e., with numerous students in different locations ofthe classroom.

Using this software, it is also possible to create asplit-screen display in which students may commu-nicate with each other or add their own notes onhalf the screen, while observing the CAN or steno-generated text on the other half (Knox-Quinn &Anderson-Inman, 1996). One particular benefit ofsuch an arrangement is that it may encourage note-taking on the part of the deaf or hard of hearingstudent, since she/he need not look at the keyboard.An added feature is that the program can correlatethe student’s own notes with the CAN or steno-generated text.

USE OF REAL-TIME SPEECH-TO-TEXTRELATIVE TO OTHER CLASSROOMSUPPORT SERVICES

Real-time speech-to-text is one of four directclassroom support services that are discussed in thisseries of reports, the others consisting of assistivelistening devices, interpreting, and notetaking. Someof the factors we should consider in choosing one ormore of these services with a given deaf or hard ofhearing student taking a particular course follow.These factors are classified loosely under Individualdeaf or hard of hearing student, Course and/orinstructor, and Other considerations. For the purposeof this report, we will discuss these only in relationto real-time speech-to-text services.

Page 15: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

15

FACTORS TO BE CONSIDERED IN SELECTION OFREAL-TIME SPEECH-TO-TEXT AND ALTERNATIVESUPPORT SERVICES IN THE CLASSROOM

Individual deaf or hard of hearing student. Student-specific factors include:• Preference of the student.

Major consideration should be given toproviding this service when it is the student’spreference over other services.

• Prior experience and satisfaction withspecific classroom support service.Favorable prior experiences in using real-timespeech-to-text in the classroom support thestudent’s preference.

• Ability to participate orally in question-asking and discussion.Real-time speech-to-text services require thatstudents either use their own voice if theirspeech is intelligible, or type and have thecaptionist read the display aloud to the class.For students with intelligible speech, itgenerally is easier for them to speak than totype.

• Ability to make effective use of an assistivelistening device in the classroom.If the student is able to make effective use of anassistive listening device in the classroom, if thedevice is well maintained, and if both theinstructor and fellow students cooperate in itsuse, the student may have little need for thereal-time service. However she/he maycontinue to need its notetaking features.

• Level of reading proficiency.A requisite for functional use of real-timespeech-to-text at the college level is thestudent’s ability to read the text.

• Level of signing proficiency.A deaf student is likely to have proficiency insign language, and this may be her/his firstlanguage. If so, the student may profit morefrom the use of an interpreter than from real-time speech-to-text. However, this will notobviate the probable need for a notetakingservice of some kind.

COURSE AND/OR INSTRUCTOR. COURSE/INSTRUCTOR FACTORS INCLUDE:

• Lecture vs. discussion-oriented course.Some courses involve more active in-classstudent participation than others. Because ofthe interactive constraints on real-time speech-

to-text systems, they are better adapted tocourses that feature a lecture mode than tocourses that are highly discussion-oriented. Thisreservation may not apply to students withintelligible speech skills.

• Course content.In general, speech-to-print services may workless effectively with certain courses, such asmathematics. However, experience in providingservices indicates that the student’s preferencesand needs are critical in deciding which of his/her courses should use speech-to-text services.Where one student may not feel that acomputer science class is appropriate forspeech-to-text services, another student may.

• Duration of class period.Regardless of the type of service, a classextending beyond an hour without a break canbe stressful for the service provider. Given a 10-minute break after the first hour, the stenotypistproviding a steno-based service appears to bebetter able to continue through the secondhour without relief than the captionist offeringa CAN service or the interpreter providing theinterpreting service.

• Instructor’s communication style.The perfect instructor for real-time speech-to-text services (and for interpreting andconventional notetaking services as well) is onewho speaks at or below normal speaking rates,i.e., 150 wpm, articulates clearly, and tends touse grammatically correct sentence structures.She/he is well organized by topic, and sharesher/his lecture notes with the service providerwell in advance of the class.

Other considerations. The following twoconsiderations can be administratively and legallycomplex. Conditions might include:

• Presence of more than one deaf or hard ofhearing student in the class.In colleges with large enrollments of deaf and/or hard of hearing students, it is common fortwo or more of these students to be enrolled inthe same class. This does not necessarily meanthe same classroom support service(s) areneeded by each. This pertains particularly to asituation where one student needs aninterpreter and a second student needs real-time speech-to-text services. In this instance,both services should be provided, butpresumably the speech-to-text service could

Page 16: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

16

supply notes to both, eliminating the need for aspecial notetaker.

• Availability/unavailability of qualifiedservice provider(s)By law, a college cannot conclude that the mostappropriate “type” of classroom support servicefor a given student is unavailable, without clearindication that considerable effort has beenmade to obtain the services of the neededprovider(s). Because of the requisite trainingfactor, one of the CAN systems should beconsidered among the most available, and asubstitute for a steno-based system. Thesubstitution of a transcription system forinterpreting depends on several factorsmentioned above, including reading proficiency(Brueggemann, 1995).

AUTOMATIC SPEECH RECOGNITION (ASR)IN THE CLASSROOM

At a national meeting in April 1997 on the topic of“Applications of automatic speech recognition withdeaf and hard of hearing people” (Stuckless, 1997),numerous speech scientists spoke enthusiasticallyabout recent developments in the ASR field, withparticular reference to the recognition of continuousspeech. This coincided with an announcement thatDragon Systems was about to release its first versionof NaturallySpeaking, a major product breakthrough(Mandel, 1997). IBM followed later in the sameyear with ViaVoice.11

For many years, scientists have been seeking themodel ASR system, one that would have threefundamental properties:12

• the capacity to recognize a large vocabulary• the ability to process natural speech• the ability to recognize different speakers

Large vocabulary. For more than a decade, systemshave been available with vocabularies numbering inthe thousands of words. Current products have“active” vocabularies of 30,000 words or more, withthe capability of allowing the user to add thousandsmore, e.g., to add obscure names and technicalterms. Vocabulary size per se is not a limiting factorfor the use of ASR in the college classroom.

Natural speech. Until 1997, commercially availableASR products featured discrete speech recognition,requiring the speaker to pause briefly between eachword. While these pauses were tolerable for dictation

purposes, speaking in this manner was anything butnatural. A secondary effect was that our rate ofspeech was severely curtailed.

Since 1997, we have been able to choose among anumber of products that are capable of recognizingcontinuous speech. By continuous we simply meanthat no longer must we pause between every word.The provision of continuous speech in ASR certainlyenables us to speak more naturally than was possiblepreviously. Also, it enables us to speak at or near ournormal speaking rate. A third major advantage isthat it tends to lead to greater accuracy, which hasbeen reported as high as 97 percent.

That having been said, we must distinguish betweencontinuous and natural speech. The two are notsynonymous. Continuous speech per se does notinclude the recognition of some of the cues found innatural speech, such as voice inflection and pauses.As a consequence, it does not automatically producepunctuation and other markers, e.g., space betweenparagraphs, which contribute so much to thereadability of text. This is illustrated by the followingexcerpt from an actual lecture, as transcribed froman audiotape into text, using continuous speechrecognition.

Why do you think we might look at the historyof the family history tends to dictate the futureokay so there is some connection you’re sayingwhat else evolution evolution you’re on theright track which changes faster technology orsocial systems technology

The above excerpt was transcribed with 100 percentverbatim accuracy, using continuous speechrecognition. But imagine trying to read lecture textfor an hour as it appears above, particularly when itis being displayed at the rate of 150 words perminute. Taken alone, high verbatim accuracy is noguarantee of readability.

As seen next, the same excerpt becomes much morereadable when punctuation and speakeridentification are added, using the appropriate voicecommands.

11Both products since have been upgraded and been joined byLernout and Hauspie’s Voice Xpress and Philips’ Free Speech. SeeAlwang (1998) for a comparative review of these four products.

12A recommended clearly-written reference source on ASR isMarkowitz, J.A. (1996). Using speech recognition. Upper SaddleRiver, NJ: Prentice Hall.

Page 17: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

17

Instructor: Why do you think we might look atthe history of the family?Student: History tends to dictate the future.Instructor: Okay. So there is some connectionyou’re saying. What else?Student: Evolution.Instructor: Evolution. You’re on the righttrack. Which changes faster, technology orsocial systems?Student: Technology.

Recognition of different speakers. A single speakertranscribed the excerpt above because at present,ASR products are incapable of recognizing morethan a single speaker (user) at a time, i.e., they lackspeaker-independence. To become a user, anindividual must sign on and devote half an hour ormore to (a) becoming oriented to the system, and(b) orienting the system to his/her distinctivespeech characteristics. She/he can then become auser, with her/his own speech files. To use thesystem, the user identifies her/himself, calling upthese speech files.

Without speaker-independent ASR, we cannot passaround a microphone to students in a class with theexpectation that their speech will be recognizable.This is one of several reasons why ASR productscannot yet capture conversational speech (Allen,1997; Woodcock, 1997).

Extending ASR applications into the classroom. Giventhe present (1999) state of the art, it is not feasibleto apply ASR for general real-time classroom usewith deaf and hard of hearing students. However, ifthe application consists of a single user, e.g., a singleinstructor presenting an uninterrupted lecture, thetask becomes less formidable. The following passagewas transcribed from an audiotape of anotherlecture, using ASR.

Today I’d like to discuss with you a little bitabout the history of money my purposes togive you a flavor for the role of money andsome of the interesting problems and types ofmoney that existed throughout history tobegin with I’d like to raise the question as towhere did money come from today how topaper money get here

Note that this monologue is easier to read than theprevious unpunctuated passage that involvednumerous changes in speakers. Parenthetically, this

passage contains two ASR errors (purposes/purposeis; to/did), and a 97% verbatim accuracy rate. Judgeits readability for yourself, notwithstanding itsabsence of punctuation. You may agree that thispassage is quite intelligible, in spite of its two ASRtranscription errors.

Now let’s say the instructor had said period orquestion mark as he was speaking to break up hisfour sentences. These commands not only insertpunctuation but also lead automatically tocapitalization of the first word in the followingsentence, adding to readability. The passage wouldthen have appeared as follows:

Today I’d like to discuss with you a little bitabout the history of money. My purposes togive you a flavor for the role of money andsome of the interesting problems and types ofmoney that existed throughout history. Tobegin with I’d like to raise the question as towhere did money come from today. How topaper money get here?

We are not suggesting that the instructor with a classconsisting predominantly of hearing students usethis strategy, but this sample does suggest how closewe have come to making ASR feasible under specificconditions.

One researcher is presently exploring the use ofshadowing as an interim technique for the use ofASR in the college classroom. This project involvesthe services of someone with an aptitude forshadowing the speech of the instructor and studentstogether with a few hours of training and practicewith ASR.

This person uses a special mask with a built-inmicrophone connected to a computer containingASR software and her speech files. Her task is tolisten to the instructor, restating what is beingspoken as fully as possible, adding sentence-endingpunctuation, and identifying each change inspeakers, all in real time (Stuckless, in progress).

If recent progress is any indication, there is reason tobe optimistic about extending the application ofautomatic speech recognition into the classroom(Levitt, 1997; Mandel, 1997; Picheny, 1997). Hasits time arrived? The answer has to be no. However,within a few years, automatic speech recognition islikely to replace other real-time speech-to-text and

Page 18: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

18

notetaking services for many deaf and hard ofhearing students in the college classroom. If andwhen this occurs, it will come about because of itsdemonstrated value to these students, its relativelylow cost, its convenience including availability whenneeded, and the direct control it will give to thestudent.

CONCLUSIONS

Speech-to-text systems have increased the educators’tools for effectively supporting deaf and hard ofhearing students who are educated with hearingclassmates. Currently there are many mainstreamedstudents who cannot hear well enough to follow theclassroom discussion, but have intelligible speechand good reading skills. Such students aresometimes given an interpreter, but this service is oflimited benefit if the student does not understandsigns well.

There are also some situations where the studentunderstands sign communication, but for success ina particular class, it is important after class to be ableto review a text that details the class discussion.Speech-to-text services provide a quality option thatcan effectively address such situations.

The two technologies currently in use to providespeech-to-text services are steno-based systems inwhich a stenotype machine is linked to a computer,and CAN systems that use standard keyboard laptopcomputers. Automatic speech recognition systems,in which the conversion to print is done entirely bycomputer and without an intermediary, will becomeavailable in the future and may supportcommunication access even more effectively(Kurzweil, 1999). Other advances in technology arealso likely to make these systems more flexible andeasier to use.

A serious issue is the fact that none of the speech-to-text technologies discussed in this report adequatelyaddress expressive communication by deaf and hardof hearing people.

Individuals with intelligible speech, such as manywho are hard of hearing or late deafened, may beable to use their voices to make a comment or ask aquestion. Others may write or type into a keyboardto produce text or synthetic speech, but in manysituations these means may be limited or inadequate.

Speech-to-text services are not a panacea for thecommunication difficulties of deaf and hard ofhearing students. In instructional situations such assmall group discussions, laboratories, and one-to-one tutoring, these services may be less appropriatethan they are in lecture situations (Haydu &Patterson, 1990). Furthermore, many deaf studentsprefer an interpreter to a speech-to-text system inmost class situations (Stinson et al., 1988).

Even with these limitations, speech-to-text serviceshave been used repeatedly to effectively supportaccessibility to information in the classroom. Thisexperience has clearly demonstrated that theseservices are a viable option for supporting thecommunication access of many deaf and hard ofhearing students in settings where they areinteracting with hearing people. In the future, asthe necessary technologies improve, and as we learnmore about how these services can effectivelysupport students, speech-to-text services shouldmake even greater contributions to improving thepostsecondary education of students who are deaf orhard of hearing.

POSTSCRIPT PERTAINING TO LAWS ANDREGULATIONS13

With relation to deaf and hard of hearing students,higher education is currently on the horns of adilemma: given the advent of various speech-to-textsystems and advances in voice recognition software,will institutions forego the services of sign languageinterpreters in reliance on speech-to-text systems,and/or will the shortage of qualified sign languageinterpreters in certain areas of the countryinadvertently push colleges and universities intotaking this step?

There are no easy answers. This chapter lays out thepros and cons of various speech-to-text systems andthe factors, both student related and instructional,which should enter into a college’s determination asto whether speech-to-text is a reasonableaccommodation and if so, which type of speech-to-text system would be appropriate in a givencircumstance. It also demonstrates that the datasuggests that speech-to-text systems can be veryeffective for a good number of students, but thatregardless of future developments, speech-to-textsystems will always have the limitations inherent insuch a process, most notably, reducing the ability of

Page 19: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

19

deaf and hard of hearing students to fully participatein classes conducted in an interactive manner.

Ultimately, the law requires two things: (a) thatcommunications with students with disabilities, heredeaf and hard of hearing students, be “as effectiveas” that provided to students without disabilities;and (b) that an individualized assessment be made inorder to determine what (a) is. This chapter goes along way toward helping service providers makethose assessments. In addition, public colleges anduniversities must give “primary consideration” to thecommunication preferences of deaf and hard ofhearing students, although as discussed in othercommentaries herein, this does not mean thestudent will always get what s/he wants.

For the most part, if a student prefers sign languageand uses interpreters, institutions will opt forproviding notes to students via notetaking systemswhich are effective but less expensive than a speech-to-text system which would arguably provide morecomplete notes. However, the law does not requirethat students with disabilities receive the “best”notes, only that they have notes which are“effective.” Deaf and hard of hearing studentsshould bear in mind that most hearing studentsrarely take notes of the quality which would beprovided by a speech-to-text system.

At present, speech-to-text systems are roughly asexpensive as sign language interpreters. In thefuture, this may change and lowered costs maybecome an incentive for institutions to choosespeech-to-text over interpreters. Nevertheless, untiland unless the law is amended, the legal analysis ofwhich type of auxiliary aid or service should beprovided and thus, whether access is achieved, willremain the same.

In addition, if a student’s communication preferenceis speech-to-text and this is not available, the Officefor Civil Rights (OCR) has made clear that a goodfaith effort to locate and implement such a systemmust be demonstrated before a public institutionmay provide an alternative system ofcommunication. While private colleges anduniversities do not have to give “primaryconsideration” to students’ communicationpreferences, they must nevertheless providecommunications which are “as effective as” thoseprovided to students without disabilities. Thus, inorder for a private institution to provide an auxiliaryaid or service which is arguably less effective thanthat requested by the student, it should likewise beable to demonstrate that it made a good faith effortto secure the auxiliary aid or service which is “aseffective as” that provided nondisabled students, butnevertheless was unable to secure that aid or service.

13Contributed by Jo Anne Simon, consultant/attorneyspecializing in laws and regulations pertaining to students withdisabilities.

Page 20: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

20

REFERENCES

Allen, J. (1997). Applications of automatic speechrecognition to natural language andconversational speech. In R. Stuckless (Ed.),Frank W. Lovejoy symposium on applications ofautomatic speech recognition with deaf and hardof hearing people. Rochester, NY: RochesterInstitute of Technology.

Alwang, G. (1998). Speech recognition: Finding itsvoice. PC Magazine. October 20 issue.

Brueggemann, B.J. (1995). The coming out of deafculture and American Sign Language: Anexploration into visual rhetoric and literacy.Rhetoric Review: 13, 409-420.

Cuddihy, A., Fisher, B., Gordon, R. & Schumaker,E. (1994, April).C-Note: A computerizednotetaking system for hearing-impaired studentsin mainstream postsecondary education.Information and Technology for the Disabled.1,2.

Eisenberg, S. & Rosen, H. (1996, April). Real-timein the classroom. Paper presented at the biennialconference on postsecondary education forpersons who are deaf or hard of hearing.Knoxville, Tenn.

Everhart, V.S., Stinson, M.S., McKee, B.G.,Henderson, J. & Giles, P. (1996, April).Evaluation of a speech-to-print transcriptionsystem as a resource for mainstreamed deafstudents. Paper presented at the annual meetingof the American Educational ResearchAssociation, New York.

Giles, P. (1996). C-PrintTM support service policy.Unpublished document. National TechnicalInstitute for the Deaf, Rochester, NY.

Hastings, D., Brecklein, K., Cermak, S., Reynolds,R., Rosen, H. & Wilson, J. (1997). Notetakingfor deaf and hard of hearing students. Report ofthe National Task Force on Quality of Services inthe Postsecondary Education of Deaf and Hardof Hearing Students. Rochester, NY: NortheastTechnical Assistance Center, Rochester Instituteof Technology.

Haydu, M.L. & Patterson, K. (1990, October). Thecaptioned classroom: Applications for the hearing-impaired adolescent. Paper presented at theFourth National Conference on the Habilitationand Rehabilitation of Hearing ImpairedAdolescents. Omaha, NB.

Hobelaid, E. (1988). The feasibility of using aportable instacap system in a college environment.ERIC Document 293 614. Canadian CaptioningDevelopment Agency, Toronto, Ontario,Canada.

James, V. & Hammersley, M. (1993). Notebookcomputers as notetakers for handicappedstudents. British Journal of EducationalTechnology. 24, 63-66.

Kanevsky, D. Nahamoo, D., Walls, T. & Levitt, H.(1992). Prospects for stenographic and semi-automatic relay service. Paper presented atmeeting of the International Federation of theHard of Hearing, Tel Aviv, Israel.

Knox-Quinn, C. & Anderson-Inman, L. ((1996,April). Project CONNECT: An interim report onwirelessly-networked notetaking support forstudents with disabilities. Paper presented at theannual meeting of the American EducationalResearch Association, New York.

Kozma-Spytek, L. & Balcke, M. (1995). Computer-assisted notetaking: Alternative to oralinterpreting. GA-SK Newsletter, 26, 8.

Kurzweil, R. (1999). The age of spiritual machines.New York: Viking Press.

Levitt, H. (1994). Communications technology andassistive hearing technology. In M. Ross ((Ed.)Communication access for persons with hearingloss. Baltimore, MD: York Press.

Levitt, H. (1997). Automatic speech recognition:Exploring potential applications for people withhearing loss. In R. Stuckless (Ed.) Frank W.Lovejoy symposium on applications of automaticspeech recognition with deaf and hard of hearingpeople. Rochester, NY: Rochester Institute ofTechnology.

Page 21: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

21

Mandel, M. (1997). Discrete word recognitionsystems and beyond: Today and five-yearprojection. In R. Stuckless (Ed.) Frank W.Lovejoy symposium on applications of automaticspeech recognition with deaf and hard of hearingpeople. Rochester, NY: Rochester Institute ofTechnology.

Markowitz, J.A. (1996). Using speech recognition.Upper Saddle River: Prentice Hall.

McKee, B.G., Stinson, M.S., Everhart, V.S. &Henderson, J.B. (1995, April). The C-PrintTM

project: Development and evaluation of acomputer-aided speech-to-print transcriptionsystem. Paper presented at the annual meeting ofthe American Educational Research Association,San Francisco, CA.

McKee, B.G., Stinson, M., Giles, P., Colwell, J.,Hager, A., Nelson-Nasca, M. & MacDonald, A.(1998). C-PrintTM: A computerized speech-to-printtranscription system.: A guide for implementingC-PrintTM. Rochester, NY: National TechnicalInstitute for the Deaf, Rochester Institute ofTechnology.

Messerly, C. & Youdelman, K. (1994, June).Computer-assisted notetaking for mainstreamedhearing-impaired high school students. Paperpresented at the International Convention of theAlexander Graham Bell Association for the Deaf,Rochester, NY.

Moore, K., Bolesky, C. & Bervinchak, D. (1994,July). Assistive technology: Providess equal access inthe classroom. Panel discussion presented at theInternational Convention of the AlexanderGraham Bell Association for the Deaf.Rochester, NY.

Picheny, M. (1997). Continuous speech recognitionsystems: Today and five-year projection. In R.Stuckless (Ed.) Frank W. Lovejoy symposium onapplications of automatic speech recognition withdeaf and hard of hearing people. Rochester, NY:Rochester Institute of Technology.

Preminger, J. & Levitt, H. (1997). Computer-assisted remote transcription: A tool to aidpeople who are deaf or hard of hearing in theworkplace. Volta Review, 99, 219-230.

Sanderson, G., Siple, P. & Lyons, B. (1999).Interpreting. Report of the National Task Forceon Quality of Services in the PostsecondaryEducation of Deaf and Hard of HearingStudents. Rochester, NY: Northeast TechnicalAssistance Center, Rochester Institute ofTechnology.

Smith, S.B. & Ritterhouse, R.K. (1990). Real-timegraphic display: Technology for mainstreaming.Perspectives in Education and Deafness. 9(2), 2-5.

Stinson, M.S. & Stuckless, R. (1998). Recentdevelopments in speech-to-print transcriptionsystems for deaf students. In A. Weisel (Ed.).Issues unresolved: New perspectives on languageand deaf education. Washington, D.C: GallaudetUniversity Press.

Stinson, M., Stuckless, R., Henderson, J.B. &Miller, L. (1988). Perceptions of hearing-impaired college students toward real-timespeech-to-print: RTGD and other supportservices. Volta Review, 90, 336-348.

Stuckless, R. (1983). Real-time transliteration ofspeech into print for hearing-impaired studentsin regular classes. American Annals of the Deaf.128, 619-24.

Stuckless, R. (1994). Developments in real-timespeech-to-text communication for people withimpaired hearing. In M. Ross (1994).Communication access for people with impairedhearing. Baltimore, MD: York Press.

Stuckless, R. (Ed.)(1997). Frank W. Lovejoysymposium on applications of automatic speechrecognition with deaf and hard of hearing people.Rochester, NY: Rochester Institute ofTechnology.

Task Force on Realtime Reporting in the Classroom(1994). Realtime in the educational setting:Implementing new technology for access and ADAcompliance. Vienna, VA.:National CourtReporters Foundation.

Virvan, B. (1991). You don’t have to hate meetings- Try computer-assisted notetaking. SHHHJournal. 12, 25-28.

Page 22: REAL-TIME SPEECH-TO-TEXT SERVICES · (1) large built-in dictionary (50,000 words or larger), with provisions for the stenotypist to make additions as new words arise in class, (2)

22

Warick, R. (1994, July). Campus access for studentswho are hard of hearing. Paper presented at themeeting of the Association for Higher Educationand Disability, Columbus, OH.

Warick, R., Clark, C., Dancer, J. & Sinclair, S.(1997). Assistive listening devices. Report of theNational Task Force on Quality of Services in thePostsecondary Education of Deaf and Hard ofHearing Students. Rochester, NY: NortheastTechnical Assistance Center, Rochester Instituteof Technology.

Woodcock, K. (1997). Ergonomics of applications ofautomatic speech recognition for deaf and hardof hearing people. In R. Stuckless (Ed.), FrankW. Lovejoy symposium on applications of automaticspeech recognition with deaf and hard of hearingpeople. Rochester, NY: Rochester Institute ofTechnology.