language resources and call

Post on 24-Jan-2016

21 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

- PowerPoint PPT Presentation

TRANSCRIPT

Language Resources and CALL Applications

Helmer Strik1, Jozef Colpaert2, Joost van Doremalen1, andCatia Cucchiarini1

1 Centre for Language and Speech Technology (CLST) Dept. of Linguistics, Radboud Univ. Nijmegen, The Netherlands2 Linguapolis, University of Antwerp, Antwerp, Belgium

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 2

Language Resources and CALLThe current presentation:

The relation betweenlanguage resources and CALL systems

CALL: Computer Assisted Language Learning

We focus here on the project DISCO:Development and Integration of Speech technology into COurseware for language learning

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 3

Overview

A short introduction to DISCOResources used to develop a CALL systemResources obtained during development of a CALL systemResources obtained using a CALL systemConclusions

Dr. Spraak

(Dr. Speech)

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 4

A short introduction to DISCODISCO project:

develop a prototype of a CALL systemthat can give feedbackon spoken utterances

Levels:pronunciation (of sounds)grammar (syntax & morphology)

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 5

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 6

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 7

Syntax exercise

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 8

Morphology exercise

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 9

Pronunciation exercise – with feedback

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 10

Menu: conversation environment report, learner is listening to own speech in complete conversation

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 11

Menu: conversation environment report, learner is reviewing pronunciation mistakes by listening to own speech

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 12

Menu: remediation environment, overall scores for phonemes, learner can start remediation by clicking on a phoneme

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 13

Menu: remediation environment, pronunciation exercise

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 14

Menu: remediation environment, learner is reviewing progress

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 15

Characters in DISCO

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 16

ASR-based CALLASR: Automatic Speech Recognition

standard ASR: from (native) speech to words

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 17

ASR: Automatic Speech Recognition

Decoder

AcousticModels

Lexicon LanguageModel

Speech SignalInput

W1 W2 W3 W4

WordsOutput

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 18

ASR-based CALLASR: Automatic Speech Recognition

standard ASR: from (native) speech to words

ASR for CALL, 2 phases:1. content, what has been said, tolerant; recognize words despite non-native variation2. form, how has it been said, strict; error detection, find deviations from native …

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 19

Resources used to develop a CALL system (1)

More general, native resources:ASR toolkit – e.g. SPRAAK [from Stevin]Corpus with native speech –e.g. Spoken Dutch Corpus (CGN) [from TST-Centrale]Native lexicon – e.g. e-Lex [from TST-Centrale]

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 20

Resources used to develop a CALL system (2)

More specific, non-native resources (often not available) to develop / improve the 2 phases:

Phases 1 + 2. Corpora with non-native speech : JASMIN [from Stevin]; CITO, Triest, Dutch-CAPT

Phase 1. word recognition, contentResources, information to model non-native 'behavior', in order to improve:

Acoustic Models: mainly by training on non-native audio (from speech corpora)Lexicon & Language Model: data-driven, from non-native audio, or knowledge based, from lit. etc.

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 21

Resources used to develop a CALL system (3)

More specific, non-native resources (often not available) to develop / improve the 2 phases:

phase 2. error detection (classifiers), strict; A. Decide which errors to address,

criteria + selection => inventorydata-driven and/or knowledge based

B. Develop classifiers, train and test; data-drivenA & B. data-driven => Resources needed: annotations for audioLevels:

o Pronunciation: sounds [& prosody, not in DISCO]o Grammar: syntax & morphology

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 22

Resources obtained during developmentBlue-print of the designContent

specifications for exercises and feedback strategies a list of predicted correct and incorrect utterances

Modules for the 2 phases: 1. word recognition, 2. error detectionThe CALL system itself, the whole system,prototype with content

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 23

Resources obtained using a CALL systemAudio recordingsLog-files: user + system 'behavior'Videos

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 24

ConclusionsLanguage Resources

important role in relation to CALL systemsLanguage Resources

are needed to develop a CALL systemcan be obtained during development of a CALL systemcan be obtained using a CALL system

Language Resources obtained give rise to new opportunities:

researchsystem development

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 25

Website DISCOlands.let.ru.nl/~strik/research/DISCO/

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 26

Stevin project DISCOTrainen van spreekvaardigheid

uitspraak, morfologie, syntax

CorrectVoorbeeld Ik loop naar huis

FoutenUitspraak Ik lop nar guisMorfologie Ik lopen naar huisSyntax Ik naar huis lopen

Fouten automatisch detecterenm.b.v. spraaktechnologie

Radboud University NijmegenLREC 2010, Malta, 22-05-2010 27

ASR

DisplayLogic

PromptGenerator

Words

Segmentation

ErrorDetectionGrading

FeedbackGeneration

top related