correcting misuse of verb forms

Post on 02-Feb-2016

55 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Correcting Misuse of Verb Forms. John Lee , Stephanie Seneff Computer Science and Artificial Intelligence Laboratory,  MIT, Cambridge. ACL 2008. Outline. Introduction Background System Baselines Data Evaluation Conclusions. Introduction. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Correcting Misuse of Verb Forms

John Lee , Stephanie SeneffComputer Science and Artificial Intelligence Laboratory,

 MIT, Cambridge

ACL 2008

Outline

Introduction Background System Baselines Data Evaluation Conclusions

Introduction

Introduction

Introduction

Outline

Introduction Background System Baselines Data Evaluation Conclusions

Background

The goal is to correct confusions among the five forms, as well as the infinitive caused by semantic and syntactic errors.

Semantic Errors

Suppose one wants to say “I am prepared for the exam”, but writes “I am preparing for the exam”.

Background

Syntactic Errors

Subject-Verb AgreementHe *have been living there since June.

Auxiliary AgreementHe has been *live there since June.

ComplementationHe wants *live there.

Outline

Introduction Background System Baselines Data Evaluation Conclusions

System

Step1Automatic Parsing

“My father is *work in the laboratory.”

System

Step2Replacing the verb forms

System

System

Step3N-gram counts as a filterUsing WEB 1T N-GRAM corpus. Prepared by Google Inc.

Outline

Introduction Background System Baselines Data Evaluation Conclusions

Baselines

majority baselineNo correction.

verb-only baseline(Only used in Auxiliary Agreement & Complementation)

It attempts corrections only when the word in question is actually tagged as a verb.

Outline

Introduction Background System Baselines Data Evaluation Conclusions

Data

Development DataAQUAINT Corpus (English News Text)

Evaluation DataJLE (Japanese Learners of English corpus)For 167 of the transcribed interviews, totalling 15,637 sentences.Test Set477 sentences (3.1%) contain subject-verb agreement errors, and 238 (1.5%) contain auxiliary agreement and complementation errors

Data

Evaluation DataHKUST (Hong Kong University of Science and Technology)It contains a total of 2556 sentences.

DataEvaluation MetricAccuracy(true neg + true pos) / total number of sentencesRecalltrue pos / (true pos + false neg + inv pos)Detection Precision(true pos + inv pos) / (true pos + inv pos + false pos)Correction Precisiontrue pos / (true pos + false pos + inv pos)

Outline

Introduction Background System Baselines Data Evaluation Conclusions

Evaluation

JLEResults for Subject-Verb Agreement

Corpus Method Accuracy Precision(correction)

Precision(detection)

Recall

JLE allmajority

98.93%96.95%

81.61% 83.93% 80.92%

Results for Auxiliary Agreement & Complementation

Corpus Method Accuracy Precision(correction)

Precision(detection)

Recall

JLE allverb-onlymajority

98.94%98.85%98.47%

68.00%71.43%

80.67%84.75%

42.86%31.51%

Evaluation

HKUSTResults for Auxiliary Agreement & ComplementationTwo native speakers of English were given the edited sentences, as well as the original input.For each pair, they were asked to select one of four statements: one of the two is better, or both are equally correct, or both are equally incorrect.

Corpus Method Accuracy Precision(correction)

Precision(detection)

Recall

HKUST all Not available 71.71% not available not available

Kappa: 0.76

Evaluation

Outline

Introduction Background System Baselines Data Evaluation Conclusions

Conclusions

This paper proposes a method to correct English verb form

errors made by non-native speakers. Investigation of the ways the ways in which verb form errors

affect parse trees.

top related