modeling prompt adherence in student essays - hltrivince/talks/acl14-essay.pdf · modeling prompt...

1

Modeling Prompt Adherence

in Student Essays

Isaac Persing and Vincent Ng

Human Language Technology Research Institute

University of Texas at Dallas

2

3

Automated Essay Scoring

bull Important educational application of NLP

bull Related research on essay scoring

ndash Grammatical errors

ndash Coherence

ndash Thesis clarity

ndash Organization

ndash Prompt adherence

3

4

What is Prompt Adherence

bull refers to how related an essayrsquos content is to

the prompt for which it was written

4

5




ndash An essay with a high prompt adherence score

consistently remains on topic introduced by the

prompt and is free of irrelevant digressions

5

6

Goal

bull Develop a model for scoring the prompt

adherence of student essays

6

7

Related Work on Prompt Adherence Scoring

bull Off-topic sentence detection (Higgins et al 2004)

bull Off-topic essay detection (Higgins et al 2006)

bull Off-topic essay detection with short prompts

(Louis and Higgins 2010)

8






Binary decision (off-topicon-topic)

9







Knowledge-lean

ndash Features derived from semantic similarity measures

bull Random indexing (RI) and Content Vector Analysis (CVA)

10

What are the differences between

our work and previous work

10

11



11

Task

Score can range from 1-4 points

Supervised prompt adherence scoring

Approach

Feature-rich

12

bull Corpus and Annotations

bull Approach for scoring prompt adherence

bull Evaluation

Plan for the Talk

12

13

Plan for the Talk

Corpus and Annotations


bull Evaluation

13

14

Selecting a Corpus

bull International Corpus of Learner English (ICLE)

ndash 45 million words in more than 6000 essays

ndash Written by university undergraduates who are

learners of English as a foreign language

ndash Mostly (91) argumentative writing topics

bull Essays selected for annotation

ndash 830 argumentative essays from 13 prompts

ndash annotate each essay with its prompt adherence score

14

15

Prompt Adherence Scoring Rubric

4 ndash essay fully addresses the prompt and

consistently stays on topic

3 ndash essay mostly addresses the prompt or

occasionally wanders off topic

2 ndash essay does not fully address the prompt or

consistently wanders off topic

1 ndash essay does not address the prompt at all or is

completely off topic

bull Half-point increments (ie 15 25 35) allowed15

16

Inter-Annotator Agreement

bull 707 of 830 essays scored by both annotators

16

17



bull Perfect agreement on 38 of essays

bull Scores within 05 points on 66 of essays

bull Scores within 10 point on 89 of essays

17

18






bull Whenever annotators disagree use the average

score rounded to the nearest half point

18

19

Plan for the Talk


Approach for scoring prompt adherence

bull Evaluation

19

20

bull recast problem as a linear regression task

bull train one regressor per prompt

Approach for Scoring Prompt Adherence

21



ndash common problems students have writing essays for one

prompt may not apply to essays written for another


22





bull one training instance per training essay

ndash ldquoclassrdquo value prompt adherence score

ndash learner LIBSVM

ndash features 7 feature types


23

Features

bull 7 types of features

ndash baseline features

ndash 6 types of new features

24

Features




25

Baseline Features

bull Features based on Random Indexing (RI)

ndash adapted from Higgins et al (2004)

26

Baseline Features



bull Random indexing

ndash ldquoan efficient and scalable alternative to LSIrdquo (Sahlgren

2005)

ndash generates a semantic similarity measure between any

two words

ndash generalized to computing similarity between two

groups of words (Higgins amp Burstein 2007)

27

Why Random Indexing (RI)

bull May help find text in essay related to the prompt

even if some of its words have been rephrased

ndash Eg essay talks about ldquojailrdquo while prompt has ldquoprisonrdquo

bull Train a RI model on the English Gigaword

28

5 Random Indexing Features

bull The entire essayrsquos similarity to the prompt

bull The essayrsquos highest individual sentencersquos similarity to

the prompt

bull The highest entire essay similarity to one of the

prompt sentences

bull The highest individual sentence similarity to an

individual prompt sentence

bull The essayrsquos similarity to a manually rewritten version

of the prompt that excludes extraneous material

29

Features



ndash 6 types of features

30

1 Thesis Clarity Keyword Features

31


refers to how clearly an author explains the thesis of her essay

32

bull introduced in Persing amp Ng (2013) for scoring the

thesis clarity of an essay

bull generated based on thesis clarity keywords



33

What are Thesis Clarity Keywords

34


bull ldquoimportantrdquo words in a prompt

ndash important word good word for a student to use when

stating her thesis about the prompt

35

How to Identify Keywords

36


bull By hand

37

Hand-Selecting Keywords

bull Hand-segment each multi-part prompt into parts

bull For each part hand-pick the most important (primary)

and second most important (secondary) words that it

would be good for a writer to use to address the part

The prison system is outdated No civilized society

should punish its criminals it should rehabilitate them

38








39








Primary rehabilitate

Secondary society

40

Designing Keyword Features

bull Example in one feature we

1 compute the random indexing similarity between

the essay and each group of primary keywords

taken from parts of the essayrsquos prompt

2 assign the feature the lowest of these values

41







bull A low feature value suggests that the student ignored

the prompt component from which the value came

42

Thesis Clarity Keyword Features

bull Though these features were designed for scoring thesis

clarity some of them are useful for prompt adherence

scoring

43

2 Prompt Adherence Keyword Features

bull Motivation rather than relying on keywords for thesis

clarity why not hand-pick keywords for prompt

adherence and create features from them

44


bull Rather than relying on keywords for thesis clarity why

not hand-pick keywords for prompt adherence and

create features from them

45

An Illustrative Example

Marx once said that religion was the opium of the

masses If he was alive at the end of the 20th century he would replace religion with television

46


bull This question suggests that students discuss whether

television is analogous to religion in this way



47




ndash prompt adherence keywords contain ldquoreligionrdquo

ndash thesis clarity keywords do not contain ldquoreligionrdquo



48






ndash A thesis like ldquoTelevision is badrdquo can be stated clearly without

reference to ldquoreligionrdquo



49








bull essay with this thesis could have high thesis clarity score

ndash But low adherence score Religion should be discussed



50

Creating Features from Keywords

bull Two types of features

51



bull For each prompt component

1 take the RI similarity between the whole essay and the

componentrsquos keywords

2 compute the fraction of the componentrsquos keywords

that appear in the essay

52

3 LDA Topics

bull Motivation the features introduced so far have

trouble identifying topics that are related to but not

explicitly mentioned in the prompt

53

3 LDA Topics




All armies should consist entirely of professional soldiers

there is no value in a system of military service

54

3 LDA Topics




bull An essay containing words like ldquopeacerdquo ldquopatriotismrdquo or

ldquotrainingrdquo are probably not digressions and should not be

penalized for discussing these topics



55

3 LDA Topics







ndash But the various measures of keyword similarities might not

notice that anything related to the prompt is discussed



56

3 LDA Topics









ndash this might have effects like lowering the RI similarity scores



57

How to create LDA features

58


For each prompt

1 collect all the essays in the ICLE corpus written in response

to it not just those we labeled

2 build an LDA of 1000 topics

59


For each prompt




ndash Soft clustering of the words into 1000 sets

60


For each prompt





bull Eg for the most frequent topic for the military prompt

the five most important words are

ldquomanrdquo ldquomilitaryrdquo ldquoservicerdquo ldquopayrdquo and ldquowarrdquo

61


For each prompt




ndash Model can tell us how much an essay spends on each topic

bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt




3 construct 1000 features one for each topic

bull Feature value encodes how much of the essay was spent

discussing the topic

63


For each prompt







Eg if an essay written for the military prompt spends

55ldquofullyrdquo ldquocountrdquo ldquoordinaryrdquo ldquoczechrdquo ldquodayrdquo

45ldquomanrdquo ldquomilitaryrdquo ldquoservicerdquo ldquopayrdquo ldquowarrdquo

64

4 Manually Annotated LDA Topics

bull Motivation the regressor using LDA features may not be

able to distinguish an infrequent topic that is adherent to

the prompt and one that is an irrelevant digression

65





ndash An infrequent topic may not appear enough in the training

set for the regressor to make this judgment

66







bull Create manually annotated LDA features

67

How to create Manually Annotated

LDA Features

68


LDA Features

1 For each set of essays written for a given prompt build an

LDA of 100 topics

69


LDA Features


LDA of 100 topics

2 For each topic inspect its top 10 words and hand-annotate

it with a number from 0 to 5 representing how likely it is

that the topic is adherent to the prompt

bull higher score more adherent

70


LDA Features


LDA of 100 topics





3 For each essay create 10 features from the labeled topics

71

10 Features from the labeled topics

bull Five features encode the sum of contributions to an essay

of topics annotated with the number 0 1 hellip 4 resp


of topics annotated with a number ge 1 ge 2 hellip ge 5 resp

72






bull These features should give the regressor a better idea of

how much of an essay is composed of prompt-related vs

prompt-unrelated discussions

73

5 Predicted Thesis Clarity Errors

bull In previous work on thesis clarity essay scoring

(Persing amp Ng 2013) we

ndash score an essay wrt the clarity of its thesis

ndash determine which type(s) of errors an essay contains that

detract from the clarity of its thesis

74

5 Types of Thesis Clarity Errors

bull Confusing Phrasing

bull Missing Details

bull Writer Position

bull Incomplete Prompt Response

bull Relevance to Prompt

75







76

Features based on Error Types

bull Introduced features for prompt adherence scoring

that encode the error types an essay contains

77




ndash Though each essay was manually annotated with the

errors it contains in a realistic setting we wonrsquot have

access to these manual annotations

78







bull Predict which of the 5 error types an essay contains

ndash Recast as a multi-label classification task

79

Creating Predicted ldquoError Typerdquo Features

bull Add a binary feature indicating the presence or

absence of each error type

80

6 N-gram features

bull Can capture useful words and phrases related to a

prompt

bull 10K lemmatized unigrams bigrams and trigrams

ndash selected according to information gain

81

Summary of Features

bull Baseline features

ndash Random indexing features

bull Six types of features

ndash Lemmatized n-grams

ndash Thesis clarity keyword features

ndash Prompt adherence keyword features

ndash LDA topics

ndash Manually annotated LDA topics

ndash Predicted thesis clarity errors

82


Model for scoring prompt adherence

Evaluation

Plan for the Talk

82

83

Evaluation

bull Goal evaluate our system for prompt

adherence scoring

bull 5-fold cross validation

84

Scoring Metrics

bull Define 4 evaluation metrics

84

85


85

probability that a system

predicts the wrong score out

of 7 possible scores (1 15 2

25 3 35 4)

Scoring Metrics

86


86




25 3 35 4)annotated scores

estimated scores

Scoring Metrics

87

Scoring Metrics


87

(probability of error)

average absolute error

annotated scores

estimated scores

88

Scoring Metrics


88


distinguishes near misses from

far misses

annotated scores

estimated scores

89

Scoring Metrics


89


(average absolute error)

average squared error

90

Scoring Metrics


90



prefer systems whose

estimations are not too often

far away from correct scores

91

Scoring Metrics


91



(average squared error)

PC Pearsonrsquos correlation coefficient between Ai and Ei

92

Scoring Metrics


92





S1 S2 S3

bull error metrics

bull smaller value is better

93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC

bull correlation coefficient

bull larger value is better

94

Regressor Training

bull SVM regressors are trained to maximize performance

wrt each scoring metric by tuning the regularization

parameter on held-out development data

95

Results

233

PCSystem S1 S2 S3

Baseline 517 368 234

+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97

probability of error

98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3


Our system 488 348 197

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ

bull Improvements wrt all four scoring metrics

103

bull Improvements wrt all four scoring metricsbull Improvements wrt all four scoring metrics

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ

Significant differences

104

Feature Ablation

bull Goal examine how much impact each of the feature

types has on our systemrsquos performance wrt each

scoring metric

ndash Train a regressor on all but one type of features

105

Feature Ablation Results

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110

bull Relative importance of features does not always remain

consistent if we measure performance in different ways

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111

bull Buthellip there are feature that tend to be more important than

the others in the presence of other features

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112

bull most important TC keywords

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113

bull most important TC keywords n-grams

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114

bull most important TC keywords n-grams annotated LDA topics

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115


bull middling important RI

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116


bull middling important RI unlabeled LDA topics

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117



bull least important predicted TC errors

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118



bull least important predicted TC errors PA keywords

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary

bull Examined the problem of prompt adherence

scoring in student essays

ndash feature-rich approach

bull Released the annotations

ndash prompt adherence scores

ndash prompt adherence keywords

ndash manually annotated LDA topics

2

3





ndash Coherence


ndash Organization


3

4




4

5







5

6

Goal



6

7






8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








3





ndash Coherence


ndash Organization


3

4




4

5







5

6

Goal



6

7






8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








4




4

5







5

6

Goal



6

7






8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








5







5

6

Goal



6

7






8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








6

Goal



6

7






8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








7






8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








8







9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








9







Knowledge-lean



10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








10



10

11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








11



11

Task



Approach

Feature-rich

12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








12



bull Evaluation

Plan for the Talk

12

13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








13

Plan for the Talk



bull Evaluation

13

14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








14

Selecting a Corpus









14

15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








15











16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








16



16

17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








17






17

18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








18








18

19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








19

Plan for the Talk



bull Evaluation

19

20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








20




21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








21






22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








22










23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








23

Features




24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








24

Features




25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








25

Baseline Features



26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








26

Baseline Features





2005)


two words



27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








27






28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








28




the prompt


prompt sentences





29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








29

Features




30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








30


31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








31



32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








32






33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








33


34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








34





35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








35


36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








36


bull By hand

37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








37








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








38








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








39









Secondary society

40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








40







41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








41









42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








42




scoring

43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








43





44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








44





45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








45




46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








46






47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








47








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








48










49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








49












50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








50



51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








51








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








52

3 LDA Topics




53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








53

3 LDA Topics






54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








54

3 LDA Topics









55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








55

3 LDA Topics











56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








56

3 LDA Topics












57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








57


58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








58


For each prompt




59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








59


For each prompt





60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








60


For each prompt








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








61


For each prompt





bull Eg

5Topic1000

helliphellip

15Topic3

45Topic2

25Topic1

62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








62


For each prompt







63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








63


For each prompt










64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








64





65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








65







66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








66








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








67


LDA Features

68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








68


LDA Features


LDA of 100 topics

69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








69


LDA Features


LDA of 100 topics





70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








70


LDA Features


LDA of 100 topics






71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








71






72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








72









73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








73







74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








74







75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








75







76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








76




77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








77







78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








78









79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








79




80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








80

6 N-gram features


prompt



81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








81

Summary of Features







ndash LDA topics



82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








82



Evaluation

Plan for the Talk

82

83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








83

Evaluation


adherence scoring


84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








84

Scoring Metrics


84

85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








85


85




25 3 35 4)

Scoring Metrics

86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








86


86





estimated scores

Scoring Metrics

87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








87

Scoring Metrics


87



annotated scores

estimated scores

88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








88

Scoring Metrics


88



far misses

annotated scores

estimated scores

89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








89

Scoring Metrics


89




90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








90

Scoring Metrics


90






91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








91

Scoring Metrics


91





92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








92

Scoring Metrics


92





S1 S2 S3

bull error metrics


93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








93

Scoring Metrics


93





S1 S2 S3

bull error metrics


PC



94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








94

Regressor Training




95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








95

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

95

96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








96

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

96

RI features

97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








97

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

97


98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








98

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

98


average absolute

error

99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








99

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

99


average absolute

error

average squared

error

100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








100

Results

233

PCSystem S1 S2 S3


+ Misspelling

feature

654 515 402

+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

100


average absolute

error

average squared

error

Pearsonrsquos

τ

101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








101

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

101


average absolute

error

average squared

error

Pearsonrsquos

τ

102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








102

Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

102


average absolute

error

average squared

error

Pearsonrsquos

τ


103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








103


Results

360

233

PCSystem S1 S2 S3



+ Keyword

features

663 490 369

+ Aggregated

word n-grams

651 484 374

+ Aggregated

POS n-grams

671 483 377

+ Aggregated

frames

672 486 382

103


average absolute

error

average squared

error

Pearsonrsquos

τ


104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








104

Feature Ablation



scoring metric


105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








105


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








106


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-gramsS2

107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








107


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

S2

S3

108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








108


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








109

PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








110



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








111



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








112


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








113


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








114


PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








115



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








116



PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








117




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








118




PA

keywords

Annotated

LDA topicsN-grams

Predicted

TC errorsRI

Unlabeled

LDA topics

TC

keywordsS1

Predicted

TC errors

Unlabeled

LDA topics

PA

keywordsRI

TC

keywords

Annotated

LDA topicsN-grams

TC

keywords

Predicted

TC errors

PA

keywords

Unlabeled

LDA topicsRI

Annotated

LDA topicsN-grams

PA

keywords

Unlabeled

LDA topics

Predicted

TC errorsRIN-grams

TC

keywords

Annotated

LDA topics

S2

S3

PC

Most important

Least important

119

Summary








119

Summary








modeling prompt adherence in student essays - hltrivince/talks/acl14-essay.pdf · modeling prompt...

Documents