mining from open answers in questionnaire data

10

Click here to load reader

Upload: feiwin

Post on 04-Jul-2015

1.211 views

Category:

Business


2 download

TRANSCRIPT

Page 1: Mining from Open Answers in Questionnaire Data

Mining from Open Mining from Open Answers in Answers in Questionnaire DataQuestionnaire DataKDD 01 San Francisco CA LISA

Copyright ACM 2001

Hang Li*NEC Corporation

[email protected]

Kenji YamanishiNEC Corporation

[email protected]

Page 2: Mining from Open Answers in Questionnaire Data

AgendaAgenda• Analysis of open answers• Rule Analysis

–Classification Rules–Association Rules–Algorithm

• Correspondence Analysis• Mining Result

Page 3: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Analysis of open answers

Automatically summarize open answersAutomatically mine useful information from open answers.Survey Analyzer system to analyze open answers (SA.)

Two statistical learningRule learning (Rule analysis)

Correspondence Analysis

Page 4: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Rule Analysis – Classification Rules

A number of categories containing a number of texts.Automatically acquire rules from the categorizes texts.Classify new texts on the basis of the acquired rules.SA

View each analysis target as a categoryView open answers associated with the target as texts.

Page 5: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Rule Analysis (Cont.)

Page 6: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Rule Analysis – Algorithm (SC)

SA learn classification rules or association rules by Stochastic Complexity (SC)

MLD (Minimum Description Length) principle.Rectangles : 10 open answersAnalysis target: TSome contain a specific word: W△SC > 0 is positive, that is most likely to have given rise to the data.

Page 7: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Correspondence Analysis

Page 8: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Relationship between Rule analysis and Correspondence analysis

Rule analysis Employs a conditional probability model: P(Y|X)Provides the facts in detail. (Table 2, 3).

Correspondence analysis Employs a joint probability model: P(Y, X)Yields the entire structure. (Position map.)

Y : analysis targetX: words

Page 9: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Mining result

With Car Data

With Eye-drap DataWith Beverage Data

Page 10: Mining from Open Answers in Questionnaire Data

Presented byJoyce Chen

Advantage of the mining system

It is much faster and less costly way to summarize or mine from open questions.SA is the first system that can performing rule analysis and correspondence analysis.New statistical learning methodology base on Stochastic Complexity.SA has successfully been used in the mining of various types of questionnaire data.