mining from open answers in questionnaire data

Post on 04-Jul-2015

1.211 Views

Category:

Business

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mining from Open Mining from Open Answers in Answers in Questionnaire DataQuestionnaire DataKDD 01 San Francisco CA LISA

Copyright ACM 2001

Hang Li*NEC Corporation

lihang@ccm.cl.nec.co.jp

Kenji YamanishiNEC Corporation

k-yamanishi@cw.jp.nec.com

AgendaAgenda• Analysis of open answers• Rule Analysis

–Classification Rules–Association Rules–Algorithm

• Correspondence Analysis• Mining Result

Presented byJoyce Chen

Analysis of open answers

Automatically summarize open answersAutomatically mine useful information from open answers.Survey Analyzer system to analyze open answers (SA.)

Two statistical learningRule learning (Rule analysis)

Correspondence Analysis

Presented byJoyce Chen

Rule Analysis – Classification Rules

A number of categories containing a number of texts.Automatically acquire rules from the categorizes texts.Classify new texts on the basis of the acquired rules.SA

View each analysis target as a categoryView open answers associated with the target as texts.

Presented byJoyce Chen

Rule Analysis (Cont.)

Presented byJoyce Chen

Rule Analysis – Algorithm (SC)

SA learn classification rules or association rules by Stochastic Complexity (SC)

MLD (Minimum Description Length) principle.Rectangles : 10 open answersAnalysis target: TSome contain a specific word: W△SC > 0 is positive, that is most likely to have given rise to the data.

Presented byJoyce Chen

Correspondence Analysis

Presented byJoyce Chen

Relationship between Rule analysis and Correspondence analysis

Rule analysis Employs a conditional probability model: P(Y|X)Provides the facts in detail. (Table 2, 3).

Correspondence analysis Employs a joint probability model: P(Y, X)Yields the entire structure. (Position map.)

Y : analysis targetX: words

Presented byJoyce Chen

Mining result

With Car Data

With Eye-drap DataWith Beverage Data

Presented byJoyce Chen

Advantage of the mining system

It is much faster and less costly way to summarize or mine from open questions.SA is the first system that can performing rule analysis and correspondence analysis.New statistical learning methodology base on Stochastic Complexity.SA has successfully been used in the mining of various types of questionnaire data.

top related