fuzzy final homework system implementation selected paper: fuzzy integration of structure adaptive...

15
Fuzzy Final Homework System Implementation Selected paper: Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60 Lecture: Prof. Hahn-Ming Lee Student: Ching-Hao Mao [email protected]

Post on 21-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Fuzzy Final HomeworkSystem ImplementationSelected paper: Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60

Lecture: Prof. Hahn-Ming Lee

Student: Ching-Hao Mao

[email protected]

Outline

Introduction Proposed method in selected paper Implementation Conclusion References

Introduction

In this report, we implement Kim and Cho’s paper appear on Fuzzy Set and System in 2004

User profile represents different aspects of user’s characteristics

The author proposed an ensemble of classifiers that estimate user’s preference using web content labeled by user as “like” or “dislike”

Introduction- Preview Studies [2]

Feature Selection Method Properties

Feature selection methods such as Information Gain, TFIDF, and ODDS ratio have different properties

TFIDF does not consider class values of documents when calculating the relevance of features while information gain uses class labels of documents

Odds ratio uses class labels of documents but they find useful features to classify only one specific class

Overview of the proposed method in [1]

Classification

TFIDF, Information Gain,ODDS Ratio

Structure Adaptive SOM

Training SASOM’s using different feature sets

Fuzzy Integral

Hot

Cold

or

Data Set Description

UCI Syskill & Webert data (http://kdd.ics.uci.edu) Contain the HTML source of web pages plus the

ratings of a single user on these web pages The web pages are on four separate subjects

Bands- recording artists (Implement in this report) Goats (Implement in this report) Sheep BioMedical

Implementation

Coding Java (J2SE 1.5) program for preprocessing, feature selection (TFIDF and ODDS Ratio), and Fuzzy Integral mechanism

Using Weka for Feature Selection (Information Gain) and Classification

This report not successfully program SASOM…

Implementation-preprocessing

UCI Syskill & Webert data

ExtractHTMLContent.java

Pure Text without Anchor Text

Bands.txt

After Stopword and Porter Stemmer

Bands_Stopword.txtBands_Porter.txt

Implementation- Feature Selection

In Bands, 61 dataset E.g. Attribute

Number: 5436->32

Information Gain TFIDF ODDS Ratio

0.1435 1411 mother

0.1435 4109 writes

0.1054 49 places

0.1054 855 letter

0.1054 3883 movement

0.1054 1464 stories

0.1054 3856 synthesizer

0.1054 2568 songwriters

0.0962 4643 singer

0.0937 50 america

sea

acid

programming

innovative

letter

method

members

bleed

concentrated

mother

oss

wild

cultures

vehemently

smoking

define

book

charge

library

hand

Implementation- Fuzzy IntegralFuzzy measure of classifiers that are determined subjectively [1]

Bayes Classifier b1,b2,b3

b1=0, b2=1, b3=0 0.99

FuzzyIntegral.java

(g1,g2,g3)

0.99,0.99,0.99) (0.01,0.01,0.99)

(b1,b2,b3) Result (b1,b2,b3) Result

(0,1,0) 0.99 (0,0,1) 0.01

(1,1,1) 0.99 (0,1,1) 0.01

(0,0,0) 0,99 (0,0,0) 0.01

Conclusion

Fuzzy integral provides the method of measuring the importance of classifiers subjectively, especially in semi-supervised learning method

The method based on fuzzy integral can be effectively applied to web content mining for predicting user’s preference as user profile

Fuzzy Integral maybe can apply into my research area to integrate expert or user’s knowledge

References

1. Kyung-Joong Kim, Sung-Bae Cho, Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60

2. Pazzani M., Billsus, D., Learning and Revising User Profiles: The identification of interesting web sites, Machine Learning 27 (1997), 313-331

3. http://kdd.ics.uci.edu/databases/SyskillWebert/SyskillWebert.data.html