fuzzy final homework system implementation selected paper: fuzzy integration of structure adaptive...
Post on 21-Dec-2015
215 views
TRANSCRIPT
Fuzzy Final HomeworkSystem ImplementationSelected paper: Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60
Lecture: Prof. Hahn-Ming Lee
Student: Ching-Hao Mao
Introduction
In this report, we implement Kim and Cho’s paper appear on Fuzzy Set and System in 2004
User profile represents different aspects of user’s characteristics
The author proposed an ensemble of classifiers that estimate user’s preference using web content labeled by user as “like” or “dislike”
Feature Selection Method Properties
Feature selection methods such as Information Gain, TFIDF, and ODDS ratio have different properties
TFIDF does not consider class values of documents when calculating the relevance of features while information gain uses class labels of documents
Odds ratio uses class labels of documents but they find useful features to classify only one specific class
Data Set Description
UCI Syskill & Webert data (http://kdd.ics.uci.edu) Contain the HTML source of web pages plus the
ratings of a single user on these web pages The web pages are on four separate subjects
Bands- recording artists (Implement in this report) Goats (Implement in this report) Sheep BioMedical
Implementation
Coding Java (J2SE 1.5) program for preprocessing, feature selection (TFIDF and ODDS Ratio), and Fuzzy Integral mechanism
Using Weka for Feature Selection (Information Gain) and Classification
This report not successfully program SASOM…
Implementation-preprocessing
UCI Syskill & Webert data
ExtractHTMLContent.java
Pure Text without Anchor Text
Bands.txt
After Stopword and Porter Stemmer
Bands_Stopword.txtBands_Porter.txt
Implementation- Feature Selection
In Bands, 61 dataset E.g. Attribute
Number: 5436->32
Information Gain TFIDF ODDS Ratio
0.1435 1411 mother
0.1435 4109 writes
0.1054 49 places
0.1054 855 letter
0.1054 3883 movement
0.1054 1464 stories
0.1054 3856 synthesizer
0.1054 2568 songwriters
0.0962 4643 singer
0.0937 50 america
sea
acid
programming
innovative
letter
method
members
bleed
concentrated
mother
oss
wild
cultures
vehemently
smoking
define
book
charge
library
hand
Implementation- Fuzzy IntegralFuzzy measure of classifiers that are determined subjectively [1]
Bayes Classifier b1,b2,b3
b1=0, b2=1, b3=0 0.99
FuzzyIntegral.java
(g1,g2,g3)
0.99,0.99,0.99) (0.01,0.01,0.99)
(b1,b2,b3) Result (b1,b2,b3) Result
(0,1,0) 0.99 (0,0,1) 0.01
(1,1,1) 0.99 (0,1,1) 0.01
(0,0,0) 0,99 (0,0,0) 0.01
Conclusion
Fuzzy integral provides the method of measuring the importance of classifiers subjectively, especially in semi-supervised learning method
The method based on fuzzy integral can be effectively applied to web content mining for predicting user’s preference as user profile
Fuzzy Integral maybe can apply into my research area to integrate expert or user’s knowledge
References
1. Kyung-Joong Kim, Sung-Bae Cho, Fuzzy integration of structure adaptive SOMs for web content mining, Fuzzy Sets and Systems 148 (2004) 43–60
2. Pazzani M., Billsus, D., Learning and Revising User Profiles: The identification of interesting web sites, Machine Learning 27 (1997), 313-331
3. http://kdd.ics.uci.edu/databases/SyskillWebert/SyskillWebert.data.html