![Page 1: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/1.jpg)
Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization
---Lei Tang, Jianping Zhang and Huan Liu
![Page 2: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/2.jpg)
Taxonomies and Hierarchical Models
Web pages can be organized as a tree-structured taxonomy (Yahoo!, Google directory)
Parental control: Web filters to block children’s access to undesirable web sites. Parents want accurate content categorization of
different granularity Service providers appreciate the decision path how
a blocking/non-blocking is made for fine tuning.
Hierarchical Model: Exploit the taxonomy for classification strategy or loss function
![Page 3: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/3.jpg)
Quality of Taxonomy
Most hierarchical models use a predefined taxonomy, typically semantically sound.
A librarian is often employed to construct the semantic taxonomy.
Is semantically-sound taxonomy always good? Subjectivity can result in different taxonomies Semantics change for specific data
![Page 4: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/4.jpg)
A Motivating Example
Hurricane
Federal Emergency Management Agency
Geography
Politics
Normally
During Katrina
![Page 5: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/5.jpg)
A “Bayesian” View
Stagnant nature of predefined Taxonomy (Prior Knowledge)
Dynamic change of Semantics reflected in Data
Data-Driven Taxonomy
Inconsistent
![Page 6: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/6.jpg)
“Start from Scratch” - Clustering
Throw away the predefined taxonomy information, clustering based on labeled data.
Two categories: divisive or hierarchical Usually require human experts to specify some
parameters like the maximum height of a tree, the number of nodes in each branch, etc.
Difficult to specify parameters without looking at the data
![Page 7: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/7.jpg)
Optimal Hierarchy
Optimal hierarchy: How to estimate the likelihood? Hierarchical model’s performance and the
likelihood are positively related. Use hierarchical models’ performance statistics
on validation set to gauge the likelihood. Brute-force approach to enumerate all
taxonomies is infeasible.
![Page 8: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/8.jpg)
Constrained Optimal Hierarchy
Predefined taxonomy can help. Assumption: the optimal hierarchy is near
the neighborhood of predefined taxonomy H0
Constrained optimal hierarchy H’ for H0:H’ results from a series of elementary operations to adjust H0 until no likelihood increase is observed.
![Page 9: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/9.jpg)
Elementary Operations1
2 3 4
5 6(H1)
42
35 6
1
(H3)
72
3 45 6
1
(H4)(H2)
2 3 4
5
6
1
‘Promote’ ‘Merge’
‘Demote’
(All the leaf nodes remain unchanged)
![Page 10: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/10.jpg)
Search in Hierarchy Space
Given a predefined taxonomy, find its best constrained optimal hierarchy.
Search in the hierarchy space.
H0
H12
H11
H32
H33
H31
H03
H02
H13
H21
H22
H23
H24
H01
H04
![Page 11: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/11.jpg)
Finding Best COH
Greedy Search Follow the track with largest likelihood
increase at each step to search for the best hierarchy.
H0
H12
H11
H32
H33
H31
H03
H02
H13
H21
H22
H23
H24
H01
H04
![Page 12: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/12.jpg)
Framework (a wrapper approach)
Given: H0 , Training Data T, Validation Data V
1. Generate neighbor hierarchies for H0,
2. For each neighbor hierarchy, train hierarchical classification models on T
3. Evaluate hierarchical classifiers on V.
4. Pick the best neighbor hierarchy as H0
5. Repeat step 1 until no improvement
![Page 13: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/13.jpg)
Hierarchy Neighbors Elementary operations can be applied to any
nodes in the tree. Neighbors of a hierarchy could be huge. Most operations are repeated for evaluation.
2
1
3 2’
1
3
H1 H2
![Page 14: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/14.jpg)
Finding Neighbors
Check nodes one by one rather than all the nodes at the same time in each search step.
‘Merge’ and ‘Demote’ only consider the node most similar to the current one.
Nodes at higher levels affects more for classification. Top-down traversal: Generate neighbors by performing all
possible elementary operations to the shallowest node first.
![Page 15: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/15.jpg)
Further consideration 2 types of top-down
traversal:
1. ‘Promote’ operation only to generate neighbors
2. ‘Demote’ and ‘Merge’ operations only to generate neighbors
Repeat 2-traversals procedure until no improvement.
Root
Geography
Hurricane
Politics
If a node is inproperly placed under a parent, we need to ‘promote’ it first.
![Page 16: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/16.jpg)
Experiment Setting
10-fold cross validation Naïve Bayes Classifier (Multinomial) Use information gain to select features Due to the scarcity of documents in each
class, we use training data to validate the likelihood of a hierarchy.
![Page 17: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/17.jpg)
Data Sets
Data: Soc and Kids Human labeled
web pages with a predefined taxonomy
Soc Kids
Classes 69 244
Nodes 83 299
Height 4 5
Instances 5248 15795
Vocabulary 34003 48115
![Page 18: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/18.jpg)
Results on Soc
![Page 19: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/19.jpg)
Results on Kids
![Page 20: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/20.jpg)
Over-fitting?
As we optimize the hierarchy just based on training data, it’s possible to over-fit the data.
0.3
0.35
0.4
0.45
0.5
0.55
1 2 3 4 5 6 7
Iteration No.
Mac
ro R
ecal
l
Fold 1
Fold 2
Fold 3
Fold 4
Fold 5
Fold 6
Fold 7
Fold 8
Fold 9
Fold 10
![Page 21: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/21.jpg)
Robust Method Instead of multiple traversals(iterations), just do 2-
traversals once.
![Page 22: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/22.jpg)
Conclusions
Semantically sound taxonomy does not necessarily lead to intended good classification performance.
Given a predefined taxonomy, we can accustom it to a data-driven taxonomy for more accurate classification
Taxonomy generated by our method outperforms human-constructed taxonomy and the taxonomy generated “starting from scratch”.
![Page 23: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/23.jpg)
Future work
An initial work to combine “noisy” prior knowledge and data.
How to implement an efficient filter model that can find a good taxonomy by exploiting the predefined taxonomy?
Feature selection could alleviate the difference between taxonomies. How to use the taxonomy information for feature selection?
![Page 24: Acclimatizing Taxonomic Semantics for Hierarchical Content Categorization](https://reader035.vdocument.in/reader035/viewer/2022062721/56813846550346895d9ff33a/html5/thumbnails/24.jpg)