cyborg categorization the basics tom reamy knowledge architect intranet consultant
Post on 16-Dec-2015
213 Views
Preview:
TRANSCRIPT
Cyborg Categorization The Basics
Tom Reamy
Knowledge Architect
Intranet Consultant
Categorization Explosion
Autonomy Semio Verity Inxight Topical Net Mohomine Simile H5Technologies YellowBrix
GammaSite MetaTagger Applied Semantics Sageware SmartLogik Quiver Stratify Vivisimo Other - Tacit
Categorization: Why Now?
Search Stinks Professionals spend more time looking
for information than using it. Solution: Browse and Search Buy Search to get Categorization Need a Taxonomy
Taxonomy: How
Old Answer: Manual– hire a bunch of librarians and IA’s– Costly, difficult to maintain
New Answer: – Automatic Categorization
A Better Answer:– Cyborg Categorization– Integrate Content Management, Search,Taxonomy – Integrate central IA’s and local authors
Auto-Categorization: the How
Automatic Methods Catalog by Example
– Training Sets (5-500)– Bag of Words or language and concepts
Statistical Clustering– Set of Documents & Taxonomy Level
Semi-Automatic: Rules
Auto-Categorization: the How
Next Generation Support Vector Machines Machine Learning World Knowledge
Incremental Improvement From 75% to 85%
Critical Issue: Integration
Automatic vs. Humanatic
Humans are better, but not as consistent– General bin, understandable mistakes– Bring outside contexts to the document
Purpose, similar documents, common sense
Automatic is faster and cheaper.– Faster yes, Cheaper ?– Cost of poorer quality categorization
Intranet: 20,000 users taking 60 seconds longer = $20,000 a week
Automatic vs Humanatic:News Feeds to Corporate Intranets
News Feeds and Content providers– uniform content, size and structure– professional writers– Simple or standard vocabulary
Corporate intranet– Wildly varied content– Mix of good, bad, and ugly writers– Tower of Babel: Acronyms, special meanings
The Answer is Cyborg
No one software has best of automatic Automatic Categorization is not Integration not Assimilation Human and Computer Integration Cyborg Integration and Content
Management, Search
Human - Computer Integration
Humans– Create top level taxonomy– Create rules, select training sets– Final Quality Control
Automatic– Provisional Categorization and Meta Data– Automatic Summarization
Combination– Integration of Rules, World Knowledge
Content Management & Search
Content Management– Distributed Work Flow: Central IA & local
authors– Collaborative Categorization– Taxonomic Publishing Model
Search– Support Browse and Seach– Real time clustering, categorizing– Collaborative filtering - by category
Lessons Learned
Out of the Box, Out of Your Mind Play well with others Brain surgery is fun! World revolves around you Quality counts and size matters Let a Hundred flowers Bloom The End
The END
Really.
top related