ieee knowledge media networking kmn’02 keynote address, crl, kyoto japan, july 11, 2002 concept...

61
IEEE Knowledge Media Networking KMN’02 IEEE Knowledge Media Networking KMN’02 eynote Address, CRL, Kyoto Japan, July 11, 2002 eynote Address, CRL, Kyoto Japan, July 11, 2002 Concept Switching in the Concept Switching in the Interspace: Interspace: Networking Infrastructure for Networking Infrastructure for Community Knowledge Community Knowledge Bruce Schatz CANIS Laboratory Graduate School of Library and Information Science University of Illinois at Urbana-Champaign Graduate School of Informatics, Kyoto University [email protected], www.canis.uiuc.edu

Upload: isaac-craig

Post on 30-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

IEEE Knowledge Media Networking KMN’02IEEE Knowledge Media Networking KMN’02Keynote Address, CRL, Kyoto Japan, July 11, 2002Keynote Address, CRL, Kyoto Japan, July 11, 2002

Concept Switching in the Interspace:Concept Switching in the Interspace:Networking Infrastructure for Networking Infrastructure for

Community KnowledgeCommunity Knowledge

Bruce SchatzCANIS Laboratory

Graduate School of Library and Information ScienceUniversity of Illinois at Urbana-Champaign

Graduate School of Informatics, Kyoto University

[email protected], www.canis.uiuc.edu

THE THIRD WAVE OF NET EVOLUTIONTHE THIRD WAVE OF NET EVOLUTION

PACKETS

OBJECTS

CONCEPTS

from Objects to Concepts from Syntax to Semantics Infrastructure is Interaction with Abstraction

Internet is packet transmission across computers

Interspace is concept navigation across repositories

CONCEPT SPACESCONCEPT SPACES

Technology

Engineering

Electrical

FORMAL

INFORMAL

(manual)

(automatic)

IEEE

communities

groups

individuals

LEVELS OF INDEXESLEVELS OF INDEXES

THE DISTRIBUTED WORLDTHE DISTRIBUTED WORLD

Community Repositories in the Interspace Peer to Peer Networking Infrastructure Every Person performs Every Role

USER requestLIBRARIAN referenceINDEXER classifyPUBLISHER qualityAUTHOR generate

Meta DataMeta Data

How to Represent theCommunity Knowledge

Automatic and InteractiveRepresentation Techniques

for Capturing the Fundamental Structure

Meta MapsMeta Maps

How to Locate theCommunity Knowledge

Automatic and InteractiveLocation Techniques

for Capturing the Fundamental Landscape

CONCEPTS ACROSS THE INTERSPACECONCEPTS ACROSS THE INTERSPACE

SCALABLE SEMANTICSSCALABLE SEMANTICS

Automatic indexing Domain-Independent indexing Statistical clustering

Compute Context of

concepts within documents documents within repositories

CROSS-OVERS IN SEMANTIC INDEXINGCROSS-OVERS IN SEMANTIC INDEXING

1992 1993 1995 1996 1998

COMPUTING CONCEPTSCOMPUTING CONCEPTS

‘92: 4,000 (molecular biology)

‘93: 40,000 (molecular biology)

‘95: 400,000 (electrical engineering)

‘96: 4,000,000 (engineering)

‘98: 40,000,000 (medicine)

SIMULATING A NEW WORLDSIMULATING A NEW WORLD Obtain discipline-scale collection

MEDLINE from NLM, 10M bibliographic abstracts human classification: Medical Subject Headings

Partition discipline into Community Repositories 4 core terms per abstract for MeSH classification 32K nodes with core terms (classification tree)

Community is all abstracts classified by core term 40M abstracts containing 280M concepts concept spaces took 2 days on NCSA Origin 2000

Simulating World of Medical Communities 10K repositories with > 1K abstracts (1K w/ > 10K)

COMMUNITY PROCESSINGCOMMUNITY PROCESSING

Semantic IndexingSemantic Indexing Extracting Concepts (AI)

Canonical noun phrases Generic statistical parser

Computing Context (IR) Co-occurrence frequency, in collection Useful interactively, not strict ordering

System Side InfrastructureSystem Side Infrastructure

Classification Technologies for Multimedia Documents

Phrases (multi-word nouns) Concepts (generic phrases) Types (identified concepts) Clusters (grouped types) Structures (semantic

universals)

INTERSPACE NAVIGATIONINTERSPACE NAVIGATION

Semantic Indexes for Community Repositories

Navigating Abstractions within Repository concept space & category map

Interactive browsing by Community experts

*www.canis.uiuc.edu/interspace-prototype

Interspace Remote Access ClientInterspace Remote Access Client

Navigation in MEDSPACENavigation in MEDSPACE

For a patient with Rheumatoid Arthritis Find a drug that reduces the pain (analgesic) but does not cause stomach (gastrointestinal) bleeding

Choose DomainChoose Domain

Concept SearchConcept Search

Concept NavigationConcept Navigation

Retrieve DocumentRetrieve Document

Navigate DocumentNavigate Document

Retrieve DocumentRetrieve Document

Category MapCategory Map

Category Navigation

Category Navigation

Concept NavigationConcept Navigation

User Side InfrastructureUser Side Infrastructure

Navigation Technologies forSearch Interfaces

Exact Match (noun phrases) Relationship List (concept suggestions) Cluster Comparison (groups to groups) Spreading Activation (group intersections) Artificial Landscapes (semantic distances)

SWITCHINGSWITCHING

In the Interspace…

each Community maintains its own repository

Switching is navigating Across repositories

use your vocabulary to search another specialty

Medicine SessionMedicine Session

Categories and ConceptsCategories and Concepts

Concept SwitchingConcept Switching

Document RetrievalDocument Retrieval

CONCEPT SWITCHINGCONCEPT SWITCHING

“Concept” versus “Term” set of “semantically” equivalent terms

Concept switching region to region (set to set) match

term

Semantic region

Concept SpaceConcept Space

ENGINEERING SESSIONENGINEERING SESSION

Engineering Categories & ConceptsEngineering Categories & Concepts

Further Concept NavigationFurther Concept Navigation

Searching via Concept SuggestionSearching via Concept Suggestion

Switching Across RepositoriesSwitching Across Repositories

Future TechnologiesFuture Technologies Concept Switching

Spreading activation, type tagging

Dynamic Indexing On-the-fly collections, during session

Path Matching Aggregating indexes, many repositories

Semantic Analysis of MultimediaSemantic Analysis of Multimedia

Collections of Objects containing Units Text: community repository (topic proximity)

document abstracts containing noun phrases Image: aerial photograph (spatial proximity)

feature regions containing texture tiles

Units -- media-dependent (statistical parsers) Indexes -- media-independent (statistical clusters)

Media Interoperability ModelMedia Interoperability Model

text concept space & category map (geoscience) 1M phrases in 500K abstracts from Georef

and Petroleum Abstracts image concept & category maps in aerial photos

visual thesaurus maps for 200K regions in 800 images (6M tiles)

geographic map (where) v. semantic map (what) spatial gazetteer as bridge

image<=>text<=>number

Text and Number InteroperabilityText and Number Interoperability

Text and AVHRR Query:Show me information about Santa Barbara area with mild temperature and high vegetation density.

Integrated Result:Within the bounding geography location, 2 documents and 88 AVHRR records related to the integrated query are retrieved.

Image Concept SwitchingImage Concept Switching

Image Query:By browsing a texture (tile) catalog, show me information about residential and farm land areas.

Result:A set of related images are retrieved and shown in the Results Frame. The full-size image #368 is displayed with its place names and tile locations.

INFORMATION SPACEFLIGHTINFORMATION SPACEFLIGHT

Landscape as category map visualization

Valleys are semantic clusters Hills are semantic distances

Traversal across multiple levels of abstraction

Category Maps

Category Maps

SELF-ORGANIZING MAPS (SOMs)

SELF-ORGANIZING MAPS (SOMs)

INFORMATION SPACEFLIGHTINFORMATION SPACEFLIGHT

INFORMATION SPACEFLIGHTINFORMATION SPACEFLIGHT

Flying through Flying through CyberspaceCyberspace

Flying through Flying through CyberspaceCyberspace

THE NET OF THE 21st CENTURYTHE NET OF THE 21st CENTURY

Beyond Objects to Concepts Beyond Search to Analysis Problem Solving via Cross-Correlating

Multimedia Information across the Net

Every community has its own special library Every community does semantic indexing

The Interspace is truetrue Cyberspace

Subject AssignmentSubject Assignment

Improved Search byIdentifying Subjects

Human Indexers classify DocumentsFrom Subject Thesaurus and Knowledge

Interactive Support for Community Curators(Subject Experts but Classification Amateurs)

Use Concept Spaces to Suggest SubjectsFrom Related Documents in the Collection

See Best Paper Nominee at ACM DL 98

Structure AssignmentStructure Assignment

Improved Search byIdentifying Structures

Human Indexers classify ClustersFrom Generic Structures beyond Subjects

Universal Structures Cross-Cultural

Interactive Support for Community Curators(Subject Experts but Classification Amateurs)

Necessary for Peer-Peer InfrastructureWhen Ordinary Persons form Communities

The Structures of Everyday LifeThe Structures of Everyday Life Bodies (individuals)

Food and Clothes Buildings (groups)

Houses and Cities

Transportation (physical interactions) Rails (trains) and Roads (cars)

Communication (logical interactions) Phones (talking) and Computers (retrieving)

Navigating Universal StructuresNavigating Universal Structures

A planet for every kid’s local environment Federating the planets into a universe Ordering all planets from kid’s Point Of View Flying through the Kids UniverseKids Universe Finding similar kids from different POVs Connecting historically through museums