@ on boosting semantic web data access li ding department of computer science and electrical...
TRANSCRIPT
@
On Boosting Semantic Web Data Access
Li Ding
Department of Computer Science and Electrical Engineering,University of Maryland Baltimore County
Advisor: Tim FininDate: Jan 19, 2005
2
@
Outline Introduction
Thesis statement Contributions to computer science
Research description Research plan Preliminary and planned work
WOB-CORE: modeling the Semantic Web with its context Swoogle: digesting and searching the Semantic Web WOB: evaluating semantic web data quality
Summary Thesis schedule
@
1. Introduction
The Semantic Web in the Web Motivation Thesis statement
4
@
The Web
The Semantic Web in the Web
Semantic Web Data Access
wrapper service
database(Web) document
Static RDF document
RDF/XML, N3, N-Triple, OWL/XML…
RDF Graph
Agent &Web Service
HTTP HTTP, SOAP FIPA, SOAP,…
Agent World Inference Translation
Application
5
@
The growing semantic web data More data ( from Swoogle Today, Jan 16, 2005 )
335,858 RDF documents (v.s. Google 8,058,044,651) 156,504 ontological terms (classes or properties) 46,987,876 triples
Well populated ontology (organization adoption) Blog, News feed (e.g. rss) Personal homepage and social networking (e.g. foaf, bio) Digital library (e.g. dc, dcTerms), Copyright – creative commons (cc) Software configuration (trustix) Dictionary (e.g. wordnet) Scientific data ( e.g. CRISISCat - California Invasive Species
Information Catalog) Potential semantic web data
Bibliography CIA world fact book
6
@
Three challenges before utilizing semantic web data
Semantic Web
Where doesGeorge live ?
Ontology dictionary
Data access service
Which `live’ ?
Get it !
Quality of RDF graph
Which to believe?
Web scale semantic web vocabulary and data access
source
source
JoeRank? Trust ?
I mean ex:livesIn
foo:George ex:livesIn ex:TheWhiteHouse
foo:George ex:livesIn ex:Texas
foo:George ex:livesIn ?x
1
2
3
7
@
Motivation The utility of semantic web data access depends on
three factors
Availability: how much semantic web data is available in the Web Accessibility: how easily and effectively can users obtain the data they
want Quality: how well can semantic web data satisfy users’ requirements
Applications Spire: sharing scientific information using the Semantic Web SemDis: discovering and evaluating semantic associations in the
Semantic Web
UtilitySWDA = f (Availability, Accessibility, Quality)
8
@
Spire is a distributed, interdisciplinary research project exploring the use of semantic web technologies in support science in general and the field of ecoinformatics in particular.
Ecological Networks
California Invasive Species Information Catalog
UMBC Tree Survey
NBII-CAIN
Pacific Ecoinformatics and Computational Ecology Lab
Darwin Core
ebiquity@UMBC
MindSwap@UMCP
SF Tree Survey
who@where
How to search and use these data ?
publisher
creator
creator
creator
creator
Sharing semantic web data published by different sources throughout the Web
http://spire.umbc.edu/
9
@
Al-Qaeda
Mr.X
Terrorist Group
isPresidentOf
listedIn
Company A
investsOrganization B
Osama Bin LadenmemberOf
Afghanistan
locatedIn
Mr. Y
ownedBy
locatedInrelatedTo
USlocatedIn
Kabul basedIn
Afghanistan
Company A
Osama Bin Laden
NASDAQ
CIA World Fact Book
CIA Agent W
Department of State Organization B
FOO News
Kabul
Agent K
Discover complex semantic associations in SW. Evaluate trustworthiness of discovered associations
Step1Collect semantic web data from multiple sources and merge a big RDF graph.
Step 2Discover paths from Mr.X to Osama Bin Laden in the big RDF graph.
Step 3Evaluate trustworthiness of a discovered path with provenance and trust data
http://semdis.umbc.edu/
10
@
Research overview
Semantic web vocabulary Semantic web data access service
Quality of RDF graph
2. Swoogle: Digesting and Searching the Semantic Web
1. WOB-CORE: Modeling the Semantic Web and its context
3. WOB: Evaluating semantic web data quality
quality
accessibility
Consistency Importance Trustworthiness
Identify dimensions Rank importance Evaluate trustworthiness
Discover SW Digest SW metadata Search & navigation
Search URIrefs Map URIrefs
Search Ontologies Search RDF documents Semantic web “hyperlink”
Utility
Concepts Associations
11
@
Thesis Statement
Finding and evaluating information in the large scale Semantic Web is critical to users’ adoption but is not met yet. We developed Web of Belief (WOB) ontology, Swoogle data access service and data quality evaluation mechanisms to address these issues. These tools are proven to be effective in building semantic web metadata and boosting web-scale semantic web data access in applications like SemDis and Spire.
12
@
Contributions to computer science WOB is the first ontology that captures and collects the metadata
of the Semantic Web and its context RDF graph reference language Finer provenance model
Swoogle is one of the first data access services that digest and search the web-scale semantic web. Adaptive semantic web discovery agent Semantic web metadata
RDF graph abstract Ontology dictionary Recognized more relations among resources and document
Semantic web search and navigation model and service One of the first works that investigate semantic web data quality
Ranking the Semantic Web. We identified multiple navigation models for ranking.
Evaluate RDF graph's trustworthiness.
@
2. Research Description
Modeling the Semantic Web with its context Digesting and searching the Semantic Web Evaluating semantic web data quality
14
@
Agent World
The Semantic Web and its context
The RDF Graph World
The Web
RDF Document
serializes
RDF graph
Agent
createsbelieves
trusts
RDF resourceuses
Ontology
defines
trustprovenance
subClassOf
legends
Person subClassOf
DocumentsubClassOf
15
@
Modeling the Semantic Web and its context Goals
Identify concepts and associations Build an ontology in OWL semantics, especially
RDF graph reference language Finer provenance
Populate this ontology by rule-based translation Principles
Build simple, clear and minimal ontology Reuse existing ontology Show entity identity Be aware of inference tractability
Evaluation Analytical comparison with other existing ontologies. Satisfy applications (Swoogle, SemDis) requirements
16
@
Related works WOB-core Ontology
Meta-ontologies: RDF, OWL Popular ontologies: FOAF, DC
RDF graph reference Naïve approach: RDF test, OWL test RDF reification: RDF specification Named graphs (Carroll et al.2004)
Provenance Digital library (e.g. Dublin Core) Database:
data provenance (Buneman, Khanna, & Tan 2001) view maintenance (Cui, Widom, & Wiener 2000)
AI: knowledge provenance (da Silva, McGuinness, & McCool 2003; Fox
& Huang 2003) proof tracing, PML (da Silva, McGuinness, & Fikes 2004); TELLIS(Gil
& Ratnakar 2002)
17
@
Web-scale semantic web data access model
agent data access service the Web
Discover RDF Docs
ask (term)Compose query
ask (query)
inform (term URIrefs)
Fetch docs
Compose LocalRDF graph
Query localRDF graph
Digest RDF Docs &Terms
inform (doc URLs)
Search Terms
Search RDF Docs
18
@
Digesting and searching the Semantic Web Goals:
Web-scale semantic web data access model Data access service
Adaptive RDF document discovery Digest semantic web metadata Semantic web search and navigation model and service
Principles Scalable design Real world application
Evaluation Statistical report on collected metadata, web service usage Precision and recall of search result Users’ satisfaction on search and navigation model
19
@
Related works: SW vs. Web IR vs. DB
SW vs. Web IR: vocabulary, data model, query SW vs. DB: implicit data, query scale, vocabulary
20
@
Related works (cont’d) Swoogle
Ontology based annotation systems Annotate web documents
SHOE (UMCP, 1997) Ontobroker (AIFB, karlsruhe, 1998), WebKB (Martin & Eklund, 1999), QuizRDF (BT,2002)
Annotate proper reference & relations CREAM (AIFB,2003)
Ontology repositories DAML ontology library Schema Web Semantic web central
Semantic web ontology browsers W3C’s Ontaria (2004)
Semantic web instance databases Semantic web search
Discovery Meta-crawler focused crawler sw-crawler
Digest DC W3C’s Annotea OWL & RDFS
Search & Navigation Web IR (TFIDF) RDF database query
(e.g. RDQL, SPARQL) Term navigation (e.g.
Ontaria, Hyperdaml)
21
@
Evaluating semantic web data quality Goals
Investigate dimensions of semantic web data quality Evaluate semantic web data quality
Ranking RDF resources and RDF documents Evaluating RDF graph trustworthiness
Trust and provenance based semantic web navigation model Principles
Semantic web data quality dimensions vary for different granularity and/or background knowledge
Evaluation Analytical analysis and proofs over navigation models and trust
propagation models Simulation (Ding et al. 2004b) for quantifying convergence &
effectiveness Application (Spire, SemDis) users’ feedback
22
@
Related works Data quality dimensions
Information science (Wang, Storey, & Firth 1995) categorize data quality dimensions by domain interests Integrity (Database) User-satisfaction (Psychology) Statistics (auditing methods) Ontological world-modeling (Wand & Wang 1996)
Imperfect information Taxonomy : (Smithson 1989) (Smets 1991) (Parsons 1996) Computational models (Parsons & Hunter 1998)
probabilistic theory, possibility theory, evidence (Dempster-Shafer) theory.
23
@
Related works (cont’d) Ranking
Complex network analysis (Newman 2003) Text document ranking Web page ranking:
PageRank (Page et al. 1998; Haveliwala 1999), Hits(Kleinberg,1998)
Semantic ranking: Ranking RDF resources: (H.Zhuge & Zheng 2003) Ranking RDF document: Swoogle (contributed by Tim Finin, Rong Pan, 2004)
Social network analysis Trustworthiness
Content analysis: RDF graph difference (Berners-Lee & Connolly 2004).
Context analysis: semantic web trust layer Information security (Hyvonen 2002) Trust network (Golbeck, Parsia, & Hendler 2003; Richardson, Agrawal,
& Domingos 2003; R.Guha et al. 2004; Ding,et al. 2004) semantic web publishing (Carroll & Bizer 2004). SWAD-Europe’s trust ontology (Arenas et al. 2004)
@
3. Research Plan
Research objectives and status
25
@
Research objectives and statusPhase Objectives Artifacts to produce
1 WOB WOB-core ontology (w provenance)
RDF graph reference language
Provenance
translated WOB-core instances
2 Swoogle adaptive discovery agent
semantic web metadata *
search and navigation services*
Swoogle statistics *
3 SW data quality
WOB-quality extension
navigation and ranking model
trust inference algorithms
trust based navigation model
4 Finalize Dissertation
1 .Prototype
2. Complete & revise prototype
3. Evaluation & Justification
Spiral research model
* This is a joint work with others.
@
WOB-Core ontology RDF graph reference Provenance Status and next step
Preliminary and planned Work: Web Of belief
(WOB)
27
@
Agent World
WOB-core ontology
The RDF Graph World
The Web
wob:RDFDocument
wob:RDFgraphRef
foaf:Agent
rdfs:Resource
owl:Ontology
foaf:Person
foaf:Document
Association
wob:source
wob:Association
wob:connective
rdfs:domain
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
wob:isdefinedby
dc:source
wob:creator
wob:sourceDocument
rdfs:subPropertyOf
rdfs:subPropertyOf
28
@
RDF graph reference Reference entire RDF graph
Reference the RDF graph from a document Reference the RDF graph defined by usePattern
Reference partial RDF graph Accept a set of triples Reject a set of triples Special cases
Referencing class instance Wildcard: “John hasChild _:x”
29
@
RDF graph reference: an example
wob:RDFGraphRef
wob:RDFDocument
http://foo.com/ex1.rdf
wob:SimpleTriple
foo:George
ex:livesIn
ex:Texas
rdf:type
rdf:type
rdf:type
wob:sourceDocument
wob:usePattern
wob:subject
wob:predicate
wob:objectfoo:George
foaf:mbox
ex:livesIn
ex:Taxas
http://foo.com/ex1.rdf
30
@
Provenance in the Semantic Web Where Whom why Definition
RDF Resource dc:source dc:creator rdfs:isDefinedBy
RDF graph
RDF document dc:source dc:creator
We differentiate the rdfs:range of provenance relation The scope of provenance property
Minimum semantic element: the semantic will not be complete when any triple is removed Complete: the entire sub-tree URI-complete: minimal sub-tree ends without blank nodes
dc:creator semantics Class instance Class/property definition Document
31
@
Provenance of RDF graph
Bob (said so)
http://foo.com/example.owl
“A is sub class of B”
whom where
why
implies“A is sub class of C”“C is sub class of B”
“Transitive rule”owl:Class
ex:A ex:Brdfs:subClassOf
rdf:type rdf:type
supports “x is instance of both A and B”
whom
32
@
Provenance of RDF resource and RDF document
Bob (said so)
http://foo.com/example.owl
“A is sub class of B”
foo.com
Whom(dc:creator)
where(dc:source)
owl:Class
ex:A ex:Brdfs:subClassOf
rdf:type rdf:type
Definition(rdfs:isDefinedBy)
Whom(dc:publisher)
where(dc:source)
Whom(dc:creator)
why
33
@
Proof
WOB-provenance
Agent RDF Document
RDF Graph
Website
wob:sourceDocument
TBD
• wob:creator• dc:creator
rdfs:subClassOf
RDF Resource• wob:sourceDocument• wob:isDefinedBy• rdfs:isDefinedBy• dc:source
• wob:sourceDocument • dc:source
TBD
• wob:creator• dc:creator
• wob:creator• dc:creator
34
@
Status and next step We have
Constructed WOB conceptualism Proposed prelim RDF graph reference language Classified provenance in the Semantic Web
We will Refine and evaluate WOB-core ontology Complete RDF graph reference language Add why-provenance Populate WOB-core instances using rule based
translation Evaluate WOB-core ontology
@
Preliminary and planned Work: Swoogle
Discovery Digest Search and navigation Status and next step
36
@
The role of Swoogle in the Semantic Web
Semantic WebServices
Semantic web data
Software Agents, Applications
SW data service
database(Web) document
RDF document
usesuses
Directory/Digest Service
Service Finder
digestsdigests
searches
Data Finder Swoogle
37
@
Discovery - research Crawlers
Google-crawler Focused-crawler Semantic-Web-crawler, e.g. scutter
RDF document word indicator Keywords (positive list and negative list)
filetype: 10 positive, over 100 negative url-pattern content-pattern
Google cat-words (to refine Google query) Revisiting URLs
The would-be RDF document The out-of-date RDF document: changed, deleted The redirected RDF document
38
@
Discovery – current status Crawler performance
Google crawler is the best Focused crawler needs to be improved
1/3 URLs are verified pure RDF documents Embedded RDF graph.
RDF docs Non-RDF docs Undecided TOTAL
Focused Crawler 1,465 7% 10,580 52% 8,292 20,337
google crawler 273,023 36% 369,371 49% 110,794 753,188
SW_crawler 61,870 15% 285,506 70% 57,709 405,085
TOTAL 336,358 665,457 176,795 1,178,610
Source: Swoogle (2005-Jan-05) SELECT `discovered_by`, sum(isRDF), sum(1-isRDF), count(*) FROM `digest_url` WHERE 1 group by discovered_by
39
@
Digest -- research RDF document annotation (join work) RDF graph abstract Ontological term definition Relations (join work)
Document-term relation Document-document relation Term-term relation
40
@
RDF document annotation (join work) Document
filetype (suffix of URL) When/how discovered Last modified time Document hash Crawling info
RDF/OWL level RDF Syntax SW language OWL species Provenance (creator, publisher)
Ontology Label Version Comment
41
@
RDF graph abstract Possible models
Bag-of-word : literal, local name of resource Bag-of-URI: URIrefs of non-blank RDF node Triple: swangled triple digest (Mayfield & Finin 2003) Ontological term: defined/referenced/populated
class/property Namespace: used/defined namespace Identity: identity of class instance
Possible methods Document vector Bloom filter (Bloom 1970)
42
@
Ontological term definition
foaf:name
rdf:type owl:Classrdf:type
“Person”rdfs:label
foaf:name
“Tim Finin”
“Tim’s FOAF File”dc:title
foaf:mbox
rdfs:domain
foaf:Agent
rdfs:subClassOf
Term Definition• rdfs:subClassOf -- foaf:Agent• rdfs:label – “Person”
Empirical C-P bond• foaf:name• dc:title
Ontological C-P bond• foaf:mbox• foaf:name
rdfs:domain
file1
file3file2
foaf:Person
43
@
Relations: doc-term; doc-doc; term-term
rdfs:Resource
wob:RDFDocument
owl:Ontology
rdfs:subClassOf
swoogle:isUsedBy
swoogle:sameNamespaceswoogle:sameLocalnameC-P bond, P-C bondany RDF triple
swoogle:uses
swoogle:defines
wob:isDefinedBy
swoogle:populatesClassswoogle:populatesPropertyswoogle:refersClassswoogle:refersProperty
swoogle:definesClassswoogle:definesProperty
foaf:Documentrdfs:seeAlsordfs:isDefinedBy
swoogle:officialOntoswoogle:extensionOnto
owl:importsowl:priorVersionowl:backwardCompatibleWithowl:imcompatiableWith
44
@
Search & Navigation -- researchThe Semantic Web is not simply the Web
Search service Document search – RDF document is not free text Term search – URIref contains compound local
name
Navigation service The RDF graph – Typed links The web of RDF documents – Few hyperlinks The social network of agents – trust & provenance
45
@
URL
URIref
Semantic web search/navigation model
Resource
RDF Document
uses definesisDefinedBy
officialOntoextensionOnto
OntologyPropertyrdfs:seeAlsordfs:isDefinedBy
Ontology
isUsedBy
rdfs:subClassOf
sameNamespacesameLocalnameANY RDF PROPERTY
Term Search
Document Search
1
2 34
5
6 7
• Keywords+ Filters
• Keywords+ Filters• SPARQL• RDF graph
46
@
Status and next step We have
Built a automatic semantic web discovery agent Digested part of semantic web metadata
RDF document annotation Relations: res-res; res-doc; doc-doc
Proposed semantic web search/navigation model with prototype implementation
We will Make the agent adaptive Explore efficient RDF graph abstract Provide a complete search/navigation service, esp.
Swoogle search with SPARQL search support Ontology dictionary with user-friendly navigation interface
Complete Swoogle web service Complete Swoogle statistics for quantitative evaluation
@
Preliminary and planned work:
Semantic Web Data Quality
Dimensions of semantic web data quality Ranking RDF resources and RDF documents Evaluate RDF graph trustworthiness Trust based navigation Status and next step
48
@
Dimensions of semantic web data qualityRDF graph RDF graph +RDFS/OWL SW metadata SW metadata
+trust
weighted directed graph
RDF graph SW + Web SW + Web + agents
Term Importance centrality betweenness
rel-vaguenss Importance Importance
RDF
Document
Importance
RDF
graph
graph structure
definition closeness semantic consistency rel-completeness
credibility credibility
Agent credibility
More to consider term correlation (C-P bond, P-C bond)
49
@
Ranking RDF documents and RDF resources PageRank like navigation model
Background knowledge decides w(p) – how credits are distributed along semantic paths from one node
Different context RDF graph as weight directed graph RDF graph + RDFS/OWL RDF graph + RDFS/OWL + WOB (semantic web metadata)
50
@
Navigation model 1: RDF graph
RDF node
Named edge
Let wg(e) be the frequency of named edges in the given RDF graph
Given a node p, each edge e from p is assigned with weight wg(e), and w(p) is the normalized vlaue
51
@
Navigation model 2: RDF graph +RDFS/OWL
Individual
Class
Meta Class
Property
typetype
typeLiteral /
Resource
type*
Individual => Property is made by reading triple type* is valid in OWL-FULL semantics Literals and non-instance resources are ignored
Except owl:InverseFunctionalProperty is considered (OWL-FULL)
InverseFunctionalProperty
52
@
a2http://foo.com/ex.owl
wob:sourceDocumentwob:RDFDocument
rdf:type
foaf:Document
rdfs:subClassOf
rdfs:Class
rdf:type rdf:type
rdfs:Property
wob:source
rdfs:label
rdfs:range rdfs:subPropertyOf
rdf:type
dc:title
An example
53
@
Navigation model 3: RDF graph +RDFS/OWL+WOB
Individual
Class
Meta Class
Property
typetype
type
RDF Document
Ontology
We assume Swoogle search/navigation services is used. Rank RDF resources and RDF documents together
54
@
Evaluating trustworthiness [Definition] A philosophical and context dependent concept.
Common interpretations are reliance, faith, and confidence. Examples
“Is the triple (foo:George ex:livesIn foo:WhiteHouse) credible? ”
“Does foo:George (an instance of foaf:Person) always telling truth? ”
Related terms Belief: Trustworthiness of an RDF graph (by individual agent) Trust: Trustworthiness of an agent’s beliefs (by individual agent)
[KR] An agent’s belief (assertion) [ML] A hypothesis of the other agents’ belief quality [SNA] A context dependent inter-agent relation
Reputation: Social trustworthiness of an agent (by the public)
55
@
How statement is justified trustworthy
I’ve been to Foo many times, and the food was always good!
I’ve been to Foo many times, and the food was always good!
I believe that “Restaurants with good outlook are good” “Foo has good outlook”;
I believe that “Restaurants with good outlook are good” “Foo has good outlook”;
My friends (who have similar taste as me ) said so.
My friends (who have similar taste as me ) said so.
No better alternativeNo better alternative
inductive
deductive
conclusive (mimic)
prima facie (at first view)Foo is a good
restaurant
I believe that “Good restaurants has good outlook” “Foo has good outlook”;
I believe that “Good restaurants has good outlook” “Foo has good outlook”;
abductive
56
@
Trust propagation in justification Deductive – trustworthiness propagates from the premise w.r.t.
inference rule P -> Q, tv(Q) = tv(P) *tv(P->Q)
Abductive – trustworthiness propagates from the consequence w.r.t. trustworthiness of reversing inference rule P-> Q tv(P) = tv(Q) * f ( tv(P->Q) ) Bayes
Inductive – trustworthiness is derived from past experiences Argumentation – logic coherence Knowledge similarity – statistic coherence
Conclusive – trustworthiness propagates from the other agents through social trust relation Trust(A,B) tv(S,A) = tv(trust(A,B)) * tv(S,B) Recommendation
prima facie – blind trust Tv(S) = constant (normal reputation) Largest take all
57
@
Agents
The given RDF Graph
RDF graph(w ontology)
Evaluate RDF graph trustworthiness
S1S2
Foaf:person rdf:type owl:ClassS3
Foaf:person rdf:type rdfs:Class
foaf:knows
Foaf:Person
rdfs:Classowl:Classrdfs:subClassOf
(social network) Joe Mike
trusts
believesdisbelieves(Conflict belief)
1
2
3
4
Remove independent assumption by using more data
58
@
Trust and provenance aware navigation Mechanism
Only pursue highly trusted Shortest distance principle Derive trustworthiness
using weighted consensus No delegation
Complexity control Search Branch – trust filter Search Depth
small world Initiator’s control
ba d
e gf
c
h
initiator distance=0
distance=1
distance=2
domain-refer refer-refer
59
@
Status and next step We have
Revealed some dimensions of semantic web data quality Proposed some ranking mechanisms based on different
navigation models and background knowledge Proposed some trust evaluation mechanisms based on
different background knowledge Proposed a trust based navigation model
We will Consolidate semantic web data quality dimensions with
more formal description Evaluate, justify and improve ranking and trust evaluation
mechanims
@
Summary
[R] Thesis Statement [R] Contributions to computer science Research time table Planned milestones
61
@
Thesis Statement
Finding and evaluating information in the large scale Semantic Web is critical to users’ adoption but is not met yet. We developed Web of Belief (WOB) ontology, Swoogle data access service and data quality evaluation mechanisms to address these issues. These tools are proven to be effective in building semantic web metadata and boosting web-scale semantic web data access in applications like SemDis and Spire.
62
@
Contributions to computer science WOB is the first ontology that captures and collects the metadata
of the Semantic Web and its context RDF graph reference language Finer provenance model
Swoogle is one of the first data access services that digest and search the web-scale semantic web. Adaptive semantic web discovery agent Semantic web metadata
RDF graph abstract Ontology dictionary Recognized more relations among resources and document
Semantic web search and navigation model and service One of the first works that investigate semantic web data quality
Ranking the Semantic Web. We identified multiple navigation models for ranking.
Evaluate RDF graph's trustworthiness.
63
@
A tentative research time tablePhase Objectives Artifacts to produce Status
(%)
Time
(months)
1 WOB WOB-core ontology 60 0.5 3
RDF graph reference language 30 1
Provenance 50 0.5
translated WOB-core instances 0 1
2 Swoogle adaptive discovery agent 50 1 5
semantic web metadata * 50 1
search and navigation services * 30 2
Swoogle statistics * 30 1
3 SW Quality WOB-quality extension 20 1 6
navigation and ranking model 40 2
trust inference algorithms 50 2
trust based navigation model 80 1
4 Finalize Dissertation 4 4
TOTAL 18
* This is a joint work with others.
64
@
Planned milestones WOB-core ontology
It covered all required meta-concepts in Spire and SemDis. Swoogle
It indexed all semantic web data needed by Spire and SemDis. We are expecting millions of RDF documents to be indexed.
It performed better than Google or other semantic web portals in searching ontologies and URIrefs throughout the Web. We are also looking forward to searching class-instance data.
Semantic web data quality RDF documents and RDF resources can be ranked reasonably
using semantic web metadata in WOB. We are expecting users’ satisfaction about Swoogle search precision.
RDF graph trustworthiness can be evaluated reasonably by using trust and provenance information in WOB.
65
@
PublicationsRefereed Publications Li Ding et al.,
"On Homeland Security and the Semantic Web: A Provenance and Trust Aware Inference Framework", InProceedings, Proceedings of the AAAI SPring Symposium on AI Technologies for Homeland Security, March 2005.
Li Ding et al., "How the Semantic Web is Being Used:An Analysis of FOAF", InProceedings, Proceedings of the 38th International Conference on System Sciences, January 2005.
Li Ding et al., "Analyzing Social Networks on the Semantic Web", Article, IEEE Intelligent Systems, January 2005.
Li Ding et al., "Swoogle: A Search and Metadata Engine for the Semantic Web", InProceedings, Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management , November 2004.
Li Ding et al., "Modeling and Evaluating Trust Network Inference", InProceedings, Seventh International Workshop on Trust in Agent Societies at AAMAS 2004, July 2004.
Li Ding et al., "Trust Based Knowledge Outsourcing for Semantic Web Agents", InProceedings, Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence, October 2003.
Youyong Zou et al., "Using Semantic web technology in Multi-Agent systems: a case study in the TAGA Trading agent environment", Article, Proceeding of the 5th International Conference on Electronic Commerce , September 2003.
Non-Refereed Publications Li Ding et al., "Weaving the Web of Belief into the Semantic Web", Misc, submitted to
WWW2004, May 2004.
66
@
Selected references Berners-Lee, T., and Connolly, D. 2004. Delta: an ontology for the distribution of differences between rdf
graphs. http://www.w3.org/DesignIssues/Diff. Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7):422–
426. Carroll, J. J.; Bizer, C.; Hayes, P.; and Stickler, P. 2004. Named graphs, provenance and trust. Technical
Report HPL-2004-57, HP Lab. Cui, Y.; Widom, J.; and Wiener, J. L. 2000. Tracing the lineage of view data in a warehousing
environment. ACM Trans. on Database Systems 25(2):179–227. da Silva, P. P.; McGuinness, D. L.; and Fikes, R. 2004. A proof markup language for semantic web
services. Technical Report KSL-04-01, Stanford. da Silva, P. P.; McGuinness, D. L.; and McCool, R. 2003. Knowledge provenance infrastructure. Data
Engineering Bulletin 26(4):26–32. Fox, M., and Huang, J. 2003. Knowledge provenance: An approach to modeling and maintaining the
evolution and validity of knowledge. Technical report, University of Toronto. Gil, Y., and Ratnakar, V. 2002. Trusting information sources one citizen at a time. In Proceedings of
International Semantic Web Conference 2002, 162–176. Golbeck, J.; Parsia, B.; and Hendler, J. 2003. Trust networks on the semantic web. In Proceedings of
Cooperative Intelligent Agents. Grandison, T., and Sloman, M. 2000. A survey of trust in internet application. IEEE Communications
Surveys Tutorials (Fourth Quarter) 3(4). Hunter, A., and Parsons, S., eds. 1998. Applications of Uncertainty Formalisms. Springer. Hyvonen, E. 2002. The semantic web – the new internet of meanings. In Semantic Web Kick-Off in
Finland: Vision, Technologies,Research, and Applications. H.Zhuge, and Zheng, P. 2003. Ranking semantic-linked network. In www 2003. Josang, A. 1997. Prospectives for modelling trust in information security. In Proceedings of Australasian
Conference on Information Security and Privacy.
67
@
Selected references (cont’d) Kanh, B. K.; Strong, D. M.; and Wang, R. Y. 2002. Information quality benchmarks: Product and service
performance. Communications of the ACM 45(4):184–192. Kleinberg, J. 1998. Authoritative sources in a hyperlinked environment. In Proceedings of ACM-SIAM
Symposium on Discrete Algorithms. Mayfield, J., and Finin, T. 2003. Information retrieval on the semantic web: Integrating inference and
retrieval. In Proceedings of the SIGIR 2003 Semantic Web Workshop. McDermott, D. 2001. Why rdf’s reification doesn’t work.
http://lists.w3.org/Archives/Public/wwwrdf-logic/2001Apr/0066. McKnight, D. H., and Chervany, N. L. 1996. The meanings of trust. MISRC Working Paper Series. Newman, M. E. J. 2003. The structure and function of complex networks. SIAM Review 167–256. Page, L.; Brin, S.; Motwani, R.; and Winograd, T. 1998. The pagerank citation ranking: Bringing order to
the web. Technical report, Stanford Digital Library Technologies Project. Parsons, S., and Hunter, A. 1998. A review of uncertainty handling formalisms. In Applications of
Uncertainty Formalisms. Parsons, S. 1996. Current approaches to handling imperfect information in data and knowledge bases.
Knowledge and Data Engineering 8(3). R.Guha; Kumar, R.; Raghavan, P.; and Tomkins, A. 2004. Propagation of trust and distrust. In
Proceedings of the 1st Workshop on Friend of a Friend, Social Networking and the Semantic Web. Richardson, M.; Agrawal, R.; and Domingos, P. 2003.Trust management for the semantic web. In
Proceedings of the Second International Semantic Web Conference. Smets, P. 1998. Probability, possibility, belief: Which and where. Quantified Representation of Uncertainty
and Imprecision 1:1–24. Smithson, M. J., ed. 1989. Ignorance and Uncertainty: Emerging Paradigms. Springer Verlag. Wand, Y., and Wang, R. Y. 1996. Anchoring data quality dimensions in ontological foundations.
Communications of the ACM 39(11):86–95. Wang, R.; Storey, V.; and Firth, C. 1995. A framework for analysis of data quality research. IEEE
Transactions on Knowledge and Data Engineering 7(4):623–639.
68
@
Some ontologies and their QNamesQName Name URL
rdf Resource Description Framework http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs Resource Description Framework schema
http://www.w3.org/2000/01/rdf-schema#
owl Web Ontology Language http://www.w3.org/2002/07/owl#
rss RDF site summary http://purl.org/rss/1.0/
foaf Friend Of A Friend http://xmlns.com/foaf/0.1/
dc Dublin Core Elements http://purl.org/dc/elements/1.1/
bio A vocabulary for biographical information http://vocab.org/bio/0.1/
cc creative commons http://web.resource.org/cc/
trustix (used but not publicly defined) http://www.trustix.net/schema/rdf/spi-0.0.1#
wordnet Wordnet (Princeton U.) http://xmlns.com/wordnet/1.6/