the semantic web · the web content is not machine-accessible lack of semantics not in a proper...
TRANSCRIPT
The Semantic Web
Mr. Mubashir AliLecturer (Dept. of Computer Science)
1
Lecture 18
Outline
1. The Problem
2. Introduction to Semantic Computing
3. Web
– Today’s Web
– Semantic Web
4. Applications
5. Reading
Mubashir Ali - Lecturer (Department of Computer Science)
2
Issue?
Mubashir Ali - Lecturer (Department of Computer Science)
3
Huge
Data ExpulsionComplex
Fast
Mubashir Ali - Lecturer (Department of Computer Science)
4
Semantic Computing ?
Mubashir Ali - Lecturer (Department of Computer Science)
5
• Computing with the “meanings”• Try to extract the implicit relation and
meanings• Interpret large volumes of data by machines
instead of humans (hopefully in an intelligentway)
• WWW is an example of Large volumes of data• “Google’s mission is to organize the world‘s
information and make it universallyaccessible and useful”
Issues
Mubashir Ali - Lecturer (Department of Computer Science)
6
2.5 quintillion (18 zeros) bytes of data createdeveryday
Cry?
Mubashir Ali - Lecturer (Department of Computer Science)
7
Provide me useful information
Let us see the state of the art
Mubashir Ali - Lecturer (Department of Computer Science)
8
Who provides us useful information?
Most of the data is free … Free Asset
Google is earning 20 Billion Dollars a Quarter
Mubashir Ali - Lecturer (Department of Computer Science)
9
Mubashir Ali - Lecturer (Department of Computer Science)
10
Mubashir Ali - Lecturer (Department of Computer Science)
11
Let us start a research…
Domain Name Servers
MAC Address
Network Topology
Network Devices
Mubashir Ali - Lecturer (Department of Computer Science)
12
Mubashir Ali - Lecturer (Department of Computer Science)
13
Sr. No Query on GS Returned
Results
Years needed to finish the
reading (five papers per day)
Citations of top 10
results
Years needed to
finish reading of
citations (five
papers per day)
1 Digital Libraries 1.8 million 936 years 3411 1.8 years
2 Ontology Evaluation 343,000 188 years 1073 Half year
3 Schema integration 427,000 234 years 3389 1.8 years
4 Requirement
Engineering
1.4 million 761 years 341 0.2 years
5 Turing Machines 154,000 84 years 3957 2.2 years
6 Distributed
computing models
2.2 million 1238 years 5469 3 years
7 Fuzzy Logic 1 million 507 years 27154 15 years
8 Hypermedia 176,000 96 years 12359 6.7 years
9 Virtual Reality 1.9 million 1030 years 17098 9.3 years
10 Fault Tolerance 601,000 329 years 7246 4 years
Why?
• Web content is currently formatted for humanreaders rather than programs
• HTML is the predominant language in whichWeb pages are written (directly or using tools)
• Vocabulary describes presentation
Mubashir Ali - Lecturer (Department of Computer Science)
14
HTML?
<HTML><BODY><H2 align=center>Nonmonotonic Reasoning:
Context- Dependent Reasoning</H2><P align=center>
<I>by<B>V. Marek</B>and <B>M Truszczynski</B></I>
<BR>Springer 1993<BR>ISBN 0387976892</P></BODY></HTML>
Mubashir Ali - Lecturer (Department of Computer Science)
15
HTML?
• Inability to cover any content aspects – HTMLonly describes the appearances of documentsand cannot cover any content related aspects. Itis therefore unsuitable for explicit queries.
• Inability for semantic markup – Individualelements on a page cannot be markedsemantically.
Mubashir Ali - Lecturer (Department of Computer Science)
16
Why does this happen?
The Web content is not machine-accessible lack of semantics
Not in a proper structure
Not in a machine understandable manner
keyword-based search engines (e.g. Google, AltaVista, Yahoo)
Mubashir Ali - Lecturer (Department of Computer Science)
17
How to overcome these limitations?
• Currents situation can be improved by adopting followingtwo strategies– Use the content as it is represented today, and to develop
techniques based on artificial intelligence and computationallinguistics.
• This approach has been followed for sometime now, butdespite advances that have been made the task still appearstoo ambitious.
• Represent Web content in a form that is more easilymachine processable– Then use intelligent techniques to take advantage of these
representations (Semantic Web).
Mubashir Ali - Lecturer (Department of Computer Science)
18
What should we do?
• Simple lesson learnt from Basic Computer Science
– Provide a structure to contents
Mubashir Ali - Lecturer (Department of Computer Science)
19
What is required?
• Organize the data as per some proper structure
• Classify as per need on the fly
Mubashir Ali - Lecturer (Department of Computer Science)
20
Data from multiple sources
• How to organize the data meaningfully?
• Integrate Humans into Information Systems
• Vaguely defined concept: “Concepts for fostering human collaboration to solve complex problems.”
• HITs - Human Intelligence Tasks
Mubashir Ali - Lecturer (Department of Computer Science)
21
Two Problems—need to be addressed
• Nowadays the focus is much bigger in creatingdata instead of analyzing and learning from it.
• biological data is one of the most complexavailable, But
• So far computers are creating and aggregatingdata to allow biologists to manually analyse andcurate the data and feedback the system for thenext cycle.
Mubashir Ali - Lecturer (Department of Computer Science)
22
Continued..
• Another problem of huge data is the lack of welldefined standards.
• Every database or institute has it's own uniqueidentifiers, indexing scheme, data format and toolsto deal with them all.
• There is already a big effort to integrate this databut the amount of work to be done with currentdata models is huge and because data modelschange quite often, the maintenance of the codebase is just not feasible.
Mubashir Ali - Lecturer (Department of Computer Science)
23
Continued..
• One way of solving this problem is to centralize all data in a single database (such as EMBL, UniProtKB etc)
• but for that we still need all other groups to produce the data in the same format (which does not fit all models)
• and we end up having either lack of information (all using the same fixed format) or integration nightmare (different formats).
Mubashir Ali - Lecturer (Department of Computer Science)
24
Continued..
At last, even when we reach the stage where alldata is automatically integrated together in real-time, there is still the need to understand thisdata and to do what's most important inBioinformatics: reasoning.
Mubashir Ali - Lecturer (Department of Computer Science)
25
Bioinformatics & Semantic Web
• Bioinformatics is an ideal field for testing Semantic Web technologies for three reasons:
– First, Web-based systems and Web databases have been applied very early in Bioinformatics,
– second the dramatic increase of data produced in the field calls for novel processing methods,
– third, the high heterogeneity of Bioinformatics data require semantic-based integration methods.
Mubashir Ali - Lecturer (Department of Computer Science)
26
Continued..
• During the information forage , the scientistconstantly used literature databases to readrelevant articles.
• Despite the tremendous growth of 8000 articlesa week, How our researcher manages to quicklyfind the relevant articles?
Mubashir Ali - Lecturer (Department of Computer Science)
27
Semantic Web - Example
• Michael had just had a minor car accident andwas feeling some pain in the neck. His GPsuggested a series of physical therapy sessions.Michael asked his Semantic Web agent to workout some possibilities.
• agent retrieved details of the recommendedtherapy from the doctor’s agent,
• and looked up the list of therapists maintainedby Michael’s health insurance company.
Mubashir Ali - Lecturer (Department of Computer Science)
28
Continued..
• The agent checked for those located within aradius of 10km from Michael’s office or home, andlooked up their reputation according to trustedrating services.
• Then it tried to match between availableappointment times and Michael’s diary.
• One therapist had offered appointments in twoweeks’ time, for the other Michael would have todrive during rush hour.
• Therefore Michael decided to set stricter timeconstraints and asked the agent to try again.
Mubashir Ali - Lecturer (Department of Computer Science)
29
Continued..
• A few minutes later the agent came back withan alternative: A therapist with excellentreputation who had free appointments startingin two days.
• However there were a few minor problems:
• A few of Michael’s less important workappointments would have to be rescheduled.The agent offered to make arrangements if thissolution was adopted.
Mubashir Ali - Lecturer (Department of Computer Science)
30
Continued..
• The therapist was not listed on the insurer’s sitebecause he charged more than the insurer’smaximum coverage. The agent had found his namefrom an independent list of therapists, and hadalready checked that
• Michael was entitled to the insurer’s maximumcoverage, according to the insurer’s policy.
• It had also negotiated with the therapist’s agent aspecial discount.
• The therapist had only recently decided to chargemore than average, and was keen to find newpatients.
Mubashir Ali - Lecturer (Department of Computer Science)
31
Continued..
Michael was happy with the recommendation,since he would have to pay only a few dollarsextra. However, because he had installed theSemantic Web agent a few days ago, he asked itfor explanations for some of its assertions:
Mubashir Ali - Lecturer (Department of Computer Science)
32
Continued..
• How was the therapist’s reputation established,why was it necessary for Michael to reschedulesome of his work appointments, how was theprice negotiation conducted. The agentprovided appropriate information.
• Michael was satisfied
Mubashir Ali - Lecturer (Department of Computer Science)
33
Semantic Computing
• Understanding the (possibly naturallyexpressed) intentions (semantics) of user andexpressing them in a machine processablelanguage
• Semantic Computing addresses technologiesthat facilitate the derivation of semantics fromcontent and connecting semantics intoknowledge,– “content” may be anything such as video, audio, text,
conversation, process, program, device, behavior, etc.Mubashir Ali - Lecturer (Department of
Computer Science)34
Applications• Semantic Computing encompasses all aspects of computing
where data is encoded, processed, stored or transferred using techniques that communicate the meaning of the data in addition to the data itself.– Search engines and question answering
– Semantic web services
– Content-based multimedia retrieval and editing
– Context-aware networks of sensors, devices and applications
– Machine translation
– Creative art description
– Medicine and biology
– Semantic programming languages and software engineering
– System design and synthesis
Mubashir Ali - Lecturer (Department of Computer Science)
35
Reading – Home Work
Data Semantics: what, where and how?
Chapter 1 of : Semantic Web Primer TimBerner Lee Paper: The Semantic Web
Mubashir Ali - Lecturer (Department of Computer Science)
36
Summary
The Problem
Introduction to Semantic Computing
Web
Today’s Web
Semantic Web
Applications
Reading
Mubashir Ali - Lecturer (Department of Computer Science)
37