the cooperative web a step towards web intelligence

24
The Cooperative Web A Step towards Web Intelligence Daniel Gayo Avello University of Oviedo

Upload: morse

Post on 22-Feb-2016

43 views

Category:

Documents


0 download

DESCRIPTION

The Cooperative Web A Step towards Web Intelligence. Daniel Gayo Avello University of Oviedo. Web Intelligence?. Multidisciplinary effort Artificial Intelligence Information Retrieval Software Agents ... Early stages Goal  The Wisdom Web New web. More useful. Truly “intelligent”. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Cooperative Web A Step towards Web Intelligence

The Cooperative WebA Step towards Web

IntelligenceDaniel Gayo Avello

University of Oviedo

Page 2: The Cooperative Web A Step towards Web Intelligence

Web Intelligence?• Multidisciplinary effort

– Artificial Intelligence– Information Retrieval– Software Agents– ...

• Early stages• Goal The Wisdom Web

– New web.– More useful.– Truly “intelligent”

Page 3: The Cooperative Web A Step towards Web Intelligence

The Semantic Web (in a nutshell)• Standardized conventions (ontologies)

– objects– attributes– relations

• Semantic tags– Document authors mark up– Software agents (basic) reasoning

Page 4: The Cooperative Web A Step towards Web Intelligence

So...• Semantic Web ~ Web

Intelligence Approach• Cooperative Web ~ Web

Intelligence Approach

Page 5: The Cooperative Web A Step towards Web Intelligence

Is the Cooperative Web just-another-proposal?• Not really...• Semantic Web

– beginning... – human made (ontologies - at this moment)– time to reach the whole Web (5-10 years?)

• “I know what I want and I want it now!”• The Web ~ Legacy System• Something...

– fully automatic– simple– built on top of the current web (legacy)– between the current web (legacy) and The Wisdom Web (future)

• ...wouldn’t be nice?

Page 6: The Cooperative Web A Step towards Web Intelligence

Cooperative Web proposal (in a nutshell)• Simple, cheap, automatic • Intermediate: Web ¿? Wisdom Web• “Squeeze out” the current Web a little

more...• Main ideas:

– Concept extraction– Automatic document taxonomies– Computational biology

Page 7: The Cooperative Web A Step towards Web Intelligence

Concepts• Let’s study these samples...

...Betelgeuse, a red supergiant star about 600 light years distant, is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 8: The Cooperative Web A Step towards Web Intelligence

Concepts• They’re results from the Google

query star......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 9: The Cooperative Web A Step towards Web Intelligence

Concepts• But they talk about different kinds

of “stars”......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 10: The Cooperative Web A Step towards Web Intelligence

Concepts• From those (and other) documents we could

extract something like these “word bags”...0:{red supergiant, star, Sun, ...}1:{actor, actors, celebrity, films, star, ...}

• Plenty of techniques to obtain these “word bags” or “concepts”, for instance:– Latent Semantics (Foltz, 1990)

– Concept Indexing (Karypis and Han, 2000)

Page 11: The Cooperative Web A Step towards Web Intelligence

Conceptual related documents• Documents shown before...

...Betelgeuse, a red supergiant star about 600 light years distant, is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Page 12: The Cooperative Web A Step towards Web Intelligence

Conceptual related documents• Could be transformed in something like

this......Betelgeuse, a red supergiant star about 600 light years distant, is seen in

this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

• by dropping the “stop words”...

Page 13: The Cooperative Web A Step towards Web Intelligence

Conceptual related documents• And then into this...

?00???????00????????1?1??????1??11??1???1?

• Last three documents are closely related while the first one has nothing to do...

Page 14: The Cooperative Web A Step towards Web Intelligence

Text strings...• This way of representing free text...

?00???????00????????1?1??????1??11??1???1?

• ...could be well-suited to determine the distance between documents.

• Let’s see a simpler technique to get the distance between text strings...

Page 15: The Cooperative Web A Step towards Web Intelligence

Text strings...• Three simple strings:

– BENJI– DANI– HENRY

• How closely are they related?• Let’s define a distance between two strings as

the number of letters to delete +the number of letters to change +the number of letters to insert...

• ...to transform one string into the another.

Page 16: The Cooperative Web A Step towards Web Intelligence

Text strings...• Distance between BENJI and DANI: 3

BENJI DENJI (1), DENJI DANJI (2), DANJI DANI (3)

• Distance between DANI and HENRY: 4DANI HANI (1), HANI HENI (2), HENI HENRI (3), HENRI HENRY (4)

• Distance between BENJI and HENRY: 3BENJI HENJI (1), HENJI HENRI (2), HENRI HENRY (3)

• This is known as Levenshtein distance and will allow us to better understand next step...

Page 17: The Cooperative Web A Step towards Web Intelligence

Someone’s in the kitchen with DNA• DNA highly complex molecule made from only 4

different kinds of components:– Adenine - A– Cytosine - C– Guanine - G– Thymine - T

• So, DNA molecules ~ simple (but huge) text strings– CCAAGGA...– CCAAGGAAACTCACTA...– GATTACA...

Page 18: The Cooperative Web A Step towards Web Intelligence

Someone’s in the kitchen with DNA

• If DNA ~ text string then distances between two or more strings can be easily computed...

(Ursing and Arnason, 1998)

Page 19: The Cooperative Web A Step towards Web Intelligence

What if...

Could be possible to adapt computational biology

algorithms to distill semantics from the web in

an automatic fashion?

Page 20: The Cooperative Web A Step towards Web Intelligence

Cooperative Web architecture

Œ

Ž

User

Software agent

Browsinghistory

Documenttaxonomy

?

Œ

Page 21: The Cooperative Web A Step towards Web Intelligence

So, the Cooperative Web would be...

A layer over the Webto provide semantics

in an automatic fashion“inspired” by

computational biology

Page 22: The Cooperative Web A Step towards Web Intelligence

Work in progress...

•Cooperative Web is just a proposal (at this moment)

•Some prototypes soon (I hope...)

Page 23: The Cooperative Web A Step towards Web Intelligence

The Cooperative WebA Step towards Web

Intelligence

Thank you!Any question?

Page 24: The Cooperative Web A Step towards Web Intelligence

References• Foltz, P.W. (1990), "Using Latent Semantic Indexing for Information

Filtering", Proceedings of the ACM Conference on Office Information Systems, Boston, EE.UU., pp. 40-47.

• Karypis, G., and Han, E. (2000), "Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization", Technical Report TR-00-0016, University of Minnesota.

• Ursing, B.M., and Arnason, U. (1998), "Analyses of mitochondrial genomes strongly support a hippopotamus-whale clade", Proceedings of the Royal Society of London. Series B, Biological Sciences, 265:2251-2255.