from research to business: the web of linked data

60
From research to business: the Web of linked data Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009 From research to business: the Web of linked data Irene Celino – Semantic Web Practice CEFRIEL – ICT Institute, Politecnico di Milano email: [email protected] – web: http://swa.cefriel.it

Upload: irene-celino

Post on 11-May-2015

2.011 views

Category:

Technology


1 download

DESCRIPTION

invited talk at the Enterprise X.0/Econom Workshops @ BIS 2009 (April 29th, 2009)

TRANSCRIPT

Page 1: From research to business: the Web of linked data

From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009

From research to business: the Web of linked data

Irene Celino – Semantic Web PracticeCEFRIEL – ICT Institute, Politecnico di Milanoemail: [email protected] – web: http://swa.cefriel.it

Page 2: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20092Irene Celino – From research to business: the Web of linked data

AgendaAgendaThe problem of integration

Web as a platformLinked data

How do we produce linked data today?The case of Service-Finder

How do we manage linked data today?The case of Urban Computing in LarKC

What’s next?What’s already going on Business viewScientific & technical view

Page 3: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20093Irene Celino – From research to business: the Web of linked data

The problem of integrationThe problem of integrationWhen do we have an integration problem?

Very large amounts of data that grow and evolve continuously

problem of scalescaleNumerous and different data typologies (documents, media, email, Web results, contacts, etc.)

problem of data data heterogeneityheterogeneity

Numerous and differentinformation systems (DB, legacy systems, ERP, etc.)

problem of system system heterogeneityheterogeneity

Page 4: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20094Irene Celino – From research to business: the Web of linked data

When 1 + 1 > 2 ?When 1 + 1 > 2 ?Data integration always gives an added valueadded value

Getting a global high-level view

Sharing knowledge

Business opportunities

Business Intelligence

Still there is the technologicaltechnological problemproblem: How to reconcile data heterogeneity?

Who took advantage from integration?

Can (Semantic) Web be of help?

Page 5: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20095Irene Celino – From research to business: the Web of linked data

Lesson learned from Web 2.0Lesson learned from Web 2.0Participation politicsParticipation politics and “wisdom of the crowds”

Great success of mashmash--upsupsMash-ups: applications made up of light integration of artifacts provided by third parties (often API or REST services)New integration paradigm to application development

Publication and access via Webvia WebStoring our information on the Web is becoming easier and easierAccessing our information on the Web (e.g. by retrieving it with search engines) is becoming more and more frequent

Page 6: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20096Irene Celino – From research to business: the Web of linked data

The Web as integration platformThe Web as integration platformWhat if we integrate on the Webintegrate on the Web?

Web as a platformData prosumer (producer + consumer)

“Web of DataWeb of Data”From current “Web of Documents” to a Web of dataNot only information retrieval, but also data retrieval

Exposing your dataExposing your data on the WebConverting/translating to a suitable format“Wrapping” the data source

D2RD2R VirtuosoVirtuoso

SquirrelRDFSquirrelRDF

SPASQLSPASQLRelational.OWLRelational.OWL

DartGridDartGridSPOONSPOON

TriplifyTriplify

R2OR2OTalisTalis

Page 7: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20097Irene Celino – From research to business: the Web of linked data

Linked data and data cloudLinked data and data cloudLinked DataLinked Data

The realization of the “Web of Data” (and of the Semantic Web)Tim Berners-Lee: http://www.w3.org/DesignIssues/LinkedData

Linking Open Data InitiativeInitiativeA community publishing and linking data on the Webhttp://linkeddata.org/

Data cloudData cloudToday everybody talks about cloud computing

However, often it’s not only a computation or storage issue, but it also about data and knowledge management

Page 8: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 20098Irene Celino – From research to business: the Web of linked data

Challenges for linked dataChallenges for linked dataAutomatic linked data creation and linkageAutomatic linked data creation and linkage

Automatic generation of linked data and smart mechanisms to identify “contact points” between different data sources and to seamlessly link them

Distributed queryingDistributed queryingQuerying distributed data over different Web sources regardless the “physical position” of data and getting aggregated results

Distributed reasoningDistributed reasoningApplying inference techniques to distributed data, preserving consistency and correctness of the reasoning

Page 9: From research to business: the Web of linked data

From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009

Service-Finderhttp://demo.service-finder.eu

There’s a lot of information already on the Web: how can we turn it into linked data?

Page 10: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200910Irene Celino – From research to business: the Web of linked data

Context: SOA onto the WebContext: SOA onto the WebService Oriented Architectures (SOAs) along with Web Services technologies are widely seen as the most promising fundament for realizing service interchange in business to business settings.However, it is envisioned that SOAs andWeb Services will increasingly move outof these settings and out onto the Web.Web size

Google: 1.000.000.000.000 URIs (08/2008)NetCraft: 62.000.000 active hosts

Service Web sizeGoogle: filetype:asmx inurl:wsdl (818)Service-Finder: > 25.000

[ http://aws.amazon.com/ ]

[ http://developer.ebay.com/ ]

Page 11: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200911Irene Celino – From research to business: the Web of linked data

One of the essential building blocks for creating applications that utilize the vast quantities of services, which are available on the Web is making it easier to discoveryand select the right servicesUDDI was initially proposed as a component of Web Services usage process enabling registering and discovering services, but finally UDDI did not reach its expected potentialThe critical problem in this new Web oriented environment is one of scalebecause services appear, disappear and change at a rate much higher than in business to business settings

UDDI Business Registry Shutdown. "With the approval of UDDI v3.02 as an OASIS Standard in 2005, and the momentum UDDI has achieved in market adoption, IBM, Microsoft and SAP have evaluated the status of the UDDI Business Registry and determined that the goals for the project have been achieved. Given this, the UDDI Business Registry will be discontinued as of 12 January 2006." [from “Registering for UDDI” 2005-12-17 ][see http://xml.coverpages.org/uddi.html ]

The rise and fall of public UDDI registriesThe rise and fall of public UDDI registries

Page 12: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200912Irene Celino – From research to business: the Web of linked data

Pitfalls of public UDDI registriesPitfalls of public UDDI registries1. UDDI is centered around programmatic access to the registry and

only a few mostly technically focused user interfaces are available.

2. The information in public UDDI registry was often outdated. Thevalue of the service in the public UDDI registry is minimal if the service itself does not exist anymore.

3. There are no means for community feedback. Practically there is only one possibility to provide feedback allowing the user to contact a provider by email listed in the service description.

4. A WSDL definition and a short description is not sufficient for a service consumer to select a service. To make decision about applicability of the service, service consumer need to become familiar with pricing, terms and condition, service level agreements to name just a few.

Page 13: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200913Irene Celino – From research to business: the Web of linked data

Overcoming UDDI limitationOvercoming UDDI limitation1. Easy to use GUI – It is important that early adopters of Web

Services technology, who learns about it for the first time, should be able to start exploring it with a few simply steps

2. Search Engine style – Web is unpredictable and services can appear and disappear (the same as websites), but one can put up a mechanism (periodic crawling and availability check) allowing to eliminate these services which are not available anymore

3. Architecture of participation – Learn from Web 2.0 (e.g., wikis, blogs, etc.) in enabling community contribution

4. More useful info – Include all information required by a user to make decision about applicability of the service; e.g., pricing,terms and condition, service level agreements, etc.

Page 14: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200914Irene Celino – From research to business: the Web of linked data

project ideaproject idea

SemanticsKnowledge Representation

& Reasoning

Web ServicesAs a basic tool to implementa Service Oriented Architecture

Semantic Web ServicesAs a means to realize

Service Oriented Architecture

Web 2.0User clusteringUser-Resource correlation

Semantic SearchConceptual Indexing

Semantic Matching

AutomaticSemantic Annotation

Combining smart-machine and smart-data

Service-Finder aims at developing a platform for service discovery in which Web Services are embedded in a Web 2.0 environmentService-Finder aims at developing a platform for service discovery in which Web Services are embedded in a Web 2.0 environment

Realizing Web Service Discovery at Web Scale

http://demo.service-finder.eu

Page 15: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200915Irene Celino – From research to business: the Web of linked data

key objectiveskey objectives

Create a Semantic Search Engine for Web ServicesAggregates information from heterogeneous sources: WSDL, wikis, blogs and also users’ feedbacks and behaviourCreate a Web Service Crawler to identify Web Services and their relevant information

Automatically generate Semantic Service Descriptionsby analyzing heterogeneous sourcesAllow efficient and effective search of collected and generated dataProvide a Web 2.0 portal

To support users in searching and browsing for Web ServicesTo give recommendations to usersTo track user behaviour for improving accuracy of service searchand user recommendations

Create a Semantic Search Engine for Web ServicesAggregates information from heterogeneous sources: WSDL, wikis, blogs and also users’ feedbacks and behaviourCreate a Web Service Crawler to identify Web Services and their relevant information

Automatically generate Semantic Service Descriptionsby analyzing heterogeneous sourcesAllow efficient and effective search of collected and generated dataProvide a Web 2.0 portal

To support users in searching and browsing for Web ServicesTo give recommendations to usersTo track user behaviour for improving accuracy of service searchand user recommendations

Page 16: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200916Irene Celino – From research to business: the Web of linked data

RealizingRealizing____________

June 2008

Jan 2008

Dec 2008

Dec 2009

Today

Page 17: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200917Irene Celino – From research to business: the Web of linked data

Use cases forUse cases for____________To gather requirements we imaged several use cases

A system administrator at a bank who is looking for an SMS Messaging service that sends him an SMS in any case failures with the on-line payment system of the bankA business and technology consultant working on a e-health project that needs to make it possible for general practitioners to send and receive fax directly from their patient record application using an on-line serviceA web developer that, after using a service listed on Service-Finder, decides to edit the information on the portal in order to improve it for other community users

Page 18: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200918Irene Celino – From research to business: the Web of linked data

RequirementsRequirements forfor ___________We identified within those previous use cases more than 60 requirements and we grouped similar requirements together into three main categories:

Search related: search for text, search for tag, search for concept, disambiguation, facet-browsing, ranking, sorting, comparing, etc.Web Service information related:

Services details: interface, how can the service be used, its payment modalities, its terms and clauses, user-added information as ratings, comments and tags, measured values of service levels such as availability (uptime) or performance (response time) and the service level declared by the provider. Providers info: name of the provider and its references, user-added information as ratings, comments and tags

User Community related: rating, commenting, tagging, editing, writing wiki entries, registration, recommendations

Page 19: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200919Irene Celino – From research to business: the Web of linked data

Architecture and ComponentsArchitecture and Components

Page 20: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200920Irene Celino – From research to business: the Web of linked data

Key innovations ofKey innovations of ___________Research Activities

Automatic Service Annotation

To automatic create Web Service descriptions by analyzing WSDL and related information• coping with contradictions• using community process to verify results

User and Service Clustering

To investigate and implement techniques for:• clustering users accordingly to their behaviours• clustering services accordingly to their usage by users belonging to the same clusters

Research and Engineering Activities

Conceptual Indexing and Matching

To apply semantic technologies in the Web Service discovery domainTo adopt them to the new forms of input descriptions:• Automatic annotations, clusters, contexts

Integration Activities

Service-Finder Portal

To provide a Web 2.0 portal• demonstrating the developed technologies• fostering communities participation

Page 21: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200921Irene Celino – From research to business: the Web of linked data

Beyond state of the artBeyond state of the artFeature State of the art Improvement

Architecture for lightweight semantic service discovery

Approaches based on a registration process or an editorial team

Enables to scale service discovery with the upcoming increase of publicly available services

Largest and most accurate set of publicly available services

Specialized portals only containing subset of services

Focused crawler able to identify services and related information

Automatic metadata creation for Web Service

Innovative; under-researched Metadata generation from Web 2.0 data and services

Integration of formal and informal (textual) knowledge

Indexed textual descriptions Hybrid match-making algorithm

Automatic creation of both user and service clusters

Only general-purpose clustering techniques exist

Specialize clustering algorithms that jointly cluster users and services

Innovative interface that combines Web 2.0 features and service related features

Current Web 2.0 portals do not include semantic metadata.

Techniques that enable handling of semantic metadata in Web 2.0 portals

Page 22: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200922Irene Celino – From research to business: the Web of linked data

Expected ImpactsExpected ImpactsService-Finder provides core mechanisms to cope with changing environments:

It uses Web principles such as openness and robustness;It takes explicit and implicit user interaction for construction, improvement and validation of rich service description; andIt exploits Semantic Web technologies as means to organize internally the data on available services.

It simplifies the service publishing process by removing the burden of any registration and brings service discovery even to non-technical persons.

Publishers increase their productivity, by being able to provide complex services without the need to register them explicitly.Creators become able to design more communicative forms of content by integrating third party services.Organizations can automate their processes by quickly finding adequate services.

Page 23: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200923Irene Celino – From research to business: the Web of linked data

Exploitation ProspectsExploitation ProspectsThe results of the Service-Finder project have the potential to revolutionize this market and to outperform existing solutionsUsing Service Finder for Public services

Unique chancemarket for public services increases (xignite, cdyne, …)

Missing AlternativesUDDI (has been shutdown in 2006)Google (no reliable filter / no additional information)Portals (rely on editorial process <=400 services)

Service finder can also be applied within organizationsNumber of Services increases in organizationsAs within internet repositories in big companies can be quickly outdatedIT Manager like minimal invasive technology

Page 24: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200924Irene Celino – From research to business: the Web of linked data

So what? ServiceSo what? Service--Finder and linked dataFinder and linked dataEven if I didn’t explicitly talk about linked data, that is exactly the result of Service-Finder

We take information about services from the Web, we translate it into structured information describing services wrt to domain-specific ontologies, we gives this information back to the community that can further enrich it

Is this linked data? Not yet, but:RDFa annotation in SF portal pages coming soonServices to query the knowledge base coming soonPossibly a “dump” of SF knowledge base could be easily published on the Web as linked data

Page 25: From research to business: the Web of linked data

From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009

Urban Computing in LarKChttp://wiki.larkc.eu/UrbanComputing

There are lots of data sources about cities on the Web: how can we query and reason on it?

Page 26: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200926Irene Celino – From research to business: the Web of linked data

Context: Cities are aliveContext: Cities are alive

Cities come to life, grow, evolve like living beingsThe state of a city changes continuously, influenced by a lot of factors

human factors: people moving in the city or extending itnatural factors: precipitations or climate changes

26NTT DoCoMo Invited speech, 11-3-2009

[source http://www.citysense.com]

Page 27: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200927Irene Celino – From research to business: the Web of linked data

Today CitiesToday Cities’’ ChallengesChallengesOur cities face many challenges

• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?

• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?

• How can we reduce traffic congestion yet stay connected?• How can we include citizens in planning their communities

rather than limiting input to only those affected by the next project?

• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?

• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?

• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?

• How can we reduce traffic congestion yet stay connected?• How can we include citizens in planning their communities

rather than limiting input to only those affected by the next project?

• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?

[ source http://www.uli.org/]

Page 28: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200928Irene Celino – From research to business: the Web of linked data

Urban Computing to address challengesUrban Computing to address challenges

Page 29: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200929Irene Celino – From research to business: the Web of linked data

Urban ComputingUrban ComputingA definition:

The integration of computing, sensing, and actuation technologies into everyday urban settings and lifestyles.

Urban settings include, for example, streets, squares, pubs, shops, buses, and cafés - any space in the semipublic realms of our towns and citiesOnly in the last few years have researchers paid much attention to technologies in these spacesPervasive computing has largely been applied

either in relatively homogeneous rural areas, where researchers have added sensors in places such as forests, vineyards, and glaciersor, on the other hand, in small-scale, well-defined patches of the built environment such as smart houses or rooms

Urban settings are challenging for experimentation and deployment, and they remain little explored

[source IEEE Pervasive Computing,July-September 2007 (Vol. 6, No. 3)]

Page 30: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200930Irene Celino – From research to business: the Web of linked data

Availability of DataAvailability of DataSome years ago, due to the lack of data, solving Urban Computing problems with ICT looked like a Sci-Fi ideaNowadays, a large amount of the required information can be made available on the Web at almost no cost. We are running a survey and we have collected more than 50 sources of data:

maps with streets and paths (Google Maps, Yahoo! Maps…),events scheduled (EVDB, Upcoming…),multimedia data with information about location (Flickr…)relevant places (schools, bus stops, airports...)traffic information (accidents, problems of public transportation...)city life (job ads, pollution, health care...)

We are running a survey (please contribute), seehttp://wiki.larkc.eu/UrbanComputing/ShowUsABetterWayhttp://wiki.larkc.eu/UrbanComputing/OtherDataSources

Page 31: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200931Irene Celino – From research to business: the Web of linked data

Are Data Are Data MashupsMashups the solution?the solution?

[source: http://www-01.ibm.com/software/lotus/products/mashups/ ]

IBM Lotus Mashups

[source: http://editor.googlemashups.com ]

[source: http://pipes.yahoo.com/pipes/ ]

[source: http://www.popfly.com/ ]

[source: http://openkapow.com/ ]

Page 32: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200932Irene Celino – From research to business: the Web of linked data

Data Data MashupsMashups offer powerful visualizationsoffer powerful visualizations

Google Charts API

http://code.google.com/apis/chart/http://maps.google.it/

http://maps.yahoo.com/

MIT Simile Timeline & Timeplot

http://simile.mit.edu/timeline/ http://simile.mit.edu/timeplot/

Page 33: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200933Irene Celino – From research to business: the Web of linked data

Data Data MashupsMashups offer simple programming offer simple programming abstractionsabstractions

Page 34: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200934Irene Celino – From research to business: the Web of linked data

Not everything boils down to plumbingNot everything boils down to plumbing

Page 35: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200935Irene Celino – From research to business: the Web of linked data

The LarKC projectThe LarKC project

[Source: Fensel, D., van Harmelen, F.: Unifying reasoning and search to web scale. IEEE Internet Computing 11(2) (2007)]

Visit http://www.larkc.eu !Visit http://www.larkc.eu !

Page 36: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200936Irene Celino – From research to business: the Web of linked data

Sustainable mobility as an exampleSustainable mobility as an exampleUrban Computing proposes a set of different issues, from technological to social ones. Our experience in the field make us believe that sustainable mobility is an exemplar case which we can elicit generalizablerequirements from.Mobility demand has been growing steadily for decades and it will continue in the future.For many years, the primary way of dealing with this increasing demand has been the increase of the roadway network capacity, by building new roads or adding new lanes to existing ones. However, financial and ecological considerations are posing increasingly severe constraints on this process. Hence, there is a need for additional intelligent approaches designed to meet the demand while more efficiently utilizing the existing infrastructure and resources.

• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?

• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?

• How can we reduce traffic congestion yet stay connected?

• How can we include citizens in planning their communities rather than limiting input to only those affected by the next project?

• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?

• How can we redevelop existing neighbourhoods and business districts to improve the quality of life?

• How can we create more choices in housing, accommodating diverse lifestyles and all income levels?

• How can we reduce traffic congestion yet stay connected?

• How can we include citizens in planning their communities rather than limiting input to only those affected by the next project?

• How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security?

Page 37: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200937Irene Celino – From research to business: the Web of linked data

Actors:Carlo: a citizen living in Varese. The day after, he has to go to Lombardy Region premises in Milano at 11.00. UCS: a fictitious Urban Computing System of Milano area

Ways to MilanoPrivate CarFS railwaysLe Nord railways

A Challenging Use Case 1/2 (planning)A Challenging Use Case 1/2 (planning)

Varese

Milano

©2009 Google – Map Data @2009 Teleatlas – Terms of Usage

©2009 Google – Map Data @2009 Teleatlas – Terms of Usage

Page 38: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200938Irene Celino – From research to business: the Web of linked data

Actors:Carlo: a citizen living in Varese. The day after, he has to go to Lombardy Region premises in Milano at 11.00. UCS: a fictitious Urban Computing System of Milano area

Ways to MilanoPrivate CarFS railwaysLe Nord railways

A Challenging Use Case 2/2 (traveling)A Challenging Use Case 2/2 (traveling)

Varese

Milano

M

M

©2009 Google – Map Data @2009 Teleatlas – Terms of Usage

©2009 Google – Map Data @2009 Teleatlas – Terms of Usage

Page 39: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200939Irene Celino – From research to business: the Web of linked data

Requirements for LarKCRequirements for LarKCUrban Computing (and Mobility Management) encompass sensing, actuation and computing requirements.Many previous work in the area of Pervasive and Ubiquitous Computing investigated requirements in sensing, actuation, and several aspects of computation (from hardware to software, from networks to devices)In this work we are focusing on reasoning requirementsfor LarKC, but also of general interest for the entire community working on the complex relationship of the Internet with space, places, people and content.Hereafter we exemplify how coping with

representational, reasoning, and defaults heterogeneityscaletime-dependencynoisy, uncertain and inconsistent data

Page 40: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200940Irene Celino – From research to business: the Web of linked data

Coping with representational heterogeneityCoping with representational heterogeneityIt is an obvious requirement

data always come in different formats (syntactic and structural heterogeneity)legacy data not in semantic formats will always exist!the problem of merging and aligning ontologies is a structural problem of knowledge engineering and it must be always considered when developing an application of semantic technologies.

Page 41: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200941Irene Celino – From research to business: the Web of linked data

Coping with reasoning heterogeneityCoping with reasoning heterogeneityIt means the systems allow for multiple paradigms of reasoners; e.g.

precise and consistent inference for telling that at a given junction all vehicles, but public transportation ones, must go straight

approximate reasoning when calculating the probability of a traffic jam given the current traffic conditions and the past history

[ source http://senseable.mit.edu/ ]

Page 42: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200942Irene Celino – From research to business: the Web of linked data

Coping with defaults heterogeneity 1/2Coping with defaults heterogeneity 1/2Open World Assumption vs. Close World Assumption

While for the an entire city we cannot assume complete knowledge, for a time table of a bus station we can

[source: http://gizmodo.com/photogallery/trafficsky/1003143552 ]

Page 43: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200943Irene Celino – From research to business: the Web of linked data

Coping with defaults heterogeneity 2/2Coping with defaults heterogeneity 2/2Unique Name Assumption

A square with several station for buses and subway can be considered a unique point for multimodal travel planning, but not when the problem is giving direction in that square to a pedestrian

©2009 Google – Map Data @2009 Teleatlas – Terms of Usage ©2009 Google – Imagery @2009 Teleatlas – Terms of Usage

Page 44: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200944Irene Celino – From research to business: the Web of linked data

Coping with scaleCoping with scaleThe advent of Pervasive Computing and Web 2.0 technologies led to a constantly growing amount of data about urban environmentsAlthough we encounter large scale data which are not manageable, it does not necessary mean that we have to deal with all of the data simultaneously.Usually, only very limited amount data are relevant for a single query/processing at a specific application. For example, when Carlo is driving to Milano,

only part of the Milano map data are relevant.the local parking information may become active by a prediction of the known relation between bad weather conditions and destination parking lot re-planning.

Page 45: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200945Irene Celino – From research to business: the Web of linked data

Coping with timeCoping with time--dependencydependencyKnowledge and data can change over the time.

For instance, in Urban Computing names of streets, landmarks, kind of events, etc. change very slowly, whereas the number of cars that go through a traffic detector in five minutes changes very fast.

This means that the system must have the notion of ''observation period'', defined as the period when we the system is subject to querying.Moreover the system, within a given observation period, must consider the following four different types of knowledge and data:

Invariable knowledgeInvariable dataPeriodically changing data that change according to a temporal law that can beEvent driven changing data that are updated as a consequence of some external event.

Page 46: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200946Irene Celino – From research to business: the Web of linked data

Invariable knowledge and dataInvariable knowledge and dataInvariable knowledge

it includes obvious terminological knowledge

such as an address is made up by a street name, a civic number, a city name and a ZIP code

less obvious nomological knowledge that describes how the world is expected

to be e.g., given traffic lights are switched off or certain streets are closed during the night

to evolve e.g., traffic jams appears more often when it rains or when important sport events take place

Invariable datado not change in the observation period, e.g. the names and lengths of the roads.

©2009 Google – Imagery @2009 Teleatlas – Terms of Usage

Page 47: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200947Irene Celino – From research to business: the Web of linked data

Changing dataChanging dataPeriodically changing data change according to a temporal law that can be

Pure periodic law, e.g. every night at 10pm Milano overpasses close.Probabilistic law, e.g. traffic jam appear in the west side of Milano due to bad weather or when San Siro stadium hosts a soccer match.

Event driven changing data are updated as a consequence of some external event. They can be further characterized by the mean time between changes:

Slow, e.g. roads closed for scheduled worksMedium, e.g. roads closed for accidents or congestion due to trafficFast, e.g. the intensity of traffic for each street in a city

©2009 Google – Imagery @2009 Teleatlas – Terms of Usage

Page 48: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200948Irene Celino – From research to business: the Web of linked data

Coping with noisy, uncertain and inconsistent dataCoping with noisy, uncertain and inconsistent dataTraffic data are a very good example of such data. Different sensors observing the same road area give apparently inconsistent information.

a traffic camera may say that the road is empty whereas an inductive loop traffic detector may tell 100 vehicles went over itThe two information may be coherent if one consider that a traffic camera transmits an image per second with a delay of 15-30 seconds, whereas a traffic detector tells the number of vehicles that went over it in 5 minutes and the information may arrive 5-10 minutes later.

Moreover, a single data coming from a sensor in a given moment may have no certain meaning.

an inductive loop traffic detector, it tells you 0 car went overIs the road empty? Is the traffic completely stuck? Did somebody park the car above the sensor? Is the sensor broken?

Combining multiple information from multiple sensors in a given time window can be the only reasonable way to reduce the uncertainty.

Page 49: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200949Irene Celino – From research to business: the Web of linked data

Towards requirements satisfaction in LarKCTowards requirements satisfaction in LarKC

The Large Knowledge Collidera platform for infinitely scalable reasoning on the data-web

Pip

elin

e

Page 50: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200950Irene Celino – From research to business: the Web of linked data

LarKC platform

Interface

Mobile Data Mashup Environment

SPARQLquery

SPARQLresult

RESTrequest

JSONresponse

Request data Data

PipelineConfig.

PROBLEM: Which Milano monuments or events or friends can I quickly get to from here?

TrafficMonumentsEventsPeople

The first Data The first Data MashupMashup withinwithin_________

http://www.larkc.eu

Page 51: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200951Irene Celino – From research to business: the Web of linked data

A roadmap towards LarKC Urban use caseA roadmap towards LarKC Urban use caseData

Known: street topology, monuments/events/friends location, traffic situation (current data stream + historical time series)Inferred: traffic predictions, residual street capacity

Formulating the query for LarKCBasic: shortest path from A to BExtended: shortest path from A to monuments/events/friendsAdvanced: considering traffic predictions and residual street capacity

Configuring the pipelineBasic configuration

Combining a SPARQL processor and a Graph ProcessorUsing AllegroGraph GeoExtension as a selector

Extended configurationDBpedia, EVDB, GoogleLatitude selector

Advanced configuration: traffic predictions based on recurrent neural networks, residual street capacity based on data stream analysis

Page 52: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200952Irene Celino – From research to business: the Web of linked data

LarKC Early Adopters WorkshopLarKC Early Adopters WorkshopThe public launch of the first open source LarKC platform release will take place at the forthcoming European Semantic Web Conference (ESWC 2009)

Register for the event! More information at: http://earlyadopters.larkc.eu/

We are developing the Urban Baby LarKC as a showcase of the potentiality of such platformEverybody will be invited to run experiments over LarKC

The Large Knowledge Collider a platform for massive distributed incomplete reasoninghttp://www.larkc.eu

Page 53: From research to business: the Web of linked data

From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009

The next Web of open, linked data

Just research? What’s going on? Why should I care?

Page 54: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200954Irene Celino – From research to business: the Web of linked data

“an open, shared database of the world’s information”

Source: Freebase - http://www.freebase.com (2009)

FreebaseFreebase

Page 55: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200955Irene Celino – From research to business: the Web of linked data

OpenCalaisOpenCalais

Source: Thomson Reuters - http://www.opencalais.com/ (2009)

Page 56: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200956Irene Celino – From research to business: the Web of linked data

WhatWhat’’s next? Business point of views next? Business point of viewOrganization today are used to produce lots of data……and they have the problem of managing and making sense of them!

More and more often they ask for Business Intelligence and related technologies to understand and decideBut it also happens that, in order to fully understand what’s going on and to take informed decisions, the data within the organization should be integrated or enhanced with external knowledge

This could definitely be a job for linked data technology!

Page 57: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200957Irene Celino – From research to business: the Web of linked data

www.flickr.com/photos/_-amy-_/3167333250/

““Stop hugging Stop hugging your datayour data””

Sir Tim BernersSir Tim Berners--Lee, 2009Lee, 2009

Linked data seen by the Web inventorLinked data seen by the Web inventor

Don’t let considerations

about security or data ownership

represent an obstacle to

innovation and opportunities

Page 58: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200958Irene Celino – From research to business: the Web of linked data

WhatWhat’’s next? Technological point of views next? Technological point of viewHow Business Intelligence and similar techniques change when their basic assumptions are no more valid?

Dynamically changing data sources (and data themselves…)Inconsistency typical of the Web (everything & the opposite of everything)Partial informationMore information than expected or than needed

Linked data pose new challenges for existing technologies!

Page 59: From research to business: the Web of linked data

Poznan, 29th April 2009 – © CEFRIEL 200959Irene Celino – From research to business: the Web of linked data

If I didnIf I didn’’t convince yout convince you……http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html

Page 60: From research to business: the Web of linked data

From research to business: the Web of linked dataEnterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009

Thanks for your attention! Any question?

Contacts: Irene Celino – Semantic Web PracticeCEFRIEL – ICT Institute, Politecnico di Milanoemail: [email protected] – web: http://swa.cefriel.it

phone: +39-02-23954266 – fax: +39-02-23954466Slides available at: http://www.slideshare.net/iricelino