open issues on semantic web daniel w. gillman us bureau of labor statistics

27
Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Upload: steven-dennis

Post on 05-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Open Issues on Semantic Web

Daniel W. GillmanUS Bureau of Labor Statistics

Page 2: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

The BLS Mission

The Bureau of Labor Statistics (BLS) is the principal fact-finding agency for the Federal Government in the broad field of labor economics and statistics. The BLS collects, processes, analyzes, and disseminates essential statistical data to the public, Congress, Federal agencies, State and local governments, business, and labor.

Page 3: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Outline Semantic Web – Description Scenario Problems Semantic Web Technologies Semantic Web and Metadata Management

Analysis Identify problems / use scenario Discovery, Judgment, Meaning

Not Semantic Web criticism / Stimulus for debate

METIS2010-03-12

4

Page 4: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web - Description

Berners-Lee -- 1999 I have a dream for the Web [in which

computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.

2010-03-12

METIS 5

Page 5: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web - Description

Web pages, readable B y computer

Instead, now, humans Determine height of Mt Everest Reserve table at favorite restaurant Find best prices for tires for the car

Semantic Web will demand more

2010-03-12

METIS 6

Page 6: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web - Description

Two new IT artifacts Web Services Ontologies

Service Set of events with a defined

interface Web Service

Software designed to support interoperable machine-to-machine interaction over a network

2010-03-12

METIS 7

Page 7: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web - Description

Ontology Set of concepts, the relations among

them, and a computational description

Purpose is to be able to reason, i.e., make inferences

Knowledge representation languages Bridge between web service and

ontology2010-03-12

METIS 8

Page 8: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Scenario

“America’s Safest Cities” by Zack O’Malley Greenburg 26 October 2009 Forbes Magazine

Rank cities by “livability” Workplace fatalities Traffic fatalities Violent crimes Natural disaster risk

2010-03-12

METIS 9

Page 9: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Scenario

Base comparison on MSA Metropolitan statistical area

Rank MSAs based on Numerical ranking for each measure Sum of rankings

Questions Can we find such data? If so, where?

2010-03-12

METIS 10

Page 10: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Scenario

Finding data -- Discovery Workplace fatalities

– Bureau of Labor Statistics– Data based on MSA– Data given as number, not rate

Traffic fatalities– National Highway Traffic Safety

Administration– Data based on city, not MSA– Based on rates

2010-03-12

METIS 11

Page 11: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Scenario

Violent crime– Federal Bureau of Investigation– Based on MSA– Given as rate

Natural disaster risk– SustainLane.Com– Not federal site, based on government

data– Data based on city, but only a few– No data, no rates, just a rank

2010-03-12

METIS 12

Page 12: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Scenario Using data – Judgment Unit of analysis = MSA Questions How can we combine this data? Can we harmonize the differences? City as proxy for MSA? Decisions are Qualitative Require human judgment

2010-03-12

METIS 13

Page 13: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Scenario

How do we know MSA vs. city Number vs. rate Rank vs. rate?

Understanding – Meaning Requires

Links from data sets to metadata Good metadata model for data

semantics METIS is good at this

2010-03-12

METIS 14

Page 14: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Problems Meaning

Easy – needs agency metadata Link meanings to data

– Straightforward– Mechanical, once metadata is captured

Discovery Harder –

– Difficult search– Takes a lot of work– Numerous comparisons– Not easy to know when to stop

2010-03-12

METIS 15

Page 15: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Problems

Judgment Very hard –

– Difficult to see how to automate– Case by case basis

If proxy OK? Need population for MSA Again, where?

– Discovery (Census Bureau)– Judgment (Appropriate?)– Meaning (Data elements correct?)

2010-03-12

METIS 16

Page 16: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web Technologies

Web services Any action in Semantic Web Several kinds Operation required? Web service called

Examples based on scenario Read data from a data set Display data dictionary of data set Calculate rates, ranks, and overall rank

2010-03-12

METIS 17

Page 17: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web Technologies

Ontologies Concept systems

– Set of concepts– Relations among them

Computational description– How one makes inferences– Logical system

Means for organizing knowledge– Concepts organized for some purpose

2010-03-12

METIS 18

Page 18: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web Technologies

Ontologies Logics

– Predicate calculus– Description logic– First order logic– Others

Low to high forma lity

2010-03-12

METIS 19

Page 19: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web Technologies

Knowledge representation languages Bridge between ontology and web

service Service uses KRL to make inferences

Typical languages RDF – Resource Description Framework

– Based on “triples”• Subject – verb – object

– Triples can be linked• Object of one is subject of another

– Creates Directed Graph structure

2010-03-12

METIS 20

Page 20: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web Technologies

Typical languages – cont’d OWL – Web Ontology Language

– Comes in 3 main types• OWL – lite

» More powerful than RDF, easiest, a DL• OWL – DL

» More powerful than OWL – lite, a DL also• OWL – full

» Equivalent to RDF-Schema, almost FOL» Most powerful OWL, hard to implement

2010-03-12

METIS 21

Page 21: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web Technologies

Typical languages – cont’d RDF and OWL – W3C specifications Common Logic – ISO/IEC 24707

– Very powerful– Full FOL, including some extensions

However – Using KR ≠> Ontology KR languages – Difficult to

implement– Work to build non-trivial ontology is huge

• Subject matter experts• Terminology experts• KR and logic experts

2010-03-12

METIS 22

Page 22: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web and Metadata Management

Metadata play central role in SW Linked Data – newer aspect of SW

Berners-Lee given credit again Laid out 4 criteria

– Use URIs to identify things. – Use HTTP URIs for dereferencing– Provide useful metadata when URI

dereferenced. – Include links to other, related URIs

2010-03-12

METIS 23

Page 23: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web and Metadata Management

2 main reactions: 1) No difference with traditional

metadata management 2) Begs the question

– How does one FIND the right URI (URL)?

Answer – Ontologies! – See above! Successful ontology

Consistent Complete Useful

2010-03-12

METIS 24

Page 24: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web and Metadata Management

Consistent & Compete ≠> Useful

Discovery doesn’t need new methods

Registries are designed for this SDMX ISO/IEC 11179 Library card catalog

2010-03-12

METIS 25

Page 25: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Semantic Web and Metadata Management

Judgment SW offers no help

Meaning Metadata management already

solves METIS members are experts

2010-03-12

METIS 26

Page 26: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Conclusion

Verdict SW not offering much new

SW descriptions Make hard problems seem easy Make easy problems seem hard

– Often the “sexy” stuff

2010-03-12

METIS 27

Page 27: Open Issues on Semantic Web Daniel W. Gillman US Bureau of Labor Statistics

Contact Information

Daniel [email protected]