what’s going on in my ph.d.? a short report of my first year what’s going on in my ph.d.? a...

21
What’s going on in my Ph.D.? A short report of my first year Andrea Giovanni Nuzzolese [email protected]

Post on 20-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

What’s going on in my Ph.D.?

A short report of my first yearAndrea Giovanni Nuzzolese

[email protected]

Summary

• Research questions and objectives

• State of the Art

• Main research activities

• Other research activities

• Attended schools and earned credits

Research Questions1. How to recognize a possible source of Knowledge Patterns?

2. How to mine meaning from data, e.g. Linked Data, to KPs?

3. What are the invariances, if any exists, for extracting KPs from data?

4. What could be a good automatic or semi-automatic method in order to extract KPs?

5. Why is the extraction of KPs useful?

6. How can KR and Ontology Engineering benefit from KP extraction?

Research Objectives• Being able to recognize invariances from any possible source

of KPs

• Validate cognitive soundness of extracted KPs– Define a mesure for cognitive soundness of KPs– Desing a benchmark for cognitive soundnes of KPs

• Design a method which allows an efficient extraction of KPs from data

• Provide an implementation of the method

• Apply extracted KPs for specific tasks– e.g. exploratory search, recommendation systems, explanation

systems, etc…

Design Patterns

• In software engineering, design patterns represent typical and recurring schemata of good [Gamma et al., 1994] and bad [Brown et al., 1998] software architectures

• They come from architecture “Each pattern describes a problem that occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice” [Alexander et al., 1977]

Knowledge Patterns

• In ontology engineering KPs are general templates denoting recurring theory schemata and their transformation to create specific theories [Clark et al., 2000]– Ontology engineering is a modeling endeavor

• [Gangemi and Presutti, 2010] presents a cognitive vision of KPs close to the notion of frames as described by [Minsky, 1975] and [Baker et al., 1998]

Knowledge Patterns (cont’d)

• In [Gangemi and Presutti, 2010] a KP is a small unit of meaning– Task based– Well grounded– Cognitively sound

Ontology Extraction• The generation of ontologies from formal and semi-formal data

is frequently called Semantic Web Mining [Stumme, 2006] or Ontology Mining [d’Amato et al., 2010]

• [Völker and Niepert, 2011] presents a statistical approach to the induction of expressive schemata from large RDF repositories (Statistical Schema Induction)

• In the field of ontology learning from natural language text [Cimiano et al., 2004] presents a method for inducing taxonomies by means of hierarchical clustering of context vectors

• [Jäschke et al., 2008] presents a method for discovering ontologies from folksonomies

Knowledge Pattern Extraction

• During my first year I focused my attention on trying to find possible approaches to the problem of the extraction of KPs

• So far two main directions for KP extraction have been analyzed– Top-down: the extraction of KPs from foundational

ontologies (e.g. Dolce), frames (e.g. FrameNet), thesauri and any other formal or semi-formal structure

– Bottom-up: the extraction of KPs directly from data, e.g. RDBs, Linked Data

My articles• A. G. Nuzzolese, A. Gangemi, V. Presutti and P. Ciancarini. Semion: a smart triplication

tool. In O. Corcho and J. Volker, editors, � Demo Poster of the 17th Conference on Knowledge Engineeringand Knowledge Management, pp. 166-167. CEUR Workshop Proceedings, Lisbon, Portugal, 2010.

• A. G. Nuzzolese, A. Gangemi, V. Presutti and P. Ciancarini.Fine-tuning triplication with Semion. In V. Presutti, V. Svatek, and F. Share, editors, EKAW workshop on Knowledge Injection into and Extraction from Linked Data (KIELD2010), pp. 2-14. CEUR Workshop Proceedings, Lisbon, Portugal, 2010.

• A. G. Nuzzolese, A. Gangemi, and V. Presutti. Gathering Lexical Linked Data and Knowledge Patterns from FrameNet. In Proc. of the 6th International Conference on Knowledge Capture(K-CAP), pp. 41-48. ACM, Ban, Alberta, Canada, 2011.

• A. G. Nuzzolese, A. Gangemi, V. Presutti, P. Ciancarini: Encyclopedic knowledge patterns fromwikipedia links. In: Aroyo, L., Noy, N., Welty, C. (eds.) Proceedings fo the 10th InternationalSemantic Web Conference (ISWC2011). Springer, pp. 520-536, 2011.

• A. Musetti, A. G. Nuzzolese, F. Draicchio, V. Presutti, E. Blomqvist, A. Gangemi, P. Ciancarini: Aemoo: exploratory search based on knowledge patterns over the Semantic Web. To appear in Semantic Web Challenge 2011.

Sem

ion

Top-d

ow

nB

ott

om

-up A

pp

Semion

• Provides a method which – allows to reengineer any data source to RDF

triples– no assumption is fixed on the domain semantics,

but those that are customized by the user

• Is based on two main steps– a syntactic transformation of the data source– a rule-based refactoring

FrameNet LOD

• The contribution of this paper is twofold– the production and publishing of a LOD

dataset for the FrameNet lexical database, and

– the description of a method to produce knowledge patterns out of FrameNet frames

• For both contributions we use Semion

EKPs

• Presents a resource of Encyclopedic KPs that have been discovered by analyzing the Wikipedia page links dataset

• Describes the evaluation of the extracted EKPs with a user study

• Provides a bottom-up approach for extracting EKPs based on the Knowledge Architecture and the concept of Path

Aemoo• Is a Web application supporting exploratory search over the

Semantic Web based on Encyclopedic KPs

• Aggregates knowledge from – Linked Data– Wikipedia– Twitter– Google News

• Provides an effective summary of knowledge about an entity, including explanations

• Aemoo participated in the Semantic Web Challenge 2011 and was selected for the final round and finally ranked 4– http://challenge.semanticweb.org/

Relation articles-RQsRQ1 RQ2 RQ3 RQ4 RQ5

Semion1-2 X X

FrameNet Lod X X

EKPs X X X X

Aemoo X

① How to recognize a possible source of Knowledge Patterns?② How to mine meaning from data, e.g. Linked Data, to KPs?③ What are the invariances, if any exists, for extracting KPs from

data? ④ What could be a good automatic or semi-automatic method in

order to extract KPs?⑤ Why is the extraction of KPs useful?

Other Research Activities

• Interactive Knowledge Stack (IKS)– IKS is an Integrating Project part-funded by the

European Commission – it will provide an open source technology

platform for semantically enhanced CMS

• Apache Stanbol– is a modular software stack and reusable set of

components for semantic content management– currently, there are more than 200 blogs that

run WordLift, a plug-in for WordPress based on the refactor engine derived from Semion.

Technical Reports in IKS• A. Adamou, E. Blomqvist, C. E. Bonafede, P. Ciancarini, E. Daga,

A. Musetti, A. G. Nuzzolese, V. Presutti, S. Germesin, M. Romanelli. Knowledge Representation and Reasoning System (KReS) - Beta Version Report. Technical report, IKS Consortium, 2010.

• A. Adamou, E. Blomqvist, C. E. Bonafede, E. Daga, A. G. Nuzzolese and V. Presutti. Knowledge representation and reasoning system (KReS) - Alpha Version Report. Technical report, IKS Consortium, 2010.

• A. Adamou, E. Blomqvist, A. Gangemi, A. G. Nuzzolese, V. Presutti, W. Behrendt, D. Violeta and A. Conconi. IKS deliverable 3.2: Ontological requirements for industrial cms applications. Technical report, IKS Consortium, 2010.

• W. Kasper, J. Stefen, A. G. Nuzzolese and V. Presutti. IKS deliverable 3.3: Requirements for semantic lifting/wrapping components. Technical report, IKS Consortium, 2010.

Lecturer

• Invited lecturer at the Jönköping University for the master course in Information Retrieval

• Tutor at UniBO for the master course in Knowledge Management and Data Mining

• Tutor at UniBO for the master course in Computer System Security

Attended courses• Bertinoro International Spring School

– Information Integration, Prof. Maurizio Lenzerini, Univesità La Sapienza, Rome (Italy)

– Computational Aspects of Game Theory, Prof. Bruno Codenotti, Consiglio Nazionale delle Ricerche, Pisa (Italy)

– Model Checking: From Finite-state to Infinite-state Systems, Prof. Giorgio Delzanno, Università di Genova (Italy)

– Trust in Anonymity Networks, Prof. Vladimiro Sassone, University of Southampton (United Kingdom)

• Computational Ontologies, Dr. Valentina Presutti e Dr. Aldo Gangemi, CNR-ISTC, Rome (Italy)

• 8th European Summer School on Ontological Engineering and the Semantic Web, Cercedilla (Spain)

• Embedded Real Time Systems, Prof. Fabio Panzieri Università di Bologna (Italy) and Prof. Tullio Vardanega, Università di Padova (Italy)

Earned credits

Course Credits

Model Checking: From Finite-state to Infinite-state Systems 1

Trust in Anonimity Networks 1

Computational Ontologies 1

Embedded Real Time Systems 1

SSSW 1,5

Information Integration 1

TOTAL 5,5 (6,5)

Thank you