lecture reference architecture_for_semantic_cms_part_i
TRANSCRIPT
Co-funded by the European Union
Semantic CMS Community
Designing Semantic CMS – Part I
Copyright IKS Consortium1
LecturerOrganization
Date of presentation
www.iks-project.eu
Page:
Introduction of Content Management
Foundations of Semantic Web Technologies
Storing and Accessing Semantic Data
Knowledge Interaction and Presentation
Knowledge Representation and Reasoning
Semantic Lifting
Designing Interactive Ubiquitous IS
Requirements Engineering for Semantic CMS
Designing Semantic CMS
Semantifying your CMS
Part I: Foundations
Part II: Semantic Content Management
Part III: Methodologies
(2) (1)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
www.iks-project.eu
Page:
What is this Lecture about?
We have seen ... ... how requirements for
semantic content management are defined in a systematic way.
... a list of industry needs.
What is missing? An efficient way to design an
architecture for a semantic CMS that meets the defined requirements
Copyright IKS Consortium
3
Designing Interactive Ubiquitous IS
Requirements Engineering for Semantic CMS
Designing Semantic CMS
Semantifying your CMS
Part III: Methodologies
(7)
(8)
(9)
(10)
www.iks-project.eu
Page:
How to design a semantic CMS?
Copyright IKS Consortium
4
Conceptual Reference
Architecture
Technical Architectural
Style
Part 1IKS Reference
Architecture
Part 2REST Architecture
What does the architecture of a
semantic CMS look like?
How can a semantic CMS be realized?
www.iks-project.eu
Page:
Copyright IKS Consortium
5
www.iks-project.eu
Page:
Towards Semantic Content Management
Copyright IKS Consortium
6
extract knowledgefrom content
SemanticContent Management
Content Content Knowledge
ContentManagement
www.iks-project.eu
Page:
How to build a Semantic CMS?
Requirements from industry Easy integration with existing CMS
Reuse features of existing CMS Use RESTful interfaces Semantic features as optional components
Functional requirements Automatic extraction of entities from text Automatic extraction of relations between entities Automatic categorization of content Automatic linking of content ...
7
Extend traditional CMS
architecture with required
semantic capabilities
Copyright IKS Consortium
www.iks-project.eu
Page:
What are semantic CMS?
Copyright IKS Consortium
8
A Semantic CMS is a CMS with the capability of
interacting withsemantic metadata,
extractingsemantic metadata,
managingsemantic metadata,
and storingsemantic metadata
about content.
Knowledge Representation andReasoning Layer
Persistence Layer
Semantic Lifting Layer
Presentation and Interaction Layer
www.iks-project.eu
Page:
Traditional CMS Architecturefor Content
Copyright IKS Consortium
9
User Interface
Content Management
Content Data Model
Content Repository
Content
Adm
inistration
Content Access
Persistence Layer
Business Logic Layer
Presentation Layer
Data Representation Layer
www.iks-project.eu
Page:
Reference Architecture for Semantic CMS
Copyright IKS Consortium
10
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
Knowledge Representation and Reasoning Layer
Persistence Layer
Semantic Lifting Layer
Presentation & Interaction Layer
www.iks-project.eu
Page:
Semantic User Interaction
Dealing with knowledge in semantic CMS raises the need an additional user interface level that allows the interaction with content,
Example: “A user writes an article and the SCMS recognizes the
brand of a car in that article. An SCMS includes a reference to an object representing that car manufacturer – not only the brand name. The user can
interact with the car manufacturer object and
see, e.g. the location of its headquarter.
Copyright IKS Consortium
11
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Knowledge Access
Access to inferred and extracted knowledge is encapsulated through a Knowledge Access layer
It provides the access to knowledge for Semantic User Interaction.
Copyright IKS Consortium
12
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Knowledge Extraction Pipelines
The main challenge for semantic CMS is the ability to extract knowledge in terms of semantic metadata from the stored content.
A separate layer for Knowledge Extraction Pipelines encapsulates algorithms for semantic metadata extraction.
Typically, knowledge extraction is a
multistage process [FL04] by applying
different IE/IR algorithms
Copyright IKS Consortium
13
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Pipeline Processing - Example
Copyright IKS Consortium
14
ContentExtraction
Pre-Processing
EntityExtraction
RelationExtraction
John Miller has brought a Jaguar car this year.
Person Car Manufacturer
Time
Relation
www.iks-project.eu
Page:
Reasoning
After lifting content to a semantic level this extracted information may be used as inputs for reasoning techniques in the Reasoning layer
Logical reasoning is a well-known artificial intelligence technique that uses semantic relations to retrieve knowledge about the content that was not explicitly known before.
Copyright IKS Consortium
15
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Knowledge Models
Knowledge (representation) Models that define the semantic metadata are used to express knowledge
Ontologies can be used to define semantic metadata that specifies so-called concepts and their semantic relations.
Copyright IKS Consortium
16
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Knowledge Repository
Knowledge is stored in a Knowledge Repository that defines the fundamental data structure for knowledge
State-of-the-art knowledge repositories implement a triple store where a triple is formed by a subject, a predicate, and an object
A triple can be used to express any relation between a subject and an object
Copyright IKS Consortium
17
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Knowledge Administration
Knowledge Administration includes the management of: Semantic User Interaction templates, Knowledge Extraction Pipeline management Reasoning management to the administration of
Knowledge Models and Repositories.
Copyright IKS Consortium
18
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Integration
Copyright IKS Consortium
19
User Interface
ContentManagement
Content Data Model
Content Repository
Content
Adm
inistration
Content Access
Semantic User Interface
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Know
ledgeA
dministration
Knowledge Access
KnowledgeExtraction Pipelines
www.iks-project.eu
Page:
Implementation of the Reference Architecture
Reference implementation withinthe IKS project IKS: An open source community to
bring semantic technologies to CMS
platforms New incubating project at the
Apache Software Foundationhttp://incubator.apache.org/stanbol
Copyright IKS Consortium
20
www.iks-project.eu
Page:
Implementation of the Reference Architecture
One year student projectInformation-Driven Software Engineering Extract knowledge from unstructured
software specification documents Case study: 10.000 pages specification of German Health
Card system
Copyright IKS Consortium
21
www.iks-project.eu
Page:
Breathing life to the Reference Architecture
Copyright IKS Consortium
22
User Interface
ContentManagement
Content Data Model
Content Repository
Co
nte
nt
Ad
min
istratio
n
Content Access
Semantic User Interaction
Reasoning
Knowledge Models
Knowledge Repository
Kn
ow
led
ge
Ad
min
istratio
n
Knowledge Access
KnowledgeExtraction Pipelines
Semantic User Interface
Instantiation Content Management
ID|SE Platform
www.iks-project.eu
Page:
Analysis &
Design
Implementation &
Test
Requirements
Engineering
Problem Statement
23
?
Copyright IKS Consortium
www.iks-project.eu
Page:
Problem Statement
Documents and Artifacts created in the software
development process contain implicit information:
Type of the document (e.g. requirements specification)
Named Entities (e.g. actor „User“)
Relations between the different document are not obvious
Thematically similar
Duplicates
24
Copyright IKS Consortium
www.iks-project.eu
Page:
ID|SE Demo
Copyright IKS Consortium
25
http://idse.cs.upb.de:8082/opencms/opencms/idse
www.iks-project.eu
Page:
ID|SE-Platform – Architecture
Copyright IKS Consortium
26
Document-Content-Storage
ID|SE-Service-Platform
IE/IR-Service-Orchestrators
Meta-Data-
Search
Content-Management
IE/IR-ServicesEvaluation-
Services
Meta-Data-Storage
Meta-Data-Model
<<OpenCMS>>
Content-Management-
System
www.iks-project.eu
Page:
Mapping with Reference Architecture
Copyright IKS Consortium
27
www.iks-project.eu
Page:
ID|SE-Platform1. Send Request to the ID|SE Platform
Copyright IKS Consortium
28
<<OpenCMS>>Content Management
SystemID|SE-Service Platform
IEIR-ServiceOrchestrators
Webservice
DefaultMetaDataCreator
<<OpenCMS-Module>>GUI
IDefaultMetaDataCreator
DefaultMetaDataCreatorWebservice
www.iks-project.eu
Page:
ID|SE-Platform2. Providing Documents
Copyright IKS Consortium
29
<<OpenCMS>>Content Management
System
ID|SE-Service Platform
IEIR-ServiceOrchestrators
DefaultMetaDataCreator
<<component>>DocumentProvider
Content-Management
DocumentContent-Storage
OpenCMSDocumentProviderProxy
IProvideDocuments
Webservice
www.iks-project.eu
Page:
ID|SE-Platform3. Generation of Meta-Data
Copyright IKS Consortium
30
IE/IR-ServicesEvaluationServices
MetaDataStorage
MetaDataModel
IE/IR-ServiceOrchestrators
DefaultMetaDataCreator
Content-Extrac-
tion
Pre-pro-
cessors
Classi-fier
ClustererNamed-Entity-
Recognizer
Information-Aggregator
www.iks-project.eu
Page:
ID|SE-Platform4. Providing/Presenting Meta-Data
Copyright IKS Consortium
31
<<OpenCMS>>Content Management
System
Webservice
<<OpenCMS-Module>>ArtifactSearchGUI
Meta-Data-Search
MetaDataModel
MetaDataStorage
IEIR-Services
MetaDataSearchEngine
MetaDataSearchEngineWebservice
www.iks-project.eu
Page:
ID|SE Features
Copyright IKS Consortium
32
Clustering of artefacts
Classification of artefacts
Named entity
recognition
Facetted Search
Efficient way in browsing through
content
“Which artefacts are about ‘XYZ’ ”
No redundancy in software specification
documents
Duplicate Check
www.iks-project.eu
Page:
Copyright IKS Consortium
33
How can we
evaluate our
semantic
features?
www.iks-project.eu
Page:
Evaluation Criteria
Copyright IKS Consortium
34
Recall
Precision
F-Measure
www.iks-project.eu
Page:
Evaluation of Semantic Features
Copyright IKS Consortium
35
F-Measure Precision Recall0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
80%
88%
74%
Entity Recognition
F-Measure Precision Recall0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
77%
84%
72%
Classification
F-Measure Precision Recall0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
58%
64%
56%
Clustering
www.iks-project.eu
Page:
Lessons Learned ...
Now you should know ... ... the architectural requirements for a semantic CMS. ... the integration concept of two loosely coupled columns. ... the components of the reference architecture ... how the reference architecture model can used to build
a semantic CMS from scratch and how an extended system can be extended
Copyright IKS Consortium
36