THE INFORMATION WORKBENCHLINKED DATA AND SEMANTIC WIKIS IN THE ENTERPRISE
Peter Haasefluid Operations AG
SMWCon Fall 2012 Cologne
fluid Operations (fluidOps)
Linked Data & Semantic Technologies Enterprise Cloud Computing
Software company founded Q1/2008 by team of serial entrepreneurs, privately held, VC funded
Headquarters in Walldorf / Germany, SAP Partner Port
Currently 45 employees
Named “Cool Vendor” by Gartner March 2010
Global reseller agreement with EMC focus large enterprise customers Apr 2010
NetApp Advantage Alliance Partner Oct 2010
The RDF Data Model
predicatesubject object
Who am I and Why am I Here?A Linked Data Perspective
presenterAt
Who am I and Why am I Here?A Linked Data Perspective
owl:sam
eAs
affiliation
affiliation develops
develops
student
affilia
tion
develops
develops
affiliation
foaf:friend
foaf:friend
foaf:friend
generalChair
extends
extends
spinoff
affiliation
Who am I and Why am I Here?A Linked Data Perspective
affilia
tion
develops
foaf:friend
foaf:friend
affiliation
president
supervises
hosts
supervises
Wikis, the Web, Data and SemanticsCo
llbor
ation
(on
the
Web
)
Structured Data
The Potential of Linked Data
Linked Data• Set of standards, principles for publishing, sharing
and interrelating structured knowledge• From data silos to a Web of Data• RDF as data model, SPARQL for querying• Ontologies to describe the semantics
Benefits of Linked Data in the Enterprise• Enterprise Data Integration: Semantically integrate and
interlink data scattered among different information systems
• Collaborative Knowledge Management and Analytics: Enable cross-organization analysis, interactive analytics, and reporting, resulting in better business decisions
• Simplified publishing and sharing of data: Increase openness and accessibility of Enterprise Data
• Enrichment and contextualization through interlinking: Value add by linking to Linked Open Data
Information WorkbenchLinked Data and Semantic Wikis in the Enterprise
• Supports the whole process of interacting with Linked Data
• Data integration• Visualization & exploration• Collaborative knowledge management
• Open standards and technologies• Semantic Wiki based frontend
(Using SMW Syntax) • Supporting W3C standards (OWL, RDF, SPARQL)• Community Edition (Open Source) + Enterprise Edition (Commercial)
• Platform for Linked Data Application Development• Base functionality to build applications without programming• SDK for easy extensions• Implementation in Java, very flexible AJAX frontend
10
Information Workbench - Linked Data Platform
Semantic Web Data
Intelligent Data Access and Analytics Flexible self-service UI Visualization, exploration,
dashboarding, and reporting Semantic search
Collaboration and knowledge management
Curation & authoring Collaborative workflows
Semantics- & Linked Data-based integration of private and public data sources based on data providers
Generic and specific providers for various data formats and sources
Supports established mapping frameworks (e.g. R2RML, SILK, …)
Named graphs for managing contexts and provenance
Linked Data Integration Approaches
Centralized Integration•Following a data warehousing approach
•Data providers periodically gather data from sources and lift it to semantic data formats
•Graph-based data format enables pay-as-you-go integration of legacy data sources
•Information Workbench comes with predefined providers for various formats and data sources (Spreadsheets, XML, …)
Virtualized Integration• Autonomous, distributed data sources
linked through a federation layer
• No central integration required
• Data sources can be added ad hoc, on demand
• Federation mediator for query processing (routing sub queries to relevant sources)
Centralized Store
Data Provider
Query
Federation Mediator
Query
Enabling Data Composition & Integration:Federation of Virtualized Data Sources
Application Layer
Virtualization Layer
Data Layer
Data Source Data Source Data Source Data Source
SPARQLEndpoint
SPARQLEndpoint
SPARQLEndpoint
SPARQLEndpoint Metadata
Registry
See also: FedX: Optimization Techniques for Federated Query Processing on Linked Data (ISWC2011)
Self-service Linked Data Frontend driven bySemantic Wiki + Rich Widgets• Ontology-driven template mechanism• Declarative specification of the UI
based on available pool of widgets and declarative wiki-based syntax
• Widgets have direct access to the DB• Ad hoc data exploration, visualization,
analytics, dashboards, ...
Wiki Page in Edit Mode … … and Displayed Result Page
Rich Pool of Available Widgets for Interacting with the Integrated Data
Analytics and ReportingVisualization and Exploration
Mashups with Social MediaAuthoring and Content Creation
All widgets can be integrated into the UIusing an intuitive, Wiki-style declarative syntax.
Widget-based Visualization and Query Construction
Example Templates
Example:Conference Explorer
17
• „Linked-Data-a-Thon“: build an application that makes use of conference metadata and contextualizes data with external data sources in two weeks
• Realized with the Information Workbench
Data Sources• Conference Metadata (Linked Data)• Public bibliographic meta data• Social Networks:
• Twitter• Facebook• LinkedIn
• LinkedGeoData
Features• Conference schedule, timelines,
hot topics• Statistics and reports• Background information about
authors and publications• Link to social network profiles and
statistics
http://conference-explorer.fluidops.net/
Some Notes on Relationship with SMW and WikiData
• “People are scared of Wiki markup”• Semantic links for creating structured data is not something that people use • Need for form-based approaches • Wiki editing at most for unstructured documentation
• “We need to support diversity”• WikiData: Statements that reify claims• Our approach: Named graphs• Actually: In enterprise settings, we try to fight diversity (aka inconsistency,
redundancies, mismatches -> c.f. Semantic Master Data Management)
Information WorkbenchEnterprise Application Areas
Knowledge Management in the Life Sciences
Digital Libraries, Media and Content Management
Intelligent Data Center Management
BBC Web Site – Powered by Semantic Technologies
Open Sport Ontology
Dynamic Semantic Publishing with the BBC
Information Workbench for DSP
• Collaborative authoring and linking of unstructured and structured semantic data
• Ontology and instance data management• DSP editorial workflows• Automation of content creation and
enrichment
Olympics 2012 requirements• A lot of output... Page per Athlete [10,000+], Page per country [200+],
Page per Discipline [400-500], Time coded, metadata annotated, on demand video, 58,000 hours of content
• Almost real time statistics and live event pages with too many web pages for too few journalists
Dynamic Semantic Publishing (DSP) architecture to automate content aggregation
Information Workbench DSP Architecture
Staging Database
LiveDatabase
Data Layer SPARQL/RDF HTTP
Journalist, Data Architect, ...
Web-Frontend(Browser)
Unpublished Data Published Data
Social Netw. Widgets
Collaboration Widgets
Navigation Widgets
Extensible Widget Pool Visualization
Widgets
Interlinking and
Integration
Information Extraction
andEnrichment
Querying and Search
CollaborationAuthoring Visualization Search and Analytics
Publishing Workflows
Data Management Modules Data
Access
User Roles and Editorial Workflow
• JournalistView Instance Data
• SubeditorEdit instance data
Draft Approved
Rejected
PublishedApprove Publish
Reject
Edit
• Media ManagerEdit instance dataApprove/reject instance data edits
• Data ArchitectEdit instance data and ontology data editsPublish instance data
Demo Dynamic Semantic Publishing
Enterprise Clouds Vision
All resources of an adaptive, cloud-enabled IT environment can be set up, monitored, and maintained from a single, unified, and intuitive management console:
Internal and external IT resources accessible across stack without vendor lock-in High degree of automation and IT provisioning at click of button on the level of enterprise
landscapes Internal portal of private/public IT services with e.g. pay-as-you-go cost models
Intelligent Data Center Management
Problem Administration silos: compute infrastructure,
storage, application, … Business data not interlinked with technical
data CXOs struggle to have an integrated view on
the resources employed in the data center
Solution Semantic, resource-centric view on data: link
business data with data center resources and interrogate heterogeneous resources in a unified way
User-defined dashboards, queries, historical data management for analytics and reporting purposes
Integrated View On The Data Center
Integration of different software and hardware components, storage systems, compute infrastructures, applications, CRM systems, ticket systems, project catalogs
Automatic correlation of data retrieved from various systems
Unified view on data and metadata across the border of company units
Exploration, analysis, and actions based on the entire data corpus
Data Center Management
29
• Support collaborative operations management in the data center• Link business data to technical data• Technical Documentation• Analytics and Reporting• Performance and Capacity Monitoring• Responsibility Management• Resource Management• Change Management• Technical Ticketing System
Link Business Data To Data Center Resources E.g.: link your customers to their Virtual Landscapes using semantic annotations; visually
explore the connections between the business information and the data center resources on-demand
Use Case: Root Cause Analysis and Error Handling Identify which customer‘s SAP systems and system landscapes are affected when an error on the
storage level occursDetermine where errors on the application level are coming fromRelate events to each other Document and compare solutions for events allows fast reaction for error handling and ensures SLA enforcement
Semantic Link in Wiki Page Visual Data Exploration
Share Knowledge Within Your Company
Collaborative Acquisition and Augmentation of Knowledge with Semantic Wiki Technology Use Case: Technical Documentation and Responsibility Management
Use Wiki to collaboratively maintain technical documentation and best practicesCategorize documentation Interlink hardware resources with documentation in a central placeAssign responsibilities directly to technical resource
Wiki Page in Edit Mode … … and Displayed Result Page
Collaboration and Workflows
At data management level – collaborative creation of structured and unstructured documentation directly through the Wiki page of each resource (e.g. using edit-form widgets embedded in the wiki)
At process/workflow level – formalize and execute workflows in an automated way
Ontology-based edit form for adding missing information
Example workflow for a state-based ticketing system
Ticketing Support
Tickets are directly linked to the personal profile page
Integrated tickets and statistics built on top
Analytics and Reporting
Embed dynamic, user-defined charts directly into Semantic Wiki pages E.g.: create tabular summaries of your data and existing connections; specify user-
defined charts and dashboards; generate reports based on historical data
Use case: Performance Monitoring and Capacity Planning Monitor the performance & usage of your infrastructure over time using historical data Forecast when new infrastructure resources need to be ordered Analysis of the impact of new hardware options on utilization rates
Use case: Cost and Demand Forecasting Keep track of infrastructure costs for each customer / project Determine what infrastructure resources will be needed when and for which project Compare various infrastructure options in terms of cost
Example: Employed VMs over Time grouped by Power Status
Data Integration with the Information Workbench
Phase 1 – Integration: Data integration in a central
repository via data providers Lift existing data sources to RDF RDF data integration into a central
repository Data alignment using a global
ontology
Phase 2 – Logical Mapping: Bring together entities from
different sources Generate a logical view and map
data items Identify and align equal data across different
data sources A logical mapping layer derives IDs spanning
different data sources
Benefits of the Semantic Master Data Management in the Data Center Domain
Seamless integration creates transparency
Improve data quality Reuse data Discover redundancies and
inconsistencies
Ad hoc analysis Reduce time and
effort for search, query and report generation
Conclusion
Semantic technologies offer great potential to overcome the challenges of today’s enterprise data management challenges
(Semantic) Wikis to support collaboration in the enterprise: knowledge management, publishing, collaborative operations
Semantic Wiki + Rich Widgets for self-service Linked Data frontends Plenty of application areas
Get started with the Information Workbench:http://www.fluidops.com/information-workbench/
CONTACT:fluid OperationsAltrottstr. 31Walldorf, Germany
Email: [email protected]: www.fluidops.comTel.: +49 6227 3846-527