bbc case study agenda - wild apricot › resources › documents... · 2014-10-24 · a brief...

33
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 1 BBC Case Study Agenda A brief introduction to MarkLogic The BBC Dynamic Semantic Publishing model DSP Technology Components DSP and the Olympics The relevancy of triples, ontologies and open linked data for everyone else Embracing the potential

Upload: others

Post on 27-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 1

BBC Case Study Agenda ì  A brief introduction to MarkLogic ì  The BBC Dynamic Semantic Publishing model ì  DSP Technology Components ì  DSP and the Olympics ì  The relevancy of triples, ontologies and open linked data for everyone else ì  Embracing the potential

Page 2: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 2

BRIEF INTRODUCTION TO MARKLOGIC

Page 3: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 3

Hierarchical Era For your application data! •  Application- and

hardware-specific

Data Drives the Need for a New Generation Database

Relational Era “For all your structured data!” •  Normalized, tabular

model •  Application-

independent query •  User control

Any Structure Era “For all your data!” • Schema-agnostic • Massive scale • Query and search • Analytics • Heterogeneous data • Faster time-to-results

Page 4: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 4

Harnessing Data & Reimagining Applications

ì  Reduce Risk

ì  Better Manage Compliance

ì  Create New Value from Data

ì  Optimize Operations

ì  Lower TCO / Better IT Economics

ì  Better Decision-making

Page 5: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 5

The Only Enterprise NoSQL Database ì  Document Store – Search & Query

ì  Triple Store – Semantic Discovery

ì  ACID Transactions

ì  High Availability / Disaster Recovery

ì  Replication

ì  Government-grade Security

ì  Scalability & Elasticity

ì  On-premise or Cloud Deployment

ì  Hadoop for Storage & Compute

ì  Powerful Indexing Paradigm

ì  Drives Alerting, Geospatial, Analytics and more…

SEARCH DATABASE

APPLICATION SERVICES

Page 6: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 6

THE BBC DSP MODEL

Page 7: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 7

BBC’s Aspiration

"Our aspiration was that just as the Coronation did for TV in 1953, the Olympics would do for digital in 2012"

Phil Fearnley – General Manager. Future Media – News & Knowledge

Page 8: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 8

Some Numbers Behind the Success A daily record of 9.5M Global browsers

A daily record of 7.1M UK browsers

55M global browsers across the games

37M UK browsers across the games

106M requests for BBC Olympic video content

bbc.co.uk traffic for

Olympics greater over 24 hours than ALL of World Cup 2010

2.8 Petabytes of Data in the busiest

DAY

700 Gbits per second during Bradley

Wiggins’ TT Gold 9.2M UK browsers from

Mobile Devices 34% of all daily browsers

from phones

2.3M UK browsers

from Tablets 12M requests for

Mobile Video

Page 9: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 9

Introducing Dynamic Semantic Publishing (DSP)

ì  Uses linked data technology to automate… ì  Aggregation ì  Publishing ì  Re-purposing ì  …of interrelated content objects

ì  Driven by an ontological domain-modeled information architecture

Page 10: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 10

DSP – Key Points ì  DSP enables the automated publication of metadata and content-

state driven web pages ì  Each web page automatically aggregates and renders links to

relevant stories and assets ì  Minimal journalist involvement: a small number of journalists can

author and surface the content with as light a touch as possible ì  Underpinned by Ontologies, a Triple Store and a Content Store

Page 11: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 11

DSP TECHNOLOGY COMPONENTS

Page 12: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 12

The Editorial Tool (The Enabler) ì  The tagging ontology is kept deliberately simple to protect

the journalist from the complexities of the underlying domain model

ì  A simple set of asset/domain joining predicates, such as

"about" and "mentions", drive the annotation tool UI and workflow

ì  The journalist applies suggested annotations as well as searching for triple store-indexed concepts.

ì  All ontology concepts are linked to linked open data (LOD) identifiers (DBPedia, Geonames etc.).

Page 13: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 13

The Sport ontology and Meta model which powers these automated annotation powered aggregations has now been published and can be re-used under a Creative Commons attribution licence

The Ontology Model (Framework)

Page 14: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 14

The Triple Store (Populating the Framework) ì  Maps the assets to the Ontology model ì  Relies on RDF - Resource Description Framework: ì  Making statements about concepts/resources in the form of

subject-predicate-object expressions (triples) ì  RDF semantics improve navigation, content re-use and

journalist determined levels of automation ("edited by exception")

Page 15: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 15

Semantics: Organizing and Connecting Data for Meaning

Data is stored in Triples, expressed as: Subject : Predicate : Object John Smith : livesIn : London London : isIn : England

Rules tell us something about the triples:

If (A livesIn X) AND (X isIn Y) then (A livesIn Y)

"John Smith" "England" livesIn "London" isIn

livesIn

Page 16: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 16

The Content Store (The Delivery Engine)

ì  Ingests all the assets intended for consumption ì  Stores those assets (in multiple schema) in a single

place ì  Dynamically delivers all assets to all the web pages,

systems and devices that need them… precisely when they need them

Page 17: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 17

Making things work…

O CS TS

Page 18: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 18

Making things presentable…

Page 19: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 19

DSP AND THE OLYMPICS

Page 20: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 20

The World Cup 2010 – The First Step

ì  Featured 700-plus team, group and player pages ì  That number of indices impossible with a static publishing

model

ì  Every page orchestrated by automated annotation-powered aggregations

ì  The DSP architectural approach enabled the BBC to support

much greater breadth and scale

Page 21: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 21

The World Cup Domain Model ì  The domain model included concepts and relationships such as:

ì  time and location

ì  events and competitions

ì  groups

ì  stages and rounds

ì  matches

ì  teams, squads and players

ì  players within squads

ì  teams playing in groups

ì  groups within stages

ì  …etc

Page 22: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 22

BBC Content…

ì  A journalist selects and applies the single concept "Frank Lampard“ ì  The Triple Store both applies and infers links to other ‘concepts’ such as…

ì  "England Squad"

ì  "Group C" and

ì  "FIFA World Cup 2010"

ì  … as generated triples within the triple store

ì  The semantics of the ontologies, the factual data, and the content metadata are all taken into account during each query evaluation.

Page 23: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 23

plays for

plays in

Lampard

Chelsea

Southampton v Chelsea

Premier League

Include a story about Lampard

Include in the list of recent matches

Include in the League Table

Facts into Actions:

http://www.bbc.com/sport/

Page 24: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 24

External Content Feeds… ì  Automated XML sports stats feeds from various sources are delivered and

processed by the BBC ì  Feeds are now also transformed into an RDF representation ì  The transformation process maps feed-supplier IDs onto corresponding

ontology concepts ì  Sports stats for Matches, Teams and Players are aggregated inline and served

dynamically from the content store, orchestrated by the triple store

Page 25: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 25

The Olympics – DSP at Scale

“The Content Store which currently powers all of the statistics and navigation on the sports site has been scaled to handle ingesting many thousands of content objects per second “…whilst concurrently supporting many millions of dynamic page renditions and impressions a day “…This high performance content store will allow the BBC Sports site to ingest and render sport statistics including live football scores, live football tables, live Olympics event statistics and results in near real-time whilst rendering this content dynamically using the DSP approach.”

Page 26: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 26

The BBC’s Conclusion?

"The demand and astonishing feedback we've seen from audiences accessing our Olympics content online, whenever they want, on the devices they choose, has exceeded our expectations and helped fulfill this aspiration… a truly digital games”

Phil Fearnley – General Manager. Future Media – News & Knowledge

Page 27: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 27

THE RELEVANCY OF TRIPLES, ONTOLOGIES AND OPEN LINKED

DATA FOR EVERYONE ELSE…

Page 28: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 28

Key Themes in Play ì  Complex Ecosystem

ì  Huge event, huge variety of known actors, venues, events, dependencies, outcomes

ì  Looking for an accurate, complete representation of that Ecosystem

ì  Rich, complete view, broad and deep discovery possibilities, insight!

ì  Moving from ‘static’ to ‘dynamic’

ì  Low economies of scale to high economies of scale

ì  ‘Walled Garden’ to ‘Expansive’

Page 29: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 29

Context from the World at Large

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Linked Open Data

ì  Facts that are freely available ì  In a form that’s easily consumed

Examples:

ì  DBPedia

ì  GeoNames

Machine readable knowledge!

Page 30: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 30

The Digital Supply Chain Core

Ingest Aggregate Manipulate Deliver

Ingest Aggregate Manipulate Deliver

Ingest Aggregate Manipulate Deliver

!

!

Incompatibility:

•  Formats •  Schema •  Definitions •  Vocabulary •  Meaning

NoSQL

RDF

Page 31: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 31

NoSQL & RDF for a New World of Data ì  NoSQL

ì  Relax schema constraints to enable data efficiency and a ‘single view’ ì  Single Store or Metadata Layer

ì  Banish silos ì  RDF

ì  Maximize discovery and insight ì  Associate and Link ALL relevant things together

ì  With rules that govern the degree of flex for relevancy ì  Breadth & completeness within a restricted environment

ì  Complete freedom and serendipity wherever appropriate

Page 32: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 32

Horizontal Relevancy & Application ì  Patient 360 ì  Single view of Citizen ì  Know Your Customer ì  Complete view of Assets ì  Fraud Detection

Page 33: BBC Case Study Agenda - Wild Apricot › Resources › Documents... · 2014-10-24 · A brief introduction to MarkLogic ! The BBC Dynamic Semantic Publishing model ! DSP Technology

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 33

Horizontal Relevancy & Application ì  What assets and information are relevant for this athlete? ì  Which drugs are compatible for treatment of this condition? ì  Which products will work together / might you wish to purchase together? ì  Which financial products are compatible with this investors risk profile? ì  What advice is compatible with this patient’s lifestyle and health history?