bbc case study agenda - wild apricot › resources › documents... · 2014-10-24 · a brief...

Post on 27-Jun-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 1

BBC Case Study Agenda ì  A brief introduction to MarkLogic ì  The BBC Dynamic Semantic Publishing model ì  DSP Technology Components ì  DSP and the Olympics ì  The relevancy of triples, ontologies and open linked data for everyone else ì  Embracing the potential

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 2

BRIEF INTRODUCTION TO MARKLOGIC

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 3

Hierarchical Era For your application data! •  Application- and

hardware-specific

Data Drives the Need for a New Generation Database

Relational Era “For all your structured data!” •  Normalized, tabular

model •  Application-

independent query •  User control

Any Structure Era “For all your data!” • Schema-agnostic • Massive scale • Query and search • Analytics • Heterogeneous data • Faster time-to-results

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 4

Harnessing Data & Reimagining Applications

ì  Reduce Risk

ì  Better Manage Compliance

ì  Create New Value from Data

ì  Optimize Operations

ì  Lower TCO / Better IT Economics

ì  Better Decision-making

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 5

The Only Enterprise NoSQL Database ì  Document Store – Search & Query

ì  Triple Store – Semantic Discovery

ì  ACID Transactions

ì  High Availability / Disaster Recovery

ì  Replication

ì  Government-grade Security

ì  Scalability & Elasticity

ì  On-premise or Cloud Deployment

ì  Hadoop for Storage & Compute

ì  Powerful Indexing Paradigm

ì  Drives Alerting, Geospatial, Analytics and more…

SEARCH DATABASE

APPLICATION SERVICES

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 6

THE BBC DSP MODEL

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 7

BBC’s Aspiration

"Our aspiration was that just as the Coronation did for TV in 1953, the Olympics would do for digital in 2012"

Phil Fearnley – General Manager. Future Media – News & Knowledge

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 8

Some Numbers Behind the Success A daily record of 9.5M Global browsers

A daily record of 7.1M UK browsers

55M global browsers across the games

37M UK browsers across the games

106M requests for BBC Olympic video content

bbc.co.uk traffic for

Olympics greater over 24 hours than ALL of World Cup 2010

2.8 Petabytes of Data in the busiest

DAY

700 Gbits per second during Bradley

Wiggins’ TT Gold 9.2M UK browsers from

Mobile Devices 34% of all daily browsers

from phones

2.3M UK browsers

from Tablets 12M requests for

Mobile Video

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 9

Introducing Dynamic Semantic Publishing (DSP)

ì  Uses linked data technology to automate… ì  Aggregation ì  Publishing ì  Re-purposing ì  …of interrelated content objects

ì  Driven by an ontological domain-modeled information architecture

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 10

DSP – Key Points ì  DSP enables the automated publication of metadata and content-

state driven web pages ì  Each web page automatically aggregates and renders links to

relevant stories and assets ì  Minimal journalist involvement: a small number of journalists can

author and surface the content with as light a touch as possible ì  Underpinned by Ontologies, a Triple Store and a Content Store

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 11

DSP TECHNOLOGY COMPONENTS

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 12

The Editorial Tool (The Enabler) ì  The tagging ontology is kept deliberately simple to protect

the journalist from the complexities of the underlying domain model

ì  A simple set of asset/domain joining predicates, such as

"about" and "mentions", drive the annotation tool UI and workflow

ì  The journalist applies suggested annotations as well as searching for triple store-indexed concepts.

ì  All ontology concepts are linked to linked open data (LOD) identifiers (DBPedia, Geonames etc.).

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 13

The Sport ontology and Meta model which powers these automated annotation powered aggregations has now been published and can be re-used under a Creative Commons attribution licence

The Ontology Model (Framework)

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 14

The Triple Store (Populating the Framework) ì  Maps the assets to the Ontology model ì  Relies on RDF - Resource Description Framework: ì  Making statements about concepts/resources in the form of

subject-predicate-object expressions (triples) ì  RDF semantics improve navigation, content re-use and

journalist determined levels of automation ("edited by exception")

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 15

Semantics: Organizing and Connecting Data for Meaning

Data is stored in Triples, expressed as: Subject : Predicate : Object John Smith : livesIn : London London : isIn : England

Rules tell us something about the triples:

If (A livesIn X) AND (X isIn Y) then (A livesIn Y)

"John Smith" "England" livesIn "London" isIn

livesIn

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 16

The Content Store (The Delivery Engine)

ì  Ingests all the assets intended for consumption ì  Stores those assets (in multiple schema) in a single

place ì  Dynamically delivers all assets to all the web pages,

systems and devices that need them… precisely when they need them

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 17

Making things work…

O CS TS

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 18

Making things presentable…

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 19

DSP AND THE OLYMPICS

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 20

The World Cup 2010 – The First Step

ì  Featured 700-plus team, group and player pages ì  That number of indices impossible with a static publishing

model

ì  Every page orchestrated by automated annotation-powered aggregations

ì  The DSP architectural approach enabled the BBC to support

much greater breadth and scale

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 21

The World Cup Domain Model ì  The domain model included concepts and relationships such as:

ì  time and location

ì  events and competitions

ì  groups

ì  stages and rounds

ì  matches

ì  teams, squads and players

ì  players within squads

ì  teams playing in groups

ì  groups within stages

ì  …etc

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 22

BBC Content…

ì  A journalist selects and applies the single concept "Frank Lampard“ ì  The Triple Store both applies and infers links to other ‘concepts’ such as…

ì  "England Squad"

ì  "Group C" and

ì  "FIFA World Cup 2010"

ì  … as generated triples within the triple store

ì  The semantics of the ontologies, the factual data, and the content metadata are all taken into account during each query evaluation.

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 23

plays for

plays in

Lampard

Chelsea

Southampton v Chelsea

Premier League

Include a story about Lampard

Include in the list of recent matches

Include in the League Table

Facts into Actions:

http://www.bbc.com/sport/

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 24

External Content Feeds… ì  Automated XML sports stats feeds from various sources are delivered and

processed by the BBC ì  Feeds are now also transformed into an RDF representation ì  The transformation process maps feed-supplier IDs onto corresponding

ontology concepts ì  Sports stats for Matches, Teams and Players are aggregated inline and served

dynamically from the content store, orchestrated by the triple store

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 25

The Olympics – DSP at Scale

“The Content Store which currently powers all of the statistics and navigation on the sports site has been scaled to handle ingesting many thousands of content objects per second “…whilst concurrently supporting many millions of dynamic page renditions and impressions a day “…This high performance content store will allow the BBC Sports site to ingest and render sport statistics including live football scores, live football tables, live Olympics event statistics and results in near real-time whilst rendering this content dynamically using the DSP approach.”

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 26

The BBC’s Conclusion?

"The demand and astonishing feedback we've seen from audiences accessing our Olympics content online, whenever they want, on the devices they choose, has exceeded our expectations and helped fulfill this aspiration… a truly digital games”

Phil Fearnley – General Manager. Future Media – News & Knowledge

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 27

THE RELEVANCY OF TRIPLES, ONTOLOGIES AND OPEN LINKED

DATA FOR EVERYONE ELSE…

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 28

Key Themes in Play ì  Complex Ecosystem

ì  Huge event, huge variety of known actors, venues, events, dependencies, outcomes

ì  Looking for an accurate, complete representation of that Ecosystem

ì  Rich, complete view, broad and deep discovery possibilities, insight!

ì  Moving from ‘static’ to ‘dynamic’

ì  Low economies of scale to high economies of scale

ì  ‘Walled Garden’ to ‘Expansive’

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 29

Context from the World at Large

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Linked Open Data

ì  Facts that are freely available ì  In a form that’s easily consumed

Examples:

ì  DBPedia

ì  GeoNames

Machine readable knowledge!

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 30

The Digital Supply Chain Core

Ingest Aggregate Manipulate Deliver

Ingest Aggregate Manipulate Deliver

Ingest Aggregate Manipulate Deliver

!

!

Incompatibility:

•  Formats •  Schema •  Definitions •  Vocabulary •  Meaning

NoSQL

RDF

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 31

NoSQL & RDF for a New World of Data ì  NoSQL

ì  Relax schema constraints to enable data efficiency and a ‘single view’ ì  Single Store or Metadata Layer

ì  Banish silos ì  RDF

ì  Maximize discovery and insight ì  Associate and Link ALL relevant things together

ì  With rules that govern the degree of flex for relevancy ì  Breadth & completeness within a restricted environment

ì  Complete freedom and serendipity wherever appropriate

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 32

Horizontal Relevancy & Application ì  Patient 360 ì  Single view of Citizen ì  Know Your Customer ì  Complete view of Assets ì  Fraud Detection

© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 33

Horizontal Relevancy & Application ì  What assets and information are relevant for this athlete? ì  Which drugs are compatible for treatment of this condition? ì  Which products will work together / might you wish to purchase together? ì  Which financial products are compatible with this investors risk profile? ì  What advice is compatible with this patient’s lifestyle and health history?

top related