leveraging big data opportunities for growth
DESCRIPTION
Mr Krishna Tewari spoke on the Publisher´s Forum 2013 in Berlin about Big Data...TRANSCRIPT
Krishna Tewari Global Head
Digital Publishing & Retail solutions
Datamatics Global Services Ltd
Leveraging Big Data Opportunities for Growth
Challenges for publishers
Big Data in publishing industry
The technology landscape
Use cases for publishing
Planning for Big Data
1
2
3
4
5
Agenda
This is ‘The Library of Alexandria’ Here the Egyptians once collected and managed every scroll of information
then available in the world
The classical content
Artist: O. Von Corven, Source: Wikipedia
Newsroom www.telegraph.co.uk
The Publisher’s tilt
challenge is to bridge the chasm ahead….
Challenges for publishers
Big Data in publishing industry
The technology landscape
Use cases for publishing
Planning for Big Data
1
2
3
4
5
Agenda
Data & content in the publishing world Structured Semi structured Unstructured
Content
Databases XML Files PDFs Headers Metadata
Image Banks Application Files Adverts Feeds
Info Graphics Audio Video Content sharing Ratings
Readers / Content Consumers
Subscriptions Customer Information CRM Data
Purchase History Demographics Service Logs
Reading Modes Interest Areas Buying Patterns Searches,eMails Spend Analysis
Likes Tweets Shares Ratings Reading ,Chats
Sales Channels
Geo Spread Publication type Performance
Geographical Performance
Campaign Data Discounts Bundled offers Geo preferences Channel data
Hit counts Events Surveys Marketing copies Test runs
Authors/ Data providers
Author Databases
Contracts Permissions Rights
Market performances Subject expertise Qualifications Affiliations Emails, Payments
Tweets Shares Peer Reviews
80 % data existing in any enterprise today is unstructured
What Consists of Big Data?
Big Data Integration
Big Transaction Data Big Interaction Data Transactional Data:
Orders, Invoices, Payments, Plans,
Deliverables, Travel records
Other Interaction Data
Big Data Processing
Analytical Data:
Historical Data, Machine Streams, Clickstream
data, Log files
Volume
Velocity
Variety
Complexity
Big Data is the confluence of the three trends consisting of Big Transaction Data, Big Interaction Data, and Big Data Processing
Challenges for publishers
Big Data in publishing industry
The technology landscape
Use cases for publishing
Planning for Big Data
1
2
3
4
5
Agenda
Big Data Technology Landscape
Challenges for publishers
Big Data in publishing industry
The technology landscape
Use cases for publishing
Planning for Big Data
1
2
3
4
5
Agenda
Use Case: Large Scale Data Archival
Data segregated in disparate platforms in different file formats can be acquired & organized easily using Big Data
Transactional Data
Publishing House Historical Data • Millions of Images • Millions of Data Files • Thousands of articles from hundreds of authors
Contracts Board
Comments
Mails & Tweets
Integrated Data Repository
(Powered by Big Data) Automatically indexed and
tagged and made available for end users through a portal
Case Study : Archiving at RSC
• About Royal Society of Chemistry – Europe’s largest society in advances of chemical science
• Business Challenge – To organize assets accumulated since 1840s – Content Summary:
• 1 million images • Millions of Scientific data files • Hundreds of thousands of articles from 200,000 authors • Recent Captures – Social Media, Video and Digital Assets
• Solution – MarkLogic (NoSQL solution) was used to create a repository accessible for RSC’s online
users, entrepreneurs, researchers and educators – Content stored as XML documents (using document centric model)
• Benefits – Allows RSC to publish 3x times as journals and 4x times as many articles
Source: http://is.gd/oyEu01
Case Study: Converting Large Scale Images in NYT
• About New York Times – American daily newspaper, published in New York city since 1851
• Business Challenge – NYT decided to make all public domain articles dated 1851-1922 available to the readers
free of charge – 11 million articles available in images were to be converted to PDF format – Previously PDF were generated dynamically. But as traffic scaled this approach ran out of
feasibility
• Solution – Pre-generating articles & serving them as static files to readers
• Amazon S3 as File System • Amazon EC2 for Web Services • Hadoop to convert articles into PDF files
• Benefits – NYT were able to save tremendous IT investments and were able to deliver over 1.5 TB
of data to users instantaneously
Source: http://is.gd/kMqKSe
Use Case: Leveraging Value in Social Media
GoodReads Reviews
Facebook Page Likes
and Comments
No of Tweets with
hashtag of bookname
Source: Twitter, Facebook, Goodreads pages of RailSea [Author: Chine Mieville, Publishers: Random House]
Publishing Companies can leverage Big Data to aggregate and track social data in real time
Case Study: Personalizing Interactions at De Persgroep
• About De Persgroep – Leading Publishing and Broadcasting network in Belgium and Netherlands
• Business Challenge – Millions of readers, viewers tune into De Persgroep’s print and digital, TV and radio
channels – With users accessing content through multiple devices (iPad, Kindle, iPhone) consumer
data outgrew the bounds of siloed solutions
• Solution – Customer used Lily 2.0 (with help from NGData – customer intelligence management
company) to get an intelligent view on how customers are leveraging the content generated by the group
• Creating personalized interactions, messages, and offers based on user preferences and purchase history
• They realized an increase in Customer Lifetime Value
• Benefits – The adoption enabled De Persgroep to understand viewing and content preference of
customers, and to create and share timely and relevant content on those lines
Source: http://is.gd/M7lVWw
Challenges for publishers
Big Data in publishing industry
The technology landscape
Use cases for publishing
Planning for Big Data
1
2
3
4
5
Agenda
insights for growing the business
Reader / Content Consumer
Past Searches
67% - LIFE SCIENCES
Entomology
Coleoptera - 56%
Lady bird beetle (72%)
Beetles (28%)
ad banners in the website
Display Lady bird research articles
Discount coupons for subject books
Customize bundled offers
Demographics Prof in Humboldt Universität, Berlin
Dept of Agricultural entomology
Editor in Chief – Life sciences journal
Customized bills with focused ads
Upcoming publications
Discounts
Time of reading
Subject related searches 10 AM – 4 pm
device read 8 pm – 10 pm
Device content share – 9 - 9.30 pm
80% tweets – 6PM – 7 PM
Customized ad release timings
Ad release in devices
Do not disturb timings
Tailored call center action
Spend Analysis Total monthly spend – euro 350
Research articles - euro 250
Books -euro 45
Journal subscription -euro 55
Ads of publications in price range
Bundled savings
Spend trend and alerts to sales
Social Media Activity
Very active social media
FB shares – 27% XYZ | 80% ABC
Tweets – 18% XYZ | 82% ABC
Low share of wallet
Watch customer surveys
Alert customer Account Manager
Reading Device 24% online searches – desktop
76% Book reading - iPad
More focus on ipad alerts for
books
Offers on ebook versions
DATA ANALYTICS ACTIONS
Big data innovation trends
Source :http://www.constellationrg.com
Recommended Steps to consider Big Data
• Identify the business problem that you are trying to solve
• Identify the relevant technology that will be able to address the problem
• Break organization silos and form cross functional teams
• Assign responsibility to a mix of ‘left brain’ analytical and ‘right brain’
depicter type of people
• Start small, with proof of concepts playing around with existing commodity
hardware and free solutions
• Striking a balance between the existing technology infrastructure and
introduction of Big Data technologies
There is new hope with big data…
Leveraging Big Data Opportunities for Growth
Krishna Tewari Global Head
Digital Publishing & Retail solutions Datamatics Global Services Ltd