applications of big data

31
Applications of Big Data By Prashant Kumar Jadia Department of Computer Science and Engineering Hong Kong University of Science and Technology [email protected]

Upload: prashant-kumar-jadia

Post on 12-Apr-2017

356 views

Category:

Documents


1 download

TRANSCRIPT

Applications of Big Data

By

Prashant Kumar JadiaDepartment of Computer Science and Engineering

Hong Kong University of Science and Technology

[email protected]

What is Big Data

• McKinsey

"Big data" refers to datasets whose size is beyond the ability of typical database

software tools to capture, store, manage, and analyze.

Various Definitions

• Gartner

Big Data in general is defined as high volume, velocity and variety information

assets that demand cost-effective, innovative forms of information processing for

enhanced insight and decision making.

• Oreilly

Big data is data that exceeds the processing capacity of conventional database

systems. The data is too big, moves too fast, or doesn't fit the strictures of your

database architectures. To gain value from this data, you must choose an

alternative way to process it.

Source: Infosys Blogs

URL: http://www.infosysblogs.com/bigdata/2013/02/what_actually_is_big_data.html

Date: 14-February-15

Four V’s of Big Data

Volume

• 300 hours of video every minute to you tube

• 10 billion posts on Facebook Everyday

• 302 million monthly active users on Twitter

Variety

• 500 miliion tweets everyday

• Millions of wearables and health monitors

• Billions of photos uploaded everyday

Velocity

• Spread of sensor network

• Growth in world connectivity

Veracity

• Different sources will have different formats of data

• Health care, same data in various forms.

Figures are as of May, 2015

The Fifth ‘V’ Value

Value

DFAS has saved approx$4 billion in improper vendor

payments

Savings $100 million in

erroneous claims

eCall will save around 2500 lives every year

Estimated savings of

$450 billion in USA Health Care if Big

Data is used

Figures are as of May, 2015

History of Big Data

• First Documented use of Term Big Data

1997 by a paper from NASA: "Visualization provides an interesting challenge for

computer systems: data sets are generally quite large, taxing the capacities of

main memory, local disk, and even remote disk. We call this the problem of big

data."

• 3Vs first published in 2001

Gartner analyst Doug Laney introduced the 3Vs concept in a 2001 MetaGroup

research publication, 3D data management: Controlling data volume, variety and

velocity.

• Rapid growth since 2007

- Better Internet bandwidth

- Cheaper storage

- Increased computing powe

History of Big Data – Factors

contributing to Growth

Number of “Big Data” Papers published per yearSource: An overview of Big DataJournal: The Next Wave | Vol. 20 | No. 4 | 2014

History of Big Data – Factors

contributing to Growth

Computing Cost Performance 1992-2012

Source: From exponential technologies to exponential innovation

URL: http://dupress.com/articles/from-exponential-technologies-to-exponential-innovation/

Date: 4-October-13

History of Big Data – Factors

contributing to Growth

Storage Cost Performance 1992-2012

Source: From exponential technologies to exponential innovation

URL: http://dupress.com/articles/from-exponential-technologies-to-exponential-innovation/

Date: 4-October-13

History of Big Data – Factors

contributing to Growth

Global Internet Traffic

Figures are as of May, 2015

History of Big Data – Factors

contributing to Growth

Gartner Emerging Technologies 2012

History of Big Data – Factors

contributing to Growth

Google search for Term “Big Data” – Signifying public interestFigures are as of May, 2015

Big Data in Social Media

Recommendation Systems

Marketing

Electioneering

Influence Marketing

Credit Scoring

Candidature Check

Big Data in Social Media

The conversation Prism

• What is Social Media?A group of Internet-based applications that

build on the ideological and technological

foundations of Web 2.0, and that allow the

creation and exchange of user-generated

content.

• Social media is much more

than FB and twitter.

• Social media platforms for

almost every sphere of life.

Users /day

Twitter 302 million 500 million tweets

Facebook 936 million55 million status

update

LinkedIn 364 million

YouTube 1+ billion users432000 hours of

videos

• How big are these platform?

Figures are as of May, 2015

Uses of Social Media Data

• What can be mined out of ocean of data?

Possibilities are endless.. .. ..

UN project showcased an

exciting application to discover

association between food

prices inflation and tweets

about price of rice.

Social Media – Recommendation

Systems

Many Types of recommendation systems

• Facebook – Recommended Friends

• LinkedIn – People You May Know

• YouTube – Videos you may Link

• Amazon – People also brought

• Pinterest – Board Recommendation

So, how does Recommendation Systems work?

Social Media – Recommendation

Systems

People / Friend Recommendation

- Using known information predict ties

- Friends of Friends are likely to be friends

Algorithm/research area

- Community detection

- Structural Holes

Social Media – Electioneering

• What is Electioneering?

- The activity of trying to persuade people to vote

for a particular political party.

• What is the Big Data’s role in it?

- Determine and target most perusable electoral

base

- Effectively choose marketing media for maximum

reach for every dollar spent

- Influencing the influencers

Social Media – Electioneering

• Maximizing return per dollar

– Match billing record (set-up box company) with present voter list

– Divide a day into 96 zones

– Study the time slots usage of target electoral across 60 channels

– Pick the slots with maximum reach per dollar

• User Modelling

– Model users as on rating of 0 – 10 for being perusable

– Volunteers then call/visit electoral with appropriate content

• Micro-targeting

– Monitor social media facebook, twitter etc.

– Micro target voters by delivering custom message to specific sub group

Social Media – Influence

MarketingWhat is Social Influence

- Social influence occurs when one's opinions, emotions,

or behaviours are affected by others, intentionally or

unintentionally.

What is Influence marketing

- Discovering and predicting a users influence on

connected nodes and ability to spread information.

Social Media – Influence

Marketing

Use Case

- Klout generates a score on a scale of 1-100 for a social

user to represent her/his ability to engage other people

and inspire social actions.

- In 2012, Cathay Pacific opened access to SFO lounge to

Klout user’s

Big Data in Healthcare

Self-aware Medics

Sports and Fitness Tracking

Clinical Trials

Personalized Medicines

Genomics

Electronic Health records

Big Data in Healthcare

• Data characteristics

- 1.2 billion clinical documents are produced in the U.S.

each year. 60% are in unstructured format

- Health trackers

- GENOMICS

• Savings

- Can save up to $450 billion if healthcare industry uses big

data analytics and patients make the right choices.

- US Government recoveries from forfeiture, asset seizures

and fines amounting to $4.3 dollars

Figures are as of May, 2015

Healthcare – Clinical Trials

Before

Big Data Era

Healthcare – Personalized

medicine

Redefining Medicine

Healthcare – Genomics

Success Story

Use Big Data and genomics to pin on disease root cause

Story- Joshua Osborn(pictured), 14 year old admitted to hospital for high fever

- MRI showed brain swelling. However, all related series

of test showed negative result.

- Doctors decided to run experimental DNA Technology

- Extracted DNA using cerebrospinal fluid

- With in 2 day, three million DNA sequences were

extracted- From Sequence obtained, team subtracted all known human elements

- Only 0.02 percent left out, belonged to lethal bacterium called Leptospira

- Started the cure for the infection and within weeks Joshua was back home

Underling Big Data Technology- SNAP, a spark based sequence aligner

Big Data in Smart Cities

Smart Transport

Traffic Management

Smart Governance

Smart Energy

Smart Economy

Smart Cities – Internet of Things

What is IoTThe Internet of Things, also called TheInternet of Objects, refers to awireless network between objects,usually the network will be wirelessand self-configuring, such ashousehold appliances.

-Wikipedia

Benefits

- Dynamic control of Life

- Improve resource utilization

- Automation support systems

- Integrating physical systems

with human society

Smart Cities – Smart Transport

Latest Use Case: eCall- Mandatory for all vehicles

to have embedded impact

sensors

- Sensors can call

emergency services in

case of impact.

- Devices activated only on

accidents.

Savings- Expected to reduce response time by 40-50%

- Time saved = lives saved. 2500 lives annually

Challenges- User privacy and concerns over being tracked and monitored

Smart Cities – Smart Energy

Use Case: Time based

energy pricing- Monitor energy usage using

smart meters

- Report usage to both customer

and energy company in real

time.

- Big data is used to predict and

calculate pricing based on

history and current utilization.

Savings and benefits- Customer can better manage

there energy usage

- Potential to maximize saving on

energy

Smart Cities – Smart Energy

Use Case: IBM HyREF- Cloud imaging technology can

track clouds

- Sensors for wind speed,

temperature and direction.

- Can predict 1 month in advance

- Can predict weather 1 month in

advance at interval of 15 mins

Savings and benefits- Can better manage variable

nature of winds

- Better forecast of energy

generation

- Enable integration of traditional

sources of power generation in

case of outage

Thank You