big data: d’utopia a realitatbioinformaticsbarcelona.eu/media/upload/arxius... · big data:...

Post on 05-Oct-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BIG DATA: D’UTOPIA A REALITAT

Big data i AI, el repte transformat en casos d'èxit a la indústria biotecnològica i farmacèutica

Dimarts 6 de febrer 2018

Toni Manzano, R&D Director and co-founder

AGENDA

AgendaWhat does big data really mean?

1

How do we extract knowledge from big data?

2

Real use cases in one of the most complex industry3

1

What does big data really mean?

Artificial Intelligence

The new-world of bigdata, internet connectivity and analytics, is built with many pieces

Internet of Things

Machine Learning

What is big data

What is big data

90% of the data in the world today has been created in the last 24 months alone

Source: Adeptia

Volume

Every day we create5 quintillion bytes of data

=5.000.000.000.000.000.000

characters / day=

5 ExaBytes / day

Big data and the magic of the aggregation and the granularity

STATISTIC CHALLENGES

Yet the adoption rate of big data, cloud technologies in Pharma is lagging that of other industries

Retail

Consulting

Transportation

Construction

Food Products

Steel

Automobile

Industrial instruments

Publishing

Telecommunications

0 15 30 45 60

$0.4B

$0.8B

$1.2B

$2.0B

$3.4B

$3.4B

$4.2B

$4.3B

$5.0B

$9.6B

17.0%

18.0%

18.0%

19.0%

20.0%

20.0%

20.0%

21.0%

39.0%

49.0%

Productivity Increase %Sales Increase $B

Source: Wipro

STATISTIC CHALLENGES

SOFTWARE, INFORMATION TECHNOLOGY SERVICES32%

INTERNET8,78%

TELECOM

RESEARCH3,37%

RETAIL2,66%

MARKETING AND ADVERTISING2,55%

4,19%

FINANCIAL SERVICES2,35%

2,15%AUTOMOTIVE

… AND MORE

?

Source: Naimat, Aman “The New Artificial Intelligence Market” O’Reilly

Companies investing in AI by industrySTATISTIC CHALLENGES

2

How do we extract knowledge from big data?

Examples of emergent knowledge from big data

Square, a platform originally presented as a device plus a backend to process credit card payments, announced recently its focus to generate a new revenue stream: big data and analytics. Using its massive amount of transaction’s data square offers valuable information to a new range customers

UPS, one of the earliest adopters of business analytics, is moving to a new dynamic package routing program which will save the company tens of millions each year in fuel costs. “UPS executives don’t necessarily view Big Data as new,” Guest Columnist Thomas H. Davenport writes, ”but they do view it as providing revolutionary benefits through evolutionary implementation.”

The company has negotiated deals with multiple energy partners in the U.S. Some utility partners are willing to spend $30 to $60 per year and per thermostat to be able to turn the air conditioner up when it’s a hot day. This way, the utility can levels load on the grid. Partners don’t have direct access to the thermostats, they just sign a deal with Nest, and then Nest has access to the thermostats. Moreover, it’s a recurring revenue stream.

Rio Tinto’s Pilbara region mines, railways and ports generate 2.4 terabytes of data a minute, and its new, state-of-the-art processing centre in Brisbane is working towards processing this valuable information. The company recently reported that its new processing centre in Brisbane has already reduced the company’s costs by US$80 million

GETTING KNOWLEDGE

Kaiser Permanente: Reducing the 26.2% of office visits 8x telephone visits

Premier: 2,700 members, hospitals and health systems 90,000 non-acute facilities 400,000 physicians saving an estimated 29,000 lives reducing healthcare spending by nearly $7 billion

Asthmapolis has created a GPS-enabled tracker that monitors inhaler usage by asthmatics.

ginger.io offers a mobile application in which patients (such as those with diabetes) agree, in conjunction with their providers, to be tracked through their mobile phones and assisted with behavioral health therapies.

Johns Hopkins School of Medicine discovered how to predict sudden increases in flu-related from Google Flu Trends

The analysis of Twitter updates was as accurate as (and two weeks ahead of) official reports at tracking the spread of cholera in Haiti after the January 2010 earthquake

Examples of emergent knowledge in healthGETTING KNOWLEDGE

Drivers, driving change in healthQuality in processes and procedures is required

• 46 millions of primary visits per year • 760.000 of hospital admissions per year • 100.000 of new public health members

per year • 2.7 millions of emergency visits per year • 60 millions of electronic recipes per year • 140 millions of electronic prescriptions

per year • 63 hospitals, 49 mental health • 369 equipments for primary attention

• 3 GB for each codified genotype • 30 MB for each Ray-X test • 120 MB for each mammography • 1 GB for each 3D CT Scan • 150 MB for magnetic resonance • 80% of non-structured data • 20-40% of anual increment • 665 TB of data generated by hospital

AQuAS, Observatori de Salut de Catalunya, 2013 NetApp, The Body as a Source of Big Data, 2013

Main magnitudes in the health system in Catalonia Data volumes generated

GETTING KNOWLEDGE

Do you know how much data is being generated in your site?

Emergency of knowledge

10 Tb 500 Tb 100 PB

per year

Volume Variety

Velocity

Veracity

GETTING KNOWLEDGE

When the statistical results have huge impact in the users

GETTING KNOWLEDGE

n-dimension

3

Real use cases in one of the most complex industry

$50 wasted by Pharma

manufacturers each year

Billion

PHARMA MANUFACTURING CHALLENGES

Source: W. Nicholson Price II, Making Do in Making Drugs: Innovation Policy and Pharmaceutical Manufacturing, 55 B.C.L. Rev. 491

70% of manufacturing data is unused

CHALLENGES

Source: Gartner

Mindset change

DATA SCIENCE CHALLENGES

CQA

CPP

DS

HVAC

QC

(…..)

IoT

CLIMA

New knowledge

Artificial Intelligence Machine Learning

Big data Cloud technologies

IIoT

SAP

ERPMES

Legacy IoT

ERP

MES

IoT

ERPLIMS

CLIMA Users

+Regulated Data Hub

Siloed data 70% unused

Processes Engineering

Something is changing in Pharma…

New statistical tools for a new age based on big data

STATISTIC CHALLENGES

ANOVA

Chi-square

T-Student

p-value

Linear Regression

Gaussian Distribution

Representative sample

Decision Trees

Naive Bayes Classification

Logistic Regression

Neural Networks

Principal Component Analysis

Least Squares Regression

NLP

Clustering Algorithms

Bayesian Networks

Classical statistics Big data statistics

Data management process

STATISTIC CHALLENGES

Data Acquisition

Knowledge generation

Data Preparation

Data Aggregation

Analysis and Models

Interpretation Visualization

Computation

Heterogeneity Scalability Velocity Privacy Security

Decision making

USE CASE 1 - Golden Batch

RM

12days 14

days

ERP

MES

CLIMA

Users

ERP

MES

CLIMA

Users

?CQA: All of them under specs

CPP: Equivalent results except culture duration

Neural Networks + Pattern recognition

CQA: pH Temperature O2 CO2 PID …

CPP: Yield Glucose Flowrates DOUR % Addition …

Others: Cleaning BCO Holding times Climate HVAC …

USE CASE 1 - Golden Batch

Overall Equipment EffectivenessMeasuring the real time batch activity and transversal operationsCPP, CQA and ambience measures

USE CASE 2 - OEE

<26%

Downtimes OEE > 35%

After a 6 months POC using bigengine, the amount of 26% unexpected downtimes were reduced, the OEE increased a 35%

>35%

USE CASE 2 - OEE

Once-a-month data collection ≉ data for predictive maintenanceUSE CASE 3 - Predictive maintenance

• Failures between data collection rounds

• Inconsistent data collection

• Variable operating conditions

• Inaccessible machinery

• Manual Analysis

• Wireless Connectivity

• Inexpensive Sensors

• Big data in real time

• Cloud Computing

• Artificial Intelligence

Classic procedure Unattended process

Solv

en

ts &

Raw

Mat

eri

al Reactor 1

Reactor 2

(…)

Reactor n

VOC

Parallel processes Sequential processes

Emissions

USE CASE 4 - VOC, EL & Cooling process optimization

Why?

USE CASE 5 - Defects in tablets

toni.manzano@bigfinite.com

top related