data analytics with matlab -...

26
1 © 2017 The MathWorks, Inc. MATLAB for Data Analytics

Upload: others

Post on 05-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

1© 2017 The MathWorks, Inc.

MATLAB for Data Analytics

Page 2: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

2

Aeronautics

Medical Devices

Off-highway

vehicles

Automotive

Oil & Gas

Industrial Automation

Fleet Analytics

Health Monitoring

Asset Analytics

Process Analytics

Prognostics

Condition

Monitoring

Clean Energy

Retail Analytics

Mfg Process Analytics

Supply Chain

Operational

Analytics

Healthcare Analytics

Risk Analysis

Logistics

Retail

Finance

Healthcare

Management

Internet

Railway Systems

Page 3: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

3

What is Data Analytics?

• What happened? Descriptive

• Why did it happen?Diagnostics

• What will happen?Predictive

• What should be done?Prescriptive

Turn large volumes of complex data into actionable information

Data Decisions

Page 4: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

4

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Files

Databases

Sensors

Access and Explore

Data

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

Page 5: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

5

Data Analytics Workflow

Files

Databases

Sensors

Access and Explore

DataPreprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

▪ Point and click tools to access

variety of data sources

▪ High-performance environment

for big data

Files

Signals

Databases

Images

▪ Built-in algorithms for data

preprocessing including sensor,

image, audio, video and other

real-time data

MATLAB Analytics work

with business and

engineering data

1

Page 6: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

6

Data Analytics Workflow

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

Preprocess Data

Working with

Messy Data

Data Reduction/

Transformation

Feature

Extraction

MATLAB enables domain experts to

do Data Science

2

Apps Language

▪ Easy to use apps

▪ Wide breadth of tools to facilitate

domain specific analysis

▪ Examples/videos to get started

▪ Automatic MATLAB code

generation

▪ High speed processing of large

data sets

Page 7: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

7

Data Analytics Workflow

Integrate Analytics with

Systems

Desktop Apps

Enterprise Scale

Systems

Embedded Devices

and Hardware

Develop Predictive

Models

Model Creation e.g.

Machine Learning

Model

Validation

Parameter

Optimization

▪ End user: Operators, Analysts,

Administrative Staff, customers etc.

▪ Different target platforms:

– Cluster or Cloud environment

– Standalone desktop applications

– Server based Web and enterprise systems

– Embedded hardware

▪ Different Interfaces: C++, Java, Python,

.NET etc.

▪ Need to translate analytics to production

environment

Challenges

Page 8: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

8

Integrate analytics with systems

MATLAB

Runtime

C, C++ HDL PLC

Embedded Hardware

C/C++ ++ExcelAdd-in Java

Hadoop/

Spark.NET

MATLABProduction

Server

StandaloneApplication

Enterprise Systems

Python

MATLAB Analytics run anywhere

3

Page 9: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

9

Key Takeaways

MATLAB Analytics work

with business and

engineering data

1 MATLAB enables domain experts to do

Data Science

2 3MATLAB Analytics run anywhere

Page 10: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

10

Machine Learning is Everywhere

▪ Image Recognition

▪ Speech Recognition

▪ Stock Prediction

▪ Medical Diagnosis

▪ Data Analytics

▪ Robotics

▪ and more…

[TBD]

Page 11: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

11

Machine Learning

Machine learning uses data and produces a program to perform a task

Standard Approach Machine Learning Approach

𝑚𝑜𝑑𝑒𝑙 = <𝑴𝒂𝒄𝒉𝒊𝒏𝒆𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎

>(𝑠𝑒𝑛𝑠𝑜𝑟_𝑑𝑎𝑡𝑎, 𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦)

Computer

Program

Machine

Learning

𝑚𝑜𝑑𝑒𝑙: Inputs → OutputsHand Written Program Formula or Equation

If X_acc > 0.5

then “SITTING”

If Y_acc < 4 and Z_acc > 5

then “STANDING”

𝑌𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑦= 𝛽1𝑋𝑎𝑐𝑐 + 𝛽2𝑌𝑎𝑐𝑐+ 𝛽3𝑍𝑎𝑐𝑐 +

Task: Human Activity Detection

Page 12: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

12

Example: Human Activity Learning Using Mobile Phone Data

Machine

Learning

Data:

➢ 3-axial Accelerometer data

➢ 3-axial Gyroscope data

Page 13: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

13

“essentially, all models are wrong,

but some are useful”

– George Box

Page 14: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

14

MODEL

PREDICTION

Machine Learning Workflow

Train: Iterate till you find the best model

Predict: Integrate trained models into applications

MODELSUPERVISED

LEARNING

CLASSIFICATION

REGRESSION

PREPROCESS

DATA

SUMMARY

STATISTICS

PCAFILTERS

CLUSTER

ANALYSIS

LOAD

DATAPREPROCESS

DATA

SUMMARY

STATISTICS

PCAFILTERS

CLUSTER

ANALYSIS

NEW

DATA

Page 15: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

15

Parallel Computing ParadigmMulticore Desktops

Multicore Desktop

Core 5

Core 1 Core 2

Core 6

MATLAB Desktop

(client)

Worker Worker

Worker Worker

MATLAB multicore

Page 16: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

16

Parallel Computing ParadigmCluster Hardware

Cluster of computers

Core 5

Core 1 Core 2

Core 6

MATLAB Desktop

(client)

Core 5

Core 1 Core 2

Core 6

Core 5

Core 1 Core 2

Core 6 Core 5

Core 1 Core 2

Core 6

Worker Worker

Worker Worker

Worker Worker

Worker Worker

Worker Worker Worker Worker

Worker Worker Worker Worker

Page 17: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

17

Migrate execution to a cluster environment

MATLAB MATLAB Distributed Computing Server

GPU

Multi-core CPU

Parallel Computing Toolbox

GPU

Multi-core CPU

Page 18: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

18

Parallel Computing ParadigmNVIDIA GPUs

Using NVIDIA GPUs

MATLAB Desktop

(client)

GPU cores

Device Memory

Page 19: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

19

Cluster Computing Paradigm

▪ Prototype on the desktop

▪ Integrate with existing

infrastructure

▪ Access directly through

MATLAB

User Desktop HeadnodeCompute

Nodes

Parallel Computing Toolbox

MATLAB

MATLAB Distributed Computing Server

Page 20: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

20

Parallel Computing with MATLAB – Beyond PARFOR

Well-known features

▪ parallel-enabled toolboxes

▪ parfor

▪ gpuArray

Full spectrum of support

▪ batch submission, jobs and tasksbatch, createJob, createTask

▪ asynchronous queue for fevalparfeval

▪ parallel support for big datatall, mapreduce

▪ distributed arrays (“global arrays”)distributed, codistributed

▪ message passinglabSend, labReceive

tutorials

Page 21: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

21

Parallel-enabled Toolboxes (MATLAB® Product Family)Enable parallel computing support by setting a flag or preference

Optimization

Parallel estimation of

gradients

Statistics and Machine Learning

Resampling Methods, k-Means

clustering, GPU-enabled functions

Neural Networks

Deep Learning, Neural Network

training and simulation

Image Processing

Batch Image Processor, Block

Processing, GPU-enabled functions

Computer Vision

Parallel-enabled functions

in bag-of-words workflow

Signal Processing and

Communications

GPU-enabled FFT filtering,

cross correlation, BER

simulations

Other parallel-enabled Toolboxes

Page 22: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

22

Speed-up MATLAB code with NVIDIA GPUs

➢ Ideal Problems

• Massively Parallel and/or Vectorized operations

• Computationally Intensive

➢ 300+ GPU-enabled MATLAB functions

• Enable existing MATLAB code to run on GPUs

• Support for sparse matrices on GPUs

➢ Additional GPU-enabled Toolboxes

• Neural Networks

• Image Processing

• Signal Processing

..... Learn More

Page 23: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

23

Run Same Code on CPU and GPUSolving 2D Wave Equation

0

10

20

30

40

50

60

70

80

0 512 1024 1536 2048

Tim

e (

se

co

nd

s)

Grid size

18 x

faster

23x

faster

20x

faster

GPU

NVIDIA Tesla K20c

706MHz

2496 cores

memory bandwith 208 Gb/s

CPU

Intel(R) Xeon(R)

W3550 3.06GHz

4 cores

memory bandwidth 25.6 Gb/s

Page 24: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

24

Big Data capabilities in MATLAB

11 26 41

12 27 42

13 28 43

15 30 45

16 31 46

17 32 47

20 35 50

21 36 51

22 37 52

Distributed Arrays

Apache Spark™ on Hadoop

Tall

Datastores

Page 25: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

25

Large collections of data files

• datastore

• support for HDFS

ACCESS

Access more data and collections

of files than fit in memory

Statistical analysis• tall arrays

• distributed arrays and overloaded functions

Signal processing• distributed arrays and overloaded functions

Deep Learning• GPU

Big Data capabilities in MATLAB

PROCESS AND ANALYZE

Adapt traditional processing tools or

learn new tools to work with Big Data

SCALE

Scale to compute clusters and

Hadoop/Spark

Analysis of large tabular data• tall arrays

Large simulations of environmental data• distributed arrays and overloaded functions

Advanced techniques for power users• MATLAB API for Spark

• mapreduce

• labSend / labReceive

Page 26: Data Analytics with MATLAB - itpcas.ac.cnlib.itpcas.ac.cn/documents/18/0/Data+Analytics+MATLAB.pdfMATLAB Analytics work with business and engineering data 1. 6 Data Analytics Workflow

26

MathWorks Services

▪ Consulting– Integration

– Data analysis/visualization

– Unify workflows, models, data

▪ Training

– Classroom, online, on-site

– Data Processing, Visualization, Deployment, Parallel Computing

www.mathworks.com/services/consulting/

www.mathworks.com/services/training/