big data analytics - cleveland state...

9
Big Data Analytics Sunnie Chung Electrical Engineering and Computer Science

Upload: others

Post on 20-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

Big Data Analytics

Sunnie ChungElectrical Engineering and Computer Science

Page 2: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

2

Big DataHow Much of Data ? In Peta Bytes !

• Google processes 40 PB a day (2016)• eBay has 11 PB of user data + 50 TB/day (2015)• Facebook has 36 PB of user data + 80-90 TB/day

(2013)• CERN’s LHC: 15 PB a year (~2015)• LSST: 6-10 PB a year (~2015)

How many female WWF fans under the age of 30 visited the Toyota

community over the last 4 days and saw a Class A ad?

How are these people similar to those that visited

Nissan?

Unstructured Text Stream in PB a day

What Your Big Data Stream Looks Like?

Page 3: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

3

1. Data Cleaning/Extraction/Transformation

2. Data Staging/Processing

3. Data Mining Strategies: Data Modeling/ Validation

4. Data Visualization

Massively Parallel Processing Systems• Hadoop Based Multi Node Cluster: NoSQL Stack• Cloud Based Hadoop Cluster (20 – 2000 Nodes)Software: Automatic Parallel Execution in MapReduce

Analytic Parallel Data Warehouse Systems

Information Retrieval

∑∑

==

==•=

•=

V

i i

V

i i

V

i ii

dq

dq

d

d

q

q

dq

dqdq

1

2

1

2

1),cos( r

r

r

r

rr

rrrr

Machine Learning: Neural Network, SVM, Classification

Database Research Based Methods:Multi Level Association Rule Mining

Statistics Based Methods ; Cluster

Page 4: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

4

010002000300040005000600070008000

Pacific

Paris,

Lo

ndo

n,

Easte

rn…

Am

ste

rda

m,

Ath

en

s,

Ce

ntr

al…

Jakart

a,

Gre

en

lan

d,

Bang

ko

k,

Bra

sili

a,

Ha

waii,

Atla

ntic…

Arizona

,

Lju

blja

na

,

Beiji

ng,

Belg

rade

,

Ne

w D

elh

i,

Berlin

,

Topics Most Talked About on Nov 22, 2015

Regions Most Tweeted on Nov 22, 2015

Data Extraction/Transformation

Your data Tweets Looks like on Nov 22, 2015

Page 5: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

5

Top Job titles recently listedlocations of jobs listed 1 day ago

Profile Headlines with Highest Connections

Page 6: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

6

Tweets Data Stream on Nov 5, 2016 Tweets Topics on Nov 5, 2016

Leads to the Company Stock FallUnusual Negative Tweets on the Company

Unusual Cluster on the Company Name

Page 7: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

7

Tweets Data Stream on Nov 13, 2016

Tweets Per Topic on Nov 13, 2016

Page 8: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

8

Database Security on Cloud

Encrypting Database on Cloud for Retrieving the Sensitive Data Without Decrypting

Achieving Cyber Security with Big Data Analytics

Fraud Detection in Credit Card

Intrusion Detection in Systems with sensitive data

Machine Fault Detection

Page 9: Big Data Analytics - Cleveland State Universitycis.csuohio.edu/~sschung/CIS660/BigDataAnalytics... · Big Data Analytics Research Group Math, Statistics and Databases Big Data Specific

9

Annual Big Data Workshop at CSU Big Data Analytics Curriculum at EECS

Big Data Analytics Research Group

Math, Statistics and DatabasesBig Data Specific Processing TechniquesCloud Computing Massively Parallel Big Data Processing SystemsData Source ModelingData Mining Strategies

Data Driven solutions

President’s Advisory Committee for Center Of ExcellenceData AnalyticsCyber SecurityCloud Computing