se2016 bigdata denis reznik "data driven future"

20
Data-Driven Future What to Learn and What to Expect? Denis Reznik Data Architect at Intapp Kyiv Microsoft Data Platform MVP

Upload: inhacking

Post on 11-Apr-2017

114 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: SE2016 BigData Denis Reznik "Data driven future"

Data-Driven FutureWhat to Learn and What to Expect?

Denis ReznikData Architect at Intapp KyivMicrosoft Data Platform MVP

Page 2: SE2016 BigData Denis Reznik "Data driven future"

About me

•Denis Reznik

•Kyiv, Ukraine

•Data Architect at Intapp, Inc.

•Microsoft Data Platform MVP

•Co-Founder of Ukrainian Data Community

2 |

Page 3: SE2016 BigData Denis Reznik "Data driven future"

Agenda

•Data is a new Oil (c)

•Data and Science

•Data in Big Companies

•Data and Application Development

•Data-Driven Future

Page 4: SE2016 BigData Denis Reznik "Data driven future"

Data is a New Oil

“Data is the new oil. It’s valuable, but if unrefined it

cannot really be used. It has to be changed into gas,

plastic, chemicals, etc to create a valuable entity that

drives profitable activity; so must data be broken

down, analyzed for it to have value.”

(c) Clive Humby, UK Mathemetician

Page 5: SE2016 BigData Denis Reznik "Data driven future"

Data and Science

•Thousands of years•Empirical

•Few hundreds of years•Theoretical

•Last fifty years•Computational•“Query the world”

•Last twenty years•eScience (Data Science)•“Download the world”

Page 6: SE2016 BigData Denis Reznik "Data driven future"

Machine Learning

Supervised Learning Unsupervised Learning

Classification Regression

Page 7: SE2016 BigData Denis Reznik "Data driven future"

Linear Regression

Learning Algorithm

Training Data

h

h - Hypothesis

OceanTemperature

WhalesPopulation

Page 8: SE2016 BigData Denis Reznik "Data driven future"

DEMO

Linear Regression

Page 9: SE2016 BigData Denis Reznik "Data driven future"

Data in Big Companies

Page 10: SE2016 BigData Denis Reznik "Data driven future"

Parallel Processing

Temperature Sensor Datasets (n Items)

Q: How many times temperature was above the norm during the last week?

A: 5

Time: 2 sec

Algorithmic Complexity: O(n)

Page 11: SE2016 BigData Denis Reznik "Data driven future"

Parallel Processing

Temperature Sensor Datasets (k Items in each one)

Q: How many times temperature was above the norm during the last week?

A: 1

Time: 0.5 sec

Algorithmic Complexity: O(n/k)

A: 0 A: 3 A: 4

Page 12: SE2016 BigData Denis Reznik "Data driven future"

Map-Reduce

A: 1

Map -> COUNT(*) WHERE Value > 40

A: 0 A: 3 A: 4

Reduce -> COUNT(*)

A: 5

Reduce

Page 13: SE2016 BigData Denis Reznik "Data driven future"

DEMO

Map-Reduce

Page 14: SE2016 BigData Denis Reznik "Data driven future"

Data and Application Development

source: https://www.youtube.com/watch?v=t6kM2EM6so4

Page 15: SE2016 BigData Denis Reznik "Data driven future"

Index (B-Tree) - Seek

1 .. 1M

1 .. 2K 2K+1 .. 4K

1M-2K .. 1M

1 .. 300 301..800 801..1,5K 1,5K+1..2K

SELECT * FROM UsersWHERE Id = 523

Page 16: SE2016 BigData Denis Reznik "Data driven future"

Index (B-Tree) - Scan

1 .. 1M

1 .. 2K 2K+1 .. 4K

1M-2K .. 1M

1 .. 300 301..800 801..1,5K 1,5K+1..2K

SELECT * FROM Users

Page 17: SE2016 BigData Denis Reznik "Data driven future"

Index (B-Tree) - Range Scan

1 .. 1M

1 .. 2K 2K+1 .. 4K

1M-2K .. 1M

1 .. 300 301..800 801..1,5K 1,5K+1..2K

SELECT * FROM UsersWHERE Id BETWEEN 700 AND 1700

Page 18: SE2016 BigData Denis Reznik "Data driven future"

Hashtable

John Dow

John Snow

Jack Snack

2

3

1

4

0

John Dow

Hash Function

0

Jack Snack

2

John Snow

0

Page 19: SE2016 BigData Denis Reznik "Data driven future"

Data-Driven Future

• Data amount is growing and this is cool

• More and more decisions are based on data

• More and more applications are developed

• It is exciting to be a Software Engineer now!

Page 20: SE2016 BigData Denis Reznik "Data driven future"

Thank you!

Denis Reznik

Twitter: @denisreznik

Email: [email protected]

Blog: http://reznik.uneta.com.ua

Facebook: https://www.facebook.com/denis.reznik.5

LinkedIn: http://ua.linkedin.com/pub/denis-reznik/3/502/234