11-data science and big data

Download 11-Data Science and Big Data

Post on 02-Jun-2018

216 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 8/10/2019 11-Data Science and Big Data

    1/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Process Mining: Data Science in Act ion

    Data Science and Big Data

    prof.dr.ir. Wil van der Aalstwww.processmining.org

  • 8/10/2019 11-Data Science and Big Data

    2/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Data is the new oil!

    In the last 10 minutes we generated more

    than from prehistoric times until 2003

  • 8/10/2019 11-Data Science and Big Data

    3/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    We are all generating event data!

    taking the train

    refueling your car

    buying a coffee

    adjusting the temperature

    getting a spee

    sending

    making an a

    making a phone callw

    this

  • 8/10/2019 11-Data Science and Big Data

    4/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    GPS

    Proximity sensor

    Ambient l ight sensor

    Accele

    Magnetomet

    Gyroscopic senso

    Touchscreen

    Camera (front)

    Camera (back)

    Bluetooth

    Finger-p

    M

    GSM/HSDPA/LTE

    14+ sensors

  • 8/10/2019 11-Data Science and Big Data

    5/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Internet of Events

  • 8/10/2019 11-Data Science and Big Data

    6/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Internet of Events: 4 sources of event da

  • 8/10/2019 11-Data Science and Big Data

    7/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Internet of Events: 4 sources of event da

  • 8/10/2019 11-Data Science and Big Data

    8/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Internet of Events: 4 sources of event da

  • 8/10/2019 11-Data Science and Big Data

    9/38Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Internet of Events: 4 sources of event da

  • 8/10/2019 11-Data Science and Big Data

    10/38

  • 8/10/2019 11-Data Science and Big Data

    11/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)Wil van der Aalst & TU/e (use only with permission & acknowledgements)

  • 8/10/2019 11-Data Science and Big Data

    12/38

    Moo

    Gordon E. M

    Components

    Circui ts, Elec1965.Diagram by Wgsimon/CC BY 3.0

    Other e

    Compu

    Capaci

    Bytes p

    220= 1.048.576x in 40 y

  • 8/10/2019 11-Data Science and Big Data

    13/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Question

    40 years ag

    approximathours to go

    Eindhoven

    Amsterdam

    How long w

    today if tran

    technology

    followed Mo

  • 8/10/2019 11-Data Science and Big Data

    14/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Answer

    1.5 x 60 x 60 / 220 =

    0 00515 s

  • 8/10/2019 11-Data Science and Big Data

    15/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Question

    40 years ago

    approximateto go from A

    to New York

    How long wotoday if trans

    technology w

    followed Mo

  • 8/10/2019 11-Data Science and Big Data

    16/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Answer

    7 x 60 x 60 / 220 = 0.0240 secon

  • 8/10/2019 11-Data Science and Big Data

    17/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Question

    40 years ago

    approximateliters of petr

    around the w

    How much p

    it take today

    transportatio

    technology w

    fol lowed Moo

  • 8/10/2019 11-Data Science and Big Data

    18/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Answer

    4000 / 220 = 0.0038 liters

  • 8/10/2019 11-Data Science and Big Data

    19/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Drowning in data

    How

    extrac

    value event d

    4 V' f Bi D t

  • 8/10/2019 11-Data Science and Big Data

    20/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    4 V's of Big Data

    VERACITYVELOCITYVARIETYVOLUME

  • 8/10/2019 11-Data Science and Big Data

    21/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Data does not have to be "Big" to be chal

    Need for data scien

    Data anaquestion

    everywhe

  • 8/10/2019 11-Data Science and Big Data

    22/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    A data scientist is able to

    analyze, and interpret dat

    variety of sources (socialinteraction, business proc

    cyber-physical systems).

    Turning data into

  • 8/10/2019 11-Data Science and Big Data

    23/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Four generic

    data science

    questions

  • 8/10/2019 11-Data Science and Big Data

    24/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    What

    happened?

    #1

  • 8/10/2019 11-Data Science and Big Data

    25/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Why did

    it happen?

    #2

  • 8/10/2019 11-Data Science and Big Data

    26/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    What will

    happen?

    #3

  • 8/10/2019 11-Data Science and Big Data

    27/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    What is

    the best that

    can happen?

    #4

    Wh d i h

  • 8/10/2019 11-Data Science and Big Data

    28/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Can we predict

    waiting times?

    Why do patients have

    to wait so long?

    How can we

    costs?

    How much staff is

    needed tomorrow?

    Do doctors follow the

    guidelines?

  • 8/10/2019 11-Data Science and Big Data

    29/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Why and when doX-ray machines

    malfunction?

    Which componentsshould be replaced?

    How are X-ray

    machines really used?

    Can w

    that the

    will bre

    next

    Wh

    ne

    imfrom the organizational level

    to the hardware/software level

    Data science skills

  • 8/10/2019 11-Data Science and Big Data

    30/38

    needed to answer

    such questions

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

  • 8/10/2019 11-Data Science and Big Data

    31/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    DSC/e

    http://www.tue.nl/

    It is the process s

  • 8/10/2019 11-Data Science and Big Data

    32/38

    It is the process s

    In the end i t is th

    that matters (adata or the

    Not jus

    and

    but e

    p

    P t i i d t i

  • 8/10/2019 11-Data Science and Big Data

    33/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Process-centric view on data scien

    Understand and improve "care

    flows" in a hospital by intelligently

    using event data scattered over

    hundreds of database tables w ith

    patient data.

    Understand and improve

    and performance of X-ra

    in the field using teraby

    level event data col lecte

    remote services ne

    F f thi

  • 8/10/2019 11-Data Science and Big Data

    34/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Focus of this course

    P i i

  • 8/10/2019 11-Data Science and Big Data

    35/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Process mining use cases

    What is the process that people re

    follow?

    Where are the bottlenecks in my

    process?

    Where do people (or machines) d

    from the expected or idealized pro

    M f

  • 8/10/2019 11-Data Science and Big Data

    36/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Many more use cases for process m

    What are the "highways" in my process?

    What factors are influencing a bottlenec

    Can we predict problems (delay, deviatio

    etc.) for running cases?

    Can we recommend countermeasures? How to redesign the process / organizat

    machine?

  • 8/10/2019 11-Data Science and Big Data

    37/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Process Mining: Data Science in Part I: Preliminaries Part III: Beyond Process Discovery

  • 8/10/2019 11-Data Science and Big Data

    38/38

    Wil van der Aalst & TU/e (use only with permission & acknowledgements)

    Chapter 2Process Modeling and

    Analysis

    Chapter 3Data Mining

    Part II: From Event Logs to Pr