texttext wil van der aalst - eit digital · statistics on the first intake •total number of...
TRANSCRIPT
Text Text Wil van der Aalst Professor Information Systems TU/e
t
A new profession is emerging, just like computer science in the early 1980-ties!
Industry and Society need Data Scientists!
t
EIT-Digital Data Science Major
5 universities involved
EIT-Digital Data Science Major
3 entry universities
EIT-Digital Data Science Major
5 exit universities (specializations)
EIT-Digital Data Science Major
5 specializations
Distributed Systems & Data Mining for Really Big Data at KTH
Multimedia & Web Science for Big Data at UNS
Design, Implementation, and Usage of Data Science Instruments at TUB
Process Mining in High Tech Systems, Healthcare, Visual Analytics, or Big Software at TUE
Internet of Things (IoT) at UPM
Statistics on the first intake
• Total number of students: 46
• EU students 54%
• Age :20-29
• Female students: 17%
DSC
t
DSC/e
t
Data Science Center Eindhoven
http://www.tue.nl/dsce/
11
DSC/e: Competences and Research Programs 28 groups and 420+ people involved
Context: Why are we using data science, does it have the intended effect, and will
people accept it?
Analysis: How to turn data into real value (models, answers/decisions, and
visualizations/insights)?
Enabling technologies: How to get the data and deal with computational/
infrastructural challenges (big data and hard questions)?
Probability and Statistics
Stochastic Networks
Data Mining
Process Mining
Visualization
Large-Scale Distributed Systems
Data-Intensive Algorithms
Data-Driven Operations Management
Data-Driven Innovation and Business
Human and Social Analytics
Privacy, Security, Ethics, and Governance
Internet of Things
[RP1] Process Analytics: Improving Service While Cutting Costs
[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior
[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability
[RP4] Quantified Self: Improving Performance and Well-Being
[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science
[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens
[RP7] Smart Grids: Data Intensive Infrastructures
Data Science Flagship (Philips & DSC/e)
4 Strategic topics • Data Driven Value Propositions
• Healthcare Smart Maintenance
• Optimizing Healthcare Workflows
• Continuous Personal Health
4 TU/e departments
16 PhD students
30 Data science specialists
Many more organizations
• BrandLoyalty
• Vanderlande Industries
• ASML
• SynerScope
• Magnaview
• Fluxicon
• Adversitement
• Rabobank
• ING
• SAP
• IBM
• PwC
• AMC
• …
Process Mining
Example: Process Mining as the Bridge Between Data Science and Process Science
Process Mining: Spreadsheet for behavior
• Input: events (“things that have happened”)
• Mandatory per event:
• case identifier
• activity name
• timestamp/date
• Optional
• resource
• transaction type
• costs
• …
case
identifier
activity
name timestamp
resource row = event
Process Mining: Spreadsheet for behavior
208 cases
5987 events
74 activities
Process Mining: Spreadsheet for behavior
batching for activities
“opstellen eindnota” and
“archiveren”
Loesje van
der Aalst
desire line
Process Discovery
Process Mining: Spreadsheet for behavior process discovery
NO
modeling
needed!
Process Mining: Spreadsheet for behavior process discovery
NO
modeling
needed!
74 act.
11 act.
3 act.
event data process
model
Conformance Checking
desire line
very safe
system
Conformance Checking
Process Mining: Spreadsheet for behavior conformance checking
?
discovered or
hand-made
Process Mining: Spreadsheet for behavior conformance checking
fitness of
93.5%
Process Mining: Spreadsheet for behavior
conformance checking
final inspection is
skipped 40 times
Process Mining: Spreadsheet for behavior conformance checking
move on model
(something should have
happened, but did not)
move on log
(something happened that
should not happen)
Process Mining: Spreadsheet for behavior performance analysis
average
flowtime is
1.92 months
bottleneck
NO
modeling
needed!
Process Mining: Spreadsheet for behavior
performance analysis
waiting time of
15.74 days
NO
modeling
needed!
Process Mining: Spreadsheet for behavior animating reality
NO
modeling
needed!
real cases
Process Mining: Spreadsheet for behavior
16 cases are
queueing
animating reality
Process Mining: Spreadsheet for behavior
Deviations
Where?
Why? time
costs
…
What?
32
Conclusion
•Need for Data Scientists!
•Wonderful Data Science Master Program with 3
entry points and 5 specializations
• Ask Farideh Heidari ([email protected]) for details!
•Zoomed-in on the Data Science ecosystem in
Eindhoven: Data Science Center Eindhoven (DSC/e)
•Zoomed-in on a particular Data Science topic:
Process Mining (linking processes and data)
masterschool.eitdigital.eu
More information?
http://www.masterschool.eitdigital.eu/programmes/dsc/
https://www.coursera.org/course/procmin/
http://www.processmining.org/
http://www.tue.nl/dsce/
http://vdaalst.com/