scientists whoe stare at data - mathworks · scientists who stare at data | dipl.-ing. jan grüner...

19
Scientists who stare at data Dipl.-Ing. Jan Grüner | MathWorks Automotive Conference 2019

Upload: others

Post on 25-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Scientists who stare at data

Dipl.-Ing. Jan Grüner | MathWorks Automotive Conference 2019

Page 2: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

About

TU Berlin // FVB

• Chair of Naturalistic Driving Oberservation for Energetic

Optimisation and Accident Avoidance

• Focus on Electromobility

www.fvb.tu-berlin.de

Scientists who stare at data | Dipl.-Ing. Jan Grüner

2

Myself

• Research Assistant

• Matlab since 2009

• Incl. several years of teaching Matlab

• PHD Thesis: “Usage behavior of hybrid vehicles”

[email protected]

Matlab Projects

• IDCB (Create individual driving cycles from user data)

• Database Toolbox (wrapper functions for MySQL Database

Connector)

• AMPERE (Usage behavior of hybrid vehicles)

• WeatherDB (Weather DB / API to make use of data provided

by the DWD)

Page 3: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

We are living in the information processing age

We record everything & everywhere

• Cameras

• Cars

• Smart-X (Phones, Homes)

• IoT

Scientists who stare at data | Dipl.-Ing. Jan Grüner

3

We can store huge amounts of data

• Fast storage

• Large storage

We can transmit data from A to B

• Worldwide

• Instantly / Fast

• Everything is connected

We have the computational power to analyze it

• Server

• Cluster

• Databases

• Software

Page 4: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

The challenge is no longer getting data, it‘s processing it

Devices, Services,

DataProcessing Result

(and making sense of)

• With endless possibilities

comes complexity

• Complexity creates chaos

• Data formats

• Competing standards

• Methods

• A huge amount of

information

Scientists who stare at data | Dipl.-Ing. Jan Grüner

4

Page 5: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

In tech / data we (blindly) trust

Scientists who stare at data | Dipl.-Ing. Jan Grüner

5

The pessimistic view on hardware /

services

• Sensors are cheap, the device isn‘t

• Updates?

• Security?

• Testing?

• Documentation?

• Support?

• May change at any timeConsecutive datastrem

Wrong unit

Page 6: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Question the data and your work (and everyone else)

The situation

• Assumptions

• Documentation

• Human error

Scientists who stare at data | Dipl.-Ing. Jan Grüner

6

Rules of data processing

• Assume the data is broken

• Mistakes will happen

• Re-check your data / existing processes regularly

"

• The RMS value is inserted into a circular list of 401 entries

• No event is fired until this list is initially filled

• TBD should this list be persisted, or rebuilt after each

reboot?

"

• (Variable-)Names will not change over time

• The documentation is complete

Updates everywhere

Bugs everywhere

➔ Can become false over time

➔ Outdated (best case)

➔ nobody is prefect

Page 7: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Workflow (an example)

Why Matlab

• University (Everything is DIY)

• You want it, you build it

• You built it, you run / manage it

• Maintenance / teaching experience

• Easy to learn / to understand (for students)

• Avoid multiple languages

• Complexity

• Knowledge required

• Development

• Easy to visualize data along the way

• Ready to use toolboxes (or create your own)

Scientists who stare at data | Dipl.-Ing. Jan Grüner

7

Page 8: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Import (a simple) (CSV)

• Column names, units, limits

• Create overallmappingarchitecture

Pre-Analysis

• Remove duplicates

• Data format

Challenges• Read header

• Map data

• Process

Import

• Set datatypes

• Store in database

Storage

Scientists who stare at data | Dipl.-Ing. Jan Grüner

8

GetMD5()

MATLAB to MySQL

➔ MySQL Database Connector

Order, names, units (SI)

• Missing columns

• Changing column names

• No fixed column order

• Changing data units

Page 9: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Validation & Fixes

ValidationValidation

LiteratureLiterature

ValuesValues

MapsMaps

DataData

FormulaFormula

Integration / DifferentiationIntegration /

Differentiation

Literature

• Not always correct either

• Contradicting sources

Data

• Requires Formulas

• Documentation?

• Generally known?

• Signal can be estimated with “basic” math

Scientists who stare at data | Dipl.-Ing. Jan Grüner

9

Page 10: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Literature

Scientists who stare at data | Dipl.-Ing. Jan Grüner

10

Data Literature A

Literature B Estimation

And now?

A: M

AT

TH

É,

R., 2

01

2c. O

pe

l A

mp

era

-E

lectr

ic P

rop

uls

ion

an

d E

ne

rgy S

tora

ge

Syste

m,

Ele

ktr

ik / E

lektr

on

ik in

Hybri

d-

un

d E

lektr

ofa

hrz

eu

ge

n u

nd

ele

ktr

isch

es E

ne

rgie

ma

na

ge

me

nt,

Deu

tsch

lan

d

B: G

RE

BE

, U

.D. u

nd

L.T

. N

ITZ

, 2

01

1.

Vo

lte

c–

Das A

ntr

ieb

ssyste

m f

ür

Che

vro

let V

olt u

nd

Op

el A

mp

era

, M

TZ

-M

oto

rte

ch

nis

ch

e Z

eitsch

rift

, 7

2(5

), 3

42

-35

1. IS

SN

00

24

-85

25

Page 11: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Literature

#Formula Literature

(1)𝑅𝑃𝐴 =

1

𝑥𝑎𝑖

+ ∗ 𝑣𝑖(Bratt und Ericsson, 2000, S. 5, The

European Comission, 2016, S. 12)

(2)𝑅𝑃𝐴 =

1

𝑥න𝑣 ∗ 𝑎+ (Ericsson, 2000, S. 11)

(3)

𝑅𝑃𝐴 = ෑ

𝑘 = 1

𝑒𝑛𝑑

𝑣 𝑘 ∗ 𝑎 𝑘 /𝑚𝑒𝑎𝑛 𝑣(Blanco-Rodriguez, Vagnoni und

Holderbaum, 2016, S. 653)

Scientists who stare at data | Dipl.-Ing. Jan Grüner

11

a : accelerationv : velocityx : distance

Relative Positive Acceleration (RPA)

• Descriptor for the dynamics of a

(driving) cycle

• Assumptions make the difference

• Documentation

(1)

BR

AT

T, H

. u

nd

E. E

RIC

SS

ON

, H

g.,

20

00

. M

ea

su

rin

gve

hic

led

rivin

gp

att

ern

s–

estim

atin

gth

ein

flu

en

ce

ofd

iffe

ren

t m

ea

su

rin

gin

terv

als

. L

un

d, S

ch

we

de

n.

(1)

Co

mis

sio

nR

eg

ula

tio

n (

EU

) 2

01

6/6

46

[o

nlin

e].

In

: O

ffic

ial J

ou

rna

l o

fth

eE

uro

pe

an

Un

ion

[Z

ug

riff

am

: 2

3.

Fe

bru

ar

20

19

]. h

ttp

s:/

/eu

r-le

x.e

uro

pa

.eu

/eli/r

eg

/20

16

/64

6/o

j

(2)

ER

ICS

SO

N,

E.,

Hg

., 2

00

0. D

rivin

gp

att

ern

in u

rba

n a

rea

s-

de

scri

ptive

an

aly

sis

an

d in

itia

l p

red

ictio

nm

od

el. L

un

d, S

ch

we

de

n:

Lu

nd

In

stitu

te o

fT

ech

no

log

y. B

ulle

tin

. 1

85

.

(3)

BL

AN

CO

-RO

DR

IGU

EZ

, D

., G

. V

AG

NO

NI u

nd

B.

HO

LD

ER

BA

UM

, 2

01

6.

EU

6 C

-Se

gm

en

t D

iese

l ve

hic

les, a

ch

alle

ng

ing

se

gm

en

tto

me

et

RD

E a

nd

WL

TP

re

qu

ire

me

nts

. IF

AC

-Pa

pe

rsO

nL

ine,

49

(11

), 6

49

-65

6. IS

SN

24

05

89

63

.

Page 12: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Data

Velocity

• Formula

• 𝑣𝑒𝑙𝑜𝑐𝑖𝑡𝑦𝐸𝑆𝑇 =2∗π∗𝑟𝑤ℎ𝑒𝑒𝑙∗𝑠𝑝𝑒𝑒𝑑𝐸𝑀𝐵

𝑟𝑎𝑡𝑖𝑜𝐴∗𝑟𝑎𝑡𝑖𝑜𝐵

• (expected) Value range

• 0 : 170 km/h

• Conditions

• Validation only possible in a specific electric mode

Scientists who stare at data | Dipl.-Ing. Jan Grüner

12

Page 13: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Data

Acceleration

• Formula

• 𝑎𝑐𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝐸𝑆𝑇 =𝜕𝑣𝑒𝑙𝑜𝑐𝑖𝑡𝑦

𝜕𝑡𝑖𝑚𝑒

• (expected) Value range

• -10 m/s² : +10 m/s²

Scientists who stare at data | Dipl.-Ing. Jan Grüner

13

Page 14: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Data

Scientists who stare at data | Dipl.-Ing. Jan Grüner

14

Page 15: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

(fixed) Data

Steps (so far)

• Look at all data

• RMSE values for identification

• Original vs. Estimate histograms

• Literature (if available)

• Look at individual data

• Plot individual trips

• Correct / Fix

• Your code

• Assumptions

Scientists who stare at data | Dipl.-Ing. Jan Grüner

15

Handling outliers

Page 16: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

(monitor) Data

Scientists who stare at data | Dipl.-Ing. Jan Grüner

16

Bugs

• Sensor stopped working correctly for unknown reasons

• Bug reported in January (still not fixed)

Steps

• ?

Page 17: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Lessons learned

know

validatefix

work

• There will be new issues

• Re-check your (raw-)data & assumptions

• Make sure you are not the problem

• Test your code

• Review your code

• Make proper bug reports

• Provide examples (figures if possible!)

• Don’t expect the bugs to get fixed

• Documentation saves time

• Use it whenever possible

• Read it properly

• Create your own (and keep it up to date)

Scientists who stare at data | Dipl.-Ing. Jan Grüner

17

Page 18: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

Lessons learned

• Nobody is perfect, neither is software

• Nothing is ever really finished

• Expect things to change / go wrong

• Look at your data and understand what you are looking at

• Know what your data is supposed to look like

• Never stop learning

• Decent hardware is a big help

Unexpected MATLAB lessons (over the years)

• read(datastore) doesn‘t always read the whole file

• webread() may not always return a variable (or produce an

error)

• use wget() instead of mget()

• Different behaviour of the same function in different releases

• MATLAB and Linux work best from a terminal in software

mode (best guess: buggy gfx driver)

• Never stop upgrading to the next release

Scientists who stare at data | Dipl.-Ing. Jan Grüner

18

Page 19: Scientists whoe stare at data - MathWorks · Scientists who stare at data | Dipl.-Ing. Jan Grüner 3 We can store huge amounts of data • Fast storage • Large storage We can transmit

End

Scientists who stare at data | Dipl.-Ing. Jan Grüner

19