wastian, brunmeir - data analyses in industrial applications: from predictive maintenance to image...

41
Data Analysis in Industrial Applications: From Predictive Maintenance to Image Understanding 2016 Taipei Tech Workshop, Technikum Wien, 22.11.2016 DI Matthias Wastian (dwh GmbH), DI Dominik Brunmeir (dwh OG) [email protected], [email protected]

Upload: vienna-data-science-group

Post on 22-Jan-2018

48 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Data Analysis in Industrial Applications: From Predictive Maintenance

to Image Understanding

2016 Taipei Tech Workshop, Technikum Wien, 22.11.2016

DI Matthias Wastian (dwh GmbH), DI Dominik Brunmeir (dwh OG)

[email protected], [email protected]

Presentation Outline

• Who We Are • Some Definitions

– Machine Learning – Data Mining – Deep Learning

• Natural Language Processing – Telecommunication Patent Classification – Speech Analysis of Austrian Politicians

• Predictive Maintenance – Server Outage Prediction

• Image Understanding – Object Detection Using HOG Features – Deep Inspection – Automatic Optical Inspection of Humidity Sensors

dwh GmbH

• Founded 2004, GmbH since 2010

• 16 employees

• 17 master theses

• 6 finished dissertations

• 6 current dissertations

• >90 publications

• Bosses: – Niki Popper

– Michael Landsiedl

Definitions

Definitions

Machine Learning • is a field of study that gives computers the ability to learn without being

explicitly programmed (Arthur Samuel, 1959). • The field of machine learning is concerned with the question of how to

construct computer programs that automatically improve with experience. • A computer program is said to learn from experience E with respect to

some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E (Tom Mitchell, 1997).

Data Mining • is the analysis of (often large) observational data sets to find unsuspected

relationships and to summarize the data in novel ways that are both understandable and useful to the data owner (David Hand, 2001).

Definitions

Deep Learning

• is learning using one of a set of algorithms that attempt to model high-level abstractions in data by using model architectures composed of multiple non-linear transformations.

• One of the promises of deep learning is replacing handcrafted features with efficient algorithms for unsupervised or semi-supervised feature learning and hierarchical feature extraction.

Telecommunication Patent Classification

Example EP1696821B1 • Method and device for automatically detecting mating of animals

• Abstract: The inventive device (110, 210, 310, 510) for automatically detecting the mating of animals is wearable by an animal (100) and comprises means (105, 505) for fixing to an animal, means (140) for detecting an attempt of mating a female animal (120) by said animal, means (145, 180, 345, 580) for identifying an electronic label which is introduced in the body of said female animal and actuated by said detection means and/or by the female animal identification means by processing the image of at least one part of the female animal triggered by said detection means. In the preferred embodiment, means for identifying said other animal comprises means for communicating with the electronic label (130) carried by a female animal conspecific with the animal triggered by said detection means. In one of the embodiments, communications means is embodied in such a way that it reads the electronic label identifier of each female animal which said animal attempts to mate and storing means (160) memorises each displayed identifier. In the other embodiment, communication means is provided with a device for storing representative information on the attempted mating in the random access memory of the electronic label carried by the conspecific female animal.

Telecommunication Patent Classification

• Several thousand classified patents were used to derive a classification model for millions of 3GPP patents from Korea, Japan, China, the US and Europe.

• The data used included the publication number, the abstract, the abstract of the DWPI and the claims of the patent.

Telecommunication Patent Classification

• Language detection

• Bag of words models

• Tf-idf weighting

• Different classification models

– SVM

– Maximum entropy classifier

• Fuzzy key word comparison

Natural Language Processing

Word Counts • Measuring similarity: scalar product • Problem: document length, solution: normalize

Tf-idf • Common words (stop words: a, the, in...)

vs rare words (names, technical terms,...) • Important words: common locally, rare globally

• Term frequency times 𝑙𝑜𝑔#𝑑𝑜𝑐𝑠

1+#𝑑𝑜𝑐𝑠 𝑢𝑠𝑖𝑛𝑔 𝑤𝑜𝑟𝑑

Speech Analysis of Austrian Politicians

• How rude are Austrian politicians?

• Have they become ruder over time?

Data acquisition via

web scraping

Human labelling of selected sentences

Word2vec or similar models

Predictive Maintenance

Server Outage Prediction

• NOBODY likes server outages.

• Is there an exact definition of the term outage?

• Is it measurable?

• Downtime minutes per user

Server Outage Prediction

• Definition (Event): An event shall be defined as an occurrence happening at a determinable time and place with a certain duration. It may be a part of a chain of occurrences as an effect of a preceding occurrence and as the cause of a succeeding occurrence. It is possible that more than one event occurs at the same time and/or place.

• Definition (Abnormal Event): An abnormal event shall be defined as an outlier in a chain of events, an event that deviates so much from the other events as to arouse suspicion that it was caused by something that does not follow the usual behavior of the considered system and that it could change the entire system behavior.

Server Outage Prediction

Server

Server Monitoring

Feature Selection

Prediction

Anomaly Detection

Abnormal Event Basic Model

Assumption:

The predictions are accurate, if the server status is ok.

Server Outage Prediction

• ANN • SARIMA

• OC-SVM • ABOD

Server Outage Prediction

Data: • up to 1439 features per server, sampling rate 1-15min

• historic data sets, IBM Lotus Domino Server.Load

Preprocessing: • Reduction of data using expert knowledge

• Differentiating accumulative features

• Checking for wrong or missing data

• Normalizing the data (maxmin-mapping)

Server Outage Prediction

“I have seen the future and it is very much like the present, only longer.“ Kehlog Albran, The Profit

Server Outage Prediction

„Prediction is very difficult, especially if it's about the future.“ Niels Bohr, Nobel laureate in Physics

SARIMA, ANN

• Univariate

• Seasonality

• Crossvalidation

• Errors

0 100 200 300 400 500 600 700 800 900 100040

50

60

70

80

90

100Platform.Memory.RAM.PctUtil; prediction: green

Prediction Errors

0 100 200 300 400 500 600 700 800 900 1000-14

-12

-10

-8

-6

-4

-2

0

2

4

6

time step

err

or

Prediction Errors Platform.Memory.RAM.PctUtil

Outlier Detection

• Threshold

• Angle-based outlier detection

• One-class support vector machine

• Why do we use these methods?

– 1 + 𝑥 -classification problem

– Unsupervised

– Range of dimensions

Outlier Detection

• Threshold

• Angle-based outlier detection

• One-class support vector machine

• Why do we use these methods?

– 1 + 𝑥 -classification problem

– Unsupervised

– Range of dimensions

Server Outage Prediction

• The outlier detection delivers a score that can be used to calculate a fuzzy value of outageness.

• Thus a partition based on the relevance of outages is possible – traffic light system

• A combination of outageness scores delivered by various anomaly detectors is possible.

• By saving all these scores in a database, a classification of outages is possible (e.g. with an ANN or some clustering method).

0 50 100 150 200 250 300

0e

+0

02

e+

05

4e

+0

56

e+

05

8e

+0

51

e+

06

ABOF-Bewertung der Zeitpunkte

Rot-Lastbeginn, Grün-Lastende

Zeit

F-A

BO

Fa

kto

r

Image Understanding Applications

• Industrial image analysis

Quality assurance

Labview, Halcon, Cognex Vision Pro

• Medical image analysis

Mostly researchers with medical background

Visualisation support, detection of carcinoms etc.

ITK

• Image analysis and AI

Facebook, Google, Baidu, Microsoft – but still enrooted in universities

Face detection, mimic detection, scene description

OpenCV, dlib, Theano, Keras

HOG: Goldpad Search Results

were extremely well.

HOG: Algorithm Details

• Dalal, Triggs (2005)

• Focus on intensity gradients/edge directories

• Local contrast normalization (invariant to light conditions)

• Orientation detection of a single pixel, overlapping blocks, histogram of orientated gradients

• SVM classifier

• Open source availybility (dlib)

• Relatively few training pictures necessary

• Not a lot of parameter tuning

• Few wrong detections

HOG: Training Dataset

Deep Inspection

• Automatic optical inspection of sensors

• Sensor generations look similar, but not exactly alike

• Deep Convolutional Networks for better generalization and no extra parameter tuning

• HOG

• Software used: Keras (Python)

Deep Inspection

• Input: pixel grey values

• Solution processed by Gershick et al. (2014): 227*227

• Alternating convolution and max pooling, spatial overlapping

• Sparse connections (non-linear filter)

• Shared weights to gain translation invariance and an improved generalization ability

Deep Inspection

• Input: pixel grey values • Solution processed by Gershick et al. (2014): 227*227 • Alternating convolution and max pooling, spatial

overlapping • Hierarchical abstractions • Sparse connections (non-linear filter) • Shared weights to gain translation invariance and an

improved generalization ability • MLP classifier • Dataset augmentation: sliding window, flipping,

distortion,...

Deep Inspection

• Gold pad check (scratches vs errors)

• Ohmic contact

• active area

AOI

Automatic Optical Inspection of humidity sensors

Automatic error detection for humidity sensors

Multiple Challenges

High quality requirements

Changing specifications

Different kind of errors

High data volume

Image Aquisition

8“ Silicium wafers

90.000 Sensors per wafer

0.7µm/pixel resolution

Target scan speed of 30 minutes per wafer

Sensor

Cropping of images based on CAD file

Focus

Intensity of reflection of laser beam

Deep search for highest peak

Keep focus with proportional controller

-50

0

50

100

150

200

250

300

350

400

1

20

39

58

77

96

115

134

153

172

191

210

229

248

267

286

305

324

343

362

381

400

419

438

457

476

495

514

533

552

571

590

609

628

647

666

685

704

723

742

761

780

799

818

837

856

875

894

913

932

951

970

989

1008

Master

A master image is created per wafer

Perfect image

Self-adapting to new sensors

Simplification of image registration

Step 1: Canny edge detection

Proven algorithm for

edge detection Reliable

Easy to implement

Fast

Skeleton (1px edge)

Con: Threshold

needed

Step 2: Filtering

Discrepancy norm Robust against noise

Con: No skeleton

Automatic

thresholding

Final Step

Combination of these

two methods

Comparison with

Canny-image of

master

Conclusion

High rate of error detection

No human intervention needed

Fast and robust

Based on proven algorithms