1 vldb 2006, seoul mapping a moving landscape by mining mountains of logs automated generation of a...

VLDB 2006, Seoul 1

Mapping a Moving Landscape byMining Mountains of LogsAutomated Generation of a Dependency Model for HUG’s Clinical System

Mirko Steinle, EPFL and HUGKarl Aberer, EPFLSarunas Girdzijauskas, EPFLAlexander Lamb, HUG

VLDB 2006, Seoul 2

Overview

• Background – Dependency Models• Approaches

– L1: Analyzing general service activity– L2: Analyzing user sessions– L3: Analyzing textual content

• Evaluation• Conclusion

VLDB 2006, Seoul 3

Background - A Moving Landscape

• Distributed clinical system of University Hospital Geneva (HUG)– 2000 beds, 4500 PCs, 20000 records accessed per day

• Relevant features– Communication is web service based

• Service Directory: about 50 service groups– Centralized Logging System with a standard XML format

• 10 Mio log messages/day, 1 TeraByte/year– Quite homogeneous infrastructure

• Severe Availability Requirements (24 x 7 x 365)➱ Need for automated support for problem diagnosis

VLDB 2006, Seoul 4

Dependency Model

• Service Orientation allows for easy reuse and integration, but has resulted into a complex dependency structure

• Dependency model is not clear– DM difficult to obtain, impossible to keep up-to-date

manually– Infrastructure for manual documentation of the

dependency structure is available, but not used …

VLDB 2006, Seoul 5

Goal - Automated Dependency Model

• Goal: Automated creation of a model of the system’s dependency structure (DM)– Non-intrusive and low-cost– Focus on invocation dependencies between high-

level objects

• Applications – Support for Fault Localization Algorithms– Prediction of Impact of Management Operations– Support for Architectural Decisions– Detection of Abnormal Behavior

• “you don’t want to interrupt a surgery because of DB maintenance”

VLDB 2006, Seoul 6

Possible Approaches

• Static approaches– Capture dependencies at “compile time” by scanning

configuration files, code etc.

• Dynamic approaches– Capture dependencies at runtime– Approaches include:

• Code instrumentation (standards like JMX or ARM exist but are not yet applied broadly)

• Middleware instrumentation (eg. request tagging)• Active perturbation of system operation• Time series analysis of activity measures, eg. using

Neural Networks, (network communication, cpu usage, …) [Ensel02]

Gen

era

lity

Accu

racy &

Pre

cis

ion

VLDB 2006, Seoul 7

State of the Art

• Research– Focuses on how to exploit a dependency model, little

work on how to obtain it– No generally applicable solution providing sufficiently

correct dependency models seems to exist

• Commercial Products– Most focus on low-level objects and visualization– (Few) existing dynamic approaches: high

configuration effort!

VLDB 2006, Seoul 8

Overview




VLDB 2006, Seoul 9

Technique L1: Logs as a General Activity Measure

• Key idea– Activity of dependent objects is

likely to be correlated in some sense– Use logs as an activity measure

• Earlier work– Neural networks on CPU usage,

traffic volume, … [Ensel02]– Drawback: supervised training

• Our approach– statistical approach (no training) – inspired by [LM04] (“Mining Temporal Patterns without

Predefined Time Windows”)

VLDB 2006, Seoul 10

Statistical Approach

• Tests for association of spatial point processes– Compare the typical distance of a random point R in time

to the closest timestamp of a log from B, to the one of a timestamp of a log from A

• Approach– Obtain distances by sampling from R and A– Determine median for distances A-B and R-B– If median for A-B lower than for R-B →

correlation/dependence– Use confidence intervals

VLDB 2006, Seoul 11

Example

confidence interval for median of x1,…,xn: median fallswith probability 95% into this interval,interval [xj, xk] s.t. Bn,½(k-1)- Bn,½(j-1) > 0.95

VLDB 2006, Seoul 12

Observations for L1

• Observations from preliminary experimental evaluation – True dependencies found, but clearly incomplete– Few “random” errors– However, correlation also if no invocation dependency

exists

• Limit analysis to shorter time windows– Eliminate common dependency on time

Transitive dependency Simultaneous use

VLDB 2006, Seoul 13

Technique L2: Logs in a User Session

• One main difficulty is heavy parallelism in system

➱ execution sequences get overshadowed

• Reconstruct user sessions➱ eliminates parallelism due to multiple users

• Then, adapt a procedure from NLP [Evert04]• Two independent steps

1. Extraction of consecutive log-source pairs [APPi, APPj] and creation of contingency tables

2. Statistical test for association on these tables

VLDB 2006, Seoul 14

Construction of Contingency Table

• Session Log

• Bigrams (u, v)

• Contingency table for A-B

u = A u ≠ A

v = B 1 1v ≠ B 0 1

(A,B) - (B,C) - (C,B)

VLDB 2006, Seoul 15

Expected vs. Observed Frequencies

• Expected frequencies under the hypothesis that u and v are statistically independent

VLDB 2006, Seoul 16

Statistical Test for Association

• Log-likelihood test (Dunning)

• Works well for heavily skewed tables (O11 << N)

• For an excellent discussion of statistical tests for correlation see [Evert04]

VLDB 2006, Seoul 17

Observations for L2

• Observations from preliminary experimental evaluation – Many true dependencies found – Interestingly, a few similar errors as in L1

• transitivity and simultaneous use

– Main problem• only a small subset of logs can be assigned to a

session, and many interactions can thus not be observed

VLDB 2006, Seoul 18

Technique L3: Exploiting Textual Content in Logs

• Observation– Invocation of a remote service is typically logged by the

caller– One could identify such logs and process log content to find

callee

• The other way round– Find logs mentioning directory entry contents for a given

service– Infer a dependency of the log’s source, the caller, on the

service

• Example: service s calls notify on server myserver ● Possible content of free text in log entry

Invoke externalService [fct [notify] server [myserver.hguge:9999/myurl]]or(DPINOTIFICATION) notify ($myparams)

VLDB 2006, Seoul 19

Overview




VLDB 2006, Seoul 20

Experiments on Logs: Setting

• Test data: 56.8 Mio logs from 1 week

• Reference model (RM)– Created with help of more than a dozen system experts

and developers– 178 dependencies out of 1431 possible dependencies (54

services)

• Strategy1. Validate L1, L2 and L3 against static reference model2. Validate L1 and L2 against L3 and study influence of load

VLDB 2006, Seoul 21

Experiment: Validation against RM

L10.98 level CI: [0.63, 0.73]

L20.98 level CI: [0.71, 0.78]

L30.98 level CI: [0.93, 0.96]

• 30-46 True Positives detected• Small classification error for L1

– about 2% in negative case

• False Positives (FP) for L1– transitive and simultaneous use

(e.g. administrative patient data and laboratory results)

• 51-74 True Positives detected• FP for L2

– asynchronous communication

• Sessions in L2– only 10% of all logs can be

assigned to a session

• 116-152 True Positives detected– 10 False Negatives on the whole

week

VLDB 2006, Seoul 22

Experiment: Influence of Load on Detection

• Realizations of dependency relationships computed with L3• Percentage of False Positives is not influenced by load

CI for linear factorsL1: [-0.284, -0.215]L2: [-0.025, 0.002]

VLDB 2006, Seoul 23

Overview




VLDB 2006, Seoul 24

Comparison of Log-based Approaches

L3. Logs as Text L2. Logs in Sessions

L1. Logs as Activity Measure

Accuracy and Precision of Result

Concurrency

Correlation

Implementation and Maintenance

Parametrization

Performance and security impact

Required Structure and Content of Logs (Scope)

Service directory

Session info

Only source and timestamp

• All techniques can be implemented in linear complexity w.r.t. #logs• Invocation direction functional dependency direction• Solution for HUG

– Centralized logging system ➱ little effort for log-based methods– L3 is a viable solution

VLDB 2006, Seoul 25

Conclusion

• Three new approaches to use logs for DM generation with a large scope

• All have been shown to discover useful dependency information in real-world environment

• Seems to be first study on use of logs and first real-world experiment for DM generation

• Sniffing– Applicable for web service oriented systems

• Simple and efficient solution for HUG

1 vldb 2006, seoul mapping a moving landscape by mining mountains of logs automated generation of a...

Documents

seoul background

hug slide

correct dependency models

systems dependency structure

seoul state

seoul technique l1

general service activity

runtime approaches