reverse engineering object-oriented distributed systems

Post on 21-Nov-2014

599 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Reverse Engineering Object-Oriented

Distributed SystemsDan C. Cosma

LOOSE Research Group“Politehnica” University of Timisoara

Romania

OverviewUnderstanding object-oriented

distributed software applications

by reverse engineeringthe source code,

focusing on the distribution-related aspects of the system,

using a structural, technology-aware analysis approach

Distributed Software

The distributed aspect is crucial for understanding - systems are specifically built for distributed problems - technology dependence: communication infrastructure

Making the distributed aspect central makes the analysis easy - without ignoring the local functionality concerns

Methodology for understanding object-oriented distributed systems

meta-model

reverse engineering techniques

metrics

visualization

tool

A reverse engineering process

System Representation

Model

Augments an OO meta-model (Memoria): makes the distributed aspect a main concept

distributable feature -- feature directly involved in the distributed functionality, either by providing remote services, or by directly using such services

frontier classes -- act at the frontier between the system and the communication infrastructure (“communication mediator”)

Model Overview

System - Mediator

Frontier

Frontier Class

Frontier Class

Frontier Class

Frontier Class

Core Class

Core ClassCore

ClassCore Class

Core Class

Frontier Class

Core Class

Core Class

Core Class

Acquaintance Class Acquaintance

ClassAcquaintance

Class

Acquaintance Class

Acquaintance Class

Acquaintance Class

Acquaintance Class Acquaintance

Class

Acquaintance Class

Acquaintance Class

Acquaintance Class

Acquaintance Class

Class

Class

Class

Class

Class

Class

Acquaintance Class

Distributable

Feature Core

Distributable

Feature Core

Distributable

Feature Distributable

Feature

Local

Feature

The Approach

Mediator

remote

call

Mediator

2: Separate distinct cores of distributable features

utility

class

core of

distr.feat.

Mediator

4: Assess impact ofdistributable features

3: Capture coarse-grained architecture

of distributable features

1: Build the dependency graph of distributable features (DGDF)

frontier

class

0: Initial graph ofclasses

5: Support for restructuring

core class

extracted

new feature

Approach

vertex: a classedge: method call / attribute access / inheritance relation

The case studies

Java / RMI

I. Core analysis

Goals

Find the core entities involved in the distributed functionality

Get an overview of the distributed architecture

Build a Core Graph

Start with the Frontier Classes: the best starting points describing involvement in distribution

Incrementally add new vertices and edges

Identify the Frontier - technology dependent rules

until a configurable depth of search is reached

Identify the Distributable Feature Cores

Detect and remove edges that connect loosely coupled sets of classes - technology-aware and cohesion-based heuristics

The resulting connected components: candidate DF cores

Identify the remote communication channels

The engineer reviews the result

Classes in DF cores: ~10%

Architectural preview: the Distributed Architecture Perspective

II. Impact of distribution

Goalsfocus on the rest of the classes in the system (the

majority)

evaluate their involvement in providing the Distributable Features

identify the classes that follow the main patterns of involvement

make system-level and class-level characterizations

Class involvementSet of coupling-based metrics

The collaboration of a class with the entire system Total bidirectional coupling (TBC)

Involvement in providing a particular DF Acquaintance with a Distributable Feature (ADF)

Involvement in providing all DFs = involvement in distribution Total Acquaintance with Distributable Features (TADF)

System-wide “distributed awareness” Average Total Acquaintance with Distributable Features (Average TADF)

Visualization

Feature Affiliation

Perspective

Total Collaboration Dispersion

Dispersion of

Feature

Acquaintance

Intensity of

Feature

Acquaintance

To

tal C

olla

bo

ratio

n I

nte

nsity

- intensity: no of collaborations- dispersion: no. of collaborators

gray: total collaborationcolor: distribution-related collaboration

The Feature Affiliation Perspective

Visualization Example

Part of the visualization for EHCACHE

Patterns of InvolvementHow does a class participate in providing the DFs

- The main patterns of involvement were detected (Patterns of acquaintance)

- Define and use a set of detection strategies [Marinescu04] to detect the classes following a certain pattern

- Put the visualization to use: see the interesting classes

Pattern I.Significant Feature

Acquaintance Big Color Box

AND

Significant

Acquaintance of

Distributable

Features

Total coupling with

distributable features is high

Class is mostly coupled with

distributable features

TADFTBC ≥ AV ERAGE

TADF ≥ HIGH

Class has significant involvement with the distributed functionality

Pattern II.Local Feature Contributor Big Gray

Class has significant involvement with local (non-distributed) functionality

AND

Class is strongly coupled with the

other classes in the system

Class has (almost) no relation

with the distributable features

TBC ≥ HIGH

Local Feature

Contributor

TADF

TBC≤ LOW

Pattern III.Connector Class Color-Spotted

Gray

Class connects a local feature with a distributed one

AND

Class has significant coupling with

the distributable features

Class has significant coupling with

other classes in the system

TADF ≥ AV ERAGE

LOW <TADF

TBC≤ AV ERAGE

Connector

Class

EHCACHE

FWS

System-level characterization

FWS: 2 DF cores Average TADF=3, lot of gray - significant local functionality [80 classes belong to a local tool, system initially non distributed]

EHCACHE: 5 DF cores Average TADF=9, more color - more distributed functionality [documentation: system redesigned specifically as distributed]

Class-level characterization

• FWS- 80 classes -- the local tool for visually editing workflow specifications- 6 classes -- belonging to other local features

• EHCACHE- Less than 5 classes - Cache – highest TBC heavily used, but local - ConfigurationHelper – manages configuration files

Local Feature Contributor / Big Gray

Class-level characterization

• FWS- 5 classes, related to the Workflow Engine- Small number => the functionality is well located in the system

• EHCACHE- 12 classes, related to the Cache Peer Manager- TADF/TBC close to 1 => classes are dedicated to the distributable feature (ex: Mutex, ConcurrencyUtil, Sync)

Significant Feature Acquaintance / Big Color Spot

Class-level characterization

FWS- 5 classes- Most interesting case: ProcessDefinition - TADF=15, TADF/TBC=0.2 - Models/stores the internal representation of workflows in execution - Links the classes that run the workflow (detected as Significant Feature Acquaintances) with an XML parser that reads the workflow specifications

EHCACHE- 6 classes- Most interesting case: Element - Represents the data item cached by the system - The only class that has a noticeable relation with Cache Replicator - Links the Cache Replicator with the non-distributed feature of the system that actually stores (caches) data

Connector Class / Color-Spotted Gray

III. Support for Restructuring

Goal

Apply concepts and measurements similar to those used in the analysis

to help the engineer explore / play with

tentative restructuring scenarios

ApproachVisualize (a part of) the graph of classes

Select a set of initial classes

See what happens if they are to be extracted (removed) as a separate unit: - evaluate the redesign layout which classes should go with those selected,

which should remain in the initial system

- evaluate the cost

Apply such scenarios at will

HelpersMetrics-based visualization to help select initial classes - In-group Adequacy (IGA) metric

Compute the forecasted layout - Acquaintance with Class Group (ACG) - Configurable threshold value

Computing the extraction cost - Extraction Cost (EC)

dispersion

intensity

a) b) c)

Example

Applying successive scenarios can also help improve the system understanding

IV. The Tool

niSiDe“non-invasive Structural insight on Distributed environments”

Follows all the steps in the methodology, and provides complete support for analysis

Generates all visualizations and support diagrams

Built for extensibility

Integrated in the iPlasma environment

Conclusions

Contributions

• A methodology for understanding object-oriented distributed systems

• A model for object-oriented distributed systems

• The Distributable Features View (visualization)

• Basic restructuring support as a natural extension to the understanding techniques

• Comprehensive tool support

top related