oopsla’05: gil, maman micro patterns in java code the unbelievable simplicity of real...

23
OOPSLA’05: Gil, Mama Micro Patterns in Java code The Unbelievable Simplicity of Real Object-Oriented Programming Itay Maman The Technion – Israel Institute of Technology Joint work with Yossi Gil October 18, 2005 OOPSLA’05, San-Diego

Post on 21-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

OOPSLA’05: Gil, Maman

Micro Patterns in Java code

The Unbelievable Simplicity of Real Object-Oriented Programming

Itay Maman

The Technion – Israel Institute of Technology

Joint work with Yossi Gil

October 18, 2005

OOPSLA’05, San-Diego

2/23

OOPSLA’05: Gil, Maman

Java’s Design Space

• Classes: – final, abstract, inner, visibility,

super-class, interfaces• Fields:

– static, final, visibility, signature• Methods:

– static, final, visibility, signature, abstract, overriding

• Constructors, initializers, …

3/23

OOPSLA’05: Gil, Maman

90% of the complexity is in 10% of the code

The Patterns Conjecture

90% of the complexity is in 10% of the code

• The 90%-10% principle:– Most classes are simple

• In most classes only a fracture of the design space is used– Complex classes are rare

• Rationale:– Programmers prefer to employ proven familiar techniques– They reuse the same simple ideas that work over and over

4/23

OOPSLA’05: Gil, Maman

• Complex classes (10%)– Difficult to implement – Substantial role

• Represent real life entities• Discovered during design

– Rich state

• Simple classes (90%)– Implementation is almost trivial– Solve a local need in an ad-hoc fashion– Encapsulate minimal state (if any)

• Our focus: The “boring” 90% of the code

Simple vs. Complex Classes

Leading questions:

• Is the 90%-10% principle true?• What do simple classes look like?• Why do we need so many simple classes?

5/23

OOPSLA’05: Gil, Maman

File -> New -> Class…

6/23

OOPSLA’05: Gil, Maman

The Methodology

• We set out to explore the nature of Java classes– Similar to the way a social-science survey is conducted

• Our empirical approach:– Step 1: Experiments -- Observe the world– Step 2: Conclusions -- Analyze the results– Step 3: Reflection -- Develop an underlying model of the world

• A “Classical” computer-science research:– Step 1: Given some model, Invent a new idea– Step 2: Experiment -- test this new idea– Step 3: Conclusions -- Evaluate your work

7/23

OOPSLA’05: Gil, Maman

Experimental Setting

• 14 huge software collections– 3,391 packages– 71,611 classes– 620,595 methods

• Domains– Libraries: 7 implementations/versions of the JRE– Applications:

• GUI: JEdit and Poseidon.• Server: JBoss and Tomcat.• Compiler/language tools: Ant, Scala, MJC

• Testing system: automatic, static analysis of classfiles + manual inspection of results

8/23

OOPSLA’05: Gil, Maman

μ-Patterns

“Formal condition on the type, structure, names, and attributes of a Java class and its components.”

1. Purposeful: fulfill a useful programming need

2. Limiting: restrict the design space variety

3. Mechanically recognizable: by an automatic checker

4. Simple: human understandable

Similar to Design Patterns, but (i) automatable, and (ii) at a lower level of abstraction.

9/23

OOPSLA’05: Gil, Maman

Example: Traits*

package java.lang;

import java.io.Serializable;

public abstract class Number implements Serializable { public byte byteValue() { return (byte)intValue(); } public short shortValue() { return (short)intValue(); } public abstract double doubleValue(); public abstract float floatValue(); public abstract int intValue(); public abstract long longValue(); }

* Scharli, Ducasse, Nierstrasz and Black in ECOOP’03

10/23

OOPSLA’05: Gil, Maman

μ-Patterns Examples

• Immutable: a class whose state never changes• Sampler: offers customers a collection of pre-made

instances• Sink: does not propagate calls• Record: has the look and feel of Pascal records• Stateless: carries no state information• Designator: an empty interface• Implementor: gives body to abstract methods, without

introducing any new methods.

Not mutually exclusive Discovered, not invented

11/23

OOPSLA’05: Gil, Maman

μ-Pattern Classification

• Degeneracy– Degenerate

State and Behavior– Degenerate

State– Degenerate Behavior– Controlled Creation

• Containment– Wrappers– Data Managers

• Inheritance – Inheritors– Base Classes

• Degeneracy– Degenerate State and Behavior– Degenerate State– Degenerate Behavior– Controlled Creation

• Containment– Wrappers– Data Managers

• Inheritance – Inheritors– Base Classes

Function Pointer Function Object Cobol Like

Box Compound Box

Outline

Restricted CreationSampler

RecordData Manager

State MachinePure TypeAugmented TypePseudo Class

Degenerate State and Behavior

Base Classes Degenerate

State

Wrappers

Data Managers

Controlled Creation

Degenerate Behavior

StatelessCommonStateImmutable

Canopy

Trait

Sink

more restricted more generalBehavior

more restricted

more general

State

DesignatorTaxonomyJoinerPool

InheritorsImplementorOverridder Extender

12/23

OOPSLA’05: Gil, Maman

Empirical Findings

• 90%-10% principle– Most classes use a mere fraction of Java’s design space

• 45% of all classes are trivial– One in ten classes is a wrapper of a single instance field– One in seven classes has no instance state– One in seven classes has no mutable state– One in seven classes is a sink

• Distinction between software collections– Patterns prevalence is not a property of the Java language– “Similar” collections tend to use patterns similarly

13/23

OOPSLA’05: Gil, Maman

Five Most Popular Patterns

 JeditScalaSharedJbossPoseidonTomcatSun-14AVG

Canopy26.5%3.9%5.2%6.2%10.3%4.6%9.8%9.5%

PureType2.5%20.5%11.2%11.3%11.9%5.6%7.7%10.1%

Overrider23.1%4.1%10.4%7.0%16.8%20.2%12.4%13.4%

Sink9.0%14.0%15.0%12.9%11.3%12.1%20.6%13.5%

Implementor37.1%10.5%16.8%23.0%22.1%12.7%26.1%21.2%

Coverage70.6%48.6%51.8%54.0%61.9%50.6%61.4%55.2%

Total Coverage83.7%79.4%65.7%76.2%76.9%67.3%79.5%75.5%

•3 out of 4 classes have at least one pattern

•This supports the validity of the 90%-10% principle

14/23

OOPSLA’05: Gil, Maman

Example of Statistical Inference

CSEETotal

Full Professors18 (34%)20 (42%)38

Associate Professors

22 (42%)13 (27%)35

Assistant Professors

13 (24%)15 (31%)28

Total5348101

Probability = 31%

15/23

OOPSLA’05: Gil, Maman

Distinction Between Collections: The Box pattern

TomcatScalaTotal

Box classes124 (9%) 387 (14%) 511

Non-box classes1310 (91%)2291 (86%)3601

Total143426784112

Probability = 10-7

16/23

OOPSLA’05: Gil, Maman

Similarity of Software Collections

 Coll-1Coll-2Coll-3Coll-4Coll-5

Box5%5%1%14%3%

CompoundBox5%6%6%5%10%

Sampler1%1%1%4%0%

PseudoClass1%1%0%2%0%

Canopy9%9%26%4%4%

FunctionPointer1%2%1%1%1%

Stateless9%10%6%15%6%

#Classes5,2138,2166762,678421

• Sun’s JREs: High degree of similarity– E.g.: Every 10th class should be Stateless– This invariant is kept regardless of the size of the collection

AntScalaJeditSun-14Sun-13 AntScalaJeditSun-14Sun-13

17/23

OOPSLA’05: Gil, Maman

μ-Patterns as a Distinguishing Mark

• “Different” collections => Different prevalence values

• “Similar” collections => Similar prevalence values – Progressive versions of the same product– Same specification

• Most differences in prevalence are statistically significant

18/23

OOPSLA’05: Gil, Maman

Patterns Multiplicity

9,947

13,03912,415

5,377

55889 18

0

2000

4000

6000

8000

10000

12000

14000

0 1 2 3 4 5 6

No. Patterns

No.

Cla

sses

19/23

OOPSLA’05: Gil, Maman

Leading Questions Revisited

• 90%-10% principle?– Yes… (our coverage: 75%)– Open question: what’s with the remaining 25%

classes?• Boring? Interesting?

• What do simple classes look like?– See our catalog!

• Why do we need so many simple classes?– Well written programs have many simple

classes

20/23

OOPSLA’05: Gil, Maman

The Lego Conjecture

Reusability diminishes with implementation

complexity

21/23

OOPSLA’05: Gil, Maman

Summary: Applications/Benefits

• Code generation by IDE/CASE tools

• Documentation tools (JavaDoc)– Automatic identification– Precise vocabulary

• Refactoring– Some patterns tend to go together

• Design– A useful collection of different flavors of classes.

• Teaching

22/23

OOPSLA’05: Gil, Maman

Summary: Further Research

• JTL: Java Tools Language– A formal language for describing Java elements– Can be used in:

• AOP: Pointcut specifiction• Concepts: Type constraints• Pre/Post condition in program transformation

• Language constructs– enum types AugmentedType, Pool patterns– Functional Programming in Java FunctionObject pattern

• Nano patterns– Method level patterns

• “Distance from pattern” function– Can be used to recommend code-correction steps

23/23

OOPSLA’05: Gil, Maman

Questions ?