oopsla’05: gil, maman micro patterns in java code the unbelievable simplicity of real...
Post on 21-Dec-2015
221 views
TRANSCRIPT
OOPSLA’05: Gil, Maman
Micro Patterns in Java code
The Unbelievable Simplicity of Real Object-Oriented Programming
Itay Maman
The Technion – Israel Institute of Technology
Joint work with Yossi Gil
October 18, 2005
OOPSLA’05, San-Diego
2/23
OOPSLA’05: Gil, Maman
Java’s Design Space
• Classes: – final, abstract, inner, visibility,
super-class, interfaces• Fields:
– static, final, visibility, signature• Methods:
– static, final, visibility, signature, abstract, overriding
• Constructors, initializers, …
3/23
OOPSLA’05: Gil, Maman
90% of the complexity is in 10% of the code
The Patterns Conjecture
90% of the complexity is in 10% of the code
• The 90%-10% principle:– Most classes are simple
• In most classes only a fracture of the design space is used– Complex classes are rare
• Rationale:– Programmers prefer to employ proven familiar techniques– They reuse the same simple ideas that work over and over
4/23
OOPSLA’05: Gil, Maman
• Complex classes (10%)– Difficult to implement – Substantial role
• Represent real life entities• Discovered during design
– Rich state
• Simple classes (90%)– Implementation is almost trivial– Solve a local need in an ad-hoc fashion– Encapsulate minimal state (if any)
• Our focus: The “boring” 90% of the code
Simple vs. Complex Classes
Leading questions:
• Is the 90%-10% principle true?• What do simple classes look like?• Why do we need so many simple classes?
6/23
OOPSLA’05: Gil, Maman
The Methodology
• We set out to explore the nature of Java classes– Similar to the way a social-science survey is conducted
• Our empirical approach:– Step 1: Experiments -- Observe the world– Step 2: Conclusions -- Analyze the results– Step 3: Reflection -- Develop an underlying model of the world
• A “Classical” computer-science research:– Step 1: Given some model, Invent a new idea– Step 2: Experiment -- test this new idea– Step 3: Conclusions -- Evaluate your work
7/23
OOPSLA’05: Gil, Maman
Experimental Setting
• 14 huge software collections– 3,391 packages– 71,611 classes– 620,595 methods
• Domains– Libraries: 7 implementations/versions of the JRE– Applications:
• GUI: JEdit and Poseidon.• Server: JBoss and Tomcat.• Compiler/language tools: Ant, Scala, MJC
• Testing system: automatic, static analysis of classfiles + manual inspection of results
8/23
OOPSLA’05: Gil, Maman
μ-Patterns
“Formal condition on the type, structure, names, and attributes of a Java class and its components.”
1. Purposeful: fulfill a useful programming need
2. Limiting: restrict the design space variety
3. Mechanically recognizable: by an automatic checker
4. Simple: human understandable
Similar to Design Patterns, but (i) automatable, and (ii) at a lower level of abstraction.
9/23
OOPSLA’05: Gil, Maman
Example: Traits*
package java.lang;
import java.io.Serializable;
public abstract class Number implements Serializable { public byte byteValue() { return (byte)intValue(); } public short shortValue() { return (short)intValue(); } public abstract double doubleValue(); public abstract float floatValue(); public abstract int intValue(); public abstract long longValue(); }
* Scharli, Ducasse, Nierstrasz and Black in ECOOP’03
10/23
OOPSLA’05: Gil, Maman
μ-Patterns Examples
• Immutable: a class whose state never changes• Sampler: offers customers a collection of pre-made
instances• Sink: does not propagate calls• Record: has the look and feel of Pascal records• Stateless: carries no state information• Designator: an empty interface• Implementor: gives body to abstract methods, without
introducing any new methods.
Not mutually exclusive Discovered, not invented
11/23
OOPSLA’05: Gil, Maman
μ-Pattern Classification
• Degeneracy– Degenerate
State and Behavior– Degenerate
State– Degenerate Behavior– Controlled Creation
• Containment– Wrappers– Data Managers
• Inheritance – Inheritors– Base Classes
• Degeneracy– Degenerate State and Behavior– Degenerate State– Degenerate Behavior– Controlled Creation
• Containment– Wrappers– Data Managers
• Inheritance – Inheritors– Base Classes
Function Pointer Function Object Cobol Like
Box Compound Box
Outline
Restricted CreationSampler
RecordData Manager
State MachinePure TypeAugmented TypePseudo Class
Degenerate State and Behavior
Base Classes Degenerate
State
Wrappers
Data Managers
Controlled Creation
Degenerate Behavior
StatelessCommonStateImmutable
Canopy
Trait
Sink
more restricted more generalBehavior
more restricted
more general
State
DesignatorTaxonomyJoinerPool
InheritorsImplementorOverridder Extender
12/23
OOPSLA’05: Gil, Maman
Empirical Findings
• 90%-10% principle– Most classes use a mere fraction of Java’s design space
• 45% of all classes are trivial– One in ten classes is a wrapper of a single instance field– One in seven classes has no instance state– One in seven classes has no mutable state– One in seven classes is a sink
• Distinction between software collections– Patterns prevalence is not a property of the Java language– “Similar” collections tend to use patterns similarly
13/23
OOPSLA’05: Gil, Maman
Five Most Popular Patterns
JeditScalaSharedJbossPoseidonTomcatSun-14AVG
Canopy26.5%3.9%5.2%6.2%10.3%4.6%9.8%9.5%
PureType2.5%20.5%11.2%11.3%11.9%5.6%7.7%10.1%
Overrider23.1%4.1%10.4%7.0%16.8%20.2%12.4%13.4%
Sink9.0%14.0%15.0%12.9%11.3%12.1%20.6%13.5%
Implementor37.1%10.5%16.8%23.0%22.1%12.7%26.1%21.2%
Coverage70.6%48.6%51.8%54.0%61.9%50.6%61.4%55.2%
Total Coverage83.7%79.4%65.7%76.2%76.9%67.3%79.5%75.5%
•3 out of 4 classes have at least one pattern
•This supports the validity of the 90%-10% principle
14/23
OOPSLA’05: Gil, Maman
Example of Statistical Inference
CSEETotal
Full Professors18 (34%)20 (42%)38
Associate Professors
22 (42%)13 (27%)35
Assistant Professors
13 (24%)15 (31%)28
Total5348101
Probability = 31%
15/23
OOPSLA’05: Gil, Maman
Distinction Between Collections: The Box pattern
TomcatScalaTotal
Box classes124 (9%) 387 (14%) 511
Non-box classes1310 (91%)2291 (86%)3601
Total143426784112
Probability = 10-7
16/23
OOPSLA’05: Gil, Maman
Similarity of Software Collections
Coll-1Coll-2Coll-3Coll-4Coll-5
Box5%5%1%14%3%
CompoundBox5%6%6%5%10%
Sampler1%1%1%4%0%
PseudoClass1%1%0%2%0%
Canopy9%9%26%4%4%
FunctionPointer1%2%1%1%1%
Stateless9%10%6%15%6%
#Classes5,2138,2166762,678421
• Sun’s JREs: High degree of similarity– E.g.: Every 10th class should be Stateless– This invariant is kept regardless of the size of the collection
AntScalaJeditSun-14Sun-13 AntScalaJeditSun-14Sun-13
17/23
OOPSLA’05: Gil, Maman
μ-Patterns as a Distinguishing Mark
• “Different” collections => Different prevalence values
• “Similar” collections => Similar prevalence values – Progressive versions of the same product– Same specification
• Most differences in prevalence are statistically significant
18/23
OOPSLA’05: Gil, Maman
Patterns Multiplicity
9,947
13,03912,415
5,377
55889 18
0
2000
4000
6000
8000
10000
12000
14000
0 1 2 3 4 5 6
No. Patterns
No.
Cla
sses
19/23
OOPSLA’05: Gil, Maman
Leading Questions Revisited
• 90%-10% principle?– Yes… (our coverage: 75%)– Open question: what’s with the remaining 25%
classes?• Boring? Interesting?
• What do simple classes look like?– See our catalog!
• Why do we need so many simple classes?– Well written programs have many simple
classes
20/23
OOPSLA’05: Gil, Maman
The Lego Conjecture
Reusability diminishes with implementation
complexity
21/23
OOPSLA’05: Gil, Maman
Summary: Applications/Benefits
• Code generation by IDE/CASE tools
• Documentation tools (JavaDoc)– Automatic identification– Precise vocabulary
• Refactoring– Some patterns tend to go together
• Design– A useful collection of different flavors of classes.
• Teaching
22/23
OOPSLA’05: Gil, Maman
Summary: Further Research
• JTL: Java Tools Language– A formal language for describing Java elements– Can be used in:
• AOP: Pointcut specifiction• Concepts: Type constraints• Pre/Post condition in program transformation
• Language constructs– enum types AugmentedType, Pool patterns– Functional Programming in Java FunctionObject pattern
• Nano patterns– Method level patterns
• “Distance from pattern” function– Can be used to recommend code-correction steps