language interaction and quality issues: an exploratory study

23
Languages interaction and possible effects: an exploratory study Antonio Vetrò - Federico Tomassetti Marco Torchiano - Maurizio Morisio

Upload: marco-torchiano

Post on 13-Jul-2015

981 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Language Interaction and Quality Issues: An Exploratory Study

Languages interaction and possible effects: an exploratory study

Antonio Vetrò - Federico TomassettiMarco Torchiano - Maurizio Morisio

Page 2: Language Interaction and Quality Issues: An Exploratory Study

No one writes in a single language anymore. Even trivial applications have a general-purpose language, SQL, JavaScript, CSS, and dozens of frameworks, each of which includes an external DSL

Wampler 2010

Page 3: Language Interaction and Quality Issues: An Exploratory Study

How do those languages interact?

Is that interaction problematic?

Page 4: Language Interaction and Quality Issues: An Exploratory Study

Research questions

RQ1 How much interaction is there between the languages used in a project?

RQ2 Which language pairs interact more?

RQ3 Are Cross Language Modules more defect-prone than Intra Language Modules?

Page 5: Language Interaction and Quality Issues: An Exploratory Study

Plan

• Define a measure for the level of interaction among languages

• Investigate interaction vs. defect proneness

• Perform a case study

Page 6: Language Interaction and Quality Issues: An Exploratory Study

The Case Study

Apache Hadoop, which is a software to supportdistributed data storage and processing.

Used in many real applications (e.g., Yahoo, Facebook).

Page 7: Language Interaction and Quality Issues: An Exploratory Study

Commit typesLanguage A (.extA)

Language B (.extB)

Cross-Language Commit (CLC)

Intra-Language Commit (ILC)

Page 8: Language Interaction and Quality Issues: An Exploratory Study

RQ1 How much interaction is there between the languages present in a project?

All(RQ 1.1)

Bug Improv

ement

New

Feature

Sub

task

Task Test

0.53 0.12 0.26 0.30 0.45 0.26 0.05

Metric: Percentage of Cross-Language Commits

• All type of commits (RQ1.1)• Commits divided by activity type (e.g., improvement,

bug fixing, new feature) (RQ1.2)

Page 9: Language Interaction and Quality Issues: An Exploratory Study

Cross Language RatioLanguage A (.extA)

Language B (.extB)

3 out of 4 commits involving m are Cross-Language

m

Cross Language Ratio of module m CLRm = 0.75

Language C (.extC)

Page 10: Language Interaction and Quality Issues: An Exploratory Study

Interaction level of a language

• Cross language ratio of an extension (language)

Page 11: Language Interaction and Quality Issues: An Exploratory Study

RQ2 Which extensions interact more?

CLRext Nr files Extension

0.96 49 c

0.87 114 sh

0.72 75 properties

0.71 320 xml

0.59 4328 java

Metric: CLRext

Considering one extension versus all the other extensions (RQ2.1)

Page 12: Language Interaction and Quality Issues: An Exploratory Study

Focusing on extension pairsLanguage A (.extA)

Language B (.extB)

2 out of 3 commits involving m together with extA are Cross Language

Cross Language Ratio of module m w.r.t extACLRm,extA = 0.67

Language C (.extC)

m

Page 13: Language Interaction and Quality Issues: An Exploratory Study

Interaction level of a pair

• Cross language ratio of an extension w.r.t. another extension

– Asymmetrical measure!

Page 14: Language Interaction and Quality Issues: An Exploratory Study

RQ2 Which extensions do interact more?

extA/extB C Java Properties Sh

C - 0.51 0.10 0.50

Java 0.01 - 0.28 0.04

Properties 0 0.54 - 0.36

Sh 0.09 0.22 0.24 -

Xml 0.04 0.52 0.43 0.24

Considering the most interacting ordered pairs of extensions (RQ2.2).

Metric: CLRextA,extB

Page 15: Language Interaction and Quality Issues: An Exploratory Study

Cross vs. Intra Lang Modules

Cross Language Module (CLM): CLR is ≥ t%

Intra Language Modules (ILM): CLR is < t%

t = 50%

Page 16: Language Interaction and Quality Issues: An Exploratory Study

RQ3 Are Cross Language Modules more defect-prone?

ILM

no def.

ILM

def.

CLM

no def.

CLM

def.

p-value OR

all 1891 225 2875 89 <0.001 0.26

c 2 0 46 1 1.000 Inf

java 1692 201 2239 25 <0.001 0.09

properties 19 1 45 7 0.429 2.92

sh 10 5 64 13 0.162 0.41

xml 96 11 184 24 0.851 1.14

Metric: Odds ratio of CLM with/without defects , ILM with/without defects

- all module regardless of extension (RQ3.1)- by extension (RQ3.2)

Page 17: Language Interaction and Quality Issues: An Exploratory Study

RQ3 Are Cross Language Modules more defect-prone?

C Java Properties sh XML

C - Inf 0 0 Inf

Java 2.79 - 0.32 0.43 0.96

Properties Inf 1 - 12.08 0.94

Sh 3.55 4.45 17.17 - 7.44

Xml 3.83 0.95 3.22 4.73 -

Considering interaction between specific ordered pairs of extensions (RQ3.3).

In bold significant values

Metric: Odds ratio of CLM with/without defects , ILM with/without defects

Page 18: Language Interaction and Quality Issues: An Exploratory Study

Threats• Confounding factors: age and size of modules• Usage of proxy for interaction between artifacts• Apache Hadoop representativeness• Renaming of modules

Page 19: Language Interaction and Quality Issues: An Exploratory Study

Conclusions

Page 20: Language Interaction and Quality Issues: An Exploratory Study

Language interaction depends on the type of activity

Page 21: Language Interaction and Quality Issues: An Exploratory Study

Frequent interactions are generally not symmetric

Many of them involve XML

Page 22: Language Interaction and Quality Issues: An Exploratory Study

Though several language pairs have CLMs significantly more defect prone then ILMs, see C

In general language interaction is not related to higher defect proneness, see Java

Page 23: Language Interaction and Quality Issues: An Exploratory Study

Antonio Vetrò - Federico TomassettiMarco Torchiano - Maurizio Morisio

Languages interaction and possible effects: an exploratory study

Questions?