csci 6960- research methods - 1 - ho 3 © houman younessi 2007 lecture 3 measurement and metrics why...

33
CSCI 6960- Research Methods - 1 - HO 3 © Houman Younessi 2007 Lecture 3 Measurement and Metrics Why do we need to measure anything? Because we seek confirmation of our experiences and of our “theories”. Measurement has become a basic tenet of our rational approach to the expansion of human knowledge. The scientific method has as its basis the measurement of phenomena of interest to us in order to develop quantitative descriptions of these.

Upload: erick-nichols

Post on 27-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

CSCI 6960- Research Methods

- 1 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Why do we need to measure anything?

Because we seek confirmation of our experiences and of our “theories”.

Measurement has become a basic tenet of our rational approach to the expansion of human knowledge.

The scientific method has as its basis the measurement of phenomena of interest to us in order to develop quantitative descriptions of these.

CSCI 6960- Research Methods

- 2 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Quantification is therefore the very basis of modern science.

Measurement has also had a correspondingly profound impact on all fields of engineering. In fact it can be safely asserted that modern engineering is defined in terms of its scientific basis, its quantification of relationships and its measurement based approaches.

For software engineering to qualify as a true engineering discipline, it too must adopt (more correctly, develop) an empirical, measurement based foundation.

CSCI 6960- Research Methods

- 3 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Unfortunately, thus far much of computer science, and particularly “software engineering” research work has been of “advocacy” nature.

Advocacy can be useful, in fact crucial, as a necessary pre-cursor to the development of any field into a scientifically based discipline.

However, there comes a time when the field has to go beyond being based on the untested (or at least inadequately tested) recommendations of authority figures. History has many examples.

CSCI 6960- Research Methods

- 4 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

BiologyMedicine?

Psychology

Anthropology::::

Physics

History

CSCI 6960- Research Methods

- 5 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsWe are at a crossroads.

We are sitting under the apple tree.

As pioneers and practitioners of a young discipline trying to make the transition, we must be vigilant. There are still many techniques, even “metrics” that are proposed on the basis of inadequate experience, theory or confirmation. These are mere proclamations, albeit sometimes useful, and often even necessary for the support and development of the practice, they must be dealt with with extreme caution.

CSCI 6960- Research Methods

- 6 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

So what is a measure?

An empirical objective assignment of a number or symbol to an entity to characterize a specific attribute. (Fenton, 1991)

Dimension or quantity reckoned by some standard. (Webster’s Dictionary).

CSCI 6960- Research Methods

- 7 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsThis means that a measure is not just a number but characterizes a mapping between the manifestation of an aspect of interest in an element or entity within our universe of discourse and a mathematical or symbolic system of ranking and comparison. In so doing, the aspect of interest is called the “attribute”, and the mathematical system of ranking and comparison into which these attributes are mapped is called a “scale”. The action of producing the said mapping is termed “measurement”.

So a direct or atomic measure is a quantification based on a mapping into a numerical or symbolic value obtained from a scale of a directly observed aspect of a phenomenon.

CSCI 6960- Research Methods

- 8 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsScales:

The Nominal Scale;

The Ordinal Scale;

The Interval Scale;

The Ratio Scale;

A simple mapping into a number of disjoint sets without regard to any other relationships. This is a naming scale.

A mapping based on rank value. This creates an ordered category.

A mapping in which both the ordering and the distance between the values of attributes can be deduced.

A mapping from the real world onto the set of real numbers.

CSCI 6960- Research Methods

- 9 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Only certain transformations are allowed, or are indeed meaningful, in relation to any of the aforementioned scales. For example:

Given that there are 7 males and 9 females in a room, what is the “average” gender of the individuals in this room?

Given that Ada, C, C++ and SQL are the programming languages used in project A and C and SQL the ones used in project B, what is the minimum language used in each project?

CSCI 6960- Research Methods

- 10 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

It is therefore important to know what operations are permissible when dealing with measured quantities:

Measurement and PermissibilityNominal Ordinal Interval Ratio

Properties

Relations

Operations

Type of Data

Statistics

Identity Identity, magnitude

Identity, magnitude, equal intervals

Identity, magnitude, equal intervals, zero

Equivalence Equivalence, Less than

Equivalence, Less than, ratio of interval

Equivalence, Less than, ratio of interval, ratio of values

None Rank order Add, subtract All

Names, Labels Ordered data Score Absolute Score

Mode, Frequency Median, Percentile Mean,StdDev., Pearson Correl.

All, e.g.: Geo. Mean, Coeff. of Variation

CSCI 6960- Research Methods

- 11 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsAn important fact about scales is the power of each scale as a means of measurement. Generally as we go from left to right on the table just presented, the “power” of the scale increases.

Any phenomenon may be measured by any scale; given understanding of the underlying principles. The aim of science is to measure more and more of observable phenomena using as high an scale as possible. For example:

We want to measure the temperature of a number of objects, say objects A,B,C, and D.

In the nominal scale we can say something like, there is category 1 and category 13 and that we assign A,and D to category 1 and the other two to category 13, based on some arrangement .

CSCI 6960- Research Methods

- 12 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Not terribly useful but still a measurement.

In the ordinal scale we say that we have an ordering based on the amount of perceived heat in an object. We devise a three level scale of Cold, Warm, and Hot. We then place A in category Cold, B and D in Category Warm and C in Hot.

A bit more useful, but can we use this scale in a sophisticated scientific laboratory when minute changes in temperature need be measured?

Using the interval scale, we can say that we have a scale divided into n equal parts called say, degrees. The difference between the temperature of x (which is presumed to always be constant)

CSCI 6960- Research Methods

- 13 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

and y (whose temperature is also presumed to be constant but different to x) is then divided into n equal distances each called a degree. We still need to devise a means of assessing how much up the scale any one artifact w actually registered (a thermometer). To do so we need some further in-depth knowledge of the universe in relation to the concept of temperature than we did with the previous scales.

What is the drawback of this scale?

CSCI 6960- Research Methods

- 14 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsUsing the ratio scale, we might use the very in-depth knowledge that temperature of an object relates to the amount of energy per unit of mass possessed by that object, or the level of atomic excitation of the body. We can now say that if a body is at complete non-excited state, it lacks heat and therefore should rate a zero for its temperature (absolute zero). As the level of excitation increases we can correspondingly increase the reading for the temperature of the body in question, based on some pre-agreed scale that is homogeneous with the rate of increase of energy (or molecular excitation) in that body. In fact we could measure the temperature of the body in terms of the size of this excitation.

CSCI 6960- Research Methods

- 15 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Question:

A measure of the “quality” of a given process of software development may be given by evaluating that process using the SEI’s Capability Maturity Model (CMM).

What scale is this model on?

What chances do you give the measurement?

CSCI 6960- Research Methods

- 16 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Composite Measures and Indirect Scales:

In order to come up with “more powerful” or “higher” scale measures of a phenomenon, we usually resort not to a direct observation and ranking in terms of an atomic measure but an indirect one. Examples:

Temperature (just seen)

Velocity of moving objects.

CSCI 6960- Research Methods

- 17 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

There are however rules that apply in how we can combine measurements on various scales. This is a challenging discussion not without its difficulties. The general rule however is that:

The scale type for an indirect measure (M) is only as strong as the weakest of the atomic scale

types that compose it.

CSCI 6960- Research Methods

- 18 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsUnfortunately this is one rule that is often broken in software engineering, leading at times to un-useable or misleading results.

Example:

In Halstead’s equation for programming effort: e=V/L; V is Program Volume (on a ratio scale) and L is Program Level (on an ordinal scale). Halstead however claims that e represents the number of mental discriminations necessary to implement a program which ought to be represented on a ratio scale (as it is a count measure)!!!

CSCI 6960- Research Methods

- 19 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsDimensionality:

Another concept often ignored when “measurement” is used or proposed in software engineering is the concept of dimensionality.

that not only the scales but also the dimensions on the right and the left hand

side of an equation must be identical

Example:

Although both SLOC and No. of Loops are on the Ratio scale and addition is permissible in that scale adding SLOC and No.of Loops is probably nonsensical.

CSCI 6960- Research Methods

- 20 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Desirable properties of Measurement:

Reliability

Effectiveness of range

Validity

Consonance

Dimensionality

Practicality

CSCI 6960- Research Methods

- 21 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Reliability

When any two measures of the same entity made in the same way and independently, agree, we have measurement reliability.

A measure is reliable if it meets the correlation condition.

If M1(A) is a measure of A obtained through experiment 1 and M2(A) is the measure of the same attribute obtained through experiment 2, and if |M1(A) – M2(A)| 0, then we have the correlation condition satisfied.

CSCI 6960- Research Methods

- 22 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Reliability

If M1 and M2 are the same type of measure made by the same experimenter at different times, then the reliability is called:

Test-Retest Reliability

If M1 and M2 are the same type of measure made by various experimenters at the same or different times, each blind to the result of the other then the reliability is called:

Inter-Rater Reliability

CSCI 6960- Research Methods

- 23 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Reliability

If M1 , N1 ,…..P1 are different types of measures made to measure

the same phenomenon , and they agree within themselves then the reliability is called:

Internally Consistent

CSCI 6960- Research Methods

- 24 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Reliability

The scientific tradition has demonstrated that the concept of measurement reliability is of utmost importance. Why?

Because, if the measures we obtain are not reliable, then the study can not yield useful information or relationships.

CSCI 6960- Research Methods

- 25 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Reliability

The factors that contribute to reliability include:

The precision of the operational definition of the

construct

The clarity of the operational definition of the construct

The care with which we carry out measuresThe number of

independent observations

CSCI 6960- Research Methods

- 26 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsEffectiveness of range

Would you use the bathroom scales to weigh:

• spices for your pie recipe, or

• your RV?

MTTF

CSCI 6960- Research Methods

- 27 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Validity

Measurements must be accurate reflections of the “true” behavior or property we perceive in the real world as reflected in the entity measured. This is the:

representation condition

Example:

If in measuring complexity of software using a measure C, program A is “more complex” than program B, then C(A) must

be larger than C(B).

CSCI 6960- Research Methods

- 28 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and MetricsConsonance

We must be certain that the measure and the measurement are aligned with our project, process or product goals. In this way, the data will not be open to abuse.

CSCI 6960- Research Methods

- 29 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Dimensionality

If a measure is set into a relationship of equality with another, then dividing the RHS and the LHS must result in an entity that is mathematically AND LOGICALLY devoid of dimensions.

CSCI 6960- Research Methods

- 30 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Practicality

Collecting data and making measurements should be easy.

However

The requirement of practicality is context dependent.

Bubble chambers and cyclotrons.

In the context of software engineering this usually means automatability.

CSCI 6960- Research Methods

- 31 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

Relating Measures: Prediction Models

Assessment measures Assessment systems

Predictive measures Prediction systems

To have a prediction system one needs:

A base measure A target measure

A prediction model A set of prediction procedures

CSCI 6960- Research Methods

- 32 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

The concept of validity becomes very important when we become concerned with prediction systems.

We now have to have measures that are not only valid in terms of the representation condition but also a model which is valid in terms of establishing the relationship that exists between them.

The predictive model must be validated

CSCI 6960- Research Methods

- 33 -HO 3

© Houman Younessi 2007

Lecture 3

Measurement and Metrics

This validation may be:

Deterministic

Stochastic

“Proof” for validity is only possible in very rare occasions when there is a correctness preserving set of mathematical transformations

that relates one measure to the other. In all other cases we provide “evidence” for validation.