this page intentionally left blank
DESCRIPTION
This page intentionally left blank. Please focus here. Measurement-based Research Methods in Computer Engineering. Mats Björkman Mälardalens Högskola. Overview. Introduction Experimental-based research methodology Statistics Measurements Methodology Examples Pitfalls Conclusions. - PowerPoint PPT PresentationTRANSCRIPT
Mats Björkman, MdH
This page intentionally left blank
Please focus here
Mats Björkman, MdH
Measurement-based Research Methods in Computer Engineering
Mats BjörkmanMälardalens Högskola
Mats Björkman, MdH
Overview Introduction Experimental-based research
methodology Statistics Measurements Methodology Examples Pitfalls Conclusions
Mats Björkman, MdH
Introduction Measurement-based research
is founded in: Experimental research methodology Statistics
Mats Björkman, MdH
Experimental-based research methodology Overview (repetition) Comments
Mats Björkman, MdH
Experimental-based research methodology - Overview Already the old Greeks… Two main standpoints: Rational methods, it all comes from
the brain, everything can be thought out
Idealistic methods, everything we observe give us knowledge about the ideal world (e.g. Plato)
Mats Björkman, MdH
Different Methodologies Rational research meant thinking,
thus deductive (logical) methodologies
Idealistic research meant observing and drawing conclusions, thus inductive methodologies
Mats Björkman, MdH
Practice often a mixture E.g. Astronomy, a combination of
induction from observations and deduction through e.g. Mathematics
Mats Björkman, MdH
Medieval times Much debate around whether or
not God stands above the laws of logic
The question “Why?” important Research always seen through the
glasses of religion
Mats Björkman, MdH
The Scientific Revolution Bacon, Copernicus, Kepler, Newton
etc. Focus on “How?” Nature and religion may be treated
separately as long as focus is on How Led to the development of the
“traditional” sciences
Mats Björkman, MdH
Modern Science Karl Popper - important philosopher Science is a process of testing and
refining hypotheses Induction problem: Can experience
be generalized? Popper says ‘no’, experiments
cannot prove general hypotheses
Mats Björkman, MdH
Modern Science Falsification is the most important
feature of science according to Popper Hypotheses cannot be proven, but they
can be falsified by counter-examples Theories are compared by their
expressiveness and by their abilities to withstand falsification
Mats Björkman, MdH
Modern Science Since hypotheses cannot be
generally proven, corroboration and statistics play important roles
Hypothetical-deductive research methods build on these views of science
However, corroboration versus verification can be and is discussed
Mats Björkman, MdH
Hypothetical-deductive methods Problem formulation Hypothesis Deduction to find evaluation criteria Experiment(s)/observation(s) Conclusion
(corroboration/verification or falsification)
Mats Björkman, MdH
Problem formulation The research problem is
formulated Typically, a good problem is
addressable through empirical studies
Mats Björkman, MdH
Hypothesis A hypothesis regarding the answer to
the research question is formulated This is the really creative part of
research, scientific intuition and a good “educated guess” are important to success
(Popper: The more “risky” the hypothesis, the “better” the result.)
Mats Björkman, MdH
Deduction From the hypothesis, criteria are
deduced, the criteria to be used to test the hypothesis
Mats Björkman, MdH
Experiments/observations The “hard work” part of research Experiments are set up and/or
observations are performed in order to corroborate/verify (or falsify) the hypothesis
Mats Björkman, MdH
Corroboration/verification or falsification Using the deduced criteria on the
results of the experiments/observations leads to either corroboration/verification or falsification of the hypothesis
Mats Björkman, MdH
Then iterate… Modern scientific research is
typically a series of hypothetical-deductive situations; each corroboration/ verification or falsification gives input to a new or modified research question etc. etc.
Through this process, our scientific theories are expanded and refined
Mats Björkman, MdH
What is “the Truth”? Experimental research is often
more quantitative than qualitative For quantitative results, confidence
levels or margins of errors are used in attempts to “encircle” the Truth (should it exist)
Experiments are repeated and/or modified until confidence levels or error margins are satisfactory
Mats Björkman, MdH
What is “the Truth”? For qualitative results, we must also
use statistics. Even if we believe in induction and
that the Truth is possible to find, there are always experimental errors and the like that makes 100% impossible to reach
Here too, repeated experiments are needed
Mats Björkman, MdH
Conclusions Experimental research is an
iterative process Potential falsification is important:
experiments without risks are not interesting
Mats Björkman, MdH
Conclusions Examples: If we know the outcome
beforehand, the experiment is of no scientific value.
If there is no way to falsify the hypothesis (e.g. pseudoscience), the experiment is of no scientific value.
Mats Björkman, MdH
Experimental-based research methodology - Comments 100 % does not exist in reality!
Mats Björkman, MdH
Experimental-based research methodology - Comments In reality, there is always a
residual chance/risk that something really weird will happen
Mats Björkman, MdH
Experimental-based research methodology - Comments Therefore, Popper and his followers
are maybe not wrong, but it is kind of irrelevant whether a hypothesis can be “generally proven” or not
(I’m trying to be provocative here…)
Mats Björkman, MdH
Experimental-based research methodology - Comments If I show that in X percent of all
cases, some hypothesis Y holds… …then according to Popper, this
cannot prove the general case… …but if 1 – X (the risk of Y not
holding) is smaller than e.g. the risk of the world being ended by a comet…
…then who cares?
Mats Björkman, MdH
Experimental-based research methodology - Comments In reality, we always take
calculated risks If a hypothesis is true for all
practical purposes, then it is an academic question (a philosophical question) whether or not the hypothesis is TRUE
Mats Björkman, MdH
Experimental-based research methodology - Conclusions Conclusion: The hypothetical-
deductive method is the modern methodology in experimental research
However, not everyone agrees that hypotheses cannot be generally proven (and others don’t care…)
Mats Björkman, MdH
An Experimental Example The work I did for my PhD thesis
around the performance of parallel implementations of communication protocols
Mats Björkman, MdH
Parallel TCP and UDP stacks On a shared-memory
multiprocessor we implemented parallel TCP/IP/Ethernet and UDP/IP/Ethernet stacks
The performance behavior of these stacks gave rise to the research question “What factors limit the performance of these parallel implementations?”
Mats Björkman, MdH
Performance limiting factors In parallel processing, critical
resources must be protected from simultaneous access, in our case by using locks
Hence, these critical sections were main suspects as performance limiting factor
Mats Björkman, MdH
Our research hypothesis Our hypothesis then was “locking is
the main performance limiting factor”
We built a performance model using only locking and processing
If our hypothesis was right, then the model should behave like the real system
Mats Björkman, MdH
Experiment Our experiment was to run the
model with the same input as the real implementation and compare results
Mats Björkman, MdH
Results For the TCP stack,
results were fairly accurate for low numbers of processors (but far from perfect).
Conclusion: locking “is probably” one major factor (but not the only)
Mats Björkman, MdH
Results For the UDP
stack, results differed widely.
Conclusion: locking is not a major factor here
Mats Björkman, MdH
Results from conclusions We need to rethink and refine
(iterate) Locking obviously is one factor, but
not the only Need to think again and formulate
a new hypothesis
Mats Björkman, MdH
New hypothesis Next to contention for shared
software resources, contention for shared hardware resources (e.g. buses, memory) is a likely candidate
New hypothesis: Contention for locks and contention for the bus/memory system are the two main factors
Mats Björkman, MdH
New model We then built a new model that
captured the effects of both locking and bus/memory contention
The same evaluation criteria as before, model and reality should agree
Mats Björkman, MdH
New results For the new
model, the TCP results were very good
Mats Björkman, MdH
New results While not perfect,
UDP results also showed that our new model captured the main behavior of the UDP stack
Mats Björkman, MdH
New results Conclusion: Lock and bus/memory
contention “are” the two main performance limiting factors for the observed implementations
Mats Björkman, MdH
StatisticsStatistics are used for many
purposes: Quantify results Measure confidence Statistics needed in
corroboration process
Mats Björkman, MdH
Statistics – Result quantification Assume we are measuring
some property P that has a certain (but to us unknown) value V
When measuring, we get a measured value V*
Mats Björkman, MdH
Statistics – Result quantification How is V and V* related? It depends on our
measurement methods Ideally, our measurement
method gives an exact result, i.e. V* = V
Mats Björkman, MdH
Statistics – Result quantification However, most measurement
methods are: Statistical by their nature Inexact Deliberately simplified (model)
Mats Björkman, MdH
Statistics – Measurement methods
Statistical by nature: Sampling a typical example:
Instead of observing a long and possibly continuous process, we take a number of snapshots
These snapshots are statistically representative of the process
Mats Björkman, MdH
Statistics – Measurement methods
Example: Counting cars (vehicles) We want to know the number of
cars/vehicles passing outside Rosenhill on one day
Instead of counting for 24 hours, we can count 10 randomly chosen minutes and multiply by 144.
Mats Björkman, MdH
Statistics – Measurement methods
Inexactness: If our measurement tools have
lower resolution than the property we are measuring, we introduce measurement errors
Mats Björkman, MdH
Statistics – Measurement methods
Example: Using a simple scale to weigh something will only yield approximative values of the weight
Mats Björkman, MdH
Statistics – Measurement methods
Simplification: We approximate the problem
by introducing an inexact model
Mats Björkman, MdH
Statistics – Measurement methods
Example: Counting cars (again) We put a rubber tube across
the street with a counter that ticks every time the tube is run over by a vehicle wheel (pair)
Mats Björkman, MdH
Statistics – Measurement methods
Example: Counting cars (again) We get one tick per vehicle axle We can approximate that all
vehicles have 2 axles… …or use some previously
determined value, e.g. 2.042 axles
Mats Björkman, MdH
Statistics – Measurement methods
Note that whereas using 2 axles is a coarser simplification, using 2.042 axles introduces dependencies on some earlier measurements and their reliability and exactness
Mats Björkman, MdH
Statistics – Result quantification Dependent on the number of
measurements we make and the measurement precision etc., we will get a measured value V* where we can quantify the relation between V* and the true value V
Mats Björkman, MdH
Statistics – Result quantification With enough knowledge about
all parts of our measurement process, we can determine that:
With a confidence of q, the true value V lies in the range of [V*-1,V*+2] (with some specific values of q, 1, 2).
Mats Björkman, MdH
Statistics – Result quantification Two goals then are to: Have as high confidence (q) as
possible in the results, and have as small interval margins
(1, 2) as possible.
Mats Björkman, MdH
Statistics – Result quantification Both confidence levels and
margins of error are dependent on the measurement methods we use
Mats Björkman, MdH
Statistics – Result quantification Therefore, it is a prime task to
make measurements as reliable as possible
By removing sources of errors, the results can be very much better
Mats Björkman, MdH
Statistics – Result quantification Repetition is another important
point By repeated experiments, we
can obtain better confidence in our results, as well as reduce the margins of error
Mats Björkman, MdH
Statistics – Methodology In order to reduce errors,
measurement experiments need to be as controlled as possible
By controlled we mean that we should reduce and (hopefully) control all error sources
Mats Björkman, MdH
Methodology If there are factors in the
environment that cannot be eliminated, they should be measured and quantified
Example: vehicle axles, 2.042 is a quantification of our “conversion error” from axles to vehicles
Mats Björkman, MdH
Methodology - dilemma Methodology dilemma:
Intrusive measurements Example: Timestamping a code
section Time for timestamping
included in reported times…
Mats Björkman, MdH
Timestamping dilemmaTimestamp
Timestamp
Code to measure
tProblem: ≠
Mats Björkman, MdH
Timestamping: reducing error
Timestamp
Timestamp
Code to measure
TimestampTimestamp
- ≈
Mats Björkman, MdH
Timestamping: reducing error
Timestamp
Timestamp
Code to measureCode to measure
Code to measure
12…N
≅ N ∗
Mats Björkman, MdH
Simulations One way to avoid intrusive
measurements Problem: Simulations must be
verified/corroborated against reality
Mats Björkman, MdH
Simulations Problem: Simulations lend
themselves to simplifications This means that we introduce
modeling errors These errors must be identified
and controlled
Mats Björkman, MdH
PitfallsThere are many pitfalls in
experimental research, especially when it involves measurements
Mats Björkman, MdH
Pitfall: Sampling too sparsely Example from a very early
publication We performed measurements
on protocol stacks in a UNIX environment
Mats Björkman, MdH
Pitfall: Sampling too sparsely Our controlled measurements
did not agree with reality
05
1015202530354045
ControlledReality
Mats Björkman, MdH
Pitfall: Sampling too sparsely Turned out, our measurement
points were not representative
05
1015202530354045
ControlledReality
Mats Björkman, MdH
Pitfall: Sampling too sparsely We had chosen measurement
points (data sizes) in multiples of 1KB
The UNIX mbuf system puts communication data into 128-byte linked buffers. For 1KB, 8 such buffers are substituted for one large 1KB buffer
Mats Björkman, MdH
Pitfall: Sampling too sparsely This means: Sending 1023 bytes means
handling of 8 small buffers Sending 1024 bytes means
handling of 1 large buffer Sending 1025 bytes means
handling of 1 large and 1 small buffer
Mats Björkman, MdH
Pitfall: Sampling too sparsely Hence, our measurement points
were extreme points, therefore our results deviated from reality’s mean vaules
05
1015202530354045
ControlledReality
Mats Björkman, MdH
Pitfall: Time correlationsWhen working in a real system,
unknown time correlations may occur. Process scheduling is one typical example.
Mats Björkman, MdH
Pitfall: Time correlations If the arrival time of packets to
a host is timestamped in a process, the timestamps will exhibit a pattern correlated to the scheduling of that process.
Mats Björkman, MdH
Pitfall: Clock resolutionMeasuring small time quantities
can be hard, and clocks may not be trustworthy on small time scales
Mats Björkman, MdH
Pitfall: Clock resolutionExample: uniqtime in UNIX
systemsUniqtime keeps track of clock
reads. Has the clock not ticked since last read, uniqtime will add a small fraction to the time to avoid “time standing still”.
Mats Björkman, MdH
Pitfall: Clock resolutionThis means that times in the
same order as the clock tick cannot me measured accurately
Mats Björkman, MdH
Pitfall: Trusting authoritiesAuthorities can make errors, too.Example: ns (and ns-2) are
standard simulators for communication.
ns comes with a large set of standard protocols
Mats Björkman, MdH
Pitfall: Trusting authoritiesWe were interested in investigating the
backoff mechanism in Ethernet.We found out that the Ethernet
implementation in ns was broken (and had been so for a long time).
For several years, many people around the world had used a broken protocol in their simulations
Mats Björkman, MdH
Conclusions Measurement-based research is a
prime example of experimental research methodology
Measurements are tricky, but fun! If you are interested in a career in
measurement-based research, study statistics!
Mats Björkman, MdH
The End
That’s all, Folks!