it metrics and benchmarking - semantic scholar · of it metrics and benchmarking — some...
TRANSCRIPT
June 2003 Vol. 16, No. 6
ACCESS TO THE EXPERTS
The Journal of Information Technology Management
IT Metrics and Benchmarking
Use BenchmarkData to SubstantiateIT ResourceRequirementsBenchmarking enables you toassess IT productivity and demon-strate potential resource savings toexecutive-level management. Thedata will make your assertionsmore compelling and defendable.
Don’t Waste ValuableIT Resources onBenchmark DataBenchmark data looks at whereothers have been in years past andhas no relevance to what you aredoing today. The data can easilybe manipulated to serve anyone’sparticular interests.
“Many organizations today are attemptingto realize the benefits of IT metrics andbenchmarking — some successfully andsome not so successfully”
— David Garmus, Guest Editor
Opening StatementDavid Garmus 2
The Big Picture: Software Measurements in Large Corporations
Capers Jones 6
Extracting Real Value from ProcessImprovementThomas M. Cagley, Jr. 13
Hitting the Sweet Spot: Metrics Success at AT&T
John Cirone, Patricia Hinerman, and Patrick Rhodes 20
From Important to Vital: The Evolution of a MetricsProgram from Internal to Outsourced Applications
Barbara Beech 28
Benchmarking for the Rest of UsJim Brosseau 33
The Practical Collection, Acquisition, andApplication of Software MetricsPeter R. Hill 38
Many organizations today are
attempting to realize the benefits
of IT metrics and benchmarking —
some successfully and some not
so successfully. IT has become a
direct contributor to bottom-line
business value. The need to build
higher-quality systems faster and
more cheaply is increasingly signifi-
cant to each organization. Toward
that end, IT is constantly seeking
ways to improve as it comes under
greater scrutiny from senior execu-
tives with regard to cost and return
on their investment.
Organizations require accurate,
detailed data to manage their soft-
ware business. They require cost
information that permits informed
decisions with regard to technology
strategies, effective implementa-
tion of architectures, and cost-
efficient resource management.
They need to understand their
capacity to deliver with regard to
their utilization of methods and
tools, effective deployment of train-
ing programs, and potential out-
sourcing opportunities. Simply
stated, they require business meas-
ures based on facts.
Evidently, many people consider
this to be a critical topic. Seventeen
individuals submitted ideas for this
issue. Since we could not print
them all this month, we will have
another issue on this topic later in
the year, with additional infor-
mation in the areas of practical
software measurement and the
Capability Maturity Model (CMM)/
CMM-Integration (CMMI).
Capers Jones starts us out with
the first of six articles in this issue.
Who can dispute Capers? He was
my first mentor in this area and is
well known to all. His article is an
excellent introduction to some of
the standard IT metrics. While the
article focuses primarily on meas-
urement in very large companies
that employ more than 10,000 soft-
ware personnel, Capers briefly
characterizes the measurement
practices of a broad range of
organizations:
� Companies with less than
100 software personnel
may have fairly simple
measurement programs
that capture software
defect data during testing,
customer-reported defects,
and possibly basic productiv-
ity measures such as cost per
function point or work-hours
per function point.
� Companies with more than
100 software personnel begin
to pay serious attention to
software issues. Such com-
panies may commission soft-
ware process assessments
and benchmark studies to
ascertain their performance
against industry norms.
They very likely have soft-
ware productivity and quality
measures, business meas-
ures such as market share,
and customer satisfaction
measures.
� Companies that employ
more than 10,000 software
personnel are often very
sophisticated in terms of
software measurement and
tend to measure everything
of importance to software:
productivity, costs, quality,
schedules, sizes of applica-
tions, process assessments,
benchmarks, baselines, staff
demographics, staff morale,
staff turnover rates, customer
satisfaction, and market
share information. Many
sources of data are used
and need to be reviewed
and validated.
My own experience with very
large companies does not validate
Capers’ opinion of their sophis-
tication, but I do concur with his
categorization of software meas-
urement into quality, productivity
and schedule, process assessment,
so m
any
ideas
and s
o lit
tle s
uccess
©2003 Cutter Information LLC.2 June 2003
Opening Statement
by David Garmus
Simply stated, organizations
require business measures
based on facts.
and business and corporate meas-
ures. Capers gives excellent defini-
tions of the types of data that can
be collected, but he doesn’t say
much about the CMM.
The next article by Tom Cagley,
however, discusses how real value
can be extracted from process
improvement and the CMM. Tom
states that serious metrics-based
process improvement programs
begin with a quantitative baseline
of current organizational data that
enables a comparison of changes
within an organization. He goes
on to claim that data analysis
requires combining the strengths,
weaknesses, and recommenda-
tions generated in a CMM appraisal
with quantitative findings from
the metrics-based productivity
assessments. He found a client
organization willing to sponsor a
joint assessment, and the results of
this assessment provided the client
with integrated recommendations
to support effective allocation of
scarce resources.
Tom relates that the individual CMM
and productivity results, as well as
the joint recommendations, proved
useful to the organization by focus-
ing future process improvement
efforts. Can you be convinced that a
CMM or CMMI assessment would
help your IT organization? Tom lays
out an interesting scenario. The
software industry has become
increasingly aware of the need to
measure itself. As organizations
pursue better software manage-
ment, they recognize the urgency
for process improvement strategies
and the quantification of business
value. Each organization must pick
and choose its own road — not
easy, is it?
Back to Capers for a moment. In his
article, he recommends that we
visit companies such as AT&T and
others to find out firsthand what
kinds of measurements occur —
and that’s just what we have done
in our next two articles. We went
to two separate IT organizations
within AT&T to learn from their
experiences with software meas-
urement programs. Most IT shops
have common goals to deliver soft-
ware projects on time, within bud-
get, and with high quality; however,
the execution and implementation
of measurement programs vary
greatly. In my opinion, any organi-
zation that has established an effec-
tive measurement process — a
process that enables the quantita-
tive and qualitative assessment of
the value and quality of the prod-
ucts and services it produces — is
exceptional. This is especially true
when that organization uses met-
rics to identify opportunities for
improving development and sup-
porting productivity and quality.
To use Capers’ term, AT&T is
sophisticated in terms of software
measurement.
John Cirone leads the IT organi-
zation responsible for all of the
finance and human resource
Get The Cutter Edge free: www.cutter.com 3Vol. 16, No. 6
Cutter IT Journal®
Cutter Business Technology Council:Rob Austin, Christine Davis, LynneEllyn, Tom DeMarco, Jim Highsmith,Tim Lister, Ken Orr, Ed Yourdon
Editorial Board: Larry L. Constantine, Bill Curtis, Tom DeMarco, Peter Hruschka,Tomoo Matsubara, Navyug Mohnot,Roger Pressman, Howard Rubin,Paul A. Strassmann, Rob Thomsett
Editor Emeritus: Ed YourdonPublisher: Karen Fine CoburnGroup Publisher: Bruce LynchManaging Editor: Karen PasleyProduction Editor: Linda MallonClient Services: Carol Bedrosian
Cutter IT Journal® (ISSN 1522-7383)is published 12 times a year by CutterInformation LLC, 37 Broadway,Suite 1, Arlington, MA 02474-5552,USA (Tel: +1 781 648 8700 or, withinNorth America, +1 800 964 5118;Fax: +1 781 648 1950 or, withinNorth America, +1 800 888 1816;E-mail: [email protected];Web site: www.cutter.com).
Cutter IT Journal® covers the soft-ware scene, with particular empha-sis on those events that will impactthe careers of IT professionalsaround the world.
©2003 by Cutter Information LLC. All rights reserved. Cutter IT Journal®
is a trademark of Cutter InformationLLC. No material in this publicationmay be reproduced, eaten, or distrib-uted without written permissionfrom the publisher. Unauthorizedreproduction in any form, includingphotocopying, faxing, and imagescanning, is against the law. Reprintsmake an excellent training tool. Forinformation about reprints and/orback issues of Cutter Consortiumpublications, call +1 781 648 8700or e-mail [email protected].
Subscription rates are US $485 a yearin North America, US $585 elsewhere,payable to Cutter Information LLC.Reprints, bulk purchases, past issues,and multiple subscription and sitelicense rates are available on request.
Can you be convinced
that a CMM or CMMI
assessment would help
your IT organization?
systems at AT&T, and Patrick
Rhodes and Patricia Hinerman are
members of his staff. They collabo-
rated to write our third article, and
they are the principal individuals
responsible for the management,
development, implementation,
and maintenance of the metrics
program they discuss.
I believe that this article presents
one of the best examples of a
measurement program that was
clearly defined and planned before-
hand and well managed in the
implementation and maintenance
phases. True, they used function
points, but their program would
work with any sizing metric. They
argue that what is key to the value
of a metrics program for manage-
ment and financial sponsors is a
direct relationship in the form of
quantitative information to man-
age the business. The data they
have collected and analyzed has
become very useful not just for
application-specific decisions but
for decisions that impact the entire
development shop.
Much has been written about the
benefits of software measurement
and the failure of software meas-
urement programs. If software
measurement is beneficial and
relatively easy, why don’t more
companies incorporate measure-
ment into their development and
maintenance practices? How does
an organization start?
Barbara Beech, district manager
in the Consumer CIO Vendor
Management Division, contributes
our fourth article, in which she
discusses how she initiated a
measurement program elsewhere
within AT&T. Barbara also had the
support of top-level management,
one of the key factors in ensuring a
successful startup. As she points
out, though, “... top-level support
will only go so far. You need to get
down to the details to make metrics
collection and reporting a reality.”
She identifies the critical steps her
group took and the quarterly targets
and scorecard they developed,
which were used by management
to monitor progress. Barbara
addresses the challenges faced in
collecting metrics data and the
changes necessary when a
decision is made to outsource
development work.
The level of interest and the need
for industry data within IT has
increased dramatically over the
past several years. Two main forces
are whetting this increased appetite
for information on IT performance:
competitive positioning and out-
sourcing. IT organizations need
to benchmark their progress and
compare their rate of improvement
to an industry standard.
The focus on improved productivity
and cost reduction has driven many
companies to outsource their IT
activities when faced with the real-
ization that their performance lev-
els were below par. For IT groups
that have chosen to outsource their
applications development and
maintenance functions, industry
benchmark data is invaluable. As
an outsourcing deal is being devel-
oped, benchmark data can be used
to properly set service levels and
define improvement goals. As the
outsourcing deal matures, periodic
checks on industry trends can be of
great value.
When an organization researches
the available sources of industry
data, an overriding question has
to be whether the data obtained
is valid. Our fifth author, Jim
Brosseau, identifies the short-
comings of some of the pub-
lished data and enumerates the
approaches organizations can use
to effectively generate meaningful
benchmarking information. While
Jim asserts that “using data from
reputable sources will help you to
back up your assertions and can
make your arguments much more
compelling and defendable,” he
continues with this show-stopper:
“The allure of benchmarking data
comes from its external sterility.
The data provided is based on other
people’s performance, and it may
provide a sanitized look at what
the industry is doing. For some
organizations, it can become a
game to blithely quote industry
performance figures while avoiding
internal measurement, knowing
that the truth can be a bitter pill
to swallow.”
©2003 Cutter Information LLC4 June 2003
Two main forces are
whetting the increased
appetite for information on
IT performance: competitive
positioning and outsourcing.
Get The Cutter Edge free: www.cutter.com
Jim also hits the SEI and the CMM
for their extremely small sample
size, considering the number of
software development organiza-
tions worldwide. He concludes that
“industry benchmark data defi-
nitely has its place in your arsenal
of information for making strategic
business decisions. Still, it has limi-
tations that must be overcome with
a deep understanding of why you
are measuring and balanced with
data gathered internally with rea-
sonable approaches.”
In our final article, Peter Hill, exec-
utive director of the International
Software Benchmarking Standards
Group (ISBSG), informs us that
real benchmark delivery rates
are available for your industry,
technology, platform, and software
type. The ISBSG is a not-for-profit
organization that maintains an
extensive database of metrics on
development projects and mainte-
nance support applications that it
sells worldwide. Peter observes
that “the commercial consulting
companies that offer benchmark-
ing services tend not to let you look
at the data used in their bench-
mark reports.” In contrast, ISBSG’s
project-level data is available to
anyone who wishes to purchase
a copy. (Company names are
changed to protect the innocent,
of course.) Peter suggests that you
ask a number of questions before
buying benchmark data:
� Is the collection instrument
well thought out and proven?
� Has the data been rated?
� How old is the data?
� Can I use the data to com-
pare “apples with apples”?
� What is the possibility of data
manipulation?
Remember, though, Jim Brosseau’s
caution about using benchmark
data as a driver for direction in your
organization: “If you are looking at
how much your organization
should be spending, historical
benchmarking data will tell you
where the industry has been, but it
will not help you resolve how to
best address your organizational
needs in the future.”
So where are we on that road map?
As you read these articles, keep in
mind that IT metrics do not appear
to be well defined, nor do they
follow a standardized process.
That may be the next step in the
maturation process of software
measurement.
Vol. 16, No. 6 5
EA Governance: From Platitudes to ProgressGuest Editor: George Westerman
So you want to build an enterprise architecture? Congratulations on having the vision to improve the ITorganization’s efficiency and flexibility!
Building an enterprise architecture? There’s no way this IT organization is going to spend zillions onsomething that provides no benefit to the company!
Which reaction will you get from your boss? In next month’s Cutter IT Journal, Guest Editor and CutterConsortium Senior Consultant George Westerman will examine a key aspect of enterprise architectureperformance or failure — namely, governance. In the issue, you’ll read insightful analyses of successes andfailures that help move the discussion of EA governance beyond simple platitudes. The sage advice of ourEA veterans will help prepare you for whatever reaction you may get.n
ext
iss
ue
INTRODUCTION
As company size grows, the kinds
of software measurement pro-
grams encountered also increase.
There are two main reasons for
this. First, in large corporations,
software costs are among the
largest identifiable expense ele-
ments. Second, large corporations
need large software applications.
Large software applications are
very likely to fail or run out of con-
trol. Therefore, the top executives
in large corporations have strong
business reasons for wanting cor-
porate software activities to be
under top management regulation.
Measurement programs are very
effective in bringing software proj-
ects under executive control.
Very small companies, with less
than 10 software personnel, usually
have no formal software measure-
ment programs. About the only
kind of measurement data they
collect is the number of customer-
reported defects.
Small companies, with less than
100 software personnel, may have
fairly simple measurement pro-
grams that capture software
defect data during testing, as well
as defects reported by users in
deployed software applications.
A few small companies have basic
productivity measures such as cost
per function point or work-hours
per function point.
Midsized companies, with between
100 and 1,000 software personnel,
begin to pay serious attention to
software issues. Such companies
may commission software process
assessments and benchmark stud-
ies to ascertain their performance
against industry norms. Companies
in this size range are very likely to
have both software productivity
and quality measures in place. Of
course, business measures such as
market share and customer satis-
faction are also common in this size
range. Business measures often
use a combination of data collected
by inhouse personnel plus data
acquired from consulting groups.
Large companies, with between
more than 1,000 and 10,000 soft-
ware personnel, usually have fairly
good software measurement pro-
grams that include productivity,
quality, and customer satisfaction
measures. Such companies also
tend to have business measures
such as market shares, and they
may have personnel and demo-
graphic measures.
At the top of the spectrum are very
large companies that employ more
than 10,000 software personnel,
such as IBM, Microsoft, Electronic
Data Systems, Siemens Nixdorf,
and the like. Some of these may
top 50,000 software personnel, and
they are often very sophisticated in
terms of software measurements.
These very large companies tend to
measure everything of importance
to software: productivity, costs,
quality, schedules, sizes of appli-
cations, process assessments,
benchmarks, baselines, staff demo-
graphics, staff morale, staff turnover
rates, customer satisfaction, market
shares, and competitive informa-
tion. Many sources of data are used
and need to be reviewed and vali-
dated. Therefore, measurement
programs in large companies are
usually carried out by full-time
measurement personnel.
These observations are gener-
ally true, but there are exceptions.
There are some small companies
with excellent measurement pro-
grams. Conversely, there are some
large companies with minimal
measurement programs. However,
©2003 by Capers Jones. All rights reserved6
livi
n’ la
rge
June 2003
The Big Picture: Software Measurements in Large Corporations
by Capers Jones
Measurement programs are
very effective in bringing
software projects under
executive control.
Get The Cutter Edge free: www.cutter.com
company size and measurement
sophistication do correlate very
strongly, for solid business reasons.
Measurement is not the only factor
that leads to software excellence.
Measurement is only one part of a
whole spectrum of issues, including:
� Good management
� Good technical staff
� Good development
processes
� Effective and complete
tool suites
� Good organization structures
� Specialized staff skills
� Continuing on-the-job
training
� Good personnel policies
� Good working environments
� Good communications
However, measurement is the tech-
nology that allows companies to
make visible progress in improving
the other factors. Without measure-
ment, progress is slow and some-
times negative. Companies that
don’t measure tend to waste scarce
investment dollars in “silver bullet”
approaches that consume time and
energy but generate little forward
progress. In fact, good quality and
productivity measurement pro-
grams provide one of the best
returns on investment of any
known software technology.
WHAT CAN BE MEASURED?
The best way for a company to
decide what to measure is to
find out what major companies
measure and do the same things.
In the next section, I discuss the
kinds of measurements used by
large companies that are at the top
of their markets and are generally
succeeding in global competition.
If possible, try to visit companies
such as Microsoft, IBM, AT&T, or HP
and find out firsthand what kinds of
measurements tend to occur.
SOFTWARE QUALITY MEASURES
Major companies measure soft-
ware quality. Quality is the most
important topic of software meas-
urement, and here are the most
important quality measures:1
Customer SatisfactionLarge companies perform annual
or semiannual customer satisfac-
tion surveys to find out what their
clients think about their products.
There is also sophisticated defect
reporting and customer support
information available via the Web.
Many large companies have active
user groups and forums. These
groups often produce independent
surveys on quality and satisfaction
topics that are quite helpful.
Defect Quantities and OriginsLarge companies usually keep
accurate records of the bugs or
defects found in all major deliver-
ables, and they tend to start early
during requirements or design. At
least five categories of defects are
measured: requirements defects,
design defects, code defects, docu-
mentation defects, and bad fixes
(i.e., secondary bugs introduced
accidentally while fixing another
bug). This form of measurement is
one of the oldest software meas-
ures on record, and companies
such as IBM began defect measure-
ments as early as the late 1950s.
Some leading companies perform
root cause analysis on software
defects in order to find and elimi-
nate common sources of error.
Defect Removal EfficiencyThe phrase “defect removal effi-
ciency” originated in IBM in the
early 1970s. It refers to the percent-
age of bugs or defects removed
before software is delivered to
customers. This is an important
aspect of software development,
but it is not universally measured.
According to my observations
among major corporations, about
a third of large companies measure
defect removal efficiency.
It is useful to measure the average
and maximum efficiency of every
major kind of review, inspection,
and test stage. This allows compa-
nies to select an optimal series of
Vol. 16, No. 6 7
1A useful summary of almost all known
software quality measures can be found
in Stephen Kan’s Metrics and Models in
Software Quality Engineering [6].
Good quality and productivity
measurement programs pro-
vide one of the best returns
on investment of any known
software technology.
removal steps for projects of vari-
ous kinds and sizes. Testing alone is
not very efficient. A combination of
reviews and inspections and mul-
tiple test stages is most efficient.
Leading companies remove from
95% to more than 99% of all defects
prior to delivery of software to cus-
tomers. Laggards seldom exceed
80% in terms of defect removal effi-
ciency and may drop below 50%.
The US average is about 85% [5].
Delivered Defects by ApplicationIt is common among large compa-
nies to accumulate statistics on
errors reported by users as soon
as software is delivered. Monthly
reports showing the defect trends
against all applications are pre-
pared and given to executives, and
they are also summarized on an
annual basis. These reports may
include supplemental statistics
such as defect reports by country,
state, industry, client, and so on.
Defect Severity LevelsAll major companies use some kind
of a severity scale for evaluating
incoming bugs or defects reported
from the field. The number of
plateaus vary from one to five. In
general, “Severity 1” defects are
problems that cause the system to
fail completely; the severity scale
then descends in seriousness. A
few companies are using the newer
“orthogonal defect classification”
developed by IBM [1]. In addition
to severity levels, this method
captures information about the
business importance of various
kinds of bugs or defects.
Complexity of SoftwareIt has been known for many years
that complex code is difficult to
maintain and has higher-than-
average defect rates. A variety of
complexity analysis tools are com-
mercially available that support
standard complexity measures such
as cyclomatic and essential com-
plexity, the two most widely used
complexity measures. Both meas-
ures were developed by the com-
plexity pioneer, Tom McCabe [8].
Test Case CoverageSoftware testing may or may not
cover every branch and pathway
through applications. A variety of
commercial tools are available that
monitor the results of software test-
ing and help to identify portions of
applications where testing was
sparse or did not occur.
Cost of Quality Control and Defect RepairsOne significant aspect of quality
measurement is to keep accurate
records of the costs and resources
associated with various forms of
defect prevention and removal. For
software, these measures include:
� The costs of software
assessments
� The costs of quality
baseline studies
� The costs of reviews,
inspections, and testing
� The costs of warranty
repairs and post-release
maintenance
� The costs of quality tools
� The costs of quality
education
� The costs of your software
quality assurance organization
� The costs of user satisfaction
surveys
� The costs of any litigation
involving poor quality or
customer losses attributed
to poor quality
About 50% of large companies
quantify the “cost of quality” [2].
SOFTWARE PRODUCTIVITYAND SCHEDULE MEASURES
The measurement of software
schedules, software effort, and soft-
ware costs is an important topic in
major corporations. As of 2003,
many major corporations have
adopted function point metrics
rather than the older and inade-
quate “lines of code” metric [4].
However, large corporations in the
defense industry (and a few others)
attempt to measure productivity
using the obsolete lines of code
measure.
The topic of software productivity
tends to have more benchmark
studies than almost any other.
About 65% of the large corporations
I’ve visited have commissioned
various benchmark comparisons
of their software performance.
Normally these studies are per-
formed by external consulting
©2003 by Capers Jones. All rights reservedJune 20038
Leading companies remove
from 95% to more than 99%
of all defects prior to delivery
of software to customers. The
US average is about 85%.
Get The Cutter Edge free: www.cutter.com
groups. Here are some of the key
productivity measures examined
by large software producers:
Size MeasuresBecause costs and schedules of
software projects are directly
related to the size of the applica-
tion, this is an important topic.
Industry leaders measure the size
of the major deliverables associ-
ated with software projects. Size
data is kept in two ways. One
method is to record the size of
actual deliverables such as pages of
specifications, pages of user manu-
als, screens, test cases, and source
code. The second way is to normal-
ize the data for comparative pur-
poses. Here the function point
metric is now the most common
and the most useful. Examples of
normalized data would be pages of
specifications produced per func-
tion point, source code produced
per function point, and test cases
produced per function point. The
function point metric defined by the
International Function Point Users
Group (IFPUG) is now the major
metric used for software data
collection.2
Schedule MeasuresMany large companies measure
overall project schedules from
start to finish. However, about 25%
of leading large companies meas-
ure the schedules of every activity
and how those activities overlap
or are carried out in parallel.
Overall schedule measurements
without any details are inadequate
for any kind of serious process
improvement.
Cost MeasuresAlmost all large companies meas-
ure the costs of software projects.
About 25% of the leaders measure
the effort for every activity, starting
with requirements and continuing
through maintenance. These meas-
ures include all major activities,
such as technical documentation,
integration, quality assurance, and
so on. Leading large companies
tend to have rather complete charts
of accounts, with no serious gaps or
omissions. Three kinds of normal-
ized data are typically created for
development productivity studies:
1. Work hours per function
point by activity and in total
2. Function points produced
per staff-month by activity
and in total
3. Cost per function point by
activity and in total
Cost benchmarking is very com-
mon. Cost benchmarks can be
either fairly high-level, such as total
software expenses, or fairly granu-
lar, such as benchmarks of specific
projects. Some large corporations
use both forms of benchmarking.
Maintenance MeasuresBecause maintenance and
enhancement of aging software are
now the dominant activities of the
software world, most companies
also measure maintenance pro-
ductivity. An interesting metric for
maintenance is “maintenance
assignment scope.” This is defined
as the number of function points of
software that one programmer can
support during a calendar year.
Other maintenance measures
include number of customers sup-
ported per staff member, number
of defects repaired per time period,
and rate of growth of applications
over time.
Indirect Cost MeasuresAbout 15% of the large companies
I’ve visited measure costs of indi-
rect software activities. Some of the
indirect activities — such as travel,
meeting costs, training and edu-
cation, moving and living, legal
expenses, and the like — are so
expensive that they cannot be
overlooked.
Rates of Requirements ChangeThe advent of function point
metrics has allowed direct meas-
urement of the rate at which soft-
ware requirements change. The
observed rate of change in the US is
about 2% per calendar month. The
rate of change is derived from two
measurement points: the function
point total of an application when
the requirements are first defined
and the function point total when
Vol. 16, No. 6 9
2The IFPUG Web site (www.ifpug.org)
is a source of information on function
point publications and uses of function
points. See also IT Measurement:
Practical Advice from the Experts [3].
Because maintenance and
enhancement of aging soft-
ware are now the dominant
activities of the software
world, most companies
also measure maintenance
productivity.
the software is delivered to actual
customers. About 20% of large
companies measure requirements
change. It is significant that when
outsourced projects go bad and
end up in court for breach of con-
tract, almost every case includes
claims of excessive requirements
change.
PROCESS ASSESSMENT, OR“SOFT FACTOR” MEASURES
Even accurate quality and produc-
tivity data is of no value unless it
can be explained why some proj-
ects are visibly better or worse than
others. More than half of the large
corporations I’ve visited have com-
missioned one or more assessment
studies. Assessments are far more
common among companies that
produce systems software, embed-
ded software, or military software
than they are among companies
that produce normal information
systems.
In general, software process
assessments, which are usually
performed by consulting organiza-
tions, cover the following topics:
Software ProcessesThis topic deals with the entire
suite of activities that are per-
formed from early requirements
through deployment. How the
project is designed, what quality
assurance steps are used, and how
configuration control is managed
are some of the topics included.
This information is recorded in
order to guide future process
improvement activities. If historical
development methods are not
recorded, there is no statistical way
of separating ineffective methods
from effective ones.
Software Tool SuitesThere are more than 2,500 software
development tools on the commer-
cial market and at least the same
number of proprietary tools that
companies have built for their own
use. It is very important to explore
the usefulness of the available
tools, and that means that each
project must record the tools uti-
lized. Thoughtful companies iden-
tify gaps and missing features and
use this kind of data for planning
improvements.
Software InfrastructureThe number, size, and kinds of
departments within large organiza-
tions are an important topic, as are
the ways of communicating across
organizational boundaries. Other
factors that exert a significant
impact on results include whether
a project uses matrix or hierarchi-
cal management and whether a
project involves a single location
or multiple cities or countries.
Software Team Skills andExperienceLarge corporations can have
more than 100 different occupa-
tion groups within their software
domains. Some of these specialists
include quality assurance, techni-
cal writing, testing, integration
and configuration control, network
specialists, and many more. Since
large software projects do better
with specialists than with general-
ists, it is important to record the
occupation groups used.
Staff and Management TrainingSoftware personnel, like medical
doctors and attorneys, need con-
tinuing education to stay current.
Leading companies tend to provide
10-15 days of education per year,
for both technical staff members
and software management.
Assessments explore this topic.
Normally, training takes place
between assignments and is not a
factor on specific projects, unless
activities such as formal inspec-
tions or joint application design
are being used for the first time.
BUSINESS AND CORPORATEMEASURES
Thus far, measurement has been
discussed at the level of software
projects. However, software sup-
ports other business operations.
Therefore, all large companies with
thousands of software personnel
perform many kinds of business
measurements. About 70% of
the large corporations I’ve visited
use some form of the “balanced
scorecard” approach for business
measures. This method, developed
by David Norton and Robert Kaplan
[7], combines financial perfor-
mance measures, customer meas-
ures, and business goals and can
©2003 by Capers Jones. All rights reservedJune 200310
If historical development
methods are not recorded,
there is no statistical way
of separating ineffective
methods from effective ones.
Get The Cutter Edge free: www.cutter.com
be customized with other factors
such as productivity and quality.
Below are just a few samples of
corporate measures to illustrate
the topics of concern.
Portfolio MeasuresMajor corporations can own from
250,000 to more than 1,000,000 func-
tion points of software, apportioned
across thousands of programs and
dozens to hundreds of systems.
Many large enterprises know the
sizes of their portfolios, their growth
rate, replacement cost, quality
levels, and many other factors.
Market Share MeasuresMost large companies know quite
a lot about their markets, market
shares, and competitors. For exam-
ple, industry leaders in the com-
mercial software domain tend to
know how every one of their prod-
ucts is selling in every country and
how well competitive products are
selling in every country. Some com-
panies carry out market share stud-
ies with their own personnel. Other
companies depend upon outside
consulting groups. Still other com-
panies use both internal and exter-
nal sources of data. Much of this
kind of information is available
from various industry sources such
as Dun & Bradstreet, Mead Data
Central, Fortune magazine, and
other journals.
SUMMARY AND CONCLUSIONS
The software industry is struggling
to overcome a very bad reputation
for poor quality and long schedules.
The companies that have been
most successful in improving qual-
ity and shortening schedules have
also been the ones with the best
measurements.
Since large companies have the
greatest costs for software and
build most of the world’s major
software systems, it is natural for
large companies to have the most
complete and sophisticated meas-
urement programs. Within the
set of large software companies,
those with the best measurement
programs have the highest suc-
cess rate in building software
applications. Good measurement
programs and good software prac-
tices are almost always found
together.
REFERENCES
1. Chillarege, Ram. “ODC Basics
for ODC Process Measurement,
Analysis and Control.” Proceedings
of the Fourth International
Conference on Software Quality.
ASQC Software Division, 1994.
2. Crosby, Philip B. Quality Is Free:
The Art of Making Quality Certain.
New American Library, 1979.
3. International Function Point
Users Group. IT Measurement:
Practical Advice from the Experts.
Addison-Wesley, 2002.
4. Jones, Capers. “Sizing Up
Software.” Scientific American
(December 1998).
5. Jones, Capers. Software Quality:
Analysis and Guidelines for
Success. International Thomson
Computer Press, 1997.
6. Kan, Stephen H. Metrics and
Models in Software Quality
Engineering, 2nd edition.
Addison-Wesley, 2003.
7. Kaplan, Robert S., and David P.
Norton. “Using the Balanced
Scorecard as a Strategic
Management System.” Harvard
Business Review (January-
February 1996).
8. McCabe, Tom. “A Complexity
Measure.” IEEE Transactions on
Software Engineering, Vol. 2,
No. 4 (1976).
ADDITIONAL READING
The literature on software meas-
urement and metrics is expanding
rapidly. Following are a few of the
more significant titles to illustrate
the topics that are available.
Boehm, Barry W. Software
Engineering Economics. Prentice
Hall, 1982.
Garmus, David, and David Herron.
Function Point Analysis. Addison-
Wesley, 2001.
Vol. 16, No. 6 11
The companies that have
been most successful in
improving quality and short-
ening schedules have also
been the ones with the best
measurements.
Garmus, David, and David Herron.
Measuring the Software Process:
A Practical Guide to Functional
Measurement. Prentice Hall, 1996.
Grady, Robert B., and Deborah L.
Caswell. Software Metrics:
Establishing a Company-Wide
Program. Prentice Hall, 1987.
Howard, Alan, ed. Software Metrics
and Project Management Tools.
Applied Computer Research, 1997.
Jones, Capers. Applied Software
Measurement, 2nd ed. McGraw-
Hill, 1996.
Jones, Capers. Software
Assessments, Benchmarks, and
Best Practices. Addison-Wesley,
2000.
Miller, Sharon E., and George T.
Tucker. “Software Development
Process Benchmarking.”
Proceedings of the IEEE Global
Telecommunications Conference.
IEEE Communications Society,
1991.
Putnam, Lawrence H., and Ware
Myers. Measures for Excellence:
Reliable Software on Time, Within
Budget. Yourdon Press Computing
Series, Pearson Education POD,
1992.
Putnam, Lawrence H., and Ware
Myers. Industrial Strength Software:
Effective Management Using
Measurement. IEEE Press, 1997.
Capers Jones is Founder of Software
Productivity Research (SPR). After SPR
was acquired by Artemis Management
Systems, he became Chief Scientist
Emeritus of Artemis Management
Systems. Mr. Jones is an author and
speaker on software productivity, quality,
project management, and measurement
and the developer of the SPQR models
(Software Productivity, Quality, and
Reliability estimators). He was formerly
Assistant Director of Measurements with
ITT Programming Technology Center.
Prior to this, Mr. Jones was a project
team leader for the software process
improvement group at IBM’s Systems
Development Division. The team was
chartered to improve the quality and pro-
ductivity of IBM’s commercial software
systems. Mr. Jones also served as team
leader of software process assessments
at Nolan, Norton, and Company. He was
a programmer/analyst for the Office of
the Surgeon General in Washington, DC,
and also for Crane Company in Chicago.
Mr. Jones is graduate of the University of
Florida. He is member of IEEE and the
International Function Point Users Group
(IFPUG). Mr. Jones was awarded a life-
time membership in IFPUG for his work
in software measurement and metrics
analysis.
Mr. Jones can be reached at
©2003 by Capers Jones. All rights reservedJune 200312
two b
irds
wit
h o
ne s
tone
©2003 The David Consulting Group. All rights reserved. 13
Despite the hopes of process
improvement advocates, there is
limited information in the literature
as to the quantifiable benefits of
process improvement. This is due
to several factors:
� No standard for accounting
benefits
� Inconsistency in applying
cost accounting standards
� Failure to recognize natural
evolution or improvement
� Lack of formal quantification
of productivity or efficiency
improvements
These factors suggest the need for
a different approach to assessing
process improvement costs and
benefits in terms of quantitatively
matching process capability/
maturity to “faster, better, and
cheaper” project performance.1
In response to the growing need to
demonstrate quantitative value of
improvement effort, in quality and
dollars, this author has developed
a joint model and productivity
assessment methodology. The
methodology generates specific
recommendations that match an
organization’s business goals with
specific process changes. In most
cases, the recommendations focus
on specific tasks and activities
rather than on the generic goal
of attaining a Capability Maturity
Model (CMM) level.2 Recommen-
dations include targeted quantita-
tive productivity improvements and
reductions in time to market, deliv-
ered defects, total defects, and
maintenance effort. The intent is
to help organizations demonstrate
value early in the process improve-
ment cycle through many small
changes rather than a “big bang.”
The resulting recommendations
provide a road map that will enable
the organization to better leverage
effort and cost and thus derive true
benefit from process improvement.
INPUT #1: CMM-BASEDAPPRAISALS
In 1986, the Software Engineering
Institute (SEI) began developing a
process maturity framework to help
organizations improve their soft-
ware process and to provide a
means of assessing software devel-
opment capability. After more than
four years’ experience with the
framework, the SEI evolved it into
the CMM for Software. Version 1.1
of the Software CMM (SW-CMM),
published in 1993, was based on
actual practice, included practices
that were believed representative
of the industry’s best practices, and
provided a framework to meet the
needs of individuals who perform
process improvement and process
appraisal activities. The SW-CMM
was designed to focus on a soft-
ware organization’s capability for
producing high-quality products
consistently and predictably. In
this case, capability was defined as
the inherent ability of a software
process (i.e., in-use activities, meth-
ods, and practices) to produce
planned results.
The staged structure of the SW-
CMM includes five maturity levels
and is based on product quality
principles. The SEI adapted these
principles into the maturity frame-
work, establishing a project
management and engineering
foundation for quantitative control
of the software process. Each level
comprises a set of process goals
that, when satisfied, stabilize
process components. The model
therefore reflects prioritized
improvement actions for those
Vol. 16, No. 6
Extracting Real Value from Process Improvement
by Thomas M. Cagley, Jr.
1It has been noted that the pay rate vari-
ances between countries and regions
within countries cause additional com-
plications when comparing costs. Cost
variances require normalization for
comparisons.
2The SW-CMM was used in the assess-
ments described in this article. However,
the processes have been mapped to the
CMM Integration (CMMI) and tested with
similar results.
processes that are of value across
an organization.
At the next level of detail, the model
provides heightened management
insight (or at least the capability for
it) into project methods, progress,
and results. This additional insight,
when coupled with effective and
timely corrective action by manage-
ment, is the primary driver for
improved project results.
From a higher-level perspective, the
SW-CMM provides an organized
framework for producing software
faster, better, and cheaper. It stands
to reason, for example, that sound
estimates produce more realistic
plans that, when effectively
tracked, better identify risks that
require proper management so
they don’t become cost or schedule
impacts. However, what has been
missing until now is a quantitative
correlation between the SW-CMM
practices, their stabilization, and
the resulting impact as to just how
much faster, how much better,
and how much cheaper an orga-
nization can produce a given set
of products.
A CMM-based appraisal includes
15 discrete tasks partitioned across
four phases: planning, preparation,
conducting, and reporting. The con-
ducting phase activities include
data collection and consolidation
tasks performed at the organiza-
tion’s site. These appraisals include
an expert evaluation of an organiza-
tion’s process capability against the
SW-CMM or other reference mod-
els, including appraisals that result
in maturity level ratings.
INPUT #2: PRODUCTIVITYASSESSMENTS
Serious metrics-based process
improvement programs begin with
a quantitative baseline of current
organizational data. The organiza-
tion needs a baseline for cost justi-
fying the investments required for
advancement. A baseline provides
a platform for comparing changes
within an organization.
A “baseline” is a point-in-time
inventory of the number and sizes
of a relevant group of software
applications and/or projects.
Statistical sampling techniques are
the best means to identify the rele-
vant group of projects or applica-
tions (although studies of whole
portfolio populations offer the most
predictive data). Software baselines
also include the other “hard” data
(schedules, costs, effort, defects,
user satisfaction) associated with
the sample. The combination of size
and other hard data allows a com-
parison of related projects and
applications against either industry
average results or best-in-class
results. A comparison of this type is
referred to as a “benchmark.” On-
site data collection provides the
most reliable results for baselines
and benchmarks.
The overall process for baselining
productivity is as follows: develop a
sample, gather quantitative data,
and then develop recommenda-
tions based on the data and the
data collection process (see
Figure 1). The organization’s size
and process maturity, along with
the level of metrics integration, are
indicators of complexity. All these
factors drive the effort required to
perform the assessment process.
Sample SelectionThe first phase of baselining is to
determine a sample size. The
benchmarkers need to generate
samples that will allow them to
forecast the impact of process
improvements and to generate sta-
tistical confidence in the overall
results. The sample should be both
representative of the organization
and statistically sound in order for
the data to be used for forecasting
improvements.
©2003 The David Consulting Group. All rights reserved.June 200314
SampleSelection
Data Analysisand Recommendation
Development
Project Size
OtherQuantitative
Metrics
ProjectAttributes
Dat
a C
olle
ctio
n
Figure 1 — Productivity baseline approach.
Get The Cutter Edge free: www.cutter.com
The benchmarkers forecast
improvements by leveraging mod-
els constructed in tools such as JMP
or Excel, as well as estimation tools
such as SMART Predictor or SEER-
SEM. Less mature organizations or
organizations that have not devel-
oped historical metrics typically
generate a sample that is represen-
tative of perceived demographics
rather than based on statistical
techniques. Samples should
include a balanced set of environ-
mental attributes, such as platform,
language, project size, and degree
of vendor functionality present
in the development portfolio. I
strongly suggest the use of function
points as the sizing metric for both
applications and projects. Function
points are an industry standard
measure of software size that can
be applied regardless of technology
or language.3
Data CollectionThe data collection phase of a
baseline focuses on collecting
basic metrics for each of the proj-
ects or applications in the sample.
This phase can include developing
or reviewing the metrics based on
the organization’s process and met-
rics maturity. The metrics included
in a quantitative baseline typically
include size, effort, defect, and
schedule data. Quantitative data
describes what has occurred but
not why. Project attribute data that
describes project classification and
organizational characteristics is
needed to complete the picture.
Data Analysis and RecommendationDevelopmentThe third phase of the baseline
process is to analyze the data and
develop recommendations. Data
analysis requires combining the
strengths, weaknesses, and recom-
mendations generated in the CMM
appraisal and the quantitative find-
ings according to the mapped rela-
tionships between the two models.
The combined data allows the
benchmarkers to derive compar-
isons to industry data at any rele-
vant levels of granularity. They use
the results of the assessment and
comparison to industry data to per-
form sensitivity and “what-if” analy-
ses while developing improvement
recommendations.
AND THERE THE TWAIN SHALLMEET: A COMBINED ASSESSMENTAPPROACH
CMM-based appraisals and
metrics-based productivity assess-
ments have evolved as separate
processes. However, each assess-
ment technique is focused on
increasing organization effective-
ness and efficiency.
In developing the initial combined
assessment process, I reviewed
both models and assessment meth-
ods to determine the feasibility of
a combined assessment. The
results of this evaluation identified
the following:
� There was sufficient overlap
and similarities between the
two models and the assess-
ment methods (data collec-
tion, data consolidation, and
reporting) to determine that
a combined assessment was
feasible.
� Overlap between CMM and
productivity assessment
models provided both com-
monality and robustness.
� Overlap between assessment
methods provided opportuni-
ties to leverage assessment
activities.
� Synergies between both
models and assessment
methods would increase the
value of combined results
and joint recommendations.
These findings validated the fea-
sibility of a joint CMM (capability-
focused) and productivity
(effectiveness-focused) assessment.
After the initial review, I performed
further analysis on the two models
and assessment methods to
gauge the amount of model over-
lap and determine how the joint
assessment could be tailored.
Figure 2 shows the overlap
between the two models. The goals
of each assessment drove tailoring
decisions, which had to include
the following:
Vol. 16, No. 6 15
ProductivityAssessment
Synergy(Effectiveness)
CMMAssessment
Synergy(Capability)
Overlap of Assessments(Assessment Synergy)
Figure 2 — Model overlap and synergies.
3See www.ifpug.org for more information
on function points.
� The results of the combined
assessment had to contribute
to process improvements.
� The combined assessment
and results, including recom-
mendations, had to optimize
value to the sponsor, includ-
ing supporting the business,
optimizing cost, and minimiz-
ing disruption.
� The appraisal process had to
be reliable in that the com-
bined event would create a
repeatable process, standard-
ize the conduct of the com-
bined appraisal, yield
predictable results, and allow
for use by both inhouse and
outside consultants.
My analysis illustrated that a mod-
erate degree of overlap existed
between components of the
CMM and productivity assessment
models. I then used the overlap to
calibrate the SW-CMM to the quan-
titative productivity assessment to
create a model used to predict the
impact of organizational process
changes.
PUTTING IT TO THE TEST
In the fall of 1995, the combined
SW-CMM and productivity
assessment was implemented
at a major US IT organization. The
underlying business consider-
ations that supported a combined
appraisal included the overlap and
synergies mentioned earlier, as
well as the very real need to reduce
appraisal impacts. When the two
types of assessments are com-
bined, the overall cost and impact
are lower than if the assessments
are done separately. The assess-
ment team proposed the combined
assessment because it strongly
believed that the productivity and
quality gains that are thought to
result from improved process
maturity could be quantified. The
overall appraisal flow is illustrated
in Figure 3.
In the combined assessment, at a
conceptual level, data would be
gathered via common team activi-
ties, processed using common
rules, and then parsed to the
respective model/method rules
associated with each model. This
overall assessment flow is based
essentially on the major phases of
a CMM Appraisal Framework–
compliant appraisal: the planning
and preparation phase, the con-
ducting (on-site appraisal activities)
phase, and the reporting phase.
Assessment tailoring decisions
were guided by the following
considerations:
� The joint assessment had to
be designed in such a way as
to produce the same result
that standalone assessments
would.
� The outcome of the tailored
assessment had to be consid-
ered legitimate by the
respective communities
associated with each model.
� The combined assessment
had to have the rigor of indi-
vidual assessments while
reducing the duplication of
information collected during
the assessment.
At a practical level, the integrated
assessment involved the following
activities:
� Performing joint planning and
preparation
� Conducting joint data gather-
ing sessions where feasible
� Performing separate rating
and reporting tasks consis-
tent with each method as it
applied to each model
� Performing joint “results
analysis” to create an inte-
grated report and recom-
mendations
Organizational coverage was
another significant issue. As a
starting point, an assessment team
consisting of some inhouse person-
nel and myself selected a set of 62
development projects for the pro-
ductivity assessment. We chose
©2003 The David Consulting Group. All rights reserved.June 200316
Common DataCollection
CommonPlanning andPreparation
CMMAssessment
DataCollection
ProductivityData
Collection ReportRate
Figure 3 — Overall appraisal flow.
Get The Cutter Edge free: www.cutter.com
these particular projects in order to
provide a statistically significant
representation of the organization’s
development environment.4 Our
selection criteria included: plat-
form, programming language
mix, and the presence of vendor-
supplied functionality. From these
62 projects, we then selected eight
for the CMM appraisal. The project
selection criteria for the CMM
appraisal included:
� Software-intensive projects
either completed or in the
latter stages of development
and testing
� Projects that, as a group,
represented the site’s
software work
� Projects that, as a group, had
work for which the CMM key
process areas (KPAs) could
be evaluated
� Projects for which personnel
were available to participate
during the site visits
� Projects that were consistent
with (a subset of) the proj-
ects selected for the process
analysis
Data collection, data consolidation,
and rating activities were per-
formed independently by the
respective development teams,5
and the assessment team sepa-
rately developed and reported the
results, including conclusions and
recommendations.6 I then com-
bined, analyzed, and presented
these results and recommenda-
tions for subsequent process
improvement.
WHAT THE JOINT ASSESSMENTREVEALED
From a SW-CMM PerspectiveThe joint assessment found that the
organization met most goals of the
CMM Level 2 KPAs but not those
relating to the software quality
assurance and software subcon-
tract management KPAs. At CMM
Level 3, goals for the organization
process focus, integrated software
management, and intergroup coor-
dination KPAs were fully satisfied,
but just one of the two goals in the
organization process definition and
software product engineering KPAs
was satisfied. The organization was
therefore rated at Level 1 against
the SW-CMM, version 1.1. Major
recommendations for achieving
CMM Levels 2 and 3 in a future
appraisal included the following:
� Enhancing the collection,
analysis, and planning for
delivery of training
� Formalizing subcontract
management procedures
already in place
� Developing and deploying
a process for formal peer
reviews of work products
� Institutionalizing SQA
reviews/audits
From a Productivity AssessmentPerspectiveData was collected from all of the
projects in the sample. The assess-
ment team used the collected
data to generate comparisons for
six basic software metrics (see
Table 1). The comparisons for
each project were made to projects
of the same type, size, and com-
plexity.7 For example, we would
compare a 500-function-point
Vol. 16, No. 6 17
Time to market Slower than average
Productivity More productive than average
Project staffing Projects use more staff than average
Defect removal efficiency Below average
Delivered defects Higher than average
Project documentation Less documentation produced than average
Table 1 — Productivity Metrics at the Portfolio Level
4The initial combination assessment
was done for a client with three devel-
opment locations (two major and
one minor).
5Data collection was coordinated through
daily joint team meetings.
6Further refinement of the methodology
will reduce the amount of independent
data collection.
7Comparisons are made based on
available data, which typically includes
industry data and internal organizational
data, if available.
client-server project to a statistical
subset of the ISBSG (International
Software Benchmarking Standards
Group) database or other industry
data made up of 500-function-point
client-server projects. Therefore,
deviation was measured from a
relative zero point.
The portfolio-level metrics profile
reflects the results gathered
against the SW-CMM. Organiza-
tions that actively manage with
the assistance of metrics require
the process discipline engendered
in the project management areas
(i.e., the software project plan-
ning and project tracking and
oversight KPAs) to attain maximum
effectiveness from the metrics.8
Some observers have noted that
there is a propensity to trade off
between time to market, project
staffing profiles, and productivity. It
appears that if an organization that
is beginning to develop process dis-
cipline isolates its focus on any one
of these variables, it takes its eye off
the others. Organizations at Level 1
or 2 tend not have the process arti-
facts and discipline required for
optimizing all three variables.
JOINT ANALYSIS ANDRECOMMENDATIONS
The primary goal for the joint
assessment was to develop and
prioritize improvement goals based
on both the SW-CMM and produc-
tivity assessment frameworks. The
prioritization was based on metrics
deemed important by the sponsor
and the organization’s manage-
ment. Process improvement rec-
ommendations were formulated
after joint analysis of all of the
assessment results.
Using the mapped relationship
between the SW-CMM and the
productivity assessment frame-
work, I generated the impact of the
improvement recommendations.
Table 2 shows an example of one
of the specific recommendations.
Each recommendation included
the forecasted quantitative benefits
of its implementation.
Each of the specific recommenda-
tions identified both the CMM KPAs
that would be strengthened and
provided a quantified estimate of
the impact to productivity, time to
market, delivered defects, total
defects, and maintenance. The
factors used to show change were
selected based on feedback from
the sponsor and the goals of the
client organization.
The recommendations that came
out of the joint assessment ranged
from specific changes in the client’s
development methodology all the
way up to and including KPA-level
©2003 The David Consulting Group. All rights reserved.June 200318
Improvement Methodical Review of Deliverables by Author and Peers
• Define peer review process
• Identify deliverables to be reviewed
• Collect and use defect data
• Review and audit process
Productivity Improvement
8%-11%
Time-to-Market Reduction
3%-5%
Delivered Defect Reduction
12%-16%
Total Defects Reduction
7%-9%
Maintenance Reduction
6%-8%
Key Process Areas Impacted
• Peer review
• Software project engineering
• Organization process design
Table 2 — Recommendation Example
8It should be noted that if an organization
waits until it has attained the proper
level of discipline before using metrics
for management, it will typically be too
late. I recommend measuring early and
letting the resulting demand for the met-
rics pull the process forward.
Get The Cutter Edge free: www.cutter.com
process implementation (e.g., peer
reviews). The steps required to
achieve CMM Level 2 and 3 reflect
large increases in productivity and
reductions in time to market. The
integrated recommendations gave
the client critical information for
supporting effective allocation of
scarce resources for future
improvement activities.
CONCLUSION
The joint CMM and productivity
assessment was successful both in
its intended design and in its out-
comes. In measuring the capability
and productivity of the organiza-
tion, the assessments found that
the organization was at CMM Level
1 with overall above-average pro-
ductivity. The individual CMM and
productivity results, as well as the
joint recommendations, proved
useful to the organization by focus-
ing future process improvement
efforts.
This assessment represented a con-
tinued evolution coupling process
maturity and productivity assess-
ments. The combined assessment
significantly reduced the time and
effort impact on the client organiza-
tion that would have resulted from
separate events. One impact reduc-
tion was the ability to combine
assessment preparations. Also,
combining the models and meth-
ods reduced the assessment team’s
time on-site from three weeks
to two weeks. Finally, and most
importantly, the combined assess-
ment allowed the organization to
develop integrated recommenda-
tions based on joint coverage,
which would have been difficult
to develop from separate events.
The following improvements to the
methodology and its implementa-
tion are under development:
� Continued refinement of the
mapping of organizational
effectiveness attributes and
their “levels” (leading edge,
above average, average,
below average, deficient) to
the SW-CMM key practices
(and CMMI process areas)
in order to provide a stronger
basis for performing joint data
collection and consolidation
� A more effective correlation
between CMM techniques
and productivity data
collection processes and
additional work aids (e.g.,
scripted questions, detailed
interview planning, and well-
defined requirements for the
organizational documenta-
tion to be reviewed) to
enhance the ability to per-
form joint data collection
� Deepening the level of team
member expertise in both
CMM and productivity assess-
ment methods so as to per-
mit a single team to perform
the combined assessment
� Introduction of interim
reviews to facilitate early
discussion of findings and
support identification of
trends or global issues to
improve data capture and
contribute to more efficient
changes in data collection
strategies, schedule, and
focus during the combined
assessment
Thomas Cagley is a Managing Senior
Consultant for The David Consulting
Group. He is an authority in guiding
organizations through the process of
integrating software measurement with
model-based assessments to yield effec-
tive and efficient process improvement
programs. Mr. Cagley is a recognized
industry expert in the measurement and
estimation of software projects. His areas
of expertise encompass management
experience in methods and metrics, qual-
ity integration, quality assurance, and the
application of the Software Engineering
Institute’s Capability Maturity Model to
achieve process improvements.
Mr. Cagley has managed many types
of projects within the IT field, including
large-scale software development, con-
version, and organizational transforma-
tion projects. Based on his expertise,
Mr. Cagley managed the development
of an internal project management certifi-
cation program for software project man-
agers for a major bank holding company.
He has also managed and performed
quality assurance (technical and process)
for large IT organizations. He is a fre-
quent speaker at metrics, quality, and
project management conferences.
Mr. Cagley can be reached at E-mail:
Vol. 16, No. 6 19
There is ample literature supporting
the value of software measurement
— and the relative immaturity of
the software development industry,
implying unfledged software meas-
urement models. However, the
attention given to understanding
software productivity and quality is
encouraging. A software measure-
ment program can provide a useful
and proactive management tool
that combines quantitative and
qualitative components [1]. In addi-
tion, not only does a metrics pro-
gram provide a great deal of insight
into the software development and
maintenance process, but as
Raymond J. Offen and Ross Jeffery
observe, it reveals the intersection
of technical and market impera-
tives and the tensions between
them [6].
Shari Lawrence Pfleeger’s analysis
suggests that two out of three met-
rics initiatives do not last beyond a
second year [7]. Karl Wiegers fur-
ther notes that up to 80% of soft-
ware metrics initiatives fail [8]. He
suggests the following reasons:
� Lack of management
commitment
� Measuring too much,
too soon
� Measuring too little, too late
� Measuring the wrong things
� Having imprecise metrics
definitions
� Using metrics data to
evaluate individuals
� Using metrics to motivate
rather than to understand
� Collecting unused data
� Lacking training and
communication
� Misinterpreting the
metrics data
Creating a reasonable, phased
project with clear definitions and
goals can prevent many of these
problems. However, if there is no
management commitment, or mis-
information is communicated, it
will be difficult to establish a suc-
cessful program.
IT performance measures, which
have been used in the past to
show productivity and quality
performance results, must now
evolve into a more sophisticated,
business-oriented measure,
and CIOs will be required to
demonstrate their contribution to
the business using this new form of
measurement [1]. Traditional met-
rics programs measure individual
projects, a factor that is integral in a
development manager’s project
planning and execution. However,
Anandasivam Gopal and his
coauthors suggest that a program’s
success depends on senior man-
agement [2]. As we explain the
value of a metrics program to man-
agement and financial sponsors, it
is difficult to ask for support without
a direct incentive in the form of
quantitative information to manage
the business.
BUILDING A SOLID FOUNDATIONAT AT&T
Looking back to its inception in
April 1999, our organization’s meas-
urement program was based on
some key factors:
� Incremental implementation
� Developer participation
� Separate and dedicated
program managers
� Regularly scheduled feed-
back (quarterly scorecard
reviews)
� Automated data collection
� Training
� Support of the project
champion’s goals
Case studies agree that the key fac-
tors found in our approach are vital
for success [3]. Wiegers would also
say that the way to create a suc-
cessful measurement culture is to
start small, explain why, share the
data, define the data clearly, and
understand data trends.
©2003 Cutter Information LLC20
who y
ou c
allin
’ “a
vera
ge”?
June 2003
Hitting the Sweet Spot: Metrics Success at AT&T
by John Cirone, Patricia Hinerman, and Patrick Rhodes
Get The Cutter Edge free: www.cutter.com
Choosing Function PointsOne of the fundamental decisions
that we made early in the program
was to use function points to meas-
ure the size of our applications in
order to derive productivity, cost,
cycle time, and quality measures.
Although there are many measure-
ment models and methodologies to
quantify software size, we chose
function points as our unit of meas-
ure and applied them consistently
across the organization.
Function point counting measures
software development from the
customer’s point of view, quantify-
ing functionality provided to the
user based primarily on logical
design. This definition is in agree-
ment with the International
Function Point Users Group’s
Function Point Counting Practices
Manual, Release 4.1 [4]. Our deci-
sion to use function points is sup-
ported by metrics experts David
Garmus, David Herron, and Capers
Jones who contend that function
points, although not perfect, are the
best choice for studying software
economics and quality [1, 5].
Getting Outside HelpThe second fundamental decision
was to use an outside vendor to
count our function points. This
enabled us to share some of the
risk for program success, alleviate
bias in counting practices, and have
an experienced skill set readily
available. The main objectives of
our program were significant, and
we realized early that it would take
time and practice to create a pro-
gram that would produce useful
and accurate measures for charting
the company’s direction. Our
objectives were to:
� Create and maintain a man-
agement tool to measure
software development and
maintenance independent
of the technology used for
implementation
� Measure functionality that the
user requests and receives
� Develop repeatable and
measurable performance
profiles
� Improve the metrics data
collection process
� Establish a set of measures
that support improvement
initiatives (productivity
and quality)
� Identify measures that sup-
port business strategies
� Expand to help estimate and
choose projects as well as
manage contracts
� Align with supporting pro-
grams such as release plan-
ning, project management,
our standard development
framework, and software
quality assurance
Doing Time Reporting RightWe learned quickly that a well-
defined and reasonable time-
reporting process was essential
once we had accurate function
point counts. We found that if we
did not have a time-reporting
process in place or maturing in
parallel with our function point
counting program, function points
would be less valuable.
In order to understand our produc-
tivity and costs, we needed to
categorize work effort to apply to
the function points, the unit in
which effort was being measured.
We spent time defining the work
activity, which we categorized as
new development and mainte-
nance. Next, we created projects,
which combined like applications
together by customer and category
(warehouse, transactional inter-
nally developed, transactional
vendor developed) and limited the
number of projects so individuals
would more likely report to the cor-
rect project. Each project had the
agreed-upon task structure below
it for time-reporting purposes. We
tracked releases (enhancement
projects) by quarter and built quar-
terly development tracking into the
task structure.
Taking a Phased ApproachWe took a phased approach and
assigned a central project man-
agement group to track process,
communicate definitions, provide
status, and support 100% time
reporting. This central group priori-
tized applications, completed a
baseline count for 42% of the appli-
cations identified as core applica-
tions, and performed estimated
counts for the remaining. We chose
Vol. 16, No. 6 21
We learned quickly that a
well-defined and reasonable
time-reporting process was
essential once we had accu-
rate function point counts.
not to use backfiring (a computed
function point value based on the
total number of lines of code and
the complexity level of the pro-
gramming language) because of
its reported inaccuracy [1].
Each quarter, we counted releases
for the baselined applications and
replaced all estimated counts with
actual counts by the end of 2000.
We set goals to replace estimates
with actual counts and prioritized
releases (counted medium and
large). In an effort to manage to
a budget, we combined small
releases over quarters and counted
them collectively while merging the
effort expended across quarters to
create them.
Benchmarking Our PerformanceOnce we felt that we were provid-
ing information to count function
points accurately and reporting
time appropriately, we became
comfortable with the results that
were being reported on our quar-
terly scorecards. Our next question
was how we were doing as com-
pared to other development shops.
We began to look for benchmark
data, which helped serve two
important purposes. It facilitated
inspection of our measures for
accuracy, and it gave us the infor-
mation we needed to set goals for
achieving above our industry’s
benchmark.
One of the most tedious tasks was
to align our data definitions with
those of the benchmark data so
that we would actually have com-
parative data. The benchmark data
was organized by the following cat-
egories: warehouse, transactional
internally developed, and transac-
tional vendor developed; and func-
tion point ranges: S, M, L, and XL.
We used other variables as well —
operating system, programming
languages, and so on — to get the
best comparisons. Our benchmark
data (specific to our portfolio) iden-
tified a “sweet spot” to aim for. For
us, this translated to quarterly
releases for cycle time and release
sizes of 76-175 function points.
Keeping It SimpleWe tried to keep the program as
simple and nonintrusive to the
development and maintenance
process as possible. We used data
that already existed, and where we
needed to collect new data, we
automated and simplified collec-
tion as much as we could.
Gopal et al.’s study supports the
need to (1) keep the technical envi-
ronment “user-friendly” and (2)
engage stakeholders (i.e., develop-
ers, managers, and senior leader-
ship) in order to have a successful
metrics program. Their study shows
a positive and significant inter-
dependence between these two
factors, which contributed to met-
rics program performance. They
further state that a successful met-
rics program is one that results in an
increase of metrics use in decision-
making and has a positive impact
on organizational performance [2].
Leveraging the DataWe began to see that the data we
collected and analyzed became
very useful not just for application-
specific decisions but for decisions
that would impact the entire devel-
opment shop. However, the latter
required us to arrive at an aggre-
gated view of the work being done
while still being able to drill down
into the details to explain trends
and anomalies. As we experi-
mented to come up with an aggre-
gated view and then agreed upon a
method, we soon found that the
data helped us plan and manage
human resources more effectively.
We reviewed our time reporting
on a regular basis and began to
understand our development-to-
maintenance ratio as compared
to our plan. As we started new
projects, this data helped us see
how to distribute resources more
effectively across the entire
development shop.
Data analysis was also influential in
creating processes to support hit-
ting the “sweet spot” (for both
release size and cycle time) where
we can be most competitive. We
have created recommendations for
a quarterly release schedule, fine-
tuned our time reporting and esti-
mating in a centralized manner
across the development shop,
and aligned our tools with the
processes we have developed.
Finally, we are now able to explain
the value we provide in a measur-
able way. One of the most mar-
ketable outputs from this program
is the ability to articulate our value
and to react quickly to information
requests, with supporting details.
WAYS TO MEASURE PRODUCTIVITY— AND WAYS NOT TO
For this article, we will focus on
the labor productivity metric for
new software development. The
measure is a quantification of the
©2003 Cutter Information LLCJune 200322
Get The Cutter Edge free: www.cutter.com
number of function points of new
software functionality that one soft-
ware developer can develop in one
month’s time. This productivity
metric is called “function points per
staff-month” and is calculated as
the number of function points
developed for a release divided
by the person-months of labor
expended. The same approach can
be used for the development cycle
time and software maintenance
metrics as well as new software
development productivity.
Table 1 contains sample data for
two successive quarters. Each row
in the quarter represents newly
developed software that was insti-
tuted into the production system in
the quarter (although some of the
labor associated with the release
may have been expended in prior
quarters). For the purposes of this
article, we assume all of the sys-
tems in this universe can be cate-
gorized as either a transactional
system or a data warehouse system
(indicated in column B). Column C
is the size of the release stated in
function points. Column D is the
staff-months of labor expended to
develop each release. Column E is
the productivity metric “function
points per staff-month,” calculated
by dividing column C by column D.
An aggregate productivity total can
also be correctly calculated by
dividing the total in column C by
the total in column D.
Table 2 contains sample external
benchmark data, which was
derived from a software bench-
marking firm after a study of the
applications being benchmarked.
In this case, the benchmark for
a given release is determined
through a combination of the type
of application (column A) and the
release size in function points
(column B).
Pairing the quarterly release data
in Table 1 with the benchmark data
in Table 2 yields the benchmark
comparison in Table 3. Column C
is our achieved productivity calcu-
lated in Table 1. Column D is the
benchmark productivity determined
by looking up the application type
and release size in the benchmark
data table (shown in Table 2).
This comparison is great for meas-
uring individual releases. Detailed
examination and root cause analy-
sis of releases that exceeded or
missed the benchmark can lead
to great process improvements,
which ultimately yield consistently
Vol. 16, No. 6 23
Quarter #1 A B C D E
Release # Type of Application Release Labor Productivity Size (in staff-months) 1 Transactional 10 1.2 8.33 2 Transactional 175 17 10.29 3 Data warehouse 15 3 5.00 4 Data warehouse 75 9 8.33 5 Transactional 100 7.5 13.33 Quarter #1 Total 375 37.7 9.95
Quarter #2 Release # Type of Application Release Labor Productivity Size (in staff-months) 1 Data warehouse 90 14 6.43 2 Data warehouse 100 12 8.33 3 Transactional 80 8 10.00 4 Data warehouse 75 9 8.33 5 Data warehouse 10 2 5.00 6 Transactional 60 4.5 13.33 Quarter #2 Total 415 49.5 8.38
Table 1 — Quarterly Software Release Data
A B C Type of
Application Release Size in Function Points
Benchmark Productivity*
Transactional 1-50 10
51-150 12.5
> 150 11
Data warehouse 1-50 6
51-150 7.5
> 150 7
*Productivity expressed as new function points developed per staff-month of labor
Table 2 — Function Point Development Productivity Benchmark Data
higher productivity in future
releases. This is the traditional
value of the function point metric,
and, in our opinion, it continues to
be its greatest value.
The problem lies in using this data
to calculate some sort of overall
benchmarked productivity meas-
ure that can be compared from
quarter to quarter. This would be
useful for measuring overall trends
in development productivity and
for executive-level benchmarking
presentations. Below we examine
(and discount) some simple
approaches to creating such a
measure.
Trend the Actual ProductivityThis approach would not use any
external benchmark data; we
would simply trend our actual
aggregate productivity. In our
sample data, using the quarterly
productivity totals from column E in
Table 1, we would say that produc-
tivity in aggregate declined from
9.95 in quarter #1 to 8.38 in quarter
#2, a drop of 15.8%.
This is a flawed approach because
the benchmark data tells us that
enhancements on decision support
applications are inherently less pro-
ductive than enhancements for
transactional applications. Also,
productivity varies by the size of the
release. This approach ignores
those factors entirely. In an even
modestly complicated develop-
ment shop, quarterly productivity
measures can vary widely from
quarter to quarter depending
on the type of applications
enhanced as well as the size
of the enhancement.
Count the Winners and LosersThis approach expresses the num-
ber of releases exceeding the
benchmark as a percentage of the
total number of releases. In our
sample data, for quarter #1, two
of five releases (40%) exceeded the
benchmark. In quarter #2, three
of six releases (50%) exceeded the
benchmark, a 10% improvement.
This is a valid metric and one we
use every quarter. It is useful for
measuring how often a develop-
ment group is achieving bench-
mark. Its use is limited, however,
because it weights all releases
equally, regardless of how big
the release was.
©2003 Cutter Information LLCJune 200324
Table 3 — Actual Versus Benchmark Productivity
B C D E Quarter #1
Type of Application Actual Benchmark Comparison Productivity Productivity to Benchmark Transactional 8.33 10 ��
Transactional 10.29 11 ��
Data warehouse 5.00 6 ��
Data warehouse 8.33 7.5 ��
Transactional 13.33 12.5 ��
Total 9.95 9.4
Quarter #2
Type of Application Actual Benchmark Comparison Productivity Productivity to Benchmark
Data warehouse 6.43 7 ��
Data warehouse 8.33 7.5 ��
Transactional 10.00 11 ��
Data warehouse 8.33 7.5 ��
Data warehouse 5.00 6 ��
Transactional 13.33 12.5 ��
Total 8.38 8.58 �
Actual Benchmark
Actual as % Productivity
Productivity of Benchmark
Quarter #1 9.95 9.40 105.9%
Quarter #2 8.38 8.58 97.7%
Table 4 — Averaging the Benchmark Data
Get The Cutter Edge free: www.cutter.com
Average the Benchmark DataIn this approach we would com-
pare our aggregate productivity
to the average of the benchmark
productivity (see Table 4). In our
example, we would average the
benchmark data in column D of
Table 3, giving us 9.40 for quarter
#1 and 8.58 for quarter #2. Using
this method, we could create quar-
ter over quarter comparability by
expressing actual productivity as a
percentage of average benchmark
productivity.
This method would tell us that our
aggregate productivity dropped
from 105.9% of benchmark to 97.7%
quarter over quarter, a 7.7% drop in
productivity.
This approach is better, but the sim-
ple averaging of the benchmark for
each enhancement again weights
them all the same, regardless of
how big they were. Consider a
quarter with two releases, a huge
one (1,000+ function points) that
beat the benchmark by a large mar-
gin and a tiny one (1 function point)
that missed the benchmark by an
equally large margin. This simple
average calculation would indicate
that we hit the benchmark exactly,
when intuitively it felt like a very
productive quarter.
OUR APPROACH
Over time we have refined the
benchmark averaging method
described above to weight the
aggregate benchmark average
by the size of the releases.
Conceptually, we calculate the
labor it would have taken if we
performed exactly at benchmark
and then calculate aggregate
benchmark productivity from that
number. This is explained in more
detail below.
The formula for our benchmark
is Productivity = Release Size/
Labor. Using simple algebra,
we infer that Labor = Release
Size/Productivity. If we want to
know the labor required to
perform exactly at benchmark,
we can further refine the formula
as Benchmark Labor = Release
Size/Benchmark Productivity. In
Table 5, we calculated this theoreti-
cal benchmark labor as shown in
column E by dividing column C by
column D. We can total column E
to get a grand total of the labor it
would have taken if every enhance-
ment had been performed exactly
at benchmark.
To get the weighted average
aggregate benchmark, we divide
the total function points in all
enhancements by the total labor
at benchmark. In Table 5, this is
the total in column C divided by
the total in column E:
Quarter #1 Weighted AverageAggregate Benchmark = 375/37.4 = 10.02
Quarter #2 Weighted AverageAggregate Benchmark = 415/49.9 = 8.32
Vol. 16, No. 6 25
A B C D E Quarter #1
Release # Type of Application Release Size Benchmark Labor Required Productivity to Achieve Benchmark
1 Transactional 10 10 1.0
2 Transactional 175 11 15.9
3 Data warehouse 15 6 2.5
4 Data warehouse 75 7.5 10.0
5 Transactional 100 12.5 8.0
Quarter #1 Total 375 37.4
Quarter #2
Release # Type of Application Release Size Benchmark Labor Required Productivity to Achieve Benchmark
1 Data warehouse 90 7 12.9
2 Data warehouse 100 7.5 13.3
3 Transactional 80 11 7.3
4 Data warehouse 75 7.5 10.0
5 Data warehouse 10 6 1.7
6 Transactional 60 12.5 4.8
Quarter #2 Total 415 49.9
Table 5 — Weighted Average Aggregate Benchmark
Now we can express our aggregate
productivity as a percentage of the
weighted average aggregate pro-
ductivity to get a ratio we can com-
pare (see Table 6).
Using this method, we actually
trended our productivity upward by
1.4% quarter over quarter. Note that
the weighted average indicates a
trend that is increasing, while the
simple average indicated that pro-
ductivity had dropped (an outcome
that further illustrates the frailties of
the simple average).
When developing the aggregate
benchmark, this method takes into
account release size as well as
application type and weights them
appropriately. A trend plotted from
this data is an extremely valuable
tool for communicating overall pro-
ductivity trends. We currently have
a 12-quarter trend line that high-
lights steady improvements over
time. This has proven to be a
valuable addition to the metrics
program as a high-level commu-
nication tool.
CONTINUOUS PROCESSIMPROVEMENT
During the early stages of imple-
menting a measurement program,
an organization should focus on
validating the process at the funda-
mental level to ensure a sound
base for the program. Key require-
ments in the initial phase of the pro-
gram are consistent definitions for
system applications, system docu-
mentation, and development and
maintenance task code structures.
We implemented various process
improvements to enhance the level
of accuracy for time-reporting data,
clearly defined development and
maintenance activities, and for-
mally captured software release
data for utilization in the develop-
ment release sizing. Comparison to
industry benchmark data was com-
pleted at the detail level to ensure
consistent application of develop-
ment and maintenance task levels.
We compiled quarterly trend analy-
sis for both development and main-
tenance productivity metrics over a
sustained time period to assist in
validating the accuracy of the met-
rics process. This gave us the
opportunity to expand the analysis
and reporting at an aggregate level
for IT and executive management
reporting.
Once the measurement program
matured at the detailed level, the
focus of process improvements
shifted from validation of the data
to internal analysis and aggregate
comparisons to industry data. Our
next phase is to expand the metrics
process further back into the devel-
opment lifecycle. Currently, all
analysis addresses software pro-
ductivity following the completion
of the development lifecycle. We
have investigated performing
function point analysis of potential
software releases prior to the
system development phase based
on detailed user and system
requirements.
Projecting the function point count
of software releases at the require-
ments stage will assist in resource
planning, improve the overall cost
estimates, more accurately deter-
mine software delivery schedules,
and highlight additional functional
changes to requirements that are
inherent but not often measured,
within the development cycle
(scope creep). This measure can
be accomplished utilizing the same
requirements documentation avail-
able at the end of the project and
would not require the developers
to compile additional documents.
This data analysis would also high-
light the potential stability or volatil-
ity of the business requirements
presented in the initial develop-
ment phase and would be an
invaluable tool in managing user
and executive management expec-
tations for software delivery.
Our organization has also planned
another key quality process
improvement to more closely
align defect tracking with the
economic measurement process.
Going forward, the development
shop intends to track the relation-
ship between release sizes,
©2003 Cutter Information LLCJune 200326
Table 6 — Aggregate Productivity as a Percentage of the Weighted Average Aggregate Productivity
Actual Benchmark
Weighted Benchmark Productivity
Actual as Percentage of
Benchmark
Quarter #1 9.95 10.02 99.3%
Quarter #2 8.38 8.32 100.7%
Get The Cutter Edge free: www.cutter.com
associated productivity measures,
cycle time, and software defects
encountered in production as part
of its overall analysis.
At the detailed level, we have uti-
lized industry benchmark data to
compare software delivery for
individual application software
releases and have tracked aggre-
gate productivity against the aggre-
gate benchmark data for the total
development shop. This aggregate
benchmark represents an industry
average, and while this has served
us well, we have matured past
comparing ourselves to the aver-
age. Our next goal is to determine a
benchmark indicating first quartile
(top 25%) and our performance
in comparison to external high-
performing IT organizations.
SUMMARY
Our program utilizes the precision
of a traditional software measure-
ment program to track diverse
application-level productivity
through function points, and it sum-
marizes the information to produce
an aggregate to measure the overall
effectiveness of the development
shop against an aggregate compos-
ite of industry benchmarks. The
comparison is an effective tool for
demonstrating IT productivity and
value to both IT and executive-level
management.
In developing the program, we
viewed the implementation as a
phased approach and focused our
initial investment on the accuracy
and validity of the captured data. As
with any long-term IT initiative in
the current corporate climate,
demonstrated short-term success
is significant in gaining continued
commitment from the organization.
For the program to be sustainable
over time, it must have commit-
ment from the developers, be flexi-
ble enough at the detail level to
adapt to inevitable organizational
changes, and continue to mature
through process improvements
and continual leadership vision
and sponsorship.
REFERENCES
1. Garmus, D., and D. Herron.
Function Point Analysis.
Addison-Wesley, 2000.
2. Gopal, A., M.S. Krishnan,
T. Mukhopadhyay, and D.R.
Goldenson. “Measurement
Programs in Software
Development: Determinants of
Success.” IEEE Transactions on
Software Engineering, Vol. 28, No. 9
(September 2002), pp. 863-875.
3. Hall, T., and N. Fenton.
“Implementing Effective Software
Metrics Programs.” IEEE Software
(March/April 1997), pp. 55-64.
4. International Function Point
Users Group. Function Point
Counting Practices Manual, Release
4.1. IFPUG Standards, 1999.
5. Jones, C. Applied Software
Measurement. McGraw-Hill, 1996.
6. Offen, R.J., and R. Jeffery.
“Establishing Software
Measurement Programs.” IEEE
Software (March/April 1997),
pp. 45-53.
7. Pfleeger, S.L. “Lessons Learned
in Building a Corporate Metrics
Program.” IEEE Software (May
1993), pp. 67-74.
8. Wiegers, K. “10 Traps to Avoid.”
Software Development, Vol. 5,
No. 10 (October 1997), pp. 49-53.
John Cirone received his BS in computer
science from Kean University and an
MS in technology management from
Stevens Institute of Technology. He cur-
rently leads the IT organization respon-
sible for all of the finance and human
resource systems at AT&T. The metrics
program discussed in this article was
instituted for Mr. Cirone’s organization.
Mr. Cirone can be reached at AT&T, One
AT&T Way, Room 2B121, Bedminster, NJ
07921-0752, USA. E-mail: [email protected].
Patricia Hinerman received her BS in
neuroscience from the University of
Scranton and an MBA from Rutgers
University. She is currently a member
of AT&T’s technical staff and has seven
years of experience managing require-
ments and developing software appli-
cations. Ms. Hinerman was one of the
principals in the development and main-
tenance of the metrics program discussed
in the article.
Ms. Hinerman can be reached at AT&T,
30 Knightsbridge Road, Room 52G19,
Piscataway, NJ 08854-3913, USA. E-mail:
Patrick Rhodes received his BA in political
science and English and MBA from
Rutgers University. He is currently a
member of AT&T’s technical staff and
has 20 years of experience supporting all
phases of system development within the
AT&T financial systems. Mr. Rhodes was
one of the principals responsible for the
development and implementation of the
metrics program discussed in the article.
Mr. Rhodes can be reached at AT&T,
30 Knightsbridge Road, Room 52A254,
Piscataway, NJ 08854-3913, USA. E-mail:
Vol. 16, No. 6 27
My journey in metrics began more
than six years ago, when I was
asked to put together metrics for a
group within our IT development
division. The group was composed
of about 40 systems and 600 IT staff,
and it collected very little, if any,
metrics data. There was a mandate
across the total development divi-
sion to provide metrics data for
review with our CIO.
I knew collecting metrics data was
going to be a hard sell, especially
with the technical staff. Technical
staff typically feel that collecting all
this data just gets in the way of “real
work.” Luckily, I had come from a
similar organization and had faced
this same task. This is the story of
how I got started, what happened,
and how this program evolved
once we began to outsource our
work. The adage “you can’t man-
age what you don’t measure” is
really true. Unless a CIO has some
measures in place for demonstrat-
ing both the cost and quality of sys-
tems, it is impossible to tell how
well the organization is doing com-
pared to the industry average and
best in class.
Since I had the top-level support I
needed, my first task was to assess
the existing situation to see where
the organization was. An enterprise
metrics handbook and enterprise
metrics repository that I had previ-
ously helped to develop would
serve as the standards for what
metrics we would collect and
how we would report them.
Most articles I read say top-level
support is critical to making a met-
rics program happen. This certainly
is true, as it helps to motivate
groups to begin the process of
collecting metrics data. However,
top-level support will only go so far.
You need to get down to the details
to make metrics collection and
reporting a reality. Here are the
steps we took:
� We developed a metrics
program plan for the organi-
zation. This consisted of the
metrics I wanted to collect
and the process for collection.
� We developed an action plan
to begin the collection of
data. It is important to get
upper management to realize
that data collection does not
happen overnight!
� We developed the software
quality assurance (SQA) role
within the organization
and ensured that the SQAs
worked for a central, inde-
pendent group.
� We gave the SQAs responsi-
bility for collecting metrics
data for their projects.
� We established a metrics
coordinator who was respon-
sible for both scheduling
function point counts across
the applications and validat-
ing data.
� We coupled metrics collec-
tion with improvements the
organization was making to
reach Level 2 of the Software
Engineering Institute’s (SEI)
Capability Maturity Model
(CMM). This was very effec-
tive, since achieving Level 2
requires you to show that you
are collecting metrics data.
� I began communicating the
value of metrics at various
division meetings. Sell, sell,
and sell! This involved
explaining how the data
collected would help the
groups identify how they
were doing and help improve
processes. I made sure the
groups understood that the
data would not be used
against them.
©2003 Cutter Information LLC28
From Important to Vital: The Evolution of a MetricsProgram from Internal to Outsourced Applications
by Barbara Beech
insi
de, out
June 2003
Top-level support will only
go so far.
Get The Cutter Edge free: www.cutter.com
� We identified the metrics that
would be easiest to collect
first. I felt it was important to
start reporting something
soon, even if it wasn’t all the
metrics that were planned.
� I worked with the groups that
were the most receptive,
which put peer pressure on
the other groups to follow.
� We developed a scorecard at
various levels (application,
division, and all systems) and
reviewed the scorecard at
monthly meetings with the
various groups.
� We began the process of
using the collected metrics
data for estimation purposes.
This became a harder task
than it seemed, because
we couldn’t use the metrics
data directly without a profile
of the project staff. Still, it
showed a lot of promise
and value for the CIO and
the development division.
Figure 1 shows the action plan that
we developed. It is important to
note that it would take at least six
months to get a good sample of
data from across the organization,
and it took longer to obtain a good
base of data for some metrics than
for others.
Figure 2 shows an example of the
scorecard we developed. There
were different versions within the
organization at various levels, such
as the division and application lev-
els. There are some important
points to note here. We measured
quarterly progress as well as year-
to-date progress. We also estab-
lished objectives after reviewing
our baseline data and compared
our quarterly results to the monthly
objectives. Our yearly objectives
were based on percentage
improvements from our baseline
results. I also compared our results
to best-in-class data in addition to
our yearly objective.
THE CHALLENGES WE FACED
The Perception That Metrics WereMore WorkTo tackle this, I tried to make the
collection of metrics data as trans-
parent to the technical staff as pos-
sible. The SQAs were charged with
the task of collecting metrics data
for the projects they worked on.
This worked quite well. Since the
SQAs worked with the project
groups, it was easier for them to
collect the data and report it.
Using Function Points to Sizethe WorkThere is an endless debate on the
use of function points, one that
needs to be put to bed early in a
metrics program. I challenged the
skeptics to give me a better size
measure to use. To date, no one
has provided anything better than
function points. I know they have
their limitations, but they are the
best method the IT community has
for sizing work. So we hired an
Vol. 16, No. 6 29
Metric (Actuals) Quarterly Targets (Percentage of system releases providing metrics data) 1Q 2Q 3Q 4Q 1Q99 2Q99 Action Plans to Attain Targets Productivity 29% 33% 50% 75% 100% Schedule function point counting and establish metrics program and
metrics collection process.
Defect removal efficiency
29% 33% 35% 50% 75% 100% Increase number of inspections and training. Monitor inspection reports.
Delivered defect density
29% 33% 50% 60% 75% 100% Establish metrics program and metrics collection process. Automate defect collection.
Overall defect density
29% 33% 50% 60% 75% 100% Increase number of inspections. Publish inspection results. Automate defect collection.
Labor cost/FP 29% 33% 50% 75% 100% Schedule function point counting and establish metrics program and metrics collection process.
Development cycle time
29% 33% 50% 75% 100% Establish metrics program and metrics collection process.
Cost variance 29% 33% 50% 75% 100% Establish metrics program and metrics collection process. Further implement automated tool for estimating.
Figure 1 — Our metrics action plan.
external consultant to baseline and
count enhancement projects. We
tried to take up the least amount of
the project teams’ time to do this.
Having an external resource do the
function point counting on an as-
needed basis was very effective
from a time and cost perspective.
Collecting Accurate Time DataMost metrics require accurate time
data to be collected on projects.
You need to know the amount of
hours spent on specific projects to
determine a project’s cost per func-
tion point. Although we had an
internal time-tracking system,
ensuring that data was entered cor-
rectly and charged to the right proj-
ect was quite a challenge. Our
SQAs would review the time data
for all projects to ensure it was cor-
rect. However, with an internal
organization, time reporting is
never going to be exact. As long as
the organization was evaluated on
the overall cost of all projects,
ensuring that the time for all proj-
ects was correct was a difficult if
not impossible task. Tracking over-
time hours was also a challenge.
So we did the best we could with
the data while acknowledging its
limitations.
Collecting Defect DataIf you want to know the quality of
your systems, you need to collect
the defects associated with your
projects. This is probably the hard-
est data to begin to gather, and it
takes the longest. If there is no
defect data currently being col-
lected, then it could take months to
begin to collect this information.
You need a tool to gather and
record the defects associated
with all application releases and
their severity levels. It is best to
standardize on a tool, but that
depends on the environment you
are working in: client-server or
mainframe. So you might have sev-
eral tools across your organization.
Furthermore, collecting the defects
is not enough; you need to ensure
that they are associated with a spe-
cific application release. The best
measure to use here is delivered
defect density. Looking at total
defects from requirements to pro-
duction is great, but very few orga-
nizations track defects well enough
in the requirements stage to make
this a very valid measure. Most
organizations seem to collect data
only on testing defects. To collect
defects on up-front work (i.e.,
requirements/coding), you need to
have a good inspection process in
place. So even though we collected
©2003 Cutter Information LLCJune 200330
Figure 2 — Our metrics scorecard.
Dimension End-to-End Metric 1Q99 2Q99YTD
Progress s YTD 1999
1999Objective e
Best in Ct in Class
StatusCompared ed to
1999 O9 Objective e
Productivity(FP/ef/effort-mot-months)s)
REDRED
CostCost/fu/function po point GREENGREEN
Cost va variance(ac(actual: es: estimated) )
GREENGREEN
Defect re removalefefficiency y
YELLOWYELLOW
Quality Delivered de defect de densityty REDRED
Overall de defect de density GREENGREEN
Time te to Mo Market Project cy cycle ti time GREENGREEN
ProcessImprovement t
Process coProcess compliance GREENGREEN
StaStatutus:
Red: M Red: Major improvement t needed compared withth 1999 ob objectitive
Yellow: So Yellow: Some improvement t needed compared withth 1999 ob objectitive (w(withthin 20%) ) or addititional datata needed Green: A Green: As good (w(withthin 2%) ) or bettetter ththan 1999 ob objectitive
Get The Cutter Edge free: www.cutter.com
data on overall defect density, I
doubt we were actually capturing
all the defects throughout the entire
software development process.
MAKING YOUR METRICS DATAUSEFUL TO OTHERS
The following are some ways to
make the metrics data you collect
of value to others:
� Communicate the value of
the data you are collecting
and show that it can be used
to improve processes. This is
a tough sell to some groups,
but it is easier with those
groups that want to achieve
SEI CMM Level 2 and
improve their processes.
� Post data and send the met-
rics results out to the applica-
tion development teams
frequently.
� Try to tie metrics results to
objectives. This is difficult,
since many times getting a
project out is more important
than anything else and
becomes the overwhelming
objective.
� Couple metrics data with an
estimating process. Use the
data you collect to help you
estimate your work more
accurately.
� Translate metrics results into
a real public relations oppor-
tunity for the CIO. If IT devel-
opment costs and quality are
within industry standards —
to say nothing of best in class
— this can go a long way
toward demonstrating the
effectiveness and efficiency
of IT development within the
company.
HOW OUTSOURCING AFFECTEDOUR METRICS PROGRAMS
Two years into my work on the
metrics program, senior manage-
ment decided to outsource all the
development work. Obviously, this
changed things a lot. Suddenly,
defining the right service levels for
the outsource contract became
critical, and metrics collection
became very important!
What we found out:
� Outsourcing automatically
engages all your systems in
the collection of service-level
data. What was a difficult sell
prior to outsourcing becomes
part of the contract, and all
groups must participate
whether they like it or not.
� We didn’t feel that our inter-
nal data was robust enough
to use as service-level agree-
ment (SLA) targets, so we
decided to rebaseline the
data with the vendor.
� Time tracking becomes
easier, since this is how the
vendor is paid.
� Application development
teams that weren’t tracking
metrics data before now need
to get on the bandwagon.
� Quality measures are critical,
and so is tracking defect
data. Tracking defect data
minimizes risk to the busi-
ness when new releases
are deployed.
� You need to measure not
only delivered defect density
but also residual or latent
defect density to get the best
picture of the application
quality.
� Different measures need to
be added besides cost and
quality. We also need to
measure our vendor on
responsiveness to requests
and timely delivery of
enhancements.
� The vendor now owns all
application work, which was
previously split among vari-
ous internal IT groups with
competing interests. This
means that measuring proj-
ects end to end has become
easier.
� Validation of data is critical,
since a vendor can be penal-
ized if it misses service-level
targets.
� Benchmarking has become
very important, since we
need to ensure that our
outsource cost and quality
are at least meeting industry
averages.
� Establishing the right service
levels at the beginning of the
contract is vital.
What works better:
� The vendor is now responsi-
ble for the collection of all
data and needs to provide
information specified in the
contract. Therefore, there is
less fighting with internal
organizations just to collect
the data.
Vol. 16, No. 6 31
Outsourcing automatically
engages all your systems
in the collection of service-
level data.
� Time tracking needs to be
more exact than within our
internal organization.
� Defect tracking is initiated for
all applications and standard
tool sets are applied.
What is harder:
� There is less flexibility to
change or add new metrics.
� Validation of data that the
vendor is providing is a diffi-
cult task. This is more critical
now since we need to ensure
that the vendor is reporting
correct data.
� Root cause analysis
processes for service-level
misses need to be developed
with much rigor in a multi-
vendor environment.
Something we did not do
before as an internal organi-
zation now needs a lot of
focus since we need to know
the reason for any service-
level failures and what is
being done to correct them.
The chart in Figure 3 lists the met-
rics we have begun collecting from
our vendors in the following areas:
� Cost: Critical financial drivers
� Quality: Measures directly
affecting internal AT&T
customers
� Responsiveness: Metrics
ensuring that AT&T meets
commitments to business
partners
� Customer Satisfaction:
Qualitative measurement
of the success of the
partnership
WHAT WE HAVE LEARNED� Metrics are essential to any
software development proj-
ect or program, whether it is
inhouse or outsourced.
� Even though I sometimes see
metrics programs disappear,
they always come back in
some shape or form.
� You can’t know where you
are as an IT development
organization if you don’t
measure quality and cost.
� Standard definition of metrics
is important.
� Collection of metrics data is
hard and takes time.
� Knowing where you stand
in relation to the rest of the
industry is important for set-
ting improvement goals.
� There is no point in getting
hung up on function points.
They are just a size measure,
and the best one I have
found to date across various
platforms.
� Metrics collection should
be coupled with process
improvement activities.
� The data must be believable.
If no one believes that the
data you are reporting is
accurate, they will not take
the metrics you report seri-
ously, and your metrics pro-
gram will soon lose steam.
Barbara Beech is a District Manager
at AT&T in the Consumer CIO Vendor
Management Division. She has worked
at AT&T for 19 years in the area of soft-
ware development. During that time, she
was involved in the development of new
systems supporting both business and
consumer services. For the past seven
years, her focus has been on process and
metrics. She has worked to establish a
balanced scorecard, helped application
teams achieve CMM Level 2, and sup-
ported the definition of service levels for
outsourcing initiatives.
Ms. Beech can be reached at AT&T,
30 Knightsbridge Road, Room 53C338,
Piscataway, NJ 08854, USA. Tel: +1 732
457 3715; E-mail: [email protected].
©2003 Cutter Information LLCJune 200332
Cost Responsiveness
• Cost per function point for enhancements
• Estimate accuracy (committed versus delivered)
• Estimate accuracy (original versus committed)
• Project estimates within commitment time frames
• Enhancement within commitment time frames
• Ad hoc requests within commitment time frames
• Enhancement cycle time
Quality Customer Satisfaction
• Customer satisfaction survey
Key Development Initiatives
• End-to-end user response time for critical systems
• Critical deliverables• Key deliverables
• Delivered defect density
• Residual defect density
• System availability (IUMs)
• Number of production application defects
• Production application defects closed within commitment time frames
• Business process metrics
Key development initiatives must be delivered with the specified:
• Functionality• Schedule• Quality (production defects)
Figure 3 — Metrics we collect from our vendors.
look in t
he m
irro
r, p
al
Get The Cutter Edge free: www.cutter.comwww.cutter.com/consortium/ 33
To be effective, businesses of all
sizes need to understand their own
performance. While large, estab-
lished organizations will typically
have a solid infrastructure and a
constant finger on their pulse,
smaller, growing companies often
struggle with fundamental issues.
This is highly prevalent in technol-
ogy organizations, where an entre-
preneur with the Next Great Idea
suddenly finds that organizational
issues are consuming more and
more of his or her precious time. It
is wise for any organization to have
a clear understanding of how well it
is performing, either as an impetus
to improve or as a basis for under-
standing how much work it should
reasonably take on in the future.
What is the best structure for our
organization? How productive
should we expect to be? What
should we be paying our staff?
These and many other questions
need to be resolved for small, grow-
ing companies to get beyond their
nontechnical hurdles to success.
With all these questions to answer
and so little time, there is often a
rush to quickly find “the solution,”
whether a general solution really
exists for the industry as a whole
or not. Among the frequently asked
questions are the following:
What is the appropriate ratio of
software testers to developers?
Companies want to use this num-
ber to mold the structure of the
development organization, but
there is no right answer here. I have
worked on significant projects in
which the developers successfully
performed the bulk of the testing of
the system, and I have worked with
teams where dedicated testers out-
numbered developers almost 2:1
and still could not keep up with the
issues that were cropping up.
How productive should I expect
my team to be (given a variety of
factors)? There is clear bench-
marking data available indicating
average ratios of function points to
lines of code and productivity in
terms of lines of code per day, given
the type and criticality of the appli-
cation being developed. There are
a great deal of data points behind
the scenes used to make the infor-
mation statistically relevant, usually
with extremely wide variations.
This variability rarely makes it to
the surface of the data presented,
but it is a strong indicator that your
mileage may vary — and probably
by a large amount, even within
your team.
How should I compensate my
staff? Industry salary reports have
dropped from their staggering
heights of a few years ago to reflect
the changing times. Still, there is
significant geographical variation
to consider.
For a variety of reasons, many com-
panies turn to externally generated
benchmarking data to provide the
answers they need. Unfortunately,
there is a dark side to the quick and
sometimes blind use of external
information. Use of benchmarking
data needs to be carefully tem-
pered if it is to provide value for
organizational improvement.
BENCHMARKING DATA IS ALLURING
For better or worse, most organiza-
tions refer to industry benchmark-
ing data as a means of gauging their
performance. Like most people in
the industry, I’ve done my fair share
of quoting statistics from the
Standish Group’s 1994 CHAOS
Report [6] and used the quarterly
reports from the Software
Engineering Institute (SEI) to
describe industry performance in
discussions with clients. A number
of people and organizations have
collected and disseminated a great
deal of benchmarking data over the
years, including the guest editor of
this issue, David Garmus.
Vol. 16, No. 6
Benchmarking for the Rest of Us
by Jim Brosseau
Collected benchmarking data is
relatively easy to obtain as, for the
most part, it is readily available, if
for a price. It generally comes
from well established, reputable
sources, either published in books
or trade journals or available for
purchase from a number of organi-
zations worldwide. It is usually well
organized and indexed in a manner
that will allow you to quickly arrive
at the information you are looking
for. Using data from reputable
sources will help you to back up
your assertions and can make your
arguments much more compelling
and defendable. It can be an indi-
cation that you have “done your
homework.”
At times, however, the allure of
benchmarking data comes from its
external sterility. The data provided
is based on other people’s perfor-
mance, and it may provide a sani-
tized look at what the industry is
doing. For some organizations, it
can become a game to blithely
quote industry performance figures
while avoiding internal measure-
ment, knowing that the truth can
be a bitter pill to swallow.
BENCHMARKING DATA:CAVEAT EMPTOR
He uses statistics as a
drunken man uses lamp-
posts — for support rather
than illumination.
— Andrew Lang
Imagine a situation in which you
decide when it is best to leave for
work in the morning by observing
your neighbor’s departure patterns.
Over the course of a month, his
average departure time is 7:15,
give or take five minutes or so.
That’s pretty consistent, so you
decide that 7:15 must be appropri-
ate for you as well. Unfortunately,
your neighbor works about a mile
away, while you have a cross-town
trek. Worse yet, you may be on the
afternoon shift, or you may work
from home. Is that benchmarking
data worthwhile?
There are a number of problems
associated with using the industry
data that we have all turned to on
occasion. We need to be extremely
careful to drill down past the super-
ficial presentation of the data —
usually a simplified table or graph
— to determine if it is applicable to
our situation at all. Quite often, the
data will be presented in a form
that may be visually compelling
while obfuscating some important
elements of information that would
otherwise be helpful. With the gen-
eral availability of spreadsheets and
graphics packages, we often find
ourselves interested in the super-
ficial presentation rather than the
intrinsic information.
Wide Variability, Hidden BiasBeyond the simple data points
presented in benchmark data, it
is important to recognize that the
underlying data may have potential
hidden biases or wide variations
within the sample space. These
attributes, if not clearly understood,
can lead one to rely more heavily
on the benchmark data than is
reasonable.
Parametric estimation models, for
example, are essentially the result
of curve-fitting exercises based on a
broad sample space of thousands
of completed software projects,
which can make the models com-
pelling to use for early, whole-
project estimates. The SLIM
parametric estimation model is
based on a large number of proj-
ects, divided into roughly a dozen
different industry types [3], with the
intent to provide a sample space
that is relevant to your situation. As
you drill deeper, though, you find
that the variation within each of
these industry types is very wide
and that your performance may
actually be closer to the median
performance of an industry type
that does not appear to be close
to yours.
The COCOMO II parametric model
[1] introduces a bias of another
form. While the sample space is
much smaller, it is important to
note that many of the projects that
have been used for curve-fitting the
model are primarily in the defense
and aerospace realm, where prac-
tices are such that there is a very
low correlation with commercial
software development or other
development types.
Both the SLIM and COCOMO II
models have been fit primarily
with projects that are fairly large in
terms of effort and scope. It would
©2003 Cutter Information LLCJune 200334
Some organizations blithely
quote industry performance
figures while avoiding inter-
nal measurement, knowing
that the truth can be a bitter
pill to swallow.
Get The Cutter Edge free: www.cutter.com
be erroneous to assume that the
models could be extrapolated
down for use on small projects. To
blindly use these models “out of the
box” for small projects or projects
that have not been calibrated
appropriately would be to generate
estimates that are falsely defend-
able. While the data behind the
models has been validated, that
does not mean that it cleanly maps
to your situation.1
Sparse, Slanted SEI DataThe SEI’s quarterly Process Maturity
Profile of the Software Community
may suffer from bias problems of its
own. According to the August 2002
release [5], the report shows that
19.3% of reporting organizations
are performing at Level 1 (the
Initial level) of the SEI’s Capability
Maturity Model (CMM) scale, which
is quite a strong positive indicator
for the industry as a whole. The
fine print, however, indicates that
this figure is “based on the most
recent assessment, since 1998, of
1,124 organizations.”
There are a couple of points to
note here. This sample space is
extremely small considering the
number of software development
organizations worldwide. In addi-
tion, it is biased not only toward
organizations that are aware of the
SEI, the CMM, and the suggested
best practices they promote, but
also toward organizations that have
reported results to the SEI from for-
mal assessments.
In most organizations that I have
worked with in the past four years,
the majority of the people were not
aware of the SEI, and their prac-
tices and performance clearly
placed them in the Initial level of
the CMM. Among those organiza-
tions that claimed to have attained
a higher level of maturity on the
SEI’s scale (i.e., Levels 3-5), most,
in my experience, were not able to
perform in accordance with even
those goals attributed to the
Repeatable level (Level 2).
(Ir)Relevance of Annual DataOften, benchmarking data is pro-
vided on an annual basis, which
allows you to subscribe and remain
current. One must be careful, how-
ever, to determine whether time-
based trends would be relevant for
the information provided. Clearly,
annualized reports showing the
equivalent of the average results of
1,000 coin tosses will yield limited
additional insight. While some
benchmarking data will benefit
from annualized updates (such
as new data sectors or evolving
trends), there are other classes
of data that do not benefit to the
same degree.
It’s Not a Divining RodThere is danger in using bench-
marking data to determine direc-
tion for your organization. Industry
averages in IT spending, for exam-
ple, can be extremely revealing if
you are on the receiving end of that
spending trend. They can also be
used as one of the drivers for fore-
casting, especially if historical
spending trends have tracked well
with your performance in the past.
If you are looking at how much
your organization should be spend-
ing, historical benchmarking data
will tell you where the industry has
been, but it will not help you
resolve how to best address your
organizational needs in the future.
Budgeting for future spending
based on industry trends fails to
address what is important for you.
BENCHMARKING DATA’SSILVER LINING
All models are wrong.
Some models are useful.
— George Box
All this is not to say that you should
never use externally generated
benchmarking data within your
organization. There is a great deal
of consideration and industry
Vol. 16, No. 6 35
1Beyond the selection of a specific para-
metric model for estimation, there is the
question of which estimation procedure
to use. Many organizations will try to
take a published procedure (such as
that used by the NASA Software
Engineering Laboratory [2]) and its
embedded information (such as uncer-
tainty, phases, and approaches) and call
it their own. While there are industry-
wide principles that an estimation pro-
cedure should embrace, there is not a
one-size-fits-all solution.
Historical benchmarking data
will tell you where the indus-
try has been, but it will not
help you resolve how to best
address your organizational
needs in the future.
research that has gone into much
of the available benchmarking
data, and it is important to under-
stand how and whether the data
appropriately applies to your
situation.
Benchmarking data that is used as
a basis for or result of certifications
or qualifications — such as ISO
quality standards, SEI maturity lev-
els, or the Project Management
Institute’s Project Management
Professional (PMP) designation —
provides an indication that the
organization or individual has
passed a baseline level of perfor-
mance or understanding. ISO-
certified organizations have clearly
identified their quality practices
and demonstrated that they “prac-
tice what they preach” (although
this is not a guarantee that their
next project will be a success), and
the certification can reasonably be
used as part of the criteria in an
acquisition process. Individuals
with the PMP designation have
been assessed to have knowl-
edge of a base set of commonly
accepted project management best
practices and have performed a
prescribed amount of work in the
project management arena (but
this is not a guarantee that they are
effective project managers).
For much of the benchmarking data
that is available, the underlying
assumptions, variability of the data,
and inherent biases can usually be
identified with some digging. The
information may be published
along with the primary information
that has been distilled, available
from the provided reference
information, or obtained through
deeper discussion with the data
provider if one is so inclined (and
diligent).
THE PERSONALIZED SOLUTION:ANSWER YOUR OWN QUESTIONSFIRST
A reasonable approach to the
use of metrics data is to balance
external benchmarking data with
internally derived data to help
you understand whether or not
you are achieving your organiza-
tional goals. As Peter Senge noted
in The Dance of Change, we need
to measure to learn rather than to
merely report [4].
With an understanding that the first
step is to identify our goals in the
measurement process, we can lean
on Vic Basili’s GQM approach or
extend and elaborate on that prac-
tice using techniques such as the
balanced scorecard. Identifying
these goals and the model we will
use to validate the goals allows us
to remove biases from the response
and resist the temptation to use
data simply because it is readily
available. Our quests become
tightly coupled with our culture
and organizational needs.
Using this approach, we can then
perform our own internal meas-
urements and make comparisons
against industry benchmarks
where it makes sense. With the
added internal diligence, we will
have a more valuable understand-
ing of the biases that are inherent in
the data (and a comfort that the
biases are more likely to be work-
ing for us than against us) and of
the uncertainty or variability in the
data set.
It is important to recognize the dis-
tinction between statistical variabil-
ity across industry benchmarking
data and individual performance
variability that will arise in your own
measured data. The former is an
indication of the relative applicabil-
ity of the information to your situa-
tion, while the latter is an expected
artifact of the measurement
approach that needs to be fully
appreciated. You need to accept
individual variation as a fact of life.
Even if you use the information to
cull the low-performing individuals
(not a recommended practice),
you will continue to have variabil-
ity; by definition, 50% of the people
will always fall below the median of
your data set. It is dangerous to fall
into the trap of using measures for
segregating the team rather than for
improving the organization.
Some of your greatest insight will
come as you track your own meas-
urements over time and observe
the variation and trends that are
revealed. This information is not
something that can be gained from
industry benchmarking data, but
©2003 Cutter Information LLCJune 200336
Some of your greatest insight
will come as you track your
own measurements over time
and observe the variation
and trends that are revealed.
Get The Cutter Edge free: www.cutter.com
you can see whether you are
tracking toward or away from the
industry data, which will provide a
greater indication of the applicabil-
ity of the benchmarking informa-
tion to your situation.
For this tracking to be effective,
you need to be consistent in your
measurement approaches within
your organization over time. One
commonly hears concerns in the
industry about inconsistency of
measurement, whether it be for his-
togram categories to collect data or
the approach used (such as the
highly variably lines of code meas-
ure, for example). The bottom line
here is that you should select a spe-
cific approach, identify that it is the
standard, and stick with it in order
to ensure that you are indeed mak-
ing apples-to-apples comparisons.
Industry benchmarking data defi-
nitely has its place in your arsenal of
information for making strategic
business decisions. Still, it has limi-
tations that must be overcome with
a deep understanding of why you
are measuring and balanced with
data gathered internally with rea-
sonable approaches. Taken with a
grain of salt, benchmarking infor-
mation can give us the perspective
we need to better understand what
our internal information is telling us.
REFERENCES
1. Boehm, Barry, Bradford Clark,
Ellis Horowitz, Ray Madachy,
Richard Shelby, and Chris
Westland. “Cost Models for Future
Software Life Cycle Processes:
COCOMO 2.0.” Annals of Software
Engineering (1995) (http://sunset.
usc.edu/research/COCOMOII/
index.html).
2. National Aeronautics and
Space Administration. The
Manager’s Handbook for
Software Development, Revision 1
(Software Engineering Laboratory
Series SEL-84-101). NASA, 1990
(http://sel.gsfc.nasa.gov/website/
documents/online-doc/84-101.pdf).
3. Putnam, Lawrence, and Ware
Myers. Measures for Excellence:
Reliable Software on Time, Within
Budget. Yourdon Press Computing
Series, Pearson Education POD,
1992.
4. Senge, Peter, Art Kleiner,
Charlotte Roberts, George Roth,
Rick Ross, and Bryan Smith. The
Dance of Change: The Challenges
to Sustaining Momentum in
Learning Organizations.
Currency/Doubleday, 1999.
5. Software Engineering Institute.
Process Maturity Profile of the
Software Community 2002 Mid-Year
Update. SEI, August 2002.
6. The Standish Group. The
CHAOS Report. The Standish
Group, 1994 (www.standishgroup.
com/sample_research/chaos_1994
_1.php).
Jim Brosseau has 20 years’ experience
in the software industry in a wide variety
of roles, application platforms, and
domains. A common thread through his
experience has been a drive to find a less
painful approach to software develop-
ment. Mr. Brosseau has worked in qual-
ity assurance at Canadian Marconi and
was involved in the development and
management of the test infrastructure
used to support the Canadian Automated
Air Traffic System. He is Principal of the
Clarrus Consulting Group in Vancouver,
Canada, and in the past four years, he
has consulted with numerous organiza-
tions throughout North America, specifi-
cally to improve their development
practices.
Mr. Brosseau publishes the Clarrus
Compendium, a free weekly newsletter
with a unique perspective on the soft-
ware industry (www.clarrus.com/
resources.htm). He has been published
in PM Network magazine, the PMI
GovSIG magazine, and the SEA Software
Journal, and he has made presentations
at Comdex West, PSQT North, the New
Brunswick SPIN group, and several local
associations.
Mr. Brosseau can be reached at Clarrus
Consulting Group Inc., 7770 Elford
Avenue, Burnaby, BC Canada V3N 4B7.
Tel: +1 604 540 6718; Fax: +1 604 648
9534; E-mail: [email protected].
Vol. 16, No. 6 37
Before we consider software
measurement and the collection
of software metrics, we need to
ask why we want to put ourselves
through the pain. The short answer
is that the organization’s return on
investment for IT has come under
increased scrutiny from senior
business executives and directors.
Consequently, IT now has to
operate like other parts of the
organization, being aware of its per-
formance, its contribution to the
organization’s success, and oppor-
tunities for improvement. How can
IT executives achieve this without
performance data? Flying blind is
not an option.
So what is it that managers need to
know? Here are some of the ques-
tions that executives in the banking
industry raised with me during
recent discussions about the use
of metrics in that sector:
� How do I know if my internal
IT operation is performing
satisfactorily?
� How do I decide whether I
should outsource some or all
of my IT operations?
� How do I know if my out-
sourcer is performing?
� What are the risk factors
I should consider in an
IT project?
� What questions should I ask
to ensure that an IT project
proposal is realistic?
� How do I know if a project is
healthy? What should I be
worrying about?
� What are the infrastructure
trends for software develop-
ment (languages, platforms,
tools, etc.)?
And the list goes on. Furthermore,
none of these questions can be
answered without sound data.
COLLECTING DATA
Organization-Level Versus Industry-Level DataHaving established that we need
data, how do we go about collect-
ing it? It would seem from the
questions listed above that there
are two levels at which we need
data: initially at the organization
level and then at the industry level
(with the ability to look at subsets
of industry data — by industry sec-
tor, for example). If an organization
collects data about its own IT proj-
ects and builds a repository from
this data, it can use it for macro-
estimation of future projects, do
internal benchmarking, track per-
formance improvement, analyze
what seems to work and not work
in its operations, and so on. This is
a very good start, but many of the
questions being raised by manage-
ment go beyond the organization
itself. Once there is a need to
benchmark against the world out-
side, estimate a project type that
the organization has never done
before, or analyze the performance
of other languages and tools, then
we have to find industry data.
In order for data to be useful, we
have to be able to compare “apples
with apples.” Thus it is important
that the data collected at an orga-
nization level can be compared to
data collected at the industry level.
The obvious question is, “What
should I collect?” Now this is the
fun bit! If you ask your people what
data they think you need to collect,
the list will be almost endless. If you
then produce a questionnaire to
collect all the data requested, the
same people will wail: “I can’t col-
lect and enter all that. I haven’t got
the time or patience!” It’s a lot like
system functional requirements —
“nice to have” is fine, as long as you
can afford it.
©2003 Cutter Information LLC38
it’s
a b
ig w
orl
d o
ut
there
June 2003
The Practical Collection, Acquisition, andApplication of Software Metrics
by Peter R. Hill
The obvious question is,
“What should I collect?”
Get The Cutter Edge free: www.cutter.com
ISBSG QuestionnaireThe International Software
Benchmarking Standards Group
(ISBSG) established its initial data
collection standard more than 10
years ago. ISBSG constantly moni-
tors the use of its data collection
package and reviews the package
content. It has endeavored to reach
a balance between what data is
good to have and what is practical
to collect. Rather than reinvent the
wheel, any organization can use
the ISBSG data collection question-
naire, in total or in part, for its own
use. The questionnaire is available
free from the ISBSG Web site
(www.isbsg.org), with no obliga-
tion to submit data to the group. But
whatever data collection mecha-
nism you end up with, ensure that
the only data you are collecting is
data that will be used and useful.
If you employ a questionnaire
approach to data collection, you
should give some thought to devel-
oping a set of questions that pro-
vide a degree of cross-checking.
Such an approach will allow you to
assess the collected data and rate it
for completeness and integrity. You
can then consider the ratings when
selecting a data set for analysis. As
a guide, the ISBSG employs the four
rating levels shown in Table 1.
Even D-rated projects may be worth
retaining, as they may contain some
data, perhaps qualitative, that could
be useful for a specific analysis.
Automated Collection Manual data collection will always
be painful, and collection after the
fact may increase the likelihood of
error. The best collection approach
must surely be one that is auto-
matic, that occurs throughout the
complete lifecycle of a project and
goes virtually unnoticed. Some
project management tools collect
data as part of the natural planning
and management process, from ini-
tial estimation through to mainte-
nance and support. Such a system
removes the pain of collection and
increases the integrity of the data.
ACQUIRING INDUSTRY DATA
Once you decide to obtain and use
industry data, more questions arise.
Where do I get industry data? How
do I know that it is sound data?
How can I be sure that the data has
not been manipulated to suit some-
one’s specific agenda? How do I
know it’s not biased?
Where Do I Get It?Despite the need for industry data,
there seem to be few sources. The
commercial consulting companies
that offer benchmarking services
tend not to let you look at the data
used in their benchmark reports.
Industry groups will sometimes
arrange to collect data from a num-
ber of organizations that agree to
participate, but with everyone seek-
ing a competitive advantage, gain-
ing cooperation can be difficult.
Governments have been known to
encourage industry benchmarking.
In Finland, for example, the govern-
ment supported the establishment
of a national software repository to
which the major Finnish organiza-
tions contributed metrics data so
that they could benchmark them-
selves and improve their perfor-
mance. For its part, the ISBSG
repository is “open”; the data is
available to anyone who wishes
to purchase a copy.
Vol. 16, No. 6 39
Rating Description
A The data provided was assessed as being sound, with nothing being identified that might affect its integrity.
B While the data was assessed as being sound, there are some factors that could affect the credibility of the data.
C Because significant data was not provided, it was not possible to assess the integrity of the data.
D Because of one factor or a combination of factors, little credibility should be given to the data.
Table 1 — ISBSG’s Data Rating Levels
Whatever data collection
mechanism you end up with,
ensure that the only data you
are collecting is data that
will be used and useful.
How Do I Know It’s Sound?Establishing whether the data that
you are buying is sound involves
seeking answers to the following
questions. Is the collection instru-
ment well thought out and proven?
Has the data been rated? How old is
the data? Can I use the data to com-
pare “apples with apples”?
Has the Data Been Manipulated?The possibility of data manipulation
is sometimes raised. Would an
organization or other entity supply
false data to a repository? The
answer could be “yes” if there is
an opportunity to profit from such
deceit. However, if the anonymity
of the submitter is maintained and
certain types of reporting on the
data are avoided, then there is no
point in submitting false data.
Where anonymity is ensured, only
those who submitted a project can
identify that project in the reposi-
tory. Why would they want to fool
themselves?
Although this approach removes
the likelihood of data manipulation,
it also removes the possibility of
comparing one organization to
another. Projects from a specific
industry sector may be identifi-
able, but projects from a specific
organization (other than your own),
will not be. So if your organization
is a bank and you want to bench-
mark your bank against other
specific banks, you can only do it
with their cooperation — and then
their honesty.
Is the Data Biased?Data quality extends beyond the
integrity of the individual entries in
the repository that you are propos-
ing to use. Is the data representative
of the industry as a whole? At this
stage of the IT industry’s maturity,
that answer would surely be “no”
in all cases.
Any organization that has software
metrics data, or has hired consul-
tants to gather such data, or has
contributed data to a repository is
displaying a certain level of maturity
that is likely to put it at the upper
end of the scale. Normally the data
collected is from completed proj-
ects. Sadly, that in itself excludes a
lot of IT projects! Human nature
also plays a role, and despite
anonymity, the temptation may be
to submit only the better projects.
Consequently, if you have satisfied
yourself about the factual integrity
of the data, then it is highly likely
to be representative of the top 25%
of the IT industry. As we will see in
the following examples, knowing
this is useful when you come to
use the data.
APPLYING INDUSTRY DATA TOGOOD USE
So now that we have got some data
and we know what we have got,
what will we do with it? Let’s
answer a couple of the banking
executives’ questions.
Should I Outsource My ITOperations?Answering this question is a practi-
cal application of benchmarking.
If you have data about the perfor-
mance of your internal IT organiza-
tion, then you can compare it to
industry data. Such a comparison
might reveal that the internal group
is doing a good job or that only
certain activities should be out-
sourced. If outsourcing is being
considered, such an exercise will
also provide the basis for estab-
lishing outsourcer performance
requirements. Obviously it is impor-
tant to ensure that you compare
like with like. There is no point, for
example, in comparing the perfor-
mance factors of projects devel-
oped on a PC with those developed
on a mainframe.
As a very simple example, you
might use industry data1 to gauge
your IT development organization’s
performance on the basis of the
number of hours it takes to deliver a
function point of functionality. From
the industry data, you could select
a data set based on projects with
characteristics similar to yours:
banking sector, new developments,
mainframe, COBOL. Figure 1
shows one possible result.
In this simple example, the produc-
tivity of your IT organization, as
©2003 Cutter Information LLCJune 200340
1Industry figures used in the examples
are from the ISBSG repository.
Normally the data collected
is from completed projects.
Sadly, that in itself excludes
a lot of IT projects!
Get The Cutter Edge free: www.cutter.com
measured in the number of hours it
takes to produce a function point of
functionality, looks pretty good; it
takes fewer hours than the industry
median of 14. Similar benchmark
reports could be produced for
speed of delivery or for a number
of different project characteristic
sets to cover the bank’s portfolio
of projects. Where “Your
Organization” appears on the
resulting graphs could influence
an outsourcing discussion.
How Can I Ensure an IT ProjectProposal Is Realistic?If a software development project
is being submitted for funding
approval, how will the decision-
makers know whether or not the
proposal is realistic? If the proposal
was for the construction of a build-
ing, a quantity surveyor would have
already estimated the project
within 5% of its likely cost and
would have similar calculations for
speed of delivery and total dura-
tion. The questions that the deci-
sionmakers need to ask about the
proposed IT project are, “How does
the proposed project compare
against industry data for similar
projects?” and “Is the proposal
realistic?” These comparisons can
be made at a number of levels:
� Project component break-
down percentages (files,
reports, inquiries, etc.)2 exist
to ensure that nothing has
been missed and that if the
project does differ signifi-
cantly from the industry
norms, the reasons are
known and verifiable.
� The project lifecycle phase
breakdown (plan, specify,
design, build, test, imple-
ment) exists, again, to ensure
that nothing has been missed
and that if the project does
differ significantly from the
industry norms, the reasons
are known and verifiable.
� Project delivery rate, work
effort, speed of delivery, and
duration are also important.
The industry data for compa-
rable projects will quickly
reveal whether the project
being proposed is realistic.
If it does vary greatly from
the industry norms, partic-
ularly if it looks optimistic,
then the reasons should be
known and verifiable.
For example, if an organization is
proposing a banking project of 500
function points, with the same
characteristics as the one shown
in Figure 1, the industry figures in
Table 2 will provide a reality check.
Given that we believe the industry
data comes from “better” projects,
then any proposal that provides fig-
ures that are closer to optimistic
than likely should be questioned
with vigor. Of course we could
add other project characteristics,
such as application type, to further
define our project. As long as the
resulting sample data set has a
reasonable number of projects in it,
then each additional characteristic
Vol. 16, No. 6 41
0 5 10 15 20 25
Industry 75%
Industry median
Industry 25%
Your Organization
Hours Per Function Point
Figure 1 — Comparing development performance.
2There are stable industry ratios for proj-
ect components and project lifecycle
phases. See the ISBSG Software Metrics
Compendium (www.isbsg.org).
will improve the comparison
between our proposed project
and the selected group of industry
projects.
There is no doubt that individual
organizations need data not only at
their own IT activity level but also at
the industry level. Use of good soft-
ware metrics data has extended
beyond internal IT benchmarking
and project estimation to the
broader areas of IT and business
management, including:
� Outsource performance
management
� Development scope
management
� Development infrastructure
planning
� Business case reality
checking
There are mature data collection
packages and tools available that
provide guidance on what data to
collect and how to make collection
easier. There is a growing industry
body of knowledge that can be
used to help IT make its contribu-
tion to organizational strategy, com-
petitive advantage, and profitability.
Peter Hill is the Executive Director of the
International Software Benchmarking
Standards Group (ISBSG), a not-for-profit
organization. ISBSG members are the
software metrics organizations of 11
countries. The group has built, grows,
maintains, and exploits repositories of IT
industry data. Mr. Hill has compiled and
edited four books for the ISBSG: Software
Project Estimation, The Benchmark
Release 6, Practical Project Estimation,
and The Software Metrics Compendium.
Over many years, Mr. Hill has written
articles and delivered papers at IS-
and business-oriented conferences in
Australia, New Zealand, Finland, Spain,
and Malaysia.
Mr. Hill can be reached at Tel: +61 3
9844 0560; Fax: +61 3 9844 0561; E-mail:
©2003 Cutter Information LLCJune 200342
Project Delivery Rate
(hours per function point)
Project Work Effort (hours)
Speed of Delivery (function points per
month)
Duration (months)
Optimistic 9.3 4,672 46.6 10.7
Likely 14.5 7,270 30.5 16.4
Conservative 21.5 10,733 19.8 25.2
Table 2 — Estimated Metrics for a Hypothetical Banking Project (500 Function Points)
Information and analysis are the foundation for performance manage-ment systems. However, few companies have built a solid performancemeasurement process that aligns measures throughout the organization.Entities that have implemented sound measurements and tie them toorganizational strategy are enabling their managers and process ownersto focus on strategic, rather than day-to-day, issues. In turn, better strate-gic decisions are being made based on those measures.
Cutter Consortium constantly conducts surveys and studies of IT andbusiness practices worldwide, designed to deliver valuable insight andintelligence into how organizations are using IT to become more effi-cient and profitable. Cutter Benchmark Review is a succinct distillationof these findings that provides every member of your organization withthe key data and analysis necessary to build and maintain a successfulperformance management system — and use the system to makemore effective decisions.
The undiluted business intelligence provided by Cutter BenchmarkReview is simply not available anywhere else. The cost in time andresources to complete this kind of research internally would beprohibitive — but you can get it today and for the next full year forabout $20 a month.
Take a few moments now to ensure that you have unfettered access tocritical statistical analysis of important IT initiatives and programs bycompleting the order form below and becoming a subscriber to CutterBenchmark Review. We’re sure once you have this publication in yourhands, you won’t want to give it up!
�� YES! Please start my one-year subscription to Cutter Benchmark Review for theCharter Rate of $195 (US $255 outside N. America)
Name/Company
Address/P.O. Box
City State/Province
ZIP/Postal Code Country
Telephone Fax
Mail to Cutter Consortium, 37 Broadway, Suite 1, Arlington, MA 02474-5552, USA, or send a fax to +1 781 648 1950.Or call +1 781 648 8700 or send e-mail to [email protected]. Web site: www.cutter.com
Cutter Benchmark Review helps you support — and even profit from — yourorganization’s IT initiatives and programs.
SPECIAL OFFER!
Get Cutter BenchmarkReview at the Charter Ratewhen you order byJuly 31st!
Simply complete andreturn the coupon belowor call +1 781 648 8700,or fax +1 781 648 1950.
You can also order onlineat www.cutter.com orby sending e-mail [email protected].
Cutter Benchmark Reviewdelivers the survey-basedstatistics you need to keepyour IT goals on track.
Payment or P.O. enclosed.
Charge Mastercard, Visa, AmEx, Diners Club.(Charge will appear as Cutter Consortium.)
Card #
Expiration Date
Signature
Analyzing IT Metrics forInformed Management Decisions
Cutter Benchmark Review
Priority Code: 220*2CIT
SUBSCRIBE NOWAT THE CHARTER RATE!
htt
p://w
ww
.cutt
er.com
/ or
+1 8
00 9
64 5
118
Cutter IT JournalTopic Index
David Garmus, Guest Editor
David Garmus is a Principal in The David Consulting Group and an acknowl-edged authority in the sizing, measurement, and estimation of software appli-cation development and maintenance. He has helped numerous CIOs andCFOs successfully manage expectations in software development projects,using function point analysis to enable effective IT cost management and
achieve a realistic return on investment. He is the coauthor of Function Point Analysis:Measurement Practices for Successful Software Projects and Measuring the SoftwareProcess: A Practical Guide to Functional Measurements. He has served as the Presidentof the International Function Point Users Group (IFPUG).
Mr. Garmus has more than 30 years of experience managing software developmentand maintenance and has taught college-level courses in computing- and finance-related subjects. He is a member of Project Management Institute, the Quality AssuranceInstitute, and the IEEE Computer Society, and he holds a BS from the University ofCalifornia–Los Angeles and an MBA from the Harvard Business School. He can be reachedat [email protected].
UpcomingIssue Themes
Enterprise Architecture Governance
The New CIO Agenda
Usability
Patterns
Killing IT Projects
June 2003 IT Metrics and Benchmarking
May 2003 Is Open Source Ready for Prime Time?
April 2003 Project Portfolio Management:Blueprint for Efficiency or Formula for Boondoggle?
March 2003 Critical Chain Project Management:Coming to a Radar Screen Near You!
February 2003 XP and Culture Change: Part II
January 2003 Ending “Garbage In, Garbage Out”:IT’s Role in Improving Data Quality
December 2002 Preventing IT Burnout
November 2002 Globalization: Boon or Bane?
October 2002 Whither Wireless?
September 2002 XP and Culture Change
August 2002 Plotting a Testing Course in the IT Universe
July 2002 Confronting Complexity: Contemporary Software Testing
June 2002 B2B Collaboration: Where to Start?