an empirical analysis of gnu make in open source projects
Post on 08-Jan-2022
4 Views
Preview:
TRANSCRIPT
An Empirical Analysis of GNU Make
in Open Source Projects
by
Douglas Martin
A thesis submitted to the
School of Computing
in conformity with the requirements for
the degree of Doctor of Philosophy
Queen’s University
Kingston, Ontario, Canada
April 2017
Copyright c© Douglas Martin, 2017
Abstract
Build systems, the tools responsible for compiling, testing, and packaging software
systems, play a vital role in the software development process. Make is one of the
oldest build technologies and is still widely used today, whether by manually writing
Makefiles, or by generating them using tools like Autotools and CMake. Despite its
conceptual simplicity, modern Make implementations such as GNU Make have become
very complex languages, featuring functions, macros, lazy variable assignments and
more.
This thesis is an exploration of Make-based open source build systems in two
parts. First, our feature analysis looks at the popularity of features and the difference
between hand-written Makefiles and those generated using various tools. We find that
generated Makefiles use only a core set of features and that more advanced features
(such as function calls) are used very little, and almost exclusively in hand-written
Makefiles. Second, our complexity analysis introduces indirection complexity – a
simple metric for measuring maintenance complexity in Makefiles using the same
feature data compiled in the first analysis. We show how this new metric can provide
a better way to measure which Makefiles will require more cognitive overhead to
understand than traditional metrics.
Both analyses utilize our framework, built with the TXL source transformation
i
language, to obtain a detailed parse of Makefiles in our corpus. This corpus consists
of almost 20,000 Makefiles, comprised of over 8.4 million lines, from 271 different open
source projects.
Through these analyses, we aim to gain a better understanding of how the Make
language is used in the open source community (some of the most advanced users of
Make).
ii
Co-Authorship
All papers resulting from this thesis were co-authored by my supervisor, James R.
Cordy. The work presented in Chapter 4 was co-authored by Bram Adams and Giulio
Antoniol from Ecole Polytechnique de Montreal. In each case, I was the primary
author.
Chapter 4 was published in the proceedings of 23rd IEEE International Confer-
ence on Program Comprehension (ICPC ’15) with James R. Cordy, Bram Adams,
and Giulio Antoniol [28]. Chapter 5 was published in the proceedings of the 7th In-
ternational Workshop on Emerging Trends in Software Metrics (WETSoM ’16) with
James R. Cordy [27].
iii
Acknowledgments
I would like to dedicate this thesis to my grandmother, Marilyn Martin, who sadly
passed away in January of 2014 after a battle with Alzheimer’s disease. I know she
wouldn’t understand a word of this thesis, but she’d be proud anyway.
I would like to thank my supervisor, Jim Cordy, for his endless patience throughout
my doctorate. He is the reason this thesis is finally complete.
And thanks to my many lab mates over the years: Scott Grant for his valuable
wisdom and whose thesis served as an example for me while writing this; Eric Rapos,
as annoying as he could be sometimes; the lab love-birds, Paul Geesaman and Karolina
Zurowska (and her silly Polish accent); Matthew Stephan and Andrew Steveson,
who both began their PhD as I began my MSc; Amal Khalil, for suffering with me;
Juergen’s MSc trio Suchita Ganesh, Nondini Das Misti, and Leo Jweda; and finally
Nafiseh Kahani, Ashiqur Rahman, Charu Prashar Sharma, Elizabeth Antony, and
Gehan Selim, among the many others I am surely forgetting.
iv
Statement of Originality
I hereby certify that all of the work described within this thesis is the original work
of the author. The research was conducted under the supervision of Dr. James R.
Cordy. Any published (or unpublished) ideas and/or techniques of others are fully
acknowledged in accordance with the standard referencing practices.
Douglas Martin
April 2017
v
Contents
Abstract i
Co-Authorship iii
Acknowledgments iv
Statement of Originality v
Contents vi
List of Tables ix
List of Figures x
Chapter 1: Introduction 11.1 Goal of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Summary of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 2: Background 82.1 Overview of Technologies . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Make . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.2 Ant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1.3 Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.4 Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.5 Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Chapter 3: A Corpus of Open Source Makefiles and a Frameworkfor Analyzing Them 30
vi
3.1 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.1.1 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.1.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.1.3 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2.1 Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2.2 Extracting Features . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Chapter 4: Feature Analysis of Open Source Makefiles 464.1 Features of the GNU Make Language . . . . . . . . . . . . . . . . . . 47
4.1.1 Readability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.1.3 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.1.4 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.1.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Chapter 5: Maintenance Complexity of Open Source Makefiles 715.1 Complexity of Software Maintenance . . . . . . . . . . . . . . . . . . 725.2 Indirect Features of GNU Make . . . . . . . . . . . . . . . . . . . . . 75
5.2.1 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.2.2 vpath, directory change (cd), paths . . . . . . . . . . . . . . . 775.2.3 Includes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.2.4 Conditionals (ifdef/ifeq) . . . . . . . . . . . . . . . . . . . . . 785.2.5 Variable References . . . . . . . . . . . . . . . . . . . . . . . . 785.2.6 Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.2.7 Recursive Make . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Chapter 6: Conclusion 906.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916.2 Threats to Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Bibliography 95
Appendix A: Make Feature Use Summary 101
vii
A.1 All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101A.1.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101A.1.2 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102A.1.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103A.1.4 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103A.1.5 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 103A.1.6 Recipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104A.1.7 Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.2 Automake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106A.2.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106A.2.2 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107A.2.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107A.2.4 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108A.2.5 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 108A.2.6 Recipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109A.2.7 Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.3 CMake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111A.3.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111A.3.2 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112A.3.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112A.3.4 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113A.3.5 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 113A.3.6 Recipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114A.3.7 Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
A.4 QMake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116A.4.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116A.4.2 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117A.4.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117A.4.4 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118A.4.5 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 118A.4.6 Recipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119A.4.7 Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
A.5 Hand-written . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121A.5.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121A.5.2 Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122A.5.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122A.5.4 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123A.5.5 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 123A.5.6 Recipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124A.5.7 Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
viii
List of Tables
3.1 Overview of corpus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1 Automatic Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Obsolete Features in Makefiles. . . . . . . . . . . . . . . . . . . . . . 66
4.3 Recursion in Makefiles. . . . . . . . . . . . . . . . . . . . . . . . . . . 67
ix
List of Figures
1.1 An example Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Our framework parses an input Makefile, which then allows features to
be extracted and counted for analysis. . . . . . . . . . . . . . . . . . 6
2.1 The CMake GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Evolution of the Linux build system on a logarithmic scale (from [9]). 19
2.3 Standardized size of four Java systems and their Ant build systems
in terms of number of lines (SLOC for source code, SBLOC for build
scripts). Anomalies are in shown in red and investigated further in
[30]. From [30]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Complexity measures of packages in The Berkley Automounter: a)
1000 Lines of Code, b) Number of CPP Conditionals, and c) Number
of CPP conditionals per 1000 lines of code. From [51]. . . . . . . . . . 23
2.5 Build-time architecture of GCC. (From [49]). . . . . . . . . . . . . . . 26
2.6 Example of a Symbolic Dependency Graph. (From [46]). . . . . . . . 27
2.7 The MAKAO architecture (a) and user interface with visualization (b).
(From [10] and [5]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.1 An example Makefile. . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Tamrawi et al’s Makefile grammar (from [46]) . . . . . . . . . . . . . 36
x
3.3 A cross section of the TXL grammar for Makefiles showing relevant
definitions for a) variables, and b) rules. . . . . . . . . . . . . . . . . 37
3.4 Example XML Markup of Makefile Parse (Note: Lower level markup
detail not shown for readability) . . . . . . . . . . . . . . . . . . . . . 39
3.5 An excerpt of TXL rules to extract, count, and display the number of
function calls in various locations. We use function calls as an example
because they can be placed almost anywhere in the Makefile. . . . . . 41
3.6 An excerpt showing how to find variable references within targets. . . 43
3.7 Sample output from the extractor, showing the result of running it on
the example in Figure 3.1. . . . . . . . . . . . . . . . . . . . . . . . . 44
4.1 An example Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Feature Use (% of Makefiles) . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Assignment Use (% of Makefiles) . . . . . . . . . . . . . . . . . . . . 60
4.4 Automatic variables (% of Makefiles) . . . . . . . . . . . . . . . . . . 61
4.5 Function calls (% of Makefiles) . . . . . . . . . . . . . . . . . . . . . . 62
4.6 Locations of a) variable references, b) automatic variable references,
and c) function calls as a percentage of the total. . . . . . . . . . . . 63
4.7 Feature Use By Generator (% of Makefiles) . . . . . . . . . . . . . . . 65
5.1 These examples illustrate how current metrics used to measure com-
plexity in Makefiles, such as number of lines or dependencies, can be
misleading. Example (a) has more dependencies and lines, but example
(b) hides its complexity in another Makefile that it calls recursively. . 73
5.2 An example Makefile. . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
xi
5.3 The indirection complexity of a) our entire corpus and b) Makefiles
with < 10 thousand lines. . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 The indirection complexity of our corpus, coloured by generator. . . . 83
5.5 The indirection complexity of hand-written and generated Makefiles. . 84
5.5 The indirection complexity of hand-written and generated Makefiles. . 85
5.6 Examples from our corpus that illustrate the advantages of indirection
complexity over traditional metrics such as number of lines or depen-
dencies. Makefile (a) one the left has more dependencies than Makefile
(b), but is arguably much easier to understand. . . . . . . . . . . . . 87
xii
1
Chapter 1
Introduction
Build automation systems, or simply build systems, are the backbone of every software
development project, touching every stage of the process in some way. Developers
use it multiple times a day to compile and link their code. To save time, this is often
optimized to only compile and link artifacts that have been modified since the last
build. Testers can use the build system to automate testing of a set of inputs or build
a special version of the software for debugging. Finally, project managers can use it to
deploy the various versions in a software product line or highly customizable software
(e.g. the Linux kernel), which require different combinations of files or subsystems to
be compiled based on some configuration.
Maintenance of these systems has become a growing concern over the years because
of how critical they are to the liveliness of development. This maintenance has been
shown to impose a 12% overhead on development [25]. Furthermore, McIntosh et al.
have found that up to 27% of source code changes require some form of modification
of build code, which can make up to 31% of the entire code base [33].
Surveys of developers at Microsoft and Google, among others, show that very few
developers know anything about the build system, and those that do, usually a small
1.1. GOAL OF THE THESIS 2
group of “build engineers” responsible for maintaining it, can get frustrated trying to
debug it [37, 25, 38, 40]. Such small build teams can have a negative effect on the so
called truck factor (also known as the bus factor), which, as defined by Bowler [15], is
the number of people that would have to be hit by a truck or bus to put the project
in serious jeopardy. If build artifacts have to be constantly updated, and there is no
build engineer available to update it (perhaps they are on vacation), then the project
could be delayed for days. It is important to gain a better understanding of these
systems and how they are used so that decisions can be made about how to maintain
them.
There are many build automation tools, but Make, and specifically GNU Make
[4], is one of the most widely used. However, despite its simplicity, it is often criticized
for its difficulty to understand and general lack of debugging facilities [36, 19]. The
language itself is declarative, specifying rules to build target files, the files on which
they depend (dependencies), and how to update them when they are out of date
(recipes). As an example, consider the small Makefile in Figure 1.1. Line 22 shows
a rule that states that foo.o depends on files foo.c and foo.h. If either foo.c or foo.h
has a newer timestamp than foo.o (or if foo.o does not exist) the recipe command on
line 23 is invoked to update (or create) foo.o. We will discuss the other parts of this
Makefile in later chapters.
1.1 Goal of the Thesis
We recognize that build systems are important to the health of a software project
and that they are not well understood, both in terms of developer comprehension
and in the state of the research. Our main goal is to gain a better understanding of
1.1. GOAL OF THE THESIS 3
Figure 1.1: An example Makefile
how they are used in open source projects. We narrow our focus to Make-based build
systems because it is the oldest and still widely used.
Thesis statement: By using static analysis to understand how the Make language
is used and maintained in practice, we can inform tool developers’ evolution of the
language and assist users in creating Makefiles that are easier to maintain.
The core framework that enables the work in this thesis is a Make parser im-
plemented using the source transformation engine TXL. It allows us to dissect and
manipulate every piece of a Makefile for analysis. Primarily, we use it to parse and
1.2. CONTRIBUTIONS 4
analyze the use of Make features for the purpose of tabulating feature use and pre-
dicting how difficult a given Makefile will be to understand and, therefore, maintain.
By analyzing the features that are being used in practice, we hope to guide developers
who make decisions about the evolution of Make implementations and tools. And,
by identifying Makefiles that could be difficult to understand, we hope to help Make
users find problem Makefiles that can be refactored to improve maintainability.
1.2 Contributions
While other research on build systems tends to focus on build graphs and how com-
putationally complex they are to execute, the work presented in this thesis focuses
specifically on the Make language and how it is used in practice. We are interested
more in finding patterns in how it is used and maintained than how long it takes to
execute or optimize.
We make the following contributions:
1. A corpus of open source Makefiles and a general framework for analyzing Make-
files based on a robust, precise parser.
2. A taxonomy of Make features and an inventory of their use in 271 open source
projects from 3 different generators as well as those that are hand-written.
3. A new metric for measuring the maintenance complexity of Makefiles and an
analysis of the complexity of open source projects using it.
1.3. SUMMARY OF THESIS 5
1.3 Summary of Thesis
This thesis is an exploration in two parts of open source Makefiles and how the Make
language is used in open source software projects.
Background
We start, in Chapter 2, with some basic background information and the state of the
art research in the field. Here we will give a brief overview of the build technologies
mentioned throughout the thesis including GNU Make itself as well as tools that
can be used to generate Makefiles. We will then delve into research on build system
maintenance, evolution, and the migration between tools. We will finish with an
overview of build models, including one of the language that we used as a guide for
our own Makefile parser.
A Corpus of Open Source Makefiles and a Framework for Analyzing Them
Our analyses required a large set of open source projects to analyze. We compiled a
corpus consisting of almost 20k Makefiles across 271 projects — from the hand-written
Makefiles of the Linux kernel to generated Makefiles from CMake and QMake.
In addition to open source Makefiles, we required a means to read and extract
parts of a Makefile (e.g. rules, targets, functions etc.). TXL provided the perfect
facility to do this. We were able to create a Makefile grammar for TXL based on one
published by Tamrawi et al. [46] and use it to statically parse the Makefiles in our
corpus, then extract and count elements (i.e. features) of the language (Figure 1.2).
Chapter 3 gives a more detailed breakdown of the corpus and the TXL framework
that made this thesis possible.
1.3. SUMMARY OF THESIS 6
Figure 1.2: Our framework parses an input Makefile, which then allows features tobe extracted and counted for analysis.
Feature Analysis of Open Source Makefiles
The first thing we explored was which features were being used. Over the years,
GNU Make has added syntax and functions to the language, presumably to make
it easier for developers to write. Conversely, some features have been removed in
favour of others. We wanted to know if these features were being used or just ignored
and to what extent bad practices, such as the use of deprecated features, were still
being used. We also wanted to know how generators differed in the features they use
compared to hand-written ones.
Chapter 4 contains the answers to these questions and our full analysis of feature
use.
Maintenance Complexity of Open Source Makefiles
Our feature analysis yielded a large set of feature counts, which gave us an idea for a
new complexity measure. This new measure, we call indirection complexity, is based
on the way that human developers read and understand code. Experts follow links
1.3. SUMMARY OF THESIS 7
in the code to understand what it is doing. The more links that get followed, the
more things the developer must keep track of and remember and the more difficult it
becomes to do so.
We define indirect features as any feature that may require a developer to follow
a link in some artifiact (in our case a Makefile) and identify these indirect features
in Make. We calculate the indirection complexity by counting the total number of
indirect features used — data we had already collected in our feature analysis.
Chapter 5 contains more information about indirection complexity, including ex-
amples, as well as a look at the complexity of our corpus.
Summary and Conclusions
Finally, we finish the thesis with a discussion of the results presented throughout and
some potential areas of future work.
8
Chapter 2
Background
A build system, as the name implies, is an automated system of tools for building
software from source artifacts. A typical build process involves a number of steps or
layers, each addressing a particular responsibility.
The configuration layer is responsible for allowing users of the build system to
select features and options that will affect the result of the build. Depending on the
tool, this may be done by outputting configuration files with the chosen features, or
a custom build script to be used by the construction layer.
The construction layer is responsible for compiling the raw software artifacts into
an executable application. Unlike the other layers, this layer is essential to the build
system because it contains the recipe for building the entire system. Other layers can
be omitted, and in many smaller projects they are.
The certification layer is responsible for testing the deployed software. Testers
write test case suites that the build system can then automatically check every time
the software changes and alert the developers.
2.1. OVERVIEW OF TECHNOLOGIES 9
The packaging layer is responsible for gathering the deployed (and tested) soft-
ware, as well as anything else needed by an end user of the system (e.g. documenta-
tion, libraries, etc.) and bundles them into an installable package. This aspect of the
build system would be used by a project manager to deploy the different releases of
the software.
In this thesis, we focus on the construction layer, however it is sometimes difficult
to separate layers. The remainder of this chapter will discuss build technologies and
the state of the art research in the field.
2.1 Overview of Technologies
In the early days of software development, developers were either forced to compile
code manually, or rely on shell scripts to automate it. But as software projects grew
larger, these solutions became less feasible and so special tools were created to make
it easier. This section will discuss the most popular of these tools.
2.1.1 Make
One of the first build tools, and the most popular to this day, was Feldman’s Make
[17]. Feldman observed that the current approach to building software was inefficient
and error-prone. It can be difficult to keep track of which modules depend on each
other, and which files were modified after a long coding session. The only way to
ensure that the application is compiled correctly is to recompile everything, which
wastes time and resources. Feldman addressed this problem with Make by creating a
declarative language that used rules to determine file dependencies and timestamps
to determine which artifacts were out of date and thus needed to be recompiled.
2.1. OVERVIEW OF TECHNOLOGIES 10
Over the years, there have been many implementations and improvements of Make
[35]. GNU Make [4] is the most widely distributed implementation, because it is open
source and included in many Linux distributions. Of course, there are countless other
Make clones (such as Microsoft’s NMake for Visual Studio [6]), each with their own
unique features and improvements [35, 26, 26]. From here on, when we refer to
Make, we are referring to GNU Make because it’s the most open and widely used
implementation available.
Make processes what are known as Makefiles, which contain rules that dictate
how the system is to be compiled and linked. Rules consist of one or more targets, a
set of dependencies, and some commands, in the following form:
target1 [target2 ...] :[:] [dependency1 ...] [;commands] [#...]
[(tab) commands] [#...]
...
Each target may refer to a file or, alternatively, may simply be an identifier such
as “all,” “clean,” or “debug.” When a target is an identifier like this, it is called a
phony target, and may be referred to elsewhere in the Makefile as a dependency of
some other target. Targets are followed by an optional set of dependencies – other
targets or files on which the target depends. If any of the dependencies has been
updated since the target was last made, or if the target does not exist, then it must
be remade. If there are no dependencies, the target is never out of date and, therefore,
only made when called explicitly. Following the target and dependencies is a list of
commands, called a recipe, to update the target. The commands may be simple calls
to a compiler, other shell commands, or external scripts.
By default, Make will pick the first target in the Makefile to make automatically,
2.1. OVERVIEW OF TECHNOLOGIES 11
if it is not explicitly given one. This target becomes the default goal. To make
the default goal, Make must first resolve its dependencies by searching for the rule
associated with each. If the dependency is up-to-date, then it can proceed; but if it
is not, it must resolve this new target first. It then proceeds in this manner until the
default goal can be built.
Make also provides implicit rules — common rules that Make defines by default
that can be used when no rule is specified for a target. For example, there is an
implicit rule to compile a .c file into a .o file:
%.o: %.c
# commands to execute (built-in):
$(COMPILE.c) $(OUTPUT_OPTION) $<
This rule says that a .o file depends on a .c file of the same name, and when it is
out-of-date, it can be updated using the default C compiler and flags (the $< refers
to the name of the target).
Of course, Make provides a whole set of features, such as variables and functions,
but we will defer discussing those in later sections.
2.1.2 Ant
Ant [13] was developed by James Duncan Davidson as part of Tomcat when it was
donated to Apache from Sun Microsystems. It initially had no other purpose than to
build Tomcat, but in January 2000 it was separated into its own standalone product
and officially released as Apache Ant version 1.1 in July. Davidson created Ant as an
alternative to Make to address some of his grievances with the popular build tool. A
2.1. OVERVIEW OF TECHNOLOGIES 12
primary goal was to create a portable build system that could be run on any operating
system.
Apache Ant is written in Java, therefore requiring Java to run, with built-in tasks
specifically for compiling Java programs. However, it utilizes a plugin architecture
that allows custom tasks to be written, which makes it able to compile programs in
virtually any language. For example, the Ant-Contrib project includes a cpptasks
package [39] for C/C++ compilers.
Ant scripts work similar to Makefiles in that a build engineer creates a list of
targets with dependencies and how to create or compile them. It uses an XML-based
syntax of the form:
<project name="" default="" basedir="">
<target name="target1" depends="targetx">
<task1 />
[<task2 />]
...
[<taskn />]
</target>
...
</project>
Each target is represented as a list of tasks enclosed in <target> tags. Tasks are
an abstraction of shell commands, such as creating a directory, deleting files, and
compiling source code. Abstracting them in this way allows Ant scripts to be run on
any platform, and for new tasks to be written.
2.1. OVERVIEW OF TECHNOLOGIES 13
2.1.3 Generators
When Feldmen introduced Make, he acknowledged that it was best suited for medium-
sized projects. Tools have been developed to generate them for larger projects with
more dependencies. In this section, we give a brief overview of three such generators
that we will see later in our corpus.
Automake
Automake [1], as the name suggests, is a tool to automatically generate Makefiles by
taking a Makefile.am file and converting it to a Makefile.in file. It is often used in
conjunction with Autoconf, which is used to generate configuration files. Automake
and Autoconf are part of Autotools, a suite of tools for automating the creation of
underlying build artifacts of the GNU build system. These tools also attempt to
address Make’s portability problem by generating scripts that can be ran on a variety
of platforms.
Automake reads a Makefile.am file, which contains variable definitions that dictate
how the software is to be built. Variables with special suffixes, such as PROGRAMS and
LIBRARIES, are called primary variables and are used to specify various properties
of the system that Automake needs to build the software correctly. For example, a
variable with a SOURCES suffix can then be used to specify the source files of a target.
Consider the following:
bin_PROGRAMS = foo
foo_SOURCES = src/foo.h src/foo.cc src/main.cc \
src/bar/bar.h src/bar/bar.cc
This tells Automake that there is a program called foo, and that foo consists of the
2.1. OVERVIEW OF TECHNOLOGIES 14
source files assigned to foo SOURCES. From this Automake is able to write the rules
necessary to compile and link the program foo.
Automake also copies the contents of Makefile.am verbatim into the Makefile.in
that it generates. This allows the developer to declare their own variables and rules
that they want to be included in the outputted Makefile. Of course, all of this only
touches the surface of what Automake and Autotools can do. There are lots of other
primary variables that mean various different things to affect the build scripts that
Automake generates. However, we will not go in to any more detail here.
CMake
Like Automake, CMake (Cross-platform Make) [3] is another tool that can be used
to generate Makefiles. However, unlike Automake, CMake can also generate build
scripts for many other platforms and build technologies, such as Microsoft’s Visual
Studio or Apple’s XCode.
Instead of Automake’s primary variables, CMake reads a CMakeLists.txt file and
interprets a series of commands of the form:
COMMAND_NAME(Arg1, Arg2, ...)
While it does have the notion of variables, they are set using the SET command.
Consider the following simple example CMakeLists.txt :
PROJECT(ProjectName C)
ADD_LIBRARY(LibraryName STATIC lib_file.c)
SET(SRCS file1.c file2.c file3.c )
ADD_EXECUTABLE(ProjectName $SRCS)
TARGET_LINK_LIBRARIES(ProjectName LibraryName)
2.1. OVERVIEW OF TECHNOLOGIES 15
Figure 2.1: The CMake GUI
In this example, a C project is created with a static library. It uses the SET command
to list the source files in a variable named SRC, which is referenced later to add them
to the project.
CMake is also a configuration tool with its own graphical interface. Options can
be defined using the OPTIONS command, with simple if-statements to test the value.
For example:
OPTION(OPTION_NAME "Description of option" DEFAULT)
IF(OPTION_NAME)
...
ELSE
...
ENDIF (OPTION_NAME)
2.1. OVERVIEW OF TECHNOLOGIES 16
Then, when CMake is run, the user has the ability to change the values, either in the
GUI shown in Figure 2.1 or through the command line.
QMake
The Qt project, a cross-platform development framework, has its own generator to
create Makefiles as well as build scripts for Microsoft’s Visual Studio and Apple’s
XCode. Qmake [7], as it’s called, is similar to Automake in that it uses special
variables to specify important properties about the system and its artifacts. But
because it must generate other kinds of build scripts, scripts cannot contain Make
syntax like Automake can. For example, a Qmake script for a simple C++ program
may look something like this:
CONFIG += qt
HEADERS += foo.h
SOURCES = foo.cpp \
bar.cpp
Conditional statements can be added based on the values of certain built-in vari-
ables and functions. For example, the following shows how to apply statements only
on Windows platforms:
win32 {
SOURCES += winfoo.cpp
}
2.2. MAINTENANCE 17
2.2 Maintenance
The effort put into creating and maintaining the build system for a particular soft-
ware project is often overlooked. Kumfert and Epperly [25] surveyed 34 scientific
software developers, within labs of the U.S. Department of Energy, to identify the
hidden overhead associated with build systems. The survey questions asked develop-
ers about how they perceived the build system of a project they were working on, as
well as quantitative questions like how many build tools, languages, and third party
libraries they use. The perceived overhead of the build ranged from 0-37.71% with an
average of 11.91% and a median of 10%. They took note of the high average number
of languages (3.85) and average number of third party libraries (4.21) per project
as contributing factors in the high overhead. Next, they attempted to objectively
quantify the overhead using CVS repositories. Recognizing that the size of a change
may not effectively reflect the amount of time spent on it, they decided to count each
change equally as one time unit. Handwritten (as opposed to auto-generated) build
files accounted for 13.7% of the number of lines of code in the system and 27.5% of
the changes, suggesting that the build system accounts for a significant amount of
overhead in development.
McIntosh et al. found similar results [33]. In their investigation of ten large
software systems – written in C and Java, and using a variety of build technologies
(most of which were described in the previous section) – they were interested in
discovering the churn rate (i.e. the rate of change) of build artifacts, as well as build
ownership (i.e. which developers modify the build artifacts), and how this relates
back to the source code. First, they classified all of the build artifacts as build (e.g.
Makefile, configuration files, etc.), production (i.e. source code), or test (e.g. unit
2.2. MAINTENANCE 18
test code). Then they grouped revisions from the version control repository, such
that they could identify which developer was primarily involved in each. They also
marked revisions as a change to the build artifacts, production artifacts, or both, and
performed statistical analyses to determine the coupling of revisions of each type.
That is, they calculated the likelihood that changes to the source code accompany
changes to the build system.
They found that the build system could make up anywhere from 1% to 31% of
the artifacts in the system, with a median of 9%. With respect to the churn rate,
they found that it was comparable to that of the source code itself, but that changes
to the build artifacts result in more relative churn. As for coupling, they found that
up to 27% of revisions to a C project required a revision in the build system. Finally,
they identified two different styles concerning build ownership: a concentrated style
where a small team of specialized build engineers are responsible; and a dispersed
style where most developers of the system are responsible for some aspect. They
recognize that organization and budget constraints made one style more practical
over another (e.g. an open source project with many casual developers would benefit
most from a concentrated style, while a small development team may not have the
budget to hire a separate team of build engineers).
Adams et al. also observed the effort that went into maintaining the Linux build
system [9]. They identified a number of times an explicit effort was made to refactor
the build to account for the growing complexity. First, the developers introduced a
new recursion scheme that reduced the number of dependencies by 14%. Later, a
more substantial effort was made to use general rules with specialized list variables,
which resulted in 40% less dependencies.
2.3. EVOLUTION 19
Figure 2.2: Evolution of the Linux build system on a logarithmic scale (from [9]).
There was also a massive restructuring that took place leading up to version 2.6.
First, a previous phase, that extracted dependencies, was absorbed by the phases
responsible for building the kernel and selected modules. There was also an effort to
eliminate recursion, which was found to be harmful [34]. Finally, a new configuration
system, Kconfig, was introduced with a standard language. All of this contributed
to the evolution of the Linux build system over time, which we will see in the next
section.
2.3 Evolution
Software evolution is an area focused on studying how software ages or evolves over
time. The laws of software evolution were first presented by Belady and Lehman [14],
who stated that software continues to grow in size and complexity over time. There
has been work that suggests that build systems also follow these laws.
2.3. EVOLUTION 20
First, Adams et al. investigated evolution in the Linux build system [9] from
version 0.01 to 2.6.21.5 (including most pre-1.0 releases and then major releases after
that). Figure 2.2 shows the size of various artifacts of the system in source lines
of code (on a logarithmic scale). The source code evolution is represented by the
blue line and the various build system artifacts – Makefiles, configuration files, and
helper scripts – are represented by the red, yellow, and green lines, respectively. This
shows how the build system indeed evolved, and along the same sort of curve that
the source code did. Abrams et al. also assert that the complexity increases, based
on the increasing number of targets, and thus the number of tasks, to be built.
Next, McIntosh et al. [30, 31] extended this analysis to show that Java systems
using Ant evolved in the same way. They studied four small to large Java projects,
including Tomcat and Eclipse, using static and dynamic metrics. For static mea-
surements, they counted the number of files, lines (SBLOC), targets, and tasks, in
addition to modified Halstead complexity metrics – measurements for estimating how
much information someone reading the source code would have to absorb, and how
much mental effort would be required. The Halstead metrics, which usually utilize
operators and operands, were modified to use tasks, targets (operators), and the
parameters passed to them (operands). For dynamic measurements, they used the
generated build graph to find the length (total number of executed tasks/targets) and
depth (the maximum number of tasks to create one target). They also calculated the
percentage of targets and the number of lines of build code used by default or clean
builds.
With respect to their static measurements, they found a general growth in the
number of lines of code in the build scripts, which followed the same trend as the
2.3. EVOLUTION 21
Figure 2.3: Standardized size of four Java systems and their Ant build systems interms of number of lines (SLOC for source code, SBLOC for build scripts). Anomaliesare in shown in red and investigated further in [30]. From [30].
number of lines of source code, when standardized by weighting the values with
the distance from the average (i.e. [n − µ]/σ). These results are shown in Figure
2.3. With this high correlation and some manual inspection, they concluded that
restructurings in the build system are mostly caused by a restructuring of the source
code. In addition, they found a high correlation between the Halstead complexity and
the SBLOC, which allowed them to conclude that the number of lines was a good
approximation of the complexity of an Ant build system.
As for the dynamic metrics, they found two patterns concerning the depth of
the build graph: one where the depth remains mostly unchanged and one where the
depth increases over time. Further analysis showed that the projects where the depth
increased used a recursive design, whereas the others did not.
2.4. MIGRATION 22
Zadok proposes a different approach for measuring build system complexity (as
it relates to portability) in his analysis of The Berkley Automounter’s switch to
Autotools [51]. After plotting the number of lines of code and the number of CPP
conditionals (e.g. if, ifdef, etc.), he notes their similarity (Figure 2.4a and 2.4b).
The packages with the fewest lines have the fewest CPP conditionals, and vice versa.
He takes this to mean that neither is a good measure of complexity, and instead
proposes the number of CPP conditionals per 1000 lines of code (Figure 2.4c). He
notes that the differences are not as large in this new measure, and that the standard
deviation is smaller. This implies that a package may have a native complexity that
is unlikely to change over time. Digging further, he found that the biggest factor in
complexity was related to the number of operating system calls and the portable C
code that comes with it.
2.4 Migration
In the previous section, we saw how build systems grow in size and complexity over
time, prompting refactoring in many cases. But what happens when a build system
gets too complex? In the case of the Linux kernel, the build system was refactored
(e.g. removing recursion). However, in the case of KDE — a free and open source
software project and community — they decided to switch build system technologies
altogether.
Neundorf chronicled the switch [37]. At the time of writing, KDE had over 1200
developers around the world committing new code around 300 times a day to a repos-
itory with more than 4 million lines of code. They were using Autotools to build and
configure their products, but developers were not fond of its complicated architecture
2.4. MIGRATION 23
(a) (b)
(c)
Figure 2.4: Complexity measures of packages in The Berkley Automounter: a) 1000Lines of Code, b) Number of CPP Conditionals, and c) Number of CPP conditionalsper 1000 lines of code. From [51].
and chain of build commands, referring to it as “auto-hell.” Neundorf estimates that
only 10 people on the team actually understood it. KDE was not opposed to large
changes, having just switched from a CVS (Concurrent Versioning System) reposi-
tory to SVN (Subversion). So at 2005’s aKademy (KDE’s annual conference), they
decided to make a change, and SCons came out as the favourite partly because some
developers had already written scripts to build some KDE libraries on Linux.
Work began on writing SCons scripts to build key components for KDE4, how-
ever, they started to hit some obstacles. There were problems getting it to build on
non-Linux platforms, and the developers did not think SCons had a mature enough
2.4. MIGRATION 24
configuration system. On top of this, they felt like they did not have enough sup-
port from the SCons developers to help them bring it up to task. If they were to
proceed, they would have to fork SCons and add features, which would mean they
would have to maintain it as well. At this point, they decided to cut their losses and
try something else.
Neundorf suggested CMake based on his past experience using it on smaller-
scale projects, citing its simplicity and portability, as well as an existing script that
automatically does 90% of the conversion from Autotools to CMake. The CMake
developers were also fully on board, offering to implement the features they needed
to build KDE. With that, they set off writing CMake code to compile KDE libraries,
and soon had it building on Mac OS X and Windows. As of the release of KDE4,
CMake became the official build tool of KDE.
Suvorov et al. [44] studied KDE’s move to CMake as well as the major transfor-
mations of the Linux kernel’s build system and identified commonalities among them
by manually reading messages from official mailing lists. They define a migration as
a move from one (often older) build technology to another, or a major refactoring to
a different architecture within the same technology (as was the case with the Linux
kernel). Each migration they looked at resembled the spiral model, where each small
part of the build system was migrated with a working prototype one at a time. They
outline 3 phases that were used for the spiral model: planning and risk analysis,
where requirements, as well as possible implementations and risks, are gathered for
the next prototype; development, where the new prototype is developed based on the
requirements; and evaluation, where the prototype is reviewed, and either accepted
or rejected.
2.5. MODELLING 25
They observed a set of common challenges that arose in both cases. The biggest
problem was ineffective planning, specifically when gathering requirements. Had KDE
developers properly understood their requirements (e.g. with respect to the configura-
tion system), they may never have wasted months pursuing SCons as a viable option.
Improper requirements can also lead to ineffective evaluation of a prototype, another
issue identified by Suvorov et al. The stakeholders need to agree on a set of success
criteria before they are able to reject or accept a prototype. They have to be able to
effectively communicate their knowledge and concerns with each other too. Build sys-
tem experts were sometimes reluctant to communicate with other developers, which
could have been a contributing factor to the other problems. Finally, Suvorov et al.
identify a trade-off between complexity and performance when migrating. Improving
the performance of a build system can mean resorting to special cases and hacks that
make the code hard to read. Stakeholders must discuss which of these attributes is
more important to them.
2.5 Modelling
Tu and Godfrey observed that certain build-time activities were not being properly
represented in any of the current modeling views (e.g. process view, logical view,
scenarios, etc.) [49]. They proposed an additional view they call the build-time
architecture view. Their model focuses on representing the interactions of the entities
involved in the build (e.g. code artifacts, build scripts, configuration files, compilers,
linkers, etc.). Figure 2.5 shows the build-time view of the GNU Compiler Collection
(GCC). Here we can see it uses the C Compiler to compile some source code artifacts
into object files, generates header files using those object files and a description of the
2.5. MODELLING 26
Figure 2.5: Build-time architecture of GCC. (From [49]).
SPARC chip (depending on environment parameters), and then compiles and links
those with other source files using the C Compiler.
Jørgensen [22] defined a formal notation to create a semantic model for Make and
prove that its incremental build is safe (i.e. that the result of rebuilding only the
modified parts of a system is equivalent to building it all by brute force). He shows
that a Makefile is safe if and only if all of its rules declare all of the files on which
the target(s) of that rule depend (completeness) and must create or update its own
target(s) (soundness) and only its own target(s) (fairness).
To prove this, Jørgensen defines the syntax of Makefiles in terms of sets such
as Name, which includes all target identifiers, and Command, which includes all
commands in a recipe. A Rule set is then defined as a set of rules of the form:
Ts : Ds; C
2.5. MODELLING 27
Figure 2.6: Example of a Symbolic Dependency Graph. (From [46]).
where Ts, Ds ∈ Name, and C ∈ Command. Files are defined as a pair of values
consisting of a timestamp, and the file’s contents. A state, s ∈ State, is a mapping
from a Name to a File, and an exec function executes a command, c ∈ Command,
which transforms on state to another. Files are said to be equivalent if their contents
are the same, regardless of their timestamps.
Jørgensen uses this notation to define build rules — rules that are complete, sound,
and fair, with respect to a given state. A Makefile is then derivable, with respect to
a given state, if it contains only these build rules. This allows him to define partial
safeness, which, in turn, allows him to prove the safeness of incremental building.
Tamrawi et al. [46, 47] created a tool called SYMake that uses Symbolic De-
pendency Graphs (SDGs) to automatically detect errors and smells, as well as allow
refactoring (e.g. rule removal, target renaming). Figure 2.6 shows an example SDG.
Here we can see targets represented as rectangles, with the initial target — install —
at the far left. Recipes are represented as rounded rectangles, and diamond-shaped
Select nodes represent a choice between alternatives. Each recipe is associated with
2.5. MODELLING 28
a V-model, representing a string value.
Adams et al. recognized that the build system is used by almost all project stake-
holders at some point, and that they suffer from understandability and maintainability
issues [10, 8]. With this in mind, they set out to build a tool, for Make-based build
systems, that would fulfill 5 requirements:
• Visualization – The tool should provide a graphical representation that would
provide a complete view of the whole build system, and allow the user to zoom
and explore.
• Querying – The tool should provide a querying engine for finding specific infor-
mation about a target, or selecting targets that meet some criteria.
• Filtering – There should be a method of removing redundant or uninteresting
information, and defining different views of the system.
• Refactoring – There should be facilities for making broad changes to the build
system to help maintain and evolve it over time.
• Validation – The tool should be able to find simple defects like dead code or
circular dependencies.
They called their tool MAKAO (Makefile Architecture Kernel featuring Aspect
Orientation) [5]. Built on top of the GUESS graph exploration tool, MAKAO extracts
a directed acyclic graph (DAG) of dependencies, where nodes are targets, using a
dynamic trace of the Makefiles as well as the static rules (Figure 2.7a) and presents
it in the interface shown in Figure 2.7b. By using GUESS’s Gython engine and
their own Prolog implementation, they were able to fulfill the querying and filtering
2.6. CONCLUSION 29
(a) (b)
Figure 2.7: The MAKAO architecture (a) and user interface with visualization (b).(From [10] and [5]).
requirements they set out to accomplish. For example, to get a list of all targets or
rules that failed to execute, a developer may simply execute the Gython query:
(error==0).visible=0.
Prolog rules can also be used to filter out unwanted nodes. As for the final two
requirements, they demonstrated how the same queries and rules can be used to
manipulate the graph and verify that there are no errors.
2.6 Conclusion
Now that we have reviewed the most popular build automation tools and surveyed the
research that has already been conducted on them, we may begin our own analyses.
We start, in the next chapter, with an overview of our corpus of Makefiles and our
framework for parsing and extracting Make features from it. Then we will present
two exploratory studies of the dataset in Chapters 4 and 5.
30
Chapter 3
A Corpus of Open Source Makefiles and a
Framework for Analyzing Them
This thesis presents two explorations of the Make language and how it is used in the
open source community. Both require two components: a large set of open source
projects from which to extract Makefiles, and the means to automatically tabulate
instances of each feature used in them. This chapter gives a summary of each of those
components, how the corpus was assembled, and how features were extracted.
3.1 Corpus
To get an accurate view of the Makefiles in the open source community, we required
a corpus of Makefiles representing a large and diverse set of projects. This section
describes the projects in our corpus — how we chose them, how we identified the
Makefiles and classified them by generator, and what it looks like overall.
3.1.1 Selection
When selecting projects for our corpus, we used the following criteria:
3.1. CORPUS 31
• Recency – Our aim was to select projects that were recently updated so as
to capture current usage. With this in mind, we only included projects that
were updated since 2010 (5 years before the time of compilation in October
2015). This was done so as to allow enough time to pass after new features
were introduced that they would be used. That being said, 2013 saw a major
release of GNU Make 4.0 that introduced several new features, including the
guile and file functions, which we make note of in the analysis.
• Size – We sought out projects that were made by teams of more than one de-
veloper and included more than one Makefile. Our reasoning was that larger
projects would be more likely to use more obscure features and be more repre-
sentative of the most advanced users. This was not a strict requirement, but it
did guide our selection process nonetheless.
Manually hand-written Makefiles were more interesting to us because they contain
more variety in the features used. They are also maintained by developers directly.
Many projects today use tools to automatically generate Makefiles. Given the popu-
larity of these tools, we also investigated how they use the language to perhaps inform
the developers of those tools in how to evolve them in future releases.
For hand-written Makefiles, we began with the latest Linux kernel, which has been
the subject of many other studies and has used Make since its inception. We also
included the Android project, which contains advanced Makefiles that make use of
a meta-programming technique that automatically generates rules using Make itself.
Both projects are known to maintain hand-written Makefiles. We also included GNU
projects from the GNU FTP repository1 because we considered GNU developers to
1http://ftp.gnu.org/gnu/
3.1. CORPUS 32
be some of the most advanced users of Make. Not all projects in the GNU repository
use hand-written Makefiles. A large portion use Automake to generate Makefiles in
some capacity (i.e. because Automake uses a template architecture, hand-written
Make statements can be mixed in with automated portions).
In addition to this, we generated Makefiles using CMake and QMake in order to
compare how those differ from Makefiles written manually. For CMake, we used the
KDE library of applications (among others) and for QMake, we used the Ruby Qt
bindings.
Once each project was configured and, if necessary, the files were generated, the
Makefiles were taken out of the projects source folder. Because Makefiles can be
named anything at all, we had to use known naming conventions to identify them. We
began by searching for files with the traditional “Makefile” name in them, removing
the Automake “Makefile.am” files that are included but not valid. Another common
convention is to use the “.make” or “.mk” file extension, so we included those as
well. Finally, we found that some projects put commonly used rules in a file called
“Makerules” or something to that effect, so we also included any file with the word
“rule” in the name.
3.1.2 Classification
In order to compare Makefiles from different generators, we needed to classify them.
For most Makefiles we could use pre-existing knowledge. For example, we had to
generate CMake and QMake Makefiles ourselves, so we knew which generator had
created them.
With GNU projects, the answer was not so simple. Most projects seemed to use
3.1. CORPUS 33
Autotools, including Automake, to generate Makefiles. But even then we could not be
certain that all of the Makefiles within the same project were created using it because
we didn’t generate them ourselves. They could have used a mixture of Automake
and hand-written Makefiles. For these cases, we relied on searching for comments of
the form “# Generated with Automake...” to classify Automake Makefiles. If a
Makefile did not have this comment in it, then it was assumed to be hand-written.
This includes Makefiles that were generated with Autoconf, a part of Autotools, which
only replaces special temporary variables within a Makefile.in file and therefore could
still be hand-written.
We also performed a search for other comments with the word “generated” in them
in case there were generators we had missed. This showed that there were projects
that used Autogen, a generic tool for generating repetitive text, such as Makefiles,
HTML, and XML. We left these as hand-written for two reasons. First, Autogen itself
does not contain any knowledge of Make, so the choice of how the generated Makefiles
are written is still up to the human developer. Second, there were not enough — only
3 projects in total — to draw any conclusions.
3.1.3 Characteristics
Table 3.1 gives a breakdown of the corpus. It totals 271 projects, of which 147 contain
Makefiles generated with Automake, 80 contain Makefiles generated by CMake, 2
contain Makefiles generated by QMake, and 130 contain hand-written Makefiles. Note
that most Automake projects also contain hand-written Makefiles, so the number of
projects does not add up to 271.
In terms of size of the overall corpus, it contains just under 20,000 Makefiles with
3.2. FRAMEWORK 34
Table 3.1: Overview of corpus.
Generator # Projects # Makefiles Total Lines Average LinesAutomake 147 1585 1761487 1111CMake 80 8672 1167300 135QMake 2 2493 4996408 2005Hand-written 130 6939 353338 51All 271 19689 8278533 420
a total of over 8 million lines. In general, generated Makefiles are longer (over a
thousand lines on average in the case of Automake and QMake) than hand-written
ones (about 50 lines on average). While hand-written Makefiles make up the second
largest group in our corpus in terms of the number of files, it is actually the smallest
in terms of the total number of lines. The average number of lines of CMake Makefiles
is lower than the other generators because it generates a large number of small files
that contain only a few variable assignments for compiler flags, build progress or, in
the case of depend.make files, a one or two line comment to be replaced when the
Makefiles are ran and scan for dependencies.
3.2 Framework
The experiments carried out in this thesis were enabled by a framework written in
the TXL source transformation language. The centre of this is a Makefile parser
that is capable of constructing a detailed parse tree of most Makefiles. We use this
parse tree to extract parts of the Makefile in which we are interested. This means
extracting subtrees that represent the features or aspects of the Makefile that we wish
to count. The remainder of this section discusses how we parse Makefiles and extract
their features. Throughout it, we will illustrate the process with the example Makefile
shown in Figure 3.1.
3.2. FRAMEWORK 35
Figure 3.1: An example Makefile.
3.2.1 Parsing
There exists no formal specification of the Make language syntax, so we based our
parser on the high-level grammar published by Tamrawi et al [46] (shown in Figure
3.2) for their work building symbolic dependency models to perform refactoring tasks
and detect code smells. Implementing a TXL parser for Makefiles based on this gram-
mar was our starting point. This grammar, however, abstracted a number of details
that we intended to measure. For example, the definition of Make rules contains no
mention of targets or dependencies (14 in Figure 3.2), and it defines the foreach
function separately from other functions (20 in Figure 3.2).
3.2. FRAMEWORK 36
Figure 3.2: Tamrawi et al’s Makefile grammar (from [46])
Figure 3.3 shows a portion of our TXL grammar definitions for Make variable
assignments (Figure 3.3a), as well as Make rules (Figure 3.3b). This excerpt shows one
of the challenges of parsing Makefiles. Since the language places so much importance
on whitespace (spaces and tabs), we must explicitly parse each space, as seen in the
Assignment, AssignmentContinuation, Rule, and Target definitions where [WS]
denotes a tab or space.
Parsing Makefiles is a difficult challenge, mainly because it is the unstructured
intersection of three different languages. First, a preprocessing language, which con-
tains conditional structures and variable references, to insert Make statements and
rules into the Makefile before it is interpreted. Second, there is what we might call
3.2. FRAMEWORK 37
define Assignment
[WS] [PrivateExportOverride?] [WS] [Id] [WS] [AssignmentOp] [WS] [BSWS?] [Expr?]
[AssignmentContinuation*] [EOL?]
end define
define AssignmentContinuation
[EOL] [tabspace] [WS] [Expr] % continued assignment
end define
define PrivateExportOverride
’private | ’export | ’override | ’unexport
end define
define AssignmentOp
’+= | ’:= | ’?= | ’=
end define
(a) The TXL definition of Make variable assignments.
define Rule
[space?] [Targets] ’: [’: ?] [Patterns] [TargetAssignments] [Dependencies] [Recipe]
end define
define Targets
[TargetBSWS+]
end define
define TargetBSWS
[BSWS] % continuation of targets
| [Target]
end define
define Target
[Id] [WS]
end define
...
(b) The TXL definition of a Make rule. (The definitions for dependencies are almost iden-tical to those for targets, so we have excluded them for space.)
Figure 3.3: A cross section of the TXL grammar for Makefiles showing relevantdefinitions for a) variables, and b) rules.
3.2. FRAMEWORK 38
the language itself, which contains the facilities to define dependencies and call func-
tions. Lastly, there is the shell scripting language (commonly bash, but dependant
on the system running it), which is used to define recipes to build targets. Our parser
must parse all of these languages at once. Even GNU Make (the tool, as opposed to
the language) interprets these as needed in a multi-pass system. For this reason, a
Makefile could contain a syntax error that goes unnoticed for a long period of time
as long as that rule or statement is never run.
Another challenge is Make’s use of white space to denote important distinctions.
Spaces are used to delineate items in a list (e.g. a dependency list). Tabs are used
to denote the beginning of a recipe command. Because of this, our grammar had to
parse at the character level and explicitly read tabs and spaces.
Because of these challenges, it should be noted, our TXL-based parser cannot
parse all Makefiles. It should also be noted that the numbers in Table 3.1 apply only
to the Makefiles we were able to parse . The number of Makefiles we were unable to
parse is only 369 out of 20059, or about 1.8% of the total number of files identified
as Makefiles. Of course, these are only the files we suspect to be Makefiles based on
their names, and it is likely the case that some we identified were not in fact Makefiles
and therefore could not be parsed. It is also possible that some of these Makefiles
contain actual syntax errors. In fact, we discovered one of these in the set of Android
Makefiles.
Our parser was tested against our entire corpus and its accuracy was validated
by manually checking a random sample of 100 Makefiles. In terms of completeness,
the TXL error messages relating to syntax that the parser was unable to parse were
used to fix incomplete parts of the grammar. In addition, as we proceeded with our
3.2. FRAMEWORK 39
<Statement><Comment>#This is an example of a Makefile</Comment></Statement>
<Statement><Assignment>VPATH=usr/lib/</Assignment></Statement>
<Statement><Directive>vpath *c usr/src/</Directive></Statement>
<Statement><Assignment>DEBUG=yes</Assignment></Statement>
<Statement><Assignment>
OBJS:= \bin/main.o \bin/foo.o \bin/bar.o
</Assignment></Statement>
<Statement><IfStatement>
ifeq( <Evaluation>$(DEBUG)</Evaluation> , yes )
<Statement><Directive>include build/flags_debug.mk</Directive></Statement>
else
<Statement><Directive>include build/flags.mk</Directive></Statement>
endif
</IfStatement></Statement>
<Rule>
<Target>.PHONY</Target> : <Dependency>all</Dependency>
<Recipe></Recipe>
</Rule>
<Rule>
<Target>all</Target> :
<Dependency><Evaluation>$(OBJS)</Evaluation></Dependency>
<Recipe>cc -o <Evaluation>$(OBJ)</Evaluation></Recipe>
</Rule>
<Rule>
<Target>foo.o</Target> :
<Dependency>foo.c</Dependency>
<Dependency>foo.h</Dependency>
<Recipe>cc -c foo.c</Recipe>
</Rule>
<Rule>
<Target>%.o</Target> : <Dependency>%.c</Dependency>
<Recipe>
+echo "Compiling... "
<FunctionCall>$(basename
<AutoEval>$@</AutoEval>)
</FunctionCall>
<Evaluation>$(CC)</Evaluation> -c
<Evaluation>$(CFLAGS)</Evaluation>
<Evaluation>$(CPPFLAGS)</Evaluation>
<AutoEval>$<</AutoEval> -o
<AutoEval>$@</AutoEval>
</Recipe>
</Rule>
<Rule>
<Target>.PHONY</Target> : <Dependency>clean</Dependency>
<Recipe></Recipe>
</Rule>
<Rule>
<Target>clean</Target> :
<Recipe>rm bin/*</Recipe>
</Rule>
Figure 3.4: Example XML Markup of Makefile Parse (Note: Lower level markupdetail not shown for readability)
3.2. FRAMEWORK 40
analyses and began to read more into the corpus, we discovered new features that we
wanted to measure and went back to add those to the grammar.
Several iterations of adapting, tuning, and refining resulted in a reliable parser
that yields an accurate, robust, and highly detailed parse of Makefiles. Figure 3.4
shows XML markup representing a simplified parse tree of our example Makefile
(Figure 3.1). From it, we can see that this Makefile contains six rules as well as a
number of statements.
3.2.2 Extracting Features
The second part of our analysis framework is a set of TXL rules to analyze Makefiles
by identifying, extracting, and counting a feature or property of the code. At the
very basic level, the code matches a Makefile and then extracts the desired elements
corresponding to each feature in the parse tree using the TXL extract (ˆ) operator.
The length of (i.e. number of elements in) each of these lists of extracted elements
then gets printed as the output count for the feature or property.
Since our grammar is so detailed, it also allows us to measure features in context.
So, not only can we count the number of instances of a feature in a Makefile, but
we can also tell where it is used. Function calls are a good example we shall use
to illustrate. Function calls can occur almost anywhere in a Makefile — inside vari-
able assignments, targets, dependencies, recipe commands, and so on. Our extractor
counts function calls in all of these locations, as shown in Figure 3.5. We can see, for
example, that we extract the list of function calls ([FunctionCall*]) from the list of
assignments (Assigns) that we extracted earlier to count the number of assignments.
This gives us only the function calls that occur in assignment statements.
3.2. FRAMEWORK 41
function main
match [program] Makefile [program]
...
% Assignment function calls
construct AssignFuncs [FunctionCall*]
_ [^ Assigns]
construct NAssignFuncs [number]
_ [length AssignFuncs] [putp "assignment function calls: %"]
...
% Target function calls
construct TargetFuncs [FunctionCall*]
_ [^ Targets]
construct NTargetFuncs [number]
_ [length TargetFuncs] [putp "rule target function calls: %"]
...
% Dependency function calls
construct DependencyFuncs [FunctionCall*]
_ [^ Dependencies]
construct NDependencyFuncs [number]
_ [length DependencyFuncs] [putp "rule dependency function calls: %"]
...
% Recipe function calls
construct RecipeFuncs [FunctionCall*]
_ [^ RecipeCommands]
construct NRecipeFuncs [number]
_ [length RecipeFuncs] [putp "rule recipe function calls: %"]
...
%total function calls
construct AllFuncCalls [FunctionCall*]
_ [^ Makefile]
construct _ [number]
_ [length AllFuncCalls] [putp "total function calls: %"]
...
Figure 3.5: An excerpt of TXL rules to extract, count, and display the number offunction calls in various locations. We use function calls as an example because theycan be placed almost anywhere in the Makefile.
Function calls can also be embedded within other function calls. Consider the
following example:
$(function1...$(function2a...$(function3...)...)...$(function2b...)...)
This example shows four function calls. function1 contains two function calls to function2a
and function2b, and function2a contains a function call to function3. One of the fea-
tures we measure in our analysis is the number of functions that contain other functions. In
the above example, there are two such function calls — function1 and function2a. But
counting instances of these is not as simple as in previous examples.
3.2. FRAMEWORK 42
Detecting this feature is more complex than simply extracting elements from the parse
tree (seen previously in Figure 3.5). For this, we made use of TXL’s pattern matching
capabilities. The whole process is shown in Figure 3.6. First, we use a variable called
TotalFunctionsWithEmbeds to keep track of the number of functions that have embedded
calls (in TXL, we have to import and export the variable between rules). In the main
extraction rule shown in Figure 3.6a, we initialize the count to 0, and apply two rules
to the Makefile shown in Figure 3.6b. The two rules are the same, but match the two
different function variants. Both of these rules scan the Makefile for a function call that
matches their respective variant. Then, they take the argument list of the call and use
the extract operator (ˆ) to extract any function calls embedded inside. If this list is non-
empty (i.e. the length is not 0), then the function call does have an embedded call, so the
TotalFunctionsWithEmbeds variable is incremented and the rule proceeds. Back in the
main rule in Figure 3.6a, the variable gets imported and displayed as part of the output.
Once the analysis has been run on every Makefile in the corpus, we are left with a file
filled with output comparable to that in Figure 3.7. This Figure shows a representative
sample of the output for the example Makefile in Figure 3.1. From it we can see that it
correctly counts three assignments (lines 3, 6, and 7), six rules with six targets and six
dependencies total (lines 18, 19, 22, 25, 29, and 30), and five recipe lines (lines 20, 23, 26,
27, and 31), as well as a number of other features we will describe in the next chapter.
Using a simple Ruby script, we converted this output to a CSV (Comma Separated
Values) file with rows representing each Makefile and columns representing each feature
(124 in all). From there, we used Microsoft Excel to summarize, analyze, and graph the
data for each analysis.
3.2. FRAMEWORK 43
export TotalFunctionsWithEmbeds [number]
0
construct _ [program]
Makefile [countFunctionsWithEmbeddedCalls]
construct _ [program]
Makefile [countErrorFunctionsWithEmbeddedCalls]
import TotalFunctionsWithEmbeds [number]
construct _ [number]
TotalFunctionsWithEmbeds [putp "total function calls with embedded calls: %"]
(a) The TXL definition of Make variable assignments.
rule countFunctionsWithEmbeddedCalls
replace $ [FunctionCall]
Func [FunctionCall]
skipping [FunctionCall]
deconstruct Func
’$ ’( _ [FunctionName] _ [tabspace] Args [list FunctionArg+] _ [WS] ’) _ [WS]
construct EmbeddedCalls [FunctionCall*]
_ [^ Args]
construct NEmbeddedCalls [number]
_ [length EmbeddedCalls]
deconstruct not NEmbeddedCalls
0
import TotalFunctionsWithEmbeds [number]
export TotalFunctionsWithEmbeds [number]
TotalFunctionsWithEmbeds [+ 1]
by
Func
end rule
rule countErrorFunctionsWithEmbeddedCalls
...
deconstruct Func
’$ ’( _ [ErrorName] _ [tabspace] Args [ErrorExpr*] _ [WS] ’) _ [WS]
...
end rule
(b) The TXL definition of Make variable assignments.
Figure 3.6: An excerpt showing how to find variable references within targets.
3.2. FRAMEWORK 44
Makefile: ./MakeEg
lines: 31
continuations: 3
comments: 1
...
vpath directives: 1
includes: 2
ifdef/ifeq directives: 1
assignments: 3
VPATH variable assignments: 1
assignment variable references: 0
unique assignment variable references: 0
assignment function calls: 0
assignment $@ references: 0
...
assignment total obsolete autoeval references: 0
assignment total autoeval references: 0
assignment lazy assignments: 2
assignment strict assignments: 1
assignment iterative assignments: 0
assignment conditional assignments: 0
assignment paths: 2
rules (all): 6
rules (single colon): 6
rules (double colon): 0
pattern rules: 1
static pattern rules: 0
rule targets: 6
rule target special targets: 2
rule target suffix targets: 0
rule target variable references: 0
unique target variable references: 0
rule target function calls: 0
rule target paths: 0
rule dependencies: 6
...
rule recipe commands: 5
rule @ recipe commands: 0
rule - recipe commands: 0
rule + recipe commands: 1
...
total variable references: 6
total function calls: 1
total embedded function calls: 0
total unique variable references: 5
...
Figure 3.7: Sample output from the extractor, showing the result of running it on theexample in Figure 3.1.
3.3. CONCLUSION 45
3.3 Conclusion
Once we acquired a corpus of Makefiles and built the framework capable of extracting and
counting features, we had everything we needed to perform our analysis. The following
chapter presents an in depth analysis of the features used in Makefiles.
46
Chapter 4
Feature Analysis of Open Source Makefiles
Make is often criticized for its difficulty to understand and general lack of debugging facilities
[45, 18]. Although this difficulty has been studied by measuring the size of Makefiles [9],
coupling of Makefile changes to source code changes [33] and analysis of the kinds of changes
made to build files [31], Make has not been studied yet from the program comprehension
point of view. Which language features are used the most? Is there a common set of features
that GNU Make can be reduced to without losing functionality? By knowing how Makefiles
are used, we can help make decisions about future versions and implementations, such as
what features to add, remove, or just make easier to use.
In this analysis, we present an extensive and detailed inventory of the features we
extracted from our corpus of open source Makefiles (described in the previous chapter),
addressing the following research questions:
RQ1. How frequently are Makefile features used? What are the features that are absolutely
essential to Makefiles? What are the least used or unused features? Are there features that
could be removed from Make to make it less bloated?
RQ2. Where are Makefile features used? Some features can be embedded inside other
features (e.g. variables can be referenced in targets, dependencies, etc.), so in which features
4.1. FEATURES OF THE GNU MAKE LANGUAGE 47
are these more commonly found?
RQ3. Are features used differently in generated Makefiles? How do Makefiles generated by
Automake, CMake and QMake differ from those written by hand? Are generators using
more advanced features?
RQ4. To what extent are bad practices, specifically obsolete features and recursion, still in
use? The GNU Make manual specifies a number of still supported, but obsolete features.
How often are they still used? Calling Make within a Makefile is considered harmful. How
many Makefiles do this?
4.1 Features of the GNU Make Language
Chapter 2 gives a basic overview of the Make language, but to fully understand feature use,
we must first know all of the features it offers. There have been many variations of Make,
each with its own set of features, but we will be focussing on the most popular variant,
GNU Make.
Make reads Makefiles, which contain a set of instructions that describe how to build a
particular software project. This is done using rules, which specify targets (files) and how
they should be built or rebuilt, as well as on which other files they depend. Make will build
only what is necessary, which means only the target files whose dependencies have changed
will be rebuilt.
Make is run by using the make command in the directory with a Makefile. With no
arguments, Make will look for a file named GNUMakefile, makefile, or Makefile (in that
order) and execute the first rule. By convention, that will be a rule to build the whole
system. Of course, there are many command line arguments that can be used to customize
the execution. For example, make foo will look for a rule to build foo, and make -f
build/Rules.mk will execute the given Makefile instead of the default Makefile.
4.1. FEATURES OF THE GNU MAKE LANGUAGE 48
Figure 4.1: An example Makefile
The remainder of this section presents a catalogue of Make features that we will be
measuring and discussing in this analysis. We group the features based on the syntax of
the grammar and their intention. Figure 4.1 shows an example Makefile that will be used
for illustration.
4.1.1 Readability
Comments
Comments in Make are denoted by a “#” and can occur on their own line or at the end of
a line (e.g., line 1 of Figure 4.1). There are no multi-line comments in Make.
4.1. FEATURES OF THE GNU MAKE LANGUAGE 49
Continuations
Continuations provide a way of breaking up long lines of Makefile text by placing a “\” at
the end of a line to denote that it continues on the following line. For example, lines 7-10 of
Figure 4.1 use continuations to break up the declaration of the OBJS variable onto multiple
lines.
4.1.2 Rules
Rules are the heart of the Makefile. They tell Make what should be made (targets), when
they should be made (dependencies), and how to make them (recipes). They take the form:
targets : dependencies
[TAB] recipe
We will discuss each part in further detail, as well as some other aspects of Make rules,
in the following sections.
Targets
Targets represent the artifacts to be built. Most of the time, that is the name of a file;
however, a phony target, containing a string of characters not associated with a file, can be
used to execute commands on request. Phony targets can be used to represent subsystems
with one root “all” target, typically representing the build of the the whole system (lines
19-20 of Figure 4.1), or to represent a common build task, such as a “clean” target to remove
all previously built files (lines 30-31).
Dependencies
Dependencies (or prerequisites, as they are sometimes called) are other targets or names
of files that are required to build the target(s) of the rule in which they are specified. The
4.1. FEATURES OF THE GNU MAKE LANGUAGE 50
dependency list of a target T serves 2 purposes: First, the rule for each dependency of T
is found and the corresponding recipe is executed if the dependency is out-of-date; second,
if any of those dependencies was found to be out-of-date, the recipe of T itself is executed
to update the target(s). In certain cases, a target may not need to be updated if one of
the dependencies was found to be out-of-date. In such cases, these dependencies – called
order-only dependencies – can be listed at the end of the dependency list after a pipe symbol
(“|”).
Recipe
A recipe is a list of commands to be executed. These commands, in theory, build the rule
target(s) using the dependency files and should only update the target(s) and nothing else,
but this is not guaranteed. For example, compiler commands or scripts can be invoked from
inside a recipe, as in lines 20, 23, and 27 of Figure 4.1.
Special Targets
Make also defines a set of special targets that have special meanings. They all begin with
a period (“.”) and are conventionally written in all caps. For example, the Makefile in
Figure 4.1 contains 2 rules with the .PHONY target (line 18 and line 29). A rule with the
.PHONY target means that the dependencies should not be considered as filenames, but
rather as a subsystems or build phases. Without these rules, if there are files named “all”
or “clean” in the same directory, Make would assume that those files are up-to-date and do
not need to be rebuilt using a rule. In such a case, the “clean” target would therefore never
be executed.
4.1. FEATURES OF THE GNU MAKE LANGUAGE 51
Recipe Flags
There are 3 prefixes that may be used in front of recipe commands. A minus sign (“-”)
indicates that Make should ignore any errors that the command may yield. A plus sign
(“+”) indicates that Make should execute this command even during a dry run, when the
user has specified that it should not execute any commands (useful for debugging). And
an at sign (“@”) indicates that Make should not print the command itself to the standard
output, only its programmed output. In the example in Figure 4.1, the recipe command on
line 26 begins with a “+” to indicate that the echo command should always print the name
of the file to be compiled, even when the user has specified otherwise.
Single vs. Double Colon
Ordinarily, a target should appear in only one rule. If not, Make will execute the recipe of
the last rule and print an error message. However, there are some circumstances where a
different recipe should be executed depending on which dependency has changed. In this
case, a double-colon is used in each rule rather than a single colon.
Pattern Rules
Pattern rules are used to define implicit rules in Make. Implicit rules describe a recipe for
making certain types of files that are all built the same way. Make includes many predefined
implicit pattern rules, such as the rule to compile .c files into .o files. A pattern rule looks
the same as an ordinary rule, but it contains a target with a percent symbol (“%”), which
matches any non-empty string. The “%” can then be used in the dependencies list to match
the same string. For example, lines 25-27 in Figure 4.1 show a pattern rule with target “%.o”
to match all object files, while its dependency (“%.c”) matches the corresponding C file.
4.1. FEATURES OF THE GNU MAKE LANGUAGE 52
Static Pattern Rules
Static pattern rules are the same as normal pattern rules, but operate on a static list of
targets that come before the target pattern. Instead of searching the file system for targets
matching the pattern, Make applies the pattern to the list of specified targets to extract the
stem (“%”) and find the dependencies. If, for example, we only wanted the pattern rule on
line 25 of Figure 4.1 to apply to foo.o and bar.o, we could add “foo.o bar.o:” before “%.o :
...”.
Suffix Rules
Suffix rules are the older, now obsolete, way of defining implicit rules. They contain no
dependencies, and only a single target specifying one or two file suffixes. A single suffix rule
contains one suffix and is a general rule applied to all targets with that suffix. A double
suffix rule contains two suffixes together and apply to all targets that match the second
suffix, with dependencies that match the first suffix. For example, the pattern rule on line
25 of Figure 4.1 could be turned into an equivalent suffix rule by changing “%.o: %.c” to
“.c.o:”.
Recursive Make
A common but discouraged technique of creating a build system with Make is to split
the system into multiple Makefiles, each responsible for building their own subsystem, that
invoke each other as separate Make processes. These Makefiles are usually put into separate
folders along with the source code on which they operate and a root Makefile at the top
that calls these Makefiles (in recipe commands) to build the whole system. Invoking another
Makefile within a Makefile like this is referred to as recursive Make, and it can have unwanted
effects [34].
The problem with this is that, because separate Make processes are used, each Makefile
4.1. FEATURES OF THE GNU MAKE LANGUAGE 53
has no knowledge of what is happening in the other Makefiles, and therefore may not
rebuild all of the targets that require it. For example, if two of the Makefiles refer to
the same physical file, it is possible that the second one rebuilds that file, making a new
compiled version, while the first one has already made decisions based on the previously
compiled version of that file, thus not rebuilding the targets on which it depends.
There are a number of solutions to fix this. The easy solution is to invoke the root Make
more than once, thus giving the first Makefile the opportunity to notice changes made by
the second. However, in general the number of times you would have to invoke Make to
ensure everything was up-to-date varies depending on how many subsystems there are, not
to mention the overhead involved (which can be huge for a large system). The best solution
is to write one large Makefile or, to break it up, to use the include directive (described
later) to put the subsystem Makefiles directly in the root Makefile. This method ensures
that Make can find all of the required dependencies and rules it needs to build a complete
and up-to-date system.
4.1.3 Variables
Like many languages, Make provides variables that can be referenced in target/dependency
lists, recipes, or even other variables and functions. There are two flavours of variables:
recursively expanded, which are evaluated as they are needed (i.e., when a reference is
read), and simply expanded, which are evaluated when they are defined.
Assignments
Recursively expanded variables are defined using the “=” operator, or the define directive
(see Multi-line Variables). Simply expanded variables are assigned with the “:=” or “::=”
operators. There are also three special types of assignment operators: the “?=” operator
will only set the variable if it has not already been set; the “+=” operator appends the
4.1. FEATURES OF THE GNU MAKE LANGUAGE 54
value to the end of the variable; and the “!=” operator is used to assign a variable with the
result of a shell command (e.g., var != ls *.c). In the example in Figure 4.1, the simply
expanded OBJS variable is assigned on lines 7-10.
Variable References
Variables can be referenced anywhere in the Makefile (targets, dependencies, recipes, vari-
able assignments, etc.) by putting the variable name inside brackets with a “$” sign before
it (e.g., $(VAR)). In the example in Figure 4.1, the OBJS variable is referenced in the de-
pendency list and the recipe of the rule on line 19-20.
Multi-line Variables (Macros)
Normal variables cannot contain newlines, but sometimes they are needed (for example,
to define a common recipe segment). To do this, the define directive is used. They are
defined in much the same way as regular variables but with the “define” keyword before
the variable name, and “endef” after the value. For example, the definition below could be
used to define the recipe on lines 26-27 of the example in Figure 4.1.
define recipe :=
+echo "Compiling... " $(basename $@)
$(CC) -c $(CFLAGS) $(CPPFLAGS) $< -o $@
endef
Automatic Variables
When using patterns, function calls, or variable references in rule headers, it can be impos-
sible to tell which target or dependency is being evaluated. For this reason, Make provides
a set of Automatic Variables to refer to them inside a recipe (or dependency list). Table 4.1
4.1. FEATURES OF THE GNU MAKE LANGUAGE 55
Table 4.1: Automatic Variables
$@ the filename of the target (in the case of multiple targets,the target that forced the rule to run)
$% the member name of the target if the file is part of anarchive
$< the first dependency$? all dependencies that are newer than the target$^ all dependencies, with duplicates removed$+ all dependencies, with duplicates$| the order-only dependencies$* the stem (“%”) of the pattern matched in a pattern rule
lists the automatic variables offered by Make. In the example in Figure 4.1, the $< and $@
variables are used in the rule on lines 25-27 to get the name of the dependency and target,
respectively, since they are not known until runtime.
4.1.4 Statements
In addition to rules and variable assignments, there are a few special statements that can
be used in a Makefile. We will discuss these in this section.
VPATH Variable and vpath Directive
By default, Make will look for target and dependency files in the current working directory;
if the file is not there, an error has occurred. The VPATH variable and vpath directive allow
the build engineer to tell Make where to look for files that cannot be found in the current
directory. It allows a directory to be specified for a particular pattern, where the directory
is searched if the target file meets some criteria (e.g., “*.c”). The example in Figure 4.1
contains both the variable and directive. Line 3 assigns “/usr/lib/” to the VPATH variables,
meaning that Make should look there for any file that it cannot find in the current directory.
And Line 4 uses the vpath directive to tell Make to look in “/usr/src/” for .c files that it
cannot find in the current directory.
4.1. FEATURES OF THE GNU MAKE LANGUAGE 56
Includes
The include directive is used to tell Make to suspend reading the current Makefile and read
some other specified Makefile(s). Once Make has finished reading the other Makefile(s),
it continues reading after the include directive. This is used to separate the build into
multiple subsystems, a safer method than using recursive Make [34]. The example in Figure
4.1 includes 2 external Makefiles called flags.mk (line 15) and flags debug.mk (line 13),
depending on whether the DEBUG variable is set to “yes” or not (more on conditionals in
the next section). These Makefiles, as their names suggest, contain variable definitions for
the CFLAGS and CPPFLAGS variables, used by the recipe command on line 27.
Conditionals
Like most languages, Make provides simple conditional if-statements that allow the build
engineer to include parts of the Makefile only if certain conditions are met. Make evaluates
conditionals, and replaces them with the text corresponding to the conditions that evaluate
to true. The text can be rules or statements that could occur inside the Makefile, or
recipe commands within a rule. For example, in Figure 4.1, lines 12-16 show a conditional
statement that checks to see if the DEBUG variable is set to “yes” and includes an external
Makefile if so.
4.1.5 Functions
Make includes a number of built-in functions for a variety of uses. Functions are called in
much the same way as variables are referenced, using commas (and no spaces) to separate
parameters, as shown below.
$(function param1,param2,param3 )
The example in Figure 4.1 uses the basename function on recipe line 26 to print the name
4.1. FEATURES OF THE GNU MAKE LANGUAGE 57
(minus the .o) of the target file matching the rule. The remainder of this section briefly
describes the other functions provided by GNU Make, categorized based on the manual [4].
String Functions
String functions operate on strings or lists (strings separated by spaces). These include:
findstring, subst, patsubst for finding and replacing substrings; strip for removing
trailing whitespace; and filter, filter-out, sort, word, wordlist, words, firstword,
and lastword for operating on lists.
Filename Functions
Filename functions operate specifically on filenames and directories, or lists of filenames
and directories. These include: dir and notdir for identifying directories; basename and
suffix for stripping and retrieving file extensions (like .c), respectively; addprefix, and
addsuffix for adding strings to the beginning and end of a path, respectively; join for
concatenating lists; wildcard, for searching for files (with patterns); and realpath and
abspath for returning non-relative paths (no “.” or “..”).
Conditional Functions
There are 3 conditional functions: if, and, and or. The if function differs from the if
directive (“ifdef”/“ifeq”) because it can be used in rule targets or dependencies. The and
and or functions are typical logical operators that return true or false when passed a series
of conditions. They are normally used in conjunction with the if function.
Control Functions
Control functions can alter the way Makefiles are run. The error function will print a given
error message and exit. The warning function will throw a warning with the given error
4.2. ANALYSIS 58
message, but continue running. The info function simply prints a given message.
Variable Functions
The value, flavor, and origin functions are useful for determining the origin of a variable.
The value function will return the current value of a variable, without expanding it (i.e., as
text with variable names). The flavor function will return the flavour of a given variable
(i.e., simple or recursive), while the origin function returns where a variable was defined
(e.g., command line, automatic variable, environment variable, etc).
Other Functions
Make offers some other useful specialized functions. The shell function can be used to
call a shell command. The guile function is used to run GNU Guile scripts. The foreach
function iterates through a list of words and performs some other function using it. The
file function allows a Makefile to write to a file by overwriting it or appending to it.
Custom Functions (The Call Function)
If Make doesn’t provide a function, or a particular sequence of function calls becomes too
difficult to read, Make allows developers to write custom functions and invoke them using
the call function. Invoking the call function is much like any other function, where the first
parameter is the name of the function. Subsequent parameters are passed to the custom
function and referred to using $(1), $(2), and so on.
4.2 Analysis
To help answer our research questions, we plotted several graphs illustrating feature use
across the Makefiles in our corpus. Each graph shows the proportion of the corpus that
4.2. ANALYSIS 59
Figure 4.2: Feature Use (% of Makefiles)
uses the feature listed on the Y-axis at least once.
RQ1. How frequently are Makefile features used?
Figure 4.2 shows feature use of the whole corpus for the features listed. We group
the features using the breakdown of Section 4.1 to simplify the chart. As we can see,
Makefiles are frequently commented, with comments appearing in more than 80% of files.
Assignments and variable references are the most used language features, at over 85% and
65% (respectively) of all Makefiles. Rules, targets, dependencies, and recipes account for
just under 50%. This indicates that Makefiles are split into two categories: Makefiles with
rules, and Makefiles with only assignments that are included in the former. This is further
supported by the number of Makefiles that use includes (just over 25%) and by browsing
the Makefile source.
There are some features mentioned in the GNU Make manual that we did not find in our
corpus. To see this, we have to unpack some of the categories of features we used. First,
4.2. ANALYSIS 60
Figure 4.3: Assignment Use (% of Makefiles)
there are 6 different kinds of assignment operators: lazy (“=”), strict (“:=” and “::=”),
iterative (“+=”), conditional (“?=”), and shell (“!=”). From the graph in Figure 4.3, we
can see that the shell operator and the alternative strict operator (introduced to conform
to the POSIX standard) are never used, which is not surprising considering that they were
only introduced in GNU Make v4.0 (released October 2013). Lazy assignments are used
the most, but what is interesting is the difference between hand-written and generated
Makefiles. CMake and QMake use only lazy assignments, with Automake also using strict,
iterative and conditional. It is a good time to note that, since it uses templates, Automake
can have hand-written Make code inserted into the Makefiles it generates. Because of this,
we see some similar features popping up in the feature composition of generated and hand-
written Makefiles. For example, we can see that Automake uses other assignment operators
in the same descending order as hand-written Makefiles.
In Figure 4.2, we can see that automatic variables are used in about 13% of Makefiles,
but some are more common than others. Looking closer (Figure 4.4), we see that $@ is the
most common of these variables, followed by $?, $<, $*, and $^. $| and $+ are also used to
some extent, although they are not visible on the graph (0.01%), however, $% is never used.
Automake is by far the biggest user of automatic variables, with almost every Makefile it
4.2. ANALYSIS 61
Figure 4.4: Automatic variables (% of Makefiles)
generates using at least one, but they are also used in hand-written Makefiles.
Functions are another feature that is almost exclusively used in hand-written and
Automake-generated Makefiles. Figure 4.2 indicates that they are only used in about 5%
of Makefiles. Figure 4.5 shows the use of all the built-in functions listed in the Make man-
ual. The call and wildcard functions are the most common with 1.8% and 1.7% use
respectively, with the next highest function at 1.4%. On the other end of the list, there
are a number of functions that are never used in our corpus. The lastword, suffix, file,
flavor, and guile functions all have 0% usage. The guile and file functions were only
recently added to GNU Make, which likely accounts for their absence.
RQ2. Where are Makefile features used?
Some Make features can be embedded in others. For example, a dependency may contain
a variable reference, an automatic variable reference, or a function call. As we discussed
4.2. ANALYSIS 62
Figure 4.5: Function calls (% of Makefiles)
4.2. ANALYSIS 63
(a) Variable references (b) Automatic variable references
(c) Function calls
Figure 4.6: Locations of a) variable references, b) automatic variable references, andc) function calls as a percentage of the total.
in Chapter 3, our Makefile grammar is precise enough that we can detect combinations of
features by extracting them from other features. Specifically, variable references (including
automatic variables) and function calls can be placed in a number of places in a Makefile.
Figure 4.6 shows the number of instances in each possible location in our corpus as a
percentage of the total number of instances. We can see that variable references (Figure
4.6a) occur mostly in rule recipes followed closely by variable assignments, with targets and
dependencies making up the remainder.
Automatic variables are used to refer to targets and dependencies when they are un-
known. Unsurprisingly, the overwhelming majority of them occur inside rule recipes (Figure
4.6b) because compiler commands need to know which targets and dependencies are being
considered by Make when they are invoked. For example, pattern rules (e.g. the rule on
4.2. ANALYSIS 64
line 25 of Figure 4.1) can be matched to a wide array of targets and dependencies and must
have recipes that are written without knowing what those patterns end up matching. So
automatic variables are used to let Make fill in the blanks when it applies it. The second
most used location of automatic variables is in assignments. Assignments are often used
to store recipe commands and lists dependencies that are reused in various places and are
written to be general enough to be used in a variety of situations, hence why they would use
automatic variables. Finally, there is a small number of automatic variables located inside
dependency lists (14 in total). Of these 14, we see usage of three automatic variables — $@,
$<, and $*. These occur in both Automake and hand-written Makefiles, although hand-
written Makefiles were the only source of $* instances. Of course, no instances were found
inside targets. It does not make sense to have use them as a target because targets do not
need to refer to themselves and, since Make reads targets first, there are no dependencies
to which to refer.
We have seen that function calls occur mostly in hand-written Makefiles (or generated
Makefiles that can contain hand-written portions). In Figure 4.6c we can see that most of
them occur within assignments and recipes. However, we have observed some that are used
to generate target and dependency lists in some way.
RQ3. Are features used differently in generated Makefiles?
We have already seen how generators use assignment operators, automatic variables,
and function calls differently, but Figure 4.7 shows the breakdown of feature use by all
categories. A set of core Makefile features emerge that are used by all generated files,
mainly variables (assignments, references) and single colon rules (targets, dependencies,
recipes). Other features such as special targets, continuations, and recipe command flags
(“@”, “-”, and “+”) are used by most. On the other hand, there are some features that
seem designed specifically to make it easier for writing Makefiles by hand. Conditionals,
4.2. ANALYSIS 65
Figure 4.7: Feature Use By Generator (% of Makefiles)
4.2. ANALYSIS 66
Table 4.2: Obsolete Features in Makefiles.
All Automake CMake QMake Hand.SILENT 19.12% 0.00% 43.38% 0.00% 0.03%.IGNORE 0.00% 0.00% 0.00% 0.00% 0.00%suffix rules 19.43% 97.41% 0.00% 82.47% 3.26%old auto vars 0.16% 0.06% 0.00% 0.00% 0.45%
functions, vpaths, double colon rules, pattern rules, and macros are all used only by hand-
written or Automake Makefiles.
The graph also shows a set of features that are used very little in our corpus (i.e.,
<10%), and only by Makefiles with hand-written Make code (i.e., Automake-generated and
hand-written Makefiles). Exceptions are pattern and double-colon rules, which are also
used by a small number of QMake-generated Makefiles (i.e., <0.5%). Vpaths, functions,
conditionals, macros, pattern rules, double colon rules, and – to some extent – automatic
variables are what we consider to be the more advanced features of Make because they allow
more general Makefiles to be written as opposed to a more verbose Makefile.
RQ4. To what extent are bad practices, specifically obsolete features and recursion, still in
use?
The GNU Make manual gives 4 features that are “semi-obsolete” meaning that, while
still supported, they are replaced by other features. Table 4.2 shows the percentage of
Makefiles that use obsolete features.
The first of these features are the .SILENT and .IGNORE special targets, which are
obsolete because the “@” and “-” recipe command flags (respectively) offer a more fine-
grained implementation. We can see from our data that .IGNORE is never used. We
can also see that .SILENT is only used in a small number of hand-written Makefiles (two,
actually), and more than 40% of CMake-generated Makefiles.
Another obsolete feature is suffix rules, the old way of expressing implicit rules, which
4.2. ANALYSIS 67
Table 4.3: Recursion in Makefiles.
All Automake CMake QMake Handmake 32.46% 99.12% 28.29% 81.71% 4.76%automake 7.84% 97.16% 0.00% 0.00% 0.04%cmake 19.11% 0.00% 43.38% 0.00% 0.00%qmake 12.49% 0.00% 0.00% 98.68% 0.00%
were replaced by pattern rules. Our results show that these are still used often across most
categories, appearing in almost every Automake-generated Makefile in our corpus, and in
the majority of QMake-generated files. On the other hand, CMake doesn’t use it at all.
Interestingly, there are 100 Makefiles in our set – across multiple generators – that use a
mixture of both suffix and pattern rules.
The last obsolete feature is a variation on automatic variables. Each automatic variable
returns a file or list of files, which is in the form of a path. These paths can be reduced to
filenames or directories only by appending an “F” or a “D” (respectively) to the automatic
variable (e.g., “$(@F)” or “$(%D)”). This form is considered obsolete because the dir and
notdir functions provide a similar output. When we look at the results, we can see this
form used in less than 0.2% of all Makefiles (25 total), and only appear in Automake or
hand-written Makefiles (the only Makefiles that use automatic variables).
As discussed in Section 4.1, Make can be run recursively by calling itself within a
recipe. This can be dangerous and can lead to unexpected results if not done carefully [34].
Generators can also be recursively invoked from a Makefile recipe, to report build progress
back to the generator or to compile a report. As such, these are likely less harmful, but we
measure them nonetheless.
Table 4.3 shows the use of recursive calls across all categories of Makefiles. What we see
is that generators rely heavily on recursive calls. Most Automake Makefiles use recursive
Make and Automake calls (about 99%). The same goes for CMake and QMake: 43% of
CMake files call CMake, all QMake files call QMake, and both use recursive Make calls
4.3. DISCUSSION 68
(29% and 83%, respectively). Interestingly, only about 5% of the hand-written Makefiles
use recursion. A tiny number of hand-written Makefiles seem to call Automake, possibly
an indicator of misclassification for some Makefiles.
4.3 Discussion
From our analysis, it appears that there is a core set of features that are essential to
any Make-based build system, roughly corresponding to Feldman’s original Make features.
Assignments and variable references are by far the most common, appearing in almost every
Makefile. This makes sense, considering that some Makefiles consist only of assignments
that are then included in another Makefile (as a configuration technique). For example, the
Linux kernel adopted this pattern to improve the life of driver developers, who no longer
need to write their own rules but just assign the names of their source code and object files
to the right variable. The Linux build system then automatically reads these variables and
processes them using generic build rules.
Rules and their associated parts (targets, dependencies, and recipe commands) form the
other essential Make features. What is surprising is how many hand-written Makefiles con-
tain no rules. It is possible that these files have been incorrectly classified as hand-written,
when in fact they may have been generated by Autotools (and Automake). Autotools often
produces smaller Makefiles containing only variable assignments, but doesn’t produce any
comment to say they were generated, so it is difficult to know for certain.
It may seem like include statements are not used that often, but it makes sense when
one realizes that not every file is going to include another file. Some Makefiles will be be
written for the purpose of being included in other Makefiles, and therefore probably won’t
include others.
Special targets are used a lot by generators, but not often in hand-written Makefiles. We
4.3. DISCUSSION 69
saw earlier that all CMake-generated Makefiles use the obsolete .SILENT target. This is be-
cause all CMake-generated Makefiles containing rules use a rule with a variable called VER-
BOSE. By default, VERBOSE is null and therefore an empty rule with only the .SILENT
target is produced, which Make interprets to suppress all output from the Makefile. If
VERBOSE is set to any non-null value, the target ceases to be the special .SILENT target
and the rule is meaningless. This is a simple, elegant way of giving the user the option of
suppressing warning messages. Make has since abandoned this feature in favour of recipe
flags, which offer more fine-grained control of which command output to suppress. How-
ever, this may be one update that the CMake developers have deferred until Make stops
supporting it because the worst case is that the user is presented with unwanted output
while their system is built.
One serious problem that should be addressed by Automake and QMake developers is
the use of suffix rules. Used in nearly every file produced by these tools, the disappearance
of suffix rules could have catastrophic effects (i.e. not being able to compile the project) on
users if GNU Make stops supporting them.
The lack of rule patterns in automatically generated Makefiles can be explained by the
fact that generators specialize their Makefiles after scanning for the names of all source
code files and their header file dependencies. Hence, they know all the files by name during
generation, and no generic rules are necessary. Apart from being faster, the use of only
essential Make features makes these Makefiles more portable to other implementations such
as BSD Make. Alternatively, it could be because they are still using the aforementioned
obsolete suffix rules instead.
All CMake-generated Makefiles that contain rules use recursive Make calls, which the
developers argue is necessary in order to automatically scan implicit dependencies [2]. This
works using 3 levels of Makefiles that call each other. The top-level Makefile is the one that
is called from the command line. It then calls the second-level Makefile (Makefile2) with the
4.4. CONCLUSION 70
appropriate target. The second-level Makefile calls the third-level Makefiles (build.make)
that are within the sub-directories of the system. These third-level Makefiles get called
twice: the first time with the “depend” target, which scans the system for implicit depen-
dencies; and again with the “build” target, which actually builds the system. This is how
it avoids the pitfalls of recursive Make described by Miller [34].
4.4 Conclusion
By looking at the frequency of use of GNU Make features, we observed two things. First,
there is a core set of features that are used in more than a third of all Makefiles. Second,
there is a set of more advanced features that seem to be more suited to writing Makefiles
by hand, and that are used almost exclusively in that context. We also found that recursive
Make is used a lot – especially by generators – even though it has been considered bad
practice for almost two decades, and that suffix rules are still used extensively despite being
considered semi-obsolete. While we have only presented a summary of our feature use
analysis in this chapter, a full breakdown of the features used in our corpus can be seen in
Appendix A.
We have only begun to scratch the surface of what can be analyzed in Makefiles. In
the next chapter, we will show how we can use this detailed inventory of feature use to
measure the complexity of a given Makefile from a maintenance and program comprehension
perspective.
71
Chapter 5
Maintenance Complexity of Open Source Makefiles
Build automation tools, such as Make, are the backbone of every software project. They
compile the source code submitted by developers into an executable program. They run
tests to assure that the code is correct. They compile documentation for manuals. And
they package everything into a distributable application. Given the vital role these systems
play in the development process, maintaining them is a high priority, because without them
the whole project grinds to a halt. Moreover, even minor errors in the build process can
lead major difficulties in a software release. They must also constantly evolve and adapt to
changes and additions to the source code, tests and target platform of the software.
As the size and complexity of build systems grows, the overhead of build system main-
tenance becomes an increasingly large part of the overall software maintenance task. A
recent study by McIntosh et al. [33] shows that the build system can account to up to 31%
of code files in a project, and that up to 27% source code changes in a C project require
corresponding updates to build artifacts. Adams et al. studied the Linux build system
through its many refactorings since inception [9], showing that the build system has grown
to be a large and complex software system of its own that has grown to represent a large
part of the Linux code base. Thus the maintenance of build systems is an increasingly
important and difficult part of the overall software effort [10, 33].
5.1. COMPLEXITY OF SOFTWARE MAINTENANCE 72
Build languages such as Make [17] and Ant [13] have also been shown to be notoriously
difficult to understand and modify [34, 46]. This is partly because they were designed
for much smaller systems [17], and partly because of their lack of higher level abstraction
features appropriate to the million-line software systems they have grown to be [33].
In this chapter, we seek to characterize this complexity and the maintenance overhead
associated with it. To address the declarative nature of build systems, we propose a new
theory of software complexity based on program understanding effort, and a metric to char-
acterize it based on indirect references. Unlike other software metrics, such as Halstead
complexity and cyclomatic complexity, indirection complexity is based on exactly how de-
velopers are known to maintain code — by following links to understand what the code is
doing. We use this knowledge, as well as our knowledge of human memory, to estimate how
difficult that understanding will be, which can then be use to predict where most of the
maintenance difficulty will lie. We hope developers will use this to make decisions about
how or if to refactor their Makefiles.
5.1 Complexity of Software Maintenance
Existing measures of code complexity, such as McCabe’s Cyclomatic Complexity [29], Hal-
stead’s metrics [20], and Function Point Analysis [12], are primarily aimed at development
effort and at traditional algorithmic code. While they have been used to estimate predicted
maintenance effort [11], they are not designed for that purpose and do not apply well to
declarative languages such as those used in build systems. When they have been modified to
apply to build systems, they have been found to be indistinguishable from simply counting
the number of lines [30]. As a result, efforts to estimate complexity of build systems have
relied primarily on simple metrics such as number of files, lines, targets, or dependencies,
and historical metrics such as build file churn [33, 9, 30].
5.1. COMPLEXITY OF SOFTWARE MAINTENANCE 73
objs/comprul.o : UNIX/cinterface comprul.c
cc -c -O comprul.c; mv comprul.o objs/comprul.o
objs/compdef.o : UNIX/cinterface compdef.c
cc -c -O compdef.c; mv compdef.o objs/compdef.o
objs/boot.o : UNIX/cinterface boot.c
cc -c -O boot.c; mv boot.o objs/boot.o
objs/loadstor.o : UNIX/cinterface loadstor.c
cc -c -O loadstor.c; mv loadstor.o objs/loadstor.o
objs/ident.o : UNIX/cinterface ident.c
cc -c -O ident.c; mv ident.o objs/ident.o
objs/scan.o : UNIX/cinterface scan.c
cc -c -O scan.c; mv scan.o objs/scan.o
objs/parse.o : UNIX/cinterface parse.c
cc -c -O parse.c; mv parse.o objs/parse.o
objs/parsepf.o : UNIX/cinterface parsepf.c
cc -c -O parsepf.c; mv parsepf.o objs/parsepf.o
objs/trees.o : UNIX/cinterface trees.c
cc -c -O trees.c; mv trees.o objs/trees.o
objs/xform.o : UNIX/cinterface xform.c
cc -c -O xform.c; mv xform.o objs/xform.o
objs/xformpf.o : UNIX/cinterface xformpf.c
cc -c -O xformpf.c; mv xformpf.o objs/xformpf.o
(...)
(a)
all:
cp -r ../generic/* .
cp -r machdep/* .
make -f Makefile_Generic SYS=$(SYS) AS="cc -c -m32"
CC="cc -m32" LD="cc -m32"
clean:
make -f Makefile_Generic clean
rm -r -f TL* Makefile_Generic main.ch
(b)
Figure 5.1: These examples illustrate how current metrics used to measure complexityin Makefiles, such as number of lines or dependencies, can be misleading. Example(a) has more dependencies and lines, but example (b) hides its complexity in anotherMakefile that it calls recursively.
5.1. COMPLEXITY OF SOFTWARE MAINTENANCE 74
We propose a new approach, designed to predict maintenance effort more directly. Stud-
ies of software maintenance indicate that the majority of maintenance effort is spent trying
to understand the system [43]. While novices tend to read code locally and linearly, experts
are more likely to follow links and related parts [21], and both kinds of maintainers spend
most of their time navigating the software and searching for related items [42, 41, 24]. Many
tools and interfaces have been designed to assist in these tasks, perhaps most successfully
Hipikat [16] and Mylar [23], which has been adopted into Eclipse as Mylyn [48].
The bottom line is that software maintenance is all about the effort of code exploration
and understanding, not so much about size or change. A very large Makefile can be easy
to understand (Figure 5.1a), and a very small one can be very difficult (Figure 5.1b). The
difference between these two files is that the first is repetitive and self-contained – the
programmer need look nowhere else to understand what is being built and under what
conditions. By contrast, the build script in Figure 5.1b is dependent on a great deal of
external information – in order to understand what is being built and when, the programmer
must look into two different parts of the file system, must look to see how the Makefile is
invoked elsewhere and with what parameters, must trace a recursive Make invocation, and
must find and substitute overridden variables in the recursive call.
Each of these actions requires the programmer to look elsewhere to understand the
meaning of the section of code at which they are looking. Every such reference disturbs the
process and increases the difficulty of understanding when the script is being maintained.
We call these references indirections, and theorize that as the number of them increases in a
build script, the difficulty of understanding increases, and thus maintenance effort increases.
More generally, every time a maintainer must look somewhere that is not where they are
currently focussed in order to understand what they are doing, the cognitive overhead of
understanding is increased.
Based on these observations, we propose a new complexity measure we call indirection
5.2. INDIRECT FEATURES OF GNU MAKE 75
complexity, which is calculated as the sum of all instances of such indirections. In a main-
tained software artifact, such as a build script, this can be approximated by identifying and
counting occurrences of language features that result in indirection (that is, that require
the programmer to look elsewhere in order to understand what is in front of them).
To put it formally, indirection complexity IC can be computed as:
IC =n∑
x=1
wxix
for n indirect features, where ix is the number of instances of indirect feature x and wx is
a weight associated with indirect feature x. For this introductory work, we have kept all
weights at 1. Further analysis and empirical evidence will be necessary to find the ideal
weights.
This measure has several advantages: it can be applied to any kind of programming
artifact, including requirements, design documents, build systems, and source code; it is
independent of programming language and paradigm; and it is focussed directly on esti-
mating software maintenance effort rather than logical content. In the remainder of this
chapter, we explore the application of the indirection complexity measure to Makefiles.
5.2 Indirect Features of GNU Make
Before we can calculate the indirection complexity of Makefiles, we must first determine
which features should be considered indirections. This section describes the features we
consider to be indirections and how we arrived at that conclusion. We will use the example
Makefile in Figure 5.2 to illustrate.
5.2. INDIRECT FEATURES OF GNU MAKE 76
Figure 5.2: An example Makefile.
5.2.1 Dependencies
The number of dependencies is a common metric for measuring build complexity, and we
include it in our metric as well. This is because each dependency specified in a rule represents
another rule or file that must be found and, therefore, an indirection. Since we include this
in our metric, in some cases it can dominate the complexity score and appear to be no
different than simply counting the dependencies. However, as we will see later, our metric
provides more nuance, especially for Makefiles with no dependencies at all.
5.2. INDIRECT FEATURES OF GNU MAKE 77
5.2.2 vpath, directory change (cd), paths
Makefiles depend on knowing the state the of the filesystem, or at least the portions of the
filesystem relevant to the software project being built, in order to know whether files exist
or if they are out of date (i.e. their timestamp is older than that of the files on which they
depend). When a user is reading a Makefile and trying to understand it, or why it does not
work, they must be aware of where Make is looking for these target files.
Make’s vpath directive (and less versatile VPATH variable) allow the Makefile author
to select directories where Make should look for target files or dependencies. For example,
they may specify where third party library files can be found on the user’s system, or they
may specify a folder in the project directory with common files so as to avoid having to use
paths. In any case, the reader must be aware of these directories and redirect their attention
to them when they are referenced. The example in Figure 5.2 uses both the VPATH variable
(on line 3) and the vpath directive on line 4).
A similar argument can be made for directory changes and paths in filenames. When
the working directory is changed, the reader must be aware of the files in the new directory.
And when a path is specified with a target or dependency, the reader must redirect to this
directory to check the state of the file. We also count file paths in variable assignments,
because they are often referenced in target and dependency lists. The example in Figure
5.2 does not change directories in any of its recipes, but it does use paths in the definition
of the OBJS variable on line 7. This means the reader must add the “bin” directory to the
list of places that must be monitored in addition to the “lib” and “src” directories specified
in the vpath variable and directive.
5.2.3 Includes
Include statements provide a way for the author to break a Makefile into logical units (or
illogical ones, depending on the author) and include them in other Makefiles, essentially
5.2. INDIRECT FEATURES OF GNU MAKE 78
inserting it at the point of the include statement. This creates a level of indirection where
the reader must switch their attention to the included file. At worst, they must switch
back and forth between files when searching for rules that update an outdated target. The
example in Figure 5.2 contains two include statements on lines 13 and 15.
5.2.4 Conditionals (ifdef/ifeq)
Like most programming languages, Make provides a conditional construct to apply a portion
of the code to be included or ignored based on some criteria (e.g. if a variable is defined,
or if a variable has a particular value). Conditional statements are evaluated during a
pre-processing step, which simplifies them to an extent. However, when reading them,
the reader may still have to redirect their attention to somewhere else in the Makefile to
continue reading. This could be the next line, or it could be halfway down the file. The
example in Figure 5.2 contains a condition on line 12 that checks if the value of $DEBUG is
“yes” and includes a different external Makefile depending on the outcome.
5.2.5 Variable References
In large software systems, files tend to have a large number of dependencies that need
to be explicitly defined in the Makefile. One way for an author to manage this is to use
variables that list common dependencies and reference those. This is a classic case of
indirection because a reader must search for where the referenced variable was last assigned
to decipher the meaning of the rule in which it appears. The rule on line 19 of the example
in Figure 5.2 depends on a list of files defined in $OBJS. When reading or debugging that
rule, the reader must either have remembered the list from earlier, or go back and read it
again. Either way this adds complexity to understanding the build.
We observed that a common practice was to reference a variable holding a path, or
partial path, to a directory when listing files. For example, we could have chosen to define
5.2. INDIRECT FEATURES OF GNU MAKE 79
a variable containing “bin/” in our example Makefile in Figure 5.2 and referenced that
variable three times in the definition of $OBJS. We recognized that a reader would likely
not have to look up a variable’s value when it is used repeatedly in a list such as this. To
address this, we decided to add rules to our extractor to only count a variable reference
once if it is used again within the same assignment, target/dependency list, or recipe.
Make also includes a set of built-in automatically assigned variables, or simply automatic
variables. These variables change based on the context in which they are used. For example,
the $@ variable always refers to the filename of the target currently being built, the $<
variable always refers to the name of the first dependency, and the $? variable refers to the
set of dependencies that are newer than the current target. An example of some of these
can be seen in the rule on line 25 of Figure 5.2. These variables are different in that they
are not assigned explicitly, and therefore do not require the reader to search for the line
where they were assigned, but we still count them because they may require the reader to
look back to the rule header to remind them of what is currently being considered.
5.2.6 Function Calls
Function calls are another classic case of indirection. When a reader encounters a function
call, they will likely have to search for the location of the function definition to continue
tracing the code. While Make allows the author to write simple custom functions, most of
the common operations have their own built-in function. For example, string functions (e.g.
findstring, patsubst, filter) provide facilities for manipulating strings (including lists),
and filename functions (e.g. dir, addsuffix, wildcard) provide facilities to manipulate
filenames or query the file system. The example in Figure 5.2 contains a call to the basename
function on line 26, which returns the name of the file it is given, without any extension.
While it is unlikely that the reader will need to consult the definition of any of these
built-in functions to obtain the result, their attention may still be redirected. For example,
5.3. ANALYSIS 80
the result of a string function like strip, which removes whitespace from a string, is trivial
and does not affect the meaning of the Makefile. However, a function like wildcard, which
can be used to search a directory for a list of files that match a pattern, can be unpredictable
and require the reader to consult the file system. Despite this, we count all function calls
as indirection in our complexity measure because it is difficult to determine whether or
not they cause an indirection in every context. In any case, as we have previously shown
(Chapter 4), functions are not used in many Makefiles anyway.
5.2.7 Recursive Make
One common, but discouraged, practice when writing Makefiles is to invoke Make on another
(or possibly the same) Makefile in the recipe of a rule. This allows the author to split
the system into multiple smaller Makefiles for each subsystem and call them individually.
Similarly to include statements, this requires the reader to redirect their attention to a
different file in order to understand the build. As we have also seen in Figure 5.1, this can
also be used to hide complexity.
5.3 Analysis
To see how our new complexity metric performs on Makefiles, we used the same analysis
framework and corpus we described in Chapter 3. Recall that after we parse the input
Makefile, we extract and count the desired features, and then collect them in a spreadsheet.
From there we found the sum of the count of each feature we considered to cause indirection
(listed in the previous section) to yield the complexity score. We present the results in this
section.
Validating indirection complexity is a difficult task because, despite the appearance of
a quantitative measure, it is actually qualitative in nature. Evaluating it properly would
require Make experts to volunteer their time to do a user study that would evaluate what
5.3. ANALYSIS 81
they are thinking when they read and modify Makefiles during various maintenance tasks.
Given the relatively small number of build experts, this is not an easy task. Surveying
novice users (e.g. students) would require simple artificial Makefiles to be written and runs
the risk of not being representative of actual Makefiles or maintenance.
One may think we could use bug reports or change logs to verify our prediction of which
Makefiles may have a higher maintenance cost. However, it is important to remember that
what we are attempting to predict is not which Makefiles will require more changes or have
more bugs, but rather which Makefiles will be more difficult to understand when perform-
ing these tasks, which is not necessarily the same thing. Just because a Makefile is left
unchanged does not mean that it was not read while maintaining the system. Understand-
ing it may still have been essential to the maintenance task, but we have no way of knowing
this from available information.
What we do have available to us is our corpus of Makefiles and previous knowledge from
the research that has already been done on the complexity of build artifacts that shows that
complexity is highly correlated with the number of lines [30, 9]. Armed with this knowledge,
we began by graphing our corpus against the number of lines. If indirection complexity is
any different from these other metrics, it stands to reason that it should not be as highly
correlated with size. This also gives us a way to normalize the data and lets us compare
the complexity of files of different sizes (i.e. by dividing the total number of indirections by
the number of lines, we get the average complexity per line, which we can see visually by
plotting complexity versus the number of lines).
Figure 5.3a shows the graph we get by doing so. It showed that our corpus was being
dominated by a number of outliers, mostly from QMake-generated Makefiles, some as large
as 172k lines. This gave the impression that the data was highly linear, with an R-squared
of 0.95. However, when we zoomed in and examined only Makefiles with less than 10k lines
(Figure 5.3b), we saw more of a spread of data points.
5.3. ANALYSIS 82
(a)
(b)
Figure 5.3: The indirection complexity of a) our entire corpus and b) Makefiles with< 10 thousand lines.
5.3. ANALYSIS 83
Figure 5.4: The indirection complexity of our corpus, coloured by generator.
From this point on, when performing linear regression on the data, we were vigilant to
remove extreme outliers that can obscure the relationship. We identified them by removing
Makefiles larger than three times the interquartile range of the number of lines (i.e. # of
Lines >Q3 + 3*IQR), a method for removing extreme outliers proposed by Tukey [50]. We
did this for both the corpus as a whole and for each category of generator. So, for example,
a Makefile that is removed from the hand-written set is not necessarily removed from the
whole corpus. When we did this for the whole corpus, we obtained a new R-squared value
of 0.76.
It is important to remember at this point that our corpus contains a large number of
generated Makefiles. Figure 5.4 shows the indirection complexity of our corpus coloured
based on the generator used to create them. From this we can see that not all generators
are equal in terms of complexity, at least according to our measure.
In this plot (and separately in Figure 5.5), we can see that CMake (orange) and QMake
5.3. ANALYSIS 84
(a) Hand
(b) Automake
Figure 5.5: The indirection complexity of hand-written and generated Makefiles.
5.3. ANALYSIS 85
(c) CMake
(d) QMake
Figure 5.5: The indirection complexity of hand-written and generated Makefiles.
5.3. ANALYSIS 86
(grey) Makefiles tend to have a higher indirection complexity, while Automake (blue) and
hand-written (yellow) Makefiles tend to be less complex. This observation is interesting
when considering that we previously found that CMake and QMake use only the core
features of Make (rules and variables) in our feature study. Only Makefiles written by hand
or with Automake used some of Make’s more advanced, complex features.
CMake and QMake also seem to be more highly linearly correlated with the number of
lines (R-squared of 0.80 and 0.99, respectively) than hand-written and Automake-generated
Makefiles (with R-squared of 0.37 and 0.23, respectively), which makes sense since these
generators use a limited set of templates that are duplicated as the size of the system grows.
When we look more closely, we can see bands of data points that roughly correspond to the
3 levels of Makefiles that CMake (red) generates to recursively call one another [2]. QMake
(grey) is less documented, but a similar design is likely responsible for its bands as well.
Generators like CMake and QMake are less interesting because the Makefiles they create
are never meant to be read by developers. So the fact that their complexity can be accurately
predicted by the number of lines is less meaningful. Our metric is better utilized on hand-
written Makefiles that need to be debugged directly, as we can see when we graph their
indirection complexity separately in Figure 5.5 where the spread of data points is much
greater and less predictable.
But the real utility of indirection complexity is best seen through examples. Consider
the Makefiles in Figure 5.6. Both are roughly the same size, but the one on the right is twice
as complex (53 vs 23) according to our indirection complexity. And this is what we would
expect when we read them. Figure 5.6a is a straight-forward test harness that runs a series
of tests (i.e. $PROGS), while Figure 5.6b is a Makefile that is included in another Makefile
several times to iterate through a list of items and output some variable assignments and
rules. Figure 5.6a is quite easily understood. But, were it not for the the comments at the
beginning, it would likely take a reader much longer to fully understand Figure 5.6b due
5.3. ANALYSIS 87
noarg:$(MAKE) -C ../../
# The EBB handler is 64-bit code and everything# links against itCFLAGS += -m64
PROGS := reg_access_test event_attributes_test \cycles_test cycles_with_freeze_test \pmc56_overflow_test ebb_vs_cpu_event_test \cpu_event_vs_ebb_test \cpu_event_pinned_vs_ebb_test \task_event_vs_ebb_test \task_event_pinned_vs_ebb_test \multi_ebb_procs_test multi_counter_test \pmae_handling_test close_clears_pmcc_test \instruction_count_test fork_cleanup_test \ebb_on_child_test ebb_on_willing_child_test \back_to_back_ebbs_test lost_exception_test \no_handler_test cycles_with_mmcr2_test
all: $(PROGS)
$(PROGS): ../../harness.c ../event.c ../lib.c \ebb.c ebb_handler.S trace.c \busy_loop.S
instruction_count_test: ../loop.S
lost_exception_test: ../lib.c
run_tests: all-for PROG in $(PROGS); do \
./$$PROG; \done;
clean:rm -f $(PROGS)
(a)
# This file is included several times in a row,# once for each element of $(iter-items).# On each inclusion, we advance $o to the next element.# $(iter-labels) and $(iter-from) and# $(iter-to) are also advanced.
o := $(firstword $(iter-items))iter-items := $(filter-out $o,$(iter-items))
$o-label := $(firstword $(iter-labels))iter-labels := $(wordlist 2, \
$(words $(iter-labels)),$(iter-labels))
$o-from := $(firstword $(iter-from))iter-from := $(wordlist 2,$(words $(iter-from)),$(iter-from))
$o-to := $(firstword $(iter-to))iter-to := $(wordlist 2,$(words $(iter-to)),$(iter-to))
ifeq ($($o-from),$($o-to))$o-opt := -D$($o-from) MODE
else$o-opt := -DFROM $($o-from) -DTO $($o-to)
endif
#$(info $o$(objext): -DL$($o-label) $($o-opt))
ifneq ($o,$(filter $o,$(LIB2FUNCS EXCLUDE)))$o$(objext): %$(objext): $(srcdir)/fixed-bit.c
$(gcc compile) -DL$($*-label) $($*-opt) -c \$(srcdir)/fixed-bit.c $(vis hide)
ifeq ($(enable shared),yes)$(o) s$(objext): % s$(objext): $(srcdir)/fixed-bit.c$(gcc s compile) -DL$($*-label) $($*-opt) \
-c $(srcdir)/fixed-bit.cendif
endif
(b)
Figure 5.6: Examples from our corpus that illustrate the advantages of indirectioncomplexity over traditional metrics such as number of lines or dependencies. Makefile(a) one the left has more dependencies than Makefile (b), but is arguably much easierto understand.
to all the references to variables that are not even defined in the same file. Note, however,
that Figure 5.6a has more targets and dependencies, which makes it more complex under
these measures.
Figure 5.6a also illustrates a potential weakness and threat to the validity of our ap-
proach. Because we use a static parse to count features, we would find that this Makefile
contains 11 dependencies. However, looking at the rule to build all, you can see that it lists
${PROGS} as a dependency and ${PROGS} expands to include 22 files. Thus the Makefile
5.4. CONCLUSION 88
actually contains 31 dependencies, not 11. In this way, we find that variables can be used
to hide complexity from our current analysis.
Another example of this can be seen in Figure 5.5a in the horizontal line of data points
with a complexity of just over 200 and lines ranging from about 500 to 750. These Make-
files appear in a number of different GNU projects as part of building GNU gettext –
a translation toolset for localizing software. Each of these Makefiles is configured from a
seemingly hand-written template that adds continuations (i.e. additional lines) to some
variable assignments. These assignments specify object files that become dependencies of
the rules, but the build logic stays the same, as does our measure of complexity.
It is unclear what the appropriate course of action is in this situation. One possibility is
to weigh any dependencies containing variables against the length of the variable assignment
or its complexity, thus making the analysis more dynamic. This creates a whole new set of
challenges around resolving variable references, which can be difficult or impossible without
actually running the build. They may be assigned in multiple places, where they are
overwritten, conditionally assigned, or appended. Additionally, there is the problem that
arises from variables that are assigned in other Makefiles, as is the case in Figure 5.6b.
5.4 Conclusion
Software maintenance is all about understanding code. Expert developers do this by fol-
lowing the indirections, which adds to the amount of information that they must keep track
of and makes the maintenance task more complex. We have attempted to estimate this
difficulty in understanding Makefiles by calculating indirection complexity based on fea-
ture frequency, and have demonstrated the potential advantages it offers over other metrics
such as the number of lines or dependencies. While validation is difficult, we believe that
the empirical reasoning behind the metric provides a strong foundation on which we can
discuss maintenance complexity in terms of how difficult it is to understand, and we have
5.4. CONCLUSION 89
attempted to demonstrate that here. Further studies are needed to determine the validity
of the metric in actual maintenance tasks.
Our analysis is entirely static, and a dynamic approach would be better able to evaluate
variables, functions, and other features of Make that might give a more accurate count of
indirections. It would also allow entire systems of Makefiles to be evaluated as a whole,
because it could resolve include statements that link them together. Another possibility
to be explored is to assign weights to each indirection feature based on an estimate of its
cognitive overhead. For example, a feature that requires the reader to look in a separate
file may be weighted more than a feature that causes the reader to look somewhere else in
the same file. We have already explored this possibility to some extent, but further work is
needed to find an ideal weighting scheme.
One of the potential pitfalls of our new metric is that it can be interpreted as advice
to avoid features such as abstractions, includes and variables associated with indirection
altogether. This is not our intention. Rather, our goal is to persuade developers to be
aware of the potential indirections in their code and to weigh the development benefits
against the possibly increased maintenance effort.
We believe indirection complexity can be applied to other languages, but this remains to
be seen. Even for Makefiles, empirical and user studies are needed to determine whether our
theory of indirection can actually predict maintenance effort or even perceived complexity.
We hope that indirection complexity can spark discussion about what it means to be
complex. We believe it provides a benefit over other complexity measures that have been
used on Makefiles in that it allows us to talk about complexity of Makefiles that do not
contain rules or dependencies and a way of accounting for other features of the language,
such as functions, that can contribute to complexity. With better methods of estimating
complexity, developers should be able to make better decisions about how or if to refactor
their Makefiles or, if necessary, consider a different technology.
90
Chapter 6
Conclusion
In this thesis, we have presented two empirical analyses of Makefiles in open source software
projects in an effort to understand how Make is used in real development environments.
We began in Chapter 1 with an overview of the thesis and what we wished to accomplish.
Then, in Chapter 2, we reviewed the most popular build automation tools, including Make,
Ant, Automake, and CMake. This chapter also surveyed related research on build systems
and their maintenance and evolution.
Chapter 3 described the methodology and corpus used throughout this work. We showed
how we compiled our corpus of more than 20,000 Makefiles from 271 open source projects,
and how we used TXL to precisely parse and extract features from it. Using this, we
performed an in-depth feature analysis in Chapter 4 showing which features of the language
actually get used and how Makefiles generated by tools, such as Automake and CMake,
differ from those manually written by hand. Finally, in Chapter 5, we introduced a new
complexity metric called indirection complexity and performed an analysis on the corpus to
demonstrate how it can yield a more accurate estimate of complexity than simple metrics
like number of lines.
6.1. CONTRIBUTIONS 91
6.1 Contributions
A corpus of Makefiles and a framework for analyzing them: We have compiled
a corpus of Makefiles from 271 open source projects, including Makefiles generated from
three different tools. Using the TXL source transformation language, we have created a
framework capable parsing Makefiles precisely, as well as a set of rules to extract basic
features of Makefiles. Previous analysis tools of Make projects rely on executing Make
to dynamically extract features. This requires the entire project source code, whereas a
static analysis is able to parse individual Makefiles even when the rest of the source is not
available.
A taxonomy of Make features and an inventory of their use: Using the Make
manual and a sampling of Makefiles, we identified and organized features of the Make
language. Then we extracted these features from a corpus of more than 20,000 Makefiles
over 271 open source projects and presented our findings. We found that generated Makefiles
use only the oldest core set of Make features, while hand-written ones use Make’s more
advanced features such as functions. We also identified some obsolete features that are still
in use by some projects in addition to recursive Make calls, which are still used extensively
even though it is considered bad practice.
A new metric for measuring the maintenance complexity of Makefiles: We
defined indirection complexity as the sum of all instances of indirect features (features that
require to the reader to divert their attention somewhere else). We demonstrated how this
allows us to describe complexity more accurately than the number of lines or dependencies,
especially for Makefiles that don’t contain rules.
Our corpus, framework, and the full results of the analyses presented in this thesis are
available at: http://txl.ca and http://www.cs.queensu.ca/~doug/
6.2. THREATS TO VALIDITY 92
6.2 Threats to Validity
While our corpus is quite large, and we made an effort to choose projects that represented
the most advanced users of Make, it may not be as diverse as we originally thought. It
wasn’t until late in the work that we discovered that the Android operating system used
hand-written Makefiles and in a way in which we had not seen before. This technique was
a type of metaprogramming wherein functions are used to generate dependencies and write
rules. So it is possible that our corpus is not as good a representation of current practice
as we originally thought. On the other hand, perhaps we should have included projects
from less advanced users, such as small projects on GitHub. It is possible that some of
the features we found no usage of were added to Make for the benefit of novices. Further
analysis of these smaller projects could prove to be beneficial.
6.3 Future Work
Our framework has countless other applications that we have yet to explore. Some of these
include:
• Slicing: One interesting technique that has yet to be explored with Makefiles is the
concept of code slicing. TXL rules could be written to extract slices (pieces of the
code) that pertain to a particular target, variable, etc.
• Optimization: If bad code smells or patterns can be detected, our framework is
capable of transforming them into better patterns. Some bad practices (such as those
discussed in Chapter 4) could be automatically recognized and fixed, perhaps even
some recursive Make invocations.
• Refactoring: Similar to the above proposals, our framework is capable of other
kinds of refactoring. For example, one might be able to extract and combine slices of
6.3. FUTURE WORK 93
Makefiles to create new ones.
• Migration: Something TXL is capable of doing, that we haven not seen in this thesis,
is translating from one language to another. We have seen a number of projects move
from generator to generator, or from writing Makefiles by hand to using one of these
generators. Using TXL and our Makefile grammar, developers looking to migrate
their existing Makefiles to a generator can detect patterns and transform them into
the corresponding statements of the generator’s language. This does, however, require
a new grammar for the generator language.
• Debugging: While some debugging tools for Makefiles already exists, our framework
is also capable of adding debugging statements to Makefiles. For example, one could
add breakpoints to recipes to query the value of variables.
• Clone Detection: While McIntosh et al. have performed clone detection on build
artifacts of Ant, Maven, Autotools, and CMake (the CMakeLists.txt files, not the
build artifacts it generates) [32], no one to the best of our knowledge has performed
it on Makefiles. Furthermore, because our parser is so precise, we can detect clones
at different granularities. For example, instead of searching for clones within a set
of whole Makefiles, we could also extract rules and perform a clone analysis at that
granularity, thus allowing us to find clones even within the same file. We have already
begun work on this, but further study is required.
Our complexity analysis focussed on demonstrating how indirection complexity can
provide a more accurate estimation of complexity over the current standard of counting
lines or dependencies. What we did not do is show that there is a statistical significance
between both of these measures. Part of the reason was that our analysis was static and
other studies that measure dependencies use dynamic analysis. The difference is that a
static analysis counts a variable reference as one dependency when it could in fact expand
6.3. FUTURE WORK 94
to many more. A future experiment designed to use both of these analysis techniques could
allow us to better compare the two metrics to determine if there is a significant difference.
Another extension of our complexity analysis that would be valuable is a user study
conducted with Makefile experts to determine if indirection complexity accurately predicts
how they perceive complexity. One of the reasons that was not attempted in this thesis
is the lack of access to the relatively few experts there are. Perhaps some online tool that
compares two random Makefiles and asks which is more difficult to understand could be
implemented and sent to developers familiar with Make.
Indirection complexity also has possible applications outside of Makefiles. We believe
it can be applied to more traditional languages but the concept is general enough to apply
to any type of readable artifact, such as requirements or design documents. We have not
explored this possibility in anything other than Makefiles, but it warrants further study.
BIBLIOGRAPHY 95
Bibliography
[1] Automake. [Online] Available: https://www.gnu.org/software/automake/manual/
automake.html. [Accessed March 12, 2017].
[2] CMake FAQ. [Online] Available: http://www.cmake.org/Wiki/CMake_FAQ. [Accessed
March 12, 2017].
[3] CMake reference documentation. [Online] Available: https://cmake.org/cmake/
help/v3.6/. [Accessed March 12, 2017].
[4] GNU make. [Online] Available: https://www.gnu.org/software/make/manual/
make.html. [Accessed March 12, 2017].
[5] MAKAO, re(verse)-engineering build systems. [Online] Available: http://mcis.
polymtl.ca/makao.html. [Accessed March 12, 2017].
[6] NMAKE reference. [Online] Available: https://msdn.microsoft.com/en-us/
library/dd9y37ha.aspx. [Accessed March 12, 2017].
[7] QMake manual. [Online] Available: http://doc.qt.io/qt-4.8/qmake-manual.html.
[Accessed March 12, 2017].
[8] B. Adams. Co-evolution of Source Code and the Build System: Impacton the Introduc-
tion of AOSD in Legacy Systems. PhD dissertation, Ghent University, Belgium, May
2008.
BIBLIOGRAPHY 96
[9] B. Adams, K. De Schutter, H. Tromp, and W. De Meuter. The evolution of the linux
build system. Electronic Communications of the EASST, 8(0), October 2008.
[10] B. Adams, H. Tromp, K. De Schutter, and W. De Meuter. Design recovery and main-
tenance of build systems. In IEEE International Conference on Software Maintenance,
2007. ICSM 2007, pages 114–123, 2007.
[11] Y. Ahn, J. Suh, S. Kim, and H. Kim. The software maintenance project effort estima-
tion model based on function points. Journal of Software Maintenance and Evolution:
Research and Practice, 15(2):71–85, 2003.
[12] A.J. Albrecht. Measuring application development productivity. In Proceedings of the
Joint SHARE/GUIDE/IBM Application Development Symposium, volume 10, pages
83–92, 1979.
[13] Apache Software Foundation. Apache ant. [Online] Available: http://ant.apache.
org, 2000. [Accessed March 12, 2017].
[14] L.A. Belady and M.M. Lehman. A model of large program development. IBM Systems
Journal, 15(3):225–252, 1976.
[15] M. Bowler. Truck factor. Agile Advice, 2005.
[16] D. Cubranic, G.C. Murphy, J. Singer, and K.S. Booth. Hipikat: A project memory for
software development. IEEE Trans. Software Eng., 31(6):446–465, 2005.
[17] S.I. Feldman. Make a program for maintaining computer programs. Software: Practice
and Experience, 9(4):255–265, 1979.
[18] J. Graham-Cumming. Debugging makefiles. [Online] Available: http://www.drdobbs.
com/tools/debugging-makefiles/197003338. [Accessed March 12, 2017].
BIBLIOGRAPHY 97
[19] J. Graham-Cumming. Debugging makefiles, 2007.
[20] M.H. Halstead. Elements of Software Science (Operating and Programming Systems
Series). Elsevier Science Inc., 1977.
[21] K. Iio, T. Furuyama, and Y. Arai. Experimental analysis of the cognitive processes of
program maintainers during software maintenance. 2013 IEEE International Confer-
ence on Software Maintenance, 0:242, 1997.
[22] Niels Jørgensen. Safeness of make-based incremental recompilation. In International
Symposium of Formal Methods Europe, pages 126–145. Springer, 2002.
[23] M. Kersten and G.C. Murphy. Mylar: A degree-of-interest model for IDEs. In Pro-
ceedings of the 4th International Conference on Aspect-Oriented Software Development,
AOSD 2005, Chicago, Illinois, USA, March 14-18, 2005, pages 159–168, 2005.
[24] A.J. Ko, B.A. Myers, M.J. Coblenz, and H.H. Aung. An exploratory study of how
developers seek, relate, and collect relevant information during software maintenance
tasks. IEEE Trans. Software Eng., 32(12):971–987, 2006.
[25] G. K. Kumfert and T. G. W. Epperly. Software in the DOE: The hidden overhead
of the build. Lawrence Livermore National Laboratory, CA, USA, Technical Report,
2002.
[26] Noel Llopis. The quest for the perfect build system. [Online] Available: http:
//gamesfromwithin.com/the-quest-for-the-perfect-build-system. [Accessed
March 12, 2017].
[27] D.H. Martin and J.R. Cordy. On the maintenance complexity of makefiles. In Pro-
ceedings of the 7th International Workshop on Emerging Trends in Software Metrics,
WETSoM ’16, pages 50–56, New York, NY, USA, 2016. ACM.
BIBLIOGRAPHY 98
[28] D.H. Martin, J.R. Cordy, B. Adams, and G. Antoniol. Make it simple - an empirical
analysis of gnu make feature use in open source projects. In Program Comprehension
(ICPC), 2015 IEEE 23rd International Conference on, pages 207–217, May 2015.
[29] T.J. McCabe. A complexity measure. Software Engineering, IEEE Transactions on,
SE-2(4):308–320, Dec 1976.
[30] S. McIntosh, B. Adams, and A.E. Hassan. The evolution of ANT build systems. In
2010 7th IEEE Working Conference on Mining Software Repositories (MSR), pages
42–51, 2010.
[31] S. McIntosh, B. Adams, and A.E. Hassan. The evolution of java build systems. Em-
pirical Software Engineering, 17(4-5):578–608, August 2012.
[32] S. McIntosh, M. Poehlmann, E. Juergens, A. Mockus, B. Adams, A.E. Hassan,
B. Haupt, and C. Wagner. Collecting and leveraging a benchmark of build system
clones to aid in quality assessments. In Companion Proceedings of the 36th Interna-
tional Conference on Software Engineering, ICSE Companion 2014, pages 145–154,
New York, NY, USA, 2014. ACM.
[33] Shane McIntosh, Bram Adams, Thanh H.D. Nguyen, Yasutaka Kamei, and Ahmed E.
Hassan. An empirical study of build maintenance effort. In Proceedings of the 33rd In-
ternational Conference on Software Engineering, ICSE ’11, pages 141–150, New York,
NY, USA, 2011. ACM.
[34] P. Miller. Recursive make considered harmful. AUUGN Journal of AUUG Inc,
19(1):14–25, 1998.
[35] A. Neagu. Make alternatives. [Online] Available: http://freecode.com/articles/
make-alternatives, 2005. [Accessed March 12, 2017].
BIBLIOGRAPHY 99
[36] A. Neagu. What is wrong with make? [Online] Available: http://freecode.com/
articles/what-is-wrong-with-make, 2005. [Accessed March 12, 2017].
[37] A. Neundorf. Why the KDE project switched to CMake – and how. [Online] Available:
http://lwn.net/Articles/188693/. [Accessed March 12, 2017].
[38] S. Phillips, T. Zimmermann, and C. Bird. Understanding and improving software build
teams. In Proceedings of the 36th International Conference on Software Engineering,
pages 735–744. ACM, 2014.
[39] Ant-Contrib Project. Ant-contrib tasks. [Online] Available: http://ant-contrib.
sourceforge.net/, 2003. [Accessed March 12, 2017].
[40] H. Seo, C. Sadowski, S. Elbaum, E. Aftandilian, and R. Bowdidge. Programmers’ build
errors: a case study (at google). In Proceedings of the 36th International Conference
on Software Engineering, pages 724–734. ACM, 2014.
[41] J. Singer. Practices of software maintenance. In 1998 International Conference on
Software Maintenance, ICSM 1998, Bethesda, Maryland, USA, November 16-19, 1998,
pages 139–145, 1998.
[42] J. Singer, T.C. Lethbridge, N.G. Vinson, and N. Anquetil. An examination of software
engineering work practices. In Proceedings of the 1997 conference of the Centre for
Advanced Studies on Collaborative Research, November 10-13, 1997, Toronto, Ontario,
Canada, page 21, 1997.
[43] Z. Soh, F. Khomh, Y.-G. Gueheneuc, and G. Antoniol. Towards understanding how
developers spend their effort during maintenance activities. In Reverse Engineering
(WCRE), 2013 20th Working Conference on, pages 152–161, Oct 2013.
BIBLIOGRAPHY 100
[44] R. Suvorov, M. Nagappan, A.E. Hassan, Y. Zou, and B. Adams. An empirical study
of build system migrations in practice: Case studies on KDE and the linux kernel. In
Software Maintenance (ICSM), 2012 28th IEEE International Conference on, pages
160–169. IEEE, 2012.
[45] Conifer Systems. Whats wrong with GNU make? [Online] Available: http://www.
conifersystems.com/whitepapers/gnu-make/. [Accessed March 12, 2017].
[46] A. Tamrawi, H.A. Nguyen, H.V. Nguyen, and T.N. Nguyen. Build code analysis with
symbolic evaluation. In 2012 34th International Conference on Software Engineering
(ICSE), pages 650–660, 2012.
[47] A. Tamrawi, H.A. Nguyen, H.V. Nguyen, and T.N. Nguyen. SYMake: a build code
analysis and refactoring tool for makefiles. In 27th IEEE/ACM International Confer-
ence on Automated Software Engineering, ASE 2012, pages 366–369, New York, NY,
USA, 2012.
[48] Tasktop Technologies Inc. Mylyn. [Online] Available: http://www.tasktop.com/
mylyn, 2007. [Accessed March 12, 2017].
[49] Q. Tu and M.W. Godfrey. The build-time software architecture view. In Proceedings
of the IEEE International Conference on Software Maintenance (ICSM’01), page 398.
IEEE Computer Society, 2001.
[50] John W. Tukey. Exploratory Data Analysis. Pearson, 1st edition, 1977.
[51] E. Zadok. Overhauling AMD for the ’00s: A case study of GNU autotools. In FREENIX
Track: 2002 USENIX Annual Technical Conference, pages 287–297, Berkeley, CA,
USA, 2002.
101
Appendix A
Make Feature Use Summary
A.1 All
A.1.1 General
Feature Total Average Max
# offileswithfeature
% offileswithfeature
lines 8278533 420.5 172197 - -continuations 4757886 241.7 168654 6988 35.5%
comments 449325 22.8 55979 16066 81.6%directory changes (cd) 137333 7.0 2771 6400 32.5%
paths 5941053 301.7 176433 12903 65.5%vpath directives 45 0.0 8 21 0.1%
VPATH variable assignments 1095 0.1 1 1095 5.6%includes 22388 1.1 530 5315 27.0%
ifdef/ifeq directives 5488 0.3 142 1070 5.4%macros 406 0.0 5 370 1.9%
assignments 954626 48.5 5458 16844 85.6%variable references 1079121 54.8 46238 13299 67.5%
unique variable references 299650 15.2 21915 13299 67.5%autoevals 44950 2.3 1639 2528 12.8%
$@ references 32292 1.6 1553 2441 12.4%$< references 4859 0.2 194 1454 7.4%$? references 1784 0.1 38 1581 8.0%$ˆ references 540 0.0 222 187 0.9%$+ references 7 0.0 7 1 0.0%
A.1. ALL 102
Feature Total Average Max
# offileswithfeature
% offileswithfeature
$* references 5385 0.3 82 740 3.8%$% references 0 0.0 0 0 0.0%
$| references 1 0.0 1 1 0.0%obsolete autoevals 82 0.0 13 32 0.2%
function calls 9925 0.5 483 1200 6.1%embedded function calls 2116 0.1 259 328 1.7%
rules 951299 48.3 14034 9502 48.3%targets 954370 48.5 14034 9502 48.3%
dependencies 4805559 244.1 168707 9204 46.7%recipe commands 551279 28.0 7617 9736 49.4%
A.1.2 Assignments
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 370612 18.8 24327 8018 40.7%unique variable references 245246 12.5 24033 8018 40.7%
autoeval references 4793 0.2 248 1417 7.2%$@ references 4555 0.2 247 1398 7.1%$< references 159 0.0 14 91 0.5%$? references 1 0.0 1 1 0.0%$ˆ references 31 0.0 3 21 0.1%$+ references 0 0.0 0 0 0.0%$* references 23 0.0 5 15 0.1%
$% references 0 0.0 0 0 0.0%$| references 1 0.0 1 1 0.0%
obsolete autoeval references 23 0.0 6 10 0.1%lazy assignments 909998 46.2 5458 13121 66.6%
strict assignments 16435 0.8 376 3683 18.7%iterative assignments 26831 1.4 833 3263 16.6%
conditional assignments 2587 0.1 54 539 2.7%function calls 5184 0.3 169 863 4.4%
paths 179973 9.1 1044 12826 65.1%
A.1. ALL 103
A.1.3 Rules
Feature Total Average Max
# offileswithfeature
% offileswithfeature
single colon rules 950135 48.3 14034 9391 47.7%double colon rules 1164 0.1 318 220 1.1%
pattern rules 800 0.0 77 315 1.6%static pattern rules 246 0.0 19 105 0.5%
A.1.4 Targets
Feature Total Average Max
# offileswithfeature
% offileswithfeature
special targets 167750 8.5 5729 7833 39.8%.SILENT targets 3764 0.2 1 3764 19.1%
.IGNORE targets 0 0.0 0 0 0.0%suffix targets 32977 1.7 1084 3826 19.4%
target-specific assignments 951840 48.3 14034 9502 48.3%variable references 25782 1.3 1119 8212 41.7%variable references 25573 1.3 1119 8212 41.7%
function calls 103 0.0 11 43 0.2%paths 424093 21.5 7162 8266 42.0%
A.1.5 Dependencies
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 77728 3.9 2818 4682 23.8%unique variable references 71832 3.6 2291 4682 23.8%
function calls 256 0.0 33 74 0.4%total autoeval references 14 0.0 4 6 0.0%
$@ references 8 0.0 4 4 0.0%$< references 4 0.0 2 2 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%
A.1. ALL 104
Feature Total Average Max
# offileswithfeature
% offileswithfeature
$* references 2 0.0 2 1 0.0%$% references 0 0.0 0 0 0.0%
$| references 0 0.0 0 0 0.0%obsolete autoeval references 0 0.0 0 0 0.0%
paths 4281465 217.5 168664 8401 42.7%
A.1.6 Recipes
Feature Total Average Max
# offileswithfeature
% offileswithfeature
command flags 256355 13.0 2769 7396 37.6%@ command flags 182656 9.3 2769 7280 37.0%- command flags 73652 3.7 1642 4238 21.5%
+ command flags 47 0.0 12 12 0.1%recursive makes 136240 6.9 3808 6391 32.5%
recursive automakes 1691 0.1 3 1543 7.8%recursive cmakes 76840 3.9 3980 3762 19.1%recursive qmakes 40709 2.1 1262 2460 12.5%
variable references 583353 29.6 17444 8042 40.8%unique variable references 450002 22.9 14095 8010 40.7%
autoeval references 40143 2.0 1553 2444 12.4%$@ references 27729 1.4 1553 2356 12.0%$< references 4696 0.2 194 1389 7.1%$? references 1783 0.1 38 1580 8.0%$ˆ references 509 0.0 222 166 0.8%$+ references 7 0.0 7 1 0.0%$* references 5360 0.3 81 737 3.7%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 59 0.0 13 22 0.1%function calls 1805 0.1 352 306 1.6%
paths 398782 20.3 7332 8857 45.0%
A.1. ALL 105
A.1.7 Function Calls
Feature Total Average Max
# offileswithfeature
% offileswithfeature
subst function calls 601 0.0 18 197 1.0%patsubst function calls 697 0.0 18 242 1.2%
strip function calls 398 0.0 83 119 0.6%findstring function calls 295 0.0 84 78 0.4%
filter function calls 394 0.0 55 132 0.7%filter-out function calls 470 0.0 23 246 1.2%
sort function calls 118 0.0 7 81 0.4%word function calls 84 0.0 7 43 0.2%
wordlist function calls 55 0.0 6 42 0.2%words function calls 53 0.0 18 27 0.1%
firstword function calls 76 0.0 4 50 0.3%lastword function calls 7 0.0 2 6 0.0%
dir function calls 171 0.0 12 67 0.3%notdir function calls 503 0.0 176 155 0.8%suffix function calls 8 0.0 3 4 0.0%
basename function calls 238 0.0 176 53 0.3%addsuffix function calls 100 0.0 9 47 0.2%addprefix function calls 573 0.0 31 215 1.1%
join function calls 167 0.0 166 2 0.0%wildcard function calls 612 0.0 25 291 1.5%realpath function calls 5 0.0 2 3 0.0%abspath function calls 31 0.0 7 18 0.1%
error function calls 323 0.0 13 122 0.6%warning function calls 48 0.0 9 28 0.1%
info function calls 59 0.0 13 18 0.1%file function calls 0 0.0 0 0 0.0%call function calls 1783 0.1 205 409 2.1%
value function calls 4 0.0 2 2 0.0%eval function calls 192 0.0 43 35 0.2%
origin function calls 33 0.0 2 31 0.2%flavor function calls 0 0.0 0 0 0.0%shell function calls 820 0.0 34 306 1.6%guile function calls 0 0.0 0 0 0.0%
foreach function calls 380 0.0 84 94 0.5%if function calls 533 0.0 37 247 1.3%
or function calls 57 0.0 2 32 0.2%and function calls 26 0.0 1 26 0.1%
A.2. AUTOMAKE 106
A.2 Automake
A.2.1 General
Feature Total Average Max
# offileswithfeature
% offileswithfeature
lines 1761487 1111.3 81286 - -continuations 469662 296.3 9088 1585 100.0%
comments 162404 102.5 55979 1585 100.0%directory changes (cd) 10505 6.6 82 1570 99.1%
paths 258027 162.8 13034 1583 99.9%vpath directives 0 0.0 0 0 0.0%
VPATH variable assignments 15 0.0 1 15 0.9%includes 12040 7.6 530 646 40.8%
ifdef/ifeq directives 205 0.1 8 33 2.1%macros 27 0.0 1 27 1.7%
assignments 755671 476.8 5458 1584 99.9%variable references 601029 379.2 46238 1582 99.8%
unique variable references 175404 110.7 21915 1582 99.8%autoevals 36003 22.7 1639 1578 99.6%
$@ references 26163 16.5 1553 1578 99.6%$< references 2908 1.8 194 891 56.2%$? references 1658 1.0 5 1540 97.2%$ˆ references 244 0.2 222 17 1.1%$+ references 0 0.0 0 0 0.0%$* references 5029 3.2 27 631 39.8%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoevals 1 0.0 1 1 0.1%function calls 2043 1.3 97 152 9.6%
embedded function calls 538 0.3 26 91 5.7%rules 145100 91.5 2418 1580 99.7%
targets 146259 92.3 2419 1580 99.7%dependencies 231524 146.1 3540 1580 99.7%
recipe commands 95222 60.1 4381 1580 99.7%
A.2. AUTOMAKE 107
A.2.2 Assignments
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 314557 198.5 24327 1581 99.7%unique variable references 197154 124.4 24033 1581 99.7%
autoeval references 4044 2.6 248 1255 79.2%$@ references 4036 2.5 247 1255 79.2%$< references 4 0.0 1 4 0.3%$? references 0 0.0 0 0 0.0%$ˆ references 3 0.0 2 2 0.1%$+ references 0 0.0 0 0 0.0%$* references 1 0.0 1 1 0.1%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%lazy assignments 752864 475.0 5458 1583 99.9%
strict assignments 299 0.2 8 84 5.3%iterative assignments 1959 1.2 833 38 2.4%
conditional assignments 1425 0.9 54 33 2.1%function calls 1842 1.2 63 150 9.5%
paths 116314 73.4 1044 1584 99.9%
A.2.3 Rules
Feature Total Average Max
# offileswithfeature
% offileswithfeature
single colon rules 144592 91.2 2418 1580 99.7%double colon rules 508 0.3 318 4 0.3%
pattern rules 206 0.1 7 82 5.2%static pattern rules 74 0.0 11 34 2.1%
A.2. AUTOMAKE 108
A.2.4 Targets
Feature Total Average Max
# offileswithfeature
% offileswithfeature
special targets 7618 4.8 17 1574 99.3%.SILENT targets 0 0.0 0 0 0.0%
.IGNORE targets 0 0.0 0 0 0.0%suffix targets 4421 2.8 16 1544 97.4%
target-specific assignments 145181 91.6 2418 1580 99.7%variable references 13985 8.8 1119 1572 99.2%variable references 13914 8.8 1119 1572 99.2%
function calls 1 0.0 1 1 0.1%paths 12071 7.6 1805 1543 97.4%
A.2.5 Dependencies
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 53513 33.8 2818 1578 99.6%unique variable references 50769 32.0 2291 1578 99.6%
function calls 17 0.0 15 2 0.1%total autoeval references 8 0.0 4 2 0.1%
$@ references 6 0.0 4 2 0.1%$< references 2 0.0 2 1 0.1%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%paths 19122 12.1 2127 1551 97.9%
A.2. AUTOMAKE 109
A.2.6 Recipes
Feature Total Average Max
# offileswithfeature
% offileswithfeature
command flags 51950 32.8 1688 1578 99.6%@ command flags 32786 20.7 1282 1578 99.6%- command flags 19164 12.1 770 1544 97.4%
+ command flags 0 0.0 0 0 0.0%recursive makes 17525 11.1 48 1571 99.1%
recursive automakes 1688 1.1 3 1540 97.2%recursive cmakes 0 0.0 0 0 0.0%recursive qmakes 0 0.0 0 0 0.0%
variable references 206725 130.4 17444 1580 99.7%unique variable references 170931 107.8 14095 1580 99.7%
autoeval references 31951 20.2 1553 1578 99.6%$@ references 22121 14.0 1553 1578 99.6%$< references 2902 1.8 194 890 56.2%$? references 1658 1.0 5 1540 97.2%$ˆ references 241 0.2 222 15 0.9%$+ references 0 0.0 0 0 0.0%$* references 5028 3.2 27 631 39.8%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 1 0.0 1 1 0.1%function calls 94 0.1 19 33 2.1%
paths 39140 24.7 3354 1579 99.6%
A.2.7 Function Calls
Feature Total Average Max
# offileswithfeature
% offileswithfeature
subst function calls 197 0.1 8 28 1.8%patsubst function calls 283 0.2 11 83 5.2%
strip function calls 25 0.0 2 24 1.5%findstring function calls 104 0.1 5 28 1.8%
filter function calls 7 0.0 6 2 0.1%filter-out function calls 132 0.1 2 86 5.4%
sort function calls 27 0.0 1 27 1.7%
A.2. AUTOMAKE 110
Feature Total Average Max
# offileswithfeature
% offileswithfeature
word function calls 48 0.0 2 24 1.5%wordlist function calls 0 0.0 0 0 0.0%
words function calls 2 0.0 2 1 0.1%firstword function calls 0 0.0 0 0 0.0%lastword function calls 0 0.0 0 0 0.0%
dir function calls 7 0.0 7 1 0.1%notdir function calls 91 0.1 14 78 4.9%suffix function calls 0 0.0 0 0 0.0%
basename function calls 27 0.0 1 27 1.7%addsuffix function calls 0 0.0 0 0 0.0%addprefix function calls 110 0.1 8 52 3.3%
join function calls 0 0.0 0 0 0.0%wildcard function calls 182 0.1 4 116 7.3%realpath function calls 0 0.0 0 0 0.0%abspath function calls 0 0.0 0 0 0.0%
error function calls 92 0.1 4 28 1.8%warning function calls 0 0.0 0 0 0.0%
info function calls 0 0.0 0 0 0.0%file function calls 0 0.0 0 0 0.0%call function calls 55 0.0 4 27 1.7%
value function calls 0 0.0 0 0 0.0%eval function calls 2 0.0 2 1 0.1%
origin function calls 28 0.0 1 28 1.8%flavor function calls 0 0.0 0 0 0.0%shell function calls 369 0.2 8 104 6.6%guile function calls 0 0.0 0 0 0.0%
foreach function calls 21 0.0 19 2 0.1%if function calls 162 0.1 37 77 4.9%
or function calls 48 0.0 2 24 1.5%and function calls 24 0.0 1 24 1.5%
A.3. CMAKE 111
A.3 CMake
A.3.1 General
Feature Total Average Max
# offileswithfeature
% offileswithfeature
lines 1167300 134.6 36482 - -continuations 11152 1.3 780 1199 13.8%
comments 190961 22.0 7970 6162 71.1%directory changes (cd) 83485 9.6 2771 3602 41.5%
paths 872500 100.6 27086 4958 57.2%vpath directives 0 0.0 0 0 0.0%
VPATH variable assignments 0 0.0 0 0 0.0%includes 4909 0.6 3 2509 28.9%
ifdef/ifeq directives 0 0.0 0 0 0.0%macros 0 0.0 0 0 0.0%
assignments 47956 5.5 1208 6326 72.9%variable references 94201 10.9 5889 3762 43.4%
unique variable references 22677 2.6 1213 3762 43.4%autoevals 0 0.0 0 0 0.0%
$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoevals 0 0.0 0 0 0.0%function calls 0 0.0 0 0 0.0%
embedded function calls 0 0.0 0 0 0.0%rules 396721 45.7 14034 3762 43.4%
targets 396721 45.7 14034 3762 43.4%dependencies 312576 36.0 12429 3762 43.4%
recipe commands 247136 28.5 7617 3762 43.4%
A.3. CMAKE 112
A.3.2 Assignments
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 0 0.0 0 0 0.0%unique variable references 0 0.0 0 0 0.0%
autoeval references 0 0.0 0 0 0.0%$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%lazy assignments 47956 5.5 1208 6326 72.9%
strict assignments 0 0.0 0 0 0.0%iterative assignments 0 0.0 0 0 0.0%
conditional assignments 0 0.0 0 0 0.0%function calls 0 0.0 0 0 0.0%
paths 24986 2.9 7 4959 57.2%
A.3.3 Rules
Feature Total Average Max
# offileswithfeature
% offileswithfeature
single colon rules 396721 45.7 14034 3762 43.4%double colon rules 0 0.0 0 0 0.0%
pattern rules 0 0.0 0 0 0.0%static pattern rules 0 0.0 0 0 0.0%
A.3. CMAKE 113
A.3.4 Targets
Feature Total Average Max
# offileswithfeature
% offileswithfeature
special targets 151790 17.5 5729 3762 43.4%.SILENT targets 3762 0.4 1 3762 43.4%
.IGNORE targets 0 0.0 0 0 0.0%suffix targets 0 0.0 0 0 0.0%
target-specific assignments 396721 45.7 14034 3762 43.4%variable references 3762 0.4 1 3762 43.4%variable references 3762 0.4 1 3762 43.4%
function calls 0 0.0 0 0 0.0%paths 187557 21.6 7162 3762 43.4%
A.3.5 Dependencies
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 0 0.0 0 0 0.0%unique variable references 0 0.0 0 0 0.0%
function calls 0 0.0 0 0 0.0%total autoeval references 0 0.0 0 0 0.0%
$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%paths 205179 23.7 9085 3762 43.4%
A.3. CMAKE 114
A.3.6 Recipes
Feature Total Average Max
# offileswithfeature
% offileswithfeature
command flags 100014 11.5 2769 2616 30.2%@ command flags 100014 11.5 2769 2616 30.2%- command flags 0 0.0 0 0 0.0%
+ command flags 0 0.0 0 0 0.0%recursive makes 68788 7.9 3808 2453 28.3%
recursive automakes 0 0.0 0 0 0.0%recursive cmakes 76840 8.9 3980 3762 43.4%recursive qmakes 0 0.0 0 0 0.0%
variable references 90439 10.4 5888 2616 30.2%unique variable references 90439 10.4 5888 2616 30.2%
autoeval references 0 0.0 0 0 0.0%$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%function calls 0 0.0 0 0 0.0%
paths 199167 23.0 7332 3762 43.4%
A.3.7 Function Calls
Feature Total Average Max
# offileswithfeature
% offileswithfeature
subst function calls 0 0.0 0 0 0.0%patsubst function calls 0 0.0 0 0 0.0%
strip function calls 0 0.0 0 0 0.0%findstring function calls 0 0.0 0 0 0.0%
filter function calls 0 0.0 0 0 0.0%filter-out function calls 0 0.0 0 0 0.0%
sort function calls 0 0.0 0 0 0.0%
A.3. CMAKE 115
Feature Total Average Max
# offileswithfeature
% offileswithfeature
word function calls 0 0.0 0 0 0.0%wordlist function calls 0 0.0 0 0 0.0%
words function calls 0 0.0 0 0 0.0%firstword function calls 0 0.0 0 0 0.0%lastword function calls 0 0.0 0 0 0.0%
dir function calls 0 0.0 0 0 0.0%notdir function calls 0 0.0 0 0 0.0%suffix function calls 0 0.0 0 0 0.0%
basename function calls 0 0.0 0 0 0.0%addsuffix function calls 0 0.0 0 0 0.0%addprefix function calls 0 0.0 0 0 0.0%
join function calls 0 0.0 0 0 0.0%wildcard function calls 0 0.0 0 0 0.0%realpath function calls 0 0.0 0 0 0.0%abspath function calls 0 0.0 0 0 0.0%
error function calls 0 0.0 0 0 0.0%warning function calls 0 0.0 0 0 0.0%
info function calls 0 0.0 0 0 0.0%file function calls 0 0.0 0 0 0.0%call function calls 0 0.0 0 0 0.0%
value function calls 0 0.0 0 0 0.0%eval function calls 0 0.0 0 0 0.0%
origin function calls 0 0.0 0 0 0.0%flavor function calls 0 0.0 0 0 0.0%shell function calls 0 0.0 0 0 0.0%guile function calls 0 0.0 0 0 0.0%
foreach function calls 0 0.0 0 0 0.0%if function calls 0 0.0 0 0 0.0%
or function calls 0 0.0 0 0 0.0%and function calls 0 0.0 0 0 0.0%
A.4. QMAKE 116
A.4 QMake
A.4.1 General
Feature Total Average Max
# offileswithfeature
% offileswithfeature
lines 4996408 2031.1 172197 - -continuations 4206088 1709.8 168654 2460 100.0%
comments 35153 14.3 18 2460 100.0%directory changes (cd) 41132 16.7 1330 913 37.1%
paths 4760241 1935.1 176433 2460 100.0%vpath directives 0 0.0 0 0 0.0%
VPATH variable assignments 0 0.0 0 0 0.0%includes 0 0.0 0 0 0.0%
ifdef/ifeq directives 0 0.0 0 0 0.0%macros 0 0.0 0 0 0.0%
assignments 79812 32.4 41 2460 100.0%variable references 250308 101.8 4896 2458 99.9%
unique variable references 56728 23.1 36 2458 99.9%autoevals 0 0.0 0 0 0.0%
$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoevals 0 0.0 0 0 0.0%function calls 0 0.0 0 0 0.0%
embedded function calls 0 0.0 0 0 0.0%rules 367352 149.3 1430 2460 100.0%
targets 367451 149.4 1430 2460 100.0%dependencies 4199032 1706.9 168707 2460 100.0%
recipe commands 180124 73.2 2595 2460 100.0%
A.4. QMAKE 117
A.4.2 Assignments
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 9393 3.8 6 2458 99.9%unique variable references 9393 3.8 6 2458 99.9%
autoeval references 0 0.0 0 0 0.0%$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%lazy assignments 79812 32.4 41 2460 100.0%
strict assignments 0 0.0 0 0 0.0%iterative assignments 0 0.0 0 0 0.0%
conditional assignments 0 0.0 0 0 0.0%function calls 0 0.0 0 0 0.0%
paths 16722 6.8 12 2460 100.0%
A.4.3 Rules
Feature Total Average Max
# offileswithfeature
% offileswithfeature
single colon rules 367352 149.3 1430 2460 100.0%double colon rules 0 0.0 0 0 0.0%
pattern rules 11 0.0 1 11 0.4%static pattern rules 0 0.0 0 0 0.0%
A.4. QMAKE 118
A.4.4 Targets
Feature Total Average Max
# offileswithfeature
% offileswithfeature
special targets 2030 0.8 1 2030 82.5%.SILENT targets 0 0.0 0 0 0.0%
.IGNORE targets 0 0.0 0 0 0.0%suffix targets 27827 11.3 1084 2054 83.5%
target-specific assignments 367352 149.3 1430 2460 100.0%variable references 2450 1.0 29 2092 85.0%variable references 2450 1.0 29 2092 85.0%
function calls 0 0.0 0 0 0.0%paths 221088 89.9 1118 2459 100.0%
A.4.5 Dependencies
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 8192 3.3 384 2092 85.0%unique variable references 8192 3.3 384 2092 85.0%
function calls 0 0.0 0 0 0.0%total autoeval references 0 0.0 0 0 0.0%
$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%paths 4046823 1645.1 168664 2459 100.0%
A.4. QMAKE 119
A.4.6 Recipes
Feature Total Average Max
# offileswithfeature
% offileswithfeature
command flags 96879 39.4 1655 2460 100.0%@ command flags 43743 17.8 1261 2460 100.0%- command flags 53136 21.6 1642 2458 99.9%
+ command flags 0 0.0 0 0 0.0%recursive makes 46386 18.9 1264 2036 82.8%
recursive automakes 0 0.0 0 0 0.0%recursive cmakes 0 0.0 0 0 0.0%recursive qmakes 40709 16.5 1262 2460 100.0%
variable references 230273 93.6 4881 2458 99.9%unique variable references 148639 60.4 3302 2458 99.9%
autoeval references 0 0.0 0 0 0.0%$@ references 0 0.0 0 0 0.0%$< references 0 0.0 0 0 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 0 0.0 0 0 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%function calls 0 0.0 0 0 0.0%
paths 148603 60.4 2590 2443 99.3%
A.4.7 Function Calls
Feature Total Average Max
# offileswithfeature
% offileswithfeature
subst function calls 0 0.0 0 0 0.0%patsubst function calls 0 0.0 0 0 0.0%
strip function calls 0 0.0 0 0 0.0%findstring function calls 0 0.0 0 0 0.0%
filter function calls 0 0.0 0 0 0.0%filter-out function calls 0 0.0 0 0 0.0%
sort function calls 0 0.0 0 0 0.0%
A.4. QMAKE 120
Feature Total Average Max
# offileswithfeature
% offileswithfeature
word function calls 0 0.0 0 0 0.0%wordlist function calls 0 0.0 0 0 0.0%
words function calls 0 0.0 0 0 0.0%firstword function calls 0 0.0 0 0 0.0%lastword function calls 0 0.0 0 0 0.0%
dir function calls 0 0.0 0 0 0.0%notdir function calls 0 0.0 0 0 0.0%suffix function calls 0 0.0 0 0 0.0%
basename function calls 0 0.0 0 0 0.0%addsuffix function calls 0 0.0 0 0 0.0%addprefix function calls 0 0.0 0 0 0.0%
join function calls 0 0.0 0 0 0.0%wildcard function calls 0 0.0 0 0 0.0%realpath function calls 0 0.0 0 0 0.0%abspath function calls 0 0.0 0 0 0.0%
error function calls 0 0.0 0 0 0.0%warning function calls 0 0.0 0 0 0.0%
info function calls 0 0.0 0 0 0.0%file function calls 0 0.0 0 0 0.0%call function calls 0 0.0 0 0 0.0%
value function calls 0 0.0 0 0 0.0%eval function calls 0 0.0 0 0 0.0%
origin function calls 0 0.0 0 0 0.0%flavor function calls 0 0.0 0 0 0.0%shell function calls 0 0.0 0 0 0.0%guile function calls 0 0.0 0 0 0.0%
foreach function calls 0 0.0 0 0 0.0%if function calls 0 0.0 0 0 0.0%
or function calls 0 0.0 0 0 0.0%and function calls 0 0.0 0 0 0.0%
A.5. HAND-WRITTEN 121
A.5 Hand-written
A.5.1 General
Feature Total Average Max
# offileswithfeature
% offileswithfeature
lines 353338 50.7 23380 - -continuations 70984 10.2 10261 1744 25.0%
comments 60807 8.7 409 5859 84.0%directory changes (cd) 2211 0.3 336 315 4.5%
paths 50285 7.2 1295 3902 56.0%vpath directives 45 0.0 8 21 0.3%
VPATH variable assignments 1080 0.2 1 1080 15.5%includes 5439 0.8 84 2160 31.0%
ifdef/ifeq directives 5283 0.8 142 1037 14.9%macros 379 0.1 5 343 4.9%
assignments 71187 10.2 891 6474 92.9%variable references 133583 19.2 5566 5497 78.8%
unique variable references 44841 6.4 410 5497 78.8%autoevals 8947 1.3 248 950 13.6%
$@ references 6129 0.9 192 863 12.4%$< references 1951 0.3 115 563 8.1%$? references 126 0.0 38 41 0.6%$ˆ references 296 0.0 8 170 2.4%$+ references 7 0.0 7 1 0.0%$* references 356 0.1 82 109 1.6%
$% references 0 0.0 0 0 0.0%$| references 1 0.0 1 1 0.0%
obsolete autoevals 81 0.0 13 31 0.4%function calls 7882 1.1 483 1048 15.0%
embedded function calls 1578 0.2 259 237 3.4%rules 42126 6.0 6975 1700 24.4%
targets 43939 6.3 6976 1700 24.4%dependencies 62427 9.0 7077 1402 20.1%
recipe commands 28797 4.1 1432 1934 27.7%
A.5. HAND-WRITTEN 122
A.5.2 Assignments
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 46662 6.7 849 3979 57.1%unique variable references 38699 5.6 567 3979 57.1%
autoeval references 749 0.1 57 162 2.3%$@ references 519 0.1 33 143 2.1%$< references 155 0.0 14 87 1.2%$? references 1 0.0 1 1 0.0%$ˆ references 28 0.0 3 19 0.3%$+ references 0 0.0 0 0 0.0%$* references 22 0.0 5 14 0.2%
$% references 0 0.0 0 0 0.0%$| references 1 0.0 1 1 0.0%
obsolete autoeval references 23 0.0 6 10 0.1%lazy assignments 29366 4.2 891 2752 39.5%
strict assignments 16136 2.3 376 3599 51.6%iterative assignments 24872 3.6 435 3225 46.3%
conditional assignments 1162 0.2 43 506 7.3%function calls 3342 0.5 169 713 10.2%
paths 21951 3.1 423 3823 54.8%
A.5.3 Rules
Feature Total Average Max
# offileswithfeature
% offileswithfeature
single colon rules 41470 5.9 6942 1589 22.8%double colon rules 656 0.1 50 216 3.1%
pattern rules 583 0.1 77 222 3.2%static pattern rules 172 0.0 19 71 1.0%
A.5. HAND-WRITTEN 123
A.5.4 Targets
Feature Total Average Max
# offileswithfeature
% offileswithfeature
special targets 6312 0.9 1808 467 6.7%.SILENT targets 2 0.0 1 2 0.0%
.IGNORE targets 0 0.0 0 0 0.0%suffix targets 729 0.1 10 228 3.3%
target-specific assignments 42586 6.1 6975 1700 24.4%variable references 5585 0.8 320 786 11.3%variable references 5447 0.8 319 786 11.3%
function calls 102 0.0 11 42 0.6%paths 3377 0.5 322 502 7.2%
A.5.5 Dependencies
Feature Total Average Max
# offileswithfeature
% offileswithfeature
variable references 16023 2.3 1130 1012 14.5%unique variable references 12871 1.8 712 1012 14.5%
function calls 239 0.0 33 72 1.0%total autoeval references 6 0.0 2 4 0.1%
$@ references 2 0.0 1 2 0.0%$< references 2 0.0 2 1 0.0%$? references 0 0.0 0 0 0.0%$ˆ references 0 0.0 0 0 0.0%$+ references 0 0.0 0 0 0.0%$* references 2 0.0 2 1 0.0%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 0 0.0 0 0 0.0%paths 10341 1.5 1125 629 9.0%
A.5. HAND-WRITTEN 124
A.5.6 Recipes
Feature Total Average Max
# offileswithfeature
% offileswithfeature
command flags 7512 1.1 1379 742 10.6%@ command flags 6113 0.9 1367 626 9.0%- command flags 1352 0.2 58 236 3.4%
+ command flags 47 0.0 12 12 0.2%recursive makes 3541 0.5 1162 331 4.7%
recursive automakes 3 0.0 1 3 0.0%recursive cmakes 0 0.0 0 0 0.0%recursive qmakes 0 0.0 0 0 0.0%
variable references 55916 8.0 5327 1388 19.9%unique variable references 39993 5.7 3747 1356 19.4%
autoeval references 8192 1.2 248 866 12.4%$@ references 5608 0.8 192 778 11.2%$< references 1794 0.3 115 499 7.2%$? references 125 0.0 38 40 0.6%$ˆ references 268 0.0 8 151 2.2%$+ references 7 0.0 7 1 0.0%$* references 332 0.0 81 106 1.5%
$% references 0 0.0 0 0 0.0%$| references 0 0.0 0 0 0.0%
obsolete autoeval references 58 0.0 13 21 0.3%function calls 1711 0.2 352 273 3.9%
paths 11872 1.7 794 1073 15.4%
A.5.7 Function Calls
Feature Total Average Max
# offileswithfeature
% offileswithfeature
subst function calls 404 0.1 18 169 2.4%patsubst function calls 414 0.1 18 159 2.3%
strip function calls 373 0.1 83 95 1.4%findstring function calls 191 0.0 84 50 0.7%
filter function calls 387 0.1 55 130 1.9%filter-out function calls 338 0.0 23 160 2.3%
sort function calls 91 0.0 7 54 0.8%
A.5. HAND-WRITTEN 125
Feature Total Average Max
# offileswithfeature
% offileswithfeature
word function calls 36 0.0 7 19 0.3%wordlist function calls 55 0.0 6 42 0.6%
words function calls 51 0.0 18 26 0.4%firstword function calls 76 0.0 4 50 0.7%lastword function calls 7 0.0 2 6 0.1%
dir function calls 164 0.0 12 66 0.9%notdir function calls 412 0.1 176 77 1.1%suffix function calls 8 0.0 3 4 0.1%
basename function calls 211 0.0 176 26 0.4%addsuffix function calls 100 0.0 9 47 0.7%addprefix function calls 463 0.1 31 163 2.3%
join function calls 167 0.0 166 2 0.0%wildcard function calls 430 0.1 25 175 2.5%realpath function calls 5 0.0 2 3 0.0%abspath function calls 31 0.0 7 18 0.3%
error function calls 231 0.0 13 94 1.3%warning function calls 48 0.0 9 28 0.4%
info function calls 59 0.0 13 18 0.3%file function calls 0 0.0 0 0 0.0%call function calls 1728 0.2 205 382 5.5%
value function calls 4 0.0 2 2 0.0%eval function calls 190 0.0 43 34 0.5%
origin function calls 5 0.0 2 3 0.0%flavor function calls 0 0.0 0 0 0.0%shell function calls 451 0.1 34 202 2.9%guile function calls 0 0.0 0 0 0.0%
foreach function calls 359 0.1 84 92 1.3%if function calls 371 0.1 18 170 2.4%
or function calls 9 0.0 2 8 0.1%and function calls 2 0.0 1 2 0.0%
top related