kernel recipes 2013 - automating source code evolutions using coccinelle
DESCRIPTION
New APIs are continually added to the Linux kernel, to improve functionality, reliability, or performance. Nevertheless, it is a challenge to update old code to use these new APIs. Coccinelle is a program matching and transformation tool for C code that has been extensively been applied to the Linux kernel. This talk will introduce Coccinelle and illustrate how it can be used for this kind of code renovation.TRANSCRIPT
Automating Source Code Evolutions usingCoccinelle
Julia Lawall (Inria/LIP6)
Joint work withGilles Muller, René Rydhof Hansen,
Nicolas Palix
September 24, 2013
1
Legacy software:Changing priorities, changing requirements
Linux kernel examples:I Booleans:
– Use 0 / 1?– Use true / false?
I Managed memory:– kmalloc requires kfree, request_irq requiresfree_irq
– Devm interface implicitly cleans up allocated resources.
Issues:I These changes require pervasive, scattered modifications.I Nothing forces the changes to be made.I Developers don’t pick up on new coding strategies.
2
The use of booleans over time
Lin
ux 2
.6.2
0
Lin
ux 2
.6.2
2
Lin
ux 2
.6.2
4
Lin
ux 2
.6.2
6
Lin
ux 2
.6.2
8
Lin
ux 2
.6.3
0
Lin
ux 2
.6.3
2
Lin
ux 2
.6.3
4
Lin
ux 2
.6.3
6
Lin
ux 3
.0
Lin
ux 3
.2
Lin
ux 3
.4
Lin
ux 3
.6
Lin
ux 3
.8
Lin
ux 3
.10
0
10000
20000
30000
Occurrences boolean values
boolean variables
ints in bools
(Roughly every 6 months, February 2007 - June 2013).
3
The use of booleans over time - rate of bad code
Lin
ux 2
.6.2
0
Lin
ux 2
.6.2
2
Lin
ux 2
.6.2
4
Lin
ux 2
.6.2
6
Lin
ux 2
.6.2
8
Lin
ux 2
.6.3
0
Lin
ux 2
.6.3
2
Lin
ux 2
.6.3
4
Lin
ux 2
.6.3
6
Lin
ux 3
.0
Lin
ux 3
.2
Lin
ux 3
.4
Lin
ux 3
.6
Lin
ux 3
.8
Lin
ux 3
.10
0
5
10
15
20
%
int values in bool variables
4
Booleans: A concrete example
Desired property: bool variables should be true or false.
Code fragment:
static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {struct drbd_peer_request *rs_req;bool rv = 0;spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {
if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {
rv = 1;break;
}}spin_unlock_irq(&mdev->tconn->req_lock);return rv;
}
5
The use of devm functions over time
Linux 2.6.20
Linux 2.6.22
Linux 2.6.24
Linux 2.6.26
Linux 2.6.28
Linux 2.6.30
Linux 2.6.32
Linux 2.6.34
Linux 2.6.36
Linux 3.0
Linux 3.2
Linux 3.4
Linux 3.6
Linux 3.8
Linux 3.10
0
200
400
600
Occurrences
platform kzalloc
platform devm_kzalloc
i2c kzalloc
i2c devm_kzalloc
6
Devm functions: A concrete example
Desired property (simpler than introducing devm functions):I A common pattern is platform_get_resource followed
by devm_ioremap_resource.
I devm_ioremap_resource does error handling forthe platform_get_resource value.
I Separate error handling is not needed.
Code fragment:
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!res)
return -ENODEV;ahb->regs = devm_ioremap_resource(&pdev->dev, res);
7
How to make these changes automatically andreliably?
Requirements:I Find relevant code.
– grep does this...
I Make changes.– sed does this...
Problem: Grep and sed don’t know about code structure orsemantics.
8
Our approach: Coccinelle
I Static analysis to find patterns in C code.
I Automatic transformation.
I User scriptable, based on patch notation(semantic patches).
http://coccinelle.lip6.fr/
http://coccinellery.org/
9
Boolean example, part 1
@@ -1932,14 +1932,14 @@static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {
struct drbd_peer_request *rs_req;- bool rv = 0;+ bool rv = false;
spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {
if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {
- rv = 1;+ rv = true;
break;}
}spin_unlock_irq(&mdev->tconn->req_lock);return rv;
}
First step:I Replace bool b = 0; by bool b = false;
I Replace bool b = 1; by bool b = true;10
Semantic patch: booleans, part 1
@@identifier b;@@
bool b =- 0+ false;
@@identifier b;@@
bool b =- 1+ true;
11
Semantic patch application: booleans, part 1
Result:
@@ -1932,14 +1932,14 @@static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {
struct drbd_peer_request *rs_req;- bool rv = 0;+ bool rv = false;
spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {
if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {
rv = 1;break;
}}spin_unlock_irq(&mdev->tconn->req_lock);return rv;
}
Affects 187 lines, 124 files in Linux 3.10.I Fixes the declaration, but not the subsequent assignment...
12
Boolean example: part 2
Problem:I We cannot replace every 0 by false, and 1 by true.I The assignment must involve a boolean variable.
Solution: Type constraints on metavariables.
@@ bool b; @@b =- 0+ false
@@ bool b; @@b =- 1+ true
13
Semantic patch application: booleans, part 2
Result:
@@ -1932,14 +1932,14 @@static bool overlapping_resync_write(struct drbd_conf *mdev, ...) {
struct drbd_peer_request *rs_req;- bool rv = 0;+ bool rv = false;
spin_lock_irq(&mdev->tconn->req_lock);list_for_each_entry(rs_req, &mdev->sync_ee, w.list) {
if (overlaps(peer_req->i.sector, peer_req->i.size,rs_req->i.sector, rs_req->i.size)) {
- rv = 1;+ rv = true;
break;}
}spin_unlock_irq(&mdev->tconn->req_lock);return rv;
}
Affects 657 lines, 236 files in Linux 3.10.
14
Remaining issues
I Function arguments.
I Function return values.
I Booleans used with non-bool variables.
Beyond the scope of this talk...
15
platform_get_resource example, part 1
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!mem) {
dev_err(&pdev->dev, "no mem resource?");return -ENODEV;
}irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);if (!irq) {
dev_err(&pdev->dev, "no irq resource?");return -ENODEV;
}... // 35 lines of codedev->base = devm_ioremap_resource(&pdev->dev, mem);if (IS_ERR(dev->base)) {
r = PTR_ERR(dev->base);goto err_unuse_clocks;
}
16
platform_get_resource example, part 1
- mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);- if (!mem) {- dev_err(&pdev->dev, "no mem resource?");- return -ENODEV;- }
...+ mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
dev->base = devm_ioremap_resource(&pdev->dev, mem);if (IS_ERR(dev->base)) {
r = PTR_ERR(dev->base);goto err_unuse_clocks;
}
Problem: Need both platform_get_resource anddevm_ioremap_resource.
I Separated by arbitrary code fragments.
17
Semantic patch: platform_get_resource, part 1
Copy the code fragment, from the first line to the last line
@@expression res, pdev, n, e, x;
@@
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!mem) {
dev_err(&pdev->dev, "no mem resource?");return -ENODEV;
}irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);if (!irq) {
dev_err(&pdev->dev, "no irq resource?");return -ENODEV;
}...dev->base = devm_ioremap_resource(&pdev->dev, mem);
18
Semantic patch: platform_get_resource, part 1
Drop the irrelevant parts
@@expression res, pdev, n, e, x;
@@
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!mem) {
dev_err(&pdev->dev, "no mem resource?");return -ENODEV;
}
...dev->base = devm_ioremap_resource(&pdev->dev, mem);
19
Semantic patch: platform_get_resource, part 1
Abstract over irrelevant detailsy
@@expression mem, pdev, n, e;statement S;@@
mem = platform_get_resource(pdev, IORESOURCE_MEM, n);if (!mem) S
...e = devm_ioremap_resource(&pdev->dev, mem);
20
Semantic patch: platform_get_resource, part 1
Introduce transformationsy
@@expression mem, pdev, n, e;statement S;@@
- mem = platform_get_resource(pdev, IORESOURCE_MEM, n);- if (!mem) S
...+ mem = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(&pdev->dev, mem);
21
Semantic patch: platform_get_resource, part 1
Add sanity checks
@@expression mem, pdev, n, e;statement S;@@
- mem = platform_get_resource(pdev, IORESOURCE_MEM, n);- if (!mem) S
... when != mem+ mem = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(&pdev->dev, mem);
22
Semantic patch application:platform_get_resource, part 1
Result:
- mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);- if (!mem) {- dev_err(&pdev->dev, "no mem resource?");- return -ENODEV;- }
irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);if (!irq) {
dev_err(&pdev->dev, "no irq resource?");return -ENODEV;
}...
+ mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);dev->base = devm_ioremap_resource(&pdev->dev, mem);if (IS_ERR(dev->base)) {
r = PTR_ERR(dev->base);goto err_unuse_clocks;
}
Affects 42 files in Linux 3.10.
23
platform_get_resource example, part 2
Problem: Some calls are missed because &pdev->dev isrenamed.
Code fragment:
struct device *dev = &pdev->dev;void __iomem *addr = NULL;struct stmmac_priv *priv = NULL;struct plat_stmmacenet_data *plat_dat = NULL;const char *mac = NULL;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);if (!res)
return -ENODEV;
addr = devm_ioremap_resource(dev, res);if (IS_ERR(addr))
return PTR_ERR(addr);
24
Semantic patch: platform_get_resource, part 2
@@expression res, pdev, n, e, x, e1, e2;statement S;@@
x = &pdev->dev... when != x = e1
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);- if (!res) S
... when != memwhen != x = e2
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);e = devm_ioremap_resource(x,res);
Result: Updates 4 more files in Linux-3.10.
25
Summary
We have seen the need for:
I Search and replace for atomic code fragments.– Declaration and initialization of boolean variables.
I Search and replace using type information.– Assignments of boolean variables.
I Search and replace for scattered code fragments– platform_get_resource somewhere followed bydevm_iomap_resource.
I Taking renamings into account.– &pdev->dev vs dev.
26
Conclusion
CoccinelleI Define and perform arbitrary transformations across a
code base.
Reasonable performance in most cases.I Boolean example takes 7 minutes on a 4 core laptop.
Impact:I Over 1000 patches in Linux based on Coccinelle.I Over 40 semantic patches in linux/scripts/coccinelle.
http://coccinelle.lip6.fr/http://coccinellery.org/
27