btrfs fun · subliminal message: do not put critical data on btrfs partitions. introduction btrfs...

22
BTRFS Fun funtoo linux Support Funtoo and help us grow! Donate $15 per month and get a free SSD-based Funtoo Virtual Container. Looking for people interested in testing and documenting Docker support! Contact Daniel Robbins for more info. Important BTRFS is still experimental even with latest Linux kernels (3.4-rc at date of writing) so be prepared to lose some data sooner or later or hit a severe issue/regressions/"itchy" bugs. Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place in 2007. A good summary is given in [1]. BTRFS aims to provide a modern answer for making storage more flexible and efficient. According to its main contributor, Chris Mason, the goal was "to let Linux scale for the storage that will be available. Scaling is not just about addressing the storage but also means being able to administer and to manage it with a clean interface that lets people see what's being used and makes it more reliable." (Ref. http://en.wikipedia.org/wiki/Btrfs). Btrfs, often compared to ZFS, is offering some interesting features like: Using very few fixed location metadata, thus allowing an existing ext2/ext3 filesystem to be "upgraded" in-place to BTRFS. Operations are transactional Online volume defragmentation (online filesystem check is on the radar but is not yet implemented). Built-in storage pool capabilities (no need for LVM) Built-in RAID capabilities (both for the data and filesystem metadata). RAID-5/6 is planned for 3.5 kernels Capabilities to grow/shrink the volume Subvolumes and snapshots (extremely powerful, you can "rollback" to a previous filesystem state as if nothing had happened). Copy-On-Write Usage of B-Trees to store the internal filesystem structures (B-Trees are known to have a logarithmic growth in depth, thus making them more efficient when scanning) Requirements A recent Linux kernel (BTRFS metadata format evolves from time to time and mounting using a recent Linux kernel can make the BTRFS volume unreadable with an older kernel revision, e.g. Linux 2.6.31 vs Linux 2.6.30). You must also use sys-fs/btrfs-progs (0.19 or better use -9999 which points to the git repository). BTRFS Fun 03/05/15 http://www.funtoo.org/BTRFS_Fun 1 / 22 3VMKMRI HI PE HSG LXXT[[[JYRXSSSVK&86*7C*YR

Upload: others

Post on 02-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

BTRFS Fun

funtoo linux

Support Funtoo and help us grow! Donate $15 per month and get a free SSD-basedFuntoo Virtual Container.

Looking for people interested in testing and documenting Docker support! ContactDaniel Robbins for more info.

ImportantBTRFS is still experimental even with latest Linux kernels (3.4-rc at date of writing) so beprepared to lose some data sooner or later or hit a severe issue/regressions/"itchy" bugs.Subliminal message: Do not put critical data on BTRFS partitions.

IntroductionBTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place in 2007. Agood summary is given in [1]. BTRFS aims to provide a modern answer for making storage more flexibleand efficient. According to its main contributor, Chris Mason, the goal was "to let Linux scale for thestorage that will be available. Scaling is not just about addressing the storage but also means being ableto administer and to manage it with a clean interface that lets people see what's being used and makesit more reliable." (Ref. http://en.wikipedia.org/wiki/Btrfs).

Btrfs, often compared to ZFS, is offering some interesting features like:

Using very few fixed location metadata, thus allowing an existing ext2/ext3 filesystem to be"upgraded" in-place to BTRFS.Operations are transactionalOnline volume defragmentation (online filesystem check is on the radar but is not yet implemented).Built-in storage pool capabilities (no need for LVM)Built-in RAID capabilities (both for the data and filesystem metadata). RAID-5/6 is planned for 3.5kernelsCapabilities to grow/shrink the volumeSubvolumes and snapshots (extremely powerful, you can "rollback" to a previous filesystem state asif nothing had happened).Copy-On-WriteUsage of B-Trees to store the internal filesystem structures (B-Trees are known to have a logarithmicgrowth in depth, thus making them more efficient when scanning)

RequirementsA recent Linux kernel (BTRFS metadata format evolves from time to time and mounting using a recentLinux kernel can make the BTRFS volume unreadable with an older kernel revision, e.g. Linux 2.6.31 vsLinux 2.6.30). You must also use sys-fs/btrfs-progs (0.19 or better use -9999 which points to the gitrepository).

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 1 / 22

3VMKMRI�HI�PE�HSG��LXXT���[[[�JYRXSS�SVK�&86*7C*YR

Page 2: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

Playing with BTRFS storage poolcapabilitiesWhereas it would possible to use btrfs just as you are used to under a non-LVM system, it shines in usingits built-in storage pool capabilities. Tired of playing with LVM ? :-) Good news: you do not need itanymore with btrfs.

Setting up a storage poolBTRFS terminology is a bit confusing. If you already have used another 'advanced' filesystem like ZFS orsome mechanism like LVM, it's good to know that there are many correlations. In the BTRFS world, theword volume corresponds to a storage pool (ZFS) or a volume group (LVM). Ref.http://www.rkeene.org/projects/info/wiki.cgi/165

The test bench uses disk images through loopback devices. Of course, in a real world case, you will uselocal drives or units though a SAN. To start with, 5 devices of 1 GiB are allocated:

# dd if=/dev/zero of=/tmp/btrfs-vol0.img bs=1G count=1# dd if=/dev/zero of=/tmp/btrfs-vol1.img bs=1G count=1# dd if=/dev/zero of=/tmp/btrfs-vol2.img bs=1G count=1# dd if=/dev/zero of=/tmp/btrfs-vol3.img bs=1G count=1# dd if=/dev/zero of=/tmp/btrfs-vol4.img bs=1G count=1

Then attached:

# losetup /dev/loop0 /tmp/btrfs-vol0.img# losetup /dev/loop1 /tmp/btrfs-vol1.img# losetup /dev/loop2 /tmp/btrfs-vol2.img# losetup /dev/loop3 /tmp/btrfs-vol3.img# losetup /dev/loop4 /tmp/btrfs-vol4.img

Creating the initial volume (pool)BTRFS uses different strategies to store data and for the filesystem metadata (ref.https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices).

By default the behavior is:

metadata is replicated on all of the devices. If a single device is used the metadata is duplicatedinside this single device (useful in case of corruption or bad sector, there is a higher chance that oneof the two copies is clean). To tell btrfs to maintain a single copy of the metadata, just use single.Remember: dead metadata = dead volume with no chance of recovery.data is spread amongst all of the devices (this means no redundancy; any data block left on adefective device will be inaccessible)

To create a BTRFS volume made of multiple devices with default options, use:

# mkfs.btrfs /dev/loop0 /dev/loop1 /dev/loop2

To create a BTRFS volume made of a single device with a single copy of the metadata (dangerous!), use:

# mkfs.btrfs -m single /dev/loop0

To create a BTRFS volume made of multiple devices with metadata spread amongst all of the devices,use:

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 2 / 22

Page 3: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

# mkfs.btrfs -m raid0 /dev/loop0 /dev/loop1 /dev/loop2

To create a BTRFS volume made of multiple devices, with metadata spread amongst all of the devicesand data mirrored on all of the devices (you probably don't want this in a real setup), use:

# mkfs.btrfs -m raid0 -d raid1 /dev/loop0 /dev/loop1 /dev/loop2

To create a fully redundant BTRFS volume (data and metadata mirrored amongst all of the devices), use:

# mkfs.btrfs -d raid1 /dev/loop0 /dev/loop1 /dev/loop2

NoteTechnically you can use anything as a physical volume: you can have a volume composed of 2local hard drives, 3 USB keys, 1 loopback device pointing to a file on a NFS share and 3 logicaldevices accessed through your SAN (you would be an idiot, but you can, nevertheless). Havingdifferent physical volume sizes would lead to issues, but it works :-).

Checking the initial volumeTo verify the devices of which BTRFS volume is composed, just use btrfs-show device (old style) orbtrfs filesystem show device (new style). You need to specify one of the devices (the metadata hasbeen designed to keep a track of the what device is linked what other device). If the initial volume wasset up like this:

# mkfs.btrfs /dev/loop0 /dev/loop1 /dev/loop2

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTALWARNING! - see http://btrfs.wiki.kernel.org before using

adding device /dev/loop1 id 2adding device /dev/loop2 id 3fs created label (null) on /dev/loop0 nodesize 4096 leafsize 4096 sectorsize 4096 size 3.00GBBtrfs Btrfs v0.19

It can be checked with one of these commands (They are equivalent):

# btrfs filesystem show /dev/loop0# btrfs filesystem show /dev/loop1# btrfs filesystem show /dev/loop2

The result is the same for all commands:

Label: none uuid: 0a774d9c-b250-420e-9484-b8f982818c09 Total devices 3 FS bytes used 28.00KB devid 3 size 1.00GB used 263.94MB path /dev/loop2 devid 1 size 1.00GB used 275.94MB path /dev/loop0 devid 2 size 1.00GB used 110.38MB path /dev/loop1

To show all of the volumes that are present:

# btrfs filesystem showLabel: none uuid: 0a774d9c-b250-420e-9484-b8f982818c09 Total devices 3 FS bytes used 28.00KB devid 3 size 1.00GB used 263.94MB path /dev/loop2

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 3 / 22

Page 4: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

devid 1 size 1.00GB used 275.94MB path /dev/loop0 devid 2 size 1.00GB used 110.38MB path /dev/loop1

Label: none uuid: 1701af39-8ea3-4463-8a77-ec75c59e716a Total devices 1 FS bytes used 944.40GB devid 1 size 1.42TB used 1.04TB path /dev/sda2

Label: none uuid: 01178c43-7392-425e-8acf-3ed16ab48813 Total devices 1 FS bytes used 180.14GB devid 1 size 406.02GB used 338.54GB path /dev/sda4

WarningBTRFS wiki mentions that btrfs device scan should be performed, consequence of not doingthe incantation? Volume not seen?

Mounting the initial volumeBTRFS volumes can be mounted like any other filesystem. The cool stuff at the top on the sundae is thatthe design of the BTRFS metadata makes it possible to use any of the volume devices. The followingcommands are equivalent:

# mount /dev/loop0 /mnt# mount /dev/loop1 /mnt# mount /dev/loop2 /mnt

For every physical device used for mounting the BTRFS volume df -h reports the same (in all cases 3GiB of "free" space is reported):

# df -hFilesystem Size Used Avail Use% Mounted on/dev/loop1 3.0G 56K 1.8G 1% /mnt

The following command prints very useful information (like how the BTRFS volume has been created):

# btrfs filesystem df /mnt Data, RAID0: total=409.50MB, used=0.00Data: total=8.00MB, used=0.00System, RAID1: total=8.00MB, used=4.00KBSystem: total=4.00MB, used=0.00Metadata, RAID1: total=204.75MB, used=28.00KBMetadata: total=8.00MB, used=0.00

By the way, as you can see, for the btrfs command the mount point should be specified, not one of thephysical devices.

Shrinking the volumeA common practice in system administration is to leave some head space, instead of using the wholecapacity of a storage pool (just in case). With btrfs one can easily shrink volumes. Let's shrink the volumea bit (about 25%):

# btrfs filesystem resize -500m /mnt# dh -h/dev/loop1 2.6G 56K 1.8G 1% /mnt

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 4 / 22

Page 5: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

And yes, it is an on-line resize, there is no need to umount/shrink/mount. So no downtimes! :-) However,a BTRFS volume requires a minimal size... if the shrink is too aggressive the volume won't be resized:

# btrfs filesystem resize -1g /mnt Resize '/mnt' of '-1g'ERROR: unable to resize '/mnt'

Growing the volumeThis is the opposite operation, you can make a BTRFS grow to reach a particular size (e.g. 150 moremegabytes):

# btrfs filesystem resize +150m /mntResize '/mnt' of '+150m'# dh -h/dev/loop1 2.7G 56K 1.8G 1% /mnt

You can also take an "all you can eat" approach via the max option, meaning all of the possible spacewill be used for the volume:

# btrfs filesystem resize max /mntResize '/mnt' of 'max'# dh -h/dev/loop1 3.0G 56K 1.8G 1% /mnt

Adding a new device to the BTRFS volumeTo add a new device to the volume:

# btrfs device add /dev/loop4 /mnt oxygen ~ # btrfs filesystem show /dev/loop4 Label: none uuid: 0a774d9c-b250-420e-9484-b8f982818c09 Total devices 4 FS bytes used 28.00KB devid 3 size 1.00GB used 263.94MB path /dev/loop2 devid 4 size 1.00GB used 0.00 path /dev/loop4 devid 1 size 1.00GB used 275.94MB path /dev/loop0 devid 2 size 1.00GB used 110.38MB path /dev/loop1

Again, no need to umount the volume first as adding a device is an on-line operation (the device has nospace used yet hence the '0.00'). The operation is not finished as we must tell btrfs to prepare the newdevice (i.e. rebalance/mirror the metadata and the data between all devices):

# btrfs filesystem balance /mnt# btrfs filesystem show /dev/loop4Label: none uuid: 0a774d9c-b250-420e-9484-b8f982818c09 Total devices 4 FS bytes used 28.00KB devid 3 size 1.00GB used 110.38MB path /dev/loop2 devid 4 size 1.00GB used 366.38MB path /dev/loop4 devid 1 size 1.00GB used 378.38MB path /dev/loop0 devid 2 size 1.00GB used 110.38MB path /dev/loop1

Note

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 5 / 22

Page 6: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

Depending on the sizes and what is in the volume a balancing operation could take severalminutes or hours.

Removing a device from the BTRFS volume# btrfs device delete /dev/loop2 /mnt# btrfs filesystem show /dev/loop0 Label: none uuid: 0a774d9c-b250-420e-9484-b8f982818c09 Total devices 4 FS bytes used 28.00KB devid 4 size 1.00GB used 264.00MB path /dev/loop4 devid 1 size 1.00GB used 268.00MB path /dev/loop0 devid 2 size 1.00GB used 0.00 path /dev/loop1 *** Some devices missing# df -hFilesystem Size Used Avail Use% Mounted on/dev/loop1 3.0G 56K 1.5G 1% /mnt

Here again, removing a device is totally dynamic and can be done as an on-line operation! Note thatwhen a device is removed, its content is transparently redistributed among the other devices.

Obvious points:

** DO NOT UNPLUG THE DEVICE BEFORE THE END OF THE OPERATION, DATA LOSS WILLRESULT**If you have used raid0 in either metadata or data at the BTRFS volume creation you will end in aunusable volume if one of the the devices fails before being properly removed from the volume assome stripes will be lost.

Once you add a new device to the BTRFS volume as a replacement for a removed one, you can cleanupthe references to the missing device:

# btrfs device delete missing

Using a BTRFS volume in degraded modeWarningIt is not possible to use a volume in degraded mode if raid0 has been used for data/metadataand the device had not been properly removed with btrfs device delete (some stripes will bemissing). The situation is even worse if RAID0 is used for the the metadata: trying to mount aBTRFS volume in read/write mode while not all the devices are accessible will simply kill theremaining metadata, hence making the BTRFS volume totally unusable... you havebeen warned! :-)

If you use raid1 or raid10 for data AND metadata and you have a usable submirror accessible (consistingof 1 drive in case of RAID1 or the two drive of the same RAID0 array in case of RAID10), you can mountthe array in degraded mode in the case of some devices are missing (e.g. dead SAN link or dead drive) :

# mount -o degraded /dev/loop0 /mnt

If you use RAID0 (and have one of your drives inaccessible) the metadata or RAID10 but not enoughdrives are on-line to even get a degraded mode possible, btrfs will refuse to mount the volume:

# mount /dev/loop0 /mntmount: wrong fs type, bad option, bad superblock on /dev/loop0,

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 6 / 22

Page 7: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so

The situation is no better if you have used RAID1 for the metadata and RAID0 for the data, you canmount the drive in degraded mode but you will encounter problems while accessing your files:

# cp /mnt/test.dat /tmp cp: reading `/mnt/test.dat': Input/output errorcp: failed to extend `/tmp/test.dat': Input/output error

Playing with subvolumes andsnapshotsA story of boxes....When you think about subvolumes in BTRFS, think about boxes. Each one of those can contain items andother smaller boxes ("sub-boxes") which in turn can also contains items and boxes (sub-sub-boxes) andso on. Each box and items has a number and a name, except for the top level box, which has only anumber (zero). Now imagine that all of the boxes are semi-opaque: you can see what they contain if youare outside the box but you can't see outside when you are inside the box. Thus, depending on the boxyou are in you can view either all of the items and sub-boxes (top level box) or only a part of them (anyother box but the top level one). To give you a better idea of this somewhat abstract explanation let'sillustrate a bit:

(0) --+-> Item A (1) | +-> Item B (2) | +-> Sub-box 1 (3) --+-> Item C (4) | | | +-> Sub-sub-box 1.1 (5) --+-> Item D (6) | | | | | +-> Item E (7) | | | | | +-> Sub-Sub-sub-box 1.1.1 (8) ---> Item F (9) | +-> Item F (10) | +-> Sub-box 2 (11) --> Item G (12)

What you see in the hierarchy depends on where you are (note that the top level box numbered 0doesn't have a name, you will see why later). So:

If you are in the node named top box (numbered 0) you see everything, i.e. things numbered 1 to 12If you are in "Sub-sub-box 1.1" (numbered 5), you see only things 6 to 9If you are in "Sub-box 2" (numbered 11), you only see what is numbered 12

Did you notice? We have two items named 'F' (respectively numbered 9 and 10). This is not atypographic error, this is just to illustrate the fact that every item lives its own peaceful existence in itsown box. Although they have the same name, 9 and 10 are two distinct and unrelated objects (of courseit is impossible to have two objects named 'F' in the same box, even they would be numbereddifferently).

... applied to BTRFS! (or, "What is avolume/subvolume?")BTRFS subvolumes work in the exact same manner, with some nuances:

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 7 / 22

Page 8: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

BTRFS subvolumes work in the exact same manner, with some nuances:

First, imagine a frame that surrounds the whole hierarchy (represented in dots below). This is yourBTRFS volume. A bit abstract at first glance, but BTRFS volumes have no tangible existence, they arejust an aggregation of devices tagged as being clustered together (that fellowship is created whenyou invoke mkfs.btrfs or btrfs device add).Second, the first level of hierarchy contains only a single box numbered zero which can never bedestroyed (because everything it contains would also be destroyed).

If in our analogy of a nested boxes structure we used the word "box", in the real BTRFS word we use theword "subvolume" (box => subvolume). Like in our boxes analogy, all subvolumes hold a uniquenumber greater than zero and a name, with the exception of root subvolume located at the very firstlevel of the hierarchy which is always numbered zero and has no name (BTRFS tools destroy subvolumesby their name not their number so no name = no possible destruction. This is a totally intentionalarchitectural choice, not a flaw).

Here is a typical hierarchy:

.....BTRFS Volume...........................................................................

.....................................................

.

. Root subvolume (0) --+-> Subvolume SV1 (258) ---> Directory D1 --+-> File F1

. | |

. | +-> File F2

. |

. +-> Directory D1 --+-> File F1

. | |

. | +-> File F2

. | |

. | +-> File F3

. | |

. | +-> Directory D11 ---> File F4

. +-> File F1

. |

. +-> Subvolume SV2 (259) --+-> Subvolume SV21 (260)

. |

. +-> Subvolume SV22 (261) --+-> Directory D2 ---> File F4. |. +-> Directory D3 --+-> Subvolume SV221 (262) ---> File F5. | |. | +-> File F6. | |. | +-> File F7. |. +-> File F8......................................................................................................................................

Maybe you have a question: "Okay, What is the difference between a directory and a subvolume? Bothcan can contain something!". To further confuse you, here is what users get if they reproduce the firstlevel hierarchy on a real machine:

# ls -ltotal 0drwx------ 1 root root 0 May 23 12:48 SV1drwxr-xr-x 1 root root 0 May 23 12:48 D1-rw-r--r-- 1 root root 0 May 23 12:48 F1drwx------ 1 root root 0 May 23 12:48 SV2

Although subvolumes SV1 and SV2 have been created with special BTRFS commands they appear justas if they were ordinary directories! A subtle nuance exists, however: think again at our boxes analogy we

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 8 / 22

Page 9: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

did before and map the following concepts in the following manner:

a subvolume : the semi-opaque boxa directory : a sort of item (that can contain something even another subvolume)a file : another sort of item

So, in the internal filesystem metadata SV1 and SV2 are stored in a different manner than D1 (althoughthis is transparently handled for users). You can, however see SV1 and SV2 for what they are(subvolumes) by running the following command (subvolume numbered (0) has been mounted on /mnt):

# btrfs subvolume list /mntID 258 top level 5 path SV1ID 259 top level 5 path SV2

What would we get if we create SV21 and SV22 inside of SV2? Let's try! Before going further you shouldbe aware that a subvolume is created by invoking the magic command btrfs subvolume create:

# cd /mnt/SV2# btrfs subvolume create SV21Create subvolume './SV21'# btrfs subvolume create SV22Create subvolume './SV22'# btrfs subvolume list /mnt ID 258 top level 5 path SV1ID 259 top level 5 path SV2ID 260 top level 5 path SV2/SV21ID 261 top level 5 path SV2/SV22

Again, invoking ls in /mnt/SV2 will report the subvolumes as being directories:

# ls -ltotal 0drwx------ 1 root root 0 May 23 13:15 SV21drwx------ 1 root root 0 May 23 13:15 SV22

Changing the point of view on thesubvolumes hierarchyAt some point in our boxes analogy we have talked about what we see and what we don't see dependingon our location in the hierarchy. Here lies a big important point: whereas most of the BTRFS users mountthe root subvolume (subvolume id = 0, we will retain the root subvolume terminology) in their VFShierarchy thus making visible the whole hierarchy contained in the BTRFS volume, it is absolutely possibleto mount only a subset of it. How that could be possible? Simple: Just specify the subvolume numberwhen you invoke mount. For example, to mount the hierarchy in the VFS starting at subvolume SV22(261) do the following:

# mount -o subvolid=261 /dev/loop0 /mnt

Here lies an important notion not disclosed in the previous paragraph: although both directories andsubvolumes can act as containers, only subvolumes can be mounted in a VFS hierarchy. It is afundamental aspect to remember: you cannot mount a sub-part of a subvolume in the VFS; you canonly mount the subvolume in itself. Considering the hierarchy schema in the previous section, if you wantto access the directory D3 you have three possibilities:

1. Mount the non-named subvolume (numbered 0) and access D3 through /mnt/SV2/SV22/D3 if thenon-named subvolume is mounted in /mnt

2. Mount the subvolume SV2 (numbered 259) and access D3 through /mnt/SV22/D3 if the thesubvolume SV2 is mounted in /mnt

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 9 / 22

Page 10: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

3. Mount the subvolume SV22 (numbered 261) and access D3 through /mnt/D3 if the the subvolumeSV22 is mounted in /mnt

This is accomplished by the following commands, respectively:

# mount -o subvolid=0 /dev/loop0 /mnt# mount -o subvolid=259 /dev/loop0 /mnt# mount -o subvolid=261 /dev/loop0 /mnt

NoteWhen a subvolume is mounted in the VFS, everything located "above" the subvolume is hidden.Concretely, if you mount the subvolume numbered 261 in /mnt, you only see what is underSV22, you won't see what is located above SV22 like SV21, SV2, D1, SV1, etc.

The default subvolume$100 questions: 1. "If I don't put 'subvolid' in the command line, 1. how does the kernel know which one ofthe subvolumes it has to mount? 2. Does Omitting the 'subvolid' means automatically 'mountsubvolume numbered 0'?". Answers: 1. BTRFS magic! ;-) 2. No, not necessarily, you can choosesomething other than the non-named subvolume.

When you create a brand new BTRFS filesystem, the system not only creates the initial the rootsubvolume (numbered 0) but also tags it as being the default subvolume. When you ask theoperating system to mount a subvolume contained in a BTRFS volume without specifying a subvolumenumber, it determines which of the existing subvolumes has been tagged as "default subvolume" andmounts it. If none of the exiting subvolumes has the tag "default subvolume" (e.g. because the defaultsubvolume has been deleted), the mount command gives up with a rather cryptic message:

# mount /dev/loop0 /mntmount: No such file or directory

It is also possible to change at any time which subvolume contained in a BTRFS volume is considered thedefault volume. This is accomplished with btrfs subvolume set-default. The following tags thesubvolume 261 as being the default:

# btrfs subvolume set-default 261 /mnt

After that operation, doing the following is exactly the same:

# mount /dev/loop0 /mnt# mount -o subvolid=261 /dev/loop0 /mnt

NoteThe chosen new default subvolume must be visible in the VFS when you invoke btrfs subvolumeset-default'

Deleting subvolumesQuestion: "As subvolumes appear like directories, can I delete a subvolume by doing an rm -rf on it?".Answer: Yes, you can, but that way is not the most elegant, especially when it contains several gigabytesof data scattered on thousands of files, directories and maybe other subvolumes located in the one youwant to remove. It isn't elegant because rm -rf could take several minutes (or even hours!) to complete

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 10 / 22

Page 11: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

whereas something else can do the same job in the fraction of a second.

"Huh?" Yes perfectly possible, and here is the cool goodie for the readers who arrived at this point: whenyou want to remove a subvolume, use btrfs subvolume delete instead of rm -rf. That btrfscommand will remove the snapshots in a fraction of a second, even it contains several gigabytes ofdata!

WarningYou can never remove the root subvolume of a BTRFS volume as btrfs delete expects asubvolume name (again: this is not a flaw in the design of BTRFS; removing the subvolumenumbered 0 would destroy the entirety of a BTRFS volume...too dangerous).If the subvolume you delete was tagged as the default subvolume you will have to designateanother default subvolume or explicitly tell the system which one of the subvolumes has to bemounted)

An example: considering our initial example given above and supposing you have mounted non-namedsubvolume numbered 0 in /mnt, you can remove SV22 by doing:

# btrfs subvolume delete /mnt/SV2/SV22

Obviously the BTRFS volume will look like this after the operation:

.....BTRFS Volume...........................................................................

.....................................................

.

. (0) --+-> Subvolume SV1 (258) ---> Directory D1 --+-> File F1

. | |

. | +-> File F2

. |

. +-> Directory D1 --+-> File F1

. | |

. | +-> File F2

. | |

. | +-> File F3

. | |

. | +-> Directory D11 ---> File F4

. +-> File F1

. |

. +-> Subvolume SV2 (259) --+-> Subvolume SV21 (260)

............................................................................................

.........................................

Snapshot and subvolumesIf you have a good comprehension of what a subvolume is, understanding what a snapshot is won't be aproblem: a snapshot is a subvolume with some initial contents. "Some initial contents" here means anexact copy.

When you think about snapshots, think about copy-on-write: the data blocks are not duplicated betweena mounted subvolume and its snapshot unless you start to make changes to the files (a snapshot canoccupy nearly zero extra space on the disk). At time goes on, more and more data blocks will bechanged, thus making snapshots "occupy" more and more space on the disk. It is thereforerecommended to keep only a minimal set of them and remove unnecessary ones to avoid wastingspace on the volume.

The following illustrates how to take a snaphot of the VFS root:

# btrfs subvolume snapshot / /snap-2011-05-23Create a snapshot of '/' in '//snap-2011-05-23'

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 11 / 22

Page 12: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

Create a snapshot of '/' in '//snap-2011-05-23'

Once created, the snapshot will persist in /snap-2011-05-23 as long as you don't delete it. Note that thesnapshot contents will remain exactly the same it was at the time is was taken (as long as you don'tmake changes... BTRFS snapshots are writable!). A drawback of having snapshots: if you delete somefiles in the original filesystem, the snapshot still contains them and the disk blocks can't be claimed asfree space. Remember to remove unwanted snapshots and keep a bare minimal set of them.

Listing and deleting snaphotsAs there is no distinction between a snapshot and a subvolume, snapshots are managed with the exactsame commands, especially when the time has come to delete some of them. An interesting feature inBTRFS is that snapshots are writable. You can take a snapshot and make changes in the files/directoriesit contains. A word of caution: there are no undo capbilities! What has been changed has been changedforever... If you need to do several tests just take several snapshots or, better yet, snapshot yoursnapshot then do whatever you need in this copy-of-the-copy :-).

Using snapshots for system recovery (akaBack to the Future)Here is where BTRFS can literally be a lifeboat. Suppose you want to apply some updates via emerge -uaDN @world but you want to be sure that you can jump back into the past in case something goesseriously wrong after the system update (does libpng14 remind you of anything?!). Here is the "putting-things-together part" of the article!

The following only applies if your VFS root and system directories containing /sbin, /bin, /usr, /etc....are located on a BTRFS volume. To make things simple, the whole structure is supposed to be located inthe SAME subvolume of the same BTRFS volume.

To jump back into the past you have at least two options:

1. Fiddle with the default subvolume numbers2. Use the kernel command line parameters in the bootloader configuration files

In all cases you must take a snapshot of your VFS root *before* updating the system:

# btrfs subvolume snapshot / /before-updating-2011-05-24Create a snapshot of '/' in '//before-updating-2011-05-24'

NoteHint: You can create an empty file at the root of your snapshot with the name of your choice tohelp you easily identify which subvolume is the currently mounted one (e.g. if the snapshot hasbeen named before-updating-2011-05-24, you can use a slightly different name likecurrent-is-before-updating-2011-05-24 => touch /before-updating-2011-05-24/current-is-before-updating-2011-05-24). This is extremly useful if you are dealing withseveral snapshots.

There is no "better" way; it's just a question of personal preference.

Way #1: Fiddle with the default subvolume numberHypothesis:

Your "production" VFS root partition resides in the root subvolume (subvolid=0),Your /boot partition (where the bootloader configuration files are stored) is on another standalonepartition

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 12 / 22

Page 13: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

First search for the newly created subvolume number:

# btrfs subvolume list / '''ID 256''' top level 5 path before-updating-2011-05-24

'256' is the ID to be retained (of course, this ID will differ in your case).

Now, change the default subvolume of the BTRFS volume to designate the subvolume (snapshot)before-updating and not the root subvolume then reboot:

# btrfs subvolume set-default 256 /

Once the system has rebooted, and if you followed the advice in the previous paragraph that suggests tocreate an empty file of the same name as the snapshot, you should be able to see if the mounted VFSroot is the copy hold by the snapshot before-updating-2011-05-24:

# ls -l /...-rw-rw-rw- 1 root root 0 May 24 20:33 current-is-before-updating-2011-05-24...

The correct subvolume has been used for mounting the VFS! Excellent! This is now the time to mountyour "production" VFS root (remember the root subvolume can only be accessed via its identificationnumber i.e 0):

# mount -o subvolid=0 /mnt# mount.../dev/sda2 on /mnt type btrfs (rw,subvolid=0)

Oh by the way, as the root subvolume is now mounted in /mnt let's try something, just for the sake of thedemonstration:

# ls /mnt...drwxr-xr-x 1 root root 0 May 24 20:33 current-is-before-updating-2011-05-24...# btrfs subvolume list /mntID 256 top level 5 path before-updating-2011-05-24

No doubt possible :-) Time to rollback! For this rsync will be used in the following way:

# rsync --progress -aHAX --exclude=/proc --exclude=/dev --exclude=/sys --exclude=/mnt / /mnt

Basically we are asking rsync to:

preserve timestamps, hard and symbolic links, owner/group IDs, ACLs and any extended attributes(refer to the rsync manual page for further details on options used) and to report its progressionignore the mount points where virtual filesystems are mounted (procfs, sysfs...)avoid a re-recursion by reprocessing /mnt (you can speed up the process by adding some extradirectories if you are sure they don't hold any important changes or any change at all like/var/tmp/portage for example).

Be patient! The resync may take several minutes or hours depending on the amount of data amount toprocess...

Once finished, you will have to set the default subvolume to be the root subvolume:

# btrfs subvolume set-default 0 /mntID 256 top level 5 path before-updating-2011-05-24

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 13 / 22

Page 14: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

WarningDO NOT ENTER / instead of /mnt in the above command; it won't work and you willbe under the snapshot before-updating-2011-05-24 the next time the machinereboots.

The reason is that subvolume number must be "visible" from the path given at the end of thebtrfs subvolume set-default command line. Again refer the boxes analogy: in our context weare in a subbox numbered 256 which is located *inside* the box numbered 0 (so it can't seeneither interfere with it). [TODO: better explain]

Now just reboot and you should be in business again! Once you have rebooted just check if you are reallyunder the right subvolume:

# ls / ...drwxr-xr-x 1 root root 0 May 24 20:33 current-is-before-updating-2011-05-24...# btrfs subvolume list /ID 256 top level 5 path before-updating-2011-05-24

At the right place? Excellent! You can now delete the snapshot if you wish, or better: keep it as a lifeboatof "last good known system state."

Way #2: Change the kernel command line in thebootloader configuration filesFirst search for the newly created subvolume number:

# btrfs subvolume list / '''ID 256''' top level 5 path before-updating-2011-05-24

'256' is the ID to be retained (can differ in your case).

Now with your favourite text editor, edit the adequate kernel command line in your bootloaderconfiguration (/etc/boot.conf). This file contains is typically organized in several sections (one per kernelpresent on the system plus some global settings), like the excerpt below:

set timeout=5set default=0

# Production kernelmenuentry "Funtoo Linux production kernel (2.6.39-gentoo x86/64)" { insmod part_msdos insmod ext2 ... set root=(hd0,1) linux /kernel-x86_64-2.6.39-gentoo root=/dev/sda2 initrd /initramfs-x86_64-2.6.39-gentoo}...

Find the correct kernel line and add one of the following statements after root=/dev/sdX:

rootflags=subvol=before-updating-2011-05-24 - Or -rootflags=subvolid=256

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 14 / 22

Page 15: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

WarningIf the kernel your want to use has been generated with Genkernel, you MUST usereal_rootflags=subvol=... instead of rootflags=subvol=... at the penalty of not having yourrootflags taken into consideration by the kernel on reboot.

Applied to the previous example you will get the following if you referred the subvolume by its name:

set timeout=5set default=0

# Production kernelmenuentry "Funtoo Linux production kernel (2.6.39-gentoo x86/64)" { insmod part_msdos insmod ext2 ... set root=(hd0,1) linux /kernel-x86_64-2.6.39-gentoo root=/dev/sda2 rootflags=subvol=before-updating-2011-05-24 initrd /initramfs-x86_64-2.6.39-gentoo}...

Or you will get the following if you referred the subvolume by its identification number:

set timeout=5set default=0

# Production kernelmenuentry "Funtoo Linux production kernel (2.6.39-gentoo x86/64)" { insmod part_msdos insmod ext2 ... set root=(hd0,1) linux /kernel-x86_64-2.6.39-gentoo root=/dev/sda2 rootflags=subvolid=256 initrd /initramfs-x86_64-2.6.39-gentoo}...

Once the modifications are done, save your changes and take the necessary extra steps to commit theconfiguration changes on the first sectors of the disk if needed (this mostly applies to the users of LILO;Grub and SILO do not need to be refreshed) and reboot.

Once the system has rebooted and if you followed the advice in the previous paragraph that suggests tocreate an empty file of the same name as the snapshot, you should be able to see if the mounted VFSroot is the copy hold by the snapshot before-updating-2011-05-24:

# ls -l /...-rw-rw-rw- 1 root root 0 May 24 20:33 current-is-before-updating-2011-05-24...

The correct subvolume has been used for mounting the VFS! Excellent! This is now the time to mountyour "production" VFS root (remember the root subvolume can only be accessed via its identificationnumber 0):

# mount -o subvolid=0 /mnt# mount.../dev/sda2 on /mnt type btrfs (rw,subvolid=0)

Time to rollback! For this rsync will be used in the following way:

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 15 / 22

Page 16: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

# rsync --progress -aHAX --exclude=/proc --exclude=/dev --exclude=/sys --exclude=/mnt / /mnt

Here, please refer to what has been said in Way #1 concerning the used options in rsync. Onceeverything is in place again, edit your bootloader configuration to remove the rootflags/real_rootflagskernel parameter, reboot and check if you are really under the right subvolume:

# ls / ...drwxr-xr-x 1 root root 0 May 24 20:33 current-is-before-updating-2011-05-24...# btrfs subvolume list /ID 256 top level 5 path current-is-before-updating-2011-05-24

At the right place? Excellent! You can now delete the snapshot if you wish, or better: keep it as a lifeboatof "last good known system state."

Some BTRFS practices / returns ofexperience / gotchas

Although BTRFS is still evolving, at the date of writing it (still) is an experimental filesystem andshould be not be used for production systems and for storing critical data (even if thedata is non critical, having backups on a partition formatted with a "stable" filesystem like Reiser orext3/4 is recommended).From time to time some changes are brought to the metadata (BTRFS format is not definitive at dateof writing) and a BTRFS partition could not be used with older Linux kernels (this happened with Linux2.6.31).More and more Linux distributions are proposing the filesystem as an alternative for ext4Some reported gotchas: https://btrfs.wiki.kernel.org/index.php/GotchasPlaying around with BTFRS can be a bit tricky especially when dealing with default volumes and mountpoint (again: the box analogy)Using compression (e.g. LZO =>> mount -o compress=lzo) on the filesystem can improve thethroughput performance, however many files nowadays are already compressed at application level(music, pictures, videos....).Using space caching capabilities (mount -o space_cache) seems to brings some extra slightperformance improvements.There is very interesting discussion on BTRFS design limitations with B-Trees lying on LKML. Westrongly encourage you to read about on

Deploying a Funtoo instance in a subvolumeother than the root subvolumeSome Funtoo core devs have used BTRFS for many months and no major glitches have been reportedso far (except an non-aligned memory access trap on SPARC64 in a checksum calculation routine; minorlatest kernels may brought a correction) except a long time ago but this was more related to a kernelcrash due to a bug that corrupted some internal data rather than the filesystem code in itself.

The following can simplify your life in case of recovery (not tested):

When you prepare the disk space that will hold the root of your future Funtoo instance (and so, will hold/usr /bin /sbin /etc etc...), don't use the root subvolume but take an extra step to define a subvolume likeillustrated below:

# fdisk /dev/sda2....

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 16 / 22

Page 17: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

# mkfs.btrfs /dev/sda2# mount /dev/sda2 /mnt/funtoo# subvolume create /mnt/funtoo /mnt/funtoo/live-vfs-root-20110523# chroot /mnt/funtoo/live-vfs-root-20110523 /bin/bash

Then either:

Set the default subvolume /live-vfs-root-20110523 as being the default subvolume (btrfs subvolumeset-default.... remember to inspect the subvolume identification number)Use rootflag / real_rootfsflags (always use real_rootfsflags for kernel generated with Genkernel) on thekernel command line in your bootloader configuration file

Technically speaking, it won't change your life BUT at system recovery: when you want to rollback to afunctional VFS root copy because something happened (buggy system package, too aggressive cleanupthat removed Python, dead compiling toolchain...) you can avoid a time costly rsync but at the cost ofputting a bit of overhead over your shoulders when taking a snapshot.

Here again you have two ways to recover the system:

fiddling with the default subvolume:Mount to the no named volume somewhere (e.g. mount -o subvolid=0 /dev/sdX /mnt)Take a snapshot (remember to check its identification number) of your current subvolume andstore it under the root volume you just have just mounted (btrfs snapshot create //mnt/before-updating-20110524) -- (Where is the "frontier"? If 0 is monted does its contenntsalso appear in the taken snashot located on the same volume?)Update your system or do whatever else "dangerous" operationIf you need to return to the latest good known system state, just set the default subvolume to useto the just taken snapshot (btrfs subvolume set-default <snapshotnumber here> /mnt)RebootOnce you have rebooted, just mount the root subvolume again and delete the subvolume thatcorrespond to the failed system update (btrfs subvolume delete /mnt/<buggy VFSrootsnapshot name here>)

fiddling with the kernel command line:Mount to the no named volume somewhere (e.g. mount -o subvolid=0 /dev/sdX /mnt)Take a snapshot (remember to check its identification number) of your current subvolume andstore it under the root volume you just have just mounted (btrfs snapshot create //mnt/before-updating-20110524) -- (Where is the "frontier"? If 0 is mounted does its contentsalso appear in the taken snapshot located on the same volume?)Update your system or do whatever else "dangerous" operationIf you need to return to the latest good known system state, just set the rootflags/real_rootflags asdemonstrated in previous paragraphs in your loader configuration fileRebootOnce you have rebooted, just mount the root subvolume again and delete the subvolume thatcorrespond to the failed system update (btrfs subvolume delete /mnt/<buggy VFSrootsnapshot name here>)

Space recovery / defragmenting thefilesystem

TipFrom time to time it is advised to ask for re-optimizing the filesystem structures and data blocksin a subvolume. In BTRFS terminology this is called a defragmentation and it only be performedwhen the subvolume is mounted in the VFS (online defragmentation):

# btrfs filesystem defrag /mnt

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 17 / 22

Page 18: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

You can still access the subvolume, even change its contents, while a defragmentation is running.

It is also a good idea to remove the snapshots you don't use anymore especially if huge files and/or lotsof files are changed because snapshots will still hold some blocks that could be reused.

SSE 4.2 boostIf your CPU supports hardware calculation of CRC32 (e.g. since Intel Nehalem series and later and AMDBulldozer series), you are encouraged to enable that support in your kernel since BTRFS makes anaggressive use of those. Just check you have enabled CRC32c INTEL hardware acceleration inCryptographic API either as a module or as a built-in feature

Recovering an apparent dead BTRFSfilesystemDealing with a filesystem metadata coherence is a critical in a filesystem design. Losing some datablocks (i.e. having some corrupted files) is less critical than having a screwed-up and unmountablefilesystem especially if you do backups on a regular basis (the rule with BTRFS is *do backups*,BTRFS has no mature filesystem repair tool and you *will* end up in having to re-createyour filesystem from scratch again sooner or later).

Mounting with recovery option (Linux 3.2and beyond)If you are using Linux 3.2 and later (only!), you can use the recovery option to make BTRFS seek fora usable copy of tree root (several copies of it exists on the disk). Just mount your filesystem as:

# mount -o recovery /dev/yourBTFSvolume /mount/point

btrfs-select-super / btrfs-zero-logTwo other handy tools exist but they are not deployed by default by sys-fs/btrfs-progs (even btrfs-progs-9999) ebuilds because they only lie in the branch "next" of the btrfs-progs Git repository:

btrfs-select-superbtrfs-zero-log

Building the btrfs-progs goodiesThe two tools this section is about are not build by default and Funtoo ebuilds does not build them as wellfor the moment. So you must build them manually:

# mkdir ~/src# cd ~/src# git clone git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git # cd btrfs-progs# make && make btrfs-select-super && make btrfs-zero-log

NoteIn the past, btrfs-select-super and btrfs-zero-log were lying in the git-next branch, this is nolonger the case and those tools are available in the master branch

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 18 / 22

Page 19: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

Fixing dead superblockIn case of a corrupted superblock, start by asking btrfsck to use an alternate copy of the superblockinstead of the superblock #0. This is achieved via the -s option followed by the number of the alternatecopy you wish to use. In the following example we ask for using the superblock copy #2 of /dev/sda7:

# ./btrfsck --s 2 /dev/sd7

When btrfsck is happy, use btrfs-super-select to restore the default superblock (copy #0) with a cleancopy. In the following example we ask for restoring the superblock of /dev/sda7 with its copy #2:

# ./btrfs-super-select -s 2 /dev/sda7

Note that this will overwrite all the other supers on the disk, which means you really only get one shot atit.

If you run btrfs-super-select prior prior to figuring out which one is good, you've lost yourchance to find a good one.

Clearing the BTRFS journalThis will only help with one specific problem!

If you are unable to mount a BTRFS partition after a hard shutdown, crash or power loss, it may be due tofaulty log playback in kernels prior to 3.2. The first thing to try is updating your kernel, and mounting. Ifthis isn't possible, an alternate solution lies in truncating the BTRFS journal, but only if you see"replay_one_*" functions in the oops callstack.

To truncate the journal of a BTRFS partition (and thereby lose any changes that only exist in the log!),just give the filesystem to process to btrfs-zero-log:

# ./btrfs-zero-log /dev/sda7

This is not a generic technique, and works by permanently throwing away a small amount of potentiallygood data.

Using btrfsckWarningExtremely experimental...

If one thing is famous in the BTRFS world it would be the so-wished fully functional btrfsck. A read-onlyversion of the tool was existing out there for years, however at the begining of 2012, BTRFS developersmade a public and very experimental release: the secret jewel lies in the branch dangerdonteveruse ofthe BTRFS Git repository hold by Chris Mason on kernel.org.

# git clone git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git# cd btrfs-progs# git checkout dangerdonteveruse# make

So far the tool can:

Fix errors in the extents tree and in blocks groups accounting

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 19 / 22

Page 20: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

Wipe the CRC tree and create a brand new one (you can to mount the filesystem with CRC checkingdisabled )

To repair:

# btrfsck --repair /dev/''yourBTRFSvolume''

To wipe the CRC tree:

# btrfsck --init-csum-tree /dev/''yourBTRFSvolume''

Two other options exist in the source code: --super (equivalent of btrfs-select-super ?) and --init-extent-tree (clears out any extent?)

Final wordsWe give the great lines here but BTRFS can be very tricky especially when several subvolumes comingfrom several BTRFS volumes are used. And remember: BTRFS is still experimental at date of writing :)

Lessons learnedVery interesting but still lacks some important features present in ZFS like RAID-Z, virtual volumes,management by attributes, filesystem streaming, etc.Extremly interesting for Gentoo/Funtoo systems partitions (snapshot/rollback capabilities). Howevernot integrated in portage yet.If possible, use a file monitoring tool like TripWire this is handy to see what file has been corruptedonce the filesystem is recovered or if a bug happensIt is highly advised to not use the root subvolume when deploying a new Funtooinstance or put any kind of data on it in a more general case. Rolling back a data snapshot will bemuch easier and much less error prone (no copy process, just a matter of 'swapping' thesubvolumes).Backup, backup backup your data! ;)

Support Funtoo and help us grow! Donate $15 per month and get a free SSD-basedFuntoo Virtual Container.

Looking for people interested in testing and documenting Docker support! ContactDaniel Robbins for more info.

27 April 2015 by Drobbins

10 April 2015 by Drobbins

Got Funtoo?Have you installed Funtoo Linux yet? Discover the power of a from-source meta-distribution optimizedfor your hardware! See our installation instructions and browse our CPU-optimized builds.

Funtoo NewsBetter Experiences: Ego and VimInfo on Funtoo's new personality tool called 'ego', and user-focused updates to vim's defaults.

How We're Keeping You At the Center of the Funtoo UniverseRead about recent developments that keep you, our users, at the forefront of our focus asFuntoo moves forward.

New OpenGL management in FuntooFuntoo is switching to an improved system for managing multiple OpenGL providers (Mesa/Xorg, AMD

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 20 / 22

Page 21: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

30 March 2015 by Mgorny

AAwk by Example, Part 1Awk by Example, Part 2Awk by Example, Part 3

BBTRFS FunBash by Example, Part 1Bash by Example, Part 2Bash by Example, Part 3

FFuntoo Filesystem Guide, Part 1Funtoo Filesystem Guide, Part 2Funtoo Filesystem Guide, Part 3Funtoo Filesystem Guide, Part 4Funtoo Filesystem Guide, Part 5

GGUID Booting GuideGlusterFS

KKeychain

LLVM FunLearning Linux LVM, Part 1Learning Linux LVM, Part 2Linux Fundamentals, Part 1Linux Fundamentals, Part 1/pt-brLinux Fundamentals, Part 2Linux Fundamentals, Part 3Linux Fundamentals, Part 4

MMaking the Distribution, Part 1Making the Distribution, Part 2Making the Distribution, Part 3Maximum Swappage

and NVIDIA). The update may involve blockers and file collisions.

View More News...

More ArticlesBrowse all our Linux-related articles, below:

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 21 / 22

Page 22: BTRFS Fun · Subliminal message: Do not put critical data on BTRFS partitions. Introduction BTRFS is an advanced filesystem mostly contributed by Sun/Oracle whose origins take place

Categories: Labs Articles Featured Filesystems

OOpenSSH Key Management, Part 1OpenSSH Key Management, Part 2OpenSSH Key Management, Part 3

PPOSIX Threads Explained, Part 1POSIX Threads Explained, Part 2POSIX Threads Explained, Part 3Partition Planning TipsPartitioning in Action, Part 1Partitioning in Action, Part 2Prompt Magic

SSAN Box used via iSCSISed by Example, Part 1Sed by Example, Part 2Sed by Example, Part 3Slowloris DOS Mitigation Guide

TThe Gentoo.org Redesign, Part 1The Gentoo.org Redesign, Part 2The Gentoo.org Redesign, Part 3The Gentoo.org Redesign, Part 4Traffic Control

WWindows 7 Virtualization with KVM

XX Window System

ZZFS Fun

... further results

This page was last modified on December 28, 2014, at 09:41.Privacy policyAbout FuntooDisclaimers

BTRFS Fun 03/05/15

http://www.funtoo.org/BTRFS_Fun 22 / 22