atime and filesystems with snapshots (especially Btrfs)

Discussion:

Alexander Block

2012-05-25 15:35:37 UTC

Hello,

(this is a resend with proper CC for linux-fsdevel and linux-kernel)

I would like to start a discussion on atime in Btrfs (and other
filesystems with snapshot support).

As atime is updated on every access of a file or directory, we get
many changes to the trees in btrfs that as always trigger cow
operations. This is no problem as long as the changed tree blocks are
not shared by other subvolumes. Performance is also not a problem, no
matter if shared or not (thanks to relatime which is the default).
The problems start when someone starts to use snapshots. If you for
example snapshot your root and continue working on your root, after
some time big parts of the tree will be cowed and unshared. In the
worst case, the whole tree gets unshared and thus takes up the double
space. Normally, a user would expect to only use extra space for a
tree if he changes something.
A worst case scenario would be if someone took regular snapshots for
backup purposes and later greps the contents of all snapshots to find
a specific file. This would touch all inodes in all trees and thus
make big parts of the trees unshared.

relatime (which is the default) reduces this problem a little bit, as
it by default only updates atime once a day. This means, if anyone
wants to test this problem, mount with relatime disabled or change the
system date before you try to update atime (that's the way i tested
it).

As a solution, I would suggest to make noatime the default for btrfs.
I'm however not sure if it is allowed in linux to have different
default mount options for different filesystem types. I know this
discussion pops up every few years (last time it resulted in making
relatime the default). But this is a special case for btrfs. atime is
already bad on other filesystems, but it's much much worse in btrfs.

Alex.

Josef Bacik

2012-05-25 15:42:50 UTC

Permalink

Post by Alexander Block
Hello,
(this is a resend with proper CC for linux-fsdevel and linux-kernel)
I would like to start a discussion on atime in Btrfs (and other
filesystems with snapshot support).
As atime is updated on every access of a file or directory, we get
many changes to the trees in btrfs that as always trigger cow
operations. This is no problem as long as the changed tree blocks are
not shared by other subvolumes. Performance is also not a problem, no
matter if shared or not (thanks to relatime which is the default).
The problems start when someone starts to use snapshots. If you for
example snapshot your root and continue working on your root, after
some time big parts of the tree will be cowed and unshared. In the
worst case, the whole tree gets unshared and thus takes up the double
space. Normally, a user would expect to only use extra space for a
tree if he changes something.
A worst case scenario would be if someone took regular snapshots for
backup purposes and later greps the contents of all snapshots to find
a specific file. This would touch all inodes in all trees and thus
make big parts of the trees unshared.
relatime (which is the default) reduces this problem a little bit, as
it by default only updates atime once a day. This means, if anyone
wants to test this problem, mount with relatime disabled or change the
system date before you try to update atime (that's the way i tested
it).
As a solution, I would suggest to make noatime the default for btrfs.
I'm however not sure if it is allowed in linux to have different
default mount options for different filesystem types. I know this
discussion pops up every few years (last time it resulted in making
relatime the default). But this is a special case for btrfs. atime is
already bad on other filesystems, but it's much much worse in btrfs.

Just mount with -o noatime, there's no chance of turning something like that on
by default since it will break some applications (notably mutt). Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Alexander Block

2012-05-25 15:59:45 UTC

Permalink

Post by Alexander Block
not shared by other subvolumes. Performance is also not a problem, n=

Post by Alexander Block
matter if shared or not (thanks to relatime which is the default).
The problems start when someone starts to use snapshots. If you for
example snapshot your root and continue working on your root, after
some time big parts of the tree will be cowed and unshared. In the
worst case, the whole tree gets unshared and thus takes up the doubl=

Post by Alexander Block
space. Normally, a user would expect to only use extra space for a
tree if he changes something.
A worst case scenario would be if someone took regular snapshots for
backup purposes and later greps the contents of all snapshots to fin=

Post by Alexander Block
a specific file. This would touch all inodes in all trees and thus
make big parts of the trees unshared.
relatime (which is the default) reduces this problem a little bit, a=

Post by Alexander Block
it by default only updates atime once a day. This means, if anyone
wants to test this problem, mount with relatime disabled or change t=

Post by Alexander Block
system date before you try to update atime (that's the way i tested
it).
As a solution, I would suggest to make noatime the default for btrfs=

=2E

Post by Alexander Block
I'm however not sure if it is allowed in linux to have different
default mount options for different filesystem types. I know this
discussion pops up every few years (last time it resulted in making
relatime the default). But this is a special case for btrfs. atime i=

Post by Alexander Block
already bad on other filesystems, but it's much much worse in btrfs.

Just mount with -o noatime, there's no chance of turning something li=

ke that on

by default since it will break some applications (notably mutt). =A0T=

hanks,

Josef

I know about the discussions regarding compatibility with existing
applications. The problem here is, that it is not only a compatibility
problem. Having atime enabled by default, may give you ENOSPC
for reasons that a normal user does not understand or expect.
As a normal user, I would think: If I never change something, why
does it then take up more space just by reading it?

Andreas Dilger

2012-05-25 16:28:16 UTC