Discussion:
[PATCH] DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ?
Matt
2010-11-06 22:16:11 UTC
Permalink
Hi guys,

before diving into testing out 2.6.37-rc* kernels I wanted to make
sure that the patch:

[dm-devel] [PATCH] DM-CRYPT: Scale to multiple CPUs v3

is safe to use with e.g. >2.6.37-rc1 kernels

I know that it's not a "fix all" patch but it significantly seems to
speed up my backup jobs (by factor 2-3)
and 2.6.37* has evolved that much that interactivity isn't hurt too much

Thanks !

Matt
Milan Broz
2010-11-07 14:30:18 UTC
Permalink
On 11/06/2010 11:16 PM, Matt wrote:
> before diving into testing out 2.6.37-rc* kernels I wanted to make
> sure that the patch:
>
> [dm-devel] [PATCH] DM-CRYPT: Scale to multiple CPUs v3
>
> is safe to use with e.g. >2.6.37-rc1 kernels
>
> I know that it's not a "fix all" patch but it significantly seems to
> speed up my backup jobs (by factor 2-3)
> and 2.6.37* has evolved that much that interactivity isn't hurt too much

yes, it should work for the simple mappings without problems.

I hope we will fix the patch soon to be ready for upstream.

Milan
Matt
2010-11-07 17:49:09 UTC
Permalink
On Sun, Nov 7, 2010 at 3:30 PM, Milan Broz <***@redhat.com> wrote:
>
> On 11/06/2010 11:16 PM, Matt wrote:
>> before diving into testing out 2.6.37-rc* kernels I wanted to make
>> sure that the patch:
>>
>> [dm-devel] [PATCH] DM-CRYPT: Scale to multiple CPUs v3
>>
>> is safe to use with e.g. >2.6.37-rc1 kernels
>>
>> I know that it's not a "fix all" patch but it significantly seems to
>> speed up my backup jobs (by factor 2-3)
>> and 2.6.37* has evolved that much that interactivity isn't hurt too much
>
> yes, it should work for the simple mappings without problems.
>
> I hope we will fix the patch soon to be ready for upstream.
>
> Milan
>

Hi Milan,

thanks for your answer !

Unfortunately I have to post a "Warning" that it's currently not safe
(at least for me) to use it

a few hours ago before 2.6.37-rc1 was tagged I already had shortly
tested it with the dm-crypt multi-cpu patch and massive "silent" data
corruption or loss occured:

fortunately I don't/didn't see any data-corruption on my /home
partition (yet) but everytime I boot into my system things are screwed
up on the root-partition, e.g.:

where
eselect opengl list would show
>Available OpenGL implementations:
> [1] ati *
> [2] xorg-x11

normally it's
>cat /etc/env.d/03opengl
># Configuration file for eselect
># This file has been automatically generated.
>LDPATH="/usr/lib32/opengl/ati/lib:/usr/lib64/opengl/ati/lib"
>OPENGL_PROFILE="ati"


it currently says:

>eselect opengl list
>Available OpenGL implementations:
> [1] ati
> [2] xorg-x11


>cat /etc/env.d/03opengl
># Configuration file for eselect

and another example was a corrupted /etc/init.d/killprocs

so since this (a corrupted killprocs) already had happened in the past
(last time due to a hardlock with fglrx/amd's catalyst driver) I
thought it was some kind of system problem which could be fixed:
I fired up a rebuild-job (emerge -e system) for my system and (surely)
some other stuff disappeared - after 2-3 reboots I wanted to continue
finishing the rebuild and gcc was gone (!)

I don't have the time to re-test everything since this is my testing &
production machine (I'll play back a system-backup tarball) but this
didn't happen (yet) with 2.6.36 and
the following patches related to multi-cpu dm-crypt:

* [dm-devel] [PATCH] DM-CRYPT: Scale to multiple CPUs v3
* [PATCH] Use generic private pointer in per-cpu struct

so it seems to be safe.

It has to be changes which got introduced with 2.6.37* which broke
stuff. 2.6.36 seems to work perfectly fine with those 2 patches since
several days already

I'll stick with 2.6.36 for some time now

Thanks !

Matt
Matt
2010-11-07 19:32:05 UTC
Permalink
On Sun, Nov 7, 2010 at 6:49 PM, Matt <***@gmail.com> wrote:
> On Sun, Nov 7, 2010 at 3:30 PM, Milan Broz <***@redhat.com> wrote:
>>
>> On 11/06/2010 11:16 PM, Matt wrote:
>>> before diving into testing out 2.6.37-rc* kernels I wanted to make
>>> sure that the patch:
>>>
>>> [dm-devel] [PATCH] DM-CRYPT: Scale to multiple CPUs v3
>>>
>>> is safe to use with e.g. >2.6.37-rc1 kernels
>>>
>>> I know that it's not a "fix all" patch but it significantly seems t=
o
>>> speed up my backup jobs (by factor 2-3)
>>> and 2.6.37* has evolved that much that interactivity isn't hurt too=
much
>>
>> yes, it should work for the simple mappings without problems.
>>
>> I hope we will fix the patch soon to be ready for upstream.
>>
>> Milan
>>
>
> Hi Milan,
>
> thanks for your answer !
>
> Unfortunately I have to post a "Warning" that it's currently not safe
> (at least for me) to use it
>
> a few hours ago before 2.6.37-rc1 was tagged I already had shortly
> tested it with the dm-crypt multi-cpu patch and massive "silent" data
> corruption or loss occured:
>
> fortunately I don't/didn't see any data-corruption on my /home
> partition (yet) but everytime I boot into my system things are screwe=
d
> up on the root-partition, e.g.:
>
> where
> eselect opengl list would show
>>Available OpenGL implementations:
>> =A0[1] =A0 ati *
>> =A0[2] =A0 xorg-x11
>
> normally it's
>>cat /etc/env.d/03opengl
>># Configuration file for eselect
>># This file has been automatically generated.
>>LDPATH=3D"/usr/lib32/opengl/ati/lib:/usr/lib64/opengl/ati/lib"
>>OPENGL_PROFILE=3D"ati"
>
>
> it currently says:
>
>>eselect opengl list
>>Available OpenGL implementations:
>> =A0[1] =A0 ati
>> =A0[2] =A0 xorg-x11
>
>
>>cat /etc/env.d/03opengl
>># Configuration file for eselect
>
> and another example was a corrupted /etc/init.d/killprocs
>
> so since this (a corrupted killprocs) already had happened in the pas=
t
> (last time due to a hardlock with fglrx/amd's catalyst driver) I
> thought it was some kind of system problem which could be fixed:
> I fired up a rebuild-job (emerge -e system) for my system and (surely=
)
> some other stuff disappeared - after 2-3 reboots I wanted to continue
> finishing the rebuild and gcc was gone (!)
>
> I don't have the time to re-test everything since this is my testing =
&
> production machine (I'll play back a system-backup tarball) but this
> didn't happen (yet) with 2.6.36 and
> the following patches related to multi-cpu dm-crypt:
>
> * [dm-devel] [PATCH] DM-CRYPT: Scale to multiple CPUs v3
> * [PATCH] Use generic private pointer in per-cpu struct
>
> so it seems to be safe.
>
> It has to be changes which got introduced with 2.6.37* which broke
> stuff. 2.6.36 seems to work perfectly fine with those 2 patches since
> several days already
>
> I'll stick with 2.6.36 for some time now
>
> Thanks !
>
> Matt
>

sorry - I forgot to include the most important part:

the system-partition is on an LVM/Volume Group that sits on an
cryptsetup partition so:

cryptsetup (luks) -> LVM/Volume Group (2 partitions, one of them
system - the other swap) -> system (ext4)

[cryptsetup -> LVM -> ext4-partition]

the mount-options were/are:

noatime,nodiratime,barrier=3D1

sometimes also

noatime,nodiratime,barrier=3D1,commit=3D600
(when the system runs for several hours to make it consume less
energy/write less)

the other settings are:

echo "3000" > /proc/sys/vm/dirty_expire_centisecs
echo "1500" > /proc/sys/vm/dirty_writeback_centisecs
echo "15" > /proc/sys/vm/dirty_background_ratio
echo "50" > /proc/sys/vm/dirty_ratio
echo "50" > /proc/sys/vm/vfs_cache_pressure

i/o scheduler: cfq

as already mentioned - this problem didn't appear or wasn't noticable
(yet) with or until 2.6.36 - my system-memory should also be
error-free (tested via memtest86), the harddisk too (previously tested
several times via badblocks without errors)

during every shutdown, reboot, hibernate, etc.

I do a manual:

sync && sdparm -C sync /dev/foo

to make sure data gets to the partition

I read about barrier-problems and data getting to the partition when
using dm-crypt and several layers so I don't know if that could be
related

Regards

Matt
Andi Kleen
2010-11-07 19:45:47 UTC
Permalink
> I read about barrier-problems and data getting to the partition when
> using dm-crypt and several layers so I don't know if that could be
> related

Barriers seem to be totally broken on dm-crypt currently.

But that's probably not your problem. I use the scalability patch
on 2.6.36 and it's very stable. Most likely some mistake
in the forward port.

-Andi
Milan Broz
2010-11-07 21:39:23 UTC
Permalink
On 11/07/2010 08:45 PM, Andi Kleen wrote:
>> I read about barrier-problems and data getting to the partition when
>> using dm-crypt and several layers so I don't know if that could be
>> related
>
> Barriers seem to be totally broken on dm-crypt currently.

Can you explain it?

Barriers/flush change should work, if it is broken, it is not only dm-crypt.
(dm-crypt simply relies on dm-core implementation, when barrier/flush
request come to dmcrypt, all previous IO must be already finished).

Milan
Andi Kleen
2010-11-07 23:05:09 UTC
Permalink
On Sun, Nov 07, 2010 at 10:39:23PM +0100, Milan Broz wrote:
> On 11/07/2010 08:45 PM, Andi Kleen wrote:
> >> I read about barrier-problems and data getting to the partition when
> >> using dm-crypt and several layers so I don't know if that could be
> >> related
> >
> > Barriers seem to be totally broken on dm-crypt currently.
>
> Can you explain it?

e.g. the btrfs mailing list is full of corruption reports
on dm-crypt and most of the symptoms point to broken barriers.

> Barriers/flush change should work, if it is broken, it is not only dm-crypt.
> (dm-crypt simply relies on dm-core implementation, when barrier/flush
> request come to dmcrypt, all previous IO must be already finished).

Possibly, at least it doesn't seem to work.

-Andi
--
***@linux.intel.com -- Speaking for myself only.
Alasdair G Kergon
2010-11-08 14:16:36 UTC
Permalink
On Mon, Nov 08, 2010 at 12:05:09AM +0100, Andi Kleen wrote:
> e.g. the btrfs mailing list is full of corruption reports
> on dm-crypt and most of the symptoms point to broken barriers.

linux-btrfs? I'm not subscribed, but I the searches I've tried
don't show it to be "full of corruption reports".

Could you post links to the threads concerned so we can investigate?

Are we just talking -rc1 or earlier too?

Thanks,
Alasdair
Mike Snitzer
2010-11-08 14:58:09 UTC
Permalink
On Sun, Nov 07 2010 at 6:05pm -0500,
Andi Kleen <***@firstfloor.org> wrote:

> On Sun, Nov 07, 2010 at 10:39:23PM +0100, Milan Broz wrote:
> > On 11/07/2010 08:45 PM, Andi Kleen wrote:
> > >> I read about barrier-problems and data getting to the partition when
> > >> using dm-crypt and several layers so I don't know if that could be
> > >> related
> > >
> > > Barriers seem to be totally broken on dm-crypt currently.
> >
> > Can you explain it?
>
> e.g. the btrfs mailing list is full of corruption reports
> on dm-crypt and most of the symptoms point to broken barriers.

[cc'ing linux-btrfs, hopefully in the future dm-devel will get cc'd when
concerns about DM come up on linux-btrfs (or other lists)]

I spoke with Josef Bacik and these corruption reports are apparently
against older kernels (e.g. <= 2.6.33). I say <= 2.6.33 because:

https://btrfs.wiki.kernel.org/index.php/Gotchas states:
"btrfs volumes on top of dm-crypt block devices (and possibly LVM)
require write-caching to be turned off on the underlying HDD. Failing to
do so, in the event of a power failure, may result in corruption not yet
handled by btrfs code. (2.6.33)"

But Josef was not aware of any reports with kernels newer than 2.6.32
(F12).

Josef also noted that until last week btrfs wouldn't retry another
mirror in the face of some corruption, the fix is here:
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=cb44921a09221

This obviously doesn't fix any source of corruption but it makes btrfs
more resilient when it encounters the corruption.

> > Barriers/flush change should work, if it is broken, it is not only dm-crypt.
> > (dm-crypt simply relies on dm-core implementation, when barrier/flush
> > request come to dmcrypt, all previous IO must be already finished).
>
> Possibly, at least it doesn't seem to work.

Can you please be more specific? What test(s)? What kernel(s)?

Any pointers to previous (and preferably: recent) reports would be
appreciated.

The DM barrier code has seen considerable change recently (via flush+fua
changes in 2.6.37). Those changes have been tested quite a bit
(including ext4 consistency after a crash).

But even prior to those flush+fua changes DM's support for barriers
(Linux >= 2.6.31) was held to be robust. No known (at least no
reported) issues with DM's barrier support.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2010-11-08 17:59:45 UTC
Permalink
Excerpts from Mike Snitzer's message of 2010-11-08 09:58:09 -0500:
> On Sun, Nov 07 2010 at 6:05pm -0500,
> Andi Kleen <***@firstfloor.org> wrote:
>
> > On Sun, Nov 07, 2010 at 10:39:23PM +0100, Milan Broz wrote:
> > > On 11/07/2010 08:45 PM, Andi Kleen wrote:
> > > >> I read about barrier-problems and data getting to the partition when
> > > >> using dm-crypt and several layers so I don't know if that could be
> > > >> related
> > > >
> > > > Barriers seem to be totally broken on dm-crypt currently.
> > >
> > > Can you explain it?
> >
> > e.g. the btrfs mailing list is full of corruption reports
> > on dm-crypt and most of the symptoms point to broken barriers.
>
> [cc'ing linux-btrfs, hopefully in the future dm-devel will get cc'd when
> concerns about DM come up on linux-btrfs (or other lists)]
>
> I spoke with Josef Bacik and these corruption reports are apparently
> against older kernels (e.g. <= 2.6.33). I say <= 2.6.33 because:

We've consistently seen reports about corruptions on power hits with
dm-crypt. The logs didn't have any messages about barriers failing, but
the corruptions were still there. The most likely cause is that
barriers just aren't getting through somehow.

>
> https://btrfs.wiki.kernel.org/index.php/Gotchas states:
> "btrfs volumes on top of dm-crypt block devices (and possibly LVM)
> require write-caching to be turned off on the underlying HDD. Failing to
> do so, in the event of a power failure, may result in corruption not yet
> handled by btrfs code. (2.6.33)"
>
> But Josef was not aware of any reports with kernels newer than 2.6.32
> (F12).
>
> Josef also noted that until last week btrfs wouldn't retry another
> mirror in the face of some corruption, the fix is here:
> http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=cb44921a09221
>
> This obviously doesn't fix any source of corruption but it makes btrfs
> more resilient when it encounters the corruption.

Right.

>
> > > Barriers/flush change should work, if it is broken, it is not only dm-crypt.
> > > (dm-crypt simply relies on dm-core implementation, when barrier/flush
> > > request come to dmcrypt, all previous IO must be already finished).
> >
> > Possibly, at least it doesn't seem to work.
>
> Can you please be more specific? What test(s)? What kernel(s)?
>
> Any pointers to previous (and preferably: recent) reports would be
> appreciated.
>
> The DM barrier code has seen considerable change recently (via flush+fua
> changes in 2.6.37). Those changes have been tested quite a bit
> (including ext4 consistency after a crash).
>
> But even prior to those flush+fua changes DM's support for barriers
> (Linux >= 2.6.31) was held to be robust. No known (at least no
> reported) issues with DM's barrier support.

I think it would be best to move forward with just hammering on the
dm-crypt barriers:

http://oss.oracle.com/~mason/barrier-test

This script is the best I've found so far to reliably trigger
corruptions with barriers off. I'd start with ext3 + barriers off just
to prove it corrupts things, then move to ext3 + barriers on.

-chris
Mike Snitzer
2010-11-14 20:59:26 UTC
Permalink
On Mon, Nov 08 2010 at 12:59pm -0500,
Chris Mason <***@oracle.com> wrote:

> Excerpts from Mike Snitzer's message of 2010-11-08 09:58:09 -0500:
> > On Sun, Nov 07 2010 at 6:05pm -0500,
> > Andi Kleen <***@firstfloor.org> wrote:
> >
> > > On Sun, Nov 07, 2010 at 10:39:23PM +0100, Milan Broz wrote:
> > > > On 11/07/2010 08:45 PM, Andi Kleen wrote:
> > > > >> I read about barrier-problems and data getting to the partition when
> > > > >> using dm-crypt and several layers so I don't know if that could be
> > > > >> related
> > > > >
> > > > > Barriers seem to be totally broken on dm-crypt currently.
> > > >
> > > > Can you explain it?
> > >
> > > e.g. the btrfs mailing list is full of corruption reports
> > > on dm-crypt and most of the symptoms point to broken barriers.
> >
> > [cc'ing linux-btrfs, hopefully in the future dm-devel will get cc'd when
> > concerns about DM come up on linux-btrfs (or other lists)]
> >
> > I spoke with Josef Bacik and these corruption reports are apparently
> > against older kernels (e.g. <= 2.6.33). I say <= 2.6.33 because:
>
> We've consistently seen reports about corruptions on power hits with
> dm-crypt. The logs didn't have any messages about barriers failing, but
> the corruptions were still there. The most likely cause is that
> barriers just aren't getting through somehow.

Can't blame anyone for assuming as much (although it does create FUD)
but in practice (testing dm-crypt with ext4 using your barrier-test
script) I have not been able to see any evidence that dm-crypt's barrier
support is ineffective.

Could be that the barrier-test script isn't able to reproduce the unique
failure case that btrfs does (on power failure)?

> > > > Barriers/flush change should work, if it is broken, it is not only dm-crypt.
> > > > (dm-crypt simply relies on dm-core implementation, when barrier/flush
> > > > request come to dmcrypt, all previous IO must be already finished).
> > >
> > > Possibly, at least it doesn't seem to work.
> >
> > Can you please be more specific? What test(s)? What kernel(s)?
> >
> > Any pointers to previous (and preferably: recent) reports would be
> > appreciated.

I still think we need specific bug reports that detail workloads and if
possible reproducers.

> > The DM barrier code has seen considerable change recently (via flush+fua
> > changes in 2.6.37). Those changes have been tested quite a bit
> > (including ext4 consistency after a crash).
> >
> > But even prior to those flush+fua changes DM's support for barriers
> > (Linux >= 2.6.31) was held to be robust. No known (at least no
> > reported) issues with DM's barrier support.
>
>