WARNING at fs/nfs/write.c:743 nfs_inode_remove

Discussion:

WARNING at fs/nfs/write.c:743 nfs_inode_remove_request with -rc6

Will Deacon

2014-09-23 13:03:52 UTC

Hi all,

I've been running into the following warning on an arm64 system running
3.17-rc6 with 64k pages. I've been unable to reproduce with a smaller page
size (4k).

I don't yet have a concrete reproducer, but I've seen it hit a few times
today just running a machine with an NFS root filesystem and using ssh.
The warning seems to happen in parallel on the two CPUs, but I'm pretty
confident that our test_and_clear_bit implementation has the relevant
atomic instructions and memory barriers.

Any ideas?

Will

--->8

------------[ cut here ]------------
WARNING: CPU: 1 PID: 1023 at fs/nfs/write.c:743 nfs_inode_remove_request+0xe4/0xf0()
Modules linked in:
CPU: 1 PID: 1023 Comm: kworker/1:2 Not tainted 3.17.0-rc6 #1
Workqueue: nfsiod rpc_async_release
Call trace:
[<fffffe0000096410>] dump_backtrace+0x0/0x130
[<fffffe0000096550>] show_stack+0x10/0x1c
[<fffffe00004cda94>] dump_stack+0x74/0xbc
[<fffffe00000b4d20>] warn_slowpath_common+0x8c/0xb4
[<fffffe00000b4e0c>] warn_slowpath_null+0x14/0x20
[<fffffe000027a6a8>] nfs_inode_remove_request+0xe0/0xf0
[<fffffe000027b704>] nfs_write_completion+0xb4/0x150
[<fffffe0000276ef4>] nfs_pgio_release+0x34/0x44
[<fffffe00004ac2d0>] rpc_free_task+0x24/0x4c
[<fffffe00004ac5c0>] rpc_async_release+0xc/0x18
[<fffffe00000c89e8>] process_one_work+0x140/0x32c
[<fffffe00000c9338>] worker_thread+0x13c/0x470
[<fffffe00000cd9e4>] kthread+0xd0/0xe8
---[ end trace 6f044efb83f0811b ]---

------------[ cut here ]------------
WARNING: CPU: 0 PID: 621 at fs/nfs/write.c:743 nfs_inode_remove_request+0xe4/0xf0()
CPU: 0 PID: 621 Comm: kworker/0:2 Tainted: G W 3.17.0-rc6 #1
Workqueue: nfsiod rpc_async_release
Call trace:
[<fffffe0000096410>] dump_backtrace+0x0/0x130
[<fffffe0000096550>] show_stack+0x10/0x1c
[<fffffe00004cda94>] dump_stack+0x74/0xbc
[<fffffe00000b4d20>] warn_slowpath_common+0x8c/0xb4
[<fffffe00000b4e0c>] warn_slowpath_null+0x14/0x20
[<fffffe000027a6a8>] nfs_inode_remove_request+0xe0/0xf0
[<fffffe000027b704>] nfs_write_completion+0xb4/0x150
[<fffffe0000276ef4>] nfs_pgio_release+0x34/0x44
[<fffffe00004ac2d0>] rpc_free_task+0x24/0x4c
[<fffffe00004ac5c0>] rpc_async_release+0xc/0x18
[<fffffe00000c89e8>] process_one_work+0x140/0x32c
[<fffffe00000c9338>] worker_thread+0x13c/0x470
[<fffffe00000cd9e4>] kthread+0xd0/0xe8
---[ end trace 6f044efb83f0811c ]---

Weston Andros Adamson

2014-09-23 13:33:06 UTC

Permalink

Post by Will Deacon
Hi all,
=20
I've been running into the following warning on an arm64 system runni=

Post by Will Deacon
3.17-rc6 with 64k pages. I've been unable to reproduce with a smaller=

page

Post by Will Deacon
size (4k).
=20
I don't yet have a concrete reproducer, but I've seen it hit a few ti=

mes

Post by Will Deacon
today just running a machine with an NFS root filesystem and using ss=

Post by Will Deacon
The warning seems to happen in parallel on the two CPUs, but I'm pret=

Post by Will Deacon
confident that our test_and_clear_bit implementation has the relevant
atomic instructions and memory barriers.
=20
Any ideas?
=20
Will

So it looks like we=92re either calling nfs_inode_remove_request twice =
on a request,
or somehow not grabbing the inode reference for some request that is in=
the async
write path. It=92s interesting that these come in pairs - that has to m=
ean something!

Any more info on how to reproduce this would be really great. Unfortuna=
tely I don=92t
have access to an arm64 system.

If it=92s possible, could we get a packet trace around when this happen=
s? This is pure
speculation, but this might have something to do the resend path - a co=
mmit fails
and all the requests on the commit list have to be resent.

Have you noticed any side effects from this? That WARN_ON_ONCE was adde=
d
to sanity test the new page group code and we need to fix this, but I=92=
m wondering
if anything =93bad=94 happens=85

-dros

Post by Will Deacon
=20
--->8
=20
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1023 at fs/nfs/write.c:743 nfs_inode_remove_requ=

est+0xe4/0xf0()

Post by Will Deacon
CPU: 1 PID: 1023 Comm: kworker/1:2 Not tainted 3.17.0-rc6 #1
Workqueue: nfsiod rpc_async_release
[<fffffe0000096410>] dump_backtrace+0x0/0x130
[<fffffe0000096550>] show_stack+0x10/0x1c
[<fffffe00004cda94>] dump_stack+0x74/0xbc
[<fffffe00000b4d20>] warn_slowpath_common+0x8c/0xb4
[<fffffe00000b4e0c>] warn_slowpath_null+0x14/0x20
[<fffffe000027a6a8>] nfs_inode_remove_request+0xe0/0xf0
[<fffffe000027b704>] nfs_write_completion+0xb4/0x150
[<fffffe0000276ef4>] nfs_pgio_release+0x34/0x44
[<fffffe00004ac2d0>] rpc_free_task+0x24/0x4c
[<fffffe00004ac5c0>] rpc_async_release+0xc/0x18
[<fffffe00000c89e8>] process_one_work+0x140/0x32c
[<fffffe00000c9338>] worker_thread+0x13c/0x470
[<fffffe00000cd9e4>] kthread+0xd0/0xe8
---[ end trace 6f044efb83f0811b ]---
=20
------------[ cut here ]------------
WARNING: CPU: 0 PID: 621 at fs/nfs/write.c:743 nfs_inode_remove_reque=

st+0xe4/0xf0()

Post by Will Deacon
CPU: 0 PID: 621 Comm: kworker/0:2 Tainted: G W 3.17.0-rc6=

Post by Will Deacon
Workqueue: nfsiod rpc_async_release
[<fffffe0000096410>] dump_backtrace+0x0/0x130
[<fffffe0000096550>] show_stack+0x10/0x1c
[<fffffe00004cda94>] dump_stack+0x74/0xbc
[<fffffe00000b4d20>] warn_slowpath_common+0x8c/0xb4
[<fffffe00000b4e0c>] warn_slowpath_null+0x14/0x20
[<fffffe000027a6a8>] nfs_inode_remove_request+0xe0/0xf0
[<fffffe000027b704>] nfs_write_completion+0xb4/0x150
[<fffffe0000276ef4>] nfs_pgio_release+0x34/0x44
[<fffffe00004ac2d0>] rpc_free_task+0x24/0x4c
[<fffffe00004ac5c0>] rpc_async_release+0xc/0x18
[<fffffe00000c89e8>] process_one_work+0x140/0x32c
[<fffffe00000c9338>] worker_thread+0x13c/0x470
[<fffffe00000cd9e4>] kthread+0xd0/0xe8
---[ end trace 6f044efb83f0811c ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Will Deacon

2014-09-23 13:59:38 UTC

Permalink

I've been running into the following warning on an arm64 system run=

ning

3.17-rc6 with 64k pages. I've been unable to reproduce with a small=

er page

size (4k).
=20
I don't yet have a concrete reproducer, but I've seen it hit a few =

times

today just running a machine with an NFS root filesystem and using =

ssh.

The warning seems to happen in parallel on the two CPUs, but I'm pr=

etty

confident that our test_and_clear_bit implementation has the releva=

atomic instructions and memory barriers.
=20
Any ideas?

=20
So it looks like we=E2=80=99re either calling nfs_inode_remove_reques=

t twice on a request,

or somehow not grabbing the inode reference for some request that is =

in the async

write path. It=E2=80=99s interesting that these come in pairs - that =

has to mean something!

Indeed. I have 6 CPUs on this system too, so it's not a per-cpu thing.

Any more info on how to reproduce this would be really great. Unfortu=

nately I don=E2=80=99t

have access to an arm64 system.

I've not spotted a pattern other than using 64k pages, yet. If I manage=
to
get a reproducer, I'll let you know.

If it=E2=80=99s possible, could we get a packet trace around when thi=

s happens? This is pure

speculation, but this might have something to do the resend path - a =

commit fails

and all the requests on the commit list have to be resent.

Sure, once I can reproduce it reliably, then I'll try to do that.

Have you noticed any side effects from this? That WARN_ON_ONCE was ad=

ded

to sanity test the new page group code and we need to fix this, but I=

=E2=80=99m wondering

if anything =E2=80=9Cbad=E2=80=9D happens=E2=80=A6

I've not noticed anything. In fact, this happened during an LTP run and=
I
didn't see any regressions in the test results.

Will

Will Deacon

2014-09-23 14:53:56 UTC

Permalink

On Tue, Sep 23, 2014 at 02:33:06PM +0100, Weston Andros Adamson wrote=

Any more info on how to reproduce this would be really great. Unfor=

tunately I don=E2=80=99t

have access to an arm64 system.

=20
I've not spotted a pattern other than using 64k pages, yet. If I mana=

ge to

get a reproducer, I'll let you know.
=20

If it=E2=80=99s possible, could we get a packet trace around when t=

his happens? This is pure

speculation, but this might have something to do the resend path - =

a commit fails

and all the requests on the commit list have to be resent.

=20
Sure, once I can reproduce it reliably, then I'll try to do that.

Right, a bunch of DDing from /dev/zero over an ssh session triggers thi=
s
very quickly. I've put a binary tcpdump here:

http://www.willdeacon.ukfsn.org/bitbucket/oopsen/nfs/tcpdump.bin

You may want to filter out the ssh packets. The only `interesting' thin=
g to
me is a retransmission about half way through, but I don't know what I'=
m
looking for.

The NFS server is 10.1.203.204 and the client is 10.1.203.24.

Will

Weston Andros Adamson

2014-09-23 14:59:25 UTC

Permalink

On Tue, Sep 23, 2014 at 02:33:06PM +0100, Weston Andros Adamson wrot=

Any more info on how to reproduce this would be really great. Unfor=

tunately I don=92t

have access to an arm64 system.

=20
I've not spotted a pattern other than using 64k pages, yet. If I man=

age to

get a reproducer, I'll let you know.
=20

If it=92s possible, could we get a packet trace around when this ha=

ppens? This is pure

speculation, but this might have something to do the resend path - =

a commit fails

and all the requests on the commit list have to be resent.

=20
Sure, once I can reproduce it reliably, then I'll try to do that.

=20
Right, a bunch of DDing from /dev/zero over an ssh session triggers t=

his

=20
http://www.willdeacon.ukfsn.org/bitbucket/oopsen/nfs/tcpdump.bin
=20
You may want to filter out the ssh packets. The only `interesting' th=

ing to

me is a retransmission about half way through, but I don't know what =

I'm

looking for.
=20
The NFS server is 10.1.203.204 and the client is 10.1.203.24.

Thanks! You are using NFSv2, so there are no commits - that rules out m=
y hunch.

-dros

Weston Andros Adamson

2014-09-23 15:02:26 UTC

Permalink

=20

On Tue, Sep 23, 2014 at 02:33:06PM +0100, Weston Andros Adamson wro=

Any more info on how to reproduce this would be really great. Unfo=

rtunately I don=92t

have access to an arm64 system.

=20
I've not spotted a pattern other than using 64k pages, yet. If I ma=

nage to

get a reproducer, I'll let you know.
=20

If it=92s possible, could we get a packet trace around when this h=

appens? This is pure

speculation, but this might have something to do the resend path -=

a commit fails

and all the requests on the commit list have to be resent.

=20
Sure, once I can reproduce it reliably, then I'll try to do that.

=20
Right, a bunch of DDing from /dev/zero over an ssh session triggers =

this

=20
http://www.willdeacon.ukfsn.org/bitbucket/oopsen/nfs/tcpdump.bin
=20
You may want to filter out the ssh packets. The only `interesting' t=

hing to

me is a retransmission about half way through, but I don't know what=

I'm

looking for.
=20
The NFS server is 10.1.203.204 and the client is 10.1.203.24.

=20
Thanks! You are using NFSv2, so there are no commits - that rules out=

my hunch.

Wait a second - the whole point of this extra reference (that the WARN_=
ON_ONCE is
related to) is to handle the pass off to commit lists.

Maybe I=92m just doing the wrong thing here with NFSv2!

More soon.

-dros

Weston Andros Adamson

2014-09-23 15:08:36 UTC

Permalink

=20

On Tue, Sep 23, 2014 at 02:33:06PM +0100, Weston Andros Adamson wr=

Any more info on how to reproduce this would be really great. Unf=

ortunately I don=92t

have access to an arm64 system.

=20
I've not spotted a pattern other than using 64k pages, yet. If I m=

anage to

get a reproducer, I'll let you know.
=20

If it=92s possible, could we get a packet trace around when this =

happens? This is pure

speculation, but this might have something to do the resend path =

- a commit fails

and all the requests on the commit list have to be resent.

=20
Sure, once I can reproduce it reliably, then I'll try to do that.

=20
Right, a bunch of DDing from /dev/zero over an ssh session triggers=

this

=20
http://www.willdeacon.ukfsn.org/bitbucket/oopsen/nfs/tcpdump.bin
=20
You may want to filter out the ssh packets. The only `interesting' =

thing to

me is a retransmission about half way through, but I don't know wha=

t I'm

looking for.
=20
The NFS server is 10.1.203.204 and the client is 10.1.203.24.

=20
Thanks! You are using NFSv2, so there are no commits - that rules ou=

t my hunch.

=20
Wait a second - the whole point of this extra reference (that the WAR=

N_ON_ONCE is

related to) is to handle the pass off to commit lists.
=20
Maybe I=92m just doing the wrong thing here with NFSv2!
=20
More soon.

Actually, can you see if this is reproducible with NFSv3?

Thanks,

-dros

Will Deacon

2014-09-23 15:25:58 UTC

Permalink

Post by Weston Andros Adamson

Post by Weston Andros Adamson
Wait a second - the whole point of this extra reference (that the
WARN_ON_ONCE is related to) is to handle the pass off to commit lis=

ts.

Post by Weston Andros Adamson

Post by Weston Andros Adamson
=20
Maybe I=E2=80=99m just doing the wrong thing here with NFSv2!
=20
More soon.

=20
Actually, can you see if this is reproducible with NFSv3?

Just tried that, and I'm unable to trigger the problem with NFSv3.

Will

Will Deacon

2014-09-25 17:15:45 UTC

Permalink

Hi Andros,

On Tue, Sep 23, 2014 at 04:08:36PM +0100, Weston Andros Adamson wrote=

Post by Weston Andros Adamson

Post by Weston Andros Adamson
Wait a second - the whole point of this extra reference (that the
WARN_ON_ONCE is related to) is to handle the pass off to commit l=

ists.

Post by Weston Andros Adamson

Post by Weston Andros Adamson
=20
Maybe I=E2=80=99m just doing the wrong thing here with NFSv2!
=20
More soon.

=20
Actually, can you see if this is reproducible with NFSv3?

=20
Just tried that, and I'm unable to trigger the problem with NFSv3.

Any more ideas? I can run with NFSv3 instead, but it's a shame to leave
v2 broken.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Weston Andros Adamson

2014-09-25 17:27:35 UTC

Permalink

Post by Will Deacon
Hi Andros,
=20

On Tue, Sep 23, 2014 at 04:08:36PM +0100, Weston Andros Adamson wrot=

Post by Weston Andros Adamson

Post by Weston Andros Adamson
Wait a second - the whole point of this extra reference (that the
WARN_ON_ONCE is related to) is to handle the pass off to commit li=

sts.

Post by Will Deacon

Post by Weston Andros Adamson

Post by Weston Andros Adamson
=20
Maybe I=92m just doing the wrong thing here with NFSv2!
=20
More soon.

=20
Actually, can you see if this is reproducible with NFSv3?

=20
Just tried that, and I'm unable to trigger the problem with NFSv3.

=20
Any more ideas? I can run with NFSv3 instead, but it's a shame to lea=

Post by Will Deacon
v2 broken.

Hi Will,

I do plan on looking at this more very soon, but since it=92s v2 only a=
nd just warning
(and not crashing anything) it didn=92t jump to the top of my list. It =
is only printing the
warnings, right? That=92s no reason to switch things up ;)

It might be interesting to get some rpcdebug output around the time of =
the warning,
but I=92m not sure it will be useful.

Since I cannot reproduce this, my next step is to inspect the code with=
an NFSv2
with 64k pages setup in mind.

-dros

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Weston Andros Adamson

2014-10-11 13:32:29 UTC

Permalink

Hey Will,

I=92ve been very busy, but I haven=92t forgotten about your bug report!

I think the WARN_ON_ONCE is just wrong, there are cases where the PG_IN=
ODE_REF
flag is legitimately not set. The flag is set so that sub requests can =
mimmic the parent
request=92s reference count.

Could you please run the reproducer again, unmount, then check the cont=
ents of
/proc/fs/nfsfs/servers?

If it=92s empty like example output below, then we know that requests a=
re being properly
dereferenced.

# cat /proc/fs/nfsfs/servers=20
NV SERVER PORT USE HOSTNAME =20
#

If there are any servers still listed (I=92m assuming this is the only =
mount on your client for
the test), then we have a problem.

-dros

Post by Weston Andros Adamson
=20

Post by Will Deacon
Hi Andros,
=20

On Tue, Sep 23, 2014 at 04:08:36PM +0100, Weston Andros Adamson wro=

Post by Weston Andros Adamson

Post by Weston Andros Adamson
Wait a second - the whole point of this extra reference (that the
WARN_ON_ONCE is related to) is to handle the pass off to commit l=

ists.

Post by Weston Andros Adamson

Post by Will Deacon

Post by Weston Andros Adamson

Post by Weston Andros Adamson
=20
Maybe I=92m just doing the wrong thing here with NFSv2!
=20
More soon.

=20
Actually, can you see if this is reproducible with NFSv3?

=20
Just tried that, and I'm unable to trigger the problem with NFSv3.

=20
Any more ideas? I can run with NFSv3 instead, but it's a shame to le=

ave

Post by Weston Andros Adamson

Post by Will Deacon
v2 broken.

=20
Hi Will,
=20
I do plan on looking at this more very soon, but since it=92s v2 only=

and just warning

Post by Weston Andros Adamson
(and not crashing anything) it didn=92t jump to the top of my list. I=

t is only printing the

Post by Weston Andros Adamson
warnings, right? That=92s no reason to switch things up ;)
=20
It might be interesting to get some rpcdebug output around the time o=

f the warning,

Post by Weston Andros Adamson
but I=92m not sure it will be useful.
=20
Since I cannot reproduce this, my next step is to inspect the code wi=

th an NFSv2

Post by Weston Andros Adamson
with 64k pages setup in mind.
=20
-dros

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Will Deacon

2014-10-20 13:57:37 UTC

Permalink

Post by Weston Andros Adamson
Hey Will,

Hi again, Andros,

Post by Weston Andros Adamson
I=E2=80=99ve been very busy, but I haven=E2=80=99t forgotten about yo=

ur bug report!

No problem, I've been busy too. I just checked with -rc1 and I can repr=
oduce
the issue there too.

Post by Weston Andros Adamson
I think the WARN_ON_ONCE is just wrong, there are cases where the
PG_INODE_REF flag is legitimately not set. The flag is set so that su=

Post by Weston Andros Adamson
requests can mimmic the parent request=E2=80=99s reference count.
=20
Could you please run the reproducer again, unmount, then check the co=

ntents of

Post by Weston Andros Adamson
/proc/fs/nfsfs/servers?

Sure. I just tried this and that file is empty after unmounting.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html