Discussion:
2.6.9-rc2-mm1 swsusp bug report.
(too old to reply)
Kevin Fenzi
2004-09-24 02:19:53 UTC
Permalink
Was trying to swsusp my 2.6.9-rc2-mm1 laptop tonight. It churned for a
while, but didn't hibernate. Here are the messages.

I do have PREEMPT and HIMEM enabled.

Sep 23 16:53:37 voldemort kernel: Stopping tasks: ==================================================
=================================================|
Sep 23 16:53:37 voldemort kernel: Freeing memory... ^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H
/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-
^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^
H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H
/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-
^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^
H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H
/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-
^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^
H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\^H|^H
/^H-^H\^H|^H/^H-^H\^H|^H/^H-^H\
Sep 23 16:53:37 voldemort kernel: ..................................................................
....................................................................................................
.........................swsusp: Need to copy 34850 pages
Sep 23 16:53:37 voldemort kernel: hibernate: page allocation failure. order:8, mode:0x120
Sep 23 16:53:37 voldemort kernel: [<c013fc1e>] __alloc_pages+0x21e/0x3e0
Sep 23 16:53:37 voldemort kernel: [<c013fe05>] __get_free_pages+0x25/0x3f
Sep 23 16:53:37 voldemort kernel: [<c01373b5>] alloc_pagedir+0x1f/0x6b
Sep 23 16:53:37 voldemort kernel: [<c01374e3>] swsusp_alloc+0x2c/0x62
Sep 23 16:53:37 voldemort kernel: [<c0137549>] suspend_prepare_image+0x30/0x6e
Sep 23 16:53:37 voldemort kernel: [<c0284fea>] swsusp_arch_suspend+0x2a/0x2c
Sep 23 16:53:37 voldemort kernel: [<c01375d5>] swsusp_suspend+0x24/0x33
Sep 23 16:53:37 voldemort kernel: [<c01379c2>] pm_suspend_disk+0x28/0x7e
Sep 23 16:53:37 voldemort kernel: [<c0135fd0>] enter_state+0x91/0x95
Sep 23 16:53:39 voldemort kernel: [<c013fc30>] __alloc_pages+0x230/0x3e0
Sep 23 16:53:39 voldemort kernel: [<c01360fb>] state_store+0xb1/0xc8
Sep 23 16:53:39 voldemort kernel: [<c0192748>] subsys_attr_store+0x3a/0x3e
Sep 23 16:53:39 voldemort kernel: [<c01929ce>] flush_write_buffer+0x3e/0x4a
Sep 23 16:53:39 voldemort kernel: [<c0192a5c>] sysfs_write_file+0x82/0x98
Sep 23 16:53:39 voldemort kernel: [<c01929da>] sysfs_write_file+0x0/0x98
Sep 23 16:53:39 voldemort kernel: [<c015926d>] vfs_write+0xd0/0x135
Sep 23 16:53:39 voldemort kernel: [<c015882b>] filp_close+0x59/0x86
Sep 23 16:53:39 voldemort kernel: [<c01593a3>] sys_write+0x51/0x80
Sep 23 16:53:39 voldemort kernel: [<c0106019>] sysenter_past_esp+0x52/0x71
Sep 23 16:53:39 voldemort kernel: swsusp: Restoring Highmem
Sep 23 16:53:39 voldemort kernel: ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 11 (level, low) -> IRQ
11
Sep 23 16:53:39 voldemort kernel: ACPI: PCI interrupt 0000:00:1f.5[B] -> GSI 5 (level, low) -> IRQ 5
Sep 23 16:53:39 voldemort kernel: PCI: Setting latency timer of device 0000:00:1f.5 to 64
Sep 23 16:53:39 voldemort kernel: Restarting tasks... done

kevin
Pavel Machek
2004-09-24 14:37:14 UTC
Permalink
Hi!
Post by Kevin Fenzi
Was trying to swsusp my 2.6.9-rc2-mm1 laptop tonight. It churned for a
while, but didn't hibernate. Here are the messages.
....................................................................................................
.........................swsusp: Need to copy 34850 pages
Sep 23 16:53:37 voldemort kernel: hibernate: page allocation failure. order:8, mode:0x120
Out of memory... Try again with less loaded system.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms
Kevin Fenzi
2004-09-24 21:09:53 UTC
Permalink
Pavel> Hi!
Post by Kevin Fenzi
Was trying to swsusp my 2.6.9-rc2-mm1 laptop tonight. It churned
for a while, but didn't hibernate. Here are the messages.
....................................................................................................
.........................swsusp: Need to copy 34850 pages Sep 23
16:53:37 voldemort kernel: hibernate: page allocation
Pavel> Out of memory... Try again with less loaded system.

The system was no more loaded than usual. I have 1GB memory and 4GB of
swap defined. I almost never touch swap. It might have been 100mb into
the 4Gb of swap when this happened.

What would cause it to be out of memory?
swsup needs to be reliable... rebooting when you are using your memory
kinda defeats the purpose of swsusp.

Felipe W Damasio <***@terra.com.br> sent me a patch, but I
haven't had a chance to try it yet:

- --- linux-2.6.9-rc2-mm2/kernel/power/swsusp.c.orig 2004-09-23 23:46:49.292975768 -0300
+++ linux-2.6.9-rc2-mm2/kernel/power/swsusp.c 2004-09-24 00:07:01.933626368 -0300
@@ -657,6 +657,9 @@
int diff = 0;
int order = 0;

+ order = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages));
+ nr_copy_pages += 1 << order;
+
do {
diff = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages)) - order;
if (diff) {


kevin
Nigel Cunningham
2004-09-24 23:40:17 UTC
Permalink
Hi.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Pavel> Hi!
Post by Kevin Fenzi
Was trying to swsusp my 2.6.9-rc2-mm1 laptop tonight. It churned
for a while, but didn't hibernate. Here are the messages.
....................................................................................................
.........................swsusp: Need to copy 34850 pages Sep 23
16:53:37 voldemort kernel: hibernate: page allocation
Pavel> Out of memory... Try again with less loaded system.
The system was no more loaded than usual. I have 1GB memory and 4GB of
swap defined. I almost never touch swap. It might have been 100mb into
the 4Gb of swap when this happened.
What would cause it to be out of memory?
swsup needs to be reliable... rebooting when you are using your memory
kinda defeats the purpose of swsusp.
The problem isn't really that you're out of memory. Rather, the memory
is so fragmented that swsusp is unable to get an order 8 allocation in
which to store its metadata. There isn't really anything you can do to
avoid this issue apart from eating memory (which swsusp is doing
anyway).

Regards,

Nigel
Kevin Fenzi
2004-09-25 01:45:42 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Pavel> Hi!
Post by Kevin Fenzi
Was trying to swsusp my 2.6.9-rc2-mm1 laptop tonight. It churned
for a while, but didn't hibernate. Here are the messages.
....................................................................................................
Post by Kevin Fenzi
.........................swsusp: Need to copy 34850 pages Sep 23
16:53:37 voldemort kernel: hibernate: page allocation >>
Pavel> Out of memory... Try again with less loaded system.
The system was no more loaded than usual. I have 1GB memory and 4GB
of swap defined. I almost never touch swap. It might have been
100mb into the 4Gb of swap when this happened.
What would cause it to be out of memory? swsup needs to be
reliable... rebooting when you are using your memory kinda defeats
the purpose of swsusp.
Nigel> The problem isn't really that you're out of memory. Rather, the
Nigel> memory is so fragmented that swsusp is unable to get an order 8
Nigel> allocation in which to store its metadata. There isn't really
Nigel> anything you can do to avoid this issue apart from eating
Nigel> memory (which swsusp is doing anyway).

Odd. I have never run into this before with either swsusp2 or
swsusp1.

What causes memory to be so fragmented?
Nothing can be done to prevent it?

Nigel> Regards,
Nigel> Nigel

kevin
Nigel Cunningham
2004-09-25 11:53:55 UTC
Permalink
Hi.
Post by Kevin Fenzi
Nigel> The problem isn't really that you're out of memory. Rather, the
Nigel> memory is so fragmented that swsusp is unable to get an order 8
Nigel> allocation in which to store its metadata. There isn't really
Nigel> anything you can do to avoid this issue apart from eating
Nigel> memory (which swsusp is doing anyway).
Odd. I have never run into this before with either swsusp2 or
swsusp1.
You won't run into it with suspend2 because it doesn't use high order
allocations. There might be one exception, but apart from that, all of
suspend2's data is stored in order zero allocated pages, so
fragmentation is not an issue. This is the real solution to the problem.
I had to do it this way because I aim to have suspend work without
eating any memory.
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Post by Kevin Fenzi
Nothing can be done to prevent it?
Apart from the above, no, sorry.

Regards,

Nigel
--
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

Many today claim to be tolerant. True tolerance, however, can cope with others
being intolerant.
Nick Piggin
2004-09-25 12:22:22 UTC
Permalink
Post by Nigel Cunningham
Hi.
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
Nigel Cunningham
2004-09-25 12:56:45 UTC
Permalink
Hi.
Post by Nick Piggin
Post by Nigel Cunningham
Hi.
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
Surely it would be able to rearrange pages to get a contiguous megabyte?
Regardless, not using order 8 allocations seems to me to be a better
solution (but then I have a patch to push once I finish my current round
of cleanups :>).

Nigel
Nick Piggin
2004-09-25 13:38:23 UTC
Permalink
Post by Nigel Cunningham
Hi.
Post by Nick Piggin
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
Surely it would be able to rearrange pages to get a contiguous megabyte?
For lots of stuff it is just infeasable. Just about all kernel memory,
for example.

But yeah, regardless, really the best thing is not to use such large
allocations at all.
William Lee Irwin III
2004-09-25 12:56:06 UTC
Permalink
Post by Nick Piggin
Post by Nigel Cunningham
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
For however useful defragmentation may be to make speculative use of
physically or virtually contiguous memory more probable to succeed, it
can never be made deterministic or even reliable, not even in pageable
kernels (which Linux is not). Fallback to allocations no larger than
the kernel's internal allocation unit, potentially in tandem with
scatter/gather capabilities, is essential.


-- wli
Nigel Cunningham
2004-09-25 13:21:02 UTC
Permalink
Hi.
Post by William Lee Irwin III
Post by Nick Piggin
Post by Nigel Cunningham
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
For however useful defragmentation may be to make speculative use of
physically or virtually contiguous memory more probable to succeed, it
can never be made deterministic or even reliable, not even in pageable
kernels (which Linux is not). Fallback to allocations no larger than
the kernel's internal allocation unit, potentially in tandem with
scatter/gather capabilities, is essential.
I fully agree. That's why I do it :>

Regards,

Nigel
Nigel Cunningham
2004-09-25 13:03:23 UTC
Permalink
Hi.
Post by Nick Piggin
Post by Nigel Cunningham
Hi.
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
Surely it would be able to rearrange pages to get a contiguous megabyte?
Regardless, not using order 8 allocations seems to me to be a better
solution (but then I have a barrow^H^H^H^H^H^Hpatch to push once I finish my current round
of cleanups :>).

Nigel
Pavel Machek
2004-09-25 15:45:27 UTC
Permalink
Hi!
Post by Nick Piggin
Post by Nigel Cunningham
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
True, defragmenter would not help.

Anyway, conversion from order-8 allocation should be pretty easy, but
I never seen that failure case and this is first report... So I'm not
doing that work just yet. [There's big chunk of changes waiting in
-mm, that needs to be merged because any other work should be done.]

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Nigel Cunningham
2004-09-25 22:03:42 UTC
Permalink
Hi.
Post by Pavel Machek
Hi!
Post by Nick Piggin
Post by Nigel Cunningham
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
Well, by this stage it looks like memory is already pretty well shrunk
as much as it is going to be, which means that even a pretty capable
defragmenter won't be able to do anything.
True, defragmenter would not help.
Anyway, conversion from order-8 allocation should be pretty easy, but
I never seen that failure case and this is first report... So I'm not
doing that work just yet. [There's big chunk of changes waiting in
-mm, that needs to be merged because any other work should be done.]
Are we still planning on having suspend2 replace swsusp eventually? It
was a lot of work to switch from those high order allocations, and if we
are still going to replace swsusp, perhaps it's would be a better use of
your time to do other things?

Regards,

Nigel
Pavel Machek
2004-09-26 10:04:42 UTC
Permalink
Hi!
Post by Nigel Cunningham
Post by Pavel Machek
True, defragmenter would not help.
Anyway, conversion from order-8 allocation should be pretty easy, but
I never seen that failure case and this is first report... So I'm not
doing that work just yet. [There's big chunk of changes waiting in
-mm, that needs to be merged because any other work should be done.]
Are we still planning on having suspend2 replace swsusp eventually? It
was a lot of work to switch from those high order allocations, and if we
are still going to replace swsusp, perhaps it's would be a better use of
your time to do other things?
I do not know if I'm more scared of swsusp1 to kill order-8
allocations or if suspend2's two page sets scare me more. (Hooks
suspend2 needs to stop all page cache activity are scary...)

I certainly want some parts of suspend2 (like improved freezer, if it
can be made small enough), but I'm no longer sure I want all of it. I
expected many people complaining about highmem problems in swsusp1,
and that just did not happen; SMP support turned out to be reasonably
simple...
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Nigel Cunningham
2004-09-26 21:59:42 UTC
Permalink
Hi.
Post by Pavel Machek
Post by Nigel Cunningham
Are we still planning on having suspend2 replace swsusp eventually? It
was a lot of work to switch from those high order allocations, and if we
are still going to replace swsusp, perhaps it's would be a better use of
your time to do other things?
I do not know if I'm more scared of swsusp1 to kill order-8
allocations or if suspend2's two page sets scare me more. (Hooks
suspend2 needs to stop all page cache activity are scary...)
Hooks to stop all page cache activity? I'm not sure what you mean.
Post by Pavel Machek
I certainly want some parts of suspend2 (like improved freezer, if it
can be made small enough), but I'm no longer sure I want all of it. I
expected many people complaining about highmem problems in swsusp1,
and that just did not happen; SMP support turned out to be reasonably
simple...
Okay. There are other advantages too, of course :>

Nigel
Pavel Machek
2004-09-26 22:43:38 UTC
Permalink
Hi!
Post by Nigel Cunningham
Post by Pavel Machek
Post by Nigel Cunningham
Are we still planning on having suspend2 replace swsusp eventually? It
was a lot of work to switch from those high order allocations, and if we
are still going to replace swsusp, perhaps it's would be a better use of
your time to do other things?
I do not know if I'm more scared of swsusp1 to kill order-8
allocations or if suspend2's two page sets scare me more. (Hooks
suspend2 needs to stop all page cache activity are scary...)
Hooks to stop all page cache activity? I'm not sure what you mean.
You have system where you write image in two parts, and there are some
pretty special rules what you may not touch when writing first part,
IIRC. That is what scares me...
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Nigel Cunningham
2004-09-27 10:12:05 UTC
Permalink
Hi.
Post by Pavel Machek
You have system where you write image in two parts, and there are some
pretty special rules what you may not touch when writing first part,
IIRC. That is what scares me...
No special rules. It's just the LRU that shouldn't change, and it won't
because all other activity is stopped and I'm using direct bio submits
to do the reading and writing. I really should get around to finishing
that 'how-it-works' document so I can clear up all the FUD. :>

Nigel
--
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

Many today claim to be tolerant. True tolerance, however, can cope with others
being intolerant.
Marcelo Tosatti
2004-09-26 16:18:16 UTC
Permalink
Post by Nigel Cunningham
Hi.
Post by Kevin Fenzi
Nigel> The problem isn't really that you're out of memory. Rather, the
Nigel> memory is so fragmented that swsusp is unable to get an order 8
Nigel> allocation in which to store its metadata. There isn't really
Nigel> anything you can do to avoid this issue apart from eating
Nigel> memory (which swsusp is doing anyway).
Odd. I have never run into this before with either swsusp2 or
swsusp1.
You won't run into it with suspend2 because it doesn't use high order
allocations. There might be one exception, but apart from that, all of
suspend2's data is stored in order zero allocated pages, so
fragmentation is not an issue. This is the real solution to the problem.
I had to do it this way because I aim to have suspend work without
eating any memory.
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
I think it is possible to have a defragmenter: allocate new page,
invalidate mapped pte's, invalidate radix tree entry (and block radix lookups),`
copy data from oldpage to newpage, remap pte's, insert radix tree
entry, free oldpage.

The memory hotplug patches do it - I'm trying to implement a similar version
to free physically nearby pages and form high order pages.
Pavel Machek
2004-09-26 18:39:15 UTC
Permalink
Hi!
Post by Marcelo Tosatti
Post by Nigel Cunningham
Post by Kevin Fenzi
What causes memory to be so fragmented?
Normal usage; the pattern of pages being freed and allocated inevitably
leads to fragmentation. The buddy allocator does a good job of
minimising it, but what is really needed is a run-time defragmenter. I
saw mention of this recently, but it's probably not that practical to
implement IMHO.
I think it is possible to have a defragmenter: allocate new page,
invalidate mapped pte's, invalidate radix tree entry (and block radix lookups),`
copy data from oldpage to newpage, remap pte's, insert radix tree
entry, free oldpage.
The memory hotplug patches do it - I'm trying to implement a similar version
to free physically nearby pages and form high order pages.
Well, swsusp is kind of special case. If it is possible to swap that
page out or discard, it is swapped out/discarded already. What remains
are things like kmalloc(), and you can't move them...

Anyway solution for swsusp is to avoid using such big pages, it is
less complex than doing defragmenter.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Pavel Machek
2004-09-25 10:15:22 UTC
Permalink
Hi!
Post by Kevin Fenzi
Post by Kevin Fenzi
Was trying to swsusp my 2.6.9-rc2-mm1 laptop tonight. It churned
for a while, but didn't hibernate. Here are the messages.
....................................................................................................
.........................swsusp: Need to copy 34850 pages Sep 23
16:53:37 voldemort kernel: hibernate: page allocation
Pavel> Out of memory... Try again with less loaded system.
The system was no more loaded than usual. I have 1GB memory and 4GB of
swap defined. I almost never touch swap. It might have been 100mb into
the 4Gb of swap when this happened.
What would cause it to be out of memory?
swsup needs to be reliable... rebooting when you are using your memory
kinda defeats the purpose of swsusp.
Read FAQ.
Post by Kevin Fenzi
- --- linux-2.6.9-rc2-mm2/kernel/power/swsusp.c.orig 2004-09-23 23:46:49.292975768 -0300
+++ linux-2.6.9-rc2-mm2/kernel/power/swsusp.c 2004-09-24 00:07:01.933626368 -0300
@@ -657,6 +657,9 @@
int diff = 0;
int order = 0;
+ order = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages));
+ nr_copy_pages += 1 << order;
+
do {
diff = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages)) - order;
if (diff) {
That does not look like it could help. I do not see why this patch
should be good thing.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Pascal Schmidt
2004-09-25 01:05:40 UTC
Permalink
Post by Nigel Cunningham
The problem isn't really that you're out of memory. Rather, the memory
is so fragmented that swsusp is unable to get an order 8 allocation in
which to store its metadata. There isn't really anything you can do to
avoid this issue apart from eating memory (which swsusp is doing
anyway).
That's one megabyte, right? Can't we preallocate that on boot, while
there's still chance to get that much contiguous memory? If the
user has swsusp compiled into his kernel, he probably wants it to
function, so it's not really "wasted".
--
Ciao,
Pascal
Pavel Machek
2004-09-25 10:16:40 UTC
Permalink
Hi!
Post by Pascal Schmidt
Post by Nigel Cunningham
The problem isn't really that you're out of memory. Rather, the memory
is so fragmented that swsusp is unable to get an order 8 allocation in
which to store its metadata. There isn't really anything you can do to
avoid this issue apart from eating memory (which swsusp is doing
anyway).
That's one megabyte, right? Can't we preallocate that on boot, while
there's still chance to get that much contiguous memory? If the
user has swsusp compiled into his kernel, he probably wants it to
function, so it's not really "wasted".
You do not know how much you should preallocate, because it depends on
ammount of memory used. You could preallocate maximum possible
ammount...

OTOH this is first report of this failure. If it fails once in a blue
moon, it is probably better to let it fail than waste memory.

Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Jan Rychter
2004-10-10 18:17:09 UTC
Permalink
Pavel> Hi!
Post by Nigel Cunningham
The problem isn't really that you're out of memory. Rather, the
memory is so fragmented that swsusp is unable to get an order 8
allocation in which to store its metadata. There isn't really
anything you can do to avoid this issue apart from eating memory
(which swsusp is doing anyway).
Post by Pascal Schmidt
That's one megabyte, right? Can't we preallocate that on boot, while
there's still chance to get that much contiguous memory? If the user
has swsusp compiled into his kernel, he probably wants it to
function, so it's not really "wasted".
Pavel> You do not know how much you should preallocate, because it
Pavel> depends on ammount of memory used. You could preallocate maximum
Pavel> possible ammount...

Pavel> OTOH this is first report of this failure. If it fails once in a
Pavel> blue moon, it is probably better to let it fail than waste
Pavel> memory.

This is *exactly* why I choose to use swsusp2. There is a marked
difference in the maintainer's approach to these kinds of problems.

The net result is that swsusp2 has worked for me very well for many
months now: I have been suspending and resuming happily for several
months, with exactly 0 swsusp-caused crashes or failures.

BTW, on a related note, I believe there is too much acceptance for
crashes and failures in the Linux world recently. Take an example: I can
bring down any of my machines (kernels 2.4 or 2.6) in less than 10
minutes just by plugging in and unplugging USB devices. There is
something fundamentally wrong with the USB subsystem if it is possible
to do that.

--J.
Pavel Machek
2004-10-11 13:32:34 UTC
Permalink
Hi!
Post by Jan Rychter
Pavel> You do not know how much you should preallocate, because it
Pavel> depends on ammount of memory used. You could preallocate maximum
Pavel> possible ammount...
Pavel> OTOH this is first report of this failure. If it fails once in a
Pavel> blue moon, it is probably better to let it fail than waste
Pavel> memory.
This is *exactly* why I choose to use swsusp2. There is a marked
difference in the maintainer's approach to these kinds of problems.
Okay, and do you have something to say or do you want to start
flamewar? That is also why swsusp2 is 10 times code size of swsusp...

Pavel
--
Boycott Kodak -- for their patent abuse against Java.
Jan Rychter
2004-10-11 14:53:29 UTC
Permalink
Pavel> Hi! You do not know how much you should preallocate, because it
Pavel> depends on ammount of memory used. You could preallocate maximum
Pavel> possible ammount...
Pavel> OTOH this is first report of this failure. If it fails once in a
Pavel> blue moon, it is probably better to let it fail than waste
Pavel> memory.
Post by Jan Rychter
This is *exactly* why I choose to use swsusp2. There is a marked
difference in the maintainer's approach to these kinds of problems.
Pavel> Okay, and do you have something to say or do you want to start
Pavel> flamewar? That is also why swsusp2 is 10 times code size of
Pavel> swsusp...

Sure, flame me if you think this is the right thing to do. But I will
continue to pitch in with a users' opinion sometimes, because I really
believe it is important.

It is easy to lose sight of the user perspective on these things if all
you deal with is kernel development. You probably reboot your machine
dozens of times a day anyway. However, for some users crashes and
reboots are *very* expensive. These people (myself included) consider
sprinkling the code with panics, crashing and failing an unacceptable
thing to do.

I also believe your reply shows how important it is for me to actually
write things like these from time to time (even risking getting
flamed). As a user I don't care whatsoever what the code size
is. Actually, I don't care that much about its performance, either. What
I do care about is that my operating system doesn't crash from under me,
doesn't lose my data, and doesn't fail on me with suspending when I
really need it to suspend now. Give me a userspace USB implementation
that works 10x slower and is 10x larger but doesn't crash my machine and
I'll take it any day.

--J.
Pavel Machek
2004-10-17 19:10:31 UTC
Permalink
Hi!
Post by Jan Rychter
Sure, flame me if you think this is the right thing to do. But I will
continue to pitch in with a users' opinion sometimes, because I really
believe it is important.
It is easy to lose sight of the user perspective on these things if all
you deal with is kernel development. You probably reboot your machine
dozens of times a day anyway. However, for some users crashes and
reboots are *very* expensive. These people (myself included) consider
sprinkling the code with panics, crashing and failing an unacceptable
thing to do.
You can have code that does not panic, does not crash, does not
corrupt your data, never fails to suspend and is in Linus' tree.

...no, that is too good. It sounds like a fairy tale.

So pick any four.
Pavel

PS: And it is real. We have conflicting goals here and I consider
"refuses to suspend" least critical.
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Nigel Cunningham
2004-10-17 21:40:18 UTC
Permalink
Hi.
Post by Pavel Machek
You can have code that does not panic, does not crash, does not
corrupt your data, never fails to suspend and is in Linus' tree.
...no, that is too good. It sounds like a fairy tale.
So pick any four.
Pavel
PS: And it is real. We have conflicting goals here and I consider
"refuses to suspend" least critical.
I'm going for all five! You're probably right, nevertheless. I can't say
suspend2 _never_ fails to suspend. It's just very rare.

Regards,

Nigel
--
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

Many today claim to be tolerant. True tolerance, however, can cope with others
being intolerant.
Stefan Seyfried
2004-10-11 09:56:18 UTC
Permalink
Hi,
Post by Pavel Machek
OTOH this is first report of this failure. If it fails once in a blue
moon, it is probably better to let it fail than waste memory.
PM: Attempting to suspend to disk.
PM: snapshotting memory.
swsusp: critical section:
swsusp: Saving Highmem
[nosave pfn 0x3be]<7>[nosave pfn 0x3bf]swsusp: Need to copy 30519 pages
suspend: (pages needed: 30519 + 512 free: 100469)
do_acpi_sleep: page allocation failure. order:7, mode:0x120
[<c013a628>] __alloc_pages+0x3a8/0x3b0
[<c013a648>] __get_free_pages+0x18/0x30
[<c0132c37>] alloc_pagedir+0x17/0x60
[<c0132ddb>] swsusp_alloc+0x4b/0xa0
[<c0132e63>] suspend_prepare_image+0x33/0x80
[<c028beda>] swsusp_arch_suspend+0x2a/0x30
[<c0132f1b>] swsusp_suspend+0x2b/0x40
[<c01332ad>] pm_suspend_disk+0x3d/0xb0
[<c0131765>] enter_state+0x85/0x90
[<c01318b1>] state_store+0xc1/0xc3
[<c01317f0>] state_store+0x0/0xc3
[<c01852e6>] subsys_attr_store+0x26/0x30
[<c018548d>] flush_write_buffer+0x1d/0x30
[<c01854c9>] sysfs_write_file+0x29/0x40
[<c01854a0>] sysfs_write_file+0x0/0x40
[<c0150c7f>] vfs_write+0x9f/0x100
[<c0150d8c>] sys_write+0x3c/0x70
[<c0105c69>] sysenter_past_esp+0x52/0x79
suspend: Allocating pagedir failed.
swsusp: Restoring Highmem

this happened right now, after running fine over the weekend and doing a
successful suspend/resume cycle this morning.
It was a "battery critical" suspend, so this is not nice :-( I had about
2 minutes left until hard powerdown during which i tried to get it to
suspend but failed. Yes, userspace should handle the "failed
battery-critical suspend" case better and probably call "shutdown -h now".

Stefan
Pavel Machek
2004-10-11 14:59:11 UTC
Permalink
Hi!
Post by Stefan Seyfried
Post by Pavel Machek
OTOH this is first report of this failure. If it fails once in a blue
moon, it is probably better to let it fail than waste memory.
PM: Attempting to suspend to disk.
PM: snapshotting memory.
swsusp: Saving Highmem
[nosave pfn 0x3be]<7>[nosave pfn 0x3bf]swsusp: Need to copy 30519 pages
suspend: (pages needed: 30519 + 512 free: 100469)
do_acpi_sleep: page allocation failure. order:7, mode:0x120
[<c013a628>] __alloc_pages+0x3a8/0x3b0
[<c013a648>] __get_free_pages+0x18/0x30
[<c0132c37>] alloc_pagedir+0x17/0x60
[<c0132ddb>] swsusp_alloc+0x4b/0xa0
[<c0132e63>] suspend_prepare_image+0x33/0x80
[<c028beda>] swsusp_arch_suspend+0x2a/0x30
[<c0132f1b>] swsusp_suspend+0x2b/0x40
[<c01332ad>] pm_suspend_disk+0x3d/0xb0
[<c0131765>] enter_state+0x85/0x90
[<c01318b1>] state_store+0xc1/0xc3
[<c01317f0>] state_store+0x0/0xc3
[<c01852e6>] subsys_attr_store+0x26/0x30
[<c018548d>] flush_write_buffer+0x1d/0x30
[<c01854c9>] sysfs_write_file+0x29/0x40
[<c01854a0>] sysfs_write_file+0x0/0x40
[<c0150c7f>] vfs_write+0x9f/0x100
[<c0150d8c>] sys_write+0x3c/0x70
[<c0105c69>] sysenter_past_esp+0x52/0x79
suspend: Allocating pagedir failed.
swsusp: Restoring Highmem
this happened right now, after running fine over the weekend and doing a
successful suspend/resume cycle this morning.
It was a "battery critical" suspend, so this is not nice :-( I had about
2 minutes left until hard powerdown during which i tried to get it to
suspend but failed. Yes, userspace should handle the "failed
battery-critical suspend" case better and probably call "shutdown -h now".
Ok... And I guess it is nearly impossible to trigger this on demand,
right?

I do not think I can use vmalloc easily because reallocate_pagedir
depends on it being contiguous.. Switching to link list is "just a
simple matter of coding", but it is going to be quite a lot of
changes.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Stefan Seyfried
2004-10-11 17:18:57 UTC
Permalink
Hi,
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on demand,
right?
Of course. I just wanted to say "yes, it does happen". I did not say
fixing it would be easy ;-)

Stefan
Rafael J. Wysocki
2004-10-11 19:58:48 UTC
Permalink
Hi,
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on demand,
right?
I think it is possible. Seemingly, on my box it's only a question of the
number of apps started. I think I can work out a method to trigger it 90% of
the time or so. Please let me know if it's worthy of doing.

Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
Pavel Machek
2004-10-12 08:55:10 UTC
Permalink
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on demand,
right?
I think it is possible. Seemingly, on my box it's only a question of the
number of apps started. I think I can work out a method to trigger it 90% of
the time or so. Please let me know if it's worthy of doing.
Yes, it would certainly help with testing...
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Rafael J. Wysocki
2004-10-13 17:29:11 UTC
Permalink
Post by Pavel Machek
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on demand,
right?
I think it is possible. Seemingly, on my box it's only a question of the
number of apps started. I think I can work out a method to trigger it
90% of the time or so. Please let me know if it's worthy of doing.
Yes, it would certainly help with testing...
So far, the most reliable method seems to be to use the box for a day after a
successful suspend/resume cycle (I've got an 8-order allocation failure 3
times out of 3 attempts). Still, I'm working on something that's less
time-consuming. ;-)

Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
Rafael J. Wysocki
2004-10-14 21:47:51 UTC
Permalink
Post by Pavel Machek
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on demand,
right?
I think it is possible. Seemingly, on my box it's only a question of the
number of apps started. I think I can work out a method to trigger it
90% of the time or so. Please let me know if it's worthy of doing.
Yes, it would certainly help with testing...
Well, I can do that, it seems, 100% of the time.

The method is to do "init 5" (my default runlevel is 3, because vts become
unreadable after I start X), log into KDE (as a non-root), start some X apps
at random (eg. I run gkrellm, kmail, konqueror, Mozilla FireFox 32-bit w/
Flash plugin, and konsole with "su -") and run updatedb (as root, of course).

Apparently, running updatedb is essential. After it finishes, on my box, you
can forget of suspending to disk from under the X+KDE combo, even if the X
apps (ie. kmail, konqueror, FireFox) are stopped before. However, if
updatedb is not run, the box usually suspends successfully.

Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
Rafael J. Wysocki
2004-10-14 21:54:25 UTC
Permalink
Post by Rafael J. Wysocki
Post by Pavel Machek
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on demand,
right?
I think it is possible. Seemingly, on my box it's only a question of
the
Post by Pavel Machek
Post by Rafael J. Wysocki
number of apps started. I think I can work out a method to trigger it
90% of the time or so. Please let me know if it's worthy of doing.
Yes, it would certainly help with testing...
Well, I can do that, it seems, 100% of the time.
The method is to do "init 5" (my default runlevel is 3, because vts become
unreadable after I start X), log into KDE (as a non-root), start some X apps
at random (eg. I run gkrellm, kmail, konqueror, Mozilla FireFox 32-bit w/
Flash plugin, and konsole with "su -") and run updatedb (as root, of course).
To be precise, the method always leads to a failure, but it seems to be either
8-order or 9-order page allocation failure.

Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
Pavel Machek
2004-10-16 16:43:47 UTC
Permalink
Hi!
Post by Pavel Machek
Post by Rafael J. Wysocki
Post by Pavel Machek
Post by Rafael J. Wysocki
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on
demand,
Post by Rafael J. Wysocki
Post by Pavel Machek
Post by Rafael J. Wysocki
Post by Pavel Machek
right?
I think it is possible. Seemingly, on my box it's only a question of
the
Post by Pavel Machek
Post by Rafael J. Wysocki
number of apps started. I think I can work out a method to trigger it
90% of the time or so. Please let me know if it's worthy of doing.
Yes, it would certainly help with testing...
Well, I can do that, it seems, 100% of the time.
The method is to do "init 5" (my default runlevel is 3, because vts become
unreadable after I start X), log into KDE (as a non-root), start some X apps
at random (eg. I run gkrellm, kmail, konqueror, Mozilla FireFox 32-bit w/
Flash plugin, and konsole with "su -") and run updatedb (as root, of course).
To be precise, the method always leads to a failure, but it seems to be either
8-order or 9-order page allocation failure.
Okay, you could probably pre-allocate 512K block during bootup, then
just use that instead of allocating new one during suspend.

Unfortunately that's rather ugly. You'd ~32 bytes per 4K page, that's
almost 1% overhead, not nice. Better solution (but more work) is to
switch to link-lists or integrate swsusp2.
Pavel
--
Boycott Kodak -- for their patent abuse against Java.
Rafael J. Wysocki
2004-10-16 19:31:19 UTC
Permalink
Post by Pavel Machek
Hi!
Post by Rafael J. Wysocki
Post by Rafael J. Wysocki
Post by Pavel Machek
Post by Rafael J. Wysocki
Post by Pavel Machek
Ok... And I guess it is nearly impossible to trigger this on
demand, right?
I think it is possible. Seemingly, on my box it's only a question
of the number of apps started. I think I can work out a method
to trigger it 90% of the time or so. Please let me know if it's
worthy of doing.
Yes, it would certainly help with testing...
Well, I can do that, it seems, 100% of the time.
The method is to do "init 5" (my default runlevel is 3, because vts
become unreadable after I start X), log into KDE (as a non-root),
start some X apps at random (eg. I run gkrellm, kmail, konqueror,
Mozilla FireFox 32-bit w/ Flash plugin, and konsole with "su -") and
run updatedb (as root, of course).
To be precise, the method always leads to a failure, but it seems to
be either 8-order or 9-order page allocation failure.
Okay, you could probably pre-allocate 512K block during bootup, then
just use that instead of allocating new one during suspend.
Unfortunately that's rather ugly. You'd ~32 bytes per 4K page, that's
almost 1% overhead, not nice. Better solution (but more work) is to
switch to link-lists or integrate swsusp2.
Well, I wonder if the page allocation failures are a swsusp problem, really.
I've just tried it the other way around and ran updatedb _first_, then
started X+KDE (no additional apps) and tried to suspend from under it. Guess
what: a 9-order page allocation failure, here you go.

It seems to me that updatedb leaves a mess in memory, which IMO should not
happen or at least the kernel should be able to clean it, but apparently it
is not. I'd be grateful if someone could explain to me why that is so,
please.

Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
Pavel Machek
2004-10-16 20:40:27 UTC
Permalink
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Unfortunately that's rather ugly. You'd ~32 bytes per 4K page, that's
almost 1% overhead, not nice. Better solution (but more work) is to
switch to link-lists or integrate swsusp2.
Well, I wonder if the page allocation failures are a swsusp problem, really.
Yes, they are. Kernel memory allocation is not design to do 8-order
allocations properly. swsusp really should not use them.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Rafael J. Wysocki
2004-10-16 21:05:50 UTC
Permalink
Post by Pavel Machek
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Unfortunately that's rather ugly. You'd ~32 bytes per 4K page, that's
almost 1% overhead, not nice. Better solution (but more work) is to
switch to link-lists or integrate swsusp2.
Well, I wonder if the page allocation failures are a swsusp problem, really.
Yes, they are. Kernel memory allocation is not design to do 8-order
allocations properly. swsusp really should not use them.
Now that's clear, thanks. Could you tell me, please, what I need to know to
understand the swsusp code and what I should start with?

Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
Pavel Machek
2004-10-16 21:25:01 UTC
Permalink
Hi!
Post by Rafael J. Wysocki
Post by Pavel Machek
Post by Rafael J. Wysocki
Post by Pavel Machek
Unfortunately that's rather ugly. You'd ~32 bytes per 4K page, that's
almost 1% overhead, not nice. Better solution (but more work) is to
switch to link-lists or integrate swsusp2.
Well, I wonder if the page allocation failures are a swsusp problem, really.
Yes, they are. Kernel memory allocation is not design to do 8-order
allocations properly. swsusp really should not use them.
Now that's clear, thanks. Could you tell me, please, what I need to know to
understand the swsusp code and what I should start with?
On bootup, prealloc, say, order-9 alocation.

In alloc_pagedir(), do not allocate anything, but check that data fit
in preallocated area, and fail if not.

There's free_pages( pagedir_save, ...) somewhere, remove that so
pagedir is never freed and you can resume multiple times.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
Continue reading on narkive:
Loading...