Discussion:
2.6.18-rc3-mm2
(too old to reply)
Andrew Morton
2006-08-06 10:08:09 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/

- 2.6.18-rc3-mm1 gets mysterious udev timeouts during boot and crashes in
NFS. This kernel reverts the patches which were causing that.



Changes since 2.6.18-rc3-mm1:


+revert-x86_64-mm-i386-remove-lock-section.patch

Revert patch which caues udev timeouts.

-knfsd-make-rpc-threads-pools-numa-aware-fix.patch

Folded into knfsd-make-rpc-threads-pools-numa-aware.patch

+revert-knfsd-make-rpc-threads-pools-numa-aware.patch

Revert patch which causes nfs crashes.



All 1136 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/patch-list


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel-announce" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-08-06 11:09:25 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
I get this error during the build.

kernel/built-in.o: In function `bacct_add_tsk':
/usr/src/linux-mm/kernel/tsacct.c:39: undefined reference to `__divdi3'
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [_all] Error 2

I'll try with CONFIG_TASKSTATS disabled.

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Balbir Singh
2006-08-07 09:52:57 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
I get this error during the build.
/usr/src/linux-mm/kernel/tsacct.c:39: undefined reference to `__divdi3'
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [_all] Error 2
I'll try with CONFIG_TASKSTATS disabled.
Regards,
Michal
Sounds likes we are trying to do a 64 bit division since timespec_to_ns()
returns a 64 bit value.

Here's a compile tested patch to fix the problem

Signed-off-by: Balbir Singh <***@in.ibm.com>
---

kernel/tsacct.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletion(-)

diff -puN kernel/tsacct.c~tsacct-build-fix kernel/tsacct.c
--- linux-2.6.18-rc3/kernel/tsacct.c~tsacct-build-fix 2006-08-07
14:20:58.000000000 +0530
+++ linux-2.6.18-rc3-balbir/kernel/tsacct.c 2006-08-07 14:51:44.000000000 +0530
@@ -36,7 +36,8 @@ void bacct_add_tsk(struct taskstats *sta
do_posix_clock_monotonic_gettime(&uptime);
ts = timespec_sub(uptime, current->group_leader->start_time);
/* rebase elapsed time to usec */
- stats->ac_etime = (timespec_to_ns(&ts))/NSEC_PER_USEC;
+ stats->ac_etime = (ts.tv_sec * USEC_PER_SEC) +
+ (ts.tv_nsec / NSEC_PER_USEC);
stats->ac_btime = xtime.tv_sec - ts.tv_sec;
if (thread_group_leader(tsk)) {
stats->ac_exitcode = tsk->exit_code;
_
--
Regards,
Balbir Singh,
Linux Technology Center,
IBM Software Labs
Michal Piotrowski
2006-08-07 12:16:04 UTC
Permalink
Hi,
Post by Balbir Singh
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
I get this error during the build.
/usr/src/linux-mm/kernel/tsacct.c:39: undefined reference to `__divdi3'
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [_all] Error 2
I'll try with CONFIG_TASKSTATS disabled.
Regards,
Michal
Sounds likes we are trying to do a 64 bit division since timespec_to_ns()
returns a 64 bit value.
Here's a compile tested patch to fix the problem
It doesn't apply
cat patches/tsacct1.patch | patch -p1 --dry-run
patching file kernel/tsacct.c
Hunk #1 FAILED at 36.
1 out of 1 hunk FAILED -- saving rejects to file kernel/tsacct.c.rej

Andrew's csa-basic-accounting-over-taskstats-fix.patch fix compilation problem.
Post by Balbir Singh
--
Regards,
Balbir Singh,
Linux Technology Center,
IBM Software Labs
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Balbir Singh
2006-08-07 14:05:09 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
Post by Michal Piotrowski
Hi,
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
Post by Michal Piotrowski
I get this error during the build.
/usr/src/linux-mm/kernel/tsacct.c:39: undefined reference to `__divdi3'
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [_all] Error 2
I'll try with CONFIG_TASKSTATS disabled.
Regards,
Michal
Sounds likes we are trying to do a 64 bit division since timespec_to_ns()
returns a 64 bit value.
Here's a compile tested patch to fix the problem
It doesn't apply
cat patches/tsacct1.patch | patch -p1 --dry-run
patching file kernel/tsacct.c
Hunk #1 FAILED at 36.
1 out of 1 hunk FAILED -- saving rejects to file kernel/tsacct.c.rej
Andrew's csa-basic-accounting-over-taskstats-fix.patch fix compilation problem.
Yeah, thats it! I did not see the fix in mm-commits.

Thanks for pointing to the fix.
--
Balbir Singh,
Linux Technology Center,
IBM Software Labs
Mattia Dongili
2006-08-06 13:33:06 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
there's something more, I had a load of the following while playing with
UML, full dmesg and config are
http://oioio.altervista.org/linux/config-2.6.18-rc3-mm2-1
http://oioio.altervista.org/linux/dmesg-2.6.18-rc3-mm2-1

[ 781.988000] ------------[ cut here ]------------
[ 781.988000] kernel BUG at mm/vmscan.c:383!
[ 781.988000] invalid opcode: 0000 [#1]
[ 781.988000] 4K_STACKS PREEMPT
[ 781.988000] last sysfs file: /devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load
[ 781.988000] Modules linked in: ipv6 nfsd exportfs lockd sunrpc ipt_MASQUERADE iptable_nat ip_nat xt_tcpudp xt_state ip_conntrack iptable_filter ip_tables x_tables jfs aes dm_crypt dm_mod rtc sony_acpi tun psmouse sonypi speedstep_ich speedstep_lib freq_table cpufreq_conservative cpufreq_ondemand cpufreq_powersave sd_mod usb_storage scsi_mod usbhid pcmcia snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer intel_agp agpgart i2c_i801 uhci_hcd usbcore evdev e100 mii yenta_socket rsrc_nonstatic pcmcia_core snd soundcore snd_page_alloc pcspkr
[ 781.988000] CPU: 0
[ 781.988000] EIP: 0060:[<c014c4d8>] Not tainted VLI
[ 781.988000] EFLAGS: 00210203 (2.6.18-rc3-mm2-1 #1)
[ 781.988000] EIP is at remove_mapping+0xe8/0x120
[ 781.988000] eax: c0374120 ebx: c11e2a80 ecx: c0374120 edx: 000000d0
[ 781.988000] esi: c0374120 edi: cfea0f78 ebp: cfea0e04 esp: cfea0df8
[ 781.988000] ds: 007b es: 007b ss: 0068
[ 781.988000] Process kswapd0 (pid: 134, ti=cfea0000 task=cfe9e030 task.ti=cfea0000)
[ 781.988000] Stack: c11e2a80 c11e2a80 c0374120 cfea0f14 c014cbab c0374120 c11e2a80 cfea0f78
[ 781.988000] c0373d60 c0373e2c 00000020 00000020 00000000 00000020 00000000 00000000
[ 781.988000] c0374120 00000001 00000000 c101a860 c0373c20 00000000 00000001 c0463168
[ 781.988000] Call Trace:
[ 781.988000] [<c014cbab>] shrink_inactive_list+0x69b/0x920
[ 781.988000] [<c014cec2>] shrink_zone+0x92/0xe0
[ 781.988000] [<c014d1f1>] kswapd+0x2e1/0x430
[ 781.988000] [<c012ee26>] kthread+0xe6/0xf0
[ 781.988000] [<c0101005>] kernel_thread_helper+0x5/0x10
[ 781.988000] DWARF2 unwinder stuck at kernel_thread_helper+0x5/0x10
[ 781.988000] Leftover inexact backtrace:
[ 781.988000] [<c0103a06>] show_stack_log_lvl+0xb6/0x100
[ 781.988000] [<c0103c2f>] show_registers+0x1df/0x290
[ 781.988000] [<c01041aa>] die+0x13a/0x310
[ 781.988000] [<c01047dd>] do_trap+0x9d/0x100
[ 781.988000] [<c0104c41>] do_invalid_op+0xa1/0xb0
[ 781.988000] [<c031a4a9>] error_code+0x39/0x40
[ 781.988000] [<c014cbab>] shrink_inactive_list+0x69b/0x920
[ 781.988000] [<c014cec2>] shrink_zone+0x92/0xe0
[ 781.988000] [<c014d1f1>] kswapd+0x2e1/0x430
[ 781.988000] [<c012ee26>] kthread+0xe6/0xf0
[ 781.988000] [<c0101005>] kernel_thread_helper+0x5/0x10
[ 781.988000] Code: 89 e0 25 00 f0 ff ff ff 48 14 8b 40 08 31 d2 a8 08 74 bc e8 6b be 1c 00 31 d2 eb b3 8d b4 26 00 00 00 00 8b 53 0c e9 51 ff ff ff <0f> 0b 7f 01 4e 66 33 c0 e9 2c ff ff ff 0f 0b 7e 01 4e 66 33 c0
[ 781.988000] EIP: [<c014c4d8>] remove_mapping+0xe8/0x120 SS:ESP 0068:cfea0df8
[ 781.988000] <0>------------[ cut here ]------------
[ 782.292000] kernel BUG at mm/vmscan.c:383!
...
[ 782.292000] <0>------------[ cut here ]------------
[ 782.564000] kernel BUG at mm/vmscan.c:383!
...
[ 809.588000] ------------[ cut here ]------------
[ 809.588000] kernel BUG at mm/vmscan.c:383!
...
[ 809.588000] <0>------------[ cut here ]------------
[ 811.748000] kernel BUG at mm/vmscan.c:383!
...
[ 811.748000] <0>------------[ cut here ]------------
[ 814.128000] kernel BUG at mm/vmscan.c:383!
...
[ 814.128000] <0>------------[ cut here ]------------
[ 815.272000] kernel BUG at mm/vmscan.c:383!
...
[ 815.272000] <0>------------[ cut here ]------------
[ 816.116000] kernel BUG at mm/vmscan.c:383!
...
[ 816.856000] <0>------------[ cut here ]------------
[ 817.120000] kernel BUG at mm/vmscan.c:383!
--
mattia
:wq!
Hugh Dickins
2006-08-06 14:55:43 UTC
Permalink
Post by Mattia Dongili
[ 781.988000] kernel BUG at mm/vmscan.c:383!
[ 781.988000] EIP is at remove_mapping+0xe8/0x120
You are so right: the minor fix below is needed.
Post by Mattia Dongili
[ 781.988000] DWARF2 unwinder stuck at kernel_thread_helper+0x5/0x10
Sorry, someone else will have to help with all that nuisance.


remove_mapping() must check against page_mapping(page):
&swapper_space is implicit, never actually stored in page->mapping.

Signed-off-by: Hugh Dickins <***@veritas.com>

--- 2.6.18-rc3-mm2/mm/vmscan.c 2006-08-06 12:25:40.000000000 +0100
+++ linux/mm/vmscan.c 2006-08-06 15:40:34.000000000 +0100
@@ -380,7 +380,7 @@ static pageout_t pageout(struct page *pa
int remove_mapping(struct address_space *mapping, struct page *page)
{
BUG_ON(!PageLocked(page));
- BUG_ON(mapping != page->mapping);
+ BUG_ON(mapping != page_mapping(page));

write_lock_irq(&mapping->tree_lock);
Mattia Dongili
2006-08-06 17:02:05 UTC
Permalink
Post by Hugh Dickins
Post by Mattia Dongili
[ 781.988000] kernel BUG at mm/vmscan.c:383!
[ 781.988000] EIP is at remove_mapping+0xe8/0x120
You are so right: the minor fix below is needed.
Thanks now it runs ok (since ~30 minutes now).
Hot-fix? :)
--
mattia
:wq!
Reuben Farrelly
2006-08-06 14:11:25 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
- 2.6.18-rc3-mm1 gets mysterious udev timeouts during boot and crashes in
NFS. This kernel reverts the patches which were causing that.
+revert-x86_64-mm-i386-remove-lock-section.patch
Revert patch which caues udev timeouts.
-knfsd-make-rpc-threads-pools-numa-aware-fix.patch
Folded into knfsd-make-rpc-threads-pools-numa-aware.patch
+revert-knfsd-make-rpc-threads-pools-numa-aware.patch
Revert patch which causes nfs crashes.
Seems to work well.

The only outstanding issue I have is with the "Generic ATA support" option which
I believe should be detecting and driving my ATA DVD-RW. However it is giving
this still on boot - it has never worked:

ahci 0000:00:1f.2: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq led clo pio slum part
ata1: SATA max UDMA/133 cmd 0xFFFFC2000000E100 ctl 0x0 bmdma 0x0 irq 314
ata2: SATA max UDMA/133 cmd 0xFFFFC2000000E180 ctl 0x0 bmdma 0x0 irq 314
ata3: SATA max UDMA/133 cmd 0xFFFFC2000000E200 ctl 0x0 bmdma 0x0 irq 314
ata4: SATA max UDMA/133 cmd 0xFFFFC2000000E280 ctl 0x0 bmdma 0x0 irq 314
scsi0 : ahci
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7, max UDMA/133, 586072368 sectors: LBA48 NCQ (depth 31/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : ahci
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 31/32)
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/133
scsi2 : ahci
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ATA-7, max UDMA/133, 586072368 sectors: LBA48 NCQ (depth 31/32)
ata3.00: ata3: dev 0 multi count 16
ata3.00: configured for UDMA/133
scsi3 : ahci
ata4: SATA link down (SStatus 0 SControl 300)
Vendor: ATA Model: ST3300622AS Rev: 3.AA
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: ST380817AS Rev: 3.42
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: ST3300622AS Rev: 3.AA
Type: Direct-Access ANSI SCSI revision: 05
ata_piix 0000:00:1f.1: version 2.00ac6
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1f.1 to 64
ata5: PATA max UDMA/133 cmd 0x1F0 ctl 0x3F6 bmdma 0x30B0 irq 14
scsi4 : ata_piix
ata5.00: ATAPI, max UDMA/66
ata5.00: configured for UDMA/66
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata5.00: (BMDMA stat 0x24)
ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata5: soft resetting port
ata5.00: configured for UDMA/66
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata5.00: (BMDMA stat 0x24)
ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata5: soft resetting port
ata5.00: configured for UDMA/66
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata5.00: (BMDMA stat 0x24)
ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata5: soft resetting port
ata5.00: configured for UDMA/66
ata5: EH complete
ata5.00: limiting speed to UDMA/44
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata5.00: (BMDMA stat 0x24)
ata5.00: tag 0 cmd 0xa0 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata5: soft resetting port
ata5.00: configured for UDMA/44
ata5: EH complete
SCSI device sda: 586072368 512-byte hdwr sectors (300069 MB)

And no DVD-RW :-(

I posted some information about it to LKML on 10/07/06

ATAPI CD-ROM, with removable media
Model Number: PIONEER DVD-RW DVR-111D
Serial Number: FADC005671WL
Firmware Revision: 1.23
+ more

but had no feedback.

Should I continue to ask/report it or should I just disable it for now and try
again in a few months to see if it works?

Reuben
lkml@o2.pl / IMAP
2006-08-06 15:20:13 UTC
Permalink
Hi,

I have found dependency error while compiling 2.6.18-rc3-mm2 kernel into
another directory...


***@amilo /home/place/linux-2.6.18-rc3-mm2> make V=1
O=../linux-2.6.18-rc3-mm2_amilo_obj menuconfig

make -C /home/place/linux-2.6.18-rc3-mm2_amilo_obj \
KBUILD_SRC=/home/place/linux-2.6.18-rc3-mm2 \
KBUILD_EXTMOD="" -f /home/place/linux-2.6.18-rc3-mm2/Makefile menuconfig
make -f /home/place/linux-2.6.18-rc3-mm2/scripts/Makefile.build
obj=scripts/basic
/bin/sh /home/place/linux-2.6.18-rc3-mm2/scripts/mkmakefile \
/home/place/linux-2.6.18-rc3-mm2
/home/place/linux-2.6.18-rc3-mm2_amilo_obj 2 6
GEN /home/place/linux-2.6.18-rc3-mm2_amilo_obj/Makefile
mkdir -p include/linux include/config
make -f /home/place/linux-2.6.18-rc3-mm2/scripts/Makefile.build
obj=scripts/kconfig menuconfig
gcc -Wp,-MD,scripts/kconfig/lxdialog/.checklist.o.d -Iscripts/kconfig
-Wall -Wstrict-prototypes -O2 -fomit-frame-pointer
-DCURSES_LOC="<ncurses.h>" -DLOCALE -c -o
scripts/kconfig/lxdialog/checklist.o
/home/place/linux-2.6.18-rc3-mm2/scripts/kconfig/lxdialog/checklist.c
/home/place/linux-2.6.18-rc3-mm2/scripts/kconfig/lxdialog/checklist.c:325:
fatal error: opening dependency file
scripts/kconfig/lxdialog/.checklist.o.d: Nie ma takiego pliku ani katalogu
compilation terminated.
make[2]: *** [scripts/kconfig/lxdialog/checklist.o] Bd 1
make[1]: *** [menuconfig] Bd 2
make: *** [menuconfig] Bd 2



Best Regards!

Piotr Jasiukajtis
Andrew Morton
2006-08-06 19:09:01 UTC
Permalink
On Sun, 6 Aug 2006 17:48:52 +0200
This kernel does not detect my HP laptop Alps touchpad. Also keyboard
seems to be detected but does not work, with the only exception of the
power button (I can use it to perform a clean shutdown).
2.6.18-rc1-mm1 works perfectly.
hum.

-tycho kernel: ata1.00: configured for UDMA/33
+tycho kernel: ata1.00: configured for UDMA/100

That looks nice.

-tycho kernel: ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x18C8 irq 15
-tycho kernel: scsi1 : ata_piix
-tycho kernel: ata2: port disabled. ignoring.
-tycho kernel: ATA: abnormal status 0xFF on port 0x177

So does that.

-tycho kernel: input: PS/2 Mouse as /class/input/input1
-tycho kernel: input: AlpsPS/2 ALPS GlidePoint as /class/input/input2

That's not so good.


Dmitry, do you have anything in there which might have caused that?

Perhaps hdaps-handle-errors-from-input_register_device.patch is triggering
for some reason. Fabio, it'd be useful if you could add this, see if it
triggers:


--- a/drivers/input/input.c~input_register_device-debug
+++ a/drivers/input/input.c
@@ -1007,6 +1007,10 @@ int input_register_device(struct input_d
fail3: sysfs_remove_group(&dev->cdev.kobj, &input_dev_id_attr_group);
fail2: sysfs_remove_group(&dev->cdev.kobj, &input_dev_attr_group);
fail1: class_device_del(&dev->cdev);
+ if (error) {
+ printk(KERN_ERR "%s failed: %d\n", __FUNCTION__, error);
+ dump_stack();
+ }
return error;
}
EXPORT_SYMBOL(input_register_device);
_
Dmitry Torokhov
2006-08-07 02:18:05 UTC
Permalink
Post by Andrew Morton
-tycho kernel: input: PS/2 Mouse as /class/input/input1
-tycho kernel: input: AlpsPS/2 ALPS GlidePoint as /class/input/input2
That's not so good.
Dmitry, do you have anything in there which might have caused that?
Perhaps hdaps-handle-errors-from-input_register_device.patch is triggering
for some reason.
Hmm, I'd be more concerned with i8042-get-rid-of-polling-timer patch...
Anyway, can I have dmesg from boot with i8042.debug=1, please? Make sure
you have big log biffer.
--
Dmitry
Fabio Comolli
2006-08-07 18:47:00 UTC
Permalink
Hi.
Post by Dmitry Torokhov
Post by Andrew Morton
-tycho kernel: input: PS/2 Mouse as /class/input/input1
-tycho kernel: input: AlpsPS/2 ALPS GlidePoint as /class/input/input2
That's not so good.
Dmitry, do you have anything in there which might have caused that?
Perhaps hdaps-handle-errors-from-input_register_device.patch is triggering
for some reason.
Hmm, I'd be more concerned with i8042-get-rid-of-polling-timer patch...
Bingo! Reverting remove-polling-timer-from-i8042-v2.patch did the
trick. Now I'm running 2.6.18-rc3-mm2 + hot-fixes :-)

Still interested in dmesg with i8042.debug=1 ?

Ciao.
Fabio
Post by Dmitry Torokhov
Anyway, can I have dmesg from boot with i8042.debug=1, please? Make sure
you have big log biffer.
--
Dmitry
Dmitry Torokhov
2006-08-07 19:00:57 UTC
Permalink
Post by Fabio Comolli
Hi.
Post by Dmitry Torokhov
Post by Andrew Morton
-tycho kernel: input: PS/2 Mouse as /class/input/input1
-tycho kernel: input: AlpsPS/2 ALPS GlidePoint as /class/input/input2
That's not so good.
Dmitry, do you have anything in there which might have caused that?
Perhaps hdaps-handle-errors-from-input_register_device.patch is triggering
for some reason.
Hmm, I'd be more concerned with i8042-get-rid-of-polling-timer patch...
Bingo! Reverting remove-polling-timer-from-i8042-v2.patch did the
trick. Now I'm running 2.6.18-rc3-mm2 + hot-fixes :-)
Still interested in dmesg with i8042.debug=1 ?
Yes, _with_ the i8042 polling patch applied. Do you have PNP support enabled?
--
Dmitry
Rafael J. Wysocki
2006-08-08 14:41:48 UTC
Permalink
]--snip--[
Post by Dmitry Torokhov
Post by Fabio Comolli
Still interested in dmesg with i8042.debug=1 ?
Yes, _with_ the i8042 polling patch applied.
I've got one for you (attached).

Greetings,
Rafael
Dmitry Torokhov
2006-08-08 17:42:36 UTC
Permalink
Post by Rafael J. Wysocki
]--snip--[
Post by Dmitry Torokhov
Post by Fabio Comolli
Still interested in dmesg with i8042.debug=1 ?
Yes, _with_ the i8042 polling patch applied.
I've got one for you (attached).
Thnk you, I think I see what the problem is. Rafael, could you please
try booting with i8042.nomux and tell me if mouse starts working.

Fabio, do you have a multiplexing controller as well?
--
Dmitry
Fabio Comolli
2006-08-08 18:16:56 UTC
Permalink
Hi Dmitry.
Post by Dmitry Torokhov
Fabio, do you have a multiplexing controller as well?
Well, I don't even know what this means :-(
How do I know?

However, it's a HP laptop, model name Pavillion DV4378EA.
Post by Dmitry Torokhov
--
Dmitry
Fabio
Dmitry Torokhov
2006-08-08 18:24:26 UTC
Permalink
Post by Fabio Comolli
Hi Dmitry.
Post by Dmitry Torokhov
Fabio, do you have a multiplexing controller as well?
Well, I don't even know what this means :-(
How do I know?
However, it's a HP laptop, model name Pavillion DV4378EA.
i8042.c: Detected active multiplexing controller, rev 1.1.
Could you please try booting with i8042.nomux and tell me if it works?

Thanks!
--
Dmitry
Fabio Comolli
2006-08-08 18:36:19 UTC
Permalink
Hi.
Post by Dmitry Torokhov
Post by Fabio Comolli
Hi Dmitry.
Post by Dmitry Torokhov
Fabio, do you have a multiplexing controller as well?
Well, I don't even know what this means :-(
How do I know?
However, it's a HP laptop, model name Pavillion DV4378EA.
i8042.c: Detected active multiplexing controller, rev 1.1.
Could you please try booting with i8042.nomux and tell me if it works?
Yup, it works.
Post by Dmitry Torokhov
Thanks!
--
Dmitry
Ciao.
Fabio
Dmitry Torokhov
2006-08-09 03:47:22 UTC
Permalink
Post by Fabio Comolli
Hi.
Post by Dmitry Torokhov
Post by Fabio Comolli
Hi Dmitry.
Post by Dmitry Torokhov
Fabio, do you have a multiplexing controller as well?
Well, I don't even know what this means :-(
How do I know?
However, it's a HP laptop, model name Pavillion DV4378EA.
i8042.c: Detected active multiplexing controller, rev 1.1.
Could you please try booting with i8042.nomux and tell me if it works?
Yup, it works.
Fabio, Rafael,

Could you please try applying the patch below on top of -rc3-mm2 and
see if it works without needing i8042.nomux?

Thank you!
--
Dmitry

Signed-off-by: Dmitry Torokhov <***@mail.ru>
---

drivers/input/serio/i8042.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

Index: work/drivers/input/serio/i8042.c
===================================================================
--- work.orig/drivers/input/serio/i8042.c
+++ work/drivers/input/serio/i8042.c
@@ -435,7 +435,7 @@ static int i8042_enable_mux_ports(void)
i8042_command(&param, I8042_CMD_AUX_ENABLE);
}

- return 0;
+ return i8042_enable_aux_port();
}

/*
Rafael J. Wysocki
2006-08-09 07:11:32 UTC
Permalink
Post by Dmitry Torokhov
Post by Fabio Comolli
Hi.
Post by Dmitry Torokhov
Post by Fabio Comolli
Hi Dmitry.
Post by Dmitry Torokhov
Fabio, do you have a multiplexing controller as well?
Well, I don't even know what this means :-(
How do I know?
However, it's a HP laptop, model name Pavillion DV4378EA.
i8042.c: Detected active multiplexing controller, rev 1.1.
Could you please try booting with i8042.nomux and tell me if it works?
Yup, it works.
Fabio, Rafael,
Could you please try applying the patch below on top of -rc3-mm2 and
see if it works without needing i8042.nomux?
Yes, it does.

Thanks,
Rafael
Rafael J. Wysocki
2006-08-08 20:32:47 UTC
Permalink
Post by Dmitry Torokhov
Post by Rafael J. Wysocki
]--snip--[
Post by Dmitry Torokhov
Post by Fabio Comolli
Still interested in dmesg with i8042.debug=1 ?
Yes, _with_ the i8042 polling patch applied.
I've got one for you (attached).
Thnk you, I think I see what the problem is. Rafael, could you please
try booting with i8042.nomux and tell me if mouse starts working.
It's a touchpad, but I guess that doesn't make a difference?

Rafael
Fabio Comolli
2006-08-08 18:14:09 UTC
Permalink
Hi.
Post by Dmitry Torokhov
Post by Fabio Comolli
Hi.
Post by Dmitry Torokhov
Post by Andrew Morton
-tycho kernel: input: PS/2 Mouse as /class/input/input1
-tycho kernel: input: AlpsPS/2 ALPS GlidePoint as /class/input/input2
That's not so good.
Dmitry, do you have anything in there which might have caused that?
Perhaps hdaps-handle-errors-from-input_register_device.patch is triggering
for some reason.
Hmm, I'd be more concerned with i8042-get-rid-of-polling-timer patch...
Bingo! Reverting remove-polling-timer-from-i8042-v2.patch did the
trick. Now I'm running 2.6.18-rc3-mm2 + hot-fixes :-)
Still interested in dmesg with i8042.debug=1 ?
Yes, _with_ the i8042 polling patch applied. Do you have PNP support enabled?
--
Dmitry
Please find the compressed log attached. And no, I don't have PNP
support enabled.
Hope this helps.

Fabio
Rafael J. Wysocki
2006-08-06 22:42:10 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.

Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.

Right now I'm doing a binary search for the offending patch.

Greetings,
Rafael
Andrew Morton
2006-08-06 22:54:54 UTC
Permalink
On Mon, 7 Aug 2006 00:42:10 +0200
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Thanks. I'd zoom in on
hdaps-handle-errors-from-input_register_device.patch and git-input.patch.
Rafael J. Wysocki
2006-08-07 09:15:45 UTC
Permalink
Post by Andrew Morton
On Mon, 7 Aug 2006 00:42:10 +0200
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Thanks. I'd zoom in on
hdaps-handle-errors-from-input_register_device.patch and git-input.patch.
None of these, but close: remove-polling-timer-from-i8042-v2.patch breaks
things here. [FYI, the box is booted with "noapic", because the IRQ sharing
doesn't work otherwise due to a BIOS issue, so it may be related.]

Attached is the dmesg output with i8042.debug=1 for Dmitry. It's from
2.6.18-rc3 with -mm2 partially applied (up to and including
logips2pp-fix-mx300-button-layout.patch). I'll apply the rest tonight, after
I find the patch that broke suspend for me.

BTW, I couldn't test -rc4, because I don't use git and there's no standalone
version so far. I hope it will be available?

[Now, I have an emergency to handle, so I won't be reachable before tonight,
I think.]

Greetings,
Rafael
Rafael J. Wysocki
2006-08-07 20:34:12 UTC
Permalink
Post by Rafael J. Wysocki
Post by Andrew Morton
On Mon, 7 Aug 2006 00:42:10 +0200
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Thanks. I'd zoom in on
hdaps-handle-errors-from-input_register_device.patch and git-input.patch.
None of these, but close: remove-polling-timer-from-i8042-v2.patch breaks
things here. [FYI, the box is booted with "noapic", because the IRQ sharing
doesn't work otherwise due to a BIOS issue, so it may be related.]
Attached is the dmesg output with i8042.debug=1 for Dmitry. It's from
2.6.18-rc3 with -mm2 partially applied (up to and including
logips2pp-fix-mx300-button-layout.patch). I'll apply the rest tonight, after
I find the patch that broke suspend for me.
Unfortunately this one is git-block.patch. I have no idea which part of it
may break the suspend.

It hangs during suspend, right after the memory has been shrunk, when devices
should be suspended. After pressing SysRq-P it shows it's spinning in the
idle thread and then hangs hard.

Greetings,
Rafael
Andrew Morton
2006-08-07 20:55:37 UTC
Permalink
On Mon, 7 Aug 2006 22:34:12 +0200
Post by Rafael J. Wysocki
Post by Rafael J. Wysocki
Post by Andrew Morton
On Mon, 7 Aug 2006 00:42:10 +0200
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Thanks. I'd zoom in on
hdaps-handle-errors-from-input_register_device.patch and git-input.patch.
None of these, but close: remove-polling-timer-from-i8042-v2.patch breaks
things here. [FYI, the box is booted with "noapic", because the IRQ sharing
doesn't work otherwise due to a BIOS issue, so it may be related.]
Attached is the dmesg output with i8042.debug=1 for Dmitry. It's from
2.6.18-rc3 with -mm2 partially applied (up to and including
logips2pp-fix-mx300-button-layout.patch). I'll apply the rest tonight, after
I find the patch that broke suspend for me.
Unfortunately this one is git-block.patch. I have no idea which part of it
may break the suspend.
ow, that tree is pretty huge at present.
Post by Rafael J. Wysocki
It hangs during suspend, right after the memory has been shrunk, when devices
should be suspended. After pressing SysRq-P it shows it's spinning in the
idle thread and then hangs hard.
OK, thanks for doing that. I'll drop git-block until we can get it sorted.
Jens Axboe
2006-08-08 05:21:18 UTC
Permalink
Post by Andrew Morton
On Mon, 7 Aug 2006 22:34:12 +0200
Post by Rafael J. Wysocki
Post by Rafael J. Wysocki
Post by Andrew Morton
On Mon, 7 Aug 2006 00:42:10 +0200
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Thanks. I'd zoom in on
hdaps-handle-errors-from-input_register_device.patch and git-input.patch.
None of these, but close: remove-polling-timer-from-i8042-v2.patch breaks
things here. [FYI, the box is booted with "noapic", because the IRQ sharing
doesn't work otherwise due to a BIOS issue, so it may be related.]
Attached is the dmesg output with i8042.debug=1 for Dmitry. It's from
2.6.18-rc3 with -mm2 partially applied (up to and including
logips2pp-fix-mx300-button-layout.patch). I'll apply the rest tonight, after
I find the patch that broke suspend for me.
Unfortunately this one is git-block.patch. I have no idea which part of it
may break the suspend.
ow, that tree is pretty huge at present.
Post by Rafael J. Wysocki
It hangs during suspend, right after the memory has been shrunk, when devices
should be suspended. After pressing SysRq-P it shows it's spinning in the
idle thread and then hangs hard.
OK, thanks for doing that. I'll drop git-block until we can get it sorted.
I think I know what it is, hang on.
--
Jens Axboe
Dmitry Torokhov
2006-08-07 02:18:58 UTC
Permalink
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Can I please have dmesg with i8042.debug=1?
--
Dmitry
Dmitry Torokhov
2006-08-07 02:20:54 UTC
Permalink
Post by Dmitry Torokhov
Post by Rafael J. Wysocki
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
My box's (Asus L5D, x86_64) keyboard doesn't work on this kernel at all, even
if I boot with init=/bin/bash. On the 2.6.18-rc2-mm1 it worked.
Unfortunately I have no indication what can be wrong, no oopses, no error
messages in dmesg, nothing.
Right now I'm doing a binary search for the offending patch.
Can I please have dmesg with i8042.debug=1?
Btw, does 2.6.18-rc4 work?
--
Dmitry
Grant Coady
2006-08-07 02:07:34 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
Okay here, done some fdisk partition manipulation and didn't lose
any filesystems or any other nasties. ;) Dual boot 'doze, so
stuffing around with NTFS (ro) as well as NFS (rw).

Some odd looking IRQ reassignments (Via chipset), I've put up
-rc3 -> -rc3-mm2 dmesg diff, as well as dmesg and config on
<http://bugsplatter.mine.nu/test/linux-2.6/sempro/> if anyone
curious.

Grant.
Jiri Slaby
2006-08-07 09:28:59 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
I tried it and guess what :)... swsusp doesn't work :@.

This time I was able to dump process states with sysrq-t:
Loading Image...

My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0

diff of dmesgs:
--- rc2 2006-08-07 11:13:34.000000000 +0200
+++ rc3 2006-08-07 11:13:39.000000000 +0200
@@ -1,4 +1,4 @@
-Linux version 2.6.18-rc2-mm1 (***@bellona) (gcc version 4.1.1 20060721 (Red Hat
4.1.1-13)) #155 SMP Tue Aug 1 01:17:45 CEST 2006
+Linux version 2.6.18-rc3-mm2 (***@bellona) (gcc version 4.1.1 20060802 (Red Hat
4.1.1-14)) #157 SMP Sun Aug 6 19:38:53 CEST 2006
BIOS-provided physical RAM map:
sanitize start
sanitize end
@@ -49,7 +49,7 @@
Enabling APIC mode: Flat. Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
-Detected 2736.278 MHz processor.
+Detected 2736.289 MHz processor.
Built 1 zonelists. Total pages: 262128
Kernel command line: ro root=/dev/hda2 reboot=w vga=1 2
mapped APIC to ffffd000 (fee00000)
@@ -57,14 +57,22 @@
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
-CPU 0 irqstacks, hard=c0505000 soft=c0502000
+CPU 0 irqstacks, hard=c0509000 soft=c0506000
PID hash table entries: 4096 (order: 12, 16384 bytes)
Console: colour VGA+ 80x50
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
-Memory: 1034488k/1048512k available (2514k kernel code, 13456k reserved, 1349k
data, 200k init, 131008k highmem)
+Memory: 1034472k/1048512k available (2522k kernel code, 13472k reserved, 1353k
data, 204k init, 131008k highmem)
+virtual kernel memory layout:
+ fixmap : 0xfff90000 - 0xfffff000 ( 444 kB)
+ pkmap : 0xff800000 - 0xffc00000 (4096 kB)
+ vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB)
+ lowmem : 0xc0000000 - 0xf8000000 ( 896 MB)
+ .init : 0xc04ce000 - 0xc0501000 ( 204 kB)
+ .data : 0xc03768d2 - 0xc04c8ff8 (1353 kB)
+ .text : 0xc0100000 - 0xc03768d2 (2522 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
-Calibrating delay using timer specific routine.. 5476.47 BogoMIPS (lpj=10952942)
+Calibrating delay using timer specific routine.. 5476.48 BogoMIPS (lpj=10952969)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
00004400 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400
00000000 00000000
@@ -82,9 +90,9 @@
CPU0: Intel(R) Pentium(R) 4 CPU 2.60GHz stepping 09
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
-CPU 1 irqstacks, hard=c0506000 soft=c0503000
+CPU 1 irqstacks, hard=c050a000 soft=c0507000
Initializing CPU#1
-Calibrating delay using timer specific routine.. 5472.77 BogoMIPS (lpj=10945546)
+Calibrating delay using timer specific routine.. 5472.79 BogoMIPS (lpj=10945581)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
00004400 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000 00004400
00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
@@ -96,15 +104,15 @@
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 2.60GHz stepping 09
-Total of 2 processors activated (10949.24 BogoMIPS).
+Total of 2 processors activated (10949.27 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
-migration_cost=111
+migration_cost=1
NET: Registered protocol family 16
ACPI: bus type pci registered
-PCI: PCI BIOS revision 2.10 entry at 0xfb670, last bus=2
+PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
@@ -189,7 +197,7 @@
ACPI: PCI Interrupt 0000:02:01.0[A] -> GSI 21 (level, low) -> IRQ 19
HPT370: chipset revision 3
HPT370: no clock data saved by BIOS
-HPT370: DPLL base: 48 MHz, f_CNT: 146, assuming 33 MHz PCI
+HPT370: DPLL base: 48 MHz, f_CNT: 148, assuming 33 MHz PCI
HPT370: using 33 MHz PCI clock
HPT370: 100% native mode on irq 19
ide2: BM-DMA at 0x9000-0x9007, BIOS settings: hde:DMA, hdf:pio
@@ -243,7 +251,7 @@
usb usb1: new device found, idVendor=0000, idProduct=0000
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: EHCI Host Controller
-usb usb1: Manufacturer: Linux 2.6.18-rc2-mm1 ehci_hcd
+usb usb1: Manufacturer: Linux 2.6.18-rc3-mm2 ehci_hcd
usb usb1: SerialNumber: 0000:00:1d.7
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
@@ -257,7 +265,7 @@
usb usb2: new device found, idVendor=0000, idProduct=0000
usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: UHCI Host Controller
-usb usb2: Manufacturer: Linux 2.6.18-rc2-mm1 uhci_hcd
+usb usb2: Manufacturer: Linux 2.6.18-rc3-mm2 uhci_hcd
usb usb2: SerialNumber: 0000:00:1d.0
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
@@ -270,7 +278,7 @@
usb usb3: new device found, idVendor=0000, idProduct=0000
usb usb3: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb3: Product: UHCI Host Controller
-usb usb3: Manufacturer: Linux 2.6.18-rc2-mm1 uhci_hcd
+usb usb3: Manufacturer: Linux 2.6.18-rc3-mm2 uhci_hcd
usb usb3: SerialNumber: 0000:00:1d.1
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
@@ -283,7 +291,7 @@
usb usb4: new device found, idVendor=0000, idProduct=0000
usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb4: Product: UHCI Host Controller
-usb usb4: Manufacturer: Linux 2.6.18-rc2-mm1 uhci_hcd
+usb usb4: Manufacturer: Linux 2.6.18-rc3-mm2 uhci_hcd
usb usb4: SerialNumber: 0000:00:1d.2
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
@@ -296,7 +304,7 @@
usb usb5: new device found, idVendor=0000, idProduct=0000
usb usb5: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb5: Product: UHCI Host Controller
-usb usb5: Manufacturer: Linux 2.6.18-rc2-mm1 uhci_hcd
+usb usb5: Manufacturer: Linux 2.6.18-rc3-mm2 uhci_hcd
usb usb5: SerialNumber: 0000:00:1d.3
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
@@ -312,8 +320,8 @@
input: Wacom Graphire2 4x5 as /class/input/input0
usbcore: registered new interface driver wacom
/l/latest/xxx/drivers/usb/input/wacom.c: v1.45:USB Wacom Graphire and Wacom
Intuos tablet driver
-serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
+serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
it87: Found IT8712F chip at 0x290, revision 5
md: raid0 personality registered for level 0
@@ -325,6 +333,7 @@
No soundcards found.
oprofile: using NMI interrupt.
ip_conntrack version 2.4 (8191 buckets, 65528 max) - 208 bytes per conntrack
+input: AT Translated Set 2 keyboard as /class/input/input1
ip_tables: (C) 2000-2006 Netfilter Core Team
TCP bic registered
NET: Registered protocol family 1
@@ -336,13 +345,15 @@
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
+EXT3-fs: INFO: recovery required on readonly filesystem.
+EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
+EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
-Freeing unused kernel memory: 200k freed
-input: AT Translated Set 2 keyboard as /class/input/input1
+Freeing unused kernel memory: 204k freed
ieee1394: Initialized config rom entry `ip1394'
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
hdd: ATAPI 40X DVD-ROM CD-R/RW drive, 2048kB Cache, UDMA(33)
ACPI: PCI Interrupt 0000:02:05.0[A] -> GSI 21 (level, low) -> IRQ 19
@@ -387,3 +398,5 @@
EXT3 FS on md0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 506036k swap on /dev/hda3. Priority:-1 extents:1 across:506036k
+JBD: barrier-based sync failed on hda2 - disabling barriers
+JBD: barrier-based sync failed on md0 - disabling barriers

regards,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jason Lunz
2006-08-07 16:23:24 UTC
Permalink
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch

That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.

Jason
Rafael J. Wysocki
2006-08-07 20:47:59 UTC
Permalink
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.

Rafael
Jens Axboe
2006-08-08 08:41:16 UTC
Permalink
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
Can you apply this on top of -mm and see if that fixes it?

diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index d2339e9..db647a9 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -390,7 +390,7 @@ void ide_end_drive_cmd (ide_drive_t *dri
args[5] = hwif->INB(IDE_HCYL_REG);
args[6] = hwif->INB(IDE_SELECT_REG);
}
- } else if (rq->cmd_type & REQ_TYPE_ATA_TASKFILE) {
+ } else if (rq->cmd_type == REQ_TYPE_ATA_TASKFILE) {
ide_task_t *args = (ide_task_t *) rq->special;
if (rq->errors == 0)
rq->errors = !OK_STAT(stat,READY_STAT,BAD_STAT);
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Slaby
2006-08-08 09:49:20 UTC
Permalink
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
Can you apply this on top of -mm and see if that fixes it?
It doesn't solve the problem for me.
Post by Jens Axboe
diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index d2339e9..db647a9 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -390,7 +390,7 @@ void ide_end_drive_cmd (ide_drive_t *dri
args[5] = hwif->INB(IDE_HCYL_REG);
args[6] = hwif->INB(IDE_SELECT_REG);
}
- } else if (rq->cmd_type & REQ_TYPE_ATA_TASKFILE) {
+ } else if (rq->cmd_type == REQ_TYPE_ATA_TASKFILE) {
ide_task_t *args = (ide_task_t *) rq->special;
if (rq->errors == 0)
rq->errors = !OK_STAT(stat,READY_STAT,BAD_STAT);
regards,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jens Axboe
2006-08-08 10:43:04 UTC
Permalink
Post by Jiri Slaby
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
Can you apply this on top of -mm and see if that fixes it?
It doesn't solve the problem for me.
Ok, thanks for testing, I'll try and reproduce it here.
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Slaby
2006-08-08 10:08:20 UTC
Permalink
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in ide-io by
ll_rw_blk. But I don't understand block layer too much. Where the
blk_end_sync_rq should be called from (why is not called at all)?

regards,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jens Axboe
2006-08-08 10:43:54 UTC
Permalink
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Post by Jiri Slaby
Where the blk_end_sync_rq should be called from (why is not called at
all)?
It's called from ->end_io() in end_that_request_last().
--
Jens Axboe
Rafael J. Wysocki
2006-08-08 10:59:15 UTC
Permalink
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.

Haven't tested yet, though.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jens Axboe
2006-08-08 11:04:47 UTC
Permalink
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?

I'll concoct a fix for that breakage.
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jens Axboe
2006-08-08 11:07:26 UTC
Permalink
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
I'll concoct a fix for that breakage.
Something like this.

diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index db647a9..38479a2 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -141,7 +141,7 @@ enum {

static void ide_complete_power_step(ide_drive_t *drive, struct request *rq, u8 stat, u8 error)
{
- struct request_pm_state *pm = rq->end_io_data;
+ struct request_pm_state *pm = rq->data;

if (drive->media != ide_disk)
return;
@@ -164,7 +164,7 @@ static void ide_complete_power_step(ide_

static ide_startstop_t ide_start_power_step(ide_drive_t *drive, struct request *rq)
{
- struct request_pm_state *pm = rq->end_io_data;
+ struct request_pm_state *pm = rq->data;
ide_task_t *args = rq->special;

memset(args, 0, sizeof(*args));
@@ -421,7 +421,7 @@ void ide_end_drive_cmd (ide_drive_t *dri
}
}
} else if (blk_pm_request(rq)) {
- struct request_pm_state *pm = rq->end_io_data;
+ struct request_pm_state *pm = rq->data;
#ifdef DEBUG_PM
printk("%s: complete_power_step(step: %d, stat: %x, err: %x)\n",
drive->name, rq->pm->pm_step, stat, err);
@@ -933,7 +933,7 @@ #endif

static void ide_check_pm_state(ide_drive_t *drive, struct request *rq)
{
- struct request_pm_state *pm = rq->end_io_data;
+ struct request_pm_state *pm = rq->data;

if (blk_pm_suspend_request(rq) &&
pm->pm_step == ide_pm_state_start_suspend)
@@ -1018,7 +1018,7 @@ #endif
rq->cmd_type == REQ_TYPE_ATA_TASKFILE)
return execute_drive_cmd(drive, rq);
else if (blk_pm_request(rq)) {
- struct request_pm_state *pm = rq->end_io_data;
+ struct request_pm_state *pm = rq->data;
#ifdef DEBUG_PM
printk("%s: start_power_step(step: %d)\n",
drive->name, rq->pm->pm_step);
diff --git a/drivers/ide/ide.c b/drivers/ide/ide.c
index d7b4499..0fd1e1c 100644
--- a/drivers/ide/ide.c
+++ b/drivers/ide/ide.c
@@ -1219,7 +1219,7 @@ static int generic_ide_suspend(struct de
memset(&args, 0, sizeof(args));
rq.cmd_type = REQ_TYPE_PM_SUSPEND;
rq.special = &args;
- rq.end_io_data = &rqpm;
+ rq.data = &rqpm;
rqpm.pm_step = ide_pm_state_start_suspend;
rqpm.pm_state = state.event;

@@ -1238,7 +1238,7 @@ static int generic_ide_resume(struct dev
memset(&args, 0, sizeof(args));
rq.cmd_type = REQ_TYPE_PM_RESUME;
rq.special = &args;
- rq.end_io_data = &rqpm;
+ rq.data = &rqpm;
rqpm.pm_step = ide_pm_state_start_resume;
rqpm.pm_state = PM_EVENT_ON;
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki
2006-08-08 11:16:00 UTC
Permalink
Post by Jens Axboe
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
Hm, I'm not sure what you mean ...
Post by Jens Axboe
Post by Jens Axboe
I'll concoct a fix for that breakage.
Something like this.
Looks good, I'll give it a try.

Rafael
Jens Axboe
2006-08-08 11:19:25 UTC
Permalink
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
Hm, I'm not sure what you mean ...
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)

But perhaps that wasn't you?
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
I'll concoct a fix for that breakage.
Something like this.
Looks good, I'll give it a try.
Thanks!
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki
2006-08-08 13:50:35 UTC
Permalink
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
Hm, I'm not sure what you mean ...
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
Ah, that.
Post by Jiri Slaby
But perhaps that wasn't you?
No, that wasn't me. :-)
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
I'll concoct a fix for that breakage.
Something like this.
Looks good, I'll give it a try.
Thanks!
It fixes this particular issue for me, but your first patch (appended) is also
needed to prevent the box from hanging later during the resume (when it
tries to save the image).

Thanks,
Rafael


--
drivers/ide/ide-io.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.18-rc3-mm2/drivers/ide/ide-io.c
===================================================================
--- linux-2.6.18-rc3-mm2.orig/drivers/ide/ide-io.c
+++ linux-2.6.18-rc3-mm2/drivers/ide/ide-io.c
@@ -402,7 +402,7 @@ void ide_end_drive_cmd (ide_drive_t *dri
args[5] = hwif->INB(IDE_HCYL_REG);
args[6] = hwif->INB(IDE_SELECT_REG);
}
- } else if (rq->cmd_type & REQ_TYPE_ATA_TASKFILE) {
+ } else if (rq->cmd_type == REQ_TYPE_ATA_TASKFILE) {
ide_task_t *args = (ide_task_t *) rq->special;
if (rq->errors == 0)
rq->errors = !OK_STAT(stat,READY_STAT,BAD_STAT);
Jens Axboe
2006-08-08 14:06:01 UTC
Permalink
Post by Rafael J. Wysocki
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel
prints is suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
I found that git-block.patch broke the suspend for me. Still have no idea
what's up with it.
I suspect elevator changes. The wait_for_completion is not woken in
ide-io by ll_rw_blk. But I don't understand block layer too much.
The ide changes are far more likely, it's probably missing a completion.
Actually I think the commit f74bf2e6b415588e562fdcfdd454d587eb33cd46
(Remove ->waiting member from struct request) is wrong, because
generic_ide_suspend() uses the end_of_io member of rq to pass the PM data
to ide_do_drive_cmd() where the pointer gets overwritten by &wait (must_wait
is "true", because action == ide_wait). Previously &wait was stored in
rq->waiting and it didn't overwrite the PM data.
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
Hm, I'm not sure what you mean ...
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
Ah, that.
Post by Jiri Slaby
But perhaps that wasn't you?
No, that wasn't me. :-)
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
I'll concoct a fix for that breakage.
Something like this.
Looks good, I'll give it a try.
Thanks!
It fixes this particular issue for me, but your first patch (appended)
is also needed to prevent the box from hanging later during the resume
(when it tries to save the image).
Yes certainly, that's a separate bug, sorry if I didn't make that clear.
Both fixes are in the block repo now, so next -mm should work fine
again.
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Slaby
2006-08-08 16:41:17 UTC
Permalink
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
Post by Jens Axboe
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
Hm, I'm not sure what you mean ...
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
Ah, that.
Post by Jiri Slaby
But perhaps that wasn't you?
No, that wasn't me. :-)
It was me and it's OK.
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
I'll concoct a fix for that breakage.
Something like this.
Looks good, I'll give it a try.
Thanks!
It fixes this particular issue for me, but your first patch (appended)
is also needed to prevent the box from hanging later during the resume
(when it tries to save the image).
Yes certainly, that's a separate bug, sorry if I didn't make that clear.
Both fixes are in the block repo now, so next -mm should work fine
again.
And even this is OK.

I'm just curious, what
@@ -387,3 +398,5 @@
EXT3 FS on md0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 506036k swap on /dev/hda3. Priority:-1 extents:1 across:506036k
+JBD: barrier-based sync failed on hda2 - disabling barriers
+JBD: barrier-based sync failed on md0 - disabling barriers

means. Another bug?

thanks,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jens Axboe
2006-08-08 17:53:43 UTC
Permalink
Post by Jiri Slaby
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
Post by Jens Axboe
Indeed, that looks broken now. That must be what is screwing it up. With
the former patch applied, did cdrom detection still look funny to you?
Hm, I'm not sure what you mean ...
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
Ah, that.
Post by Jiri Slaby
But perhaps that wasn't you?
No, that wasn't me. :-)
It was me and it's OK.
Post by Jens Axboe
Post by Rafael J. Wysocki
Post by Jiri Slaby
Post by Rafael J. Wysocki
Post by Jens Axboe
Post by Jens Axboe
I'll concoct a fix for that breakage.
Something like this.
Looks good, I'll give it a try.
Thanks!
It fixes this particular issue for me, but your first patch (appended)
is also needed to prevent the box from hanging later during the resume
(when it tries to save the image).
Yes certainly, that's a separate bug, sorry if I didn't make that clear.
Both fixes are in the block repo now, so next -mm should work fine
again.
And even this is OK.
Good.
Post by Jiri Slaby
I'm just curious, what
@@ -387,3 +398,5 @@
EXT3 FS on md0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 506036k swap on /dev/hda3. Priority:-1 extents:1 across:506036k
+JBD: barrier-based sync failed on hda2 - disabling barriers
+JBD: barrier-based sync failed on md0 - disabling barriers
I think that -mm also added barriers on by default for ext3, so I don't
think it's anything to worry about.
--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jiri Slaby
2006-08-07 21:09:59 UTC
Permalink
Post by Jason Lunz
Post by Jiri Slaby
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
http://www.fi.muni.cz/~xslaby/sklad/ide2.gif
My guess is ide2/2.0 dies (hpt370 driver), since last thing kernel prints is
suspending device 2.0
Does it go away if you revert this?
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/broken-out/ide-reprogram-disk-pio-timings-on-resume.patch
No change.
Post by Jason Lunz
That should only affect resume, not suspend, but it does mess around
with ide power management. Is this maybe happening on the *second*
suspend?
Nope, the first one.
Post by Jason Lunz
Post by Jiri Slaby
-hdc: ATAPI 63X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
+hdc: ATAPI CD-ROM drive, 0kB Cache, UDMA(33)
This looks suspicious. -mm does have several ide-fix-hpt3xx patches.
But hdc is not on the hpt3xx controller.

regards,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Andy Whitcroft
2006-08-07 13:40:45 UTC
Permalink
It seems that the command line on x86_64 is being truncated during boot:

Bootdata ok (command line is root=/dev/sda1 ro profile=2 console=tty0
console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1154470592
profile=2)
[...]
Kernel command line: root=/dev/sda1 ro profile=2 console=tty0
console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1154470592 profile=2
[...]
elm3b6:~# cat /proc/cmdline
root=/dev/sda1

This seems to be occuring around the parse_args area.

Will try and track it down.

-apw
Andi Kleen
2006-08-07 14:05:55 UTC
Permalink
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.

-Andi
Andi Kleen
2006-08-07 14:37:31 UTC
Permalink
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
Following up myself ...

Are you sure it's a regression? 2.6.17 does the same
and we always had that 255 character limit (I tried
to increase it once, but it broke some old lilo setups)

i386 should be the same btw.

-Andi
Andy Whitcroft
2006-08-07 14:42:25 UTC
Permalink
Post by Andi Kleen
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
Following up myself ...
Are you sure it's a regression? 2.6.17 does the same
and we always had that 255 character limit (I tried
to increase it once, but it broke some old lilo setups)
i386 should be the same btw.
Its not being truncated at 255 characters, its being truncated at the
first space. This is coming out of parse_args, which dumps '\0's into
the command_line as it rips it apart. We now only have one copy of the
command line (in x86_64) instead of two, so we now expose this trashed
copy in /proc/cmdline.

-apw
Andi Kleen
2006-08-07 14:46:53 UTC
Permalink
Post by Andy Whitcroft
Post by Andi Kleen
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
Following up myself ...
Are you sure it's a regression? 2.6.17 does the same
and we always had that 255 character limit (I tried
to increase it once, but it broke some old lilo setups)
i386 should be the same btw.
Its not being truncated at 255 characters, its being truncated at the
first space. This is coming out of parse_args, which dumps '\0's into
the command_line as it rips it apart. We now only have one copy of the
command line (in x86_64) instead of two, so we now expose this trashed
copy in /proc/cmdline.
I don't see this in my version; so it's likely fixed already. I did quite
a lot of changes on this patch already.

Please test

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/early-param

-Andi
Andy Whitcroft
2006-08-07 15:04:49 UTC
Permalink
Post by Andi Kleen
Post by Andy Whitcroft
Post by Andi Kleen
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
Following up myself ...
Are you sure it's a regression? 2.6.17 does the same
and we always had that 255 character limit (I tried
to increase it once, but it broke some old lilo setups)
i386 should be the same btw.
Its not being truncated at 255 characters, its being truncated at the
first space. This is coming out of parse_args, which dumps '\0's into
the command_line as it rips it apart. We now only have one copy of the
command line (in x86_64) instead of two, so we now expose this trashed
copy in /proc/cmdline.
I don't see this in my version; so it's likely fixed already. I did quite
a lot of changes on this patch already.
Please test
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/early-param
Easier said than done as the original version is unwilling to revert.
Looking at the replacement patch it has the same fix I have been testing
to restore the original dual buffer semantic. So I think it would fix
the problem we're seeing here. I'll follow up to this email with the
incremental patch I tested with 2.6.18-rc2-mm2.

-apw
Andy Whitcroft
2006-08-07 15:12:16 UTC
Permalink
x86_64 dirty fix to restore dual command line store

Ok, It seems that the patch below effectivly removes the second
copy of the command line. This means that any modification to the
'working' command line (as returned from setup_arch) is incorrectly
visible in userspace via /proc/cmdline.

x86_64-mm-early-param.patch

This patch restores the second copy. Its probabally not the right
way to fix this long term.

Signed-off-by: Andy Whitcroft <***@shadowen.org>
---
diff -upN reference/arch/x86_64/kernel/setup.c current/arch/x86_64/kernel/setup.c
--- reference/arch/x86_64/kernel/setup.c
+++ current/arch/x86_64/kernel/setup.c
@@ -378,7 +378,8 @@ void __init setup_arch(char **cmdline_p)
early_identify_cpu(&boot_cpu_data);

parse_early_param();
- *cmdline_p = saved_command_line;
+ memcpy(command_line, saved_command_line, COMMAND_LINE_SIZE);
+ *cmdline_p = command_line;

finish_e820_parsing();
Keith Mannthey
2006-08-07 21:47:05 UTC
Permalink
Post by Andy Whitcroft
x86_64 dirty fix to restore dual command line store
Ok, It seems that the patch below effectivly removes the second
copy of the command line. This means that any modification to the
'working' command line (as returned from setup_arch) is incorrectly
visible in userspace via /proc/cmdline.
Sorry for the side question but why is setup_arch adding things back
on the cmdline in the first place? What do you see in /proc/cmdline?

Thanks,
Keith
Keith Mannthey
2006-08-07 21:59:51 UTC
Permalink
Post by Keith Mannthey
Post by Andy Whitcroft
x86_64 dirty fix to restore dual command line store
Ok, It seems that the patch below effectivly removes the second
copy of the command line. This means that any modification to the
'working' command line (as returned from setup_arch) is incorrectly
visible in userspace via /proc/cmdline.
Sorry for the side question but why is setup_arch adding things back
on the cmdline in the first place? What do you see in /proc/cmdline?
Sorry for the ping. I read some more lkml and the context for this
patch was filled in.

Thanks,
Keith
Andy Whitcroft
2006-08-07 14:38:49 UTC
Permalink
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
-Andi
Well I've narroed it down to the following patch from Andrew:

x86_64-mm-early-param.patch

Basically, that leads setup_arch to return saved_command_line as _the_
command_line. We then run parse_args() against it which assumes it may
irrevocabaly change command_line. Previous to this patch
saved_command_line and command_line were separate and this was not an issue.

It feels like we should be following the model in the newly added
parse_early_parms() and taking a local copy of the command_line here.

-apw
Andrew Morton
2006-08-07 15:15:19 UTC
Permalink
On Mon, 07 Aug 2006 15:38:49 +0100
Post by Andy Whitcroft
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
-Andi
x86_64-mm-early-param.patch
Not me. My only contribution to that patch was to scrog the changelog ;)
I'll be fixing that sometime.

I think that patch doesn't have a future, although Andi hasn't yet dropped it.
Post by Andy Whitcroft
Basically, that leads setup_arch to return saved_command_line as _the_
command_line. We then run parse_args() against it which assumes it may
irrevocabaly change command_line. Previous to this patch
saved_command_line and command_line were separate and this was not an issue.
It feels like we should be following the model in the newly added
parse_early_parms() and taking a local copy of the command_line here.
Andi Kleen
2006-08-07 15:58:06 UTC
Permalink
Post by Andrew Morton
On Mon, 07 Aug 2006 15:38:49 +0100
Post by Andy Whitcroft
Post by Andi Kleen
in mm right?
Post by Andy Whitcroft
Will try and track it down.
Don't bother, it is likely "early-param" (the patch from
hell). I'll investigate.
-Andi
x86_64-mm-early-param.patch
Not me. My only contribution to that patch was to scrog the changelog ;)
I'll be fixing that sometime.
I think that patch doesn't have a future, although Andi hasn't yet dropped it.
I fixed all known bugs (but hasn't reached your tree it) and right now
it looks good to not be a drop.

Of course more testing will tell.

-Andi
Adrian Bunk
2006-08-07 15:49:38 UTC
Permalink
acpi_force can become static.

Signed-off-by: Adrian Bunk <***@stusta.de>

--- linux-2.6.18-rc3-mm2-full/arch/i386/kernel/acpi/boot.c.old 2006-08-07 15:56:19.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/arch/i386/kernel/acpi/boot.c 2006-08-07 15:56:28.000000000 +0200
@@ -37,7 +37,7 @@
#include <asm/io.h>
#include <asm/mpspec.h>

-int __initdata acpi_force = 0;
+static int __initdata acpi_force = 0;

#ifdef CONFIG_ACPI
int acpi_disabled = 0;

-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Andi Kleen
2006-08-07 16:07:16 UTC
Permalink
Post by Adrian Bunk
acpi_force can become static.
Both patches added thanks

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Adrian Bunk
2006-08-07 15:49:47 UTC
Permalink
This patch makes needlessly global code static.

Signed-off-by: Adrian Bunk <***@stusta.de>

---

BTW:
It doesn't seem to be intended that the new
ipv4/fib_rules.c:fib4_rules_cleanup() is completely unused?

include/net/ip6_fib.h | 4 ----
net/ipv4/cipso_ipv4.c | 2 +-
net/ipv4/fib_rules.c | 4 ++--
net/ipv6/fib6_rules.c | 4 ++--
net/ipv6/ip6_fib.c | 6 +++---
net/ipv6/route.c | 6 +++---
net/netlabel/netlabel_domainhash.c | 4 ++--
7 files changed, 13 insertions(+), 17 deletions(-)

--- linux-2.6.18-rc3-mm2-full/net/ipv4/cipso_ipv4.c.old 2006-08-07 16:39:05.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/net/ipv4/cipso_ipv4.c 2006-08-07 16:39:15.000000000 +0200
@@ -60,7 +60,7 @@
* if in practice there are a lot of different DOIs this list should
* probably be turned into a hash table or something similar so we
* can do quick lookups. */
-DEFINE_SPINLOCK(cipso_v4_doi_list_lock);
+static DEFINE_SPINLOCK(cipso_v4_doi_list_lock);
static struct list_head cipso_v4_doi_list = LIST_HEAD_INIT(cipso_v4_doi_list);

/* Label mapping cache */
--- linux-2.6.18-rc3-mm2-full/net/ipv4/fib_rules.c.old 2006-08-07 16:39:33.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/net/ipv4/fib_rules.c 2006-08-07 16:39:51.000000000 +0200
@@ -101,8 +101,8 @@
return err;
}

-int fib4_rule_action(struct fib_rule *rule, struct flowi *flp, int flags,
- struct fib_lookup_arg *arg)
+static int fib4_rule_action(struct fib_rule *rule, struct flowi *flp,
+ int flags, struct fib_lookup_arg *arg)
{
int err = -EAGAIN;
struct fib_table *tbl;
--- linux-2.6.18-rc3-mm2-full/net/ipv6/fib6_rules.c.old 2006-08-07 16:41:07.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/net/ipv6/fib6_rules.c 2006-08-07 16:41:16.000000000 +0200
@@ -66,8 +66,8 @@
return (struct dst_entry *) arg.result;
}

-int fib6_rule_action(struct fib_rule *rule, struct flowi *flp,
- int flags, struct fib_lookup_arg *arg)
+static int fib6_rule_action(struct fib_rule *rule, struct flowi *flp,
+ int flags, struct fib_lookup_arg *arg)
{
struct rt6_info *rt = NULL;
struct fib6_table *table;
--- linux-2.6.18-rc3-mm2-full/include/net/ip6_fib.h.old 2006-08-07 16:41:36.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/include/net/ip6_fib.h 2006-08-07 16:41:43.000000000 +0200
@@ -192,10 +192,6 @@
struct in6_addr *daddr, int dst_len,
struct in6_addr *saddr, int src_len);

-extern void fib6_clean_tree(struct fib6_node *root,
- int (*func)(struct rt6_info *, void *arg),
- int prune, void *arg);
-
extern void fib6_clean_all(int (*func)(struct rt6_info *, void *arg),
int prune, void *arg);

--- linux-2.6.18-rc3-mm2-full/net/ipv6/ip6_fib.c.old 2006-08-07 16:41:51.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/net/ipv6/ip6_fib.c 2006-08-07 16:42:05.000000000 +0200
@@ -1169,9 +1169,9 @@
* ignoring pure split nodes) will be scanned.
*/

-void fib6_clean_tree(struct fib6_node *root,
- int (*func)(struct rt6_info *, void *arg),
- int prune, void *arg)
+static void fib6_clean_tree(struct fib6_node *root,
+ int (*func)(struct rt6_info *, void *arg),
+ int prune, void *arg)
{
struct fib6_cleaner_t c;

--- linux-2.6.18-rc3-mm2-full/net/ipv6/route.c.old 2006-08-07 16:42:24.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/net/ipv6/route.c 2006-08-07 16:43:05.000000000 +0200
@@ -613,8 +613,8 @@
return rt;
}

-struct rt6_info *ip6_pol_route_input(struct fib6_table *table, struct flowi *fl,
- int flags)
+static struct rt6_info *ip6_pol_route_input(struct fib6_table *table,
+ struct flowi *fl, int flags)
{
struct fib6_node *fn;
struct rt6_info *rt, *nrt;
@@ -872,7 +872,7 @@
}

static struct dst_entry *ndisc_dst_gc_list;
-DEFINE_SPINLOCK(ndisc_lock);
+static DEFINE_SPINLOCK(ndisc_lock);

struct dst_entry *ndisc_dst_alloc(struct net_device *dev,
struct neighbour *neigh,
--- linux-2.6.18-rc3-mm2-full/net/netlabel/netlabel_domainhash.c.old 2006-08-07 16:43:27.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/net/netlabel/netlabel_domainhash.c 2006-08-07 16:43:53.000000000 +0200
@@ -50,11 +50,11 @@
/* Domain hash table */
/* XXX - updates should be so rare that having one spinlock for the entire
* hash table should be okay */
-DEFINE_SPINLOCK(netlbl_domhsh_lock);
+static DEFINE_SPINLOCK(netlbl_domhsh_lock);
static struct netlbl_domhsh_tbl *netlbl_domhsh = NULL;

/* Default domain mapping */
-DEFINE_SPINLOCK(netlbl_domhsh_def_lock);
+static DEFINE_SPINLOCK(netlbl_domhsh_def_lock);
static struct netlbl_dom_map *netlbl_domhsh_def = NULL;

/*

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller
2006-08-08 04:51:52 UTC
Permalink
From: Adrian Bunk <***@stusta.de>
Date: Mon, 7 Aug 2006 17:49:47 +0200
Post by Adrian Bunk
This patch makes needlessly global code static.
Looks reasonable, applied.
Post by Adrian Bunk
It doesn't seem to be intended that the new
ipv4/fib_rules.c:fib4_rules_cleanup() is completely unused?
I'll kill it off.

IPv4 can't be built as a module and therefore there is no
relevant exit or module load error path for ipv4 for which
this function should be called.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Adrian Bunk
2006-08-07 15:49:42 UTC
Permalink
enable_local_apic can now become static.

Signed-off-by: Adrian Bunk <***@stusta.de>

---

arch/i386/kernel/apic.c | 13 ++++++++++++-
include/asm-i386/apic.h | 12 ------------
2 files changed, 12 insertions(+), 13 deletions(-)

--- linux-2.6.18-rc3-mm2-full/include/asm-i386/apic.h.old 2006-08-07 16:10:45.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/include/asm-i386/apic.h 2006-08-07 16:12:37.000000000 +0200
@@ -16,20 +16,8 @@
#define APIC_VERBOSE 1
#define APIC_DEBUG 2

-extern int enable_local_apic;
extern int apic_verbosity;

-static inline void lapic_disable(void)
-{
- enable_local_apic = -1;
- clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
-}
-
-static inline void lapic_enable(void)
-{
- enable_local_apic = 1;
-}
-
/*
* Define the default level of output to be very little
* This can be turned up by using apic=verbose for more
--- linux-2.6.18-rc3-mm2-full/arch/i386/kernel/apic.c.old 2006-08-07 16:11:08.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/arch/i386/kernel/apic.c 2006-08-07 16:12:57.000000000 +0200
@@ -52,7 +52,18 @@
/*
* Knob to control our willingness to enable the local APIC.
*/
-int enable_local_apic __initdata = 0; /* -1=force-disable, +1=force-enable */
+static int enable_local_apic __initdata = 0; /* -1=force-disable, +1=force-enable */
+
+static inline void lapic_disable(void)
+{
+ enable_local_apic = -1;
+ clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
+}
+
+static inline void lapic_enable(void)
+{
+ enable_local_apic = 1;
+}

/*
* Debug level
Adrian Bunk
2006-08-07 15:50:05 UTC
Permalink
This patch contains the following cleanups:
- make needlessly global code static
- use C99 struct initializers

Signed-off-by: Adrian Bunk <***@stusta.de>

---

The {cia,geode_aes}_{setkey,encrypt,decryt} prototype confusion both
sparse and gcc are giveng warnings about should also be fixed.

drivers/crypto/geode-aes.c | 12 ++++++------
drivers/crypto/geode-aes.h | 2 --
2 files changed, 6 insertions(+), 8 deletions(-)

--- linux-2.6.18-rc3-mm2-full/drivers/crypto/geode-aes.h.old 2006-08-07 16:23:25.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/drivers/crypto/geode-aes.h 2006-08-07 16:23:51.000000000 +0200
@@ -37,6 +37,4 @@
u8 iv[AES_IV_LENGTH];
};

-unsigned int geode_aes_crypt(struct geode_aes_op *);
-
#endif
--- linux-2.6.18-rc3-mm2-full/drivers/crypto/geode-aes.c.old 2006-08-07 16:24:03.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/drivers/crypto/geode-aes.c 2006-08-07 16:50:41.000000000 +0200
@@ -114,7 +114,7 @@
AWRITE((status & 0xFF) | AES_INTRA_PENDING, AES_INTR_REG);
}

-unsigned int
+static unsigned int
geode_aes_crypt(struct geode_aes_op *op)
{
u32 flags = 0;
@@ -361,7 +361,7 @@
return ret;
}

-struct pci_device_id geode_aes_tbl[] = {
+static struct pci_device_id geode_aes_tbl[] = {
{ PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_LX_AES, PCI_ANY_ID, PCI_ANY_ID} ,
{ 0, }
};
@@ -369,10 +369,10 @@
MODULE_DEVICE_TABLE(pci, geode_aes_tbl);

static struct pci_driver geode_aes_driver = {
- name: "Geode LX AES",
- id_table: geode_aes_tbl,
- probe: geode_aes_probe,
- remove: __devexit_p(geode_aes_remove)
+ .name = "Geode LX AES",
+ .id_table = geode_aes_tbl,
+ .probe = geode_aes_probe,
+ .remove = __devexit_p(geode_aes_remove)
};

static int __devinit
Mattia Dongili
2006-08-07 19:38:36 UTC
Permalink
Hello,

after resume from ram (tested in single user), I can type commands for a
few seconds (time is variable), the processes get stuck in io_schedule.
Poorman's screenshots are here:
Loading Image...
Loading Image...

.config:
http://oioio.altervista.org/linux/config-2.6.18-rc3-mm2-1

Anything useful I could add?
--
mattia
:wq!
Andrew Morton
2006-08-07 20:02:08 UTC
Permalink
On Mon, 7 Aug 2006 21:38:36 +0200
Post by Mattia Dongili
after resume from ram (tested in single user), I can type commands for a
few seconds (time is variable), the processes get stuck in io_schedule.
http://oioio.altervista.org/linux/dsc03448.jpg
http://oioio.altervista.org/linux/dsc03449.jpg
That probably measn that the device or device driver has got itself into a
sick state and IO completions aren't occurring.

Which storage device (and which device driver) is being used here?
Mattia Dongili
2006-08-07 20:57:08 UTC
Permalink
Post by Andrew Morton
On Mon, 7 Aug 2006 21:38:36 +0200
Post by Mattia Dongili
after resume from ram (tested in single user), I can type commands for a
few seconds (time is variable), the processes get stuck in io_schedule.
http://oioio.altervista.org/linux/dsc03448.jpg
http://oioio.altervista.org/linux/dsc03449.jpg
That probably measn that the device or device driver has got itself into a
sick state and IO completions aren't occurring.
BTW: I tried to reverse ide-reprogram-disk-pio-timings-on-resume.patch
with no luck.
Post by Andrew Morton
Which storage device (and which device driver) is being used here?
A dmesg is available here (apart from the already resolved BUGs the boot
process is meaningful):
http://oioio.altervista.org/linux/dmesg-2.6.18-rc3-mm2-1
[ 3.168000] ICH3M: chipset revision 1
[ 3.168000] ICH3M: not 100% native mode: will probe irqs later
[ 3.168000] ide0: BM-DMA at 0x1860-0x1867, BIOS settings: hda:DMA, hdb:pio
[ 3.168000] ide1: BM-DMA at 0x1868-0x186f, BIOS settings: hdc:pio, hdd:pio
[ 3.168000] Probing IDE interface ide0...
[ 3.460000] hda: FUJITSU MHV2080AH, ATA DISK drive
[ 4.132000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[ 4.136000] Probing IDE interface ide1...
[ 4.704000] Probing IDE interface ide1...
[ 5.272000] hda: max request size: 128KiB
[ 5.344000] hda: 156301488 sectors (80026 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(100)
[ 5.348000] hda: cache flushes supported
[ 5.352000] hda: hda1 hda2 hda3 hda4 < hda5 hda6 >

lspci reports:
00:1f.1 IDE interface: Intel Corporation 82801CAM IDE U100 (rev 01) (prog-if 8a [Master SecP PriP])
Subsystem: Sony Corporation VAIO PCG-GR214EP/GR214MP/GR215MP/GR314MP/GR315MP
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin A routed to IRQ 255
Region 0: I/O ports at <ignored>
Region 1: I/O ports at <ignored>
Region 2: I/O ports at <ignored>
Region 3: I/O ports at <ignored>
Region 4: I/O ports at 1860 [size=16]
Region 5: Memory at d0000000 (32-bit, non-prefetchable) [size=1K]
--
mattia
:wq!
Mattia Dongili
2006-08-07 22:09:27 UTC
Permalink
Post by Mattia Dongili
Post by Andrew Morton
On Mon, 7 Aug 2006 21:38:36 +0200
Post by Mattia Dongili
after resume from ram (tested in single user), I can type commands for a
few seconds (time is variable), the processes get stuck in io_schedule.
http://oioio.altervista.org/linux/dsc03448.jpg
http://oioio.altervista.org/linux/dsc03449.jpg
That probably measn that the device or device driver has got itself into a
sick state and IO completions aren't occurring.
BTW: I tried to reverse ide-reprogram-disk-pio-timings-on-resume.patch
with no luck.
reverting git-block.patch (plus a couple more to make the thing build)
let me resume correctly (2 cycles already).

Suggestion taken from the "swsusp regression" sub-thread.
--
mattia
:wq!
Adrian Bunk
2006-08-07 21:04:15 UTC
Permalink
This patch removes three no longer used functions (that are even
generating gcc warnings).

This patch doesn't look right, but it is the result of
58e5528ee464d38040b9489e10033c9387a10d56 in git-netdev...

Signed-off-by: Adrian Bunk <***@stusta.de>

---

drivers/net/wireless/bcm43xx/bcm43xx_main.c | 33 --------------------
1 file changed, 33 deletions(-)

--- linux-2.6.18-rc3-mm2-full/drivers/net/wireless/bcm43xx/bcm43xx_main.c.old 2006-08-07 18:21:31.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-08-07 18:23:36.000000000 +0200
@@ -3194,39 +3194,6 @@
bcm43xx_clear_keys(bcm);
}

-static int bcm43xx_rng_read(struct hwrng *rng, u32 *data)
-{
- struct bcm43xx_private *bcm = (struct bcm43xx_private *)rng->priv;
- unsigned long flags;
-
- spin_lock_irqsave(&(bcm)->irq_lock, flags);
- *data = bcm43xx_read16(bcm, BCM43xx_MMIO_RNG);
- spin_unlock_irqrestore(&(bcm)->irq_lock, flags);
-
- return (sizeof(u16));
-}
-
-static void bcm43xx_rng_exit(struct bcm43xx_private *bcm)
-{
- hwrng_unregister(&bcm->rng);
-}
-
-static int bcm43xx_rng_init(struct bcm43xx_private *bcm)
-{
- int err;
-
- snprintf(bcm->rng_name, ARRAY_SIZE(bcm->rng_name),
- "%s_%s", KBUILD_MODNAME, bcm->net_dev->name);
- bcm->rng.name = bcm->rng_name;
- bcm->rng.data_read = bcm43xx_rng_read;
- bcm->rng.priv = (unsigned long)bcm;
- err = hwrng_register(&bcm->rng);
- if (err)
- printk(KERN_ERR PFX "RNG init failed (%d)\n", err);
-
- return err;
-}
-
static int bcm43xx_shutdown_all_wireless_cores(struct bcm43xx_private *bcm)
{
int ret = 0;

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael Buesch
2006-08-08 18:32:37 UTC
Permalink
Post by Adrian Bunk
This patch removes three no longer used functions (that are even
generating gcc warnings).
This patch doesn't look right, but it is the result of
58e5528ee464d38040b9489e10033c9387a10d56 in git-netdev...
Hm, can't find that commit in a tree.
I looked at linus', netdev-2.6.

But one thing is for sure. This patch is _wrong_. ;)
NACK.
Post by Adrian Bunk
drivers/net/wireless/bcm43xx/bcm43xx_main.c | 33 --------------------
1 file changed, 33 deletions(-)
--- linux-2.6.18-rc3-mm2-full/drivers/net/wireless/bcm43xx/bcm43xx_main.c.old 2006-08-07 18:21:31.000000000 +0200
+++ linux-2.6.18-rc3-mm2-full/drivers/net/wireless/bcm43xx/bcm43xx_main.c 2006-08-07 18:23:36.000000000 +0200
@@ -3194,39 +3194,6 @@
bcm43xx_clear_keys(bcm);
}
-static int bcm43xx_rng_read(struct hwrng *rng, u32 *data)
-{
- struct bcm43xx_private *bcm = (struct bcm43xx_private *)rng->priv;
- unsigned long flags;
-
- spin_lock_irqsave(&(bcm)->irq_lock, flags);
- *data = bcm43xx_read16(bcm, BCM43xx_MMIO_RNG);
- spin_unlock_irqrestore(&(bcm)->irq_lock, flags);
-
- return (sizeof(u16));
-}
-
-static void bcm43xx_rng_exit(struct bcm43xx_private *bcm)
-{
- hwrng_unregister(&bcm->rng);
-}
-
-static int bcm43xx_rng_init(struct bcm43xx_private *bcm)
-{
- int err;
-
- snprintf(bcm->rng_name, ARRAY_SIZE(bcm->rng_name),
- "%s_%s", KBUILD_MODNAME, bcm->net_dev->name);
- bcm->rng.name = bcm->rng_name;
- bcm->rng.data_read = bcm43xx_rng_read;
- bcm->rng.priv = (unsigned long)bcm;
- err = hwrng_register(&bcm->rng);
- if (err)
- printk(KERN_ERR PFX "RNG init failed (%d)\n", err);
-
- return err;
-}
-
static int bcm43xx_shutdown_all_wireless_cores(struct bcm43xx_private *bcm)
{
--
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Adrian Bunk
2006-08-08 19:42:31 UTC
Permalink
Post by Michael Buesch
Post by Adrian Bunk
This patch removes three no longer used functions (that are even
generating gcc warnings).
This patch doesn't look right, but it is the result of
58e5528ee464d38040b9489e10033c9387a10d56 in git-netdev...
Hm, can't find that commit in a tree.
I looked at linus', netdev-2.6.
It's in netdev-2.6.git#ALL that gets included in -mm.
Post by Michael Buesch
But one thing is for sure. This patch is _wrong_. ;)
...
And it seems to be your fault. ;-)


commit 58e5528ee464d38040b9489e10033c9387a10d56
Author: Michael Buesch <***@bu3sch.de>
Date: Sat Jul 8 22:02:18 2006 +0200

[PATCH] bcm43xx: init routine rewrite

Rewrite of the bcm43xx initialization routines.
This fixes several issues:
* up-down-up-down-up... stale data issue
(May fix some DHCP issues)
* Fix the init vs IRQ handler race (and remove the workaround)
* Fix init for cards with multiple cores (APHY)
As softmac has no internal PHY handling (unlike dscape),
this adds the file "phymode" to sysfs.
The active PHY can be selected by writing either a, b or g
to this file. Current PHY can be determined by reading from it.
* Fix the controller restart code.
Controller restart can now also be triggered through
echo 1 > /debug/bcm43xx/ethX/restart
Post by Michael Buesch
Greetings Michael.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael Buesch
2006-08-09 04:47:28 UTC
Permalink
Post by Adrian Bunk
And it seems to be your fault. ;-)
Uh, oh. I'm trapped.
Post by Adrian Bunk
commit 58e5528ee464d38040b9489e10033c9387a10d56
Date: Sat Jul 8 22:02:18 2006 +0200
[PATCH] bcm43xx: init routine rewrite
Ah, I guessed it.
This was caused by some merge-race ;)
Will send a fix for this, soon.
--
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jeff Garzik
2006-08-08 22:14:01 UTC
Permalink
Post by Michael Buesch
Post by Adrian Bunk
This patch removes three no longer used functions (that are even
generating gcc warnings).
This patch doesn't look right, but it is the result of
58e5528ee464d38040b9489e10033c9387a10d56 in git-netdev...
Hm, can't find that commit in a tree.
I looked at linus', netdev-2.6.
It's clearly in netdev-2.6.git#upstream:

commit 58e5528ee464d38040b9489e10033c9387a10d56
Author: Michael Buesch <***@bu3sch.de>
Date: Sat Jul 8 22:02:18 2006 +0200

[PATCH] bcm43xx: init routine rewrite

Rewrite of the bcm43xx initialization routines.
This fixes several issues:
* up-down-up-down-up... stale data issue
(May fix some DHCP issues)
* Fix the init vs IRQ handler race (and remove the workaround)
* Fix init for cards with multiple cores (APHY)
As softmac has no internal PHY handling (unlike dscape),
this adds the file "phymode" to sysfs.
The active PHY can be selected by writing either a, b or g
to this file. Current PHY can be determined by reading from it.
* Fix the controller restart code.
Controller restart can now also be triggered through
echo 1 > /debug/bcm43xx/ethX/restart

Signed-off-by: Michael Buesch <***@bu3sch.de>
Signed-off-by: John W. Linville <***@tuxdriver.com>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki
2006-08-08 14:39:38 UTC
Permalink
Hi,

I get something like the appended on every attempt to unmount the reiserfs
filesystem mounted on /tmp. The other reiserfs filesystems don't have such
problems and this one didn't have them too with 2.6.18-rc2-mm1.


BUG: Dentry ffff810037c573e8{i=3,n=.reiserfs_priv} still in use (1) [unmount of reiserfs hdc7]
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/dcache.c:611
invalid opcode: 0000 [1] PREEMPT
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 0
Modules linked in: ide_cd cdrom xt_pkttype ipt_LOG xt_limit usbserial asus_acpi thermal processor fan button battery ac snd_pcm_oss snd_mix
er_oss snd_seq snd_seq_device af_packet bcm43xx ieee80211softmac ieee80211 ieee80211_crypt pcmcia firmware_class ohci1394 ieee1394 skge yen
ta_socket rsrc_nonstatic pcmcia_core usbhid ff_memless ip6t_REJECT xt_tcpudp ipt_REJECT xt_state snd_intel8x0 snd_ac97_codec snd_ac97_bus s
nd_pcm snd_timer snd iptable_mangle soundcore iptable_nat ip_nat iptable_filter snd_page_alloc ip6table_mangle ehci_hcd ip_conntrack i2c_nf
orce2 i2c_core ip_tables ohci_hcd ip6table_filter ip6_tables x_tables ipv6 parport_pc lp parport dm_mod
Pid: 9478, comm: umount Not tainted 2.6.18-rc3-mm2 #7
RIP: 0010:[<ffffffff802a6eb7>] [<ffffffff802a6eb7>] shrink_dcache_for_umount_subtree+0x1d7/0x2b0
RSP: 0018:ffff810059291da8 EFLAGS: 00010296
RAX: 0000000000000062 RBX: ffff810037c573e8 RCX: 0000000000000003
RDX: 0000000000000008 RSI: ffff810037c627d8 RDI: 0000000000000001
RBP: ffff810059291dc8 R08: 0000000000000002 R09: ffffffff8022de59
R10: 0000000000000000 R11: 0000000000000001 R12: ffff810037c573e8
R13: ffff81005ddc4800 R14: ffff81005f539250 R15: ffff81005f0a8688
FS: 00002afc38e00b00(0000) GS:ffffffff808c2000(0000) knlGS:00000000558b4d00
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ac8a49a1d40 CR3: 000000005f546000 CR4: 00000000000006e0
Process umount (pid: 9478, threadinfo ffff810059290000, task ffff810037c62080)
Stack: ffff81005f0a8b10 ffff81005f0a8688 ffffffff80577a20 ffff810059291ea8
ffff810059291de8 ffffffff802a6fc4 ffff81005f0a8688 ffffffff80577a20
ffff810059291e18 ffffffff80293bb4 ffff81005f539250 ffff81005e09d140
Call Trace:
[<ffffffff802a6fc4>] shrink_dcache_for_umount+0x34/0x70
[<ffffffff80293bb4>] generic_shutdown_super+0x24/0x110
[<ffffffff80293cd0>] kill_block_super+0x30/0x50
[<ffffffff80293f81>] deactivate_super+0x81/0xa0
[<ffffffff802ac008>] mntput_no_expire+0x58/0xa0
[<ffffffff8029b83d>] path_release_on_umount+0x1d/0x30
[<ffffffff802ad3f4>] sys_umount+0x274/0x290
[<ffffffff80209d0e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:


Code: 0f 0b 68 41 33 4a 80 c2 63 02 49 8b 5c 24 68 49 39 dc 75 05
RIP [<ffffffff802a6eb7>] shrink_dcache_for_umount_subtree+0x1d7/0x2b0
RSP <ffff810059291da8>
Andrew Morton
2006-08-08 15:12:00 UTC
Permalink
On Tue, 8 Aug 2006 16:39:38 +0200
Post by Rafael J. Wysocki
Hi,
I get something like the appended on every attempt to unmount the reiserfs
filesystem mounted on /tmp. The other reiserfs filesystems don't have such
problems and this one didn't have them too with 2.6.18-rc2-mm1.
BUG: Dentry ffff810037c573e8{i=3,n=.reiserfs_priv} still in use (1) [unmount of reiserfs hdc7]
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/dcache.c:611
invalid opcode: 0000 [1] PREEMPT
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 0
Modules linked in: ide_cd cdrom xt_pkttype ipt_LOG xt_limit usbserial asus_acpi thermal processor fan button battery ac snd_pcm_oss snd_mix
er_oss snd_seq snd_seq_device af_packet bcm43xx ieee80211softmac ieee80211 ieee80211_crypt pcmcia firmware_class ohci1394 ieee1394 skge yen
ta_socket rsrc_nonstatic pcmcia_core usbhid ff_memless ip6t_REJECT xt_tcpudp ipt_REJECT xt_state snd_intel8x0 snd_ac97_codec snd_ac97_bus s
nd_pcm snd_timer snd iptable_mangle soundcore iptable_nat ip_nat iptable_filter snd_page_alloc ip6table_mangle ehci_hcd ip_conntrack i2c_nf
orce2 i2c_core ip_tables ohci_hcd ip6table_filter ip6_tables x_tables ipv6 parport_pc lp parport dm_mod
Pid: 9478, comm: umount Not tainted 2.6.18-rc3-mm2 #7
RIP: 0010:[<ffffffff802a6eb7>] [<ffffffff802a6eb7>] shrink_dcache_for_umount_subtree+0x1d7/0x2b0
Thanks, Rafael.
vfs-destroy-the-dentries-contributed-by-a-superblock-on-unmounting.patch
added that BUG_ON().
Post by Rafael J. Wysocki
RSP: 0018:ffff810059291da8 EFLAGS: 00010296
RAX: 0000000000000062 RBX: ffff810037c573e8 RCX: 0000000000000003
RDX: 0000000000000008 RSI: ffff810037c627d8 RDI: 0000000000000001
RBP: ffff810059291dc8 R08: 0000000000000002 R09: ffffffff8022de59
R10: 0000000000000000 R11: 0000000000000001 R12: ffff810037c573e8
R13: ffff81005ddc4800 R14: ffff81005f539250 R15: ffff81005f0a8688
FS: 00002afc38e00b00(0000) GS:ffffffff808c2000(0000) knlGS:00000000558b4d00
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ac8a49a1d40 CR3: 000000005f546000 CR4: 00000000000006e0
Process umount (pid: 9478, threadinfo ffff810059290000, task ffff810037c62080)
Stack: ffff81005f0a8b10 ffff81005f0a8688 ffffffff80577a20 ffff810059291ea8
ffff810059291de8 ffffffff802a6fc4 ffff81005f0a8688 ffffffff80577a20
ffff810059291e18 ffffffff80293bb4 ffff81005f539250 ffff81005e09d140
[<ffffffff802a6fc4>] shrink_dcache_for_umount+0x34/0x70
[<ffffffff80293bb4>] generic_shutdown_super+0x24/0x110
[<ffffffff80293cd0>] kill_block_super+0x30/0x50
[<ffffffff80293f81>] deactivate_super+0x81/0xa0
[<ffffffff802ac008>] mntput_no_expire+0x58/0xa0
[<ffffffff8029b83d>] path_release_on_umount+0x1d/0x30
[<ffffffff802ad3f4>] sys_umount+0x274/0x290
[<ffffffff80209d0e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Code: 0f 0b 68 41 33 4a 80 c2 63 02 49 8b 5c 24 68 49 39 dc 75 05
RIP [<ffffffff802a6eb7>] shrink_dcache_for_umount_subtree+0x1d7/0x2b0
RSP <ffff810059291da8>
David Howells
2006-08-08 17:23:56 UTC
Permalink
Make sure all dentries refs are released before calling kill_block_super() so
that the assumption that generic_shutdown_super() can completely destroy the
dentry tree for there will be no external references holds true.

What was being done in the put_super() superblock op, is now done in the
kill_sb() filesystem op instead, prior to calling kill_block_super().

This prevents the BUG_ON() in the reduced-locking dcache destroyer patch from
barking at reiserfs.

I've tested this patch by creating a ReiserFS partition, mounting and
unmounting it a few times, and doing things to its contents whilst it is
mounted.

Signed-Off-By: David Howells <***@redhat.com>
---

fs/reiserfs/super.c | 19 +++++++++++++------
1 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 5567328..69eefe2 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -430,22 +430,29 @@ int remove_save_link(struct inode *inode
return journal_end(&th, inode->i_sb, JOURNAL_PER_BALANCE_CNT);
}

-static void reiserfs_put_super(struct super_block *s)
+static void reiserfs_kill_sb(struct super_block *s)
{
- int i;
- struct reiserfs_transaction_handle th;
- th.t_trans_id = 0;
-
if (REISERFS_SB(s)->xattr_root) {
d_invalidate(REISERFS_SB(s)->xattr_root);
dput(REISERFS_SB(s)->xattr_root);
+ REISERFS_SB(s)->xattr_root = NULL;
}

if (REISERFS_SB(s)->priv_root) {
d_invalidate(REISERFS_SB(s)->priv_root);
dput(REISERFS_SB(s)->priv_root);
+ REISERFS_SB(s)->priv_root = NULL;
}

+ kill_block_super(s);
+}
+
+static void reiserfs_put_super(struct super_block *s)
+{
+ int i;
+ struct reiserfs_transaction_handle th;
+ th.t_trans_id = 0;
+
/* change file system state to current state if it was mounted with read-write permissions */
if (!(s->s_flags & MS_RDONLY)) {
if (!journal_begin(&th, s, 10)) {
@@ -2300,7 +2307,7 @@ struct file_system_type reiserfs_fs_type
.owner = THIS_MODULE,
.name = "reiserfs",
.get_sb = get_super_block,
- .kill_sb = kill_block_super,
+ .kill_sb = reiserfs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
Rafael J. Wysocki
2006-08-08 23:16:38 UTC
Permalink
Post by David Howells
Make sure all dentries refs are released before calling kill_block_super() so
that the assumption that generic_shutdown_super() can completely destroy the
dentry tree for there will be no external references holds true.
What was being done in the put_super() superblock op, is now done in the
kill_sb() filesystem op instead, prior to calling kill_block_super().
This prevents the BUG_ON() in the reduced-locking dcache destroyer patch from
barking at reiserfs.
I've tested this patch by creating a ReiserFS partition, mounting and
unmounting it a few times, and doing things to its contents whilst it is
mounted.
It didn't apply cleanly to -rc3-mm2 for me and produces the appended oops
every time at the kernel startup (on x86_64).

Greetings,
Rafael


input: SynPS/2 Synaptics TouchPad as /class/input/input2
RAMDISK: ext2 filesystem found at block 0
RAMDISK: Loading 2000KiB [1 disk] into ram disk... done.
Unable to handle kernel NULL pointer dereference at 0000000000000510 RIP:
[<ffffffff802edc73>] reiserfs_kill_sb+0x13/0xa0
PGD 0
Oops: 0000 [1] PREEMPT
last sysfs file: /block/hdc/range
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-rc3-mm2 #10
RIP: 0010:[<ffffffff802edc73>] [<ffffffff802edc73>] reiserfs_kill_sb+0x13/0xa0
RSP: 0000:ffff81005ff27a98 EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff810037c3ad20 RCX: 0000000000000003
RDX: 0000000000000008 RSI: ffff81005ff08798 RDI: ffff810037c3ad20
RBP: ffff81005ff27aa8 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff805778c0
R13: 0000000000000001 R14: ffff810037c23080 R15: ffff810037df8168
FS: 0000000000000000(0000) GS:ffffffff808c2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000510 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff81005ff26000, task ffff81005ff08040)
Stack: ffff810037c3ad20 ffff810037c3ad20 ffff81005ff27ac8 ffffffff80294671
00000000ffffffea ffff810037c3ad20 ffff81005ff27b38 ffffffff8029539f
ffffffff802ee260 0000000000000000 ffffff00306d6172 ffff810037df8168
Call Trace:
[<ffffffff80294671>] deactivate_super+0x81/0xa0
[<ffffffff8029539f>] get_sb_bdev+0x12f/0x180
[<ffffffff802ec653>] get_super_block+0x13/0x20
[<ffffffff80294746>] vfs_kern_mount+0xb6/0x160
[<ffffffff8029485a>] do_kern_mount+0x4a/0x70
[<ffffffff802ae370>] do_mount+0x720/0x790
[<ffffffff802ae474>] sys_mount+0x94/0xe0
[<ffffffff808d5b75>] mount_block_root+0xf5/0x2a0
[<ffffffff808d84d2>] initrd_load+0xc2/0x330
[<ffffffff808d5e43>] prepare_namespace+0xc3/0x140
[<ffffffff8020723c>] init+0x1dc/0x2c0
[<ffffffff8020a706>] child_rip+0x8/0x12
DWARF2 unwinder stuck at child_rip+0x8/0x12
Leftover inexact backtrace:
[<ffffffff80471edb>] _spin_unlock_irq+0x2b/0x60
[<ffffffff8020a2c0>] restore_args+0x0/0x30
[<ffffffff80207060>] init+0x0/0x2c0
[<ffffffff8020a6fe>] child_rip+0x0/0x12


Code: 48 8b b8 10 05 00 00 48 85 ff 74 31 e8 9c a3 fb ff 48 8b 83
RIP [<ffffffff802edc73>] reiserfs_kill_sb+0x13/0xa0
RSP <ffff81005ff27a98>
CR2: 0000000000000510
<0>Kernel panic - not syncing: Attempted to kill init!
David Howells
2006-08-09 10:14:27 UTC
Permalink
Post by Rafael J. Wysocki
It didn't apply cleanly to -rc3-mm2 for me and produces the appended oops
every time at the kernel startup (on x86_64).
Can you send me your modified patch?

David
Rafael J. Wysocki
2006-08-09 10:23:04 UTC
Permalink
Post by David Howells
Post by Rafael J. Wysocki
It didn't apply cleanly to -rc3-mm2 for me and produces the appended oops
every time at the kernel startup (on x86_64).
Can you send me your modified patch?
Index: linux-2.6.18-rc3-mm2/fs/reiserfs/super.c
===================================================================
--- linux-2.6.18-rc3-mm2.orig/fs/reiserfs/super.c
+++ linux-2.6.18-rc3-mm2/fs/reiserfs/super.c
@@ -430,21 +430,29 @@ int remove_save_link(struct inode *inode
return journal_end(&th, inode->i_sb, JOURNAL_PER_BALANCE_CNT);
}

-static void reiserfs_put_super(struct super_block *s)
+static void reiserfs_kill_sb(struct super_block *s)
{
- struct reiserfs_transaction_handle th;
- th.t_trans_id = 0;
-
if (REISERFS_SB(s)->xattr_root) {
d_invalidate(REISERFS_SB(s)->xattr_root);
dput(REISERFS_SB(s)->xattr_root);
+ REISERFS_SB(s)->xattr_root = NULL;
}

if (REISERFS_SB(s)->priv_root) {
d_invalidate(REISERFS_SB(s)->priv_root);
dput(REISERFS_SB(s)->priv_root);
+ REISERFS_SB(s)->priv_root = NULL;
}

+ kill_block_super(s);
+}
+
+static void reiserfs_put_super(struct super_block *s)
+{
+ int i;
+ struct reiserfs_transaction_handle th;
+ th.t_trans_id = 0;
+
/* change file system state to current state if it was mounted with read-write permissions */
if (!(s->s_flags & MS_RDONLY)) {
if (!journal_begin(&th, s, 10)) {
@@ -2155,7 +2163,7 @@ struct file_system_type reiserfs_fs_type
.owner = THIS_MODULE,
.name = "reiserfs",
.get_sb = get_super_block,
- .kill_sb = kill_block_super,
+ .kill_sb = reiserfs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
David Howells
2006-08-09 11:00:18 UTC
Permalink
Post by Rafael J. Wysocki
It didn't apply cleanly to -rc3-mm2 for me and produces the appended oops
every time at the kernel startup (on x86_64).
Hmmm... It works okay for me, but then I'm testing it on i686, not x86_64.
Should I draw any meaning from you saying "(on x86_64)"?

Also, can you do:

gdb vmlinux

And then at the prompt, can you disassemble the reiserfs_kill_sb() function:

disas reiserfs_kill_sb

And send me the disassembly?

If I had to guess, I'd say that REISERFS_SB() returned a NULL pointer, and
that sb->s_root is NULL. In which case generic_shutdown_super() will not
invoke reiserfs_put_super().

Something that you can try is to modify reiserfs_kill_sb() to be:

static void reiserfs_kill_sb(struct super_block *s)
{
if (REISERFS_SB(s) {
if (REISERFS_SB(s)->xattr_root) {
d_invalidate(REISERFS_SB(s)->xattr_root);
dput(REISERFS_SB(s)->xattr_root);
REISERFS_SB(s)->xattr_root = NULL;
}

if (REISERFS_SB(s)->priv_root) {
d_invalidate(REISERFS_SB(s)->priv_root);
dput(REISERFS_SB(s)->priv_root);
REISERFS_SB(s)->priv_root = NULL;
}
}

kill_block_super(s);
}

That way the function will be able to kill a superblock that isn't fully
initialised.

David
David Howells
2006-08-10 10:16:55 UTC
Permalink
This one works on my box just fine.
Excellent, thanks.

David
V***@vt.edu
2006-08-10 03:32:56 UTC
Permalink
Usually this means that there's an IO request in flight and it got lost
somewhere. Device driver bug, IO scheduler bug, etc. Conceivably a
lost interrupt (hardware bug, PCI setup bug, etc).
Aug 9 14:30:24 turing-police kernel: [ 3535.720000] end_request: I/O error, dev fd0, sector 0
Red herring. yum just wedged again, this time with no reference to floppy drive.
Same traceback. Anybody have anything to suggest before I start playing
hunt-the-wumpus with a -mm bisection?
Jiri Slaby
2006-08-10 11:40:11 UTC
Permalink
Post by V***@vt.edu
Usually this means that there's an IO request in flight and it got lost
somewhere. Device driver bug, IO scheduler bug, etc. Conceivably a
lost interrupt (hardware bug, PCI setup bug, etc).
Aug 9 14:30:24 turing-police kernel: [ 3535.720000] end_request: I/O error, dev fd0, sector 0
Red herring. yum just wedged again, this time with no reference to floppy drive.
Same traceback. Anybody have anything to suggest before I start playing
hunt-the-wumpus with a -mm bisection?
Hmm, I have the accurately same problem...
yum + CFQ + BLK_DEV_PIIX + nothing odd in dmesg

[ 3438.574864] yum D 00000000 0 21659 3838
(NOTLB)
[ 3438.575098] e5c09d24 00000001 c180f5a8 00000000 e5c09ce0 c01683e8
fe37c0bc 000002c4
[ 3438.575388] 00001000 00000001 c18fbbd0 0023001f 00000007 f26cc560
c1913560 fe4166d5
[ 3438.575713] 000002c4 0009a619 00000001 f26cc66c c180ec40 c04ff140
e5c09d14 c01fad44
[ 3438.576039] Call Trace:
[ 3438.576113] [<c0373d3b>] io_schedule+0x26/0x30
[ 3438.576187] [<c014653c>] sync_page+0x39/0x45
[ 3438.576260] [<c0374401>] __wait_on_bit_lock+0x41/0x64
[ 3438.576333] [<c01464ef>] __lock_page+0x57/0x5f
[ 3438.576405] [<c014f5f2>] truncate_inode_pages_range+0x1b6/0x304
[ 3438.576480] [<c014f76f>] truncate_inode_pages+0x2f/0x40
[ 3438.576553] [<c01a7bc4>] ext3_delete_inode+0x29/0xf7
[ 3438.576627] [<c017f26b>] generic_delete_inode+0x65/0xe7
[ 3438.576701] [<c017f3aa>] generic_drop_inode+0xbd/0x173
[ 3438.576774] [<c017ed25>] iput+0x6b/0x7b
[ 3438.576846] [<c017cc57>] dentry_iput+0x68/0xb3
[ 3438.576919] [<c017d99e>] dput+0x4f/0x19f
[ 3438.576990] [<c0176164>] sys_renameat+0x1e0/0x212
[ 3438.577063] [<c01761be>] sys_rename+0x28/0x2a
[ 3438.577135] [<c01030fb>] syscall_call+0x7/0xb

regards,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
Andrew Morton
2006-08-10 15:27:49 UTC
Permalink
On Thu, 10 Aug 2006 13:39:11 +0159
Post by Jiri Slaby
Post by V***@vt.edu
Usually this means that there's an IO request in flight and it got lost
somewhere. Device driver bug, IO scheduler bug, etc. Conceivably a
lost interrupt (hardware bug, PCI setup bug, etc).
Aug 9 14:30:24 turing-police kernel: [ 3535.720000] end_request: I/O error, dev fd0, sector 0
Red herring. yum just wedged again, this time with no reference to floppy drive.
Same traceback. Anybody have anything to suggest before I start playing
hunt-the-wumpus with a -mm bisection?
Hmm, I have the accurately same problem...
yum + CFQ + BLK_DEV_PIIX + nothing odd in dmesg
[ 3438.574864] yum D 00000000 0 21659 3838
(NOTLB)
[ 3438.575098] e5c09d24 00000001 c180f5a8 00000000 e5c09ce0 c01683e8
fe37c0bc 000002c4
[ 3438.575388] 00001000 00000001 c18fbbd0 0023001f 00000007 f26cc560
c1913560 fe4166d5
[ 3438.575713] 000002c4 0009a619 00000001 f26cc66c c180ec40 c04ff140
e5c09d14 c01fad44
[ 3438.576113] [<c0373d3b>] io_schedule+0x26/0x30
[ 3438.576187] [<c014653c>] sync_page+0x39/0x45
[ 3438.576260] [<c0374401>] __wait_on_bit_lock+0x41/0x64
[ 3438.576333] [<c01464ef>] __lock_page+0x57/0x5f
[ 3438.576405] [<c014f5f2>] truncate_inode_pages_range+0x1b6/0x304
[ 3438.576480] [<c014f76f>] truncate_inode_pages+0x2f/0x40
[ 3438.576553] [<c01a7bc4>] ext3_delete_inode+0x29/0xf7
[ 3438.576627] [<c017f26b>] generic_delete_inode+0x65/0xe7
[ 3438.576701] [<c017f3aa>] generic_drop_inode+0xbd/0x173
[ 3438.576774] [<c017ed25>] iput+0x6b/0x7b
[ 3438.576846] [<c017cc57>] dentry_iput+0x68/0xb3
[ 3438.576919] [<c017d99e>] dput+0x4f/0x19f
[ 3438.576990] [<c0176164>] sys_renameat+0x1e0/0x212
[ 3438.577063] [<c01761be>] sys_rename+0x28/0x2a
[ 3438.577135] [<c01030fb>] syscall_call+0x7/0xb
Is yum the only process which was stuck in D state?

If so, I'd still be expecting a device driver/iosched bug.

If not, it's probably a vfs/fs deadlock.
Mattia Dongili
2006-08-10 17:33:13 UTC
Permalink
Post by Andrew Morton
On Thu, 10 Aug 2006 13:39:11 +0159
Post by Jiri Slaby
Post by V***@vt.edu
Usually this means that there's an IO request in flight and it got lost
somewhere. Device driver bug, IO scheduler bug, etc. Conceivably a
lost interrupt (hardware bug, PCI setup bug, etc).
Aug 9 14:30:24 turing-police kernel: [ 3535.720000] end_request: I/O error, dev fd0, sector 0
Red herring. yum just wedged again, this time with no reference to floppy drive.
Same traceback. Anybody have anything to suggest before I start playing
hunt-the-wumpus with a -mm bisection?
Hmm, I have the accurately same problem...
yum + CFQ + BLK_DEV_PIIX + nothing odd in dmesg
oooh, same setup and same trace here, but no yum, see some screenshots
here:
http://oioio.altervista.org/linux/dsc03448.jpg
http://oioio.altervista.org/linux/dsc03449.jpg

The use case for me was simply:
- boot (in single user for the 2 shots)
- suspend
- resume
- wait some seconds and do anything that accesses the disk

[...]
Post by Andrew Morton
Is yum the only process which was stuck in D state?
in my case anything accessing the disk, leading to lockup shortly
Post by Andrew Morton
If so, I'd still be expecting a device driver/iosched bug.
If not, it's probably a vfs/fs deadlock.
I reverted the full git-block.patch and I'm now using rc3-mm2 since
then suspending to ram, disk and using my laptop for daily stuff:

reboot system boot 2.6.18-rc3-mm2-1 Tue Aug 8 00:02 - 19:30 (2+19:27)

PS: my previous pasts are here: http://lkml.org/lkml/2006/8/7/264
probably an unfortunate Cc list :)
--
mattia
:wq!
Jiri Slaby
2006-08-10 17:43:42 UTC
Permalink
Post by Mattia Dongili
Post by Andrew Morton
On Thu, 10 Aug 2006 13:39:11 +0159
Post by Jiri Slaby
Post by V***@vt.edu
Usually this means that there's an IO request in flight and it got lost
somewhere. Device driver bug, IO scheduler bug, etc. Conceivably a
lost interrupt (hardware bug, PCI setup bug, etc).
Aug 9 14:30:24 turing-police kernel: [ 3535.720000] end_request: I/O error, dev fd0, sector 0
Red herring. yum just wedged again, this time with no reference to floppy drive.
Same traceback. Anybody have anything to suggest before I start playing
hunt-the-wumpus with a -mm bisection?
Hmm, I have the accurately same problem...
yum + CFQ + BLK_DEV_PIIX + nothing odd in dmesg
oooh, same setup and same trace here, but no yum, see some screenshots
http://oioio.altervista.org/linux/dsc03448.jpg
http://oioio.altervista.org/linux/dsc03449.jpg
This is reiser ^^?!, so we can exclude fs? I have this behaviour on ext3.

regards,
--
<a href="http://www.fi.muni.cz/~xslaby/">Jiri Slaby</a>
faculty of informatics, masaryk university, brno, cz
e-mail: jirislaby gmail com, gpg pubkey fingerprint:
B674 9967 0407 CE62 ACC8 22A0 32CC 55C3 39D4 7A7E
V***@vt.edu
2006-08-10 17:44:33 UTC
Permalink
Post by Mattia Dongili
oooh, same setup and same trace here, but no yum, see some screenshots
http://oioio.altervista.org/linux/dsc03448.jpg
http://oioio.altervista.org/linux/dsc03449.jpg
Not quite the same trace - the first few lines are the same, but your call to
__lock_page() comes in via do_generic_mapping_read(), while Jiri and I are
seeing the call to __lock_page() coming from truncate_inode_pages_range()....
Laurent Riffard
2006-08-10 09:04:36 UTC
Permalink
[this is a resend, as the original message may be too big to reach the =
list...]
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-=
rc3/26.18-rc3-mm2/

Hello,

On my system, a cron runs every day to check the integrity of
installed RPMS, it runs "rpm -v" on each package, which computes
MD5 hash for each installed file and compares this result, the file=20
size and modification time with values stored in RPM database.

This is the workload. Since 2.6.18-rc3-mm2, this processus eats=20
all the memory and triggers OOM.

On my system, "free -t" output normally looks like this ("cached" value=
=20
is about half of RAM):
# free -t=20
total used free shared buffers cach=
ed
Mem: 515032 508512 6520 0 22992 2560=
32
-/+ buffers/cache: 229488 285544
Swap: 1116428 324 1116104
Total: 1631460 508836 1122624

After the rpm database check, "free -t" says:
total used free shared buffers cach=
ed
Mem: 515032 507124 7908 0 8132 3982=
96
-/+ buffers/cache: 100696 414336
Swap: 1116428 34896 1081532
Total: 1631460 542020 1089440

And the value of "cached" won't decrease.


This evening, this process trigger OOM-killer. Here is its first report=
:

syslogd invoked oom-killer: gfp_mask=3D0x201d2, order=3D0, oomkilladj=3D=
0
[show_trace+13/16] show_trace+0xd/0x10
[<c0104c18>] show_trace+0xd/0x10
[dump_stack+25/29] dump_stack+0x19/0x1d
[<c0104c34>] dump_stack+0x19/0x1d
[out_of_memory+93/422] out_of_memory+0x5d/0x1a6
[<c013be03>] out_of_memory+0x5d/0x1a6
[__alloc_pages+505/633] __alloc_pages+0x1f9/0x279
[<c013d25f>] __alloc_pages+0x1f9/0x279
[__do_page_cache_readahead+165/495] __do_page_cache_readahead+0xa5/0x1=
ef
[<c013e71b>] __do_page_cache_readahead+0xa5/0x1ef
[do_page_cache_readahead+66/80] do_page_cache_readahead+0x42/0x50
[<c013ec64>] do_page_cache_readahead+0x42/0x50
[filemap_nopage+412/882] filemap_nopage+0x19c/0x372
[<c013afbe>] filemap_nopage+0x19c/0x372
[__handle_mm_fault+540/1772] __handle_mm_fault+0x21c/0x6ec
[<c014435d>] __handle_mm_fault+0x21c/0x6ec
[do_page_fault+397/1158] do_page_fault+0x18d/0x486
[<c0111e1f>] do_page_fault+0x18d/0x486
[error_code+57/64] error_code+0x39/0x40
[<c0293079>] error_code+0x39/0x40
Mem-info:
DMA per-cpu:
cpu 0 hot: high 0, batch 1 used:0
cpu 0 cold: high 0, batch 1 used:0
Normal per-cpu:
cpu 0 hot: high 186, batch 31 used:63
cpu 0 cold: high 62, batch 15 used:61
Active:1621 inactive:97987 dirty:0 writeback:33 unstable:0 free:1215 sl=
ab:23388 mapped:3 pagetables:446
DMA free:2068kB min:88kB low:108kB high:132kB active:0kB inactive:7432k=
B present:16384kB pages_scanned:11284 all_unreclaimable? yes
lowmem_reserve[]: 0 495
Normal free:2792kB min:2804kB low:3504kB high:4204kB active:6484kB inac=
tive:384516kB present:507824kB pages_scanned:670357
all_unreclaimable? yes
lowmem_reserve[]: 0 0
DMA: 1*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB =
1*2048kB 0*4096kB =3D 2068kB
Normal: 0*4kB 1*8kB 6*16kB 2*32kB 1*64kB 0*128kB 0*256kB 1*512kB 0*1024=
kB 1*2048kB 0*4096kB =3D 2792kB
Swap cache: add 109576, delete 109542, find 12933/22258, race 0+8
=46ree swap =3D 936452kB
Total swap =3D 1116428kB
=46ree swap: 936452kB
131052 pages of RAM
0 pages of HIGHMEM
2358 reserved pages
2668 pages shared
34 pages swap cached
0 pages dirty
33 pages writeback
3 pages mapped
23388 pages slab
446 pages pagetables
Out of Memory: Kill process 23392 (seamonkey-bin) score 48523 and child=
ren.
Out of memory: Killed process 23392 (seamonkey-bin).


I gather some data before the rpm database check and near the end of it=
:
- /proc/slabinfo
- /proc/slab_allocators
- /proc/meminfo
- free -t

Please look in http://laurent.riffard.free.fr/2.6.18-rc3-mm2. You'll
find dmesg and .config too.

=46or information:

/proc/sys/vm/block_dump:0
/proc/sys/vm/dirty_background_ratio:10
/proc/sys/vm/dirty_expire_centisecs:3000
/proc/sys/vm/dirty_ratio:40
/proc/sys/vm/dirty_writeback_centisecs:500
/proc/sys/vm/drop_caches:0
/proc/sys/vm/laptop_mode:0
/proc/sys/vm/legacy_va_layout:0
/proc/sys/vm/lowmem_reserve_ratio:256
/proc/sys/vm/max_map_count:65536
/proc/sys/vm/min_free_kbytes:2896
/proc/sys/vm/nr_pdflush_threads:2
/proc/sys/vm/overcommit_memory:0
/proc/sys/vm/overcommit_ratio:50
/proc/sys/vm/page-cluster:3
/proc/sys/vm/panic_on_oom:0
/proc/sys/vm/percpu_pagelist_fraction:0
/proc/sys/vm/readahead_hit_rate:1
/proc/sys/vm/readahead_ratio:50
/proc/sys/vm/swappiness:60
/proc/sys/vm/swap_prefetch:1
/proc/sys/vm/swap_token_timeout:300
/proc/sys/vm/vdso_enabled:1
/proc/sys/vm/vfs_cache_pressure:100

# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev /dev tmpfs rw 0 0
/dev/vglinux1/lvroot / ext3 rw,data=3Dordered 0 0
/proc /proc proc rw 0 0
/sys /sys sysfs rw 0 0
none /dev/pts devpts rw 0 0
none /dev/shm tmpfs rw 0 0
none /proc/bus/usb usbfs rw 0 0
/dev/hda2 /boot ext2 rw 0 0
/dev/vglinux1/lvhome /home reiserfs rw 0 0
/dev/vglinux1/lvusr /usr reiserfs ro 0 0
/dev/vglinux1/lvvar /var ext3 rw,data=3Dordered 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
automount(pid1949) /vol autofs rw,fd=3D4,pgrp=3D1949,timeout=3D5,minpro=
to=3D2,maxproto=3D4,indirect 0 0

~~
laurent
Andrew Morton
2006-08-10 09:19:57 UTC
Permalink
On Thu, 10 Aug 2006 11:04:36 +0200
Post by Mattia Dongili
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1=
8-rc3/26.18-rc3-mm2/
Post by Mattia Dongili
=20
Hello,
=20
On my system, a cron runs every day to check the integrity of
installed RPMS, it runs "rpm -v" on each package, which computes
MD5 hash for each installed file and compares this result, the file=20
size and modification time with values stored in RPM database.
=20
This is the workload. Since 2.6.18-rc3-mm2, this processus eats=20
all the memory and triggers OOM.
=20
On my system, "free -t" output normally looks like this ("cached" val=
ue=20
Post by Mattia Dongili
# free -t=20
total used free shared buffers ca=
ched
Post by Mattia Dongili
Mem: 515032 508512 6520 0 22992 25=
6032
Post by Mattia Dongili
-/+ buffers/cache: 229488 285544
Swap: 1116428 324 1116104
Total: 1631460 508836 1122624
=20
total used free shared buffers ca=
ched
Post by Mattia Dongili
Mem: 515032 507124 7908 0 8132 39=
8296
Post by Mattia Dongili
-/+ buffers/cache: 100696 414336
Swap: 1116428 34896 1081532
Total: 1631460 542020 1089440
=20
And the value of "cached" won't decrease.
=20
Yes, I was just trying to reproduce this. No luck so far. Will try yo=
ur
=2Econfig tomorrow.

It would be interesting to try disabling CONFIG_ADAPTIVE_READAHEAD -
perhaps that got broken.

Also, are you able to determine whether the problem is specific to `rpm
-V'? Are you able to make the leak trigger using other filesystem
workloads?

If it's specific to `rpm -V' then perhaps direct-io is somehow causing
pagecache leakage. That would be a bit odd.



btw, it's not necessary to go all the way to oom to work out if the
pagecache leak is happening. After booting, do

echo 3 > /proc/sys/vm/drop_pagecache

and record the `Cached' figure in /proc/meminfo. After running some te=
st,
run `echo 3 > /proc/sys/vm/drop_pagecache' again and check
/proc/meminfo:Cached. If it dodn't do gown to a similarly low figure,
we're leaking pagecache.

btw2: please use /proc/meminfo output rather than free(1). Because fre=
e(1)
shows less info, and it does mysterious mangling of the info which it d=
oes
read in ways which confuse me.

Thanks.
Frederik Deweerdt
2006-08-10 12:13:36 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
Hi Andrew,

This patch aims at removing two implementations (spotted by Masatake YAMATO) of
pseudo-rwlocks using a spinlock_t and an atomic_t. One in net/socket.c
and another in net/bluetooth/af_bluetooth.c. I think that both could be
converted to rwsems, saving some lines of code.

Regards,
Frederik


Signed-off-by: Frederik Deweerdt <***@gmail.com>

net/dccp/ccid.c | 63 ++++++++++++------------------------------------------------
net/socket.c | 58 +++++++------------------------------------------------
2 files changed, 21 insertions(+), 100 deletions(-)

diff --git a/net/dccp/ccid.c b/net/dccp/ccid.c
--- a/net/dccp/ccid.c
+++ b/net/dccp/ccid.c
@@ -12,48 +12,11 @@
*/

#include "ccid.h"
+#include <linux/rwsem.h>

static struct ccid_operations *ccids[CCID_MAX];
-#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
-static atomic_t ccids_lockct = ATOMIC_INIT(0);
-static DEFINE_SPINLOCK(ccids_lock);
+static DECLARE_RWSEM(ccids_sem);

-/*
- * The strategy is: modifications ccids vector are short, do not sleep and
- * veeery rare, but read access should be free of any exclusive locks.
- */
-static void ccids_write_lock(void)
-{
- spin_lock(&ccids_lock);
- while (atomic_read(&ccids_lockct) != 0) {
- spin_unlock(&ccids_lock);
- yield();
- spin_lock(&ccids_lock);
- }
-}
-
-static inline void ccids_write_unlock(void)
-{
- spin_unlock(&ccids_lock);
-}
-
-static inline void ccids_read_lock(void)
-{
- atomic_inc(&ccids_lockct);
- spin_unlock_wait(&ccids_lock);
-}
-
-static inline void ccids_read_unlock(void)
-{
- atomic_dec(&ccids_lockct);
-}
-
-#else
-#define ccids_write_lock() do { } while(0)
-#define ccids_write_unlock() do { } while(0)
-#define ccids_read_lock() do { } while(0)
-#define ccids_read_unlock() do { } while(0)
-#endif

static kmem_cache_t *ccid_kmem_cache_create(int obj_size, const char *fmt,...)
{
@@ -103,13 +66,13 @@ int ccid_register(struct ccid_operations
if (ccid_ops->ccid_hc_tx_slab == NULL)
goto out_free_rx_slab;

- ccids_write_lock();
+ down_write(&ccids_sem);
err = -EEXIST;
if (ccids[ccid_ops->ccid_id] == NULL) {
ccids[ccid_ops->ccid_id] = ccid_ops;
err = 0;
}
- ccids_write_unlock();
+ up_write(&ccids_sem);
if (err != 0)
goto out_free_tx_slab;

@@ -131,9 +94,9 @@ EXPORT_SYMBOL_GPL(ccid_register);

int ccid_unregister(struct ccid_operations *ccid_ops)
{
- ccids_write_lock();
+ down_write(&ccids_sem);
ccids[ccid_ops->ccid_id] = NULL;
- ccids_write_unlock();
+ up_write(&ccids_sem);

ccid_kmem_cache_destroy(ccid_ops->ccid_hc_tx_slab);
ccid_ops->ccid_hc_tx_slab = NULL;
@@ -152,15 +115,15 @@ struct ccid *ccid_new(unsigned char id,
struct ccid_operations *ccid_ops;
struct ccid *ccid = NULL;

- ccids_read_lock();
+ down_read(&ccids_sem);
#ifdef CONFIG_KMOD
if (ccids[id] == NULL) {
/* We only try to load if in process context */
- ccids_read_unlock();
+ up_read(&ccids_sem);
if (gfp & GFP_ATOMIC)
goto out;
request_module("net-dccp-ccid-%d", id);
- ccids_read_lock();
+ down_read(&ccids_sem);
}
#endif
ccid_ops = ccids[id];
@@ -170,7 +133,7 @@ #endif
if (!try_module_get(ccid_ops->ccid_owner))
goto out_unlock;

- ccids_read_unlock();
+ up_read(&ccids_sem);

ccid = kmem_cache_alloc(rx ? ccid_ops->ccid_hc_rx_slab :
ccid_ops->ccid_hc_tx_slab, gfp);
@@ -191,7 +154,7 @@ #endif
out:
return ccid;
out_unlock:
- ccids_read_unlock();
+ up_read(&ccids_sem);
goto out;
out_free_ccid:
kmem_cache_free(rx ? ccid_ops->ccid_hc_rx_slab :
@@ -235,10 +198,10 @@ static void ccid_delete(struct ccid *cci
ccid_ops->ccid_hc_tx_exit(sk);
kmem_cache_free(ccid_ops->ccid_hc_tx_slab, ccid);
}
- ccids_read_lock();
+ down_read(&ccids_sem);
if (ccids[ccid_ops->ccid_id] != NULL)
module_put(ccid_ops->ccid_owner);
- ccids_read_unlock();
+ up_read(&ccids_sem);
}

void ccid_hc_rx_delete(struct ccid *ccid, struct sock *sk)
diff --git a/net/socket.c b/net/socket.c
index 53cb85b..bc52aeb 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -85,6 +85,7 @@ #include <linux/compat.h>
#include <linux/kmod.h>
#include <linux/audit.h>
#include <linux/wireless.h>
+#include <linux/rwsem.h>

#include <asm/uaccess.h>
#include <asm/unistd.h>
@@ -143,50 +144,7 @@ #endif

static struct net_proto_family *net_families[NPROTO];

-#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
-static atomic_t net_family_lockct = ATOMIC_INIT(0);
-static DEFINE_SPINLOCK(net_family_lock);
-
-/* The strategy is: modifications net_family vector are short, do not
- sleep and veeery rare, but read access should be free of any exclusive
- locks.
- */
-
-static void net_family_write_lock(void)
-{
- spin_lock(&net_family_lock);
- while (atomic_read(&net_family_lockct) != 0) {
- spin_unlock(&net_family_lock);
-
- yield();
-
- spin_lock(&net_family_lock);
- }
-}
-
-static __inline__ void net_family_write_unlock(void)
-{
- spin_unlock(&net_family_lock);
-}
-
-static __inline__ void net_family_read_lock(void)
-{
- atomic_inc(&net_family_lockct);
- spin_unlock_wait(&net_family_lock);
-}
-
-static __inline__ void net_family_read_unlock(void)
-{
- atomic_dec(&net_family_lockct);
-}
-
-#else
-#define net_family_write_lock() do { } while(0)
-#define net_family_write_unlock() do { } while(0)
-#define net_family_read_lock() do { } while(0)
-#define net_family_read_unlock() do { } while(0)
-#endif
-
+static DECLARE_RWSEM(net_family_sem);

/*
* Statistics counters of the socket lists
@@ -1132,7 +1090,7 @@ #if defined(CONFIG_KMOD)
}
#endif

- net_family_read_lock();
+ down_read(&net_family_sem);
if (net_families[family] == NULL) {
err = -EAFNOSUPPORT;
goto out;
@@ -1185,7 +1143,7 @@ #endif
goto out_release;

out:
- net_family_read_unlock();
+ up_read(&net_family_sem);
return err;
out_module_put:
module_put(net_families[family]->owner);
@@ -2034,13 +1992,13 @@ int sock_register(struct net_proto_famil
printk(KERN_CRIT "protocol %d >= NPROTO(%d)\n", ops->family, NPROTO);
return -ENOBUFS;
}
- net_family_write_lock();
+ down_write(&net_family_sem);
err = -EEXIST;
if (net_families[ops->family] == NULL) {
net_families[ops->family]=ops;
err = 0;
}
- net_family_write_unlock();
+ up_write(&net_family_sem);
printk(KERN_INFO "NET: Registered protocol family %d\n",
ops->family);
return err;
@@ -2057,9 +2015,9 @@ int sock_unregister(int family)
if (family < 0 || family >= NPROTO)
return -1;

- net_family_write_lock();
+ down_write(&net_family_sem);
net_families[family]=NULL;
- net_family_write_unlock();
+ up_write(&net_family_sem);
printk(KERN_INFO "NET: Unregistered protocol family %d\n",
family);
return 0;
David Miller
2006-08-10 12:57:11 UTC
Permalink
From: Frederik Deweerdt <***@free.fr>
Date: Thu, 10 Aug 2006 14:13:36 +0200
Post by Frederik Deweerdt
This patch aims at removing two implementations (spotted by Masatake YAMATO) of
pseudo-rwlocks using a spinlock_t and an atomic_t. One in net/socket.c
and another in net/bluetooth/af_bluetooth.c. I think that both could be
converted to rwsems, saving some lines of code.
The net/socket.c one has been converted to RCU by Stephen
Hemminger already.

If the bluetooth case is in an important code path it should
use RCU as well.
Frederik Deweerdt
2006-08-10 13:19:16 UTC
Permalink
Post by David Miller
Date: Thu, 10 Aug 2006 14:13:36 +0200
Post by Frederik Deweerdt
This patch aims at removing two implementations (spotted by Masatake YAMATO) of
pseudo-rwlocks using a spinlock_t and an atomic_t. One in net/socket.c
and another in net/bluetooth/af_bluetooth.c. I think that both could be
converted to rwsems, saving some lines of code.
The net/socket.c one has been converted to RCU by Stephen
Hemminger already.
If the bluetooth case is in an important code path it should
use RCU as well.
Sorry, I made a mistake there: net/bluetooth/af_bluetooth.c should read
net/dccp/ccid.c. Does your comment regarding af_bluetooth.c applies to
ccid.c as well?
Also, is there a place where I can find Stephen Hemminger's work?
- Note, this is pure curiosity, it can wait a kernel release or two :) -

Thanks,
Frederik
Reuben Farrelly
2006-08-10 13:43:53 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
- 2.6.18-rc3-mm1 gets mysterious udev timeouts during boot and crashes in
NFS. This kernel reverts the patches which were causing that.
Just hit this one upon shutdown (no traces logged before then):

INIT: Sending processes the TERM signal
INITStopping clamd: [FAILED]
Starting killall: Stopping clamd: [FAILED]
[ OK ]
Sending all processes the TERM signal...
Sending all processes the KILL signal...
Saving random seed:
Syncing hardware clock to system time
Turning off swap:
Unmounting file systems: umount2: Device or resource busy
umount: /var/www/html: device is busy
umount2: Device or resource busy
umount: /var/www/html: device is busy
BUG: Dentry ffff81003d0f34f0{i=3,n=.reiserfs_priv} still in use (1) [unmount of
reiserfs sdc8]
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/dcache.c:611
invalid opcode: 0000 [1] SMP
last sysfs file:
/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2/2-1.2:1.0/bInterfaceProtocol
CPU 0
Modules linked in: ipv6 ip_gre binfmt_misc i2c_i801 iTCO_wdt serio_raw
Pid: 22715, comm: umount Not tainted 2.6.18-rc3-mm2 #1
RIP: 0010:[<ffffffff802ce943>] [<ffffffff802ce943>]
shrink_dcache_for_umount_subtree+0x1a3/0x2a7
RSP: 0018:ffff81002ec6fd98 EFLAGS: 00010292
RAX: 0000000000000062 RBX: ffff81003d0f34f0 RCX: 0000000000000003
RDX: 0000000000000008 RSI: ffff810035224740 RDI: ffff810035224040
RBP: ffff81002ec6fdb8 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff80216800 R11: 0000000000000000 R12: ffff81003d0f34f0
R13: ffff8100025b2ce8 R14: ffff81002f936d30 R15: 0000000000000000
FS: 00002b532ecdd4b0(0000) GS:ffffffff808b5000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b532ecd0000 CR3: 000000003273e000 CR4: 00000000000006e0
Process umount (pid: 22715, threadinfo ffff81002ec6e000, task ffff810035224040)
Stack: ffff81003d29c980 ffff81003d29c588 ffffffff80595640 ffff81002ec6fea8
ffff81002ec6fdd8 ffffffff802ceea9 ffffffff805955e0 ffff81003d29c588
ffff81002ec6fe08 ffffffff802c6944 ffff81002f936d30 ffff81003e99e2c0
Call Trace:
[<ffffffff802ceea9>] shrink_dcache_for_umount+0x37/0x6e
[<ffffffff802c6944>] generic_shutdown_super+0x24/0x151
[<ffffffff802c6a97>] kill_block_super+0x26/0x3b
[<ffffffff802c6b65>] deactivate_super+0x4c/0x67
[<ffffffff8022d061>] mntput_no_expire+0x58/0x92
[<ffffffff80232562>] path_release_on_umount+0x1d/0x2b
[<ffffffff802d1182>] sys_umount+0x252/0x29b
[<ffffffff8025f45e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:


Code: 0f 0b 68 c9 47 4c 80 c2 63 02 4c 8b 63 50 49 39 dc 75 05 45
RIP [<ffffffff802ce943>] shrink_dcache_for_umount_subtree+0x1a3/0x2a7
RSP <ffff81002ec6fd98>
/etc/rc6.d/S01reboot: line 14: 22715 Segmentation fault "$@"

/var/www/html: c
/var: mcm
Unmounting file systems (retry):<3>BUG: Dentry
ffff81003ef61e80{i=3,n=.reiserfs_priv} still in use (1) [unmount of reiserfs sda8]
umount2: Devic----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/dcache.c:611
invalid opcode: 0000 [2] SMP
last sysfs file:
/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2/2-1.2:1.0/bInterfaceProtocol
CPU 1
Modules linked in: ipv6 ip_gre binfmt_misc i2c_i801 iTCO_wdt serio_raw
Pid: 22722, comm: umount Not tainted 2.6.18-rc3-mm2 #1
RIP: 0010:[<ffffffff802ce943>] e or resource bu [<ffffffff802ce943>]
shrink_dcache_for_umount_subtree+0x1a3/0x2a7
RSP: 0018:ffff810027e1dd98 EFLAGS: 00010292
RAX: 0000000000000062 RBX: ffff81003ef61e80 RCX: 0000000000000000
sy
umount: /varRDX: ffff810015f99140 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff810027e1ddb8 R08: 0000000000000002 R09: 0000000000000001
R10: ffffffff80216800 R11: 0000000000000001 R12: ffff81003ef61e80
/www/html: devicR13: ffff8100131f3648 R14: ffff81002f936e18 R15: 0000000000000000
FS: 00002b52520af4b0(0000) GS:ffff81003f6eb430(0000) knlGS:0000000000000000
e is busy
umounCS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff58a02ea0 CR3: 000000002ec49000 CR4: 00000000000006e0
t2: Device or reProcess umount (pid: 22722, threadinfo ffff810027e1c000, task
ffff810015f99140)
Stack: ffff81003d29d198 ffff81003d29cda0 ffffffff80595640 ffff810027e1dea8
ffff810027e1ddd8 ffffffff802ceea9 ffffffff805955e0 ffff81003d29cda0
ffff810027e1de08 ffffffff802c6944 ffff81002f936e18 ffff81003ebaa938
Call Trace:
source busy
umo [<ffffffff802ceea9>] shrink_dcache_for_umount+0x37/0x6e
unt: /var/www/ht [<ffffffff802c6944>] generic_shutdown_super+0x24/0x151
ml: device is bu [<ffffffff802c6a97>] kill_block_super+0x26/0x3b
sy
[<ffffffff802c6b65>] deactivate_super+0x4c/0x67
[<ffffffff8022d061>] mntput_no_expire+0x58/0x92
[<ffffffff80232562>] path_release_on_umount+0x1d/0x2b
[<ffffffff802d1182>] sys_umount+0x252/0x29b
[<ffffffff8025f45e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:


Code: 0f 0b 68 c9 47 4c 80 c2 63 02 4c 8b 63 50 49 39 dc 75 05 45
RIP [<ffffffff802ce943>] shrink_dcache_for_umount_subtree+0x1a3/0x2a7
RSP <ffff810027e1dd98>
/etc/rc6.d/S01reboot: line 14: 22722 Segmentation fault "$@"

/var/www/html: c
/var: mcm

Yes, there are bits of the shutdown mixed in which doesn't really help readability.

The reason I shut the box down was due to yum hanging and becoming a 'D' process
which was unkillable.

What is strange is that /var/www/html should not be busy as there are no mounts
underneath it. It's just a standard ext3 partition.

[***@tornado ~]# mount
/dev/md0 on / type ext3 (rw)
none on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
none on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext3 (rw)
/dev/md1 on /home type ext3 (rw)
/dev/md2 on /var type ext3 (rw)
/dev/md3 on /var/www/html type ext3 (rw)
/dev/md4 on /var/www/cgi-bin type ext3 (rw)
/dev/md5 on /store type ext3 (rw)
/dev/sda8 on /var/spool/squid-1 type reiserfs (rw,noatime,notail)
/dev/sdc8 on /var/spool/squid-2 type reiserfs (rw,noatime,notail)
/dev/sda9 on /tmp type ext3 (rw)
/dev/shm on /var/spool/amavisd/tmp type tmpfs (rw,size=25m,mode=700,uid=101,gid=511)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
[***@tornado ~]#

Looks identical to
http://www.uwsg.iu.edu/hypermail/linux/kernel/0606.3/2802.html which hasn't
appeared since then. I remember it was reproduceable at the time, but
disappeared for a while and just came back before..

Reuben
Andrew Morton
2006-08-10 15:38:06 UTC
Permalink
On Fri, 11 Aug 2006 01:43:53 +1200
Post by Reuben Farrelly
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
- 2.6.18-rc3-mm1 gets mysterious udev timeouts during boot and crashes in
NFS. This kernel reverts the patches which were causing that.
INIT: Sending processes the TERM signal
INITStopping clamd: [FAILED]
Starting killall: Stopping clamd: [FAILED]
[ OK ]
Sending all processes the TERM signal...
Sending all processes the KILL signal...
Syncing hardware clock to system time
Unmounting file systems: umount2: Device or resource busy
umount: /var/www/html: device is busy
umount2: Device or resource busy
umount: /var/www/html: device is busy
BUG: Dentry ffff81003d0f34f0{i=3,n=.reiserfs_priv} still in use (1) [unmount of
reiserfs sdc8]
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/dcache.c:611
invalid opcode: 0000 [1] SMP
/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2/2-1.2:1.0/bInterfaceProtocol
CPU 0
Modules linked in: ipv6 ip_gre binfmt_misc i2c_i801 iTCO_wdt serio_raw
Pid: 22715, comm: umount Not tainted 2.6.18-rc3-mm2 #1
RIP: 0010:[<ffffffff802ce943>] [<ffffffff802ce943>]
shrink_dcache_for_umount_subtree+0x1a3/0x2a7
RSP: 0018:ffff81002ec6fd98 EFLAGS: 00010292
RAX: 0000000000000062 RBX: ffff81003d0f34f0 RCX: 0000000000000003
RDX: 0000000000000008 RSI: ffff810035224740 RDI: ffff810035224040
RBP: ffff81002ec6fdb8 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff80216800 R11: 0000000000000000 R12: ffff81003d0f34f0
R13: ffff8100025b2ce8 R14: ffff81002f936d30 R15: 0000000000000000
FS: 00002b532ecdd4b0(0000) GS:ffffffff808b5000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b532ecd0000 CR3: 000000003273e000 CR4: 00000000000006e0
Process umount (pid: 22715, threadinfo ffff81002ec6e000, task ffff810035224040)
Stack: ffff81003d29c980 ffff81003d29c588 ffffffff80595640 ffff81002ec6fea8
ffff81002ec6fdd8 ffffffff802ceea9 ffffffff805955e0 ffff81003d29c588
ffff81002ec6fe08 ffffffff802c6944 ffff81002f936d30 ffff81003e99e2c0
[<ffffffff802ceea9>] shrink_dcache_for_umount+0x37/0x6e
[<ffffffff802c6944>] generic_shutdown_super+0x24/0x151
[<ffffffff802c6a97>] kill_block_super+0x26/0x3b
[<ffffffff802c6b65>] deactivate_super+0x4c/0x67
[<ffffffff8022d061>] mntput_no_expire+0x58/0x92
[<ffffffff80232562>] path_release_on_umount+0x1d/0x2b
[<ffffffff802d1182>] sys_umount+0x252/0x29b
[<ffffffff8025f45e>] system_call+0x7e/0x83
yup, thanks. We're expecting that
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/hot-fixes/reiserfs-make-sure-all-dentries-refs-are-released-before-calling-kill_block_super-try-2.patch
will fix this.
V***@vt.edu
2006-08-10 17:38:59 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
Building a kernel with IPV6_MULTIPLE_TABLES=y breaks my IPv6 connectivity
quite badly. It basically totally refuses to answer an IPv6 Neighbor Solicit
packet or IPv6 Echo Request packet. I run a 'tcpdump -n ipv6', and I see the
requests come in, and no packets leaving. Interestingly enough, if I try to
ping6 *out* of the box, it's totally willing to send a Neighbor Solicit outbound
(although it appears to totally ignore the Neighbor Advert packet that comes
back). Of course, things don't work very well at all with busticated Neighbor
Solicit.

A kernel built with IPV6_MULTIPLE_TABLES=n works just fine.

The relevant ifconfig (eth3 is a 100mbit port, eth5 is a wireless card):

eth3 Link encap:Ethernet HWaddr 00:06:5B:EA:8E:4E
inet addr:128.173.14.107 Bcast:128.173.15.255 Mask:255.255.252.0
inet6 addr: 2001:468:c80:2103:206:5bff:feea:8e4e/64 Scope:Global
inet6 addr: fe80::206:5bff:feea:8e4e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15529 errors:0 dropped:0 overruns:1 frame:0
TX packets:2073 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2333290 (2.2 MiB) TX bytes:228862 (223.4 KiB)
Interrupt:11 Base address:0x6800

eth5 Link encap:Ethernet HWaddr 00:02:2D:5C:11:48
inet addr:198.82.168.129 Bcast:198.82.168.255 Mask:255.255.255.0
inet6 addr: 2001:468:c80:2181:202:2dff:fe5c:1148/64 Scope:Global
inet6 addr: fe80::202:2dff:fe5c:1148/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2096 errors:0 dropped:0 overruns:0 frame:0
TX packets:144 errors:1 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:280919 (274.3 KiB) TX bytes:22184 (21.6 KiB)
Interrupt:11 Base address:0xe100

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1583 errors:0 dropped:0 overruns:0 frame:0
TX packets:1583 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:642598 (627.5 KiB) TX bytes:642598 (627.5 KiB)

A working routing table:

netstat -r -n -A inet6
Kernel IPv6 routing table
Destination Next Hop Flags Metric Ref Use Iface
::1/128 :: U 0 12 1 lo
2001:468:c80:2103:206:5bff:feea:8e4e/128 :: U 0 4 1 lo
2001:468:c80:2103::/64 :: UA 256 113 0 eth3
2001:468:c80:2181:202:2dff:fe5c:1148/128 :: U 0 0 1 lo
2001:468:c80:2181::/64 :: UA 256 11 0 eth5
fe80::202:2dff:fe5c:1148/128 :: U 0 0 1 lo
fe80::206:5bff:feea:8e4e/128 :: U 0 2 1 lo
fe80::/64 :: U 256 0 0 eth3
fe80::/64 :: U 256 0 0 eth5
ff02::1/128 ff02::1 UC 0 113 0 eth3
ff02::1/128 ff02::1 UC 0 1 0 eth5
ff00::/8 :: U 256 0 0 eth3
ff00::/8 :: U 256 0 0 eth5
::/0 fe80::20f:35ff:fe3e:d41a UGDA 1024 1 0 eth3
::/0 fe80::20f:35ff:fe3e:d41a UGDA 1024 1 0 eth5
Patrick McHardy
2006-08-10 20:02:03 UTC
Permalink
Post by V***@vt.edu
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc3/2.6.18-rc3-mm2/
Building a kernel with IPV6_MULTIPLE_TABLES=y breaks my IPv6 connectivity
quite badly. It basically totally refuses to answer an IPv6 Neighbor Solicit
packet or IPv6 Echo Request packet. I run a 'tcpdump -n ipv6', and I see the
requests come in, and no packets leaving. Interestingly enough, if I try to
ping6 *out* of the box, it's totally willing to send a Neighbor Solicit outbound
(although it appears to totally ignore the Neighbor Advert packet that comes
back). Of course, things don't work very well at all with busticated Neighbor
Solicit.
A kernel built with IPV6_MULTIPLE_TABLES=n works just fine.
It should be fixed by this patch (already contained in net-2.6.19).
Continue reading on narkive:
Loading...