Discussion:
2.6.9-rc2-mm1
(too old to reply)
Diego Calleja
2004-09-16 13:10:52 UTC
Permalink
El Thu, 16 Sep 2004 02:40:20 -0700 Andrew Morton <***@osdl.org> escrib=
i=F3:

[...]
-tty_io-hangup-locking.patch
+tty-locking-for-269rc2.patch
+tty-locking-for-269rc2-fixes.patch
=20
Alan's tty locking rework
I couldn't connect to internet, the connection didn't go beyond of...
Sep 16 15:01:51 estel chat[1235]: ATDTSOMETHING^M^M
Sep 16 15:01:51 estel chat[1235]: CONNECT
Sep 16 15:01:51 estel chat[1235]: -- got it=20
Sep 16 15:01:51 estel chat[1235]: send (\d)
Sep 16 15:01:52 estel pppd[1233]: Serial connection established.
^^^
here (the rest is from a working kernel)
Sep 16 15:01:52 estel pppd[1233]: using channel 1
Sep 16 15:01:52 estel pppd[1233]: Using interface ppp0
Sep 16 15:01:52 estel pppd[1233]: Connect: ppp0 <--> /dev/ttyS1

Unapplying those two patches quoted above made it work again, the box
is a smp 2xpentium3, the sysrq-t output of pppd was:
Sep 16 14:44:45 estel kernel: pppd S 00000000 0 1365 =
1 1323 (NOTLB)
Sep 16 14:44:45 estel kernel: db495e9c 00000086 c0116680 00000000 00000=
000 bffff978 ffff037f 43e740f2=20
Sep 16 14:44:45 estel kernel: 0000001b db8f2170 c0116680 de36b2a=
0 07a2d1a5 00000000 c1409f60 db8f22cc=20
Sep 16 14:44:45 estel kernel: db494000 d919f000 db495edc 0000000=
3 c01b9739 00000000 db4221a8 c016308b=20
Sep 16 14:44:45 estel kernel: Call Trace:
Sep 16 14:44:45 estel kernel: [default_wake_function+0/16] default_wak=
e_function+0x0/0x10
Sep 16 14:44:45 estel kernel: [default_wake_function+0/16] default_wak=
e_function+0x0/0x10
Sep 16 14:44:45 estel kernel: [tty_set_ldisc+457/1232] tty_set_ldisc+0=
x1c9/0x4d0
Sep 16 14:44:45 estel kernel: [locks_delete_lock+91/256] locks_delete_=
lock+0x5b
/0x100
Sep 16 14:44:45 estel kernel: [autoremove_wake_function+0/80] autoremo=
ve_wake_function+0x0/0x50
Sep 16 14:44:45 estel kernel: [autoremove_wake_function+0/80] autoremo=
ve_wake_function+0x0/0x50
Sep 16 14:44:45 estel kernel: [serial8250_tx_empty+52/80] serial8250_t=
x_empty+0x34/0x50
Sep 16 14:44:45 estel kernel: [remove_wait_queue+23/112] remove_wait_q=
ueue+0x17/0x70
Sep 16 14:44:45 estel kernel: [tty_wait_until_sent+230/256] tty_wait_u=
ntil_sent+0xe6/0x100
Sep 16 14:44:45 estel kernel: [default_wake_function+0/16] default_wak=
e_function+0x0/0x10
Sep 16 14:44:45 estel kernel: [n_tty_open+0/160] n_tty_open+0x0/0xa0
Sep 16 14:44:45 estel kernel: [n_tty_close+0/48] n_tty_close+0x0/0x30
Sep 16 14:44:45 estel kernel: [n_tty_flush_buffer+0/96] n_tty_flush_bu=
ffer+0x0/0x60
Sep 16 14:44:45 estel kernel: [n_tty_chars_in_buffer+0/128] n_tty_char=
s_in_buffer+0x0/0x80
Sep 16 14:44:45 estel kernel: [read_chan+0/2000] read_chan+0x0/0x7d0
Sep 16 14:44:45 estel kernel: [write_chan+0/560] write_chan+0x0/0x230
Sep 16 14:44:45 estel kernel: [n_tty_ioctl+0/896] n_tty_ioctl+0x0/0x38=
0
Sep 16 14:44:45 estel kernel: [n_tty_set_termios+0/480] n_tty_set_term=
ios+0x0/0x1e0
Sep 16 14:44:45 estel kernel: [normal_poll+0/331] normal_poll+0x0/0x14=
b
Sep 16 14:44:45 estel kernel: [n_tty_receive_buf+0/3728] n_tty_receive=
_buf+0x0/0xe90
Sep 16 14:44:45 estel kernel: [n_tty_receive_room+0/64] n_tty_receive_=
room+0x0/0x40
Sep 16 14:44:45 estel kernel: [n_tty_write_wakeup+0/48] n_tty_write_wa=
keup+0x0/0x30
Sep 16 14:44:45 estel kernel: [sys_ioctl+255/624] sys_ioctl+0xff/0x270
Sep 16 14:44:45 estel kernel: [sysenter_past_esp+82/113] sysenter_past=
_esp+0x52/0x71
Alan Cox
2004-09-16 12:27:38 UTC
Permalink
Yeah this is the bug Paul nailed down, new patch later today that should
fix the ldisc change hang. Thanks for the report.
Ryan Cumming
2004-09-16 13:34:47 UTC
Permalink
+sched-fix-scheduling-latencies-for-preempt-kernels.patch
This patch got a rather unfortunate name, as it actually only fixes latencies
for non-preempt kernels.

-Ryan
William Lee Irwin III
2004-09-16 14:20:18 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
Please remove include/asm-sh64/smp_lock.h; they missed the smp_lock.h
consolidation while sitting out-of-tree and/or in the process of
forward porting to 2.6


-- wli
Norberto Bensa
2004-09-16 16:45:55 UTC
Permalink
+tune-vmalloc-size.patch
This one of course breaks nvidia's binary driver; so nvidia users should do a
"patch -Rp1" to revert it.

Regards,
Norberto
Arjan van de Ven
2004-09-16 17:16:02 UTC
Permalink
Post by Norberto Bensa
+tune-vmalloc-size.patch
This one of course breaks nvidia's binary driver; so nvidia users should do a
"patch -Rp1" to revert it.
eh why how ?? what evil stuff is nvidia doing this time ?
Norberto Bensa
2004-09-16 17:28:10 UTC
Permalink
Post by Arjan van de Ven
Post by Norberto Bensa
+tune-vmalloc-size.patch
This one of course breaks nvidia's binary driver; so nvidia users should
do a "patch -Rp1" to revert it.
eh why how ?? what evil stuff is nvidia doing this time ?
On modprobe it says: "__VMALLOC_RESERVE undefined symbol". I'm almost sure is
an #include thing, but since I know near to nothing about the kernel
internals, I prefer to revert the patch.

Now I got this problem with vmware but I need a reboot to tell you the exact
message; if you are interested, I'll report later.

Best regards,
Norberto
Arjan van de Ven
2004-09-16 17:30:22 UTC
Permalink
Post by Norberto Bensa
Post by Arjan van de Ven
Post by Norberto Bensa
+tune-vmalloc-size.patch
This one of course breaks nvidia's binary driver; so nvidia users should
do a "patch -Rp1" to revert it.
eh why how ?? what evil stuff is nvidia doing this time ?
On modprobe it says: "__VMALLOC_RESERVE undefined symbol". I'm almost sure is
an #include thing, but since I know near to nothing about the kernel
internals, I prefer to revert the patch.
I would consider it REALLY weird for a module to use detailed vmalloc
knowledge like this. Does anyone know what they are doing?????
V***@vt.edu
2004-09-17 19:29:51 UTC
Permalink
Post by Arjan van de Ven
I would consider it REALLY weird for a module to use detailed vmalloc
knowledge like this. Does anyone know what they are doing?????
Here's the problematic code:

// start off by tracking down which page within this allocation
// we're looking at. do this by searching for the physical address
// in our page table.
for (i = 0; i < at->num_pages; i++)
{
if ((address == (unsigned long) at->page_table[i].phys_addr))
{
unsigned long retaddr = (unsigned long) at->page_table[i].phys_addr;

// if we've allocated via vmalloc on a highmem system, the
// physical address may not be accessible via PAGE_OFFSET,
// that's ok, we have a simple linear pointer already.
if (at->flags & NV_ALLOC_TYPE_VMALLOC)
{
return (void *)((unsigned char *) at->key_mapping + (i << PAGE_SHIFT) + offset);
}

if (retaddr <= MAXMEM)
{
return __va((retaddr + offset));
}

// ?? this may be a contiguous allocation, fall through
// to below? or should I just check at->flag here?
}
}


It gets hung up because MAXMEM is:

#define MAXMEM (-__PAGE_OFFSET-__VMALLOC_RESERVE)

and in arch/i386/mm/init.c we have:

unsigned int __VMALLOC_RESERVE = 128 << 20;

but alas without an EXPORT_SYMBOL() so it's not visible to modules. The old
definition was:

#define __VMALLOC_RESERVE (128 << 20)

Change was introduced with the 'tune-vmalloc-size' patch in -rc2-mm1 that added
the boot-time 'vmalloc=' parameter.

I admit not knowing the memory manager or the NVidia well enough to know what
they *should* be doing instead.....
Terence Ripperda
2004-09-20 17:41:30 UTC
Permalink
Post by V***@vt.edu
I admit not knowing the memory manager or the NVidia well enough to know what
they *should* be doing instead.....
this is some ugly code. we're doing a lookup on a physical address to
see if this is memory we previously allocated and returning a kernel
pointer to the page.

the particular snippet in question (that uses MAXMEM) is an ugly attempt
to verify the address is a real physical address, before using __va()
on something like an i/o region. A better approach than comparing
MAXMEM would probably be to convert the address to a mapnr and compare
to max_mapnr.

I'll clean up this code and post a patch in the next couple of days.
in the meantime, the current patches out there should be good.

Thanks,
Terence
Norberto Bensa
2004-09-16 17:57:59 UTC
Permalink
Post by Norberto Bensa
+tune-vmalloc-size.patch
This one of course breaks nvidia's binary driver; so nvidia users should
do a "patch -Rp1" to revert it.
http://00f.net/blogs/index.php/2004/09/16/nvidia_kernel_module_and_linux_2_
6_9_rc2
Thanks. Actually, that was going to be my next fix, but ATM I'm trying to get
a work done for my CS class.

Best regards,
Norberto
Jedi/Sector One
2004-09-16 17:37:10 UTC
Permalink
Post by Norberto Bensa
+tune-vmalloc-size.patch
This one of course breaks nvidia's binary driver; so nvidia users should do a
"patch -Rp1" to revert it.
http://00f.net/blogs/index.php/2004/09/16/nvidia_kernel_module_and_linux_2_6_9_rc2
Jesse Barnes
2004-09-16 17:14:59 UTC
Permalink
bk-acpi.patch
Looks like some changes in this patch break sn2. In particular, this hunk in
acpi_pci_irq_enable():

- if (dev->irq && (dev->irq <= 0xF)) {
+ if (dev->irq >= 0 && (dev->irq <= 0xF)) {
printk(" - using IRQ %d\n", dev->irq);
return_VALUE(dev->irq);
}
else {
printk("\n");
- return_VALUE(0);
+ return_VALUE(-EINVAL);
}

Now instead of returning 0, we'll get -EINVAL when a driver calls
pci_enable_device. This is arguably correct since there's no _PRT entry (and
in fact no ACPI namespace on sn2), but shouldn't the code above be looking at
the 'pin' value instead of dev->irq? The sn2 specific PCI code sets up each
dev->irq long before this with the correct values...

Thanks,
Jesse
Bjorn Helgaas
2004-09-16 17:38:25 UTC
Permalink
Post by Jesse Barnes
bk-acpi.patch
Looks like some changes in this patch break sn2. In particular, this hunk in
- if (dev->irq && (dev->irq <= 0xF)) {
+ if (dev->irq >= 0 && (dev->irq <= 0xF)) {
printk(" - using IRQ %d\n", dev->irq);
return_VALUE(dev->irq);
}
else {
printk("\n");
- return_VALUE(0);
+ return_VALUE(-EINVAL);
}
Now instead of returning 0, we'll get -EINVAL when a driver calls
pci_enable_device. This is arguably correct since there's no _PRT entry (and
in fact no ACPI namespace on sn2), but shouldn't the code above be looking at
the 'pin' value instead of dev->irq? The sn2 specific PCI code sets up each
dev->irq long before this with the correct values...
I think the change above is actually from
incorrect-pci-interrupt-assignment-on-es7000-for-pin-zero.patch

of which I am officially ignorant :-)
Norberto Bensa
2004-09-17 03:00:07 UTC
Permalink
Hello list,
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6
.9-rc2-mm1/
I mentioned before that this kernel was giving me problems with nvidia (fixed
thanks to "jedi/sector one") and vmware. Now I'm at home so I booted
2.6.9-rc2-mm1 and this is what vmware is saying:

"Could not mmap 139264 bytes of memory from file offset 0 at (nil):
Operation not permitted. Failed to allocate shared memory."

Anyone knows what's going on here?

Many thanks in advance,
Norberto
Protasevich, Natalie
2004-09-17 05:18:59 UTC
Permalink
Post by Bjorn Helgaas
bk-acpi.patch
Looks like some changes in this patch break sn2. In particular,
this
Post by Bjorn Helgaas
hunk in
- if (dev->irq && (dev->irq <= 0xF)) {
+ if (dev->irq >= 0 && (dev->irq <= 0xF)) {
printk(" - using IRQ %d\n", dev->irq);
return_VALUE(dev->irq);
}
else {
printk("\n");
- return_VALUE(0);
+ return_VALUE(-EINVAL);
}
Now instead of returning 0, we'll get -EINVAL when a driver calls
pci_enable_device. This is arguably correct since there's no _PRT
entry (and in fact no ACPI namespace on sn2), but shouldn't the code
above be looking at the 'pin' value instead of dev->irq? The sn2
specific PCI code sets up each
dev->irq long before this with the correct values...
I think the change above is actually from
incorrect-pci-interrupt-assignment-on-es7000-for-pin-zero.patch
of which I am officially ignorant :-)
I realize now that this is very involved piece of code and a lot was
built around the assumption that IRQ0 is a timer interrupt (pin 0 is for
PCI on ES7000), and assumption that everyone honors this assumption :)
However, it seems wrong that we are not able to read literally what ACPI
says, such as irq 0 for INTA. Maybe, it would be better if the code
above was returing an error code, not an irq, which is returned in dev
anyway. It should be some creative way to resolve this issue... I think
the idea in the comment above by Jesse Barnes has good potential.

Thanks,
--Natalie
Len Brown
2004-09-17 06:50:11 UTC
Permalink
Post by Protasevich, Natalie
Post by Bjorn Helgaas
bk-acpi.patch
Looks like some changes in this patch break sn2. In particular,
this
Post by Bjorn Helgaas
hunk in
- if (dev->irq && (dev->irq <= 0xF)) {
+ if (dev->irq >= 0 && (dev->irq <= 0xF)) {
printk(" - using IRQ %d\n", dev->irq);
return_VALUE(dev->irq);
}
else {
printk("\n");
- return_VALUE(0);
+ return_VALUE(-EINVAL);
}
Now instead of returning 0, we'll get -EINVAL when a driver calls
pci_enable_device. This is arguably correct since there's no _PRT
entry (and in fact no ACPI namespace on sn2), but shouldn't the
code
Post by Bjorn Helgaas
above be looking at the 'pin' value instead of dev->irq? The sn2
specific PCI code sets up each
dev->irq long before this with the correct values...
No, in this context, the variable "pin" is to select PCI INTA/B/C/D, not
a interrupt controller pin.

If SN2 is using its pre-determied interrupt configuration, then is is
probably a bug that it calls down into this code at all, since SN2 wants
this code to be a NOP, yes?
Post by Protasevich, Natalie
Post by Bjorn Helgaas
I think the change above is actually from
incorrect-pci-interrupt-assignment-on-es7000-for-pin-zero.patch
of which I am officially ignorant :-)
yeah, probably these IRQ patches should come through me, but haven't b/c
my patch throughput has been very low the last couple of weeks. At
least if I ship no new patches I don't get blamed for breaking stuff;-)
Post by Protasevich, Natalie
I realize now that this is very involved piece of code and a lot was
built around the assumption that IRQ0 is a timer interrupt (pin 0 is for
PCI on ES7000), and assumption that everyone honors this assumption :)
However, it seems wrong that we are not able to read literally what ACPI
says, such as irq 0 for INTA. Maybe, it would be better if the code
above was returing an error code, not an irq, which is returned in dev
anyway. It should be some creative way to resolve this issue... I think
the idea in the comment above by Jesse Barnes has good potential.
Yes, there are lots of places where IRQ0 is an error condition. I
expect that this was lazyness based on the fact that the timer code is
hard-wired to use IRQ0, so it is always taken on IA32. My suggestion
to un-hard-wire the timer code was not greeted with enthusiasm, so this
is how things sit.

But IRQ0 should not be a problem on the ES7000, as there is an override
to supply IRQ0 from pin 0:20. pin 0:0, on the other hand is a PCI
interrupt on the ES7000, completely valid to be assigned from the _PRT
and to be assigned by the es7000-specific code to any arbitrary IRQ#.

I'm not sure exactly that the patch above was trying to fix. Looks like
it is time to examine the latest ew7000 changes in detail. But I think
the patch has pointed out that this routine really should be returning 0
for success and non zero for failure; and returning dev->irq was
probably a latent bug all along.

-Len
Jesse Barnes
2004-09-17 15:57:13 UTC
Permalink
Post by Len Brown
I'm not sure exactly that the patch above was trying to fix. Looks like
it is time to examine the latest ew7000 changes in detail. But I think
the patch has pointed out that this routine really should be returning 0
for success and non zero for failure; and returning dev->irq was
probably a latent bug all along.
Right, and I'd like success to be defined a little more broadly. If dev->irq
is already valid, we should just return 0 right away. That would take care
of platforms that already have the dev in question setup properly.

Thanks,
Jesse
Len Brown
2004-11-09 08:47:26 UTC
Permalink
bk-acpi.patch
Looks like some changes in this patch break sn2. In particular,
this
- return_VALUE(0);
+ return_VALUE(-EINVAL);
I think the patch has pointed out that this routine really should be returning 0
for success and non zero for failure; and returning dev->irq was
probably a latent bug all along.
Jesse,
I think this one is correct, please let me know if you have any trouble
with it.

thanks,
-Len

Dominik Karall
2004-09-17 18:09:44 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6
.9-rc2-mm1/
I get a lot of usb error messages on boot, here the relevant dmesg output.
I also attached lsusb -v output, which maybe useful, if you need more
information, let me know!

mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
input: PC Speaker
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
IPv4 over IPv4 tunneling driver
GRE over IPv4 tunneling driver
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S3 S4 S5)
ACPI wakeup devices:
FUTS PCI0 USB0 USB1 USB2 USB3 MAC0 AMR0 UAR1 PS2M PS2K
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 136k freed
Adding 514040k swap on /dev/hda9. Priority:-1 extents:1
EXT3 FS on hda10, internal journal
warning: process `update' used the obsolete bdflush system call
Fix your initscripts?
ohci1394: $Rev: 1226 $ Ben Collins <***@debian.org>
ACPI: PCI interrupt 0000:00:02.3[B] -> GSI 17 (level, low) -> IRQ 17
ohci1394: fw-host0: Unexpected PCI resource length of 1000!
ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[17] MMIO=[e2427000-e24277ff]
Max Packet=[2048]
snd: Unknown parameter `device_gid'
ACPI: PCI interrupt 0000:00:02.7[C] -> GSI 18 (level, low) -> IRQ 18
intel8x0_measure_ac97_clock: measured 49665 usecs
intel8x0: clocking to 48000
usbcore: registered new driver usbfs
usbcore: registered new driver hub
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI interrupt 0000:00:03.0[A] -> GSI 20 (level, low) -> IRQ 20
ohci_hcd 0000:00:03.0: Silicon Integrated Systems [SiS] USB 1.0 Controller
ohci_hcd 0000:00:03.0: irq 20, pci mem 0xe2420000
ohci_hcd 0000:00:03.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:03.0: SiS init quirk
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:03.1[B] -> GSI 21 (level, low) -> IRQ 21
ohci_hcd 0000:00:03.1: Silicon Integrated Systems [SiS] USB 1.0 Controller
(#2)
ohci_hcd 0000:00:03.1: irq 21, pci mem 0xe2421000
ohci_hcd 0000:00:03.1: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:03.1: SiS init quirk
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:03.2[C] -> GSI 22 (level, low) -> IRQ 22
ohci_hcd 0000:00:03.2: Silicon Integrated Systems [SiS] USB 1.0 Controller
(#3)
ieee1394: Host added: ID:BUS[0-00:1023] GUID[000010dc001ee09a]
ohci_hcd 0000:00:03.2: irq 22, pci mem 0xe2422000
ohci_hcd 0000:00:03.2: new USB bus registered, assigned bus number 3
ohci_hcd 0000:00:03.2: SiS init quirk
usb 2-1: new full speed USB device using address 2
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:03.3[D] -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:03.3: Silicon Integrated Systems [SiS] USB 2.0 Controller
ehci_hcd 0000:00:03.3: irq 23, pci mem 0xe2423000
ehci_hcd 0000:00:03.3: new USB bus registered, assigned bus number 4
PCI: cache line size of 128 is not supported by device 0000:00:03.3
ehci_hcd 0000:00:03.3: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
usb 2-1: USB disconnect, address 2
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 6 ports detected
ohci_hcd 0000:00:03.1: wakeup
sis900.c: v1.08.07 11/02/2003
ACPI: PCI interrupt 0000:00:04.0[A] -> GSI 19 (level, low) -> IRQ 19
eth0: Unknown PHY transceiver found at address 0.
eth0: Realtek RTL8201 PHY transceiver found at address 1.
eth0: Unknown PHY transceiver found at address 2.
eth0: Unknown PHY transceiver found at address 3.
eth0: Unknown PHY transceiver found at address 4.
eth0: Unknown PHY transceiver found at address 5.
eth0: Unknown PHY transceiver found at address 6.
eth0: Unknown PHY transceiver found at address 7.
eth0: Unknown PHY transceiver found at address 8.
eth0: Unknown PHY transceiver found at address 9.
eth0: Unknown PHY transceiver found at address 10.
eth0: Unknown PHY transceiver found at address 11.
eth0: Unknown PHY transceiver found at address 12.
eth0: Unknown PHY transceiver found at address 13.
eth0: Unknown PHY transceiver found at address 14.
eth0: Unknown PHY transceiver found at address 15.
eth0: Unknown PHY transceiver found at address 16.
eth0: Unknown PHY transceiver found at address 17.
eth0: Unknown PHY transceiver found at address 18.
eth0: Unknown PHY transceiver found at address 19.
eth0: Unknown PHY transceiver found at address 20.
eth0: Unknown PHY transceiver found at address 21.
eth0: Unknown PHY transceiver found at address 22.
eth0: Unknown PHY transceiver found at address 23.
eth0: Unknown PHY transceiver found at address 24.
eth0: Unknown PHY transceiver found at address 25.
usb 2-1: new full speed USB device using address 3
eth0: Unknown PHY transceiver found at address 26.
eth0: Unknown PHY transceiver found at address 27.
eth0: Unknown PHY transceiver found at address 28.
eth0: Unknown PHY transceiver found at address 29.
eth0: Unknown PHY transceiver found at address 30.
eth0: Unknown PHY transceiver found at address 31.
eth0: Using transceiver found at address 1 as default
eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 19, 00:10:fa:3f:29:16.
8139too Fast Ethernet driver 0.9.27
ACPI: PCI interrupt 0000:00:07.0[A] -> GSI 18 (level, low) -> IRQ 18
eth1: RealTek RTL8139 at 0xd134a000, 00:50:fc:2d:9a:5c, IRQ 18
eth1: Identified 8139 chip type 'RTL-8139C'
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb 2-1: control timeout on ep0in
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
eth0: Media Link On 100mbps full-duplex
eth0: Media Link On 100mbps full-duplex
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb usb3: string descriptor 0 read error: -113
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb 2-1: string descriptor 0 read error: -110
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113


lsusb -v:

Bus 004 Device 001: ID 0000:0000
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 9 Hub
bDeviceSubClass 0 Unused
bDeviceProtocol 1 Single TT
bMaxPacketSize0 8
idVendor 0x0000
idProduct 0x0000
bcdDevice 2.06
iManufacturer 3 Linux 2.6.8.1-ck8 ehci_hcd
iProduct 2 Silicon Integrated Systems [SiS] USB 2.0
Controller
iSerial 1 0000:00:03.3
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 25
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xe0
Self Powered
Remote Wakeup
MaxPower 0mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 9 Hub
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0002 bytes 2 twice
bInterval 12

Bus 003 Device 001: ID 0000:0000
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 9 Hub
bDeviceSubClass 0 Unused
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x0000
idProduct 0x0000
bcdDevice 2.06
iManufacturer 3 Linux 2.6.8.1-ck8 ohci_hcd
iProduct 2 Silicon Integrated Systems [SiS] USB 1.0
Controller(#3)
iSerial 1 0000:00:03.2
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 25
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xc0
Self Powered
MaxPower 0mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 9 Hub
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0002 bytes 2 twice
bInterval 255

Bus 002 Device 003: ID 0db0:6982 Micro Star International Medion Flash XL
V2.7ACard Reader
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x0db0 Micro Star International
idProduct 0x6982 Medion Flash XL V2.7A Card Reader
bcdDevice 2.6d
iManufacturer 1
iProduct 2
iSerial 3
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 39
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 4
bmAttributes 0x80
MaxPower 100mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 8 Mass Storage
bInterfaceSubClass 6 SCSI
bInterfaceProtocol 80 Bulk (Zip)
iInterface 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 bytes 64 once
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02 EP 2 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 bytes 64 once
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 bytes 8 once
bInterval 10

Bus 002 Device 001: ID 0000:0000
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 9 Hub
bDeviceSubClass 0 Unused
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x0000
idProduct 0x0000
bcdDevice 2.06
iManufacturer 3 Linux 2.6.8.1-ck8 ohci_hcd
iProduct 2 Silicon Integrated Systems [SiS] USB 1.0
Controller(#2)
iSerial 1 0000:00:03.1
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 25
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xc0
Self Powered
MaxPower 0mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 9 Hub
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0002 bytes 2 twice
bInterval 255

Bus 001 Device 001: ID 0000:0000
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 9 Hub
bDeviceSubClass 0 Unused
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x0000
idProduct 0x0000
bcdDevice 2.06
iManufacturer 3 Linux 2.6.8.1-ck8 ohci_hcd
iProduct 2 Silicon Integrated Systems [SiS] USB 1.0
Controller
iSerial 1 0000:00:03.0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 25
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xc0
Self Powered
MaxPower 0mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 9 Hub
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0002 bytes 2 twice
bInterval 255
Sean Neakums
2004-09-18 01:10:28 UTC
Permalink
Noticed this in my dmesg after booting 2.6.9-rc2-mm1:

enable_irq(17) unbalanced from c02a1ce5

$ grep ^c02a1c /boot/System.map-`uname -r`
c02a1c10 t e100_up
William Lee Irwin III
2004-09-18 06:01:34 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
Tested this on my laptop, which is a shoddy testing environment because
it lacks serial devices... no, not all of my boxen are UltraEnterprise,
AlphaServer, and Altix systems (the Altix isn't even mine, it's werk's).
But anyway, I got some kind of backtrace in yenta_interrupt, that said
"stack pointer is garbage, not dumping" or some such, followed by an
interrupts-off deadlock later in some unclear location (looks like ICH
scanning or some such; while legible, I couldn't make heads or tails of
it). ISTR PCMCIA IRQ/etc. stack consumption issues; this may be related.

Russell, I didn't know whom to cc:; if you could redirect this in the
proper direction (e.g. PCMCIA maintainer) I'd be much obliged.


-- wli
Russell King
2004-09-18 08:18:07 UTC
Permalink
Post by William Lee Irwin III
Tested this on my laptop, which is a shoddy testing environment because
it lacks serial devices... no, not all of my boxen are UltraEnterprise,
AlphaServer, and Altix systems (the Altix isn't even mine, it's werk's).
But anyway, I got some kind of backtrace in yenta_interrupt, that said
"stack pointer is garbage, not dumping" or some such, followed by an
As far as "stack pointer is garbage", ISTR that I've seen that some
bug reports, and it looked like the test for that was wrong - ESP
seemed to be within 4K or 8K (I don't remember which) of the reported
stack base.

I think the first step is for someone to check whether the stack
pointer validation code is correct or not.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core
William Lee Irwin III
2004-09-20 01:12:31 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
Fails to boot on my Altix. I'm guessing this is some known issue with
some patch, so hopefully one member of the cc: list has heard of it as
I'm a bit late to the game with this release and ia64. Relevant bootlog
diff between 2.6.9-rc1-mm4 and 2.6.9-rc2 indicates some kind of PCI,
qla1280, ia64/Altix IO support, or ACPI IRQ bogon.

Full diff between 2.6.9-rc2-mm1 and 2.6.9-rc1-mm4 bootlogs and complete
bootlog with 2.6.9-rc2-mm1 included as MIME attachments.


-- wli


--- altix.log.54 2004-09-19 17:39:12.000000000 -0700
+++ altix.log.53 2004-09-19 17:33:25.000000000 -0700
@@ -440,111 +524,21 @@
Loading kernel/drivers/scsi/sd_mod.ko
Loading kernel/drivers/ide/pci/sgiioc4.ko
ACPI: PCI interrupt 0000:01:01.0[A]: no GSI
-SGIIOC4: IDE controller at PCI slot 0000:01:01.0, revision 79
-ide0: BM-DMA at 0xc00000080c200140-0xc00000080c200163
-Probing IDE interface ide0...
-hda: MATSHITADVD-ROM SR-8588, ATAPI CD/DVD-ROM drive
-Using deadline io scheduler
-ide0 at 0xc00000080c200100-0xc00000080c200107,0xc00000080c200120 on irq 55
+Failed to enable device SGIIOC4 at slot 0000:01:01.0
Loading kernel/drivers/scsi/qla1280.ko
qla1280: QLA12160 found on PCI bus 1, dev 3
ACPI: PCI interrupt 0000:01:03.0[A]: no GSI
-scsi(0): Enabling SN2 PCI DMA dual channel lockup workaround
-scsi(0): Enabling SN2 PCI DMA workaround
-scsi(0:0): Resetting SCSI BUS
-scsi(0:1): Resetting SCSI BUS
-scsi0 : QLogic QLA12160 PCI to SCSI Host Adapter
- Firmware version: 10.04.32, Driver version 3.24.4
- Vendor: SGI Model: ST336753LC Rev: 2741
- Type: Direct-Access ANSI SCSI revision: 03
-scsi(0:0:1:0): Sync: period 9, offset 14, Wide, DT, Tagged queuing: depth 255
-SCSI device sda: 71687372 512-byte hdwr sectors (36704 MB)
-SCSI device sda: drive cache: write through
- sda: sda1 sda2 sda3
-Attached scsi disk sda at scsi0, channel 0, id 1, lun 0
- Vendor: SGI Model: ST336753LC Rev: 2741
- Type: Direct-Access ANSI SCSI revision: 03
-scsi(0:0:2:0): Sync: period 9, offset 14, Wide, DT, Tagged queuing: depth 255
-SCSI device sdb: 71687372 512-byte hdwr sectors (36704 MB)
-SCSI device sdb: drive cache: write through
- sdb: sdb1 sdb2 sdb3 sdb4 sdb5
-Attached scsi disk sdb at scsi0, channel 0, id 2, lun 0
+qla1280: Failed to enabled pci device, aborting.
Loading kernel/fs/jbd/jbd.ko
Loading kernel/fs/ext3/ext3.ko
-Waiting for device /dev/sda3 to appear: ok
-rootfs: major=8 minor=3 devn=2051
-warning: can't open /etc/mtab: No such file or directory
-EXT2-fs warning (device sda3): ext2_fill_super: mounting ext3 filesystem as ext2
-
-VFS: Mounted root (ext2 filesystem) readonly.
-Trying to move old root to /initrd ... failed
-Unmounting old root
-Trying to free ramdisk memory ... okay
-Freeing unused kernel memory: 368kB freed
-INIT: version 2.85 booting
-System Boot Control: Running /etc/init.d/boot
-Mounting /proc filesystemdone
-Mounting sysfs on /sysdone
-Mounting /dev/ptsdone
-Configuring serial ports...
-/dev/ttySG0 at 0x0000 (irq = 233) is a 16550A
-Configured serial ports
-doneMounting shared memory FS on /dev/shmdone
-Activating swap-devices in /etc/fstab...
-Adding 1052144k swap on /dev/sda2. Priority:42 extents:1
-Adding 1052160k swap on /dev/sdb2. Priority:42 extents:1
-doneshowconsole: Warning: the ioctl TIOCGDEV is not known by the kernel
-Checking root file system...
-fsck 1.34 (25-Jul-2003)
-[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda3
-/dev/sda3: clean, 473518/4325376 files, 6590955/8644436 blocks
-donemd: Autodetecting RAID arrays.
-md: autorun ...
-md: ... autorun DONE.
-Activating device mapper...
-device-mapper: 4.1.0-ioctl (2003-12-10) initialised: ***@uk.sistina.com
-Creating /dev/mapper/control character device with major:10 minor:63.
-done
-Scanning for LVM volume groups...
- Reading all physical volumes. This may take a while...
- No volume groups found
-Activating LVM volume groups...
- No volume groups found
-skipped
-showconsole: Warning: the ioctl TIOCGDEV is not known by the kernel
-Checking file systems...
-fsck 1.34 (25-Jul-2003)
-Checking all file systems.
-[/sbin/fsck.xfs (1) -- /oracle] fsck.xfs -a /dev/sdb3
-doneSetting updone
-Mounting local file systems...
-proc on /proc type proc (rw)
-tmpfs on /dev/shm type tmpfs (rw)
-devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
-/dev/sda1 on /boot/efi type vfat (rw)
-mount: fs type subfs not supported by kernel
-SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
-XFS mounting filesystem sdb3
-Ending clean XFS mount for filesystem: sdb3
-/dev/sdb3 on /oracle type xfs (rw)
-none on /tmp type tmpfs (rw)
-failedLoading required kernel modules
-doneActivating remaining swap-devices in /etc/fstab...
-doneRestore device permissionsdone
-Setting up the CMOS clockdone
-Setting up timezone datadone
-Setting scheduling timeslices unused
-Setting up hostname 'ca-test21'done
-Setting up loopback interface lo
- lo IP address: 127.0.0.1/8
-done
-Enabling syn flood protectiondone
-Disabling IP forwardingdone
-done
-Creating /var/log/boot.msg
-
+Waiting for device /dev/sda3 to appear: .....not found -- device nodes:
+console fb0 loop0 loop1 loop2 loop3 loop4 loop5 loop6 loop7 md0 null ram ram0 ram1 ram10 ram11 ram12 ram13 ram14 ram15 ram2 ram3 ram4 ram5 ram6 ram7 ram8 ram9 ramdisk tty1 tty2 zero warning: can't open /etc/mtab: No such file or directory
+VFS: Cannot open root device "sda3" or unknown-block(0,0)
+Please append a correct "root=" boot option
+Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
+
telnet> close
Connection closed.
$

-Script done on Sun Sep 19 17:40:18 2004
+Script done on Sun Sep 19 17:34:31 2004
Paul Jackson
2004-09-20 04:37:06 UTC
Permalink
Post by William Lee Irwin III
Fails to boot on my Altix.
See a couple of patches on this linux-scsi thread, mostly between Jesse
Barnes and Andrew Vasquez:

SCSI QLA not working on latest *-mm SN2
http://marc.theaimsgroup.com/?l=linux-scsi&m=109537406715003&w=2

Or I got it working (I think - memory fuzzy know) without this patch, by
(1) disabling the CONFIG_SCSI_QLA2[123]?? options, and (2) applying the
following workaround patch:

--- 2.6.9-rc2-mm1.orig/arch/ia64/pci/pci.c 2004-09-16 07:45:58.000000000 -0700
+++ 2.6.9-rc2-mm1/arch/ia64/pci/pci.c 2004-09-16 12:02:34.000000000 -0700
@@ -445,7 +445,7 @@ pcibios_enable_device (struct pci_dev *d
if (ret < 0)
return ret;

- return acpi_pci_irq_enable(dev);
+ return ia64_platform_is("sn2") ? 0 : acpi_pci_irq_enable(dev);
}

void

===

Jesse - could you post on lkml the definitive guide to getting
2.6.9-rc2-mm1 working on Altix?
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <***@sgi.com> 1.650.933.1373
Jesse Barnes
2004-09-20 23:27:56 UTC
Permalink
Post by Paul Jackson
Post by William Lee Irwin III
Fails to boot on my Altix.
See a couple of patches on this linux-scsi thread, mostly between Jesse
SCSI QLA not working on latest *-mm SN2
http://marc.theaimsgroup.com/?l=linux-scsi&m=109537406715003&w=2
Or I got it working (I think - memory fuzzy know) without this patch, by
(1) disabling the CONFIG_SCSI_QLA2[123]?? options, and (2) applying the
You'll need the patch you posted (or its equivalent, see my earlier reply to
Andrew's announcement) to boot. You'll need the qla patches I posted to use
qla2xxx adapters.

Jesse
William Lee Irwin III
2004-09-20 02:34:52 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
top(1) shows no tasks on sparc64. Large negative inode numbers appear
to be showing up for /proc/stat and other /proc/ special files on
64-bit irrespective of endianness, and all processes appear to have the
same inode number once again irrespective of endianness. It's unclear
why top(1) enumerates tasks on x86-64 and does not do so on sparc64,
unless 2.6.9-rc2-mm1 shows some behavior procps-3.2.3 is sensitive to
that 3.2.1 is not, or some numbers are overflowing on 32-bit apps but
not 64-bit ones (top(1) is 64-bit on x86-64 but 32-bit on sparc64)
that userspace barfs on and not the kernel (no error returns from
syscalls are visible in strace). ls and cat appear to work where top(1)
does not.

acahalan cc:'d as he last touched fs/proc/.

$ stat /proc/stat
File: `/proc/stat'
Size: 0 Blocks: 0 IO Block: 1024 regular empty file
Device: 3h/3d Inode: -268435443 Links: 1
Access: (0444/-r--r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2004-09-19 18:46:38.246034917 -0700
Modify: 2004-09-19 18:46:38.246034917 -0700
Change: 2004-09-19 18:46:38.246034917 -0700
$ stat /proc/[0-9]*|grep Inode|sort -u -k 4,4
Device: 3h/3d Inode: 2 Links: 3

(the same on x86-64 and sparc64).


-- wli
Albert Cahalan
2004-09-20 04:18:45 UTC
Permalink
Post by William Lee Irwin III
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
top(1) shows no tasks on sparc64.
It would be nice if I had such a box. I can't even
find a user account on one. I have 32-bit ppc, plus
non-root accounts on alpha, i386, and x86_64 boxes
with obsolete kernels.
Post by William Lee Irwin III
Large negative inode numbers appear
to be showing up for /proc/stat and other /proc/ special files on
64-bit irrespective of endianness, and all processes appear to have the
same inode number once again irrespective of endianness.
The inode numbering patch looks sane enough...
Post by William Lee Irwin III
It's unclear
why top(1) enumerates tasks on x86-64 and does not do so on sparc64,
unless 2.6.9-rc2-mm1 shows some behavior procps-3.2.3 is sensitive to
that 3.2.1 is not, or some numbers are overflowing on 32-bit apps but
not 64-bit ones (top(1) is 64-bit on x86-64 but 32-bit on sparc64)
In no place does procps itself care about ino_t.

Perhaps your 32-bit glibc chokes on 64-bit inode numbers.
If so, yuck. It's really sad that we have a zillion
versions of stat(), many with oversize dev_t, and still
we use 32-bit ino_t in many places.

Whether or not that's the problem...

1. install a 64-bit or bi-arch gcc
2. install a 64-bit libc
3. install a 64-bit ncurses
4. install a 64-bit procps

(suggestion: keep going until /bin is done)

That's pretty much it. The procps package goes to
great lengths to compile itself 64-bit, even passing
the -m64 option and installing to /lib64 as needed.
If you've broken this, you get to keep the pieces.

In other words: seriously unsupported

I see no reason why 32-bit SPARC users should have
to suffer the pain of running code bloated up to
handle 64-bit SPARC. The 32-bit SPARC hardware is
slow enough already. Just try to look a 32-bit SPARC
user in the eye and tell him "Your system should run
even slower now, so that my hot new hardware can keep
running old 32-bit executables meant for you"
William Lee Irwin III
2004-09-20 07:47:31 UTC
Permalink
Post by Albert Cahalan
Post by William Lee Irwin III
top(1) shows no tasks on sparc64.
It would be nice if I had such a box. I can't even
find a user account on one. I have 32-bit ppc, plus
non-root accounts on alpha, i386, and x86_64 boxes
with obsolete kernels.
Post by William Lee Irwin III
Large negative inode numbers appear
to be showing up for /proc/stat and other /proc/ special files on
64-bit irrespective of endianness, and all processes appear to have the
same inode number once again irrespective of endianness.
The inode numbering patch looks sane enough...
It may be userspace. stat(1) on SuSE/x86-64 reports large negative
inode numbers and identical inode numbers for all processes similarly
to that of Debian, yet it is compiled as a 64-bit application. So even
with of 64-bit-ness of userspace something has gone wrong.
Post by Albert Cahalan
Post by William Lee Irwin III
It's unclear
why top(1) enumerates tasks on x86-64 and does not do so on sparc64,
unless 2.6.9-rc2-mm1 shows some behavior procps-3.2.3 is sensitive to
that 3.2.1 is not, or some numbers are overflowing on 32-bit apps but
not 64-bit ones (top(1) is 64-bit on x86-64 but 32-bit on sparc64)
In no place does procps itself care about ino_t.
Perhaps your 32-bit glibc chokes on 64-bit inode numbers.
If so, yuck. It's really sad that we have a zillion
versions of stat(), many with oversize dev_t, and still
we use 32-bit ino_t in many places.
Whether or not that's the problem...
1. install a 64-bit or bi-arch gcc
2. install a 64-bit libc
3. install a 64-bit ncurses
4. install a 64-bit procps
(suggestion: keep going until /bin is done)
That's pretty much it. The procps package goes to
great lengths to compile itself 64-bit, even passing
the -m64 option and installing to /lib64 as needed.
If you've broken this, you get to keep the pieces.
I didn't touch this. Also, procps FTBFS on sparc64; I see no evidence
of passing -m64 or whatever to either the compilation or linking phase
in virgin procps. It gets better, though. On SuSE's x86-64 userspace,
procps and stat(1) are compiled 64-bit yet the inode numbers overflow,
as all /proc/$PID entries have identical inode numbers as reported.

So you may be getting bitten by -EOVERFLOW from 32-bit emulated stat(2)
or otherwise glibc's struct stat sans O_LARGEFILE, even though I
couldn't see it with strace(1). Perhaps silent truncation is going on
for the case of inode numbers that would overflow 32-bit integers. IIRC
stat(2) will be replaced by stat64(2) if O_LARGEFILE is passed, which
should probably be done for this case. Having made this a requirement
for 32-bit apps on 64-bit systems may be considered undesirable.
Perhaps a method of generating inode numbers different from assigning
fixed numerical ranges to tasks of a given pid would be more
appropriate, particularly as you've observed that numbers so large in
absolute terms have undesirable side effects. It's likewise readily
observable that these inode number space reservations are ridiculous
for systems running well under 100 tasks.

x86-64 and ia64 are highly unusual in that 64-bit compilation is
useful for basic apps, and alpha an exception in that it has no 32-bit
ABI; for most 64-bit architectures making such large fractions of
userspace 64-bit applictions is undesirable.

Furthermore, this bit of userspace policy isn't my decision and I don't
care to override it, particularly as it would break automated upgrades.
I just don't dork around much with userspace. Maintainers of the Debian
procps package and sparc64 Debian kernel package cc:'d here, as they do
care somewhat more and make more of the decisions. I'm not sure who the
SuSE counterparts are, but they surely have similar concerns as their
userspace is likewise affected.
Post by Albert Cahalan
In other words: seriously unsupported
I see no reason why 32-bit SPARC users should have
to suffer the pain of running code bloated up to
handle 64-bit SPARC. The 32-bit SPARC hardware is
slow enough already. Just try to look a 32-bit SPARC
user in the eye and tell him "Your system should run
even slower now, so that my hot new hardware can keep
running old 32-bit executables meant for you"
It's UltraSPARC. 32-bit userspace is used there. Recompiling top(1) as
a 64-bit app produces nonempty process lists.

And telling sparc32 users that has already been done for far more
severe slowdowns, in particular udiv emulation. Proper use of
O_LARGEFILE etc. is actually unlikely to hurt sparc32 in any
significant way, as it has a decent number of registers, unlike the
32-bit counterparts of some architectures for whose sake O_LARGEFILE is
omitted where considered feasible. On the contrary, compiling it 64-bit
would be a minor (though I can't imagine it being significant) slowdown
for users of UltraSPARC (the 64-bit cpus). O_LARGEFILE for 32-bit apps
is far more likely to hurt x86 biarch userspace than SPARC or other
standard 64-bit architectures' biarch userspace, though in that case it
would still be unusual, as its policy is generally 64-bit by default.


-- wli
Albert Cahalan
2004-09-20 15:00:47 UTC
Permalink
Post by William Lee Irwin III
Post by Albert Cahalan
Post by William Lee Irwin III
top(1) shows no tasks on sparc64.
It would be nice if I had such a box. I can't even
find a user account on one. I have 32-bit ppc, plus
non-root accounts on alpha, i386, and x86_64 boxes
with obsolete kernels.
In no place does procps itself care about ino_t.
Perhaps your 32-bit glibc chokes on 64-bit inode numbers.
If so, yuck. It's really sad that we have a zillion
versions of stat(), many with oversize dev_t, and still
we use 32-bit ino_t in many places.
Whether or not that's the problem...
1. install a 64-bit or bi-arch gcc
2. install a 64-bit libc
3. install a 64-bit ncurses
4. install a 64-bit procps
(suggestion: keep going until /bin is done)
That's pretty much it. The procps package goes to
great lengths to compile itself 64-bit, even passing
the -m64 option and installing to /lib64 as needed.
If you've broken this, you get to keep the pieces.
I didn't touch this. Also, procps FTBFS on sparc64; I see no evidence
of passing -m64 or whatever to either the compilation or linking phase
in virgin procps.
Debian compiles sparc and sparc64 on the same box,
and fails to distinguish them. Therefore, Debian
disables -m64 during the build. The package itself
will disable -m64 if it is unable to successfully
link a dummy.c file with ncurses when using it.

If you compile from the *.tar.gz file and have a
working 64-bit ncurses, you should get a 64-bit
executable. If you don't, please tell me what I
need to do to make it work.
Post by William Lee Irwin III
It gets better, though. On SuSE's x86-64 userspace,
procps and stat(1) are compiled 64-bit yet the inode numbers overflow,
as all /proc/$PID entries have identical inode numbers as reported.
Hmmm... who cares about the inode numbers?

If the answer is "glibc readdir", then the numbers
should be taken from distinct number spaces for the
/proc, /proc/*/task, and /proc/*/fd directories.
Re-use within a directory may be the most trouble.

If it's not even that, then the fake_ino() macro
shouldn't pretend that the numbers matter. Simply
make it be a constant.
Post by William Lee Irwin III
x86-64 and ia64 are highly unusual in that 64-bit compilation is
useful for basic apps, and alpha an exception in that it has no 32-bit
ABI; for most 64-bit architectures making such large fractions of
userspace 64-bit applictions is undesirable.
MIPS at least has N32, which is a distinct ABI for
the ILP32 model on 64-bit hardware. This offers the
best performance while not making 32-bit users suffer.
Post by William Lee Irwin III
Post by Albert Cahalan
In other words: seriously unsupported
I see no reason why 32-bit SPARC users should have
to suffer the pain of running code bloated up to
handle 64-bit SPARC. The 32-bit SPARC hardware is
slow enough already. Just try to look a 32-bit SPARC
user in the eye and tell him "Your system should run
even slower now, so that my hot new hardware can keep
running old 32-bit executables meant for you"
It's UltraSPARC. 32-bit userspace is used there. Recompiling top(1) as
a 64-bit app produces nonempty process lists.
And telling sparc32 users that has already been done for far more
severe slowdowns, in particular udiv emulation.
You mean it traps instead of directly calling
a routine in libgcc?

Admittedly on x86, profiling has shown that handling
64-bit values with a 32-bit ABI is truly horrible.
So both sparc and sparc64 are being hurt if 32-bit
sparc binaries have to support 64-bit kernels.

If 32-bit is so good, you'll want to get a 32-bit
kernel on your hardware ASAP.
Post by William Lee Irwin III
Proper use of
O_LARGEFILE etc. is actually unlikely to hurt sparc32 in any
significant way, as it has a decent number of registers,
More than O_LARGEFILE would be involved.

All the /proc parsing code would have to be 64-bit.
The overhead of strtoull() and the libgcc division
functions is really bad. It can show up at the top
of a profile.

Here's just a small amount of 64-bit usage:

Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ns/call ns/call name
16.26 0.20 0.20 __udivdi3

That was merely an accidental 64-bit usage. It gets
far, far worse when trying to support a 64-bit kernel
from a 32-bit app.
William Lee Irwin III
2004-09-20 21:01:59 UTC
Permalink
Post by Albert Cahalan
Post by William Lee Irwin III
I didn't touch this. Also, procps FTBFS on sparc64; I see no evidence
of passing -m64 or whatever to either the compilation or linking phase
in virgin procps.
Debian compiles sparc and sparc64 on the same box,
and fails to distinguish them. Therefore, Debian
disables -m64 during the build. The package itself
will disable -m64 if it is unable to successfully
link a dummy.c file with ncurses when using it.
If you compile from the *.tar.gz file and have a
working 64-bit ncurses, you should get a 64-bit
executable. If you don't, please tell me what I
need to do to make it work.
Reread the above.

I used make CC="gcc -m64" to make it work on the pristine sources.
It did not work on pristine sources otherwise.
Post by Albert Cahalan
Post by William Lee Irwin III
It gets better, though. On SuSE's x86-64 userspace,
procps and stat(1) are compiled 64-bit yet the inode numbers overflow,
as all /proc/$PID entries have identical inode numbers as reported.
Hmmm... who cares about the inode numbers?
If the answer is "glibc readdir", then the numbers
should be taken from distinct number spaces for the
/proc, /proc/*/task, and /proc/*/fd directories.
Re-use within a directory may be the most trouble.
If it's not even that, then the fake_ino() macro
shouldn't pretend that the numbers matter. Simply
make it be a constant.
I don't need to answer this beyond "stock userspace pukes on it".
The patch writer bears the burden of proof.
Post by Albert Cahalan
Post by William Lee Irwin III
x86-64 and ia64 are highly unusual in that 64-bit compilation is
useful for basic apps, and alpha an exception in that it has no 32-bit
ABI; for most 64-bit architectures making such large fractions of
userspace 64-bit applictions is undesirable.
MIPS at least has N32, which is a distinct ABI for
the ILP32 model on 64-bit hardware. This offers the
best performance while not making 32-bit users suffer.
For whatever merits the n32 ABI may have, biarch userspace generally
involves reusing most of the 32-bit userspace.
Post by Albert Cahalan
Post by William Lee Irwin III
It's UltraSPARC. 32-bit userspace is used there. Recompiling top(1) as
a 64-bit app produces nonempty process lists.
And telling sparc32 users that has already been done for far more
severe slowdowns, in particular udiv emulation.
You mean it traps instead of directly calling
a routine in libgcc?
Yes. There is some instruction that's emulated (udiv) instead of having
library calls generated.
Post by Albert Cahalan
Admittedly on x86, profiling has shown that handling
64-bit values with a 32-bit ABI is truly horrible.
So both sparc and sparc64 are being hurt if 32-bit
sparc binaries have to support 64-bit kernels.
sparc32 is legacy enough that the true 32-bit systems are
irrelevant as performance considerations. I also very explicitly
discussed the ia32 case being the sole exception among all emulated
32-bit ABI's; it is so grossly inferior the 64-bit ABI dominates it in
performance as it does nowhere else.
Post by Albert Cahalan
If 32-bit is so good, you'll want to get a 32-bit
kernel on your hardware ASAP.
This may be one reason why non-Linux kernels have chosen to use 32-bit
kernels on 64-bit cpus in various instances. But in general this
assertion is pure gibberish. That's not how biarch userspace is done;
if you care to change that, petition the distribution maintainers.

As for e.g. UltraSPARC userspace, it doesn't come any other way but
biarch and mostly 32-bit. If you would like to petition distributions
to ship procps as 64-bit those Debian project members I added to the
cc: list here may be relevant.
Post by Albert Cahalan
Post by William Lee Irwin III
Proper use of
O_LARGEFILE etc. is actually unlikely to hurt sparc32 in any
significant way, as it has a decent number of registers,
More than O_LARGEFILE would be involved.
All the /proc parsing code would have to be 64-bit.
The overhead of strtoull() and the libgcc division
functions is really bad. It can show up at the top
of a profile.
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ns/call ns/call name
16.26 0.20 0.20 __udivdi3
That was merely an accidental 64-bit usage. It gets
far, far worse when trying to support a 64-bit kernel
from a 32-bit app.
This is undoubtedly architecture-specific. I am thoroughly unimpressed
by ia32 results. The effect of double-precision arithmetic on inferior,
register-starved ISA's/ABI's is very predictable and not applicable to
any other architecture. Also, your concerns have been expressed before
and are motivators of the canonical -D_FILE_OFFSET_BITS=64 arrangement,
where apps do not explicitly use any 64-bit types or 64-bit aware
library routines, but are rather redirected to such depending on whether
-D_FILE_OFFSET_BITS=64 is passed as a compilation flag or not.

I strongly suggest the following:
(a) Stop fiddling with the compilation flags since it doesn't work
anyway and is undone by packagers just to get it to build.
Furthermore, they also find your attempt to force 64-bit
compilation undesirable. As a reminder, the compiletest
demonstrating build failure from prior posts was on pristine
sources, not Debian-altered sources.
(b) Make O_LARGEFILE a compile-time option; merely use off_t etc. instead
of unsigned long and the types and fs syscalls will be switched
between single and double precision depending on whether
-D_FILE_OFFSET_BITS=64 is used. No overhead will be created
for 32-bit ia32 unless someone is so foolish as to e.g. use
-D_FILE_OFFSET_BITS=64 on an inferior, register-starved ISA/ABI.
(c) The compiler spews more warnings than are reasonable, and some of
them are likely bugs. Cleaning those up may help.
(d) If your patch is triggering some bug in glibc, diagnose it and send
a bugreport to glibc maintainers.
(e) This should have been rather easily anticipated given that you're
shifting (pid+1) << BITS_PER_LONG/2. I expect the maintainers
of most/all arches with 32-bit emulation (essentially all
64-bit except alpha) to have conniption fits if this gets
anywhere near mainline. Such shifts are tantamount to 32-bit
emulated stat(2) always returning -EOVERFLOW in lieu of results.


-- wli
William Lee Irwin III
2004-09-20 22:20:26 UTC
Permalink
Post by William Lee Irwin III
(e) This should have been rather easily anticipated given that you're
shifting (pid+1) << BITS_PER_LONG/2. I expect the maintainers
of most/all arches with 32-bit emulation (essentially all
64-bit except alpha) to have conniption fits if this gets
anywhere near mainline. Such shifts are tantamount to 32-bit
emulated stat(2) always returning -EOVERFLOW in lieu of results.
I've confirmed that backing out fake_ino-fixes.patch repairs 32-bit
emulated userspace.

akpm, please back out fake_ino-fixes.patch.


-- wli
Andrew Morton
2004-09-20 17:58:36 UTC
Permalink
I'm getting
*** Warning: "afs_file_page_mkwrite" [fs/afs/kafs.ko] undefined!
when I build the kernel.
=20
If I remove make-afs-use-cachefs.patch it works just fine.
Like this, I guess.

--- 25/fs/afs/file.c~afs-cachefs-dependency-fix 2004-09-20 10:56:28.714=
441752 -0700
+++ 25-akpm/fs/afs/file.c 2004-09-20 10:57:39.770639568 -0700
@@ -33,7 +33,9 @@ static int afs_file_releasepage(struct p
=20
static ssize_t afs_file_write(struct file *file, const char __user *bu=
f,
size_t size, loff_t *off);
+#ifdef CONFIG_AFS_CACHEFS
static int afs_file_page_mkwrite(struct page *page);
+#endif
=20
struct inode_operations afs_file_inode_operations =3D {
.getattr =3D afs_inode_getattr,
@@ -56,7 +58,9 @@ struct address_space_operations afs_fs_a
.set_page_dirty =3D __set_page_dirty_nobuffers,
.releasepage =3D afs_file_releasepage,
.invalidatepage =3D afs_file_invalidatepage,
+#ifdef CONFIG_AFS_CACHEFS
.page_mkwrite =3D afs_file_page_mkwrite,
+#endif
};
=20
/*********************************************************************=
********/
_
Magnus Määttä
2004-09-20 20:15:04 UTC
Permalink
Hi Andrew,

Sent first mail with wrong mail address..
Post by Andrew Morton
I'm getting
*** Warning: "afs_file_page_mkwrite" [fs/afs/kafs.ko] undefined!
when I build the kernel.
If I remove make-afs-use-cachefs.patch it works just fine.
Like this, I guess.
--- 25/fs/afs/file.c~afs-cachefs-dependency-fix 2004-09-20
10:56:28.714441752 -0700 +++ 25-akpm/fs/afs/file.c 2004-09-20
10:57:39.770639568 -0700
@@ -33,7 +33,9 @@ static int afs_file_releasepage(struct p
static ssize_t afs_file_write(struct file *file, const char __user *=
buf,
Post by Andrew Morton
size_t size, loff_t *off);
+#ifdef CONFIG_AFS_CACHEFS
static int afs_file_page_mkwrite(struct page *page);
+#endif
struct inode_operations afs_file_inode_operations =3D {
.getattr =3D afs_inode_getattr,
@@ -56,7 +58,9 @@ struct address_space_operations afs_fs_a
.set_page_dirty =3D __set_page_dirty_nobuffers,
.releasepage =3D afs_file_releasepage,
.invalidatepage =3D afs_file_invalidatepage,
+#ifdef CONFIG_AFS_CACHEFS
.page_mkwrite =3D afs_file_page_mkwrite,
+#endif
};
=20
/********************************************************************=
*********/=20

That fixed it, thanks!


/Magnus
David Howells
2004-09-20 20:45:11 UTC
Permalink
Hi Andrew,

That looks okay. Thanks.

David
Andi Kleen
2004-09-20 19:30:07 UTC
Permalink
Post by Terence Ripperda
this is some ugly code. we're doing a lookup on a physical address to
see if this is memory we previously allocated and returning a kernel
pointer to the page.
the particular snippet in question (that uses MAXMEM) is an ugly attempt
to verify the address is a real physical address, before using __va()
on something like an i/o region. A better approach than comparing
MAXMEM would probably be to convert the address to a mapnr and compare
to max_mapnr.
pfn_valid() is intended for that. However it cannot work
when you have more than 4GB memory and IO memory holes below 4GB.
Testing the reserved bit of the struct page * may work in
that case, although it can give false positives when you try
this with random memory (some people set reserved on real memory
for other reason)

-Andi
William Lee Irwin III
2004-09-24 05:30:31 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
Sorry to bother you again. I appear to get this after a couple days of
uptime:

# ----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at cfq_iosched:1395
invalid operand: 0000 [1] SMP
CPU 0
Modules linked in: st sr_mod floppy usbserial parport_pc lp parport snd_seq_oss snd_seq_device snd_seq_midi_event snd_seq thermal snd_pcm_oss snd_mixer_oss snd_ioctl32 processor fan button battery snd_intel8x0 snd_ac97_codec snd_pcm ipv6 snd_timer ac snd soundcore snd_page_alloc af_packet joydev usbhid uhci_hcd e1000 usbcore hw_random evdev dm_mod ext3 jbd aic79xx ata_piix libata sd_mod scsi_mod
Pid: 0, comm: swapper Not tainted 2.6.9-rc2-mm1
RIP: 0010:[<ffffffff802909cb>] <ffffffff802909cb>{cfq_put_request+139}
RSP: 0018:ffffffff804c9848 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 000001017e1d0040 RCX: 000001017e23a0c0
RDX: 0000000000000001 RSI: 0000010132292990 RDI: 000001017e4b2ef8
RBP: 000001017e9e6968 R08: 0000000000000002 R09: 0000000000800110
R10: 0000000000000001 R11: 0000010006466560 R12: 000001017ff68c08
R13: 0000010132292990 R14: 000001017ffdf100 R15: 0000000000000200
FS: 0000000000000000(0000) GS:ffffffff8054d900(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000002a9b7f4000 CR3: 0000000000101000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff80550000, task ffffffff8041dc80)
Stack: 0000010132292990 000001017e1d0040 0000000000000001 0000000000000001
000001017e1d0040 ffffffff802850cf 000001015e4bfc80 ffffffff80287a4b
0000010132292990 0000010037c1c800
Call Trace:<IRQ> <ffffffff802850cf>{elv_put_request+15} <ffffffff80287a4b>{__blk_put_request+139}
<ffffffff80287b83>{end_that_request_last+243} <ffffffffa0006178>{:scsi_mod:scsi_end_request+200}
<ffffffffa00063f0>{:scsi_mod:scsi_io_completion+576}
<ffffffffa0000506>{:scsi_mod:scsi_finish_command+214}
<ffffffffa0000e4a>{:scsi_mod:scsi_softirq+234} <ffffffff8013da51>{__do_softirq+113}
<ffffffff8013db05>{do_softirq+53} <ffffffff80113f1f>{do_IRQ+335}
<ffffffff80110d27>{ret_from_intr+0} <EOI> <ffffffff8010f5a6>{mwait_idle+86}
<ffffffff8010f9fd>{cpu_idle+29} <ffffffff8055371a>{start_kernel+490}
<ffffffff805531e0>{_sinittext+480}

Code: 0f 0b e7 9d 38 80 ff ff ff ff 73 05 ff c8 48 89 ef 41 89 44
RIP <ffffffff802909cb>{cfq_put_request+139} RSP <ffffffff804c9848>
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
Jens Axboe
2004-09-24 07:11:23 UTC
Permalink
Post by William Lee Irwin III
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/
- Added lots of Ingo's low-latency patches
- Lockmeter doesn't compile. Don't enable CONFIG_LOCKMETER.
- Several architecture updates
Sorry to bother you again. I appear to get this after a couple days of
# ----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at cfq_iosched:1395
is it the !allocated[rw] test again?
--
Jens Axboe
William Lee Irwin III
2004-09-24 07:24:05 UTC
Permalink
Post by Jens Axboe
Post by William Lee Irwin III
Sorry to bother you again. I appear to get this after a couple days of
# ----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at cfq_iosched:1395
is it the !allocated[rw] test again?
I am unfortunately completely oblivious to bdev handling code. In
2.6.9-rc2-mm1 this corresponds to (whitespace not preserved):

1390 BUG_ON(!hlist_unhashed(&crq->hash));
1391
1392 if (crq->io_context)
1393 put_io_context(crq->io_context->ioc);
1394
1395 BUG_ON(!cfqq->allocated[crq->is_write]);
1396 cfqq->allocated[crq->is_write]--;
1397
1398 mempool_free(crq, cfqd->crq_pool);
1399 rq->elevator_private = NULL;


-- wli
Continue reading on narkive:
Loading...