Discussion:
2.6.17-rc5-mm1
(too old to reply)
Andrew Morton
2006-05-30 09:29:25 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/


- The git-cfq tree is causing oopses and has been dropped.

- New reiser4 code drop.

- Merged the generic-IRQ handling code.

- Merged the runtime locking validator. If you enable this your machine
will run slowly.

- The build is broken on ia64 and probably on everything apart from x86,
x86_64 and powerpc. Check out the hot-fixes directory, as it won't be
broken for long.

- Dropped the git-viro-bid-* git trees - they're getting many rejects
against other things in -mm.

- Merged the new readahead code.

- Merged the generic statistics infrastructure patches.




Changes since 2.6.17-rc4-mm3:


origin.patch
git-acpi.patch
git-agpgart.patch
git-alsa.patch
git-audit-master.patch
git-block.patch
git-cifs.patch
git-dvb.patch
git-gfs2.patch
git-ia64.patch
git-infiniband.patch
git-intelfb.patch
git-klibc.patch
git-hdrcleanup.patch
git-hdrinstall.patch
git-libata-all.patch
git-mips.patch
git-mtd.patch
git-mtd-fixup.patch
git-mtd-cs553x_nand-build-fix.patch
git-mtd-ya-build-fix.patch
git-netdev-all.patch
git-net.patch
git-nfs.patch
git-powerpc.patch
git-rbtree.patch
git-sas.patch
git-pcmcia.patch
git-scsi-rc-fixes.patch
git-scsi-target.patch
git-supertrak.patch
git-watchdog.patch
git-cryptodev.patch

git trees

-sys_sync_file_range-move-exported-flags-outside-kernel.patch
-knfsd-fix-two-problems-that-can-cause-rmmod-nfsd-to-die.patch
-md-fix-possible-oops-when-starting-a-raid0-array.patch
-md-make-sure-bi_max_vecs-is-set-properly-in-bio_split.patch
-git-audit-master-build-fix.patch
-audit-build-fix.patch
-git-klibc-build-hacks.patch
-git-klibc-stdint-build-fix.patch
-git-klibc-stdint-build-fix-2.patch
-e1000-endian-fixes.patch
-forcedeth-suggested-cleanups.patch
-forcedeth-add-support-for-flow-control.patch
-forcedeth-add-support-for-configuration.patch
-drivers-net-s2ioc-make-bus_speed-static.patch
-qla2xxx-lock-ordering-fix.patch
-qla2xxx-lock-ordering-fix-warning-fix.patch
-orinoco-possible-null-pointer-dereference-in-orinoco_rx_monitor.patch
-x86_64-mm-i386-numa-summit-check-fix.patch
-x86-64-calgary-iommu-introduce-iommu_detected.patch
-x86-64-calgary-iommu-calgary-specific-bits.patch
-x86-64-calgary-iommu-hook-it-in.patch
-x86-64-check-for-valid-dma-data-direction-in-the-dma-api.patch
-fix-unlikely-memory-leak-in-dac960-driver.patch
-sunsu-license-fix.patch
-intelfb-use-firmware-edid-for-mode-database.patch
-intelfb-use-firmware-edid-for-mode-database-fix.patch

Merged into mainline or a subsystem tree

+ext3-resize-fix-double-unlock_super.patch
+fbcon-fix-scrollback-with-logo-issue-immediately-after-boot.patch
+spanned_pages-is-not-updated-at-a-case-of-memory-hot-add.patch
+tpm-bios-log-parsing-fixes.patch
+tpm-more-bios-log-parsing-fixes.patch
+tpm-more-bios-log-parsing-fixes-tidy.patch
+ipmi-reserve-i-o-ports-separately.patch
+revert-swsusp-add-check-for-suspension-of-x-controlled-devices.patch
+hrtimer-export-symbols.patch
+drivers-usb-core-devioc-dereference-userspace-pointer.patch
+scsi-properly-count-the-number-of-pages-in-scsi_req_map_sg.patch
+x86_64-fix-stack-mmap-randomization-for-compat.patch
+x86_64-fix-no-iommu-warning-in-pci-gart-driver.patch
+i386-apic=-command-line-option-should-always-be.patch
+x86_64-fix-last_tsc-calculation-of-pm-timer.patch
+x86_64-handle-empty-node-zero.patch
+x86_64-fix-off-by-one-in-bad_addr-checking-in.patch
+x86_64-dont-do-syscall-exit-tracing-twice.patch
+powerpc-fix-boot-on-emac.patch
+au1100fb-fix-compilation.patch
+maxinefb-fix-compilation-error.patch
+sgiioc4-use-mmio-ops-instead-of-port-io.patch
+md-fix-badness-in-sysfs_notify-caused-by-md_new_event.patch

2.6.17 queue

+acpi-atlas-acpi-driver.patch
+acpi-atlas-acpi-driver-v2-tidy.patch
+remove-acpi_os_create_lock-acpi_os_delete_lock.patch

ACPI updates

+firmware_class-s-semaphores-mutexes.patch

mutex conversion.

+trivial-videodev2h-patch.patch

cleanup

+fix-broken-suspend-resume-in-ohci1394-was-acpi-suspend.patch
+ieee1394_core-switch-to-kthread-api.patch
+ieee1394_core-switch-to-kthread-api-fix.patch

ieee1394 updates

-input-move-fixp-arithh-to-drivers-input.patch
-input-fix-accuracy-of-fixp-arithh.patch
-input-new-force-feedback-interface.patch
-input-adapt-hid-force-feedback-drivers-for-the-new-interface.patch
-input-adapt-uinput-for-the-new-force-feedback-interface.patch
-input-adapt-iforce-driver-for-the-new-force-feedback-interface.patch
-input-force-feedback-driver-for-pid-devices.patch
-input-force-feedback-driver-for-zeroplus-devices.patch
-input-update-documentation-of-force-feedback.patch
-input-drop-the-remains-of-the-old-ff-interface.patch
-input-drop-the-old-pid-driver.patch

Dropped - these are being redone.

+input-powermac-cleanup-of-mac_hid-and-support-for-ctrlclick-and-commandclick-update.patch

Fix input-powermac-cleanup-of-mac_hid-and-support-for-ctrlclick-and-commandclick.patch

+mm-constify-drivers-char-keyboardc.patch
+input-logitech-trackman-trackball-support.patch

Input driver updates

+git-mtd-cs553x_nand-build-fix.patch
+git-mtd-ya-build-fix.patch

Fix git-mtd.patch

+git-netdev-all-fixup.patch

Fix reject due to git-netdev-all.patch

+git-net-git-klibc-fixup.patch

Fix reject.

+eliminate-unused-proc-sys-net-ethernet.patch

Kill empty /proc directory.

+irda-missing-allocation-result-check-in-irlap_change_speed.patch

IRDA fix

+nfs-really-return-status-from-decode_recall_args.patch

NFS fixlet.

+64-bit-resources-arch-powerpc-changes-update.patch

ppc build fix

+allow-msi-to-work-on-kexec-kernel.patch
+pci-disable-msi-mode-in-pci_disable_device.patch

PCI fixes

+pcmcia-missing-pcmcia_get_socket-result-check.patch

pcmcia fixlet.

+qla1280-fix-section-mismatch-warnings.patch
+bogus-disk-geometry-on-large-disks.patch
+bogus-disk-geometry-on-large-disks-warning-fix.patch
+megaraid_sas-switch-fw_outstanding-to-an-atomic_t.patch
+megaraid_sas-add-support-for-zcr-controller.patch
+megaraid_sas-add-support-for-zcr-controller-fix.patch

SCSI driver updates

+usb-gadget-update-inodec-to-support-full-speed-only.patch
+usb-gadget-update-pxa2xx_udcc-and-arch-dependent-files.patch
+usb-gadget-update-pxa2xx_udcc-driver-to-fully-support.patch
+usb-gadget-clean-udch.patch
+usb-gadget-dont-build-small-version-if-usbgadgetfs.patch
+driver-for-apple-cinema-display.patch
+driver-for-apple-cinema-display-tweaks.patch
+usb-wifi-zd1201-cleanups.patch

USB updates

-x86_64-mm-iommu-warning.patch
-x86_64-mm-i386-apic-overwrite.patch
-x86_64-mm-profile-pc-fp.patch
-x86_64-mm-fix-last_tsc-calculation-of-pm-timer.patch
-x86_64-mm-empty-node0.patch
-x86_64-mm-disable-apic-initdata.patch
+x86_64-mm-iommu-clarification.patch
+x86_64-mm-reliable-stack-trace-support.patch
+x86_64-mm-reliable-stack-trace-support-x86-64.patch
+x86_64-mm-reliable-stack-trace-support-x86-64-irq-stack.patch
+x86_64-mm-reliable-stack-trace-support-x86-64-syscall.patch
+x86_64-mm-reliable-stack-trace-support-i386.patch
+x86_64-mm-reliable-stack-trace-support-i386-entrys.patch
+x86_64-mm-consoldidate-boot-compressed.patch
+x86_64-mm-remove-pud_offset_k.patch
+x86_64-mm-use-halt-instead-of-raw-inline-assembly.patch
+x86_64-mm-change-assembly-to-use-regular-cpuid_count-macro.patch
+x86_64-mm-iommu-detected.patch
+x86_64-mm-valid-dma-direction.patch
+x86_64-mm-iommu-abstraction.patch
+x86_64-mm-calgary-iommu.patch
+x86_64-mm-moving-phys_proc_id-and-cpu_core_id-to-cpuinfo_x86.patch
+x86_64-mm-add-nmi-watchdog-support-for-new-intel-cpus.patch
+x86_64-mm-rdtscp-macros.patch
+x86_64-mm-time-constants.patch
+x86_64-mm-rename-force-hpet.patch
+x86_64-mm-rdtscp-feature.patch
+x86_64-mm-remove-hpet-hack.patch
+x86_64-mm-use-time-constants.patch
+x86_64-mm-init-rdtscp.patch
+x86_64-mm-explain-double-hpet-init.patch
+x86_64-mm-update-copyright.patch
+x86_64-mm-getcpu-vsyscall.patch
+x86_64-mm-time-init-gtod-prototype.patch
+x86_64-mm-x86-clean-up-nmi-panic-messages.patch

x86_64 tree updates

-revert-x86_64-mm-profile-pc-fp.patch

Dropped.

+fix-x86_64-mm-reliable-stack-trace-support-i386-entrys.patch
+x86_64-mm-reliable-stack-trace-support-non-x86-fix.patch
+x86_64-mm-reliable-stack-trace-support-non-x86-fix-fix.patch
+x86_64-mm-moving-phys_proc_id-and-cpu_core_id-to-cpuinfo_x86-warning-fix.patch

Fix x86_64 tree.

+lock-validator-lockdep-small-xfs-init_rwsem-cleanup.patch

XFS cleanup.

-zone-init-check-and-report-unaligned-zone-boundaries-fix-v2.patch

Folded into zone-init-check-and-report-unaligned-zone-boundaries.patch

-zone-allow-unaligned-zone-boundaries-spelling-fix.patch

Folded into zone-allow-unaligned-zone-boundaries.patch

+zone-allow-unaligned-zone-boundaries-x86-add-zone-alignment-qualifier.patch

Implement it on x86.

-unify-pxm_to_node-and-node_to_pxm-update.patch

Folded into unify-pxm_to_node-and-node_to_pxm.patch

-pgdat-allocation-for-new-node-add-specify-node-id-powerpc-fix.patch
-pgdat-allocation-for-new-node-add-specify-node-id-tidy.patch
-pgdat-allocation-for-new-node-add-specify-node-id-fix-3.patch
-pgdat-allocation-for-new-node-add-specify-node-id-build-fixes.patch
-pgdat-allocation-for-new-node-add-specify-node-id-tidy-cleanup.patch

Folded into pgdat-allocation-for-new-node-add-specify-node-id.patch

-pgdat-allocation-for-new-node-add-get-node-id-by-acpi-tidy.patch

Folded into pgdat-allocation-for-new-node-add-get-node-id-by-acpi.patch

-pgdat-allocation-for-new-node-add-generic-alloc-node_data-tidy.patch

Folded into pgdat-allocation-for-new-node-add-generic-alloc-node_data.patch

-pgdat-allocation-for-new-node-add-refresh-node_data-fix.patch

Folded into pgdat-allocation-for-new-node-add-refresh-node_data.patch

-pgdat-allocation-for-new-node-add-export-kswapd-start-func-tidy.patch

Folded into pgdat-allocation-for-new-node-add-export-kswapd-start-func.patch

-catch-valid-mem-range-at-onlining-memory-tidy.patch
-catch-valid-mem-range-at-onlining-memory-fix.patch

Folded into catch-valid-mem-range-at-onlining-memory.patch

-register-sysfs-file-for-hotpluged-new-node-fix.patch

Folded into register-sysfs-file-for-hotpluged-new-node.patch

-mm-introduce-remap_vmalloc_range-tidy.patch
-mm-introduce-remap_vmalloc_range-fix.patch

Folded into mm-introduce-remap_vmalloc_range.patch

-change-gen_pool-allocator-to-not-touch-managed-memory-update.patch
-change-gen_pool-allocator-to-not-touch-managed-memory-update-2.patch

Folded into change-gen_pool-allocator-to-not-touch-managed-memory.patch

-page-migration-cleanup-extract-try_to_unmap-from-migration-functions-update-comments-7.patch

Folded into page-migration-cleanup-extract-try_to_unmap-from-migration-functions.patch

-page-migration-cleanup-move-fallback-handling-into-special-function-update-comments-9.patch

Folded into page-migration-cleanup-move-fallback-handling-into-special-function.patch

-swapless-pm-add-r-w-migration-entries-fix.patch
-swapless-pm-add-r-w-migration-entries-ifdefs.patch
-swapless-pm-add-r-w-migration-entries-update-comments.patch
-swapless-pm-add-r-w-migration-entries-update-comments-4.patch
-swapless-pm-add-r-w-migration-entries-update-comments-6.patch

Folded into swapless-pm-add-r-w-migration-entries.patch

+swapless-pm-add-r-w-migration-entries-fix-2.patch

Fix it again.

-swapless-page-migration-modify-core-logic-remove-useless-mapping-checks.patch

Folded into swapless-page-migration-modify-core-logic.patch

-more-page-migration-use-migration-entries-for-file-pages-fix.patch
-more-page-migration-use-migration-entries-for-file-pages-update-comments-5.patch
-more-page-migration-use-migration-entries-for-file-pages-update-comments-8.patch
-more-page-migration-use-migration-entries-for-file-pages-remove_migration_ptes.patch
-more-page-migration-use-migration-entries-for-file-pages-replace-call-to-pageout-with-writepage-2.patch

Folded into more-page-migration-use-migration-entries-for-file-pages.patch

-tracking-dirty-pages-in-shared-mappings-v4.patch
-tracking-dirty-pages-in-shared-mappings-v4-fix2.patch
-tracking-dirty-pages-in-shared-mappings-v4-fix3.patch
-throttle-writers-of-shared-mappings.patch
-throttle-writers-of-shared-mappings-tidy.patch
-optimize-follow_pages.patch

Dropped, being redone.

+node-hotplug-register-cpu-remove-node-struct.patch
+node-hotplug-fixes-callres-of-register_cpu.patch
+node-hotplug-fixes-callres-of-register_cpu-powerpc-warning-fix.patch
+node-hotplug-register_node-fix.patch

NUMA node hotplugging updates

+add-page_mkwrite-vm_operations-method.patch
+mm-remove-vm_locked-before-remap_pfn_range-and-drop-vm_shm.patch
+swapoff-atomic_inc_not_zero-on-mm_users.patch
+remove-unused-o_flags-from-do_shmat.patch
+fix-update_mmu_cache-in-fremapc.patch
+fix-update_mmu_cache-in-fremapc-fix.patch

Memory management updates

+page-migration-support-moving-of-individual-pages-fixes.patch
+page-migration-support-moving-of-individual-pages-x86_64-support.patch
+page-migration-support-moving-of-individual-pages-x86-support.patch
+page-migration-support-moving-of-individual-pages-x86-support-fix.patch
+allow-migration-of-mlocked-pages.patch

Page migration updates

+au1550-1200-add-missing-psc-defines-make-oss-driver-use.patch

MIPS fix

+x86-re-enable-generic-numa.patch
+x86-make-using_apic_timer-__read_mostly.patch
+x86-cyrix-code-config_pci-fix--add-__initdata.patch
+x86-constify-some-parts-of-arch-i386-kernel-cpu.patch
+x86-make-i387-mxcsr_feature_mask-__read_mostly.patch
+x86-make-acpi-errata-__read_mostly.patch
+x86-constify-arch-i386-pci-irqc.patch
+x86-use-proper-defines-for-i8259a-i-o.patch
+i386-moving-phys_proc_id-and-cpu_core_id-to-cpuinfo_x86.patch
+i386-moving-phys_proc_id-and-cpu_core_id-to-cpuinfo_x86-warning-fix.patch
+i386-fix-get_segment_eip-with-vm86.patch

x86 updates

-x86-move-vsyscall-page-out-of-fixmap-above-stack.patch
-x86-move-vsyscall-page-out-of-fixmap-above-stack-tidy.patch
+vdso-randomize-the-i386-vdso-by-moving-it-into-a-vma.patch
+vdso-randomize-the-i386-vdso-by-moving-it-into-a-vma-tidy.patch
+vdso-randomize-the-i386-vdso-by-moving-it-into-a-vma-arch_vma_name-fix.patch
+vdso-randomize-the-i386-vdso-by-moving-it-into-a-vma-vs-x86_64-mm-reliable-stack-trace-support-i386.patch
+vdso-randomize-the-i386-vdso-by-moving-it-into-a-vma-vs-x86_64-mm-reliable-stack-trace-support-i386-2.patch

Updated x86 VDSO randomisation patches

-swsusp-add-architecture-special-saveable-pages-fix.patch

Folded into swsusp-add-architecture-special-saveable-pages-support.patch

-swsusp-i386-mark-special-saveable-unsaveable-pages-fix.patch

Folded into swsusp-i386-mark-special-saveable-unsaveable-pages.patch

-swsusp-x86_64-mark-special-saveable-unsaveable-pages-fix.patch

Folded into swsusp-x86_64-mark-special-saveable-unsaveable-pages.patch

-dont-use-flush_tlb_all-in-suspend-time-tidy.patch

Folded into dont-use-flush_tlb_all-in-suspend-time.patch

-swsusp-fix-typo-in-cr0-handling.patch

Folded into swsusp-documentation-updates.patch

+m68k-completely-initialize-hw_regs_t-in-ide_setup_ports.patch
+m68k-atyfb_base-compile-fix-for-config_pci=n.patch
+m68k-cleanup-unistdh.patch
+m68k-remove-some-unused-definitions-in-zorroh.patch
+m68k-use-c99-initializer.patch
+m68k-print-correct-stack-trace.patch
+m68k-restore-amikbd-compatibility-with-24.patch
+m68k-extra-delay.patch
+m68k-use-proper-defines-for-zone-initialization.patch
+m68k-adjust-to-changed-hardirq_mask.patch
+m68k-m68k-mac-via2-fixes-and-cleanups.patch

m68k updates

+xtensa-remove-verify_area-macros.patch
+xtensa-remove-verify_area-macros-fix.patch

Xtensa updates

-s390-statistics-infrastructure.patch

Dropped.

-per-cpufy-net-proto-structures-add-percpu_counter_modbh.patch
-percpu-counters-add-percpu_counter_exceeds.patch
-per-cpufy-net-proto-structures-protomemory_allocated.patch
-per-cpufy-net-proto-structures-sockets_allocated.patch
-per-cpufy-net-proto-structures-protoinuse.patch

Dropped.

-percpu-counter-data-type-changes-to-suppport-fix.patch
-percpu-counter-data-type-changes-to-suppport-fix-fix.patch
-percpu-counter-data-type-changes-to-suppport-fix-fix-fix.patch

Folded into percpu-counter-data-type-changes-to-suppport.patch

-jbd-split-checkpoint-lists-tidy.patch

Folded into jbd-split-checkpoint-lists.patch

-mark-address_space_operations-const-fix.patch
-mark-address_space_operations-const-fix-2.patch

Folded into mark-address_space_operations-const.patch

-hptiop-highpoint-rocketraid-3xxx-controller-driver-list-locking.patch
-hptiop-highpoint-rocketraid-3xxx-controller-driver-list-locking-updates.patch
-hptiop-highpoint-rocketraid-3xxx-controller-driver-list-locking-updates-updates-2.patch
-hptiop-highpoint-rocketraid-3xxx-controller-driver-redone.patch

Folded into hptiop-highpoint-rocketraid-3xxx-controller-driver.patch

-ufs-right-block-allocation-fixes.patch

Folded into ufs-right-block-allocation.patch

-ufs-change-block-number-on-the-fly-tweaks.patch

Folded into ufs-change-block-number-on-the-fly.patch

+ufs-wrong-type-cast.patch
+ufs-not-usual-amounts-of-fragments-per-block.patch
+ufs-unmark-config_ufs_fs_write-as-broken-mm-tree.patch

More UFS fixes.

-add-driver-for-arm-amba-pl031-rtc-tidy.patch

Folded into add-driver-for-arm-amba-pl031-rtc.patch

-add-a-sysfs-file-to-determine-if-a-kexec-kernel-is-loaded-tidy.patch

Folded into add-a-sysfs-file-to-determine-if-a-kexec-kernel-is-loaded.patch

+avoid-disk-sector_t-overflow-for-2tb-ext3-filesystem.patch
+cleanup-dead-code-from-ext2-mount-code.patch
+fix-memory-leak-when-the-ext3s-journal-file-is-corrupted.patch
+remove-inconsistent-space-before-exclamation-point-in-ext3s-mount-code.patch
+moxa-remove-pointless-casts.patch
+moxa-remove-pointless-check-of-tty-argument-vs-null.patch
+moxa-partial-codingstyle-cleanup-spelling-fixes.patch
+updated-kdump-documentation.patch
+cpuset-remove-extra-cpuset_zone_allowed-check-in-__alloc_pages.patch
+spin-rwlock-init-cleanups.patch
+make-debug_mutex_on-__read_mostly.patch
+constify-parts-of-kernel-power.patch
+constify-libcrc32c-table.patch
+apple-motion-sensor-driver.patch
+prepare-for-__copy_from_user_inatomic-to-not-zero-missed-bytes.patch
+make-copy_from_user_inatomic-not-zero-the-tail-on-i386.patch
+remove-unecessary-null-check-in-kernel-acctc.patch
+ax88796-parallel-port-driver.patch
+ax88796-parallel-port-driver-build-fix.patch
+wd7000-fix-section-mismatch-warnings.patch
+megaraid_mbox-fix-section-mismatch-warnings.patch
+keys-fix-race-between-two-instantiators-of-a-key.patch
+keys-fix-race-between-two-instantiators-of-a-key-tidy.patch
+ext3_fsblk_t-filesystem-group-blocks-and-bug-fixes.patch
+ext3_fsblk_t-the-rest-of-in-kernel-filesystem-blocks.patch
+inotify-kernel-api.patch
+inotify-kernel-api-fix.patch
+kernel-doc-mm-readhead-fixup.patch
+make-procfs-obligatory-except-under-config_embedded.patch
+lock-validator-introduce-warn_on_oncecond.patch
+make-sysctl-obligatory-except-under-config_embedded.patch
+lock-validator-sound-oss-emu10k1-midic-cleanup.patch
+for_each_cpu_mask-warning-fix.patch

Misc.

-use-list_add_tail-instead-of-list_add-fix.patch

Folded into use-list_add_tail-instead-of-list_add.patch

+add-new-generic-hw-rng-core-hw_random-core-rewrite-chrdev-read-method-hw_random-core-block-read-if-o_nonblock.patch

Hardware random number genarator update.

+time-fix-time-going-backward-w-clock=pit.patch

x86 time handling fix

pi-futex-futex-code-cleanups.patch
-pi-futex-futex-code-cleanups-fix.patch
+pi-futex-robust-futex-docs-fix.patch
pi-futex-introduce-debug_check_no_locks_freed.patch
+pi-futex-introduce-warn_on_smp.patch
pi-futex-add-plist-implementation.patch
pi-futex-scheduler-support-for-pi.patch
pi-futex-rt-mutex-core.patch
-pi-futex-rt-mutex-core-fix-timeout-race.patch
pi-futex-rt-mutex-docs.patch
+pi-futex-rt-mutex-docs-update.patch
pi-futex-rt-mutex-debug.patch
pi-futex-rt-mutex-tester.patch
pi-futex-rt-mutex-futex-api.patch
pi-futex-futex_lock_pi-futex_unlock_pi-support.patch
-pi-futex-v2.patch
-pi-futex-v3.patch
-pi-futex-patchset-v4.patch
-pi-futex-patchset-v4-update.patch
-pi-futex-patchset-v4-fix.patch
-rtmutex-remove-buggy-bug_on-in-pi-boosting-code.patch
-futex-pi-enforce-waiter-bit-when-owner-died-is-detected.patch
-rtmutex-debug-printk-correct-task-information.patch
-futex-pi-make-use-of-restart_block-when-interrupted.patch
-document-futex-pi-design.patch
-document-futex-pi-design-fix.patch
-document-futex-pi-design-fix-fix.patch

Updated pi-futex patch series

+ecryptfs-fs-makefile-and-fs-kconfig-remove-ecrypt_debug-from-fs-kconfig.patch
+ecryptfs-main-module-functions-uint16_t-u16.patch
+ecryptfs-header-declarations-update.patch
+ecryptfs-header-declarations-update-convert-signed-data-types-to-unsigned-data-types.patch
+ecryptfs-header-declarations-remove-unnecessary-ifndefs.patch
+ecryptfs-file-operations-remove-null-==-syntax.patch
+ecryptfs-file-operations-remove-extraneous-read-of-inode-size-from-header.patch
+ecryptfs-convert-assert-to-bug_on.patch
+ecryptfs-remove-unnecessary-null-checks.patch
+ecryptfs-rewrite-ecryptfs_fsync.patch
+ecryptfs-overhaul-file-locking.patch

ecryptfs updates

+proc-sysctl-add-_proc_do_string-helper.patch

/proc helper fucntion.

+namespaces-utsname-switch-to-using-uts-namespaces-cleanup.patch

Folded into namespaces-utsname-switch-to-using-uts-namespaces-alpha-fix.patch

+namespaces-utsname-sysctl-hack-cleanup.patch
+namespaces-utsname-sysctl-hack-cleanup-2.patch

Folded into namespaces-utsname-sysctl-hack.patch

+uts-copy-nsproxy-only-when-needed.patch

utsname virtualisation update

+readahead-kconfig-options.patch
+radixtree-introduce-radix_tree_scan_hole.patch
+mm-introduce-probe_page.patch
+mm-introduce-pg_readahead.patch
+readahead-add-look-ahead-support-to-__do_page_cache_readahead.patch
+readahead-delay-page-release-in-do_generic_mapping_read.patch
+readahead-insert-cond_resched-calls.patch
+readahead-minmax_ra_pages.patch
+readahead-events-accounting.patch
+readahead-rescue_pages.patch
+readahead-sysctl-parameters.patch
+readahead-sysctl-parameters-fix.patch
+readahead-min-max-sizes.patch
+readahead-state-based-method-aging-accounting.patch
+readahead-state-based-method-routines.patch
+readahead-state-based-method.patch
+readahead-context-based-method.patch
+readahead-initial-method-guiding-sizes.patch
+readahead-initial-method-thrashing-guard-size.patch
+readahead-initial-method-expected-read-size.patch
+readahead-initial-method-user-recommended-size.patch
+readahead-initial-method.patch
+readahead-backward-prefetching-method.patch
+readahead-seeking-reads-method.patch
+readahead-thrashing-recovery-method.patch
+readahead-call-scheme.patch
+readahead-laptop-mode.patch
+readahead-loop-case.patch
+readahead-nfsd-case.patch
+readahead-turn-on-by-default.patch
+readahead-debug-radix-tree-new-functions.patch
+readahead-debug-traces-showing-accessed-file-names.patch
+readahead-debug-traces-showing-read-patterns.patch

readahead rework

+make-copy_from_user_inatomic-not-zero-the-tail-on-i386-vs-reiser4.patch
-reiser4-fix-incorrect-assertions.patch
-reiser4-add-missing-txn_restart-before-get_nonexclusive_access-calls.patch
-reiser4-check-radix-tree-emptiness-properly.patch
-reiser4-check-radix-tree-emptiness-properly-2.patch
-fs-reiser4-misc-cleanups.patch
-reiser4-releasepage-fix.patch
-reiser4fs-use-list_move.patch
-make-address_space_operations-invalidatepage-return-void-reiser4.patch
-reiser4-have-get_exclusive_access-restart-transaction.patch
-reiser4-writeback-fix-range-handling.patch
-reiser4-gfp_t-annotations.patch
+reiser4-run-truncate_inode_pages-in-reiser4_delete_inode.patch

Reiser4 updates

+hpt3xx-switch-to-using-pci_get_slot.patch
+hpt3xx-cache-channels-mcr-address.patch

IDE updates

+fbdev-remove-unused-exports.patch
+s3c2410fb-fix-resume.patch
+backlight-fix-kconfig-dependency.patch
+au1100fb-add-power-management-support.patch
+au1100fb-add-power-management-support-tidy.patch

fbdev updates

+dm-snapshot-unify-chunk_size.patch
+lib-add-idr_replace.patch
+lib-add-idr_replace-tidy.patch
+dm-fix-idr-minor-allocation.patch
+dm-move-idr_pre_get.patch
+dm-change-minor_lock-to-spinlock.patch
+dm-add-dmf_freeing.patch
+dm-fix-mapped-device-ref-counting.patch
+dm-add-module-ref-counting.patch
+dm-fix-block-device-initialisation.patch
+dm-mirror-sector-offset-fix.patch

Device mapper updates

+statistics-infrastructure-prerequisite-list.patch
+statistics-infrastructure-prerequisite-parser.patch
+statistics-infrastructure-prerequisite-timestamp.patch
+statistics-infrastructure-documentation.patch
+statistics-infrastructure.patch
+statistics-infrastructure-update-1.patch
+statistics-infrastructure-exploitation-zfcp.patch

Generic statistics infrastructure, use it on an s390 driver.

+genirq-rename-desc-handler-to-desc-chip-ia64-fix-2.patch

Fix genirq-rename-desc-handler-to-desc-chip.patch some more.

+genirq-sem2mutex-probe_sem-probing_active.patch
+genirq-cleanup-merge-irq_affinity-into-irq_desc.patch
+genirq-cleanup-remove-irq_descp.patch
+genirq-cleanup-remove-fastcall.patch
+genirq-cleanup-misc-code-cleanups.patch
+genirq-cleanup-reduce-irq_desc_t-use-mark-it-obsolete.patch
+genirq-cleanup-include-linux-irqh.patch
+genirq-cleanup-merge-irq_dir-smp_affinity_entry-into-irq_desc.patch
+genirq-cleanup-merge-pending_irq_cpumask-into-irq_desc.patch
+genirq-cleanup-turn-arch_has_irq_per_cpu-into-config_irq_per_cpu.patch
+genirq-debug-better-debug-printout-in-enable_irq.patch
+genirq-add-retrigger-irq-op-to-consolidate-hw_irq_resend.patch
+genirq-doc-comment-include-linux-irqh-structures.patch
+genirq-doc-handle_irq_event-and-__do_irq-comments.patch
+genirq-cleanup-no_irq_type-cleanups.patch
+genirq-doc-add-design-documentation.patch
+genirq-add-genirq-sw-irq-retrigger.patch
+genirq-add-irq_noprobe-support.patch
+genirq-add-irq_norequest-support.patch
+genirq-add-irq_noautoen-support.patch
+genirq-update-copyrights.patch
+genirq-core.patch
+genirq-add-irq-chip-support.patch
+genirq-add-handle_bad_irq.patch
+genirq-add-irq-wake-power-management-support.patch
+genirq-add-sa_trigger-support.patch
+genirq-cleanup-no_irq_type-no_irq_chip-rename.patch
+genirq-convert-the-x86_64-architecture-to-irq-chips.patch
+genirq-convert-the-i386-architecture-to-irq-chips.patch
+genirq-convert-the-i386-architecture-to-irq-chips-fix-2.patch
+genirq-more-verbose-debugging-on-unexpected-irq-vectors.patch

Generic IRQ habdling layer

+lock-validator-floppyc-irq-release-fix.patch
+lock-validator-forcedethc-fix.patch
+lock-validator-mutex-section-binutils-workaround.patch
+lock-validator-add-__module_address-method.patch
+lock-validator-better-lock-debugging.patch
+lock-validator-locking-api-self-tests.patch
+lock-validator-locking-init-debugging-improvement.patch
+lock-validator-beautify-x86_64-stacktraces.patch
+lock-validator-beautify-x86_64-stacktraces-fix.patch
+lock-validator-x86_64-document-stack-frame-internals.patch
+lock-validator-stacktrace.patch
+lock-validator-stacktrace-build-fix.patch
+lock-validator-stacktrace-warning-fix.patch
+lock-validator-fown-locking-workaround.patch
+lock-validator-sk_callback_lock-workaround.patch
+lock-validator-irqtrace-core.patch
+lock-validator-irqtrace-core-powerpc-fix-1.patch
+lock-validator-irqtrace-core-non-x86-fix.patch
+lock-validator-irqtrace-core-non-x86-fix-2.patch
+lock-validator-irqtrace-core-non-x86-fix-3.patch
+lock-validator-irqtrace-cleanup-include-asm-i386-irqflagsh.patch
+lock-validator-irqtrace-cleanup-include-asm-x86_64-irqflagsh.patch
+lock-validator-lockdep-add-local_irq_enable_in_hardirq-api.patch
+lock-validator-add-per_cpu_offset.patch
+lock-validator-add-per_cpu_offset-fix.patch
+lock-validator-core.patch
+lock-validator-procfs.patch
+lock-validator-core-multichar-fix.patch
+lock-validator-core-count_matching_names-fix.patch
+lock-validator-design-docs.patch
+lock-validator-prove-rwsem-locking-correctness.patch
+lock-validator-prove-rwsem-locking-correctness-fix.patch
+lock-validator-prove-rwsem-locking-correctness-powerpc-fix.patch
+lock-validator-prove-spinlock-rwlock-locking-correctness.patch
+lock-validator-prove-mutex-locking-correctness.patch
+lock-validator-print-all-lock-types-on-sysrq-d.patch
+lock-validator-x86_64-early-init.patch
+lock-validator-smp-alternatives-workaround.patch
+lock-validator-do-not-recurse-in-printk.patch
+lock-validator-disable-nmi-watchdog-if-config_lockdep.patch
+lock-validator-special-locking-bdev.patch
+lock-validator-special-locking-direct-io.patch
+lock-validator-special-locking-serial.patch
+lock-validator-special-locking-serial-fix.patch
+lock-validator-special-locking-dcache.patch
+lock-validator-special-locking-i_mutex.patch
+lock-validator-special-locking-s_lock.patch
+lock-validator-special-locking-futex.patch
+lock-validator-special-locking-genirq.patch
+lock-validator-special-locking-completions.patch
+lock-validator-special-locking-waitqueues.patch
+lock-validator-special-locking-mm.patch
+lock-validator-special-locking-slab.patch
+lock-validator-special-locking-skb_queue_head_init.patch
+lock-validator-special-locking-timerc.patch
+lock-validator-special-locking-schedc.patch
+lock-validator-special-locking-hrtimerc.patch
+lock-validator-special-locking-sock_lock_init.patch
+lock-validator-special-locking-af_unix.patch
+lock-validator-special-locking-bh_lock_sock.patch
+lock-validator-special-locking-mmap_sem.patch
+lock-validator-special-locking-sb-s_umount.patch
+lock-validator-special-locking-sb-s_umount-fix.patch
+lock-validator-special-locking-sb-s_umount-2.patch
+lock-validator-special-locking-jbd.patch
+lock-validator-special-locking-posix-timers.patch
+lock-validator-special-locking-sch_genericc.patch
+lock-validator-special-locking-xfrm.patch
+lock-validator-special-locking-sound-core-seq-seq_portsc.patch
+lock-validator-enable-lock-validator-in-kconfig.patch
+lock-validator-enable-lock-validator-in-kconfig-x86-only.patch
+lock-validator-enable-lock-validator-in-kconfig-not-yet.patch
+lockdep-one-stacktrace-column-if-config_lockdep=y.patch
+lockdep-further-improve-stacktrace-output.patch
+lock-validator-special-locking-kgdb.patch

Runtime locking validation.

-add-print_fatal_signals-support.patch
+vdso-print-fatal-signals.patch
+vdso-improve-print_fatal_signals-support-by-adding-memory-maps.patch

Updated



ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/patch-list

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel-announce" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 09:42:19 UTC
Permalink
Post by Andrew Morton
- Merged the runtime locking validator. If you enable this your
machine will run slowly.
if you disable CONFIG_DEBUG_LOCKDEP it should be quite OK. (If debugging
is disabled then the lockless "chain cache" is fully utilized and we
should rarely go into the more complex portions of kernel/lockdep.c.)

Ingo
Ingo Molnar
2006-05-30 10:05:26 UTC
Permalink
Subject: genirq: ia64 build fix
From: Ingo Molnar <***@elte.hu>

fix missed handler -> chip rename.

Signed-off-by: Ingo Molnar <***@elte.hu>
---
arch/ia64/hp/sim/hpsim_irq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/arch/ia64/hp/sim/hpsim_irq.c
===================================================================
--- linux.orig/arch/ia64/hp/sim/hpsim_irq.c
+++ linux/arch/ia64/hp/sim/hpsim_irq.c
@@ -45,7 +45,7 @@ hpsim_irq_init (void)

for (i = 0; i < NR_IRQS; ++i) {
idesc = irq_desc + i;
- if (idesc->handler == &no_irq_type)
- idesc->handler = &irq_type_hp_sim;
+ if (idesc->chip == &no_irq_type)
+ idesc->chip = &irq_type_hp_sim;
}
}
Ingo Molnar
2006-05-30 10:09:22 UTC
Permalink
Subject: lock validator, irqtrace: support non-x86 architectures
From: Ingo Molnar <***@elte.hu>

add TRACE_IRQFLAGS_SUPPORT method for architectures to signal
whether they have irq-flags tracing infrastructure.

Signed-off-by: Ingo Molnar <***@elte.hu>
---
arch/i386/Kconfig.debug | 4 ++++
arch/x86_64/Kconfig.debug | 4 ++++
include/linux/trace_irqflags.h | 30 +++++++++++++++---------------
lib/Kconfig.debug | 3 +++
4 files changed, 26 insertions(+), 15 deletions(-)

Index: linux/arch/i386/Kconfig.debug
===================================================================
--- linux.orig/arch/i386/Kconfig.debug
+++ linux/arch/i386/Kconfig.debug
@@ -1,5 +1,9 @@
menu "Kernel hacking"

+config TRACE_IRQFLAGS_SUPPORT
+ bool
+ default y
+
source "lib/Kconfig.debug"

config EARLY_PRINTK
Index: linux/arch/x86_64/Kconfig.debug
===================================================================
--- linux.orig/arch/x86_64/Kconfig.debug
+++ linux/arch/x86_64/Kconfig.debug
@@ -1,5 +1,9 @@
menu "Kernel hacking"

+config TRACE_IRQFLAGS_SUPPORT
+ bool
+ default y
+
source "lib/Kconfig.debug"

config DEBUG_RODATA
Index: linux/include/linux/trace_irqflags.h
===================================================================
--- linux.orig/include/linux/trace_irqflags.h
+++ linux/include/linux/trace_irqflags.h
@@ -11,12 +11,6 @@
#ifndef _LINUX_TRACE_IRQFLAGS_H
#define _LINUX_TRACE_IRQFLAGS_H

-#include <asm/irqflags.h>
-
-/*
- * The local_irq_*() APIs are equal to the raw_local_irq*()
- * if !TRACE_IRQFLAGS.
- */
#ifdef CONFIG_TRACE_IRQFLAGS
extern void trace_hardirqs_on(void);
extern void trace_hardirqs_off(void);
@@ -31,7 +25,6 @@
# define trace_softirq_enter() do { current->softirq_context++; } while (0)
# define trace_softirq_exit() do { current->softirq_context--; } while (0)
# define INIT_TRACE_IRQFLAGS .softirqs_enabled = 1,
-
#else
# define trace_hardirqs_on() do { } while (0)
# define trace_hardirqs_off() do { } while (0)
@@ -48,7 +41,10 @@
# define INIT_TRACE_IRQFLAGS
#endif

-#ifdef CONFIG_X86
+#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT
+
+#include <asm/irqflags.h>
+
#define local_irq_enable() \
do { trace_hardirqs_on(); raw_local_irq_enable(); } while (0)
#define local_irq_disable() \
@@ -66,12 +62,16 @@
raw_local_irq_restore(flags); \
} \
} while (0)
-#else
-#define raw_local_irq_disable() local_irq_disable()
-#define raw_local_irq_enable() local_irq_enable()
-#define raw_local_irq_save(flags) local_irq_save(flags)
-#define raw_local_irq_restore(flags) local_irq_restore(flags)
-#endif /* CONFIG_X86 */
+#else /* !CONFIG_TRACE_IRQFLAGS_SUPPORT */
+/*
+ * The local_irq_*() APIs are equal to the raw_local_irq*()
+ * if !TRACE_IRQFLAGS.
+ */
+# define raw_local_irq_disable() local_irq_disable()
+# define raw_local_irq_enable() local_irq_enable()
+# define raw_local_irq_save(flags) local_irq_save(flags)
+# define raw_local_irq_restore(flags) local_irq_restore(flags)
+#endif /* CONFIG_TRACE_IRQFLAGS_SUPPORT */

/*
* On lockdep we dont want to enable hardirqs in hardirq
@@ -86,7 +86,7 @@
# define local_irq_enable_in_hardirq() local_irq_enable()
#endif

-#ifdef CONFIG_X86
+#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT
#define safe_halt() \
do { \
trace_hardirqs_on(); \
Index: linux/lib/Kconfig.debug
===================================================================
--- linux.orig/lib/Kconfig.debug
+++ linux/lib/Kconfig.debug
@@ -123,6 +123,7 @@ config DEBUG_PREEMPT
bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT
default y
+ depends on TRACE_IRQFLAGS_SUPPORT
help
If you say Y here then the kernel will use a debug variant of the
commonly used smp_processor_id() function and will print warnings
@@ -347,6 +348,7 @@ config DEBUG_LOCKDEP
bool "Lock dependency engine debugging"
depends on LOCKDEP
default y
+ depends on TRACE_IRQFLAGS_SUPPORT
help
If you say Y here, the lock dependency engine will do
additional runtime checks to debug itself, at the price
@@ -355,6 +357,7 @@ config DEBUG_LOCKDEP
config TRACE_IRQFLAGS
bool
default y
+ depends on TRACE_IRQFLAGS_SUPPORT
depends on PROVE_SPIN_LOCKING || PROVE_RW_LOCKING

config DEBUG_SPINLOCK_SLEEP
Ingo Molnar
2006-05-30 10:11:23 UTC
Permalink
Subject: lock validator: rwsem build fix for non-x86 architectures
From: Ingo Molnar <***@elte.hu>

rwsem build fix for non-x86 architectures which use their own
asm/rwsem.h and have no __init_rwsem method yet.

Signed-off-by: Ingo Molnar <***@elte.hu>
Signed-off-by: Arjan van de Ven <***@linux.intel.com>
---
include/linux/rwsem.h | 4 ++++
1 file changed, 4 insertions(+)

Index: linux/include/linux/rwsem.h
===================================================================
--- linux.orig/include/linux/rwsem.h
+++ linux/include/linux/rwsem.h
@@ -30,8 +30,12 @@ struct rw_semaphore;
* Lockdep: type splitting can also be done for dynamic locks, if for
* example there are per-CPU dynamically allocated locks:
*/
+#ifdef CONFIG_PROVE_RWSEM_LOCKING
#define init_rwsem_key(sem, key) \
__init_rwsem((sem), #sem, key)
+#else
+# define init_rwsem_key(sem, key) init_rwsem(sem)
+#endif

#ifndef rwsemtrace
#if RWSEM_DEBUG
Ingo Molnar
2006-05-30 10:12:36 UTC
Permalink
Post by Andrew Morton
- The build is broken on ia64 and probably on everything apart from
x86, x86_64 and powerpc. Check out the hot-fixes directory, as it
won't be broken for long.
with the following patches i just sent:

patches/genirq-ia64-build-fix.patch
patches/irqtrace-support-nonx86.patch
patches/lockdep-rwsem-fix.patch

ia64 defconfig builds fine now. I'd expect other non-x86 architectures
to build fine too (with the lock validator disabled).

Ingo
Jiri Slaby
2006-05-30 10:48:19 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
====================================
[ BUG: possible deadlock detected! ]
- ------------------------------------
idle/1 is trying to acquire lock:
(&ops->reg_mutex){--..}, at: [<c03ca763>] mutex_lock+0x8/0xa

but task is already holding lock:
(&ops->reg_mutex){--..}, at: [<c03ca763>] mutex_lock+0x8/0xa

which could potentially lead to deadlocks!

other info that might help us debug this:
1 locks held by idle/1:
#0: (&ops->reg_mutex){--..}, at: [<c03ca763>] mutex_lock+0x8/0xa

stack backtrace:
[<c01042ac>] show_trace+0x1b/0x1d
[<c01049f2>] dump_stack+0x26/0x28
[<c01422fa>] __lockdep_acquire+0xa58/0xd8e
[<c0142b97>] lockdep_acquire+0x73/0x88
[<c03ca378>] __mutex_lock_slowpath+0xb3/0x496
[<c03ca763>] mutex_lock+0x8/0xa
[<c0333aa0>] snd_seq_device_new+0x96/0x111
[<c0358260>] snd_emux_init_seq_oss+0x35/0x9c
[<c0353f50>] snd_emux_register+0x10d/0x13f
[<c0352c39>] snd_emu10k1_synth_new_device+0xe7/0x14e
[<c0333537>] init_device+0x2c/0x94
[<c0333d04>] snd_seq_device_register_driver+0x8f/0xeb
[<c05911e0>] alsa_emu10k1_synth_init+0x22/0x24
[<c01003cb>] init+0x12b/0x2f5
[<c0101005>] kernel_thread_helper+0x5/0xb

If more info needed, feel free to ask.

regards,
- --
Jiri Slaby www.fi.muni.cz/~xslaby
\_.-^-._ ***@gmail.com _.-^-._/
B67499670407CE62ACC8 22A032CC55C339D47A7E
Arjan van de Ven
2006-05-30 11:06:28 UTC
Permalink
On Tue, 2006-05-30 at 147 +0159, Jiri Slaby wrote:

(I've turned your backtrace upside down to show it "chronological")

[<c05911e0>] alsa_emu10k1_synth_init+0x22/0x24
[<c0333d04>] snd_seq_device_register_driver+0x8f/0xeb

this one does:

mutex_lock(&ops->reg_mutex);
...
list_for_each(head, &ops->dev_list) {
struct snd_seq_device *dev = list_entry(head, struct snd_seq_device, list);
init_device(dev, ops);
}
mutex_unlock(&ops->reg_mutex);

[<c0333537>] init_device+0x2c/0x94
which calls into the driver
[<c0352c39>] snd_emu10k1_synth_new_device+0xe7/0x14e
[<c0353f50>] snd_emux_register+0x10d/0x13f
[<c0358260>] snd_emux_init_seq_oss+0x35/0x9c
[<c0333aa0>] snd_seq_device_new+0x96/0x111

and this one does
mutex_lock(&ops->reg_mutex);
list_add_tail(&dev->list, &ops->dev_list);
ops->num_devices++;
mutex_unlock(&ops->reg_mutex);


so... on first sight this looks like a real deadlock;
unless the ALSA folks can tell me why "ops" is always different,
and what the lock ordering rules between those is...
Takashi Iwai
2006-05-30 12:44:05 UTC
Permalink
At Tue, 30 May 2006 13:06:28 +0200,
Post by Arjan van de Ven
(I've turned your backtrace upside down to show it "chronological")
[<c05911e0>] alsa_emu10k1_synth_init+0x22/0x24
[<c0333d04>] snd_seq_device_register_driver+0x8f/0xeb
mutex_lock(&ops->reg_mutex);
...
list_for_each(head, &ops->dev_list) {
struct snd_seq_device *dev = list_entry(head, struct snd_seq_device, list);
init_device(dev, ops);
}
mutex_unlock(&ops->reg_mutex);
[<c0333537>] init_device+0x2c/0x94
which calls into the driver
[<c0352c39>] snd_emu10k1_synth_new_device+0xe7/0x14e
[<c0353f50>] snd_emux_register+0x10d/0x13f
[<c0358260>] snd_emux_init_seq_oss+0x35/0x9c
[<c0333aa0>] snd_seq_device_new+0x96/0x111
and this one does
mutex_lock(&ops->reg_mutex);
list_add_tail(&dev->list, &ops->dev_list);
ops->num_devices++;
mutex_unlock(&ops->reg_mutex);
so... on first sight this looks like a real deadlock;
unless the ALSA folks can tell me why "ops" is always different,
and what the lock ordering rules between those is...
This ops is a unique object assigned to a different "id" string.

The first snd_seq_device_register_driver() called from emu10k1_synth.c
is the registration for the id "snd-synth-emu10k1".
Then in init_device(), the corresponding devices are initialized, and
one callback registers again another device for OSS sequencer with a
different id "snd-seq-oss" via snd_seq_device_new() inside the lock.
Now it hits the lock-detector but the lock should belong to a
different ops object in practice.

This nested lock may happen only in two drivers, emu10k1-synth and
opl3, and only together with OSS emulation. Since the OSS emulation
layer don't do active registration from itself, no deadlock should
happen (in theory -- I may oversee something :)


Takashi
Arjan van de Ven
2006-05-30 12:59:06 UTC
Permalink
Post by Takashi Iwai
This ops is a unique object assigned to a different "id" string.
The first snd_seq_device_register_driver() called from emu10k1_synth.c
is the registration for the id "snd-synth-emu10k1".
Then in init_device(), the corresponding devices are initialized, and
one callback registers again another device for OSS sequencer with a
different id "snd-seq-oss" via snd_seq_device_new() inside the lock.
Now it hits the lock-detector but the lock should belong to a
different ops object in practice.
This nested lock may happen only in two drivers, emu10k1-synth and
opl3, and only together with OSS emulation. Since the OSS emulation
layer don't do active registration from itself, no deadlock should
happen (in theory -- I may oversee something :)
ok fair enough

Jiri, can you test the patch below? (I don't have this hardware)

The ops structure has complex locking rules, where not all ops are
equal, some are subordinate on others for some complex sound cards. This
requires for lockdep checking that each individual reg_mutex is
considered in separation for its locking rules.

Signed-off-by: Arjan van de Ven <***@linux.intel.com>

---
sound/core/seq/seq_device.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-2.6.17-rc4-mm3-lockdep/sound/core/seq/seq_device.c
===================================================================
--- linux-2.6.17-rc4-mm3-lockdep.orig/sound/core/seq/seq_device.c
+++ linux-2.6.17-rc4-mm3-lockdep/sound/core/seq/seq_device.c
@@ -46,6 +46,7 @@
#include <linux/kmod.h>
#include <linux/slab.h>
#include <linux/mutex.h>
+#include <linux/lockdep.h>

MODULE_AUTHOR("Takashi Iwai <***@suse.de>");
MODULE_DESCRIPTION("ALSA sequencer device management");
@@ -73,6 +74,8 @@ struct ops_list {
struct mutex reg_mutex;

struct list_head list; /* next driver */
+
+ struct lockdep_type_key reg_mutex_key;
};


@@ -379,7 +382,7 @@ static struct ops_list * create_driver(c

/* set up driver entry */
strlcpy(ops->id, id, sizeof(ops->id));
- mutex_init(&ops->reg_mutex);
+ mutex_init_key(&ops->reg_mutex, id, &ops->reg_mutex_key);
ops->driver = DRIVER_EMPTY;
INIT_LIST_HEAD(&ops->dev_list);
/* lock this instance */
Jiri Slaby
2006-05-30 13:09:58 UTC
Permalink
Post by Arjan van de Ven
Post by Takashi Iwai
This ops is a unique object assigned to a different "id" string.
The first snd_seq_device_register_driver() called from emu10k1_synth.c
is the registration for the id "snd-synth-emu10k1".
Then in init_device(), the corresponding devices are initialized, and
one callback registers again another device for OSS sequencer with a
different id "snd-seq-oss" via snd_seq_device_new() inside the lock.
Now it hits the lock-detector but the lock should belong to a
different ops object in practice.
This nested lock may happen only in two drivers, emu10k1-synth and
opl3, and only together with OSS emulation. Since the OSS emulation
layer don't do active registration from itself, no deadlock should
happen (in theory -- I may oversee something :)
ok fair enough
Jiri, can you test the patch below? (I don't have this hardware)
Sure, but the day after tomorrow, I am going away from that machine now.
Post by Arjan van de Ven
The ops structure has complex locking rules, where not all ops are
equal, some are subordinate on others for some complex sound cards. This
requires for lockdep checking that each individual reg_mutex is
considered in separation for its locking rules.
---
sound/core/seq/seq_device.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Index: linux-2.6.17-rc4-mm3-lockdep/sound/core/seq/seq_device.c
===================================================================
--- linux-2.6.17-rc4-mm3-lockdep.orig/sound/core/seq/seq_device.c
+++ linux-2.6.17-rc4-mm3-lockdep/sound/core/seq/seq_device.c
@@ -46,6 +46,7 @@
#include <linux/kmod.h>
#include <linux/slab.h>
#include <linux/mutex.h>
+#include <linux/lockdep.h>
MODULE_DESCRIPTION("ALSA sequencer device management");
@@ -73,6 +74,8 @@ struct ops_list {
struct mutex reg_mutex;
struct list_head list; /* next driver */
+
+ struct lockdep_type_key reg_mutex_key;
};
@@ -379,7 +382,7 @@ static struct ops_list * create_driver(c
/* set up driver entry */
strlcpy(ops->id, id, sizeof(ops->id));
- mutex_init(&ops->reg_mutex);
+ mutex_init_key(&ops->reg_mutex, id, &ops->reg_mutex_key);
ops->driver = DRIVER_EMPTY;
INIT_LIST_HEAD(&ops->dev_list);
/* lock this instance */
- --
Jiri Slaby www.fi.muni.cz/~xslaby
\_.-^-._ ***@gmail.com _.-^-._/
B67499670407CE62ACC8 22A032CC55C339D47A7E
Jiri Slaby
2006-05-30 11:02:27 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
BUG: warning at /l/latest/xxx/kernel/softirq.c:86/local_bh_disable()
[<c0103e66>] show_trace+0x1b/0x1d
[<c01045a4>] dump_stack+0x26/0x28
[<c012708f>] local_bh_disable+0x53/0x55
[<c0399fd6>] _write_lock_bh+0x10/0x15
[<c034e314>] netlink_table_grab+0x12/0xe9
[<c034e6f6>] netlink_insert+0x2a/0x156
[<c034fa46>] netlink_kernel_create+0xad/0x143
[<c051f869>] rtnetlink_init+0x70/0xc7
[<c051fb9f>] netlink_proto_init+0x187/0x192
[<c01003cb>] init+0x12b/0x2f1
[<c0101005>] kernel_thread_helper+0x5/0xb

If more info needed, feel free to ask.

regards,
- --
Jiri Slaby www.fi.muni.cz/~xslaby
\_.-^-._ ***@gmail.com _.-^-._/
B67499670407CE62ACC8 22A032CC55C339D47A7E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFEfCYeMsxVwznUen4RApvNAJ94piY4mvFzO9x3qSBKL8DstkeBbgCguCnz
Zzw1YFf/s3AtKVo0XgYWsek=
=x+hX
-----END PGP SIGNATURE-----
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 11:55:45 UTC
Permalink
Post by Jiri Slaby
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
BUG: warning at /l/latest/xxx/kernel/softirq.c:86/local_bh_disable()
ok, that WARN_ON is over-eager. Fix is below:

--------------
Subject: lock validator: remove softirq.c WARN_ON
From: Ingo Molnar <***@elte.hu>

there is nothing wrong with calling local_bh_disable() in irqs-off
section (as long as the local_bh_enable isnt done with irqs-off),
so remove this over-eager WARN_ON().

Signed-off-by: Ingo Molnar <***@elte.hu>
Signed-off-by: Arjan van de Ven <***@linux.intel.com>
---
kernel/softirq.c | 1 -
1 file changed, 1 deletion(-)

Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -83,7 +83,6 @@ static void __local_bh_disable(unsigned

void local_bh_disable(void)
{
- WARN_ON_ONCE(irqs_disabled());
__local_bh_disable((unsigned long)__builtin_return_address(0));
}

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexey Kuznetsov
2006-05-30 16:00:24 UTC
Permalink
Hello!
Nevertheless, I cannot figure out what's happening here.

This local_bh_disable() is called right after schedule().
No way irqs can be disabled there. What is wrong?


static void netlink_table_grab(void)
{
write_lock_bh(&nl_table_lock);

if (atomic_read(&nl_table_users)) {
DECLARE_WAITQUEUE(wait, current);

add_wait_queue_exclusive(&nl_table_wait, &wait);
for(;;) {
set_current_state(TASK_UNINTERRUPTIBLE);
if (atomic_read(&nl_table_users) == 0)
break;
write_unlock_bh(&nl_table_lock);
schedule();
write_lock_bh(&nl_table_lock);
}

__set_current_state(TASK_RUNNING);
remove_wait_queue(&nl_table_wait, &wait);
}
}


Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Arjan van de Ven
2006-05-30 16:05:30 UTC
Permalink
Post by Alexey Kuznetsov
Hello!
Nevertheless, I cannot figure out what's happening here.
This local_bh_disable() is called right after schedule().
No way irqs can be disabled there. What is wrong?
static void netlink_table_grab(void)
{
write_lock_bh(&nl_table_lock);
well it could be this one as well...
Post by Alexey Kuznetsov
if (atomic_read(&nl_table_users)) {
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexey Kuznetsov
2006-05-30 16:15:30 UTC
Permalink
Hello!
Post by Arjan van de Ven
Post by Alexey Kuznetsov
static void netlink_table_grab(void)
{
write_lock_bh(&nl_table_lock);
well it could be this one as well...
Indeed.

But it still looks as something very strange.
There are some GFP_KERNEL allocations on the way to this function.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 11:11:38 UTC
Permalink
Subject: lock validator, fix NULL type->name bug
From: Ingo Molnar <***@elte.hu>

this should fix the bug reported Mike Galbraith: pass in a non-NULL
mutex name string even if DEBUG_MUTEXES is turned off.

Signed-off-by: Ingo Molnar <***@elte.hu>
---
include/linux/mutex.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/include/linux/mutex.h
===================================================================
--- linux.orig/include/linux/mutex.h
+++ linux/include/linux/mutex.h
@@ -80,7 +80,7 @@ struct mutex_waiter {
do { \
static struct lockdep_type_key __key; \
\
- __mutex_init((mutex), NULL, &__key); \
+ __mutex_init((mutex), #mutex, &__key); \
} while (0)
# define mutex_destroy(mutex) do { } while (0)
#endif
Mike Galbraith
2006-05-30 11:58:46 UTC
Permalink
Post by Ingo Molnar
Subject: lock validator, fix NULL type->name bug
this should fix the bug reported Mike Galbraith: pass in a non-NULL
mutex name string even if DEBUG_MUTEXES is turned off.
Well, yes and no. It cures the oops, and it almost boots. It passes
all tests, and gets to where we start mounting things...

kjournald starting. Commit interval 5 seconds
EXT3 FS on hdc1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.

=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
mount/2545 is trying to acquire lock:
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa

...and deadlocks.

I'll try to find out what it hates.

-Mike
Ingo Molnar
2006-05-30 12:02:01 UTC
Permalink
Post by Mike Galbraith
Post by Ingo Molnar
Subject: lock validator, fix NULL type->name bug
this should fix the bug reported Mike Galbraith: pass in a non-NULL
mutex name string even if DEBUG_MUTEXES is turned off.
Well, yes and no. It cures the oops, and it almost boots. It passes
all tests, and gets to where we start mounting things...
kjournald starting. Commit interval 5 seconds
EXT3 FS on hdc1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa
...and deadlocks.
hm, and no other messages? Are you using serial logging?

Ingo
Mike Galbraith
2006-05-30 12:06:17 UTC
Permalink
Post by Ingo Molnar
Post by Mike Galbraith
Post by Ingo Molnar
Subject: lock validator, fix NULL type->name bug
this should fix the bug reported Mike Galbraith: pass in a non-NULL
mutex name string even if DEBUG_MUTEXES is turned off.
Well, yes and no. It cures the oops, and it almost boots. It passes
all tests, and gets to where we start mounting things...
kjournald starting. Commit interval 5 seconds
EXT3 FS on hdc1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa
...and deadlocks.
hm, and no other messages? Are you using serial logging?
nada. Yes, serial console.

-Mike
Mike Galbraith
2006-05-30 12:05:25 UTC
Permalink
Post by Ingo Molnar
=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa
...and deadlocks.
I'll try to find out what it hates.
It hates NTFS.

-Mike
Ingo Molnar
2006-05-30 12:06:41 UTC
Permalink
Post by Mike Galbraith
Post by Ingo Molnar
=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa
...and deadlocks.
I'll try to find out what it hates.
It hates NTFS.
i'd still love to figure out what's going on here.

hmm ... do you have the NMI watchdog enabled? Could you try with
nmi_watchdog=0?

Ingo
Mike Galbraith
2006-05-30 12:17:02 UTC
Permalink
Post by Ingo Molnar
Post by Mike Galbraith
Post by Ingo Molnar
=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa
...and deadlocks.
I'll try to find out what it hates.
It hates NTFS.
i'd still love to figure out what's going on here.
hmm ... do you have the NMI watchdog enabled? Could you try with
nmi_watchdog=0?
I have nmi_watchdog=1. I'll reboot with 0 and see if it'll trigger.

I found a warning.

ip6_tables: (C) 2000-2006 Netfilter Core Team
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
ip_conntrack version 2.4 (8191 buckets, 65528 max) - 228 bytes per conntrack
BUG: warning at kernel/lockdep.c:2398/check_flags()
<b1003dd2> show_trace+0xd/0xf <b10044c0> dump_stack+0x17/0x19
<b103ae46> check_flags+0x26e/0x273 <b103da2c> lockdep_release+0x1e/0x3e6
<b13d2d10> _spin_unlock_irq+0x16/0x3b <b13d21aa> rwsem_down_read_failed+0x64/0x1d0
<b10a76fa> .text.lock.task_mmu+0x3d/0x63 <b10a8b6b> proc_pid_follow_link+0x2b/0x3a
<b1081a9b> __link_path_walk+0xc1d/0xe98 <b1081d5c> link_path_walk+0x46/0xc6
<b1082057> do_path_lookup+0x10f/0x281 <b108298a> __user_walk_fd+0x32/0x45
<b107b81a> vfs_stat_fd+0x1b/0x41 <b107b8cd> vfs_stat+0x11/0x13
<b107b8e3> sys_stat64+0x14/0x28 <b13d3043> syscall_call+0x7/0xb
irq event stamp: 3101
hardirqs last enabled at (3101): [<b13d3081>] restore_nocheck+0x8/0xb
hardirqs last disabled at (3100): [<b13d27b5>] _spin_lock_irq+0xf/0x48
softirqs last enabled at (3094): [<b1028ae7>] __do_softirq+0xe4/0xf5
softirqs last disabled at (3085): [<b10056f4>] do_softirq+0x5a/0xca
bt878: AUDIO driver version 0.0.0 loaded
bt878: Bt878 AUDIO function found (0).
ACPI: PCI Interrupt 0000:02:02.1[A] -> GSI 18 (level, low) -> IRQ 17
bt878_probe: card id=[0x1c11bd],[ Pinnacle PCTV Sat ] has DVB functions.
bt878(0): Bt878 (rev 17) at 02:02.1, <6>input: i2c IR (Hauppauge) as /class/input/input2
ir-kbd-i2c: i2c IR (Hauppauge) detected at i2c-0/0-001a/ir0 [bt878 #0 [hw]]
irq: 17, latency: 32, memory: 0xea101000
saa7130/34: v4l2 driver version 0.2.14 loaded
Ingo Molnar
2006-05-30 12:19:52 UTC
Permalink
Post by Mike Galbraith
I have nmi_watchdog=1. I'll reboot with 0 and see if it'll trigger.
I found a warning.
BUG: warning at kernel/lockdep.c:2398/check_flags()
this one could be related to NMI. We are already disabling NMI on
x86_64, but i thought i had it fixed up for i386 - apparently not.

Ingo
Mike Galbraith
2006-05-30 12:28:18 UTC
Permalink
Post by Ingo Molnar
Post by Mike Galbraith
I have nmi_watchdog=1. I'll reboot with 0 and see if it'll trigger.
I found a warning.
BUG: warning at kernel/lockdep.c:2398/check_flags()
this one could be related to NMI. We are already disabling NMI on
x86_64, but i thought i had it fixed up for i386 - apparently not.
Booted with nmi_watchdog=0, no warning and no deadlock. It produced
fruit for NFTS.

=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
mount/2545 is trying to acquire lock:
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa

but task is already holding lock:
(&rl->lock){----}, at: [<b1165306>] ntfs_map_runlist+0x14/0xa7

which lock already depends on the new lock,
which could lead to circular deadlocks!

the existing dependency chain (in reverse order) is:

-> #1 (&rl->lock){----}:
[<b103d9f8>] lockdep_acquire+0x61/0x77
[<b11613ae>] ntfs_readpage+0x92c/0xb53
[<b10540c8>] read_cache_page+0x95/0x15a
[<b1174b0e>] map_mft_record+0xda/0x28a
[<b117187f>] ntfs_read_locked_inode+0x5d/0x1559
[<b1174212>] ntfs_read_inode_mount+0x572/0xb30
[<b1183f8c>] ntfs_fill_super+0xc9e/0x1467
[<b1078ac2>] get_sb_bdev+0xee/0x141
[<b117eff5>] ntfs_get_sb+0x1a/0x20
[<b107880c>] vfs_kern_mount+0x9a/0x166
[<b1078920>] do_kern_mount+0x30/0x43
[<b108ea7f>] do_mount+0x464/0x7ba
[<b108ee44>] sys_mount+0x6f/0xa4
[<b13d3043>] syscall_call+0x7/0xb

-> #0 (&ni->mrec_lock){--..}:
[<b103d9f8>] lockdep_acquire+0x61/0x77
[<b13d14a5>] __mutex_lock_slowpath+0x49/0xff
[<b13d1563>] mutex_lock+0x8/0xa
[<b1174a51>] map_mft_record+0x1d/0x28a
[<b1164b77>] ntfs_map_runlist_nolock+0x378/0x4a6
[<b1165360>] ntfs_map_runlist+0x6e/0xa7
[<b1161375>] ntfs_readpage+0x8f3/0xb53
[<b10540c8>] read_cache_page+0x95/0x15a
[<b11806e5>] load_system_files+0x1e3/0x1e5c
[<b1183fec>] ntfs_fill_super+0xcfe/0x1467
[<b1078ac2>] get_sb_bdev+0xee/0x141
[<b117eff5>] ntfs_get_sb+0x1a/0x20
[<b107880c>] vfs_kern_mount+0x9a/0x166
[<b1078920>] do_kern_mount+0x30/0x43
[<b108ea7f>] do_mount+0x464/0x7ba
[<b108ee44>] sys_mount+0x6f/0xa4
[<b13d3043>] syscall_call+0x7/0xb

other info that might help us debug this:

2 locks held by mount/2545:
#0: (&s->s_umount){----}, at: [<b10782db>] sget+0x1d9/0x3bd
#1: (&rl->lock){----}, at: [<b1165306>] ntfs_map_runlist+0x14/0xa7

stack backtrace:
<b1003dd2> show_trace+0xd/0xf <b10044c0> dump_stack+0x17/0x19
<b103c9ca> print_circular_bug_tail+0x5d/0x66 <b103d145> __lockdep_acquire+0x772/0xc32
<b103d9f8> lockdep_acquire+0x61/0x77 <b13d14a5> __mutex_lock_slowpath+0x49/0xff
<b13d1563> mutex_lock+0x8/0xa <b1174a51> map_mft_record+0x1d/0x28a
<b1164b77> ntfs_map_runlist_nolock+0x378/0x4a6 <b1165360> ntfs_map_runlist+0x6e/0xa7
<b1161375> ntfs_readpage+0x8f3/0xb53 <b10540c8> read_cache_page+0x95/0x15a
<b11806e5> load_system_files+0x1e3/0x1e5c <b1183fec> ntfs_fill_super+0xcfe/0x1467
<b1078ac2> get_sb_bdev+0xee/0x141 <b117eff5> ntfs_get_sb+0x1a/0x20
<b107880c> vfs_kern_mount+0x9a/0x166 <b1078920> do_kern_mount+0x30/0x43
<b108ea7f> do_mount+0x464/0x7ba <b108ee44> sys_mount+0x6f/0xa4
<b13d3043> syscall_call+0x7/0xb
Ingo Molnar
2006-05-30 12:29:50 UTC
Permalink
Post by Mike Galbraith
Post by Ingo Molnar
Post by Mike Galbraith
BUG: warning at kernel/lockdep.c:2398/check_flags()
this one could be related to NMI. We are already disabling NMI on
x86_64, but i thought i had it fixed up for i386 - apparently not.
Booted with nmi_watchdog=0, no warning and no deadlock.
ok, great. The patch below turns off NMI on i386 automatically.

-------------------
Subject: lock validator: disable NMI watchdog if CONFIG_LOCKDEP, i386
From: Ingo Molnar <***@elte.hu>

The NMI watchdog uses spinlocks (notifier chains, etc.),
so it's not lockdep-safe at the moment.

Signed-off-by: Ingo Molnar <***@elte.hu>
---
arch/i386/kernel/nmi.c | 11 +++++++++++
1 file changed, 11 insertions(+)

Index: linux/arch/i386/kernel/nmi.c
===================================================================
--- linux.orig/arch/i386/kernel/nmi.c
+++ linux/arch/i386/kernel/nmi.c
@@ -741,6 +741,17 @@ static void stop_intel_arch_watchdog(voi

void setup_apic_nmi_watchdog (void *unused)
{
+#ifdef CONFIG_LOCKDEP
+ /*
+ * The NMI watchdog uses spinlocks (notifier chains, etc.),
+ * so it's not lockdep-safe:
+ */
+ nmi_watchdog = NMI_NONE;
+ printk("lockdep: disabled NMI watchdog.\n");
+
+ return;
+#endif
+
/* only support LOCAL and IO APICs for now */
if ((nmi_watchdog != NMI_LOCAL_APIC) &&
(nmi_watchdog != NMI_IO_APIC))
Ingo Molnar
2006-05-30 12:34:15 UTC
Permalink
and NMI disabling wasnt perfect on x86_64 either. (we did it too late,
which allowed a few NMI ticks to still occur.)

---------------
Subject: lock validator: fix NMI-disabling on x86_64
From: Ingo Molnar <***@elte.hu>

this does the NMI-watchdog disabling at the right place on x86_64.

should probably be folded into:

lock-validator-disable-nmi-watchdog-if-config_lockdep.patch

Signed-off-by: Ingo Molnar <***@elte.hu>
---
arch/x86_64/kernel/nmi.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)

Index: linux/arch/x86_64/kernel/nmi.c
===================================================================
--- linux.orig/arch/x86_64/kernel/nmi.c
+++ linux/arch/x86_64/kernel/nmi.c
@@ -215,18 +215,6 @@ int __init check_nmi_watchdog (void)
int *counts;
int cpu;

-#ifdef CONFIG_LOCKDEP
- /*
- * The NMI watchdog uses spinlocks (notifier chains, etc.),
- * so it's not lockdep-safe:
- */
- nmi_watchdog = 0;
- for_each_online_cpu(cpu)
- per_cpu(nmi_watchdog_ctlblk.enabled, cpu) = 0;
-
- printk("lockdep: disabled NMI watchdog.\n");
- return 0;
-#endif
if ((nmi_watchdog == NMI_NONE) || (nmi_watchdog == NMI_DEFAULT))
return 0;

@@ -680,6 +668,17 @@ static void stop_intel_arch_watchdog(voi

void setup_apic_nmi_watchdog(void *unused)
{
+#ifdef CONFIG_LOCKDEP
+ /*
+ * The NMI watchdog uses spinlocks (notifier chains, etc.),
+ * so it's not lockdep-safe:
+ */
+ nmi_watchdog = NMI_NONE;
+ printk("lockdep: disabled NMI watchdog.\n");
+
+ return;
+#endif
+
/* only support LOCAL and IO APICs for now */
if ((nmi_watchdog != NMI_LOCAL_APIC) &&
(nmi_watchdog != NMI_IO_APIC))
Mike Galbraith
2006-05-30 12:44:24 UTC
Permalink
Post by Ingo Molnar
Post by Mike Galbraith
Post by Ingo Molnar
Post by Mike Galbraith
BUG: warning at kernel/lockdep.c:2398/check_flags()
this one could be related to NMI. We are already disabling NMI on
x86_64, but i thought i had it fixed up for i386 - apparently not.
Booted with nmi_watchdog=0, no warning and no deadlock.
ok, great. The patch below turns off NMI on i386 automatically.
All is well. Back to nmi_watchdog=1, no warning, no lock.

-Mike
Andi Kleen
2006-05-30 19:14:54 UTC
Permalink
Post by Ingo Molnar
The NMI watchdog uses spinlocks (notifier chains, etc.),
so it's not lockdep-safe at the moment.
That's totally unsafe even without lockdep and should be fixed
instead. I guess someone bungled the notifier chain conversion.
The NMI notifiers need to be lockless.

-Andi
Ingo Molnar
2006-05-30 19:47:48 UTC
Permalink
Post by Andi Kleen
Post by Ingo Molnar
The NMI watchdog uses spinlocks (notifier chains, etc.),
so it's not lockdep-safe at the moment.
That's totally unsafe even without lockdep and should be fixed
instead. I guess someone bungled the notifier chain conversion. The
NMI notifiers need to be lockless.
yeah, totally agreed, they need to be raw notifiers. Havent had time to
investigate it in detail yet - i went for the easier hack of disabling
NMIs while lockdep is enabled.

Here's the kernel trace of it happening on x86_64:

<...>-417 0D... 2983us : __lockdep_acquire (ffffffff81a5cb18 0 0)
<...>-417 0D... 2983us : __lockdep_acquire (0 0 0)
<...>-417 0D... 2984us : do_nmi (nmi)
<...>-417 0D.h. 2985us : default_do_nmi (do_nmi)
<...>-417 0D.h. 2986us : atomic_notifier_call_chain (default_do_nmi)
<...>-417 0D.h. 2986us : notifier_call_chain (atomic_notifier_call_chain)
<...>-417 0D.h. 2987us : nmi_watchdog_tick (default_do_nmi)
<...>-417 0D.h. 2987us : atomic_notifier_call_chain (nmi_watchdog_tick)
<...>-417 0D.h. 2987us : notifier_call_chain (atomic_notifier_call_chain)
<...>-417 0D... 2988us : trace_hardirqs_off (trace_hardirqs_off_thunk)
<...>-417 0D... 2989us : __lockdep_acquire (1 1 0)
<...>-417 0D... 2989us : mark_lock (__lockdep_acquire)
<...>-417 0D... 2989us : mark_lock (__lockdep_acquire)
<...>-417 0D... 2989us+: mark_lock (__lockdep_acquire)
<...>-417 0D... 2991us : check_chain_key (__lockdep_acquire)
<...>-417 0.... 2992us : _raw_spin_lock (_spin_lock)
<...>-417 0.... 2992us : _spin_lock (dput)

that shouldnt be an atomic_notifier but a raw_notifier.

Ingo
Ingo Molnar
2006-05-30 20:05:09 UTC
Permalink
Post by Ingo Molnar
yeah, totally agreed, they need to be raw notifiers. Havent had time
to investigate it in detail yet - i went for the easier hack of
disabling NMIs while lockdep is enabled.
hm ... atomic_notifier_call_chain ought to be fine - it uses
rcu_read_lock(), which uses preempt_disable(), which is NMI-safe.

so i think this NMI problem might be lockdep-specific. I think it might
be the NMI iret that confuses lockdep. (and irqflags-trace in
particular)

Ingo
Andrew Morton
2006-05-30 19:54:47 UTC
Permalink
On 30 May 2006 21:14:54 +0200
Post by Andi Kleen
Post by Ingo Molnar
The NMI watchdog uses spinlocks (notifier chains, etc.),
so it's not lockdep-safe at the moment.
That's totally unsafe even without lockdep and should be fixed
instead. I guess someone bungled the notifier chain conversion.
The NMI notifiers need to be lockless.
Confused. NMI uses notify_die(), which doesn't take locks?

We'll probably accidentally take locks when actually reporting an NMI
watchdog timeout, but that doesn't seem terribly important.
Keith Owens
2006-05-31 04:34:10 UTC
Permalink
Post by Ingo Molnar
Post by Mike Galbraith
Post by Ingo Molnar
Post by Mike Galbraith
BUG: warning at kernel/lockdep.c:2398/check_flags()
this one could be related to NMI. We are already disabling NMI on
x86_64, but i thought i had it fixed up for i386 - apparently not.
Booted with nmi_watchdog=0, no warning and no deadlock.
ok, great. The patch below turns off NMI on i386 automatically.
-------------------
Subject: lock validator: disable NMI watchdog if CONFIG_LOCKDEP, i386
The NMI watchdog uses spinlocks (notifier chains, etc.),
so it's not lockdep-safe at the moment.
Where? Since 2.6.17-rc1 the notify_die() callback uses RCU, not
spinlocks.
Arjan van de Ven
2006-05-30 12:14:16 UTC
Permalink
Post by Mike Galbraith
Post by Ingo Molnar
=====================================================
[ BUG: possible circular locking deadlock detected! ]
-----------------------------------------------------
(&ni->mrec_lock){--..}, at: [<b13d1563>] mutex_lock+0x8/0xa
...and deadlocks.
I'll try to find out what it hates.
It hates NTFS.
hummm. NTFS does really really weird things with mutexes...
can you try to enable the mutex debugging config option to see if that
triggers anything?
Michal Piotrowski
2006-05-30 12:46:33 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
udevd/415 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&base->lock#2){++..}, at: [<c012900a>] run_timer_softirq+0x3d/0x164
{in-hardirq-W} state was registered at:
[<c0139a56>] lockdep_acquire+0x69/0x82
[<c02f23ac>] _spin_lock_irqsave+0x2a/0x3a
[<c0129a24>] lock_timer_base+0x29/0x55
[<c0129e48>] del_timer+0x19/0x4c
[<c025925d>] ide_intr+0x13b/0x1a9
[<c014c524>] handle_IRQ_event+0x20/0x50
[<c014d48c>] handle_edge_irq+0x10a/0x14f
[<c010579c>] do_IRQ+0xa1/0xc9
irq event stamp: 351479
hardirqs last enabled at (351478): [<c02f274c>] _spin_unlock_irq+0x22/0x53
hardirqs last disabled at (351479): [<c02f2324>] _spin_lock_irq+0xc/0x35
softirqs last enabled at (351434): [<c0125873>] __do_softirq+0xea/0xf0
softirqs last disabled at (351475): [<c0105689>] do_softirq+0x59/0xcb

other info that might help us debug this:
3 locks held by udevd/415:
#0: (&inode->i_mutex/1){--..}, at: [<c018150e>] lookup_create+0x1e/0x77
#1: (inode_lock){--..}, at: [<c018bf13>] new_inode+0x27/0x8e
#2: (&base->lock#2){++..}, at: [<c012900a>] run_timer_softirq+0x3d/0x164

stack backtrace:
[<c0103e52>] show_trace_log_lvl+0x4b/0xf4
[<c01044b3>] show_trace+0xd/0x10
[<c010457b>] dump_stack+0x19/0x1b
[<c0137d63>] print_usage_bug+0x1a1/0x1ab
[<c0138458>] mark_lock+0x2d7/0x514
[<c01386dc>] mark_held_locks+0x47/0x65
[<c0139745>] trace_hardirqs_on+0x12b/0x16f
[<c02f2b91>] restore_nocheck+0x8/0xb

Here is dmesg
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-dmesg
Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Arjan van de Ven
2006-05-30 19:13:43 UTC
Permalink
Post by Ingo Molnar
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
(&base->lock#2){++..}, at: [<c012900a>] run_timer_softirq+0x3d/0x164
hhmmm curious.. you don't happen to have nmi watchdog enabled??
Michal Piotrowski
2006-05-30 15:59:11 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
It looks like a network stack problem.

May 30 17:50:34 ltg01-fedora init: Switching to runlevel: 6
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Got SIGTERM, quitting.
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Leaving mDNS
multicast group on interface eth0.IPv4 with address 192.168.0.
14.
May 30 17:50:35 ltg01-fedora kernel:
May 30 17:50:35 ltg01-fedora kernel: ======================================
May 30 17:50:35 ltg01-fedora kernel: [ BUG: bad unlock ordering detected! ]
May 30 17:50:35 ltg01-fedora kernel: --------------------------------------
May 30 17:50:35 ltg01-fedora kernel: avahi-daemon/1878 is trying to
release lock (&in_dev->mc_list_lock) at:
May 30 17:50:35 ltg01-fedora kernel: [<c02e693b>] ip_mc_del_src+0x5e/0xd5
May 30 17:50:35 ltg01-fedora kernel: but the next lock to release is:
May 30 17:50:35 ltg01-fedora kernel: (&im->lock){-...}, at:
[<c02e6934>] ip_mc_del_src+0x57/0xd5
May 30 17:50:35 ltg01-fedora kernel:
May 30 17:50:35 ltg01-fedora kernel: other info that might help us debug this:
May 30 17:50:35 ltg01-fedora kernel: 2 locks held by avahi-daemon/1878:
May 30 17:50:35 ltg01-fedora kernel: #0: (rtnl_mutex){--..}, at:
[<c02f0b0f>] mutex_lock+0x1c/0x1f
May 30 17:50:35 ltg01-fedora kernel: #1:
(&in_dev->mc_list_lock){-.-?}, at: [<c02e6905>]
ip_mc_del_src+0x28/0xd5
May 30 17:50:35 ltg01-fedora kernel:
May 30 17:50:35 ltg01-fedora kernel: stack backtrace:
May 30 17:50:35 ltg01-fedora kernel: [<c0103e52>] show_trace_log_lvl+0x4b/0xf4
May 30 17:50:35 ltg01-fedora kernel: [<c01044b3>] show_trace+0xd/0x10
May 30 17:50:35 ltg01-fedora kernel: [<c010457b>] dump_stack+0x19/0x1b
May 30 17:50:35 ltg01-fedora kernel: [<c0139bfa>] lockdep_release+0x18b/0x350
May 30 17:50:35 ltg01-fedora kernel: [<c02f2640>] _read_unlock+0x16/0x4d
May 30 17:50:35 ltg01-fedora kernel: [<c02e693b>] ip_mc_del_src+0x5e/0xd5
May 30 17:50:35 ltg01-fedora kernel: [<c02e69de>] ip_mc_leave_src+0x2c/0x6c
May 30 17:50:35 ltg01-fedora kernel: [<c02e6c5b>] ip_mc_leave_group+0x3d/0x97
May 30 17:50:35 ltg01-fedora kernel: [<c02c8a68>] ip_setsockopt+0x4d0/0x9a6
May 30 17:50:35 ltg01-fedora kernel: [<c02def6d>] udp_setsockopt+0x1f/0x9c
May 30 17:50:35 ltg01-fedora kernel: [<c02a7006>]
sock_common_setsockopt+0x13/0x18
May 30 17:50:35 ltg01-fedora kernel: [<c02a5956>] sys_setsockopt+0x73/0xa4
May 30 17:50:35 ltg01-fedora kernel: [<c02a6c53>] sys_socketcall+0x148/0x186
May 30 17:50:35 ltg01-fedora kernel: [<c02f2ad5>] sysenter_past_esp+0x56/0x8d

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Arjan van de Ven
2006-05-30 16:08:49 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
It looks like a network stack problem.
May 30 17:50:34 ltg01-fedora init: Switching to runlevel: 6
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Got SIGTERM, quitting.
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Leaving mDNS
multicast group on interface eth0.IPv4 with address 192.168.0.
14.
May 30 17:50:35 ltg01-fedora kernel: ======================================
May 30 17:50:35 ltg01-fedora kernel: [ BUG: bad unlock ordering detected! ]
May 30 17:50:35 ltg01-fedora kernel: --------------------------------------
May 30 17:50:35 ltg01-fedora kernel: avahi-daemon/1878 is trying to
does this fix it for you?



Mark out of order unlocking in igmp.c as such

Signed-off-by: Arjan van de Ven <***@linux.intel.com>
---
net/ipv4/igmp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.17-rc5-mm1-lockdep/net/ipv4/igmp.c
===================================================================
--- linux-2.6.17-rc5-mm1-lockdep.orig/net/ipv4/igmp.c
+++ linux-2.6.17-rc5-mm1-lockdep/net/ipv4/igmp.c
@@ -1472,7 +1472,7 @@ static int ip_mc_del_src(struct in_devic
return -ESRCH;
}
spin_lock_bh(&pmc->lock);
- read_unlock(&in_dev->mc_list_lock);
+ read_unlock_non_nested(&in_dev->mc_list_lock);
#ifdef CONFIG_IP_MULTICAST
sf_markstate(pmc);
#endif

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-05-30 18:51:01 UTC
Permalink
Hi Arjan,
Post by Arjan van de Ven
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
It looks like a network stack problem.
May 30 17:50:34 ltg01-fedora init: Switching to runlevel: 6
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Got SIGTERM, quitting.
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Leaving mDNS
multicast group on interface eth0.IPv4 with address 192.168.0.
14.
May 30 17:50:35 ltg01-fedora kernel: ======================================
May 30 17:50:35 ltg01-fedora kernel: [ BUG: bad unlock ordering detected! ]
May 30 17:50:35 ltg01-fedora kernel: --------------------------------------
May 30 17:50:35 ltg01-fedora kernel: avahi-daemon/1878 is trying to
does this fix it for you?
Yes, thanks.
Post by Arjan van de Ven
Mark out of order unlocking in igmp.c as such
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-05-30 16:16:30 UTC
Permalink
Hi Ingo,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
Here is small lockdep bug

May 30 18:05:08 ltg01-fedora ainit:
May 30 18:05:09 ltg01-fedora kernel: BUG: warning at
/usr/src/linux-mm/kernel/lockdep.c:1853/trace_hardirqs_on()
May 30 18:05:09 ltg01-fedora kernel: [<c0103e52>] show_trace_log_lvl+0x4b/0xf4
May 30 18:05:09 ltg01-fedora kernel: [<c01044b3>] show_trace+0xd/0x10
May 30 18:05:09 ltg01-fedora kernel: [<c010457b>] dump_stack+0x19/0x1b
May 30 18:05:09 ltg01-fedora kernel: [<c0139701>] trace_hardirqs_on+0xe7/0x16f
May 30 18:05:09 ltg01-fedora kernel: [<c02f2b91>] restore_nocheck+0x8/0xb
May 30 18:05:09 ltg01-fedora shutdown[2135]: shutting down for system reboot
May 30 18:05:09 ltg01-fedora init: Switching to runlevel: 6

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Ingo Molnar
2006-05-30 19:28:39 UTC
Permalink
Post by Michal Piotrowski
Hi Ingo,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
Here is small lockdep bug
May 30 18:05:09 ltg01-fedora kernel: BUG: warning at
/usr/src/linux-mm/kernel/lockdep.c:1853/trace_hardirqs_on()
hm. Do you have the NMI watchdog enabled? [does /proc/interrupts show
any increasing NMI counts?]

Ingo
Michal Piotrowski
2006-05-30 19:48:51 UTC
Permalink
Post by Ingo Molnar
Post by Michal Piotrowski
Hi Ingo,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
Here is small lockdep bug
May 30 18:05:09 ltg01-fedora kernel: BUG: warning at
/usr/src/linux-mm/kernel/lockdep.c:1853/trace_hardirqs_on()
hm. Do you have the NMI watchdog enabled? [does /proc/interrupts show
any increasing NMI counts?]
No.
Post by Ingo Molnar
Ingo
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Michal Piotrowski
2006-05-30 18:39:48 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
I get this on 2.6.17-rc5-mm1 + hot fixes + Arjan's net/ipv4/igmp.c patch.

May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel:
=====================================================
May 30 20:25:56 ltg01-fedora kernel: [ BUG: possible circular locking
deadlock detected! ]
May 30 20:25:56 ltg01-fedora kernel:
-----------------------------------------------------
May 30 20:25:56 ltg01-fedora kernel: umount/2322 is trying to acquire lock:
May 30 20:25:56 ltg01-fedora kernel: (sb_security_lock){--..}, at:
[<c01d6400>] selinux_sb_free_security+0x17/0x4e
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: but task is already holding lock:
May 30 20:25:56 ltg01-fedora kernel: (sb_lock){--..}, at:
[<c0178a89>] put_super+0x10/0x24
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: which lock already depends on the new lock,
May 30 20:25:56 ltg01-fedora kernel: which could lead to circular deadlocks!
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: the existing dependency chain (in
reverse order) is:
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: -> #1 (sb_lock){--..}:
May 30 20:25:56 ltg01-fedora kernel: [<c0139a56>]
lockdep_acquire+0x69/0x82
May 30 20:25:56 ltg01-fedora kernel: [<c02f2171>] _spin_lock+0x21/0x2f
May 30 20:25:56 ltg01-fedora kernel: [<c01d72de>]
selinux_complete_init+0x45/0xda
May 30 20:25:56 ltg01-fedora kernel: [<c01e0a4e>]
security_load_policy+0xb3/0x22d
May 30 20:25:56 ltg01-fedora kernel: [<c01da975>]
sel_write_load+0xa3/0x2a1
May 30 20:25:56 ltg01-fedora kernel: [<c0172e2a>] vfs_write+0xcd/0x179
May 30 20:25:56 ltg01-fedora kernel: [<c01734d3>] sys_write+0x3b/0x71
May 30 20:25:56 ltg01-fedora kernel: [<c02f2aa5>]
sysenter_past_esp+0x56/0x8d
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: other info that might help us debug this:
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: 1 locks held by umount/2322:
May 30 20:25:56 ltg01-fedora kernel: #0: (sb_lock){--..}, at:
[<c0178a89>] put_super+0x10/0x24
May 30 20:25:56 ltg01-fedora kernel:
May 30 20:25:56 ltg01-fedora kernel: stack backtrace:
May 30 20:25:56 ltg01-fedora kernel: [<c0103e52>] show_trace_log_lvl+0x4b/0xf4
May 30 20:25:56 ltg01-fedora kernel: [<c01044b3>] show_trace+0xd/0x10
May 30 20:25:56 ltg01-fedora kernel: [<c010457b>] dump_stack+0x19/0x1b
May 30 20:25:56 ltg01-fedora kernel: [<c0138bd6>]
print_circular_bug_tail+0x59/0x64
May 30 20:25:56 ltg01-fedora kernel: [<c0139429>] __lockdep_acquire+0x848/0xa39
May 30 20:25:56 ltg01-fedora kernel: [<c0139a56>] lockdep_acquire+0x69/0x82
May 30 20:25:56 ltg01-fedora kernel: [<c02f2171>] _spin_lock+0x21/0x2f
May 30 20:25:56 ltg01-fedora kernel: [<c01d6400>]
selinux_sb_free_security+0x17/0x4e
May 30 20:25:56 ltg01-fedora kernel: [<c0178a68>] __put_super+0x24/0x35
May 30 20:25:56 ltg01-fedora kernel: [<c0178a90>] put_super+0x17/0x24
May 30 20:25:56 ltg01-fedora kernel: [<c01793a3>] deactivate_super+0xa3/0xad
May 30 20:25:56 ltg01-fedora kernel: [<c018e010>] mntput_no_expire+0x52/0x85
May 30 20:25:56 ltg01-fedora kernel: [<c017fcb0>]
path_release_on_umount+0x15/0x18
May 30 20:25:56 ltg01-fedora kernel: [<c018f535>] sys_umount+0x292/0x2aa
May 30 20:25:56 ltg01-fedora kernel: [<c018f55a>] sys_oldumount+0xd/0xf
May 30 20:25:56 ltg01-fedora kernel: [<c02f2aa5>] sysenter_past_esp+0x56/0x8d

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config2

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Arjan van de Ven
2006-05-30 19:04:50 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
I get this on 2.6.17-rc5-mm1 + hot fixes + Arjan's net/ipv4/igmp.c patch.
=====================================================
May 30 20:25:56 ltg01-fedora kernel: [ BUG: possible circular locking
deadlock detected! ]
-----------------------------------------------------
[<c01d6400>] selinux_sb_free_security+0x17/0x4e
ok so selinux_complete_init() does
spin_lock(&sb_security_lock);
next_sb:
if (!list_empty(&superblock_security_head)) {
struct superblock_security_struct *sbsec =
list_entry(superblock_security_head.next,
struct superblock_security_struct,
list);
struct super_block *sb = sbsec->sb;
spin_lock(&sb_lock);
sb->s_count++;
spin_unlock(&sb_lock);
spin_unlock(&sb_security_lock);

nesting sb_lock inside sb_security_lock

while

put_super() takes the sb_lock, then calls __put_super() which calls
selinux_sb_free_security which calls superblock_free_security() which takes sb_security_lock
which means the nesting is opposite.


textbook AB-BA deadlock
Stephen Smalley
2006-05-31 14:56:52 UTC
Permalink
Post by Arjan van de Ven
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
I get this on 2.6.17-rc5-mm1 + hot fixes + Arjan's net/ipv4/igmp.c patch.
=====================================================
May 30 20:25:56 ltg01-fedora kernel: [ BUG: possible circular locking
deadlock detected! ]
-----------------------------------------------------
[<c01d6400>] selinux_sb_free_security+0x17/0x4e
ok so selinux_complete_init() does
spin_lock(&sb_security_lock);
if (!list_empty(&superblock_security_head)) {
struct superblock_security_struct *sbsec =
list_entry(superblock_security_head.next,
struct superblock_security_struct,
list);
struct super_block *sb = sbsec->sb;
spin_lock(&sb_lock);
sb->s_count++;
spin_unlock(&sb_lock);
spin_unlock(&sb_security_lock);
nesting sb_lock inside sb_security_lock
while
put_super() takes the sb_lock, then calls __put_super() which calls
selinux_sb_free_security which calls superblock_free_security() which takes sb_security_lock
which means the nesting is opposite.
textbook AB-BA deadlock
Yes, looks that way, although oddly I don't see this warning myself upon
performing a umount (w/ 2.6.17-rc5-mm1-lockdep). Patch below should
fix.

---

Fix unsafe nesting of sb_lock inside sb_security_lock in selinux_complete_init.
Detected by the kernel locking validator.

Signed-off-by: Stephen Smalley <***@tycho.nsa.gov>

---

security/selinux/hooks.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

--- linux-2.6.17-rc5-mm1/security/selinux/hooks.c 2006-05-30 14:26:11.000000000 -0400
+++ linux-2.6.17-rc5-mm1-x/security/selinux/hooks.c 2006-05-31 07:29:23.000000000 -0400
@@ -4448,6 +4448,7 @@ void selinux_complete_init(void)

/* Set up any superblocks initialized prior to the policy load. */
printk(KERN_INFO "SELinux: Setting up existing superblocks.\n");
+ spin_lock(&sb_lock);
spin_lock(&sb_security_lock);
next_sb:
if (!list_empty(&superblock_security_head)) {
@@ -4456,19 +4457,20 @@ next_sb:
struct superblock_security_struct,
list);
struct super_block *sb = sbsec->sb;
- spin_lock(&sb_lock);
sb->s_count++;
- spin_unlock(&sb_lock);
spin_unlock(&sb_security_lock);
+ spin_unlock(&sb_lock);
down_read(&sb->s_umount);
if (sb->s_root)
superblock_doinit(sb, NULL);
drop_super(sb);
+ spin_lock(&sb_lock);
spin_lock(&sb_security_lock);
list_del_init(&sbsec->list);
goto next_sb;
}
spin_unlock(&sb_security_lock);
+ spin_unlock(&sb_lock);
}

/* SELinux requires early initialization in order to label
--
Stephen Smalley
National Security Agency
Arjan van de Ven
2006-05-30 19:55:46 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
I get this on 2.6.17-rc5-mm1 + hot fixes + Arjan's net/ipv4/igmp.c patch.
since Andrew asked how to read this stuff.....
Post by Michal Piotrowski
=====================================================
May 30 20:25:56 ltg01-fedora kernel: [ BUG: possible circular locking
deadlock detected! ]
this message means basically an AB-BA deadlock is found
Post by Michal Piotrowski
-----------------------------------------------------
[<c01d6400>] selinux_sb_free_security+0x17/0x4e
we're holding "sb_lock" already, and are trying to get sb_security_lock
Post by Michal Piotrowski
[<c0178a89>] put_super+0x10/0x24
May 30 20:25:56 ltg01-fedora kernel: which lock already depends on the new lock,
... but there was an observed code sequence before which was the other
way around ...
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: which could lead to circular deadlocks!
yes.
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: the existing dependency chain (in
now it's going to print the previously observed behavior (backwards),
and give a backtrace of where that was acquired
since it prints backwards, this is the latest of the 2 locks taken in
the old situaion
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: [<c0139a56>]
lockdep_acquire+0x69/0x82
May 30 20:25:56 ltg01-fedora kernel: [<c02f2171>] _spin_lock+0x21/0x2f
May 30 20:25:56 ltg01-fedora kernel: [<c01d72de>]
selinux_complete_init+0x45/0xda
and it was in selinux_complete_init

for some reason the #0 is not being printed here (it normally is), which
would give a simliar backtrace. In this case it was ok,
selinux_complete_init was the sole guilty party.
now it's going to print all the locks we own currently, and where those
were taken; not just the ones that are part of the deadlock (that was
printed before)
Post by Michal Piotrowski
[<c0178a89>] put_super+0x10/0x24
ok so in put_super we took sb_lock. [*]
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: [<c0103e52>] show_trace_log_lvl+0x4b/0xf4
May 30 20:25:56 ltg01-fedora kernel: [<c01044b3>] show_trace+0xd/0x10
May 30 20:25:56 ltg01-fedora kernel: [<c010457b>] dump_stack+0x19/0x1b
May 30 20:25:56 ltg01-fedora kernel: [<c0138bd6>]
print_circular_bug_tail+0x59/0x64
May 30 20:25:56 ltg01-fedora kernel: [<c0139429>] __lockdep_acquire+0x848/0xa39
May 30 20:25:56 ltg01-fedora kernel: [<c0139a56>] lockdep_acquire+0x69/0x82
May 30 20:25:56 ltg01-fedora kernel: [<c02f2171>] _spin_lock+0x21/0x2f
these are just the lockdep printing stuff
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: [<c01d6400>]
selinux_sb_free_security+0x17/0x4e
but here it gets interesting; this is the function that triggered the
final deadlock message (well we knew that already from the first line of
the message), which gets called from
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: [<c0178a68>] __put_super+0x24/0x35
which gets called from
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: [<c0178a90>] put_super+0x17/0x24
... but wait we know this one already from where I put [*], so we're now
done. put_super takes sb_lock, then calls __put_super which calls
selinux_sb_free_security which takes sb_security lock.
Post by Michal Piotrowski
From the old pattern we knew the opposite order in
selinux_complete_init(), and we have our AB-BA deadlock
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: [<c01793a3>] deactivate_super+0xa3/0xad
May 30 20:25:56 ltg01-fedora kernel: [<c018e010>] mntput_no_expire+0x52/0x85
May 30 20:25:56 ltg01-fedora kernel: [<c017fcb0>]
path_release_on_umount+0x15/0x18
May 30 20:25:56 ltg01-fedora kernel: [<c018f535>] sys_umount+0x292/0x2aa
well we also now know that it came from a sys_umount; that might help
chasing stuff down if it's more fuzzy than this example
Dave Jones
2006-05-30 20:20:18 UTC
Permalink
Post by Arjan van de Ven
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: which lock already depends on the new lock,
... but there was an observed code sequence before which was the other
way around ...
That phrase could use some rewording IMO. It sounds more like a question
than a statement.

Dave
--
http://www.codemonkey.org.uk
Arjan van de Ven
2006-05-30 20:32:31 UTC
Permalink
Post by Dave Jones
Post by Arjan van de Ven
Post by Michal Piotrowski
May 30 20:25:56 ltg01-fedora kernel: which lock already depends on the new lock,
... but there was an observed code sequence before which was the other
way around ...
That phrase could use some rewording IMO. It sounds more like a question
than a statement.
if you have suggestions please share them... you're the native United
Kingdomian.... :)
Michal Piotrowski
2006-05-30 18:55:52 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
SCSI or libata problem.
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
init/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55
{in-hardirq-W} state was registered at:
[<c0139a56>] lockdep_acquire+0x69/0x82
[<c02f237c>] _spin_lock_irqsave+0x2a/0x3a
[<c0129a24>] lock_timer_base+0x29/0x55
[<c0129e48>] del_timer+0x19/0x4c
[<c02651e2>] scsi_delete_timer+0xe/0x1f
[<c0262964>] scsi_done+0xb/0x19
[<c0273ed3>] ata_scsi_qc_complete+0x73/0x7f
[<c027024a>] __ata_qc_complete+0x26c/0x274
[<c02704f0>] ata_qc_complete+0xd5/0xdc
[<c0270c42>] ata_hsm_qc_complete+0x201/0x210
[<c02713e7>] ata_hsm_move+0x796/0x7ac
[<c027314e>] ata_interrupt+0x173/0x1b4
[<c014c4f4>] handle_IRQ_event+0x20/0x50
[<c014d76e>] handle_level_irq+0xa1/0xeb
[<c010579c>] do_IRQ+0xa1/0xc9
irq event stamp: 576924
hardirqs last enabled at (576923): [<c02f26c7>]
_spin_unlock_irqrestore+0x36/0x69
hardirqs last disabled at (576924): [<c02f2361>] _spin_lock_irqsave+0xf/0x3a
softirqs last enabled at (576878): [<c0125873>] __do_softirq+0xea/0xf0
softirqs last disabled at (576869): [<c0105689>] do_softirq+0x59/0xcb

other info that might help us debug this:
1 locks held by init/1:
#0: (&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55

stack backtrace:
[<c0103e52>] show_trace_log_lvl+0x4b/0xf4
[<c01044b3>] show_trace+0xd/0x10
[<c010457b>] dump_stack+0x19/0x1b
[<c0137d63>] print_usage_bug+0x1a1/0x1ab
[<c0138458>] mark_lock+0x2d7/0x514
[<c01386dc>] mark_held_locks+0x47/0x65
[<c0139745>] trace_hardirqs_on+0x12b/0x16f
[<c02f2b61>] restore_nocheck+0x8/0xb

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config2

Here is dmesg
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-dmesg2

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Andrew Morton
2006-05-30 19:45:53 UTC
Permalink
On Tue, 30 May 2006 20:55:52 +0200
Post by Michal Piotrowski
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
SCSI or libata problem.
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
(&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55
[<c0139a56>] lockdep_acquire+0x69/0x82
[<c02f237c>] _spin_lock_irqsave+0x2a/0x3a
[<c0129a24>] lock_timer_base+0x29/0x55
[<c0129e48>] del_timer+0x19/0x4c
[<c02651e2>] scsi_delete_timer+0xe/0x1f
[<c0262964>] scsi_done+0xb/0x19
[<c0273ed3>] ata_scsi_qc_complete+0x73/0x7f
[<c027024a>] __ata_qc_complete+0x26c/0x274
[<c02704f0>] ata_qc_complete+0xd5/0xdc
[<c0270c42>] ata_hsm_qc_complete+0x201/0x210
[<c02713e7>] ata_hsm_move+0x796/0x7ac
[<c027314e>] ata_interrupt+0x173/0x1b4
[<c014c4f4>] handle_IRQ_event+0x20/0x50
[<c014d76e>] handle_level_irq+0xa1/0xeb
[<c010579c>] do_IRQ+0xa1/0xc9
That's the second report of del_timer-in-hardirq being a bug.

Unfortunately I'm unable to decrypt the local validator's output. Perhaps
when Arjan and Ingo do the analysis of these reports they could provide a
little guidance into what the traces are actually telling us, so that
others can learn to use them?
Ingo Molnar
2006-05-30 19:42:59 UTC
Permalink
Post by Michal Piotrowski
SCSI or libata problem.
i think SCSI and libata is innocent here.
Post by Michal Piotrowski
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
#0: (&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55
[<c0103e52>] show_trace_log_lvl+0x4b/0xf4
[<c01044b3>] show_trace+0xd/0x10
[<c010457b>] dump_stack+0x19/0x1b
[<c0137d63>] print_usage_bug+0x1a1/0x1ab
[<c0138458>] mark_lock+0x2d7/0x514
[<c01386dc>] mark_held_locks+0x47/0x65
[<c0139745>] trace_hardirqs_on+0x12b/0x16f
[<c02f2b61>] restore_nocheck+0x8/0xb
weird. We are holding base->lock#2 [CPU#1's timer base lock], _and_ we
execute restore_nocheck - which is a return-to-userspace thing.

unfortunately the stacktrace provides no clues of how we got here.
For such nasty cases i have a kernel tracing patch prepared, you can get
it from:

http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch

just apply it ontop of your current tree and accept all the new .config
options as the kernel suggests them to you. Then rebuild and reboot into
the kernel, and reproduce the lockdep bug. Once such a bug is reported,
/proc/latency_trace should have a full kernel trace leading up to the
bug. Please upload that trace to your site and send us the URL.

(the tracer runs nonstop and it saves the current trace if it encounters
a lockdep bug. That way i can see the history of the bug.)

if possible it would be nice to boot with maxcpus=1 as well, to make
sure we have all relevant kernel activity traced. (assuming that booting
with maxcpus=1 does not make the bug go away)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-05-30 21:57:10 UTC
Permalink
Post by Ingo Molnar
Post by Michal Piotrowski
SCSI or libata problem.
i think SCSI and libata is innocent here.
Post by Michal Piotrowski
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
#0: (&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55
[<c0103e52>] show_trace_log_lvl+0x4b/0xf4
[<c01044b3>] show_trace+0xd/0x10
[<c010457b>] dump_stack+0x19/0x1b
[<c0137d63>] print_usage_bug+0x1a1/0x1ab
[<c0138458>] mark_lock+0x2d7/0x514
[<c01386dc>] mark_held_locks+0x47/0x65
[<c0139745>] trace_hardirqs_on+0x12b/0x16f
[<c02f2b61>] restore_nocheck+0x8/0xb
weird. We are holding base->lock#2 [CPU#1's timer base lock], _and_ we
execute restore_nocheck - which is a return-to-userspace thing.
unfortunately the stacktrace provides no clues of how we got here.
For such nasty cases i have a kernel tracing patch prepared, you can get
http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch
just apply it ontop of your current tree and accept all the new .config
options as the kernel suggests them to you.
I can't boot with that patch. I even don't see "Uncompressing
Linux..." - machine reboots.
I have 2.6.17-rc5-mm1 +
genirq-cleanup-remove-irq_descp-fix.patch
lock-validator-irqtrace-support-non-x86-architectures.patch
lock-validator-special-locking-sb-s_umount-2-fix.patch
from hot fixes
+
Arjan's net/ipv4/igmp.c patch.

BTW. I got error when compiling kernel/latency.c, so I change
if (DEBUG_WARN_ON((val < PREEMPT_MASK) && !(preempt_count() & PREEMPT_MASK))))

to

if (DEBUG_WARN_ON((val < PREEMPT_MASK) && !(preempt_count() & PREEMPT_MASK)))

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config3

Here is "Kernel Bug : The Movie" (4,3MB)
www.stardust.webpages.pl/files/crap/kbtm.avi

[snip]
Post by Ingo Molnar
Ingo
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 22:09:32 UTC
Permalink
Post by Michal Piotrowski
Post by Ingo Molnar
http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch
just apply it ontop of your current tree and accept all the new .config
options as the kernel suggests them to you.
I can't boot with that patch. I even don't see "Uncompressing
Linux..." - machine reboots.
I have 2.6.17-rc5-mm1 +
genirq-cleanup-remove-irq_descp-fix.patch
lock-validator-irqtrace-support-non-x86-architectures.patch
lock-validator-special-locking-sb-s_umount-2-fix.patch
from hot fixes
+
Arjan's net/ipv4/igmp.c patch.
could you try to 1) disable PREEMPT, 2) apply the -V2 rollup of all
fixes so far to 2.6.17-rc5-mm1:

http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm1.patch

? I'll try your config meanwhile.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 22:18:50 UTC
Permalink
Post by Ingo Molnar
could you try to 1) disable PREEMPT, 2) apply the -V2 rollup of all
http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm1.patch
? I'll try your config meanwhile.
PREEMPT wasnt the problem but CONFIG_DEBUG_STACKOVERFLOW (at least).
There's some other debug option that seems incompatible too - i'm still
trying to figure out which one.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 22:26:08 UTC
Permalink
Post by Ingo Molnar
PREEMPT wasnt the problem but CONFIG_DEBUG_STACKOVERFLOW (at least).
There's some other debug option that seems incompatible too - i'm
still trying to figure out which one.
narrowed it down to:

--- .config.good01 2006-05-31 00:24:44.000000000 +0200
+++ .config.bad01 2006-05-31 00:22:28.000000000 +0200
@@ -1,7 +1,7 @@
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.17-rc5-mm1-lockdep
-# Wed May 31 00:23:12 2006
+# Wed May 31 00:19:45 2006
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
@@ -1798,7 +1798,7 @@ CONFIG_PROVE_RWSEM_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_TRACE_IRQFLAGS=y
-# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
+CONFIG_DEBUG_SPINLOCK_SLEEP=y
CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
CONFIG_WAKEUP_TIMING=y
# CONFIG_WAKEUP_LATENCY_HIST is not set
@@ -1807,18 +1807,19 @@ CONFIG_LATENCY_TIMING=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
# CONFIG_DEBUG_KOBJECT is not set
-# CONFIG_DEBUG_HIGHMEM is not set
+CONFIG_DEBUG_HIGHMEM=y
CONFIG_DEBUG_BUGVERBOSE=y
-# CONFIG_DEBUG_INFO is not set
-# CONFIG_PAGE_OWNER is not set
+CONFIG_DEBUG_INFO=y
+CONFIG_PAGE_OWNER=y
CONFIG_DEBUG_FS=y
-# CONFIG_DEBUG_VM is not set
+CONFIG_DEBUG_VM=y
CONFIG_FRAME_POINTER=y
-# CONFIG_UNWIND_INFO is not set
+CONFIG_UNWIND_INFO=y
+CONFIG_STACK_UNWIND=y
CONFIG_FORCED_INLINING=y
-# CONFIG_DEBUG_SYNCHRO_TEST is not set
-# CONFIG_RCU_TORTURE_TEST is not set
-# CONFIG_PROFILE_LIKELY is not set
+CONFIG_DEBUG_SYNCHRO_TEST=y
+CONFIG_RCU_TORTURE_TEST=y
+CONFIG_PROFILE_LIKELY=y
# CONFIG_WANT_EXTRA_DEBUG_INFORMATION is not set
# CONFIG_KGDB is not set
CONFIG_EARLY_PRINTK=y

i'm continuing the config-bisect.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 22:29:54 UTC
Permalink
Post by Ingo Molnar
PREEMPT wasnt the problem but CONFIG_DEBUG_STACKOVERFLOW (at least).
There's some other debug option that seems incompatible too - i'm
still trying to figure out which one.
CONFIG_PROFILE_LIKELY it is, please disable it in your config, along
with CONFIG_DEBUG_STACKOVERFLOW:

--- .config.good02 2006-05-31 00:28:35.000000000 +0200
+++ .config.bad01 2006-05-31 00:22:28.000000000 +0200
@@ -1,7 +1,7 @@
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.17-rc5-mm1-lockdep
-# Wed May 31 00:26:04 2006
+# Wed May 31 00:19:45 2006
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
@@ -1819,7 +1819,7 @@ CONFIG_STACK_UNWIND=y
CONFIG_FORCED_INLINING=y
CONFIG_DEBUG_SYNCHRO_TEST=y
CONFIG_RCU_TORTURE_TEST=y
-# CONFIG_PROFILE_LIKELY is not set
+CONFIG_PROFILE_LIKELY=y
# CONFIG_WANT_EXTRA_DEBUG_INFORMATION is not set
# CONFIG_KGDB is not set
CONFIG_EARLY_PRINTK=y

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-05-30 22:31:34 UTC
Permalink
Post by Ingo Molnar
Post by Ingo Molnar
PREEMPT wasnt the problem but CONFIG_DEBUG_STACKOVERFLOW (at least).
There's some other debug option that seems incompatible too - i'm
still trying to figure out which one.
CONFIG_PROFILE_LIKELY it is, please disable it in your config, along
Ok, thanks.
Post by Ingo Molnar
Ingo
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 22:32:47 UTC
Permalink
Post by Ingo Molnar
CONFIG_PROFILE_LIKELY it is, please disable it in your config, along
i've also uploaded an updated tracing patch to:

http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch

which forces CONFIG_PROFILE_LIKELY off if LATENCY_TRACE is enabled.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-31 10:56:09 UTC
Permalink
Post by Ingo Molnar
CONFIG_PROFILE_LIKELY it is, please disable it in your config, along
the tracer fix for PROFILE_LIKELY is below. I have also uploaded an
updated tracing patch to

http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch

which allows the enabling of PROFILE_LIKELY && LATENCY_TRACING again.
There's an updated combo patch too:

http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm1.patch

for easy pickup of all current fixes against mm1 baseline.

Ingo

Index: linux/lib/likely_prof.c
===================================================================
--- linux.orig/lib/likely_prof.c
+++ linux/lib/likely_prof.c
@@ -20,7 +20,7 @@

static struct likeliness *likeliness_head;

-int do_check_likely(struct likeliness *likeliness, int ret)
+int notrace do_check_likely(struct likeliness *likeliness, int ret)
{
static unsigned long likely_lock;
Michal Piotrowski
2006-05-30 22:59:32 UTC
Permalink
Post by Ingo Molnar
Post by Michal Piotrowski
Post by Ingo Molnar
http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch
just apply it ontop of your current tree and accept all the new .config
options as the kernel suggests them to you.
I can't boot with that patch. I even don't see "Uncompressing
Linux..." - machine reboots.
I have 2.6.17-rc5-mm1 +
genirq-cleanup-remove-irq_descp-fix.patch
lock-validator-irqtrace-support-non-x86-architectures.patch
lock-validator-special-locking-sb-s_umount-2-fix.patch
from hot fixes
+
Arjan's net/ipv4/igmp.c patch.
could you try to 1) disable PREEMPT, 2) apply the -V2 rollup of all
http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm1.patch
I'll try to reproduce that bug now... but here is new one :)

BUG: key f7155db0 not in .data!
( modprobe-485 |#0): new 15286092 us user-latency.
stopped custom tracer.
BUG: warning at /usr/src/linux-mm/kernel/lockdep.c:1985/lockdep_init_map()
[<c0104208>] show_trace+0x1b/0x20
[<c01042e6>] dump_stack+0x1f/0x24
[<c0136e26>] lockdep_init_map+0x65/0xb0
[<c0134a62>] __mutex_init+0x46/0x50
[<f98b72a3>] find_driver+0xb7/0x115 [snd_seq_device]
[<f98b776f>] snd_seq_device_register_driver+0x42/0xeb [snd_seq_device]
[<f887012d>] alsa_seq_oss_init+0x12d/0x158 [snd_seq_oss]
[<c013fdad>] sys_init_module+0x96/0x1d4
[<c02eb442>] sysenter_past_esp+0x63/0xa1
---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------

Here is dmesg
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-dmesg3

Here is new config (without some debugging options)
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config4
Post by Ingo Molnar
? I'll try your config meanwhile.
Ingo
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 23:05:12 UTC
Permalink
Post by Michal Piotrowski
I'll try to reproduce that bug now... but here is new one :)
BUG: key f7155db0 not in .data!
( modprobe-485 |#0): new 15286092 us user-latency.
stopped custom tracer.
BUG: warning at /usr/src/linux-mm/kernel/lockdep.c:1985/lockdep_init_map()
Arjan's sound patch is wrong: the key must not be in a dynamic variable!

Could you try the patch below? This uses the ID string as the key. (the
ID string seems to be based on static kernel strings most of the time,
so this might as well work)

Ingo

Index: linux/sound/core/seq/seq_device.c
===================================================================
--- linux.orig/sound/core/seq/seq_device.c
+++ linux/sound/core/seq/seq_device.c
@@ -382,7 +382,7 @@ static struct ops_list * create_driver(c

/* set up driver entry */
strlcpy(ops->id, id, sizeof(ops->id));
- mutex_init_key(&ops->reg_mutex, id, &ops->reg_mutex_key);
+ mutex_init_key(&ops->reg_mutex, id, (struct lockdep_type_key)id);
ops->driver = DRIVER_EMPTY;
INIT_LIST_HEAD(&ops->dev_list);
/* lock this instance */
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ingo Molnar
2006-05-30 23:06:20 UTC
Permalink
Post by Ingo Molnar
Could you try the patch below? This uses the ID string as the key.
(the ID string seems to be based on static kernel strings most of the
time, so this might as well work)
that patch should be:

Index: linux/sound/core/seq/seq_device.c
===================================================================
--- linux.orig/sound/core/seq/seq_device.c
+++ linux/sound/core/seq/seq_device.c
@@ -74,8 +74,6 @@ struct ops_list {
struct mutex reg_mutex;

struct list_head list; /* next driver */
-
- struct lockdep_type_key reg_mutex_key;
};


@@ -382,7 +380,7 @@ static struct ops_list * create_driver(c

/* set up driver entry */
strlcpy(ops->id, id, sizeof(ops->id));
- mutex_init_key(&ops->reg_mutex, id, &ops->reg_mutex_key);
+ mutex_init_key(&ops->reg_mutex, id, (struct lockdep_type_key *)id);
ops->driver = DRIVER_EMPTY;
INIT_LIST_HEAD(&ops->dev_list);
/* lock this instance */

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-05-30 23:49:04 UTC
Permalink
Post by Ingo Molnar
Could you try the patch below? This uses the ID string as the key.
(the ID string seems to be based on static kernel strings most of the
time, so this might as well work)
Thanks, problem solved.

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Steven Rostedt
2006-05-31 03:08:46 UTC
Permalink
Post by Michal Piotrowski
Post by Ingo Molnar
Could you try the patch below? This uses the ID string as the key.
(the ID string seems to be based on static kernel strings most of the
time, so this might as well work)
Thanks, problem solved.
Had the same problem, and I can also confirm that that patch fixes it.

-- Steve
Ingo Molnar
2006-05-30 19:59:02 UTC
Permalink
Post by Ingo Molnar
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
sorry - the messages are indeed cryptic, partly because there are lots
of illegal state transitions and the printout is atomated, and partly to
keep the already sizable lockdep printouts as compact as possible.

What happened here is that a lock (-type) that had the {in-hardirq-W}
state bit set, and lockdep observed an event that also sets the
{hardirq-on-W} state bit: illegal.

here is a rough translation of the usage history state bits:

{in-hardirq-W}: lock was exclusively acquired in hardirq context
{in-hardirq-R}: lock was read-acquired in hardirq context
{in-softirq-W}: lock was exclusively acquired in softirq context
{in-softirq-R}: lock was read-acquired in softirq context
{hardirq-on-W}: lock was held exclusively with hardirqs enabled
{hardirq-on-R}: lock was read-held with hardirqs enabled
{softirq-on-W}: lock was held exclusively with softirqs enabled
{softirq-on-R}: lock was read-held with softirqs enabled

to interpret the lock state at a glance, there's an even shorter
representation of the state bits:

(&base->lock#2){++..}
^^^^

'+' : irq-safe [lock was taken in irq context]
'-' : irq-unsafe [lock was taken with irqs enabled]
'.' : unknown [lock has not yet become irq-safe or irq-unsafe]

'?' : read-locked with both hardirq context and with irqs enabled

the first character is for exclusive-locking in hardirqs, the second for
exclusive-locking in softirqs, the third is for read-locking in
hardirqs, the fourth is for read-locking in softirqs.

this means that the "{++..}" sequence shows that this lock is
hardirq-safe and softirq-safe, and was never read-locked. [the later one
is not surprising from a spinlock - but lockdep doesnt know that it's a
spinlock, it deals with all lock types in a unified way]

(more details about the usage history state bits are in
Documentation/lockdep-design.txt and in include/linux/lockdep.h)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Michal Piotrowski
2006-05-31 13:51:35 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
SCSI or libata problem.
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
(&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55
[<c0139a56>] lockdep_acquire+0x69/0x82
[<c02f237c>] _spin_lock_irqsave+0x2a/0x3a
[<c0129a24>] lock_timer_base+0x29/0x55
[<c0129e48>] del_timer+0x19/0x4c
[<c02651e2>] scsi_delete_timer+0xe/0x1f
[<c0262964>] scsi_done+0xb/0x19
[<c0273ed3>] ata_scsi_qc_complete+0x73/0x7f
[<c027024a>] __ata_qc_complete+0x26c/0x274
[<c02704f0>] ata_qc_complete+0xd5/0xdc
[<c0270c42>] ata_hsm_qc_complete+0x201/0x210
[<c02713e7>] ata_hsm_move+0x796/0x7ac
[<c027314e>] ata_interrupt+0x173/0x1b4
[<c014c4f4>] handle_IRQ_event+0x20/0x50
[<c014d76e>] handle_level_irq+0xa1/0xeb
[<c010579c>] do_IRQ+0xa1/0xc9
irq event stamp: 576924
hardirqs last enabled at (576923): [<c02f26c7>]
_spin_unlock_irqrestore+0x36/0x69
hardirqs last disabled at (576924): [<c02f2361>] _spin_lock_irqsave+0xf/0x3a
softirqs last enabled at (576878): [<c0125873>] __do_softirq+0xea/0xf0
softirqs last disabled at (576869): [<c0105689>] do_softirq+0x59/0xcb
#0: (&base->lock#2){++..}, at: [<c0129a24>] lock_timer_base+0x29/0x55
[<c0103e52>] show_trace_log_lvl+0x4b/0xf4
[<c01044b3>] show_trace+0xd/0x10
[<c010457b>] dump_stack+0x19/0x1b
[<c0137d63>] print_usage_bug+0x1a1/0x1ab
[<c0138458>] mark_lock+0x2d7/0x514
[<c01386dc>] mark_held_locks+0x47/0x65
[<c0139745>] trace_hardirqs_on+0x12b/0x16f
[<c02f2b61>] restore_nocheck+0x8/0xb
Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config2
Here is dmesg
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-dmesg2
I can't reproduce this bug with current
http://redhat.com/~mingo/lockdep-patches/latency-tracing-lockdep.patch
and updated
http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm1.patch
but these two bugs looks similar (both were previously reported). Both
appears while starting avahi daemon.

http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/dmesg_1
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/dmesg_2

http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/latency_trace_1.bz2
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/latency_trace_2.bz2

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config5

I haven't noticed these bugs with "maxcpus=1" boot param.

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Ingo Molnar
2006-05-31 14:02:02 UTC
Permalink
Post by Michal Piotrowski
but these two bugs looks similar (both were previously reported). Both
appears while starting avahi daemon.
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/dmesg_1
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/dmesg_2
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/latency_trace_1.bz2
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/latency_trace_2.bz2
thanks - these traces made it really easy to spot the problem! The
problem seems to be caused by a pagefault:

<...>-1 0D..1 10648us : check_chain_key (__lockdep_acquire)
<...>-1 0D..1 10649us+: _raw_spin_lock (_spin_lock_irqsave)
<...>-1 0D..1 10651us : do_page_fault (error_code)
<...>-1 0D..1 10652us : trace_hardirqs_off (ret_from_exception)
<...>-1 0D..1 10653us : trace_hardirqs_on (restore_nocheck)
<...>-1 0D..1 10654us : mark_held_locks (trace_hardirqs_on)
<...>-1 0D..1 10654us : mark_lock (mark_held_locks)
<...>-1 0D..1 10655us : save_trace (mark_lock)
<...>-1 0D..1 10656us : save_stack_trace (save_trace)
<...>-1 0D..1 10658us : print_usage_bug (mark_lock)

i think what happened is that the pagefault happened with irqs disabled,
and the entry.S return-to-exception-site irq-flags tracing code
mistakenly turned on the irq flag - causing the mismatch and lockdep's
confusion.

if it's easy to reproduce it once more, could you apply the patch below?
That will add a trace entry about what address faulted and at what EIP.
Please also upload vmlinux.bz2 because the EIP will be a raw hex number
and i'll have to look it up. (or if it's too big then please disassemble
vmlinux via objdump -d vmlinux and upload a ~100 lines portion that is
mentioned in the new trace entry next to the do_page_fault trace entry
near the end of the latency_trace output)

Ingo

Index: linux/arch/i386/mm/fault.c
===================================================================
--- linux.orig/arch/i386/mm/fault.c
+++ linux/arch/i386/mm/fault.c
@@ -337,6 +338,7 @@ fastcall void __kprobes do_page_fault(st

/* get the address */
address = read_cr2();
+ trace_special(regs->eip, address, error_code);

tsk = current;
Ingo Molnar
2006-05-31 14:05:33 UTC
Permalink
Post by Ingo Molnar
if it's easy to reproduce it once more, could you apply the patch below?
please use the updated patch below - it adds more info so that we can
see whether irqs were really disabled (from the eflags), and we can see
whether it was userspace or kernelspace.

Ingo

Index: linux/arch/i386/mm/fault.c
===================================================================
--- linux.orig/arch/i386/mm/fault.c
+++ linux/arch/i386/mm/fault.c
@@ -337,6 +338,8 @@ fastcall void __kprobes do_page_fault(st

/* get the address */
address = read_cr2();
+ trace_special(regs->eip, address, error_code);
+ trace_special(regs->eflags, regs->xss, regs->esp);

tsk = current;
Ingo Molnar
2006-05-31 14:12:08 UTC
Permalink
Post by Ingo Molnar
i think what happened is that the pagefault happened with irqs
disabled, and the entry.S return-to-exception-site irq-flags tracing
code mistakenly turned on the irq flag - causing the mismatch and
lockdep's confusion.
here's the fix for the irqs-off iret irqflags-tracing problem. Does this
fix the bug(s) on your box?

Ingo

Index: linux/arch/i386/kernel/entry.S
===================================================================
--- linux.orig/arch/i386/kernel/entry.S
+++ linux/arch/i386/kernel/entry.S
@@ -364,6 +364,8 @@ restore_all:
CFI_REMEMBER_STATE
je ldt_ss # returning to user-space with LDT SS
restore_nocheck:
+ testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ?
+ jz restore_nocheck_notrace
TRACE_IRQS_ON
restore_nocheck_notrace:
RESTORE_REGS
@@ -404,7 +406,10 @@ ldt_ss:
* and a switch16 pointer on top of the current frame. */
call setup_x86_bogus_stack
CFI_ADJUST_CFA_OFFSET -8 # frame has moved
+ testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ?
+ jz restore_nocheck_notrace2
TRACE_IRQS_ON
+restore_nocheck_notrace2:
RESTORE_REGS
lss 20+4(%esp), %esp # switch to 16bit stack
1: iret
Roland Dreier
2006-05-30 19:43:39 UTC
Permalink
Building 2.6.17-rc5-mm1, I get this:

net/built-in.o: In function `ip_rt_init':
(.init.text+0xb04): undefined reference to `__you_cannot_kmalloc_that_much'

This seems to be coming from:

rt_hash_locks = kmalloc(sizeof(spinlock_t) * RT_HASH_LOCK_SZ, GFP_KERNEL);

I have CONFIG_NR_CPUS=32, so RT_HASH_LOCK_SZ ends up as 2048. Also, I
have both CONFIG_DEBUG_SPINLOCK=y and CONFIG_PROVE_SPIN_LOCKING=y so
spinlock_t is bloated up quite big:

typedef struct {
raw_spinlock_t raw_lock;
#if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP)
unsigned int break_lock;
#endif
#ifdef CONFIG_DEBUG_SPINLOCK
unsigned int magic, owner_cpu;
void *owner;
#endif
#ifdef CONFIG_PROVE_SPIN_LOCKING
struct lockdep_map dep_map;
#endif
} spinlock_t;

I only have 8 CPUs in the box, so updating my config from the x86_64
defconfig fixes things for me.

No patch because I don't really know how to fix this properly...

- R.
Ingo Molnar
2006-05-30 20:26:54 UTC
Permalink
Post by Roland Dreier
(.init.text+0xb04): undefined reference to `__you_cannot_kmalloc_that_much'
could you try the patch below and set NR_CPUS back to 32?

-----------
Subject: lock validator: fix RT_HASH_LOCK_SZ
From: Ingo Molnar <***@elte.hu>

on lockdep we have a quite big spinlock_t, so keep the size down.

Signed-off-by: Ingo Molnar <***@elte.hu>
Signed-off-by: Arjan van de Ven <***@linux.intel.com>
---
net/ipv4/route.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)

Index: linux/net/ipv4/route.c
===================================================================
--- linux.orig/net/ipv4/route.c
+++ linux/net/ipv4/route.c
@@ -212,17 +212,22 @@ struct rt_hash_bucket {
/*
* Instead of using one spinlock for each rt_hash_bucket, we use a table of spinlocks
* The size of this table is a power of two and depends on the number of CPUS.
+ * (on lockdep we have a quite big spinlock_t, so keep the size down there)
*/
-#if NR_CPUS >= 32
-#define RT_HASH_LOCK_SZ 4096
-#elif NR_CPUS >= 16
-#define RT_HASH_LOCK_SZ 2048
-#elif NR_CPUS >= 8
-#define RT_HASH_LOCK_SZ 1024
-#elif NR_CPUS >= 4
-#define RT_HASH_LOCK_SZ 512
+#ifdef CONFIG_LOCKDEP
+# define RT_HASH_LOCK_SZ 256
#else
-#define RT_HASH_LOCK_SZ 256
+# if NR_CPUS >= 32
+# define RT_HASH_LOCK_SZ 4096
+# elif NR_CPUS >= 16
+# define RT_HASH_LOCK_SZ 2048
+# elif NR_CPUS >= 8
+# define RT_HASH_LOCK_SZ 1024
+# elif NR_CPUS >= 4
+# define RT_HASH_LOCK_SZ 512
+# else
+# define RT_HASH_LOCK_SZ 256
+# endif
#endif

static spinlock_t *rt_hash_locks;
Roland Dreier
2006-05-30 20:43:40 UTC
Permalink
Post by Ingo Molnar
on lockdep we have a quite big spinlock_t, so keep the size down.
Yes, that builds fine.

However the kernel won't boot for me... it oopses early on in
save_stack_trace(). I'm attaching a bootlog, plus another try booting
with nmi_watchdog=0, plus my config.

Thanks,
Roland
Ingo Molnar
2006-05-30 20:49:01 UTC
Permalink
Post by Roland Dreier
Post by Ingo Molnar
on lockdep we have a quite big spinlock_t, so keep the size down.
Yes, that builds fine.
However the kernel won't boot for me... it oopses early on in
save_stack_trace(). I'm attaching a bootlog, plus another try booting
with nmi_watchdog=0, plus my config.
there's some bad interaction between the new dwarf2 unwind info
stackframe walker code in mm1 and lockdep's stacktrace code on x86_64.
I'm investigating this currently, meanwhile you can try the quick hack
below.

Ingo

Index: linux/arch/x86_64/kernel/stacktrace.c
===================================================================
--- linux.orig/arch/x86_64/kernel/stacktrace.c
+++ linux/arch/x86_64/kernel/stacktrace.c
@@ -127,7 +127,8 @@ save_context_stack(struct stack_trace *t
skip--;
if (trace->nr_entries >= trace->max_entries)
break;
- if (!addr)
+#warning fixme
+// if (!addr)
return 0;
/*
* Stack frames must go forwards (otherwise a loop could
Roland Dreier
2006-05-30 20:58:43 UTC
Permalink
Thanks, that boots.

During boot I see this, apparently while mounting NFS filesystems:

[ 83.114812] ====================================
[ 83.133079] [ BUG: possible deadlock detected! ]
[ 83.146881] ------------------------------------
[ 83.160683] mount/3531 is trying to acquire lock:
[ 83.174745] (&inode->i_mutex){--..}, at: [<ffffffff804396df>] mutex_lock+0x22/0x27
[ 83.197835]
[ 83.197836] but task is already holding lock:
[ 83.215295] (&inode->i_mutex){--..}, at: [<ffffffff804396df>] mutex_lock+0x22/0x27
[ 83.238386]
[ 83.238387] which could potentially lead to deadlocks!
[ 83.258207]
[ 83.258207] other info that might help us debug this:
[ 83.277769] 2 locks held by mount/3531:
[ 83.289235] #0: (&s->s_umount#16){--..}, at: [<ffffffff8028c0b3>] sget+0x1a0/0x407
[ 83.312612] #1: (&inode->i_mutex){--..}, at: [<ffffffff804396df>] mutex_lock+0x22/0x27
[ 83.337025]
[ 83.337026] stack backtrace:
[ 83.350101]
[ 83.350101] Call Trace:
[ 83.361890] [<ffffffff80247b4e>] __lockdep_acquire+0x18a/0xad2
[ 83.379629] [<ffffffff804396df>] mutex_lock+0x22/0x27
[ 83.395038] [<ffffffff8024887d>] lockdep_acquire+0x82/0xa3
[ 83.411748] [<ffffffff80439450>] __mutex_lock_slowpath+0xfd/0x36a
[ 83.430273] [<ffffffff804396df>] mutex_lock+0x22/0x27
[ 83.445703] [<ffffffff880ce635>] :sunrpc:rpc_populate+0x43/0x141
[ 83.463934] [<ffffffff880cedb8>] :sunrpc:rpc_mkdir+0xb6/0x172
[ 83.481383] [<ffffffff802a1862>] mntput_no_expire+0x1b/0xb9
[ 83.498348] [<ffffffff802a989c>] simple_pin_fs+0xc3/0xd3
[ 83.514548] [<ffffffff880bf8c1>] :sunrpc:rpc_new_client+0x226/0x348
[ 83.533592] [<ffffffff880c06e0>] :sunrpc:rpc_create_client+0xc/0x3e
[ 83.552644] [<ffffffff88105e0c>] :nfs:nfs_get_sb+0x559/0x6e8
[ 83.569853] [<ffffffff8028b827>] vfs_kern_mount+0x8b/0x196
[ 83.586560] [<ffffffff8028b980>] do_kern_mount+0x3c/0x57
[ 83.602724] [<ffffffff802a3596>] do_mount+0x7dd/0x851
[ 83.618108] [<ffffffff80247240>] mark_lock+0x3b/0x4fc
[ 83.633520] [<ffffffff80262a2a>] get_page_from_freelist+0x34e/0x4cc
[ 83.652560] [<ffffffff802479a0>] trace_hardirqs_on+0x165/0x189
[ 83.670281] [<ffffffff80262a99>] get_page_from_freelist+0x3bd/0x4cc
[ 83.689324] [<ffffffff8043b461>] _spin_unlock_irqrestore+0x3f/0x47
[ 83.708109] [<ffffffff80262c2a>] __alloc_pages+0x82/0x33d
[ 83.724558] [<ffffffff802787ef>] alloc_pages_current+0xa0/0xa9
[ 83.742301] [<ffffffff803361fc>] _raw_spin_lock+0xc7/0x15d
[ 83.759012] [<ffffffff802a36a7>] sys_mount+0x9d/0xe9
[ 83.774160] [<ffffffff8043ab11>] trace_hardirqs_on_thunk+0x35/0x37
[ 83.792919] [<ffffffff80209652>] system_call+0x7e/0x83

- R.
Arjan van de Ven
2006-05-30 21:01:20 UTC
Permalink
Post by Roland Dreier
Thanks, that boots.
do you have KALLSYMS_ALL enabled? This looks like a thing we already
fixed as well... but it also looks a bit odd ..
Roland Dreier
2006-05-30 21:03:38 UTC
Permalink
Arjan> do you have KALLSYMS_ALL enabled? This looks like a thing
Arjan> we already fixed as well... but it also looks a bit odd ..

Nope, sorry. Will rebuild and resend.

- R.
Roland Dreier
2006-05-30 21:14:30 UTC
Permalink
Here it is with KALLSYMS_ALL:

[ 80.587694] ====================================
[ 80.605928] [ BUG: possible deadlock detected! ]
[ 80.619729] ------------------------------------
[ 80.633532] mount/3534 is trying to acquire lock:
[ 80.647593] (&inode->i_mutex){--..}, at: [<ffffffff804396af>] mutex_lock+0x22/0x27
[ 80.670683]
[ 80.670684] but task is already holding lock:
[ 80.688170] (&inode->i_mutex){--..}, at: [<ffffffff804396af>] mutex_lock+0x22/0x27
[ 80.711260]
[ 80.711261] which could potentially lead to deadlocks!
[ 80.731083]
[ 80.731083] other info that might help us debug this:
[ 80.750618] 2 locks held by mount/3534:
[ 80.762085] #0: (&s->s_umount#16){--..}, at: [<ffffffff8028c07b>] sget+0x1a0/0x407
[ 80.785513] #1: (&inode->i_mutex){--..}, at: [<ffffffff804396af>] mutex_lock+0x22/0x27
[ 80.809952]
[ 80.809952] stack backtrace:
[ 80.823003]
[ 80.823003] Call Trace:
[ 80.834790] [<ffffffff80247b4e>] __lockdep_acquire+0x18a/0xad2
[ 80.852503] [<ffffffff804396af>] mutex_lock+0x22/0x27
[ 80.867887] [<ffffffff8024887d>] lockdep_acquire+0x82/0xa3
[ 80.884596] [<ffffffff80439420>] __mutex_lock_slowpath+0xfd/0x36a
[ 80.903095] [<ffffffff804396af>] mutex_lock+0x22/0x27
[ 80.918499] [<ffffffff880ce635>] :sunrpc:rpc_populate+0x43/0x141
[ 80.936759] [<ffffffff880cedb8>] :sunrpc:rpc_mkdir+0xb6/0x172
[ 80.954206] [<ffffffff802a182a>] mntput_no_expire+0x1b/0xb9
[ 80.971173] [<ffffffff802a9864>] simple_pin_fs+0xc3/0xd3
[ 80.987371] [<ffffffff880bf8c1>] :sunrpc:rpc_new_client+0x226/0x348
[ 81.006390] [<ffffffff880c06e0>] :sunrpc:rpc_create_client+0xc/0x3e
[ 81.025415] [<ffffffff88105e0c>] :nfs:nfs_get_sb+0x559/0x6e8
[ 81.042598] [<ffffffff8028b7ef>] vfs_kern_mount+0x8b/0x196
[ 81.059305] [<ffffffff8028b948>] do_kern_mount+0x3c/0x57
[ 81.075495] [<ffffffff802a355e>] do_mount+0x7dd/0x851
[ 81.090906] [<ffffffff80247240>] mark_lock+0x3b/0x4fc
[ 81.106289] [<ffffffff802629f2>] get_page_from_freelist+0x34e/0x4cc
[ 81.125331] [<ffffffff802479a0>] trace_hardirqs_on+0x165/0x189
[ 81.143050] [<ffffffff80262a61>] get_page_from_freelist+0x3bd/0x4cc
[ 81.162094] [<ffffffff8043b431>] _spin_unlock_irqrestore+0x3f/0x47
[ 81.180880] [<ffffffff80262bf2>] __alloc_pages+0x82/0x33d
[ 81.197332] [<ffffffff802787b7>] alloc_pages_current+0xa0/0xa9
[ 81.215074] [<ffffffff803361cc>] _raw_spin_lock+0xc7/0x15d
[ 81.231781] [<ffffffff802a366f>] sys_mount+0x9d/0xe9
[ 81.246905] [<ffffffff8043aae1>] trace_hardirqs_on_thunk+0x35/0x37
[ 81.265690] [<ffffffff80209652>] system_call+0x7e/0x83
Arjan van de Ven
2006-05-30 21:55:56 UTC
Permalink
ok this ought to do it


rpc_populate is creating a child inode in a directory, and the
parent already has it's mutex locked. Similar to the VFS code
this needs I_MUTEX_CHILD nesting annotation

Signed-off-by: Arjan van de Ven <***@linux.intel.com>
---
net/sunrpc/rpc_pipe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.17-rc5-mm1-lockdep/net/sunrpc/rpc_pipe.c
===================================================================
--- linux-2.6.17-rc5-mm1-lockdep.orig/net/sunrpc/rpc_pipe.c
+++ linux-2.6.17-rc5-mm1-lockdep/net/sunrpc/rpc_pipe.c
@@ -557,7 +557,7 @@ rpc_populate(struct dentry *parent,
struct dentry *dentry;
int mode, i;

- mutex_lock(&dir->i_mutex);
+ mutex_lock_nested(&dir->i_mutex, I_MUTEX_CHILD);
for (i = start; i < eof; i++) {
dentry = d_alloc_name(parent, files[i].name);
if (!dentry)
Ingo Molnar
2006-05-30 21:19:30 UTC
Permalink
i've uploaded lock validator -V2, a rollup of all current fixes (against
-rc5-mm1) to:

http://redhat.com/~mingo/lockdep-patches/lockdep-combo-2.6.17-rc5-mm1.patch

(Andrew got all these fixes as individual patches already)

Ingo
Laurent Riffard
2006-05-30 21:07:58 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
...
Runtime locking validation.
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {hardirq-on-W} -> {in-hardirq-W} usage.
events/0/4 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&list->lock){+...}, at: [<c0247689>] skb_dequeue+0x12/0x43
{hardirq-on-W} state was registered at:
[<c012d2a1>] lockdep_acquire+0x56/0x6f
[<c029595b>] _spin_lock_bh+0x1c/0x29
[<c02922e0>] unix_stream_connect+0x2d8/0x3a7
[<c0243fb4>] sys_connect+0x54/0x71
[<c0244c5c>] sys_socketcall+0x6f/0x166
[<c0295afd>] sysenter_past_esp+0x56/0x8d
irq event stamp: 1886
hardirqs last enabled at (1885): [<c0295a2b>] _spin_unlock_irqrestore+0x35/0x3b
hardirqs last disabled at (1886): [<c01032fb>] common_interrupt+0x1b/0x2c
softirqs last enabled at (0): [<c0114af0>] copy_process+0x265/0x11dc
softirqs last disabled at (0): [<00000000>] init+0x3feffde0/0x1da

other info that might help us debug this:
no locks held by events/0/4.

stack backtrace:
[<c0103810>] show_trace_log_lvl+0x4b/0xf4
[<c0103e11>] show_trace+0xd/0x10
[<c0103e58>] dump_stack+0x19/0x1b
[<c012b8be>] print_usage_bug+0x1a4/0x1ae
[<c012c3c6>] mark_lock+0x8a/0x411
[<c012cc55>] __lockdep_acquire+0x302/0x8f8
[<c012d2a1>] lockdep_acquire+0x56/0x6f
[<c0295906>] _spin_lock_irqsave+0x20/0x2f
[<c0247689>] skb_dequeue+0x12/0x43
[<e0bdb7ac>] hpsb_bus_reset+0x55/0xa2 [ieee1394]
[<e0853e72>] ohci_irq_handler+0x322/0x6c9 [ohci1394]
[<c0138d8c>] handle_IRQ_event+0x20/0x50
[<c0139d38>] handle_level_irq+0x71/0xb9
[<c0104ded>] do_IRQ+0x81/0xa8
[<c0103305>] common_interrupt+0x25/0x2c
ieee1394: Host added: ID:BUS[0-00:1023] GUID[00308d0120e085ca]

.config and dmesg attached.

regards
~~
laurent
Arjan van de Ven
2006-05-30 21:24:47 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1=
7-rc5/2.6.17-rc5-mm1/
...
Runtime locking validation.
=20
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
[ BUG: illegal lock usage! ]
----------------------------
illegal {hardirq-on-W} -> {in-hardirq-W} usage.
(&list->lock){+...}, at: [<c0247689>] skb_dequeue+0x12/0x43
hmmm skb_dequeue is called in a hard irq...=20
[<c012d2a1>] lockdep_acquire+0x56/0x6f
[<c029595b>] _spin_lock_bh+0x1c/0x29
[<c02922e0>] unix_stream_connect+0x2d8/0x3a7
=2E. yet it was taken only with spin_lock_bh() in unix_stream_connect,
leaving interrupts enabled (and thus not allowing use inside a hard irq=
)
[<c0243fb4>] sys_connect+0x54/0x71
[<c0244c5c>] sys_socketcall+0x6f/0x166
[<c0295afd>] sysenter_past_esp+0x56/0x8d
irq event stamp: 1886
hardirqs last enabled at (1885): [<c0295a2b>] _spin_unlock_irqrestor=
e+0x35/0x3b
hardirqs last disabled at (1886): [<c01032fb>] common_interrupt+0x1b/=
0x2c
softirqs last enabled at (0): [<c0114af0>] copy_process+0x265/0x11dc
softirqs last disabled at (0): [<00000000>] init+0x3feffde0/0x1da
=20
no locks held by events/0/4.
=20
[<c0103810>] show_trace_log_lvl+0x4b/0xf4
[<c0103e11>] show_trace+0xd/0x10
[<c0103e58>] dump_stack+0x19/0x1b
[<c012b8be>] print_usage_bug+0x1a4/0x1ae
[<c012c3c6>] mark_lock+0x8a/0x411
[<c012cc55>] __lockdep_acquire+0x302/0x8f8
[<c012d2a1>] lockdep_acquire+0x56/0x6f
[<c0295906>] _spin_lock_irqsave+0x20/0x2f
[<c0247689>] skb_dequeue+0x12/0x43
[<e0bdb7ac>] hpsb_bus_reset+0x55/0xa2 [ieee1394]
yet hpsb_bus_reset() calls skb_dequeue (indirectly, via the inlined
abort_requests() function) in a hard irq.
Mel Gorman
2006-05-30 21:43:00 UTC
Permalink
Post by Arjan van de Ven
Post by Ingo Molnar
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
...
Runtime locking validation.
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {hardirq-on-W} -> {in-hardirq-W} usage.
(&list->lock){+...}, at: [<c0247689>] skb_dequeue+0x12/0x43
hmmm skb_dequeue is called in a hard irq...
Post by Ingo Molnar
[<c012d2a1>] lockdep_acquire+0x56/0x6f
[<c029595b>] _spin_lock_bh+0x1c/0x29
[<c02922e0>] unix_stream_connect+0x2d8/0x3a7
.. yet it was taken only with spin_lock_bh() in unix_stream_connect,
leaving interrupts enabled (and thus not allowing use inside a hard irq)
Post by Ingo Molnar
[<c0243fb4>] sys_connect+0x54/0x71
[<c0244c5c>] sys_socketcall+0x6f/0x166
[<c0295afd>] sysenter_past_esp+0x56/0x8d
irq event stamp: 1886
hardirqs last enabled at (1885): [<c0295a2b>] _spin_unlock_irqrestore+0x35/0x3b
hardirqs last disabled at (1886): [<c01032fb>] common_interrupt+0x1b/0x2c
softirqs last enabled at (0): [<c0114af0>] copy_process+0x265/0x11dc
softirqs last disabled at (0): [<00000000>] init+0x3feffde0/0x1da
no locks held by events/0/4.
[<c0103810>] show_trace_log_lvl+0x4b/0xf4
[<c0103e11>] show_trace+0xd/0x10
[<c0103e58>] dump_stack+0x19/0x1b
[<c012b8be>] print_usage_bug+0x1a4/0x1ae
[<c012c3c6>] mark_lock+0x8a/0x411
[<c012cc55>] __lockdep_acquire+0x302/0x8f8
[<c012d2a1>] lockdep_acquire+0x56/0x6f
[<c0295906>] _spin_lock_irqsave+0x20/0x2f
[<c0247689>] skb_dequeue+0x12/0x43
[<e0bdb7ac>] hpsb_bus_reset+0x55/0xa2 [ieee1394]
yet hpsb_bus_reset() calls skb_dequeue (indirectly, via the inlined
abort_requests() function) in a hard irq.
On x86_64, I'm seeing what may be flakiness related to skb_dequeue. I
haven't had a chance to look too closely, but the serial excerpt I have is
below. The real BUG of interest is near the end with

BUG: sleeping function called from invalid context at include/linux/rwsem.h:49

At the time of failure, a kernel compile was taking place. I've also seen
one ppc machine (not ppc64) lock up. There was no output to console so it
may or may not be related.

INIT: version 2.86 booting
Welcome to Fedora Core
Press 'I' to enter interactive startup.
Setting clock (localtime): Tue May 30 11:04:10 CDT 2006 [ OK ]
Starting udev: [ OK ]
Setting hostname bl6-13.ltc.austin.ibm.com: [ OK ]
Setting up Logical Volume Management: 2 logical volume(s) in volume group "VolGroup00" now active
[ OK ]
Checking filesystems
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/VolGroup00/LogVol00
/dev/VolGroup00/LogVol00: clean, 275453/7929856 files, 2546251/7929856 blocks
[/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
/boot: clean, 62/512512 files, 43374/512064 blocks
[ OK ]
Remounting root filesystem in read-write mode: [ OK ]
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling swap space: [ OK ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting readahead_early: Starting background readahead: [ OK ]
[ OK ]
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.17-rc5-mm1-autokern1/kernel/arch/x86_64/kernel/cpufreq/acpi-cpufreq.ko): No such device
Bringing up loopback interface: [ OK ]
Bringing up interface eth1: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Starting irqbalance: [ OK ]
Starting portmap: [ OK ]
Starting NFS statd: [ OK ]
Starting RPC idmapd: FATAL: Module sunrpc not found.
FATAL: Error running install command for sunrpc
Starting system message bus: [ OK ]
Starting Bluetooth services:[ OK ][ OK ]
Mounting other filesystems: [ OK ]
Starting hidd: [ OK ]
Starting automount: [ OK ]
Starting smartd: [ OK ]
Starting acpi daemon: [ OK ]
Starting hpiod: [ OK ]
Starting hpssd: [ OK ]
Starting cups: [ OK ]
Starting sshd: [ OK ]
Starting sendmail: [ OK ]
Starting sm-client: [ OK ]
Starting console mouse services: [ OK ]
Starting crond: [ OK ]
Starting xfs: [ OK ]
Starting anacron: [ OK ]
Starting atd: [ OK ]
Starting Avahi daemon: [ OK ]
Starting cups-config-daemon: [ OK ]
Starting HAL daemon: [ OK ]

Fedora Core release 5 (Bordeaux)
Kernel 2.6.17-rc5-mm1-autokern1 on an x86_64

bl6-13.ltc.austin.ibm.com login: -- 0:conmux-control -- time-stamp -- May/30/06 9:04:37 --
-- 0:conmux-control -- time-stamp -- May/30/06 9:08:26 --
NMI Watchdog detected LOCKUP on CPU 2
CPU 2
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 25254, comm: cc1 Not tainted 2.6.17-rc5-mm1-autokern1 #1
RIP: 0010:[<ffffffff810814de>] [<ffffffff810814de>] cache_alloc_refill+0x16a/0x200
RSP: 0018:ffff81001d9edb88 EFLAGS: 00000097
RAX: 00000000ffffffff RBX: 000000000000000f RCX: ffff81001d9edcd4
RDX: ffff8100016df440 RSI: ffff81003c0cc000 RDI: ffff810037fd9400
RBP: ffff81003c0cc000 R08: ffff810037fda000 R09: ffff810037fd7000
R10: ffff81001d9ec000 R11: 0000000000000246 R12: ffff8100016df440
R13: ffff810037fda000 R14: 000000000000002c R15: ffff810037fd9400
FS: 00002b18bc9bcd30(0000) GS:ffff810037e09bc0(0063) knlGS:00000000f7f9e6b0
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000f665c000 CR3: 000000001af2d000 CR4: 00000000000006e0
Process cc1 (pid: 25254, threadinfo ffff81001d9ec000, task ffff81003e048050)
Stack: 000000d000000246 0000000000000246 00000000000000d0 ffff810037fd9400
ffff810037fd9400 ffff81001d9edd68 0000000000000000 ffffffff81082a00
ffff81003b7b14c0 00000000000000d0
Call Trace:
[<ffffffff81082a00>] kmem_cache_alloc+0x7c/0x86
[<ffffffff8122e081>] __alloc_skb+0x30/0x11d
[<ffffffff8122c6e8>] sock_alloc_send_skb+0x6d/0x1ea
[<ffffffff8102c0d5>] __wake_up+0x36/0x4d
[<ffffffff8128949d>] unix_stream_sendmsg+0x14d/0x2ff
[<ffffffff81229fad>] do_sock_write+0xc7/0xd2
[<ffffffff8122a0fd>] sock_aio_write+0x4f/0x5e
[<ffffffff81086f28>] do_sync_write+0xc9/0x106
[<ffffffff812924f2>] do_page_fault+0x46f/0x7b0
[<ffffffff81046b6c>] autoremove_wake_function+0x0/0x2e
[<ffffffff810731f4>] do_mmap_pgoff+0x673/0x774
[<ffffffff8108704c>] vfs_write+0xe7/0x175
[<ffffffff8108718d>] sys_write+0x45/0x6e
[<ffffffff81022994>] cstar_do_call+0x1b/0x65


Code: 49 8b 04 24 48 89 68 08 48 89 45 00 4c 89 65 08 49 89 2c 24
console shuts up ...
NMI Watchdog detected LOCKUP on CPU 1
CPU 1
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 15, comm: events/1 Not tainted 2.6.17-rc5-mm1-autokern1 #1
RIP: 0010:[<ffffffff81135167>] [<ffffffff81135167>] __delay+0xa/0x10
RSP: 0018:ffff810037f43d80 EFLAGS: 00000012
RAX: 0000000000000008 RBX: ffff8100016df480 RCX: 00000000a155bf6d
RDX: 0000000000000101 RSI: ffff8100016df440 RDI: 0000000000000001
RBP: 000000002e69b72b R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: ffff8100016dfc40 R12: 0000000000000001
R13: 0000000000000000 R14: ffff8100016df440 R15: ffff810037fd9400
FS: 00002b99ea9c72d0(0000) GS:ffff81003efb98c0(0000) knlGS:00000000f7fdb6b0
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000009afb30 CR3: 000000003ebc6000 CR4: 00000000000006e0
Process events/1 (pid: 15, threadinfo ffff810037f42000, task ffff810037fef050)
Stack: ffffffff81140cfc 0000000000000000 ffff8100016df440 ffff810037fda400
ffffffff81081c7a ffff8100016df440 0000000000000000 ffff8100016df440
ffff81003ebd7880 ffff810037fd9400
Call Trace:
[<ffffffff81140cfc>] _raw_spin_lock+0x8b/0xf1
[<ffffffff81081c7a>] drain_array+0x51/0xd3
[<ffffffff810837af>] cache_reap+0x0/0x2ce
[<ffffffff8108389b>] cache_reap+0xec/0x2ce
[<ffffffff810837af>] cache_reap+0x0/0x2ce
[<ffffffff81043354>] run_workqueue+0xa1/0xeb
[<ffffffff8104339e>] worker_thread+0x0/0x137
[<ffffffff810434a3>] worker_thread+0x105/0x137
[<ffffffff8102c02e>] default_wake_function+0x0/0xe
[<ffffffff8102c02e>] default_wake_function+0x0/0xe
[<ffffffff81046652>] kthread+0x107/0x133
[<ffffffff8104339e>] worker_thread+0x0/0x137
[<ffffffff8100a146>] child_rip+0x8/0x12
[<ffffffff8104339e>] worker_thread+0x0/0x137
[<ffffffff8104654b>] kthread+0x0/0x133
[<ffffffff8100a13e>] child_rip+0x0/0x12


Code: 48 39 f8 72 f5 c3 65 8b 04 25 24 00 00 00 48 98 48 69 c0 c0
console shuts up ...
<1>Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP:
[<ffffffff8122f9aa>] skb_dequeue+0x2c/0x50
PGD 330cf067 PUD 0
Oops: 0002 [1] SMP
last sysfs file: /block/sda/sda1/size
CPU 0
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 1871, comm: sshd Not tainted 2.6.17-rc5-mm1-autokern1 #1
RIP: 0010:[<ffffffff8122f9aa>] [<ffffffff8122f9aa>] skb_dequeue+0x2c/0x50
RSP: 0018:ffff810032c81c28 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff810032a96510 RCX: 000000000000003f
RDX: 000000000000001f RSI: 0000000000000246 RDI: ffff810032a96528
RBP: ffff81003c0cc2c0 R08: 0000000100000000 R09: 0000000000000246
R10: 0000000000000246 R11: 000000000000001a R12: ffff810032a96528
R13: ffff810032c81da0 R14: ffff810032c81d68 R15: 0000000000000000
FS: 00002ac772097be0(0000) GS:ffffffff8146e000(0000) knlGS:00000000f7fad6b0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 000000003d28b000 CR4: 00000000000006e0
Process sshd (pid: 1871, threadinfo ffff810032c80000, task ffff810037f4c7d0)
Stack: ffff810032a96510 ffff81003c0cc2c0 ffff810032a96440 ffffffff812899a6
ffffffa100000001 0000001a00000001 0000000000000000 00000040000000d0
0000000000003fe6 ffff81003280c180
Call Trace:
[<ffffffff812899a6>] unix_stream_recvmsg+0x101/0x4bf
[<ffffffff8122c946>] release_sock+0x10/0xae
[<ffffffff81258520>] tcp_sendmsg+0x9b0/0xa82
[<ffffffff81229d87>] do_sock_read+0xc6/0xd1
[<ffffffff81229ed7>] sock_aio_read+0x4f/0x5e
[<ffffffff81086cb0>] do_sync_read+0xc9/0x106
[<ffffffff81046b6c>] autoremove_wake_function+0x0/0x2e
[<ffffffff812905e8>] _spin_unlock_irq+0x6/0xa
[<ffffffff8128e4e5>] thread_return+0x64/0xec
[<ffffffff81086dd1>] vfs_read+0xe4/0x172
[<ffffffff8108711f>] sys_read+0x45/0x6e
[<ffffffff810092be>] system_call+0x7e/0x83


Code: 48 89 58 08 48 c7 45 00 00 00 00 00 48 c7 45 08 00 00 00 00
RIP [<ffffffff8122f9aa>] skb_dequeue+0x2c/0x50 RSP <ffff810032c81c28>
CR2: 0000000000000008
<3>BUG: sleeping function called from invalid context at include/linux/rwsem.h:49
in_atomic():0, irqs_disabled():1

Call Trace:
[<ffffffff81029ba0>] __might_sleep+0xc0/0xc2
[<ffffffff810403ed>] blocking_notifier_call_chain+0x1f/0x4e
[<ffffffff81035a8a>] do_exit+0x22/0x8ce
[<ffffffff81184817>] do_unblank_screen+0x29/0x121
[<ffffffff812927c5>] do_page_fault+0x742/0x7b0
[<ffffffff81029f49>] activate_task+0x4b/0x99
[<ffffffff81098aad>] __pollwait+0x0/0xdd
[<ffffffff81009f8d>] error_exit+0x0/0x84
[<ffffffff8122f9aa>] skb_dequeue+0x2c/0x50
[<ffffffff8122f993>] skb_dequeue+0x15/0x50
[<ffffffff812899a6>] unix_stream_recvmsg+0x101/0x4bf
[<ffffffff8122c946>] release_sock+0x10/0xae
[<ffffffff81258520>] tcp_sendmsg+0x9b0/0xa82
[<ffffffff81229d87>] do_sock_read+0xc6/0xd1
[<ffffffff81229ed7>] sock_aio_read+0x4f/0x5e
[<ffffffff81086cb0>] do_sync_read+0xc9/0x106
[<ffffffff81046b6c>] autoremove_wake_function+0x0/0x2e
[<ffffffff812905e8>] _spin_unlock_irq+0x6/0xa
[<ffffffff8128e4e5>] thread_return+0x64/0xec
[<ffffffff81086dd1>] vfs_read+0xe4/0x172
[<ffffffff8108711f>] sys_read+0x45/0x6e
[<ffffffff810092be>] system_call+0x7e/0x83

NMI Watchdog detected LOCKUP on CPU 3
CPU 3
Modules linked in: ipv6 ppdev hidp rfcomm l2cap bluetooth video sony_acpi button battery asus_acpi ac lp parport_pc parport nvram
Pid: 1710, comm: sshd Not tainted 2.6.17-rc5-mm1-autokern1 #1
RIP: 0010:[<ffffffff81140cfc>] [<ffffffff81140cfc>] _raw_spin_lock+0x8b/0xf1
RSP: 0018:ffff81003319db68 EFLAGS: 00000002
RAX: 0000000000000008 RBX: ffff8100016df480 RCX: 00000000b3e3ec80
RDX: 0000000000000105 RSI: 00000000000000d0 RDI: 0000000000000001
RBP: 000000001713db5d R08: ffff81003c0d0440 R09: ffff81003d177340
R10: 00005555556bf377 R11: 0000000000000246 R12: 0000000000000001
R13: ffff810037fd8c00 R14: 000000000000003c R15: ffff810037fd9400
FS: 00002b2015099be0(0000) GS:ffff8100016dfec0(0000) knlGS:00000000f7fb26b0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00005555556bf168 CR3: 00000000348c8000 CR4: 00000000000006e0
Process sshd (pid: 1710, threadinfo ffff81003319c000, task ffff810037f57810)
Stack: 0000000000000246 00000000000000d0 ffff8100016df440 ffffffff810813ef
000000d000000246 0000000000000246 00000000000000d0 ffff810037fd9400
ffff810037fd9400 ffff81003319dd68
Call Trace:
[<ffffffff810813ef>] cache_alloc_refill+0x7b/0x200
[<ffffffff81082a00>] kmem_cache_alloc+0x7c/0x86
[<ffffffff8122e081>] __alloc_skb+0x30/0x11d
[<ffffffff8122c6e8>] sock_alloc_send_skb+0x6d/0x1ea
[<ffffffff810381e0>] current_fs_time+0x4d/0x52
[<ffffffff8128949d>] unix_stream_sendmsg+0x14d/0x2ff
[<ffffffff81229fad>] do_sock_write+0xc7/0xd2
[<ffffffff8122a0fd>] sock_aio_write+0x4f/0x5e
[<ffffffff8106f6e8>] do_wp_page+0x38e/0x3c1
[<ffffffff81086f28>] do_sync_write+0xc9/0x106
[<ffffffff812924f2>] do_page_fault+0x46f/0x7b0
[<ffffffff81046b6c>] autoremove_wake_function+0x0/0x2e
[<ffffffff812905e8>] _spin_unlock_irq+0x6/0xa
[<ffffffff8128e4e5>] thread_return+0x64/0xec
[<ffffffff81032628>] do_fork+0x138/0x1b0
[<ffffffff8108704c>] vfs_write+0xe7/0x175
[<ffffffff8108718(bot:conmon-payload) disconnected
d>] sys_write+0x45/0x6e
[<ffffffff810092be>] system_call+0x7e/0x83


Code: eb d9 45 85 e4 74 d2 45 31 e4 65 48 8b 04 25 00 00 00 00 65
console shuts up ...
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
Brice Goglin
2006-05-30 21:07:19 UTC
Permalink
Post by Andrew Morton
+node-hotplug-register-cpu-remove-node-struct.patch
+node-hotplug-fixes-callres-of-register_cpu.patch
+node-hotplug-fixes-callres-of-register_cpu-powerpc-warning-fix.patch
+node-hotplug-register_node-fix.patch
NUMA node hotplugging updates
Hi Andrew,

I had to apply the following patch to build this -mm on alpha.

Signed-off-by: Brice Goglin <***@ens-lyon.org>

Brice

Index: linux-mm/arch/alpha/kernel/setup.c
===================================================================
--- linux-mm.orig/arch/alpha/kernel/setup.c 2006-05-30 22:53:54.000000000 +0200
+++ linux-mm/arch/alpha/kernel/setup.c 2006-05-30 22:55:30.000000000 +0200
@@ -481,7 +481,7 @@
struct cpu *p = kzalloc(sizeof(*p), GFP_KERNEL);
if (!p)
return -ENOMEM;
- register_cpu(p, i, NULL);
+ register_cpu(p, i);
}
return 0;
}
Roland Dreier
2006-05-30 21:24:03 UTC
Permalink
I'm seeing problems with MSI-X interrupts on 2.6.17-rc5-mm1. I'll try
to debug the MSI patches in -mm further in the next day or so, but for
now I'll post the symptoms.

When I load the ib_mthca driver with MSI-X interrupts enabled, I get
the following crash as soon as the first interrupt is generated.

[ 329.979089] Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
[ 329.995487] [<0000000000000000>]
<8>[ 330.012818] PGD 119477067 PUD 119b48067 PMD 0
[ 330.027009] Oops: 0010 [1] SMP
[ 330.036503] last sysfs file: /class/net/ib2/address
<8>[ 330.051084] CPU 0
<8>[ 330.057932] Modules linked in: ib_mthca ib_srp ib_cm ib_ipoib ib_sa ib_mad ib_core nfs lockd nfs_acl sunrpc ipv6 thermal fan button processor ac battery dm_mod ide_generic ide_disk evdev usbhid ide_cd cdrom amd74xx psmouse serio_raw e1000 pcspkr generic ohci_hcd ehci_hcd ide_core
<8>[ 330.134158] Pid: 0, comm: idle Not tainted 2.6.17-rc5-mm1 #7
<8>[ 330.151851] RIP: 0010:[<0000000000000000>] [<0000000000000000>]
<8>[ 330.170116] RSP: 0000:ffffffff805d4f98 EFLAGS: 00010016
<8>[ 330.187344] RAX: 0000000000005200 RBX: ffffffff80873eb8 RCX: 0000000000000000
<8>[ 330.209448] RDX: ffffffff80873eb8 RSI: ffffffff80863e80 RDI: 0000000000000052
<8>[ 330.231552] RBP: ffffffff805d4fb0 R08: 0000000000000001 R09: ffffffff804380f7
<8>[ 330.253656] R10: ffff81007adc6000 R11: 0000000000000000 R12: 0000000000000052
<8>[ 330.275762] R13: 0000000000090000 R14: 0000000000000000 R15: 0000000000000000
<8>[ 330.297867] FS: 00002b9e555966d0(0000) GS:ffffffff8085c000(0000) knlGS:0000000000000000
<8>[ 330.322823] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<8>[ 330.340777] CR2: 0000000000000000 CR3: 0000000119bd7000 CR4: 00000000000006e0
<8>[ 330.362882] Process idle (pid: 0, threadinfo ffffffff80872000, task ffffffff804baa00)
<8>[ 330.387061] Stack: ffffffff8020c693 ffffffff80207c93 0000000000000100 ffffffff80873ee0
<8>[ 330.411423] ffffffff80209b89 <EOI> ff6500230f54e8fa 65c900000020250c 00000010250c8b48
<8>[ 330.438222] f700001fd8e98148 7400000003582444
<8>[ 330.454231] Call Trace:
<8>[ 330.462870] <IRQ> [<ffffffff8020c693>] do_IRQ+0x5e/0x6f
<8>[ 330.479631] [<ffffffff80207c93>] default_idle+0x0/0x9b
<8>[ 330.496080] [<ffffffff80209b89>] ret_from_intr+0x0/0xf
<8>[ 330.512526] <EOI>Unable to handle kernel paging request at ffffffff82800000 RIP:
[ 332.136320] [<ffffffff8020ad6e>] show_trace+0x145/0x195
<8>[ 332.159591] PGD 203027 PUD 205027 PMD 0
[ 332.172226] Oops: 0000 [2] SMP
[ 332.181720] last sysfs file: /class/net/ib2/address
<
Andrew Morton
2006-05-30 22:45:21 UTC
Permalink
On Tue, 30 May 2006 14:24:03 -0700
Post by Roland Dreier
I'm seeing problems with MSI-X interrupts on 2.6.17-rc5-mm1. I'll try
to debug the MSI patches in -mm further in the next day or so, but for
now I'll post the symptoms.
When I load the ib_mthca driver with MSI-X interrupts enabled, I get
the following crash as soon as the first interrupt is generated.
do_IRQ() did a jump-to-zero. So there's no handler installed.
Post by Roland Dreier
[ 329.995487] [<0000000000000000>]
<8>[ 330.012818] PGD 119477067 PUD 119b48067 PMD 0
[ 330.027009] Oops: 0010 [1] SMP
[ 330.036503] last sysfs file: /class/net/ib2/address
<8>[ 330.051084] CPU 0
<8>[ 330.057932] Modules linked in: ib_mthca ib_srp ib_cm ib_ipoib ib_sa ib_mad ib_core nfs lockd nfs_acl sunrpc ipv6 thermal fan button processor ac battery dm_mod ide_generic ide_disk evdev usbhid ide_cd cdrom amd74xx psmouse serio_raw e1000 pcspkr generic ohci_hcd ehci_hcd ide_core
<8>[ 330.134158] Pid: 0, comm: idle Not tainted 2.6.17-rc5-mm1 #7
<8>[ 330.151851] RIP: 0010:[<0000000000000000>] [<0000000000000000>]
<8>[ 330.170116] RSP: 0000:ffffffff805d4f98 EFLAGS: 00010016
<8>[ 330.187344] RAX: 0000000000005200 RBX: ffffffff80873eb8 RCX: 0000000000000000
<8>[ 330.209448] RDX: ffffffff80873eb8 RSI: ffffffff80863e80 RDI: 0000000000000052
<8>[ 330.231552] RBP: ffffffff805d4fb0 R08: 0000000000000001 R09: ffffffff804380f7
<8>[ 330.253656] R10: ffff81007adc6000 R11: 0000000000000000 R12: 0000000000000052
<8>[ 330.275762] R13: 0000000000090000 R14: 0000000000000000 R15: 0000000000000000
<8>[ 330.297867] FS: 00002b9e555966d0(0000) GS:ffffffff8085c000(0000) knlGS:0000000000000000
<8>[ 330.322823] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<8>[ 330.340777] CR2: 0000000000000000 CR3: 0000000119bd7000 CR4: 00000000000006e0
<8>[ 330.362882] Process idle (pid: 0, threadinfo ffffffff80872000, task ffffffff804baa00)
<8>[ 330.387061] Stack: ffffffff8020c693 ffffffff80207c93 0000000000000100 ffffffff80873ee0
<8>[ 330.411423] ffffffff80209b89 <EOI> ff6500230f54e8fa 65c900000020250c 00000010250c8b48
<8>[ 330.438222] f700001fd8e98148 7400000003582444
<8>[ 330.462870] <IRQ> [<ffffffff8020c693>] do_IRQ+0x5e/0x6f
<8>[ 330.479631] [<ffffffff80207c93>] default_idle+0x0/0x9b
<8>[ 330.496080] [<ffffffff80209b89>] ret_from_intr+0x0/0xf
[ 332.136320] [<ffffffff8020ad6e>] show_trace+0x145/0x195
<8>[ 332.159591] PGD 203027 PUD 205027 PMD 0
[ 332.172226] Oops: 0000 [2] SMP
[ 332.181720] last sysfs file: /class/net/ib2/address
<
The possibly-relevant patches are:

box:/usr/src/25> grep msi series
gregkh-pci-pci-msi-abstractions-and-support-for-altix.patch
gregkh-pci-pci-altix-msi-support.patch
allow-msi-to-work-on-kexec-kernel.patch
pci-disable-msi-mode-in-pci_disable_device.patch
x86_64-msi-apic-build-fix.patch

But this bug seems to be at a higher level - I'd be more suspecting the
genirq patches forgot to install a handler somehow.
Ingo Molnar
2006-05-30 22:49:55 UTC
Permalink
Post by Andrew Morton
On Tue, 30 May 2006 14:24:03 -0700
Post by Roland Dreier
I'm seeing problems with MSI-X interrupts on 2.6.17-rc5-mm1. I'll try
to debug the MSI patches in -mm further in the next day or so, but for
now I'll post the symptoms.
When I load the ib_mthca driver with MSI-X interrupts enabled, I get
the following crash as soon as the first interrupt is generated.
do_IRQ() did a jump-to-zero. So there's no handler installed.
yep. No desc->irq_handler. That should be 'impossible' on x86_64,
because the irq_desc[] array is initialized with handle_bad_irq, and
from that point on x86_64 only uses set_irq_chip_and_handler(), which at
most can set it to another (non-NULL) handle_irq function. Weird.

does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?

Ingo
Ingo Molnar
2006-05-30 22:52:54 UTC
Permalink
Post by Ingo Molnar
Post by Andrew Morton
do_IRQ() did a jump-to-zero. So there's no handler installed.
yep. No desc->irq_handler. That should be 'impossible' on x86_64,
because the irq_desc[] array is initialized with handle_bad_irq, and
from that point on x86_64 only uses set_irq_chip_and_handler(), which
at most can set it to another (non-NULL) handle_irq function. Weird.
does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?
aha - drivers/pci/msi.c sets msix_irq_type, which has no handle_irq
entry. This needs to be converted to irqchips.

Ingo
Ingo Molnar
2006-05-30 22:58:08 UTC
Permalink
Post by Ingo Molnar
Post by Ingo Molnar
Post by Andrew Morton
do_IRQ() did a jump-to-zero. So there's no handler installed.
yep. No desc->irq_handler. That should be 'impossible' on x86_64,
because the irq_desc[] array is initialized with handle_bad_irq, and
from that point on x86_64 only uses set_irq_chip_and_handler(), which
at most can set it to another (non-NULL) handle_irq function. Weird.
does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?
aha - drivers/pci/msi.c sets msix_irq_type, which has no handle_irq
entry. This needs to be converted to irqchips.
still ... that doesnt explain how the irq_desc[].irq_handler got NULL.

Ingo
Thomas Gleixner
2006-05-30 23:05:30 UTC
Permalink
CC'ed Ben, who is hacking on msi, IIRC
Post by Ingo Molnar
Post by Ingo Molnar
Post by Ingo Molnar
does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?
aha - drivers/pci/msi.c sets msix_irq_type, which has no handle_irq
entry. This needs to be converted to irqchips.
still ... that doesnt explain how the irq_desc[].irq_handler got NULL.
It has it's own irq_desc array

static struct msi_desc* msi_desc[NR_IRQS] = { [0 ... NR_IRQS-1] = NULL };

Too tired right now. I look into this tomorrow.

tglx
Ingo Molnar
2006-05-30 23:14:46 UTC
Permalink
Post by Thomas Gleixner
CC'ed Ben, who is hacking on msi, IIRC
Post by Ingo Molnar
Post by Ingo Molnar
Post by Ingo Molnar
does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?
aha - drivers/pci/msi.c sets msix_irq_type, which has no handle_irq
entry. This needs to be converted to irqchips.
still ... that doesnt explain how the irq_desc[].irq_handler got NULL.
It has it's own irq_desc array
static struct msi_desc* msi_desc[NR_IRQS] = { [0 ... NR_IRQS-1] = NULL };
ah ...

then i guess a quick solution would be to do:

if (!irq_desc[irq].irq_handler)
__do_IRQ(irq, regs);
else
generic_handle_irq(irq, regs);

in arch/x86_64/kernel/irq.c [and in arch/i386/kernel/irq.c], and
__do_IRQ() should handle the old-style irq-type MSI code just fine.

Ingo
Roland Dreier
2006-05-30 23:32:13 UTC
Permalink
Post by Ingo Molnar
if (!irq_desc[irq].irq_handler)
__do_IRQ(irq, regs);
else
generic_handle_irq(irq, regs);
in arch/x86_64/kernel/irq.c [and in arch/i386/kernel/irq.c], and
__do_IRQ() should handle the old-style irq-type MSI code just fine.
Indeed (fixing ".irq_handler" to be ".handle_irq"), with that change
MSI-X works fine with ib_mthca on my system. The only slightly funny
quirk is that the IRQ type is shown as "PCI-MSI-X-<NULL>", I guess
because it's printing handle_irq_name(NULL).

However (as BenH pointed out) there's definitely some work to do to
untangle MSI...

- R.
Benjamin Herrenschmidt
2006-05-30 23:15:49 UTC
Permalink
Post by Thomas Gleixner
CC'ed Ben, who is hacking on msi, IIRC
Post by Ingo Molnar
Post by Ingo Molnar
Post by Ingo Molnar
does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?
aha - drivers/pci/msi.c sets msix_irq_type, which has no handle_irq
entry. This needs to be converted to irqchips.
still ... that doesnt explain how the irq_desc[].irq_handler got NULL.
It has it's own irq_desc array
static struct msi_desc* msi_desc[NR_IRQS] = { [0 ... NR_IRQS-1] = NULL };
Too tired right now. I look into this tomorrow.
The only way to fix drivers/pci/msi.c is to delete it.

Honest, there is nothing salvageable in that code. I've been looking at
the issues involved in supporting MSIs on various powerpc platforms and
I came to the conclusion that there isn't a single re-useable line of
code in there. Not only it's totally specific to a given set of intel
chipsets, but it's also broken beyond imagination. I wonder how that
code got in there in the first place, especially maskqueraded as
"generic" code. Greg must have been drunk.

At this point, the only solution for us (powerpc) is to allow the arch
to have it's own implementatin of the toplevel MSI API (pci_enable_msi()
etc...). From there, depending on what we come up with, we'll look into
moving that back into generic code, but we are under some pressure for
time (stupid 2 weeks merge window thing is a pain sometimes).

Ben.
Greg KH
2006-05-30 23:53:18 UTC
Permalink
Post by Benjamin Herrenschmidt
Post by Thomas Gleixner
CC'ed Ben, who is hacking on msi, IIRC
Post by Ingo Molnar
Post by Ingo Molnar
Post by Ingo Molnar
does MSI much with the irq_desc[] separately perhaps, clearing
handle_irq in the process perhaps?
aha - drivers/pci/msi.c sets msix_irq_type, which has no handle_irq
entry. This needs to be converted to irqchips.
still ... that doesnt explain how the irq_desc[].irq_handler got NULL.
It has it's own irq_desc array
static struct msi_desc* msi_desc[NR_IRQS] = { [0 ... NR_IRQS-1] = NULL };
Too tired right now. I look into this tomorrow.
The only way to fix drivers/pci/msi.c is to delete it.
Honest, there is nothing salvageable in that code. I've been looking at
the issues involved in supporting MSIs on various powerpc platforms and
I came to the conclusion that there isn't a single re-useable line of
code in there. Not only it's totally specific to a given set of intel
chipsets, but it's also broken beyond imagination. I wonder how that
code got in there in the first place, especially maskqueraded as
"generic" code. Greg must have been drunk.
No, not drunk, just that no other arch offered up any potential help to
get this working. So, we have one arch that has it working well, and
finally, many years later when PPC64 catches up with the rest of the
world, we have issues :)

Feel free to help untangle it. ia64 just did a big chunk of work on
this, and there are further patches in the -mm tree that help get it
working there. ppc64 and other arch support are also welcome. You are
the ones who know how your arch handles this stuff the best.
Post by Benjamin Herrenschmidt
At this point, the only solution for us (powerpc) is to allow the arch
to have it's own implementatin of the toplevel MSI API (pci_enable_msi()
etc...). From there, depending on what we come up with, we'll look into
moving that back into generic code, but we are under some pressure for
time (stupid 2 weeks merge window thing is a pain sometimes).
2 week merge window? Come on, just give it to me and let it sit in -mm
for a while. I have some stuff in there for over 2 months before it
goes to Linus to make sure that it's safe. You can do the same for your
trees.

I gladly accept patches...

thanks,

greg k-h
Michal Piotrowski
2006-05-30 23:53:22 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
I resend this one. It contains additional debug info

============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
events/1/9 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&base->lock#2){++..}, at: [<c0128115>] run_timer_softirq+0x42/0x17a
{in-hardirq-W} state was registered at:
[<c013869a>] lockdep_acquire+0x50/0x68
[<c02eae04>] _spin_lock_irqsave+0x2d/0x3c
[<c0128b42>] lock_timer_base+0x1f/0x3a
[<c0128bfd>] __mod_timer+0x29/0xaa
[<c0128f48>] mod_timer+0x32/0x36
[<c02903da>] i8042_interrupt+0x21/0x1fb
[<c014c0c8>] handle_IRQ_event+0x1d/0x52
[<c014d007>] handle_edge_irq+0xc7/0x10c
[<c01054ca>] do_IRQ+0x86/0xac
irq event stamp: 44459
hardirqs last enabled at (44458): [<c02eb137>] _spin_unlock_irq+0x24/0x47
hardirqs last disabled at (44459): [<c02ead78>] _spin_lock_irq+0x11/0x38
softirqs last enabled at (44446): [<c012492d>] __do_softirq+0xf0/0xf8
softirqs last disabled at (44453): [<c01053e5>] do_softirq+0x5e/0xbd

other info that might help us debug this:
1 locks held by events/1/9:
#0: (&base->lock#2){++..}, at: [<c0128115>] run_timer_softirq+0x42/0x17a

stack backtrace:
[<c0104208>] show_trace+0x1b/0x20
[<c01042e6>] dump_stack+0x1f/0x24
[<c0136cbc>] print_usage_bug+0x1a5/0x1b1
[<c01372ef>] mark_lock+0x21d/0x3e7
[<c01374fc>] mark_held_locks+0x43/0x63
[<c0138421>] trace_hardirqs_on+0xc4/0x10b
[<c02eb512>] restore_nocheck+0x8/0xb
=======================
---------------------------
| preempt count: 00000003 ]
| 3-level deep critical section nesting:
----------------------------------------
.. [<c02ead7f>] .... _spin_lock_irq+0x18/0x38
.....[<c0128115>] .. ( <= run_timer_softirq+0x42/0x17a)
.. [<c011fe13>] .... vprintk+0x15/0x2e0
.....[<c01200f8>] .. ( <= printk+0x1a/0x1c)
.. [<c02eadf2>] .... _spin_lock_irqsave+0x1b/0x3c
.....[<c011f923>] .. ( <= release_console_sem+0x1f/0x1e5)

Here is dmesg
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-dmesg4

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config4

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)
Steven Rostedt
2006-05-31 03:17:28 UTC
Permalink
Oh look what I found. It seems that little driver of Andrew's has come
back to haunt me :) And I think it has to do with that cute little
disable_irq in vortex_timer again.


============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
idle/0 [HC0[0]:SC1[2]:HE1:SE0] takes:
(&vp->lock){++..}, at: [<f8895b44>] vortex_timer+0x414/0x490 [3c59x]
{in-hardirq-W} state was registered at:
[<c013c759>] lockdep_acquire+0x59/0x70
[<c031a11d>] _spin_lock+0x3d/0x50
[<f8898983>] boomerang_interrupt+0x33/0x470 [3c59x]
[<c0147a11>] handle_IRQ_event+0x31/0x70
[<c0149304>] handle_level_irq+0xa4/0x100
[<c0105658>] do_IRQ+0x58/0x90
[<c010348d>] common_interrupt+0x25/0x2c
[<c010162d>] cpu_idle+0x4d/0xb0
[<c01002e5>] rest_init+0x45/0x50
[<c03f881a>] start_kernel+0x32a/0x460
[<c0100210>] 0xc0100210
irq event stamp: 220672
hardirqs last enabled at (220672): [<c031aa45>] _spin_unlock_irqrestore+0x65/00hardirqs last disabled at (220671): [<c031a5d6>] _spin_lock_irqsave+0x16/0x60
softirqs last enabled at (220658): [<c0124453>] __do_softirq+0xf3/0x110
softirqs last disabled at (220667): [<c01244e5>] do_softirq+0x75/0x80

other info that might help us debug this:
no locks held by idle/0.

stack backtrace:
<c01052bb> show_trace+0x1b/0x20 <c01052e6> dump_stack+0x26/0x30
<c013af5d> print_usage_bug+0x22d/0x240 <c013b591> mark_lock+0x621/0x6c0
<c013b6ff> __lockdep_acquire+0xcf/0xd30 <c013c759> lockdep_acquire+0x59/0x70
<c031a172> _spin_lock_bh+0x42/0x50 <f8895b44> vortex_timer+0x414/0x490 [3c59x] <c0128cf9> run_timer_softirq+0xc9/0x1a0 <c01243e7> __do_softirq+0x87/0x110
<c01244e5> do_softirq+0x75/0x80 <c0124650> irq_exit+0x50/0x60
<c01102a3> smp_apic_timer_interrupt+0x73/0x80 <c010354e> apic_timer_interrupt0 <c010162d> cpu_idle+0x4d/0xb0 <c010f2a5> start_secondary+0x455/0x500
<00000000> 0x0 <f7f85fb4> 0xf7f85fb4


-- Steve
Andrew Morton
2006-05-31 04:14:42 UTC
Permalink
On Tue, 30 May 2006 23:17:28 -0400
Post by Steven Rostedt
Oh look what I found. It seems that little driver of Andrew's has come
back to haunt me :) And I think it has to do with that cute little
disable_irq in vortex_timer again.
============================
[ BUG: illegal lock usage! ]
----------------------------
illegal {in-hardirq-W} -> {hardirq-on-W} usage.
(&vp->lock){++..}, at: [<f8895b44>] vortex_timer+0x414/0x490 [3c59x]
[<c013c759>] lockdep_acquire+0x59/0x70
[<c031a11d>] _spin_lock+0x3d/0x50
[<f8898983>] boomerang_interrupt+0x33/0x470 [3c59x]
[<c0147a11>] handle_IRQ_event+0x31/0x70
[<c0149304>] handle_level_irq+0xa4/0x100
[<c0105658>] do_IRQ+0x58/0x90
[<c010348d>] common_interrupt+0x25/0x2c
[<c010162d>] cpu_idle+0x4d/0xb0
[<c01002e5>] rest_init+0x45/0x50
[<c03f881a>] start_kernel+0x32a/0x460
[<c0100210>] 0xc0100210
irq event stamp: 220672
hardirqs last enabled at (220672): [<c031aa45>] _spin_unlock_irqrestore+0x65/00hardirqs last disabled at (220671): [<c031a5d6>] _spin_lock_irqsave+0x16/0x60
softirqs last enabled at (220658): [<c0124453>] __do_softirq+0xf3/0x110
softirqs last disabled at (220667): [<c01244e5>] do_softirq+0x75/0x80
no locks held by idle/0.
<c01052bb> show_trace+0x1b/0x20 <c01052e6> dump_stack+0x26/0x30
<c013af5d> print_usage_bug+0x22d/0x240 <c013b591> mark_lock+0x621/0x6c0
<c013b6ff> __lockdep_acquire+0xcf/0xd30 <c013c759> lockdep_acquire+0x59/0x70
<c031a172> _spin_lock_bh+0x42/0x50 <f8895b44> vortex_timer+0x414/0x490 [3c59x] <c0128cf9> run_timer_softirq+0xc9/0x1a0 <c01243e7> __do_softirq+0x87/0x110
<c01244e5> do_softirq+0x75/0x80 <c0124650> irq_exit+0x50/0x60
<c01102a3> smp_apic_timer_interrupt+0x73/0x80 <c010354e> apic_timer_interrupt0 <c010162d> cpu_idle+0x4d/0xb0 <c010f2a5> start_secondary+0x455/0x500
<00000000> 0x0 <f7f85fb4> 0xf7f85fb4
Without having looked at it very hard, I'd venture that this is a false
positive - that driver uses disable_irq() to prevent reentry onto that
lock.

It does that because it knows it's about to spend a long time talking with
the mii registers and it doesn't want to do that with interrupts disabled.
Ingo Molnar
2006-05-31 06:31:03 UTC
Permalink
Post by Andrew Morton
Without having looked at it very hard, I'd venture that this is a
false positive - that driver uses disable_irq() to prevent reentry
onto that lock.
correct.
Post by Andrew Morton
It does that because it knows it's about to spend a long time talking
with the mii registers and it doesn't want to do that with interrupts
disabled.
i still consider it a 'quirky' locking construct, because disabling
interrupts for a long time also disables all other devices sharing the
same IRQ line - not nice.

Also, this is a really hard case for lockdep to detect automatically.
(fortunately it's also relatively rare)

OTOH, the straightforward lockdep workaround would be to take the
spinlock and thus disable all local interrupts - not too nice either.

Albeit in some ways it's still a bit nicer conceptually than disabling
the irq line, because other CPUs are still operational, and under
certain locking designs [preempt-rt] spin_lock_irq() does not disable
local interrupts.

Steve, can you think of any better solution? I dont have this card.

Ingo
Steven Rostedt
2006-05-31 11:50:43 UTC
Permalink
Post by Ingo Molnar
Post by Andrew Morton
Without having looked at it very hard, I'd venture that this is a
false positive - that driver uses disable_irq() to prevent reentry
onto that lock.
correct.
Post by Andrew Morton
It does that because it knows it's about to spend a long time talking
with the mii registers and it doesn't want to do that with interrupts
disabled.
i still consider it a 'quirky' locking construct, because disabling
interrupts for a long time also disables all other devices sharing the
same IRQ line - not nice.
Also, this is a really hard case for lockdep to detect automatically.
(fortunately it's also relatively rare)
What's the standard way to teach lockdep about this?
Post by Ingo Molnar
OTOH, the straightforward lockdep workaround would be to take the
spinlock and thus disable all local interrupts - not too nice either.
Albeit in some ways it's still a bit nicer conceptually than disabling
the irq line, because other CPUs are still operational, and under
certain locking designs [preempt-rt] spin_lock_irq() does not disable
local interrupts.
Steve, can you think of any better solution? I dont have this card.
Until this popped up, I didn't know I had this card either ;)
(the last time we dealt with this card was to help someone else)

Anyway, I'll look into the way this card works and start to play with it
when I get some time.

Andrew, do you have any docs that I can read to understand the card a
little better?

Thanks,

-- Steve
Ingo Molnar
2006-05-31 11:55:45 UTC
Permalink
Post by Steven Rostedt
Post by Ingo Molnar
Post by Andrew Morton
It does that because it knows it's about to spend a long time talking
with the mii registers and it doesn't want to do that with interrupts
disabled.
i still consider it a 'quirky' locking construct, because disabling
interrupts for a long time also disables all other devices sharing the
same IRQ line - not nice.
Also, this is a really hard case for lockdep to detect
automatically. (fortunately it's also relatively rare)
What's the standard way to teach lockdep about this?
Not yet. One possibility would be to use existing locks and to get rid
of the disable_irq(). One technique could be to disable the IRQ on the
card (i think the code already does this), and then call
synchronize_irq() instead of disable_irq().

Ingo
Arjan van de Ven
2006-05-31 06:39:02 UTC
Permalink
Post by Andrew Morton
On Tue, 30 May 2006 23:17:28 -0400
Without having looked at it very hard, I'd venture that this is a false
positive - that driver uses disable_irq() to prevent reentry onto that
lock.
It does that because it knows it's about to spend a long time talking with
the mii registers and it doesn't want to do that with interrupts disabled.
the scsi controller who shares that irq with your NIC just *enjoys* long
disable_irq() periods.. it can be nice and lazy about it ;)
Continue reading on narkive:
Loading...