Discussion:
perf: 3.17 another perf_fuzzer lockup
Vince Weaver
2014-10-06 15:36:02 UTC
Permalink
I reported this on the other thread but it seems to be unrelated to the
other bug, and this time instead of simply wedging the process I got a
backtrace and the machine locked solid.

This is reproducible, on 3.17 on a core2.

Trying to get something useful out of ftrace but no luck so far.

Vince

[ 843.700042] general protection fault: 0000 [#1] SMP
[ 843.704001] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg mcs7830 evdev usbnet coretemp ohci_pci pcspkr ohci_hcd psmouse serio_raw video wmi i2c_nforce2 button acpi_cpufreq processor thermal_sys sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 843.704001] CPU: 1 PID: 7273 Comm: perf_fuzzer Not tainted 3.17.0+ #86
[ 843.704001] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 843.704001] task: ffff88011a874000 ti: ffff8800bc0ec000 task.ti: ffff8800bc0ec000
[ 843.704001] RIP: 0010:[<ffffffff810cd913>] [<ffffffff810cd913>] perf_event_context_sched_in.isra.75+0x1f/0x90
[ 843.704001] RSP: 0018:ffff8800bc0efd50 EFLAGS: 00010087
[ 843.704001] RAX: ffffea0002ba2d68 RBX: ffff1001e742c000 RCX: 000000000000038f
[ 843.704001] RDX: ffff88011fc95b30 RSI: ffff880037d0eb00 RDI: ffff880037d0e700
[ 843.704001] RBP: ffff8800bc0efd60 R08: ffff8800bc0ec000 R09: 000000000000baff
[ 843.704001] R10: 0000000000000006 R11: 00000000000009bc R12: ffff880037d0e700
[ 843.704001] R13: ffff8800c944f400 R14: 0000000000000001 R15: ffff88011b340800
[ 843.704001] FS: 00007ffc76c17700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 843.704001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 843.704001] DR0: 0000000001c7b000 DR1: 0000000000000000 DR2: 0000000001c7b000
[ 843.704001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 843.704001] Stack:
[ 843.704001] ffff88011a874000 ffff88011b340800 ffff8800bc0efd90 ffffffff810cd9bb
[ 843.704001] ffff88011a8744e8 ffff88011b340800 ffff88011fc929c0 ffff8800c944f400
[ 843.704001] ffff8800bc0efdc0 ffffffff8105ae62 ffff88011fc929c0 ffff8800c944f400
[ 843.704001] Call Trace:
[ 843.704001] [<ffffffff810cd9bb>] __perf_event_task_sched_in+0x37/0xf4
[ 843.704001] [<ffffffff8105ae62>] finish_task_switch+0x9b/0xa6
[ 843.704001] [<ffffffff815209c1>] __schedule+0x309/0x4a5
[ 843.704001] [<ffffffff81520df3>] _cond_resched+0x28/0x3b
[ 843.704001] [<ffffffff815217b5>] mutex_lock+0x12/0x2f
[ 843.704001] [<ffffffff810cb9bb>] find_get_context+0xfc/0x170
[ 843.704001] [<ffffffff810cff77>] SYSC_perf_event_open+0x47b/0x7f5
[ 843.704001] [<ffffffff810d0651>] SyS_perf_event_open+0xe/0x10
[ 843.704001] [<ffffffff81523ad1>] tracesys+0xd4/0xd9
[ 843.704001] Code: 89 e7 e8 65 fe ff ff 5b 41 5c 5d c3 e8 c7 7e 45 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 07 48 8b 58 40 65 48 03 1c 25 08 ce 00 00 <48> 39 bb d8 00 00 00 74 63 48 89 fe 48 89 df e8 f0 b4 ff ff 49
[ 843.704001] RIP [<ffffffff810cd913>] perf_event_context_sched_in.isra.75+0x1f/0x90
[ 843.704001] RSP <ffff8800bc0efd50>
[ 843.704001] ---[ end trace 24a30c0f4ce68693 ]---
[ 843.960002] ------------[ cut here ]------------
[ 843.960002] WARNING: CPU: 0 PID: 2018 at kernel/watchdog.c:267 watchdog_overflow_callback+0x9b/0xa7()
[ 843.960002] Watchdog detected hard LOCKUP on cpu 0
[ 843.960002] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg mcs7830 evdev usbnet coretemp ohci_pci pcspkr ohci_hcd psmouse serio_raw video wmi i2c_nforce2 button acpi_cpufreq processor thermal_sys sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 843.960002] CPU: 0 PID: 2018 Comm: rsyslogd Tainted: G D 3.17.0+ #86
[ 843.960002] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 843.960002] 0000000000000000 ffff88011fc06c20 ffffffff8151e1be ffff88011fc06c68
[ 843.960002] ffff88011fc06c58 ffffffff8103ee43 ffffffff810a6bd6 ffff88011b349800
[ 843.960002] 0000000000000000 ffff88011fc06d70 ffff88011fc06ef8 ffff88011fc06cb8
[ 843.960002] Call Trace:
[ 843.960002] <NMI> [<ffffffff8151e1be>] dump_stack+0x45/0x56
[ 843.960002] [<ffffffff8103ee43>] warn_slowpath_common+0x7f/0x98
[ 843.960002] [<ffffffff810a6bd6>] ? watchdog_overflow_callback+0x9b/0xa7
[ 843.960002] [<ffffffff8103eea8>] warn_slowpath_fmt+0x4c/0x4e
[ 843.960002] [<ffffffff81525821>] ? ftrace_call+0x5/0x2f
[ 843.960002] [<ffffffff81525821>] ? ftrace_call+0x5/0x2f
[ 843.960002] [<ffffffff810a6bd6>] watchdog_overflow_callback+0x9b/0xa7
[ 843.960002] [<ffffffff810cec50>] __perf_event_overflow+0x139/0x1bd
[ 843.960002] [<ffffffff810cf14b>] perf_event_overflow+0x19/0x1b
[ 843.960002] [<ffffffff8101981a>] intel_pmu_handle_irq+0x29d/0x323
[ 843.960002] [<ffffffff81525821>] ? ftrace_call+0x5/0x2f
[ 843.960002] [<ffffffff81012da8>] perf_event_nmi_handler+0x25/0x3e
[ 843.960002] [<ffffffff81005e88>] nmi_handle+0x60/0xf3
[ 843.960002] [<ffffffff8102b34d>] ? cpumask_clear_cpu.constprop.3+0x11/0x11
[ 843.960002] [<ffffffff810060df>] default_do_nmi+0x50/0xdc
[ 843.960002] [<ffffffff810061cf>] do_nmi+0x64/0xa6
[ 843.960002] [<ffffffff8152572a>] end_repeat_nmi+0x1e/0x2e
[ 843.960002] [<ffffffff8106a531>] ? do_raw_spin_lock+0x1c/0x27
[ 843.960002] [<ffffffff8106a531>] ? do_raw_spin_lock+0x1c/0x27
[ 843.960002] [<ffffffff8106a531>] ? do_raw_spin_lock+0x1c/0x27
[ 843.960002] <<EOE>> <IRQ> [<ffffffff8152350f>] _raw_spin_lock+0xe/0x10
[ 843.960002] [<ffffffff8106326d>] load_balance+0x2bf/0x63d
[ 843.960002] [<ffffffff8105e0cf>] ? try_to_wake_up+0x1c4/0x1d4
[ 843.960002] [<ffffffff81063a5b>] rebalance_domains+0x144/0x205
[ 843.960002] [<ffffffff81063b59>] run_rebalance_domains+0x3d/0x15c
[ 843.960002] [<ffffffff81041bd4>] __do_softirq+0xde/0x238
[ 843.960002] [<ffffffff81041ea3>] irq_exit+0x3f/0x86
[ 843.960002] [<ffffffff8102a4e1>] smp_apic_timer_interrupt+0x35/0x41
[ 843.960002] [<ffffffff8152480a>] apic_timer_interrupt+0x6a/0x70
[ 843.960002] <EOI>
[ 843.960002] ---[ end trace 24a30c0f4ce68694 ]---
[ 843.957990] ------------[ cut here ]------------
[ 843.957990] WARNING: CPU: 1 PID: 7273 at kernel/watchdog.c:267 watchdog_overflow_callback+0x9b/0xa7()
[ 843.957990] Watchdog detected hard LOCKUP on cpu 1
[ 843.957990] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg mcs7830 evdev usbnet coretemp ohci_pci pcspkr ohci_hcd psmouse serio_raw video wmi i2c_nforce2 button acpi_cpufreq processor thermal_sys sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 843.957990] CPU: 1 PID: 7273 Comm: perf_fuzzer Tainted: G D W 3.17.0+ #86
[ 843.957990] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 843.957990] 0000000000000000 ffff88011fc86c20 ffffffff8151e1be ffff88011fc86c68
[ 843.957990] ffff88011fc86c58 ffffffff8103ee43 ffffffff810a6bd6 ffff88011ac2ac00
[ 843.957990] 0000000000000001 ffff88011fc86d70 ffff88011fc86ef8 ffff88011fc86cb8
[ 843.957990] Call Trace:
[ 843.957990] <NMI> [<ffffffff8151e1be>] dump_stack+0x45/0x56
[ 843.957990] [<ffffffff8103ee43>] warn_slowpath_common+0x7f/0x98
[ 843.957990] [<ffffffff810a6bd6>] ? watchdog_overflow_callback+0x9b/0xa7
[ 843.957990] [<ffffffff8103eea8>] warn_slowpath_fmt+0x4c/0x4e
[ 843.957990] [<ffffffff81525821>] ? ftrace_call+0x5/0x2f
[ 843.957990] [<ffffffff810a6bd6>] watchdog_overflow_callback+0x9b/0xa7
[ 843.957990] [<ffffffff810cec50>] __perf_event_overflow+0x139/0x1bd
[ 843.957990] [<ffffffff810cf14b>] perf_event_overflow+0x19/0x1b
[ 843.957990] [<ffffffff8101981a>] intel_pmu_handle_irq+0x29d/0x323
[ 843.957990] [<ffffffff81525821>] ? ftrace_call+0x5/0x2f
[ 843.957990] [<ffffffff81012da8>] perf_event_nmi_handler+0x25/0x3e
[ 843.957990] [<ffffffff81005e88>] nmi_handle+0x60/0xf3
[ 843.957990] [<ffffffff8102b34d>] ? cpumask_clear_cpu.constprop.3+0x11/0x11
[ 843.957990] [<ffffffff8108d3f0>] ? tick_sched_do_timer+0x2f/0x2f
[ 843.957990] [<ffffffff810060df>] default_do_nmi+0x50/0xdc
[ 843.957990] [<ffffffff810061cf>] do_nmi+0x64/0xa6
[ 843.957990] [<ffffffff8152572a>] end_repeat_nmi+0x1e/0x2e
[ 843.957990] [<ffffffff8108d3f0>] ? tick_sched_do_timer+0x2f/0x2f
[ 843.957990] [<ffffffff8106a538>] ? do_raw_spin_lock+0x23/0x27
[ 843.957990] [<ffffffff8106a538>] ? do_raw_spin_lock+0x23/0x27
[ 843.957990] [<ffffffff8106a538>] ? do_raw_spin_lock+0x23/0x27
[ 843.957990] <<EOE>> <IRQ> [<ffffffff8152350f>] _raw_spin_lock+0xe/0x10
[ 843.957990] [<ffffffff8105d52c>] scheduler_tick+0x3d/0x85
[ 843.957990] [<ffffffff8108145a>] update_process_times+0x56/0x65
[ 843.957990] [<ffffffff8108cf19>] tick_sched_handle+0x4a/0x59
[ 843.957990] [<ffffffff8108d42a>] tick_sched_timer+0x3a/0x58
[ 843.957990] [<ffffffff810818c7>] __run_hrtimer+0x92/0x14b
[ 843.957990] [<ffffffff810820a7>] hrtimer_interrupt+0xe8/0x1d3
[ 843.957990] [<ffffffff8102a189>] local_apic_timer_interrupt+0x54/0x57
[ 843.957990] [<ffffffff8102a4dc>] smp_apic_timer_interrupt+0x30/0x41
[ 843.957990] [<ffffffff8152480a>] apic_timer_interrupt+0x6a/0x70
[ 843.957990] <EOI> [<ffffffff8107439d>] ? console_unlock+0x30d/0x32b
[ 843.957990] [<ffffffff810975d5>] ? acct_collect+0x164/0x16e
[ 843.957990] [<ffffffff8103fed7>] do_exit+0x21c/0x8fe
[ 843.957990] [<ffffffff8151c536>] ? printk+0x54/0x56
[ 843.957990] [<ffffffff81005685>] oops_end+0xa6/0xad
[ 843.957990] [<ffffffff810059cf>] die+0x5a/0x63
[ 843.957990] [<ffffffff81002e7f>] do_general_protection+0x96/0x142
[ 843.957990] [<ffffffff815253c2>] general_protection+0x22/0x30
[ 843.957990] [<ffffffff810cd913>] ? perf_event_context_sched_in.isra.75+0x1f/0x90
[ 843.957990] [<ffffffff810cd9bb>] __perf_event_task_sched_in+0x37/0xf4
[ 843.957990] [<ffffffff8105ae62>] finish_task_switch+0x9b/0xa6
[ 843.957990] [<ffffffff815209c1>] __schedule+0x309/0x4a5
[ 843.957990] [<ffffffff81520df3>] _cond_resched+0x28/0x3b
[ 843.957990] [<ffffffff815217b5>] mutex_lock+0x12/0x2f
[ 843.957990] [<ffffffff810cb9bb>] find_get_context+0xfc/0x170
[ 843.957990] [<ffffffff810cff77>] SYSC_perf_event_open+0x47b/0x7f5
[ 843.957990] [<ffffffff810d0651>] SyS_perf_event_open+0xe/0x10
[ 843.957990] [<ffffffff81523ad1>] tracesys+0xd4/0xd9
[ 843.957990] ---[ end trace 24a30c0f4ce68695 ]---
Vince Weaver
2014-10-06 15:55:11 UTC
Permalink
Post by Vince Weaver
[ 843.700042] general protection fault: 0000 [#1] SMP
...
[ 843.704001] task: ffff88011a874000 ti: ffff8800bc0ec000 task.ti: ffff8800bc0ec000
[ 843.704001] RIP: 0010:[<ffffffff810cd913>] [<ffffffff810cd913>] perf_event_context_sched_in.isra.75+0x1f/0x90
For what it's worth, this is

kernel/events/core.c:2646

if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
perf_cgroup_sched_in(prev, task);


ffffffff810cd902: 53 push %rbx
ffffffff810cd903: 48 8b 07 mov (%rdi),%rax
ffffffff810cd906: 48 8b 58 40 mov 0x40(%rax),%rbx
ffffffff810cd90a: 65 48 03 1c 25 08 ce add %gs:0xce08,%rbx
ffffffff810cd911: 00 00
ffffffff810cd913: 48 39 bb d8 00 00 00 cmp %rdi,0xd8(%rbx)
ffffffff810cd91a: 74 63 je ffffffff810cd97f <perf_event_context_sched_in.isra.75+0x8b>
Post by Vince Weaver
[ 843.704001] RSP: 0018:ffff8800bc0efd50 EFLAGS: 00010087
[ 843.704001] RAX: ffffea0002ba2d68 RBX: ffff1001e742c000 RCX: 000000000000038f
[ 843.704001] RDX: ffff88011fc95b30 RSI: ffff880037d0eb00 RDI: ffff880037d0e700
[ 843.704001] RBP: ffff8800bc0efd60 R08: ffff8800bc0ec000 R09: 000000000000baff
[ 843.704001] R10: 0000000000000006 R11: 00000000000009bc R12: ffff880037d0e700
[ 843.704001] R13: ffff8800c944f400 R14: 0000000000000001 R15: ffff88011b340800
[ 843.704001] FS: 00007ffc76c17700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 843.704001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 843.704001] DR0: 0000000001c7b000 DR1: 0000000000000000 DR2: 0000000001c7b000
[ 843.704001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 843.704001] ffff88011a874000 ffff88011b340800 ffff8800bc0efd90 ffffffff810cd9bb
[ 843.704001] ffff88011a8744e8 ffff88011b340800 ffff88011fc929c0 ffff8800c944f400
[ 843.704001] ffff8800bc0efdc0 ffffffff8105ae62 ffff88011fc929c0 ffff8800c944f400
[ 843.704001] [<ffffffff810cd9bb>] __perf_event_task_sched_in+0x37/0xf4
[ 843.704001] [<ffffffff8105ae62>] finish_task_switch+0x9b/0xa6
[ 843.704001] [<ffffffff815209c1>] __schedule+0x309/0x4a5
[ 843.704001] [<ffffffff81520df3>] _cond_resched+0x28/0x3b
[ 843.704001] [<ffffffff815217b5>] mutex_lock+0x12/0x2f
[ 843.704001] [<ffffffff810cb9bb>] find_get_context+0xfc/0x170
[ 843.704001] [<ffffffff810cff77>] SYSC_perf_event_open+0x47b/0x7f5
[ 843.704001] [<ffffffff810d0651>] SyS_perf_event_open+0xe/0x10
[ 843.704001] [<ffffffff81523ad1>] tracesys+0xd4/0xd9
[ 843.704001] Code: 89 e7 e8 65 fe ff ff 5b 41 5c 5d c3 e8 c7 7e 45 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 07 48 8b 58 40 65 48 03 1c 25 08 ce 00 00 <48> 39 bb d8 00 00 00 74 63 48 89 fe 48 89 df e8 f0 b4 ff ff 49
[ 843.704001] RIP [<ffffffff810cd913>] perf_event_context_sched_in.isra.75+0x1f/0x90
[ 843.704001] RSP <ffff8800bc0efd50>
Chuck Ebbert
2014-10-07 00:52:38 UTC
Permalink
On Mon, 6 Oct 2014 11:55:11 -0400 (EDT)
Post by Vince Weaver
Post by Vince Weaver
[ 843.700042] general protection fault: 0000 [#1] SMP
...
[ 843.704001] task: ffff88011a874000 ti: ffff8800bc0ec000 task.ti: ffff8800bc0ec000
[ 843.704001] RIP: 0010:[<ffffffff810cd913>] [<ffffffff810cd913>] perf_event_context_sched_in.isra.75+0x1f/0x90
For what it's worth, this is
kernel/events/core.c:2646
if (atomic_read(&__get_cpu_var(perf_cgroup_events)))
perf_cgroup_sched_in(prev, task);
ffffffff810cd902: 53 push %rbx
ffffffff810cd903: 48 8b 07 mov (%rdi),%rax
ffffffff810cd906: 48 8b 58 40 mov 0x40(%rax),%rbx
ffffffff810cd90a: 65 48 03 1c 25 08 ce add %gs:0xce08,%rbx
ffffffff810cd911: 00 00
ffffffff810cd913: 48 39 bb d8 00 00 00 cmp %rdi,0xd8(%rbx)
ffffffff810cd91a: 74 63 je ffffffff810cd97f <perf_event_context_sched_in.isra.75+0x8b>
Actually it's:

static void perf_event_context_sched_in(struct perf_event_context *ctx,
struct task_struct *task)
{
struct perf_cpu_context *cpuctx;

cpuctx = __get_cpu_context(ctx);
if (cpuctx->task_ctx == ctx) <======= oops
return;

cpuctx is in %rbx (=ffff1001e742c000) and that's not even a legal
address, which is what caused the general protection fault
Post by Vince Weaver
Post by Vince Weaver
[ 843.704001] RSP: 0018:ffff8800bc0efd50 EFLAGS: 00010087
[ 843.704001] RAX: ffffea0002ba2d68 RBX: ffff1001e742c000 RCX: 000000000000038f
[ 843.704001] RDX: ffff88011fc95b30 RSI: ffff880037d0eb00 RDI: ffff880037d0e700
[ 843.704001] RBP: ffff8800bc0efd60 R08: ffff8800bc0ec000 R09: 000000000000baff
[ 843.704001] R10: 0000000000000006 R11: 00000000000009bc R12: ffff880037d0e700
[ 843.704001] R13: ffff8800c944f400 R14: 0000000000000001 R15: ffff88011b340800
[ 843.704001] FS: 00007ffc76c17700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 843.704001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 843.704001] DR0: 0000000001c7b000 DR1: 0000000000000000 DR2: 0000000001c7b000
[ 843.704001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 843.704001] ffff88011a874000 ffff88011b340800 ffff8800bc0efd90 ffffffff810cd9bb
[ 843.704001] ffff88011a8744e8 ffff88011b340800 ffff88011fc929c0 ffff8800c944f400
[ 843.704001] ffff8800bc0efdc0 ffffffff8105ae62 ffff88011fc929c0 ffff8800c944f400
[ 843.704001] [<ffffffff810cd9bb>] __perf_event_task_sched_in+0x37/0xf4
[ 843.704001] [<ffffffff8105ae62>] finish_task_switch+0x9b/0xa6
[ 843.704001] [<ffffffff815209c1>] __schedule+0x309/0x4a5
[ 843.704001] [<ffffffff81520df3>] _cond_resched+0x28/0x3b
[ 843.704001] [<ffffffff815217b5>] mutex_lock+0x12/0x2f
[ 843.704001] [<ffffffff810cb9bb>] find_get_context+0xfc/0x170
[ 843.704001] [<ffffffff810cff77>] SYSC_perf_event_open+0x47b/0x7f5
[ 843.704001] [<ffffffff810d0651>] SyS_perf_event_open+0xe/0x10
[ 843.704001] [<ffffffff81523ad1>] tracesys+0xd4/0xd9
[ 843.704001] Code: 89 e7 e8 65 fe ff ff 5b 41 5c 5d c3 e8 c7 7e 45 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 07 48 8b 58 40 65 48 03 1c 25 08 ce 00 00 <48> 39 bb d8 00 00 00 74 63 48 89 fe 48 89 df e8 f0 b4 ff ff 49
[ 843.704001] RIP [<ffffffff810cd913>] perf_event_context_sched_in.isra.75+0x1f/0x90
[ 843.704001] RSP <ffff8800bc0efd50>
Vince Weaver
2014-10-15 18:34:10 UTC
Permalink
OK, so it turns out that the oops I saw with memory corruption wasn't the
bug I was tracking, but something that comes up sometimes when trying to
run ftrace at the same time as fuzzing. So we'll leave that for another
day.

The 3.17+ lockup I am tracking still reproduces as of git from yesterday
(even after the 3.18-rc perf_event merges).

I can use sysrq to get the stack trace, the one CPU is stuck in a call
to find_get_context().

An example backtrace:

[88200.300003] <EOI>
[88200.300003] [<ffffffff81114869>] ? ____cache_alloc+0x130/0x25b
[88200.300003] [<ffffffff8107fb05>] ? __call_rcu.constprop.63+0x1bf/0x1cb
[88200.300003] [<ffffffff8107fb2b>] kfree_call_rcu+0x1a/0x1c
[88200.300003] [<ffffffff810cf84f>] put_ctx+0x51/0x55
[88200.300003] [<ffffffff810d1840>] find_get_context+0x166/0x195
[88200.300003] [<ffffffff810d5856>] SYSC_perf_event_open+0x47b/0x7f5
[88200.300003] [<ffffffff810d5f55>] SyS_perf_event_open+0xe/0x10
[88200.300003] [<ffffffff815362d6>] system_call_fastpath+0x16/0x1b

It looks like the
else if (task->perf_event_ctxp[ctxn])
err = -EAGAIN;

case is triggering non-stop in the retry path of
find_get_context() and so the kernel gets stuck forever retrying.

I can drop some printks in if it will help debug. I've tried running
ftrace, but for whatever reason if I enable ftrace the bug won't trigger.

Vince
Vince Weaver
2014-10-16 21:25:54 UTC
Permalink
Post by Vince Weaver
I can use sysrq to get the stack trace, the one CPU is stuck in a call
to find_get_context().
[88200.300003] <EOI>
[88200.300003] [<ffffffff81114869>] ? ____cache_alloc+0x130/0x25b
[88200.300003] [<ffffffff8107fb05>] ? __call_rcu.constprop.63+0x1bf/0x1cb
[88200.300003] [<ffffffff8107fb2b>] kfree_call_rcu+0x1a/0x1c
[88200.300003] [<ffffffff810cf84f>] put_ctx+0x51/0x55
[88200.300003] [<ffffffff810d1840>] find_get_context+0x166/0x195
[88200.300003] [<ffffffff810d5856>] SYSC_perf_event_open+0x47b/0x7f5
[88200.300003] [<ffffffff810d5f55>] SyS_perf_event_open+0xe/0x10
[88200.300003] [<ffffffff815362d6>] system_call_fastpath+0x16/0x1b
It looks like the
else if (task->perf_event_ctxp[ctxn])
err = -EAGAIN;
It is indeed stuck there, waiting for task->perf_event_ctxp[1] to get
set to zero, which never happens.

As far as I can tell it's when a Software event is being opened.

Still struggling through the code trying to figure out what's going on.

[ 7071.252607] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.259439] VMW: type=1 config=8
[ 7071.262713] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.269506] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.276299] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.283087] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.289879] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.296671] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.303457] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.310248] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7071.317035] VMW: task->perf_event_ctxp[1]=ffff8800cb12ec00, EAGAIN, ref=1
[ 7076.256032] find_get_context: 7246310 callbacks suppressed

Vince
Vince Weaver
2014-10-17 14:19:50 UTC
Permalink
Post by Vince Weaver
Post by Vince Weaver
It looks like the
else if (task->perf_event_ctxp[ctxn])
err = -EAGAIN;
It is indeed stuck there, waiting for task->perf_event_ctxp[1] to get
set to zero, which never happens.
OK, so with some more printk()s, it looks like somehow the parent thread
is trying to open a software event on itself.

task->perf_event_ctxp[1]

has a valid pointer, but the ctx it points to has a ctx->lock of 0.

So perf_lock_task_context() always returns NULL.

So in find_get_context()
we get stuck in an infinite retry loop, waiting forever for
either ctx->lock to go positive or for task->perf_event_ctxp[1]
to go NULL, neither of which is going to happen.

Now to find out why this could happen. Probably something to do with
crazy RCU magic :(

Vince
Vince Weaver
2014-10-17 15:21:41 UTC
Permalink
Post by Vince Weaver
Now to find out why this could happen. Probably something to do with
crazy RCU magic :(
it looks like there's an unbalanced get_ctx() / put_ctx() here, as the
software event context on the main process should not get decremented
to 0 unless that process is exiting, yet it happens.

Maybe this is bisectable. Hmmm.

[ 106.781177] VMW: using pid 2941
[ 127.216558] ------------[ cut here ]------------

And here's where ctx->refcount gets decremented to 0.

[ 127.221237] WARNING: CPU: 0 PID: 2941 at kernel/events/core.c:905 put_ctx+0x57/0x8e()
[ 127.256799] CPU: 0 PID: 2941 Comm: perf_fuzzer Not tainted 3.17.0+ #97
[ 127.263372] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 127.272289] 0000000000000009 ffff8800cb107d98 ffffffff81530f3c 000000000000249e
[ 127.279954] 0000000000000000 ffff8800cb107dd8 ffffffff8104005d ffff8800cae4b750
[ 127.287621] ffffffff810cf819 ffff8800cbb26400 ffff8800cae4b000 ffff8800cbb26410
[ 127.295285] Call Trace:
[ 127.297789] [<ffffffff81530f3c>] dump_stack+0x46/0x58
[ 127.302980] [<ffffffff8104005d>] warn_slowpath_common+0x81/0x9b
[ 127.309036] [<ffffffff810cf819>] ? put_ctx+0x57/0x8e
[ 127.314134] [<ffffffff8104011a>] warn_slowpath_null+0x1a/0x1c
[ 127.320022] [<ffffffff810cf819>] put_ctx+0x57/0x8e
[ 127.324957] [<ffffffff810cf898>] __free_event+0x48/0x71
[ 127.330326] [<ffffffff8112bb01>] ? __d_free_external+0x29/0x4f
[ 127.336298] [<ffffffff810d1311>] _free_event+0xd6/0xdb
[ 127.341585] [<ffffffff810d13ee>] put_event+0xd8/0xe1
[ 127.346693] [<ffffffff810d141e>] perf_release+0x15/0x19
[ 127.352062] [<ffffffff8111cd7d>] __fput+0xf1/0x1a6
[ 127.356994] [<ffffffff8111ce6a>] ____fput+0xe/0x10
[ 127.361931] [<ffffffff81055402>] task_work_run+0x83/0x9a
[ 127.367389] [<ffffffff810029ca>] do_notify_resume+0x5a/0x61
[ 127.373106] [<ffffffff81536720>] int_signal+0x12/0x17
[ 127.378300] ---[ end trace 8508b4f6a48d2f87 ]---

and here a little later is when we try to add a new software event
but it gets infinitely stuck.

[ 127.385717] VMW: task->perf_event_ctxp[1]=ffff8800cbb26400, EAGAIN, ref=1
[ 127.392566] VMW: pmu->type=1 type=1 config=8 pid=2941


Vince
Vince Weaver
2014-10-20 04:48:43 UTC
Permalink
Post by Vince Weaver
Post by Vince Weaver
Now to find out why this could happen. Probably something to do with
crazy RCU magic :(
it looks like there's an unbalanced get_ctx() / put_ctx() here, as the
software event context on the main process should not get decremented
to 0 unless that process is exiting, yet it happens.
Maybe this is bisectable. Hmmm.
nope. Wasted 3 days trying to bisect this. Something during the 3.17
merge window changed enough to make the perf_fuzzer results
non-deterministic, and that makes it nearly impossibe to reproduce the
issue properly. Ugh. I might be reduced to sprinkling printks around
unless anyone chimes in with better advice.

Vince
Vince Weaver
2014-10-20 14:48:21 UTC
Permalink
Post by Vince Weaver
nope. Wasted 3 days trying to bisect this. Something during the 3.17
merge window changed enough to make the perf_fuzzer results
non-deterministic, and that makes it nearly impossibe to reproduce the
issue properly. Ugh. I might be reduced to sprinkling printks around
unless anyone chimes in with better advice.
yes, so 3 different bisects from 3 different starting points and it
bisects each time to:

commit 29b88e23a9212136d39b0161a39afe587d0170a5
Merge: 2521129 4e3a25b
Author: Linus Torvalds <***@linux-foundation.org>
Date: Mon Aug 4 18:34:04 2014 -0700

Merge tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/dri

Pull driver core updates from Greg KH:
"Here's the big driver-core pull request for 3.17-rc1.

which makes no sense at all :(

Vince

Loading...