Discussion:
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=10002 jiffies)
Fengguang Wu
2012-09-25 11:19:38 UTC
Permalink
Hi Paul,

I've just bisected down one RCU stall problem:

[ 12.035785] pktgen: Packet Generator for packet performance testing. Version: 2.74
[ 12.435439] atkbd: probe of serio0 rejects match -19
[ 111.700160] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=10002 jiffies)
[ 111.700171] Pid: 0, comm: swapper/0 Not tainted 3.6.0-rc5-00004-gda10491 #1
[ 111.700178] Call Trace:
[ 111.700475] [<c10c3c84>] rcu_check_callbacks+0x544/0x570
[ 111.700538] [<c1075e86>] update_process_times+0x36/0x70
[ 111.700547] [<c10a6267>] tick_sched_timer+0x57/0xc0
[ 111.700552] [<c108758a>] __run_hrtimer.isra.31+0x4a/0xc0
[ 111.700557] [<c10a6210>] ? tick_nohz_handler+0xf0/0xf0
[ 111.700559] [<c1088155>] hrtimer_interrupt+0xf5/0x290
[ 111.700562] [<c1091cb8>] ? sched_clock_idle_wakeup_event+0x18/0x20
[ 111.700565] [<c10a6399>] ? tick_nohz_stop_idle+0x39/0x40
[ 111.700572] [<c104f56f>] smp_apic_timer_interrupt+0x4f/0x80
[ 111.700587] [<c1753636>] apic_timer_interrupt+0x2a/0x30
[ 111.700593] [<c10565b5>] ? native_safe_halt+0x5/0x10
[ 111.700599] [<c1039f89>] default_idle+0x29/0x50
[ 111.700601] [<c103a958>] cpu_idle+0x68/0xb0
[ 111.700609] [<c16f2ff7>] rest_init+0x67/0x70
[ 111.700627] [<c1af7929>] start_kernel+0x2ea/0x2f0
[ 111.700629] [<c1af7474>] ? repair_env_string+0x51/0x51
[ 111.700631] [<c1af72a2>] i386_start_kernel+0x78/0x7d
[ 127.040302] bus: 'serio': driver_probe_device: matched device serio0 with driver atkbd
[ 127.041308] CPA self-test:

to this commit:

commit 06ae115a1d551cd952d80df06eaf8b5153351875
Author: Paul E. McKenney <***@linux.vnet.ibm.com>
Date: Sun Aug 14 15:56:54 2011 -0700

rcu: Avoid having just-onlined CPU resched itself when RCU is idle

Thanks,
Fengguang
Paul E. McKenney
2012-09-25 15:07:01 UTC
Permalink
Post by Fengguang Wu
Hi Paul,
[ 12.035785] pktgen: Packet Generator for packet performance testing. Version: 2.74
[ 12.435439] atkbd: probe of serio0 rejects match -19
[ 111.700160] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=10002 jiffies)
[ 111.700171] Pid: 0, comm: swapper/0 Not tainted 3.6.0-rc5-00004-gda10491 #1
[ 111.700475] [<c10c3c84>] rcu_check_callbacks+0x544/0x570
[ 111.700538] [<c1075e86>] update_process_times+0x36/0x70
[ 111.700547] [<c10a6267>] tick_sched_timer+0x57/0xc0
[ 111.700552] [<c108758a>] __run_hrtimer.isra.31+0x4a/0xc0
[ 111.700557] [<c10a6210>] ? tick_nohz_handler+0xf0/0xf0
[ 111.700559] [<c1088155>] hrtimer_interrupt+0xf5/0x290
[ 111.700562] [<c1091cb8>] ? sched_clock_idle_wakeup_event+0x18/0x20
[ 111.700565] [<c10a6399>] ? tick_nohz_stop_idle+0x39/0x40
[ 111.700572] [<c104f56f>] smp_apic_timer_interrupt+0x4f/0x80
[ 111.700587] [<c1753636>] apic_timer_interrupt+0x2a/0x30
[ 111.700593] [<c10565b5>] ? native_safe_halt+0x5/0x10
[ 111.700599] [<c1039f89>] default_idle+0x29/0x50
[ 111.700601] [<c103a958>] cpu_idle+0x68/0xb0
[ 111.700609] [<c16f2ff7>] rest_init+0x67/0x70
[ 111.700627] [<c1af7929>] start_kernel+0x2ea/0x2f0
[ 111.700629] [<c1af7474>] ? repair_env_string+0x51/0x51
[ 111.700631] [<c1af72a2>] i386_start_kernel+0x78/0x7d
[ 127.040302] bus: 'serio': driver_probe_device: matched device serio0 with driver atkbd
commit 06ae115a1d551cd952d80df06eaf8b5153351875
Date: Sun Aug 14 15:56:54 2011 -0700
rcu: Avoid having just-onlined CPU resched itself when RCU is idle
Interesting. Of course the stack is from the CPU that detected the
problem rather than the problematic CPU. ;-)

Could you please try the following patch?

Thanx, Paul
------------------------------------------------------------------------

rcu: Fix day-one dyntick-idle stall-warning bug

Each grace period is supposed to have at least one callback waiting
for that grace period to complete. However, if CONFIG_NO_HZ=n, an
extra callback-free grace period is no big problem -- it will chew up
a tiny bit of CPU time, but it will complete normally. In contrast,
CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
sleep indefinitely, in turn indefinitely delaying completion of the
callback-free grace period. Given that nothing is waiting on this grace
period, this is also not a problem.

That is, unless RCU CPU stall warnings are also enabled, as they are
in recent kernels. In this case, if a CPU wakes up after at least one
minute of inactivity, an RCU CPU stall warning will result. The reason
that no one noticed until quite recently is that most systems have enough
OS noise that they will never remain absolutely idle for a full minute.
But there are some embedded systems with cut-down userspace configurations
that consistently get into this situation.

All this begs the question of exactly how a callback-free grace period
gets started in the first place. This can happen due to the fact that
CPUs do not necessarily agree on which grace period is in progress.
If a CPU still believes that the grace period that just completed is
still ongoing, it will believe that it has callbacks that need to wait for
another grace period, never mind the fact that the grace period that they
were waiting for just completed. This CPU can therefore erroneously
decide to start a new grace period. Note that this can happen in
TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock
considerations mean that the CPU that detected the end of the grace
period is not necessarily officially informed of this fact for some time.

Once this CPU notices that the earlier grace period completed, it will
invoke its callbacks. It then won't have any callbacks left. If no
other CPU has any callbacks, we now have a callback-free grace period.

This commit therefore makes CPUs check more carefully before starting a
new grace period. This new check relies on an array of tail pointers
into each CPU's list of callbacks. If the CPU is up to date on which
grace periods have completed, it checks to see if any callbacks follow
the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
follow the RCU_WAIT_TAIL segment. The reason that this works is that
the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
as soon as the CPU is officially notified that the old grace period
has ended.

This change is to cpu_needs_another_gp(), which is called in a number
of places. The only one that really matters is in rcu_start_gp(), where
the root rcu_node structure's ->lock is held, which prevents any
other CPU from starting or completing a grace period, so that the
comparison that determines whether the CPU is missing the completion
of a grace period is stable.

Reported-by: Becky Bruce <***@gmail.com>
Reported-by: Subodh Nijsure <***@grid-net.com>
Reported-by: Paul Walmsley <***@pwsan.com>
Signed-off-by: Paul E. McKenney <***@linaro.org>
Signed-off-by: Paul E. McKenney <***@linux.vnet.ibm.com>
Tested-by: Paul Walmsley <***@pwsan.com> # OMAP3730, OMAP4430
Cc: ***@vger.kernel.org

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f280e54..f7bcd9e 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -305,7 +305,9 @@ cpu_has_callbacks_ready_to_invoke(struct rcu_data *rdp)
static int
cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
{
- return *rdp->nxttail[RCU_DONE_TAIL] && !rcu_gp_in_progress(rsp);
+ return *rdp->nxttail[RCU_DONE_TAIL +
+ ACCESS_ONCE(rsp->completed) != rdp->completed] &&
+ !rcu_gp_in_progress(rsp);
}

/*
Avi Kivity
2012-09-30 11:10:55 UTC
Permalink
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
--
error compiling committee.c: too many arguments to function
Fengguang Wu
2012-09-30 11:23:08 UTC
Permalink
Post by Avi Kivity
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
I've checked out the dmesg that's cited by Paul, attached. Yes it
contains lines

[ 4.970051] Switching to clocksource hpet

and then

[ 7.250353] Switching to clocksource tsc

And there is no kvm-clock lines. Oh well for this particular kernel:

# CONFIG_KVM_CLOCK is not set

I'm not sure how this happen, maybe some kconfig that CONFIG_KVM_CLOCK
depends on is randconfig'ed to off..

Thanks,
Fengguang
Avi Kivity
2012-09-30 11:32:46 UTC
Permalink
Post by Fengguang Wu
Post by Avi Kivity
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
I've checked out the dmesg that's cited by Paul, attached. Yes it
contains lines
[ 4.970051] Switching to clocksource hpet
and then
[ 7.250353] Switching to clocksource tsc
Ah, tsc will certainly break on kvm if the hardware doesn't provide a
constant tsc source. I'm surprised the guest kernel didn't detect it
and switch back to hpet though.
--
error compiling committee.c: too many arguments to function
Fengguang Wu
2012-09-30 11:59:37 UTC
Permalink
Post by Avi Kivity
Post by Fengguang Wu
Post by Avi Kivity
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
I've checked out the dmesg that's cited by Paul, attached. Yes it
contains lines
[ 4.970051] Switching to clocksource hpet
and then
[ 7.250353] Switching to clocksource tsc
Ah, tsc will certainly break on kvm if the hardware doesn't provide a
constant tsc source. I'm surprised the guest kernel didn't detect it
and switch back to hpet though.
Thanks, it's good to know the root cause. All the dmesgs show the same hpet+tsc
switching pattern (and never switch back):

$ grep Switching dmesg-kvm_bisect2-inn-*21
dmesg-kvm_bisect2-inn-41931-2012-09-27-10-37-51-3.6.0-rc7-bisect2-00078-g593d100-21:[ 4.111415] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-41931-2012-09-27-10-37-51-3.6.0-rc7-bisect2-00078-g593d100-21:[ 6.550098] Switching to clocksource tsc
dmesg-kvm_bisect2-inn-41931-2012-09-27-10-41-48-3.6.0-rc7-bisect2-00078-g593d100-21:[ 3.927716] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-41931-2012-09-27-10-41-48-3.6.0-rc7-bisect2-00078-g593d100-21:[ 6.030117] Switching to clocksource tsc
dmesg-kvm_bisect2-inn-42322-2012-09-27-10-35-17-3.6.0-rc7-bisect2-00078-g593d100-21:[ 3.587408] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-42322-2012-09-27-10-35-17-3.6.0-rc7-bisect2-00078-g593d100-21:[ 5.812400] Switching to clocksource tsc
dmesg-kvm_bisect2-inn-42322-2012-09-27-10-43-56-3.6.0-rc7-bisect2-00078-g593d100-21:[ 4.294842] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-42322-2012-09-27-10-43-56-3.6.0-rc7-bisect2-00078-g593d100-21:[ 6.491696] Switching to clocksource tsc
dmesg-kvm_bisect2-inn-42322-2012-09-27-10-47-03-3.6.0-rc7-bisect2-00078-g593d100-21:[ 4.745749] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-42322-2012-09-27-10-47-03-3.6.0-rc7-bisect2-00078-g593d100-21:[ 7.193121] Switching to clocksource tsc
dmesg-kvm_bisect2-inn-42527-2012-09-27-10-38-38-3.6.0-rc7-bisect2-00078-g593d100-21:[ 4.970051] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-42527-2012-09-27-10-38-38-3.6.0-rc7-bisect2-00078-g593d100-21:[ 7.250353] Switching to clocksource tsc

And these are the stall times:

$ grep -hC1 stalls dmesg-kvm_bisect2-inn-*21
[ 36.122624] bus: 'platform': really_probe: probing driver i8042 with device i8042
[ 36.106893] INFO: rcu_preempt self-detected stall on CPU[ 210.200388] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=17413 jiffies)
[ 210.200417] Pid: 0, comm: swapper/0 Not tainted 3.6.0-rc7-bisect2-00078-g593d100 #21
--
[ 35.403888] bus: 'platform': really_probe: probing driver i8042 with device i8042
[ 212.130131] INFO: rcu_preempt detected stalls on CPUs/tasks:[ 212.131029] rcu-torture: rtc: c1a5e988 ver: 228 tfle: 0 rta: 228 rtaf: 162 rtf: 206 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 988 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=100) barrier: 0/0:0
[ 212.131029] rcu-torture: Reader Pipe: 16284405 36 0 0 0 0 0 0 0 0 0
--
[ 35.458755] bus: 'platform': really_probe: probing driver i8042 with device i8042
[ 35.448369] INFO: rcu_preempt self-detected stall on CPU { 1} [ 210.260157] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=17485 jiffies)
[ 210.260179] Pid: 0, comm: swapper/0 Not tainted 3.6.0-rc7-bisect2-00078-g593d100 #21
--
[ 36.079480] INFO: rcu_preempt self-detected stall on CPU { 1
[ 223.710182] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=18767 jiffies)
[ 223.710201] Pid: 0, comm: swapper/0 Not tainted 3.6.0-rc7-bisect2-00078-g593d100 #21
--
[ 37.831174] kobject: ':t-0000168' (c793a24c): fill_kobj_path: path = '/kernel/slab/:t-0000168'
[ 136.910092] INFO: rcu_preempt detected stalls on CPUs/tasks: {} (detected by 1, t=10002 jiffies)
[ 136.910092] INFO: Stall ended before state dump start
--
[ 37.672159] kobject: 'iscsi_session' (cc2c0c0c): fill_kobj_path: path = '/class/iscsi_session'
[ 37.672791] INFO: rcu_preempt self-detected stall on CPU { 1[ 137.900041] INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=10028 jiffies)
[ 137.900041] Pid: 41, comm: rcu_torture_rea Not tainted 3.6.0-rc7-bisect2-00078-g593d100 #21

Thanks,
Fengguang
John Stultz
2012-10-08 23:40:06 UTC
Permalink
Post by Fengguang Wu
Post by Avi Kivity
Post by Fengguang Wu
Post by Avi Kivity
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
I've checked out the dmesg that's cited by Paul, attached. Yes it
contains lines
[ 4.970051] Switching to clocksource hpet
and then
[ 7.250353] Switching to clocksource tsc
Ah, tsc will certainly break on kvm if the hardware doesn't provide a
constant tsc source. I'm surprised the guest kernel didn't detect it
and switch back to hpet though.
Thanks, it's good to know the root cause. All the dmesgs show the same hpet+tsc
$ grep Switching dmesg-kvm_bisect2-inn-*21
dmesg-kvm_bisect2-inn-41931-2012-09-27-10-37-51-3.6.0-rc7-bisect2-00078-g593d100-21:[ 4.111415] Switching to clocksource hpet
dmesg-kvm_bisect2-inn-41931-2012-09-27-10-37-51-3.6.0-rc7-bisect2-00078-g593d100-21:[ 6.550098] Switching to clocksource tsc
Is this still an open issue? Fengguang's mail sounds like its resolved,
but I'm not sure it is.

The switching from HPET -> TSC I believe is expected, as the refined
calibration will delay the TSC from being registered for a few seconds.
However, its unclear why the TSC, if it is faulty, isn't being caught
and demoted by the clocksource watchdog.

I'm also curious why this originally bisected down to
06ae115a1d551cd952d8 (when using the kvm clock) if it was more of a
hardware issue. And in those logs, I don't see the printk time-stamp
inconsistencies that were alluded to in this thread.

Fengguang: Is this still reproducible? Do you have any details (dmesg)
about host system as well?

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Fengguang Wu
2012-09-30 11:18:12 UTC
Permalink
Post by Avi Kivity
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
Hi Avi! Thanks for looking into this. It seems you have the full logs
attached in my previous email?

FYI, I've enabled CONFIG_KVM_CLOCK/CONFIG_KVM_GUEST for all bootable
kernels and here is the related boot message:

[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: cpu 0, msr 0:1b7ec81, boot clock
Post by Avi Kivity
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
I see a line

[ 2.011710] Switching to clocksource kvm-clock

w/o any indication of errors.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Avi Kivity
2012-09-30 11:31:42 UTC
Permalink
Post by Fengguang Wu
Post by Avi Kivity
[ . . . ]
But could you also please send your .config file and a description of
.config attached.
the workload you are running?
It's basically the below commands. The exact initrd is not relevant in
this case because it's a boot time warning before user space is
started. The stalls roughly happen 1 time on every 10 boots.
Yow!!!
You have severe cross-CPU time-synchronization problems. See for
example the first dmesg, with the relevant part extracted right here.
One CPU believes that it is about 37 seconds past boot, and the other
CPU beleives that it is about 137 seconds past boot. Given that large
of a time difference, an RCU CPU stall warning is expected behavior.
Good spot! Yeah I noticed that huge timestamp gap, however didn't take
it seriously enough..
Get your two CPUs in agreement about what time it is, and I bet that
the CPU stall warnings will go away.
Possibly KVM related? Because the warnings show up in many test boxes
running KVM and so is not likely some hardware specific issue.
I vaguely recall seeing something recently. But let's ask the KVM and
timekeeping guys.
From the logs it looks like hpet (why not kvmclock?) is used for the
Hi Avi! Thanks for looking into this. It seems you have the full logs
attached in my previous email?
FYI, I've enabled CONFIG_KVM_CLOCK/CONFIG_KVM_GUEST for all bootable
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: cpu 0, msr 0:1b7ec81, boot clock
Post by Avi Kivity
clock, it should not generate such drifts since it is a global clock.
Can you verify current_clocksource on a boot that actually failed (in
case the clocksource is switched during runtime)?
I see a line
[ 2.011710] Switching to clocksource kvm-clock
w/o any indication of errors.
So, with kvmclock it works and with hpet it fails? Strange, hpet is
simple while kvmclock is more complicated, I'd have expected the
opposite behaviour.
--
error compiling committee.c: too many arguments to function
Loading...