Discussion:
kfree(NULL)
Daniel Walker
2006-04-21 07:03:31 UTC
Permalink
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be inverted!
Or at least on my configuration.

Here are some examples of the warnings that I observed ..

printk: 66 messages suppressed.
BUG: warning at mm/slab.c:3384/kfree()
<c01043d3> show_trace+0x13/0x20 <c01043fe> dump_stack+0x1e/0x20
<c015e334> kfree+0xa4/0xc0 <c032584b> make_request+0x36b/0x670
<c0210305> generic_make_request+0x175/0x240 <c02107d7> submit_bio+0x57/0x100
<c01648f6> submit_bh+0x106/0x160 <c0165a62> __block_write_full_page+0x222/0x3f0
<c0165d28> block_write_full_page+0xf8/0x100 <c016a4b1> blkdev_writepage+0x21/0x30
<c0188c1e> mpage_writepages+0x1ae/0x3d0 <c016a46e> generic_writepages+0x1e/0x20
<c0148f9d> do_writepages+0x2d/0x50 <c0186b70> __writeback_single_inode+0xa0/0x400
<c018716b> sync_sb_inodes+0x1bb/0x2a0 <c01877bf> writeback_inodes+0xaf/0xe5
<c0148d53> wb_kupdate+0x83/0x100 <c0149ab2> pdflush+0x102/0x1c0
<c0131fa4> kthread+0xc4/0xf0 <c0100ed5> kernel_thread_helper+0x5/0x10
printk: 157 messages suppressed.
BUG: warning at mm/slab.c:3384/kfree()
<c01043d3> show_trace+0x13/0x20 <c01043fe> dump_stack+0x1e/0x20
<c015e334> kfree+0xa4/0xc0 <c0140405> audit_syscall_exit+0x405/0x430
<c0106a56> do_syscall_trace+0x1d6/0x245 <c010320a> syscall_exit_work+0x16/0x1c


Daniel

Index: linux-2.6.16/mm/slab.c
===================================================================
--- linux-2.6.16.orig/mm/slab.c
+++ linux-2.6.16/mm/slab.c
@@ -3380,8 +3380,10 @@ void kfree(const void *objp)
struct kmem_cache *c;
unsigned long flags;

- if (unlikely(!objp))
+ if (unlikely(!objp)){
+ WARN_ON(printk_ratelimit());
return;
+ }
local_irq_save(flags);
kfree_debugcheck(objp);
c = virt_to_cache(objp);
James Morris
2006-04-21 07:22:54 UTC
Permalink
Post by Daniel Walker
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be inverted!
Or at least on my configuration.
It would be helpful to collect some stats on this so we can look at the
ratio.


- James
--
James Morris
<***@namei.org>
Andrew Morton
2006-04-21 08:54:12 UTC
Permalink
Post by James Morris
Post by Daniel Walker
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be inverted!
Or at least on my configuration.
It would be helpful to collect some stats on this so we can look at the
ratio.
Yes, kfree(NULL) is supposed to be uncommon. If someone's doing it a lot
then we should fix up the callers.
Vernon Mauery
2006-04-21 13:56:39 UTC
Permalink
Post by Andrew Morton
Post by James Morris
Post by Daniel Walker
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be
inverted! Or at least on my configuration.
It would be helpful to collect some stats on this so we can look at the
ratio.
Yes, kfree(NULL) is supposed to be uncommon. If someone's doing it a lot
then we should fix up the callers.
Part of the reason it gets done a lot is because some developers on this list
have told others NOT to check for NULL before calling kfree because kfree
does that internally. I was told that. I can't remember who told me though.

Maybe kfree should really be a wrapper around __kfree which does the real
work. Then kfree could be a inlined function or a #define that does the NULL
pointer check. Something like the patch below. This has the advantage of
both worlds. If you know your pointer is NULL, call __kfree. If you are not
sure and would check anyway, the inline version of kfree checks for you and
then calls __kfree.

Signed-off-by: Vernon Mauery <***@us.ibm.com>

--- a/include/linux/slab.h 2006-03-19 21:53:29.000000000 -0800
+++ b/include/linux/slab.h 2006-04-21 07:47:32.000000000 -0700
@@ -123,7 +123,14 @@ static inline void *kcalloc(size_t n, si
return kzalloc(n * size, flags);
}

-extern void kfree(const void *);
+extern void __kfree(const void *);
+static inline void kfree(const void *obj)
+{
+ if (!obj)
+ return;
+ __kfree(obj);
+}
+
extern unsigned int ksize(const void *);

#ifdef CONFIG_NUMA
--- a/mm/slab.c 2006-04-21 07:49:42.000000000 -0700
+++ b/mm/slab.c 2006-04-21 07:49:56.000000000 -0700
@@ -3275,13 +3275,11 @@ EXPORT_SYMBOL(kmem_cache_free);
* Don't free memory not originally allocated by kmalloc()
* or you will run into trouble.
*/
-void kfree(const void *objp)
+void __kfree(const void *objp)
{
struct kmem_cache *c;
unsigned long flags;

- if (unlikely(!objp))
- return;
local_irq_save(flags);
kfree_debugcheck(objp);
c = virt_to_cache(objp);
@@ -3289,7 +3287,7 @@ void kfree(const void *objp)
__cache_free(c, (void *)objp);
local_irq_restore(flags);
}
-EXPORT_SYMBOL(kfree);
+EXPORT_SYMBOL(__kfree);

#ifdef CONFIG_SMP
/**
Dmitry Fedorov
2006-04-21 14:07:17 UTC
Permalink
Post by Vernon Mauery
Maybe kfree should really be a wrapper around __kfree which does the real
work. Then kfree could be a inlined function or a #define that does the NULL
pointer check.
NULL pointer check in kfree saves lot of small code fragments in callers.
It is one of many reasons why kfree does it.
Making kfree inline wrapper eliminates this save.
Jan Engelhardt
2006-04-21 15:07:45 UTC
Permalink
Post by Dmitry Fedorov
Post by Vernon Mauery
Maybe kfree should really be a wrapper around __kfree which does the real
work. Then kfree could be a inlined function or a #define that does the NULL
pointer check.
NULL pointer check in kfree saves lot of small code fragments in callers.
It is one of many reasons why kfree does it.
Making kfree inline wrapper eliminates this save.
How about

slab.h:
#ifndef CONFIG_OPTIMIZING_FOR_SIZE
static inline void kfree(const void *p) {
if(p != NULL)
__kfree(p);
}
#else
extern void kfree(const void *);
#endif

slab.c:
#ifdef CONFIG_OPTIMIZING_FOR_SIZE
void kfree(const void *p) {
if(p != NUILL)
_kfree(p);
}
#endif

That way, you get your time saving with -O2 and your space saving with -Os.


Jan Engelhardt
--
Adrian Bunk
2006-04-21 19:22:17 UTC
Permalink
Post by Jan Engelhardt
Post by Dmitry Fedorov
Post by Vernon Mauery
Maybe kfree should really be a wrapper around __kfree which does the real
work. Then kfree could be a inlined function or a #define that does the NULL
pointer check.
NULL pointer check in kfree saves lot of small code fragments in callers.
It is one of many reasons why kfree does it.
Making kfree inline wrapper eliminates this save.
How about
#ifndef CONFIG_OPTIMIZING_FOR_SIZE
static inline void kfree(const void *p) {
if(p != NULL)
__kfree(p);
}
#else
extern void kfree(const void *);
#endif
#ifdef CONFIG_OPTIMIZING_FOR_SIZE
void kfree(const void *p) {
if(p != NUILL)
_kfree(p);
}
#endif
That way, you get your time saving with -O2 and your space saving with -Os.
What makes you confident that the static inline version gives a time
saving?
Post by Jan Engelhardt
Jan Engelhardt
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
Vernon Mauery
2006-04-21 20:30:30 UTC
Permalink
Post by Adrian Bunk
Post by Jan Engelhardt
Post by Dmitry Fedorov
Post by Vernon Mauery
Maybe kfree should really be a wrapper around __kfree which does the
real work. Then kfree could be a inlined function or a #define that
does the NULL pointer check.
NULL pointer check in kfree saves lot of small code fragments in
callers. It is one of many reasons why kfree does it.
Making kfree inline wrapper eliminates this save.
How about
#ifndef CONFIG_OPTIMIZING_FOR_SIZE
static inline void kfree(const void *p) {
if(p != NULL)
__kfree(p);
}
#else
extern void kfree(const void *);
#endif
#ifdef CONFIG_OPTIMIZING_FOR_SIZE
void kfree(const void *p) {
if(p != NUILL)
_kfree(p);
}
#endif
That way, you get your time saving with -O2 and your space saving with -Os.
What makes you confident that the static inline version gives a time
saving?
A static inline wrapper would mean that it wouldn't have to make a function
call just to check if the pointer is NULL. A simple NULL check is faster
than a function call and then a simple NULL check. In other words, there
would be no pushing and popping the stack. In almost all cases, replacing an
inline function with a non-inline function means a trade-off between speed
and size.

--Vernon
Steven Rostedt
2006-04-21 20:54:13 UTC
Permalink
Post by Vernon Mauery
Post by Adrian Bunk
What makes you confident that the static inline version gives a time
saving?
A static inline wrapper would mean that it wouldn't have to make a function
call just to check if the pointer is NULL. A simple NULL check is faster
than a function call and then a simple NULL check. In other words, there
would be no pushing and popping the stack. In almost all cases, replacing an
inline function with a non-inline function means a trade-off between speed
and size.
Andrew Morton just submitted a patch to -mm that fixes the two problem
places that called kfree(NULL) more than it calls kfree(non-NULL).

Besides the places that are now fixed, the inline doesn't save much.
Since most cases kfree(non-NULL) is called, so the NULL is really an
unlikely case. Thus you just increased the size of the kernel, for
virtually no speed savings.

-- Steve
Adrian Bunk
2006-04-21 21:38:46 UTC
Permalink
Post by Vernon Mauery
Post by Adrian Bunk
Post by Jan Engelhardt
Post by Dmitry Fedorov
Post by Vernon Mauery
Maybe kfree should really be a wrapper around __kfree which does the
real work. Then kfree could be a inlined function or a #define that
does the NULL pointer check.
NULL pointer check in kfree saves lot of small code fragments in
callers. It is one of many reasons why kfree does it.
Making kfree inline wrapper eliminates this save.
How about
#ifndef CONFIG_OPTIMIZING_FOR_SIZE
static inline void kfree(const void *p) {
if(p != NULL)
__kfree(p);
}
#else
extern void kfree(const void *);
#endif
#ifdef CONFIG_OPTIMIZING_FOR_SIZE
void kfree(const void *p) {
if(p != NUILL)
_kfree(p);
}
#endif
That way, you get your time saving with -O2 and your space saving with -Os.
What makes you confident that the static inline version gives a time
saving?
A static inline wrapper would mean that it wouldn't have to make a function
call just to check if the pointer is NULL. A simple NULL check is faster
than a function call and then a simple NULL check. In other words, there
would be no pushing and popping the stack. In almost all cases, replacing an
inline function with a non-inline function means a trade-off between speed
and size.
It's not that simple - inline's make the code bigger causing more cache
misses and therefore also having a negative impact on performance.

This is the reason why e.g. the gcc -O3 option usually results in
_worse_ performance than the -O2 option.
Post by Vernon Mauery
--Vernon
cu
Adrian

BTW: Don't strip the Cc when replying to linux-kernel.
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
Jörn Engel
2006-04-22 11:56:01 UTC
Permalink
=20
A simple NULL check is faster=20
than a function call and then a simple NULL check.
I am not sure whether this is still true for any non-ancient CPUs.
The cost of a branch misprediction is in the order of _many_
predictable instructions. That makes conditionals more expensive than
they used to be. And the cost of branch mispredictions keeps
increasing.

J=F6rn

--=20
You can take my soul, but not my lack of enthusiasm.
-- Wally
Paul Mackerras
2006-04-21 23:55:08 UTC
Permalink
Post by Andrew Morton
Yes, kfree(NULL) is supposed to be uncommon. If someone's doing it a lot
then we should fix up the callers.
Well, we'd have to start by fixing up the janitors that run around
taking out the if statements in the callers. :)

Paul.
Paul Mackerras
2006-04-22 08:48:26 UTC
Permalink
No, it's not the janitors fault that we have paths doing lots of
kfree(NULL) calls. NULL check removal didn't create the problem, but
it makes it more visible definitely.
There is a judgement to be made at each call site of kfree (and
similar functions) about whether the argument is rarely NULL, or could
often be NULL. If the janitors have been making this judgement, I
apologise, but I haven't seen them doing that.

Paul.
Pekka Enberg
2006-04-22 15:02:29 UTC
Permalink
Post by Paul Mackerras
There is a judgement to be made at each call site of kfree (and
similar functions) about whether the argument is rarely NULL, or could
often be NULL. If the janitors have been making this judgement, I
apologise, but I haven't seen them doing that.
I don't think anyone should be calling kfree with NULL pointer often in
the first place. Keeping the extra check in place is masking the real
problem. So yeah, we should be looking at the NULL checks more carefully
to see if they require more fundamental fixes, but no, I don't see why
janitors can't continue to remove the extra checks.

Pekka
Hua Zhong
2006-04-22 18:57:23 UTC
Permalink
Post by Paul Mackerras
There is a judgement to be made at each call site of kfree
(and similar functions) about whether the argument is rarely
NULL, or could often be NULL. If the janitors have been
making this judgement, I apologise, but I haven't seen them
doing that.
Paul.
Even if the caller passes NULL most of the time, the check should be removed.

It's just crazy talk to say "you should not check NULL before calling kfree, as long as you make sure it's not NULL most of the
time".

Hua
Nick Piggin
2006-04-22 19:05:41 UTC
Permalink
Post by Hua Zhong
Post by Paul Mackerras
There is a judgement to be made at each call site of kfree
(and similar functions) about whether the argument is rarely
NULL, or could often be NULL. If the janitors have been
making this judgement, I apologise, but I haven't seen them
doing that.
Paul.
Even if the caller passes NULL most of the time, the check should be removed.
It's just crazy talk to say "you should not check NULL before calling kfree, as long as you make sure it's not NULL most of the
time".
It can reduce readability of the code [unless it is used in error path
simplification, kfree(something) usually suggests kfree-an-object].

If the caller passes NULL most of the time, it could be in need of
redesign.

I don't actually like kfree(NULL) any time except error paths. It is
subjective, not crazy talk.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
Hua Zhong
2006-04-22 19:22:03 UTC
Permalink
Post by Nick Piggin
It can reduce readability of the code [unless it is used in
error path simplification, kfree(something) usually suggests
kfree-an-object].
Consistency in coding style improves readability. Redundancy reduces readability.

The interface is simple and clear, and has been documented for decades, that is kfree (and free) accepts NULL. There is no ambiguity
here.

If you think "if (obj) kfree (obj);" is more readable than "kfree(obj);", fix the API to enforce it.

But if the kernel tree is full of "some caller checks NULL while others not", I hardly see it as readable. It'd just be confusing.
Post by Nick Piggin
I don't actually like kfree(NULL) any time except error
paths. It is subjective, not crazy talk.
Documented interface is not subjective.
Nick Piggin
2006-04-22 19:25:41 UTC
Permalink
Post by Hua Zhong
Post by Nick Piggin
It can reduce readability of the code [unless it is used in
error path simplification, kfree(something) usually suggests
kfree-an-object].
Consistency in coding style improves readability. Redundancy reduces readability.
The interface is simple and clear, and has been documented for decades, that is kfree (and free) accepts NULL. There is no ambiguity
here.
If you think "if (obj) kfree (obj);" is more readable than "kfree(obj);", fix the API to enforce it.
But if the kernel tree is full of "some caller checks NULL while others not", I hardly see it as readable. It'd just be confusing.
Post by Nick Piggin
I don't actually like kfree(NULL) any time except error
paths. It is subjective, not crazy talk.
Documented interface is not subjective.
That's great. I don't know quite how to reply, or even if I should
if you don't read what I write.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
Jan Engelhardt
2006-04-22 20:18:36 UTC
Permalink
Post by Nick Piggin
Post by Hua Zhong
Post by Nick Piggin
I don't actually like kfree(NULL) any time except error paths. It is
subjective, not crazy talk.
Documented interface is not subjective.
That's great. I don't know quite how to reply, or even if I should
if you don't read what I write.
Where's the problem, if a developer does not know whether an object is NULL
or not, he may call kfree(). If, on the other hand, he is sure it is NULL,
there is no need to refree it, and if he is sure it is non-NULL, he can
directly call __kfree(). So the readability can never suffer - you never
need an if(x==NULL) anymore.


Jan Engelhardt
--
Steven Rostedt
2006-04-23 16:50:23 UTC
Permalink
Post by Nick Piggin
Post by Hua Zhong
Post by Paul Mackerras
There is a judgement to be made at each call site of kfree
(and similar functions) about whether the argument is rarely
NULL, or could often be NULL. If the janitors have been
making this judgement, I apologise, but I haven't seen them
doing that.
Paul.
Even if the caller passes NULL most of the time, the check should be removed.
It's just crazy talk to say "you should not check NULL before calling kfree, as long as you make sure it's not NULL most of the
time".
It can reduce readability of the code [unless it is used in error path
simplification, kfree(something) usually suggests kfree-an-object].
If the caller passes NULL most of the time, it could be in need of
redesign.
I don't actually like kfree(NULL) any time except error paths. It is
subjective, not crazy talk.
I wrote a little hack that detects up to 1000 callers of kfree(NULL) and
outputs what it finds with sysrq-l.

http://marc.theaimsgroup.com/?l=linux-kernel&m=114564257500757&w=2

It found right away, two function in transaction.c from the jbd code,
that were freeing an object that sometimes gets allocated. Andrew
Morton already submitted the patch in the -mm tree to fix it:

- kfree(new_transaction);
+ if (unlikely(new_transaction)) /* It's usually NULL */
+ kfree(new_transaction);
return ret;
}

@@ -724,7 +725,8 @@ done:
journal_cancel_revoke(handle, jh);

out:
- kfree(frozen_buffer);
+ if (unlikely(frozen_buffer)) /* It's usually NULL */
+ kfree(frozen_buffer);

Where he uses unlikely and nicely documents that it is usually NULL (of
course the "unlikely" sort of says that already ;)

I've been running this patched kernel for a couple of days on a mostly
idle machine, (I don't need it right now, so I just let it run) and it
has shown some more problem areas. probably occurred when updatedb
kicked off.

Here's the dump:

SysRq : Show stats on kfree
Total number of NULL frees: 1589709
Total number of non NULL frees: 69448
Callers of NULL frees:
[ 27] c0154bcd - do_tune_cpucache+0x13d/0x230
[ 631] c025b9dd - class_device_add+0xcd/0x300
[ 30] c019523c - sysfs_d_iput+0x3c/0x8e
[ 44] c0193750 - sysfs_hash_and_remove+0xd0/0x110
[ 1] c01f4787 - kobject_cleanup+0x37/0x90
[ 1] c025bf73 - class_dev_release+0x23/0x90
[ 14] c021b615 - tty_write+0x105/0x220
[ 20] c025b5ff - class_device_del+0x16f/0x190
[ 6] c021cd34 - release_mem+0x174/0x2a0
[ 79] c011e804 - do_sysctl+0x94/0x250
[ 352161] c01aafc4 - start_this_handle+0x234/0x4b0
[ 430089] c01aba66 - do_get_write_access+0x2e6/0x5a0
[ 16730] c01abdf0 - journal_get_undo_access+0xd0/0x120
[ 788641] c01a3c9f - ext3_clear_inode+0x2f/0x40
[ 3] c0194a0c - sysfs_dir_close+0x6c/0x90
[ 252] c0304e1d - inet_sock_destruct+0xad/0x1f0
[ 1] c030a698 - ip_rt_ioctl+0xe8/0x130
[ 968] c02e2669 - ip_push_pending_frames+0x2d9/0x400
[ 6] c02d69b0 - netlink_release+0x1c0/0x300
[ 5] c02ba79b - sock_fasync+0x13b/0x150

start_this_handle and do_get_write_access have already been fixed, but
now it's looking like journal_get_undo_access and ext3_clear_inode are
problem children too.

-- Steve

Pekka Enberg
2006-04-22 07:43:22 UTC
Permalink
Post by Paul Mackerras
Post by Andrew Morton
Yes, kfree(NULL) is supposed to be uncommon. If someone's doing it a lot
then we should fix up the callers.
Well, we'd have to start by fixing up the janitors that run around
taking out the if statements in the callers. :)
No, it's not the janitors fault that we have paths doing lots of
kfree(NULL) calls. NULL check removal didn't create the problem, but
it makes it more visible definitely.

Pekka
Jesper Juhl
2006-04-22 11:34:04 UTC
Permalink
Post by Paul Mackerras
Post by Andrew Morton
Yes, kfree(NULL) is supposed to be uncommon. If someone's doing it a lot
then we should fix up the callers.
Well, we'd have to start by fixing up the janitors that run around
taking out the if statements in the callers. :)
I think there was pretty good agreement, when we started doing that,
that taking out the if statements in the callers was a good idea.
If it turns out to have been a net loss that's not good, but I don't
think it's been a wasted effort - there were a *lot* of places that
checked for NULL before calling [kv]free, and now that we've gotten
rid of them we can consider adding them back where it makes sense, not
just all over the place.
We could also consider changing the
if (unlikely(!obj))
return;
in kfree into simply
if (!obj)
return;

--
Jesper Juhl <***@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
Daniel Walker
2006-04-21 14:06:58 UTC
Permalink
Post by James Morris
Post by Daniel Walker
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be inverted!
Or at least on my configuration.
It would be helpful to collect some stats on this so we can look at the
ratio.
On my system it was roughly 31 million kfree(NULL) calls, to 4 million
calls with other values . That was over 4 hours of run time .

Daniel
Tilman Schmidt
2006-04-21 15:25:49 UTC
Permalink
Post by Andrew Morton
Post by James Morris
Post by Daniel Walker
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be inverted!
Or at least on my configuration.
It would be helpful to collect some stats on this so we can look at the
ratio.
Yes, kfree(NULL) is supposed to be uncommon.
Not anymore, after the recent campaign to elliminate explicit NULL
checks before calls to kfree().
Post by Andrew Morton
If someone's doing it a lot then we should fix up the callers.
If that fixup amounts to re-adding the NULL check just elliminated
then that's no improvement. It would be better to drop the assumption
that kfree() calls with a NULL argument are uncommon, and consequently
remove the unlikely() predictor. Adding likely() instead may or may
not be a good idea.
--
Tilman Schmidt E-Mail: ***@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)
Daniel Walker
2006-04-21 16:03:23 UTC
Permalink
Post by Tilman Schmidt
Post by Andrew Morton
Post by James Morris
Post by Daniel Walker
I included a patch , not like it's needed . Recently I've been
evaluating likely/unlikely branch prediction .. One thing that I found
is that the kfree function is often called with a NULL "objp" . In fact
it's so frequent that the "unlikely" branch predictor should be inverted!
Or at least on my configuration.
It would be helpful to collect some stats on this so we can look at the
ratio.
Yes, kfree(NULL) is supposed to be uncommon.
Not anymore, after the recent campaign to elliminate explicit NULL
checks before calls to kfree().
Post by Andrew Morton
If someone's doing it a lot then we should fix up the callers.
If that fixup amounts to re-adding the NULL check just elliminated
then that's no improvement. It would be better to drop the assumption
that kfree() calls with a NULL argument are uncommon, and consequently
remove the unlikely() predictor. Adding likely() instead may or may
not be a good idea.
After reviewing some of the callers of kfree(NULL) they appear to be
errors by the caller .. Where there's some false assumptions going on
during looping or repeated calls to the same function.

I agree with Andrew , I think the calls should be investigated while
retaining the unlikley() predictor .

Daniel
Jörn Engel
2006-04-21 17:48:08 UTC
Permalink
Post by Daniel Walker
=20
After reviewing some of the callers of kfree(NULL) they appear to be
errors by the caller .. Where there's some false assumptions going on
during looping or repeated calls to the same function.=20
=20
I agree with Andrew , I think the calls should be investigated while
retaining the unlikley() predictor .=20
Those calls that frequently call kfree(NULL). ;)

J=F6rn

--=20
Linux [...] existed just for discussion between people who wanted
to show off how geeky they were.
-- Rob Enderle
Steven Rostedt
2006-04-21 18:00:59 UTC
Permalink
Post by Daniel Walker
After reviewing some of the callers of kfree(NULL) they appear to be
errors by the caller .. Where there's some false assumptions going on
during looping or repeated calls to the same function.
I agree with Andrew , I think the calls should be investigated while
retaining the unlikley() predictor .
Below is a quick patch to do statistics on kfree. It records up to 1000
locations of those that use kfree(NULL). Yes I know that the
adding/searching the list is O(n), but it's only for statistics, and it
only happens on the NULL case.

Then it allows for showing what it found through the sysrq-l

Anyway, after booting with this patch I got the following output:

SysRq : Show stats on kfree
Total number of NULL frees: 16129
Total number of non NULL frees: 18119
Callers of NULL frees:
[ 27] c0154bcd - do_tune_cpucache+0x13d/0x230
[ 631] c025b9dd - class_device_add+0xcd/0x300
[ 30] c019523c - sysfs_d_iput+0x3c/0x8e
[ 44] c0193750 - sysfs_hash_and_remove+0xd0/0x110
[ 1] c01f4787 - kobject_cleanup+0x37/0x90
[ 1] c025bf73 - class_dev_release+0x23/0x90
[ 14] c021b615 - tty_write+0x105/0x220
[ 20] c025b5ff - class_device_del+0x16f/0x190
[ 6] c021cd34 - release_mem+0x174/0x2a0
[ 36] c011e804 - do_sysctl+0x94/0x250
[ 6491] c01aafc4 - start_this_handle+0x234/0x4b0
[ 8404] c01aba66 - do_get_write_access+0x2e6/0x5a0
[ 263] c01abdf0 - journal_get_undo_access+0xd0/0x120
[ 28] c01a3c9f - ext3_clear_inode+0x2f/0x40
[ 3] c0194a0c - sysfs_dir_close+0x6c/0x90
[ 59] c0304e1d - inet_sock_destruct+0xad/0x1f0
[ 1] c030a698 - ip_rt_ioctl+0xe8/0x130
[ 64] c02e2669 - ip_push_pending_frames+0x2d9/0x400
[ 6] c02d69b0 - netlink_release+0x1c0/0x300

The numbers in the []'s is the number of times they call kfree with
NULL.

This was right after a boot, but I can play on the system and see what
else is doing a heavy kfree(NULL). Maybe Jan Engelhardt's idea of doing
the inline unless CONFIG_CC_OPTIMIZE_FOR_SIZE is set is the way to go.

Anyway, if you want to play, or make this a better patch, here it is.

-- Steve

Index: linux-2.6.17-rc2/drivers/char/sysrq.c
===================================================================
--- linux-2.6.17-rc2.orig/drivers/char/sysrq.c 2006-04-21 13:13:28.000000000 -0400
+++ linux-2.6.17-rc2/drivers/char/sysrq.c 2006-04-21 13:39:43.000000000 -0400
@@ -271,6 +271,19 @@ static struct sysrq_key_op sysrq_unrt_op
.enable_mask = SYSRQ_ENABLE_RTNICE,
};

+extern void kfree_show_stats(void);
+static void sysrq_handle_kfree_stat(int key, struct pt_regs *pt_regs,
+ struct tty_struct *tty)
+{
+ kfree_show_stats();
+}
+static struct sysrq_key_op sysrq_kfree_op = {
+ .handler = sysrq_handle_kfree_stat,
+ .help_msg = "KfreeStats",
+ .action_msg = "Show stats on kfree",
+ .enable_mask = SYSRQ_ENABLE_RTNICE,
+};
+
/* Key Operations table and lock */
static DEFINE_SPINLOCK(sysrq_key_table_lock);

@@ -301,7 +314,7 @@ static struct sysrq_key_op *sysrq_key_ta
&sysrq_kill_op, /* i */
NULL, /* j */
&sysrq_SAK_op, /* k */
- NULL, /* l */
+ &sysrq_kfree_op, /* l */
&sysrq_showmem_op, /* m */
&sysrq_unrt_op, /* n */
/* This will often be registered as 'Off' at init time */
Index: linux-2.6.17-rc2/mm/slab.c
===================================================================
--- linux-2.6.17-rc2.orig/mm/slab.c 2006-04-21 10:11:41.000000000 -0400
+++ linux-2.6.17-rc2/mm/slab.c 2006-04-21 13:47:51.000000000 -0400
@@ -3366,6 +3366,69 @@ void kmem_cache_free(struct kmem_cache *
}
EXPORT_SYMBOL(kmem_cache_free);

+#define KFREE_STAT_SZ 1000
+struct kfree_struct {
+ void *addr;
+ unsigned long counter;
+} kfree_stats[KFREE_STAT_SZ];
+int nr_kfree_stats;
+unsigned long kfree_nulls;
+atomic_t kfree_non_nulls = ATOMIC_INIT(0);
+spinlock_t kfree_stat_lock = SPIN_LOCK_UNLOCKED;
+
+void kfree_show_stats(void)
+{
+ unsigned long flags;
+ unsigned long i;
+
+ spin_lock_irqsave(&kfree_stat_lock, flags);
+ printk("Total number of NULL frees: %ld\n",
+ kfree_nulls);
+ printk("Total number of non NULL frees: %d\n",
+ atomic_read(&kfree_non_nulls));
+ printk("Callers of NULL frees:\n");
+ for (i=0; i < nr_kfree_stats; i++) {
+ void *addr = kfree_stats[i].addr;
+ unsigned long cnt = kfree_stats[i].counter;
+ printk("[%9ld] %p - ", cnt, addr);
+ print_symbol("%s\n", (unsigned long)addr);
+ }
+ spin_unlock_irqrestore(&kfree_stat_lock, flags);
+}
+
+/*
+ * add_kfree_null - this adds the calling address to
+ * the list of callers that called kfree with NULL.
+ * Yes this is very inefficient, but it _should_ be the
+ * slow path (called when NULL) and is only used for stats.
+ */
+static inline void add_kfree_null(void *addr)
+{
+ unsigned long flags;
+ unsigned long i;
+
+ spin_lock_irqsave(&kfree_stat_lock, flags);
+
+ kfree_nulls++; /* protected by kfree_stat_lock */
+
+ if (nr_kfree_stats == KFREE_STAT_SZ)
+ goto out;
+
+ /* ARGH! linear O(n) search! */
+ for (i=0; i < nr_kfree_stats; i++)
+ if (kfree_stats[i].addr == addr) {
+ kfree_stats[i].counter++;
+ goto out;
+ }
+
+ nr_kfree_stats++;
+ kfree_stats[i].addr = addr;
+ kfree_stats[i].counter = 1;
+
+out:
+ spin_unlock_irqrestore(&kfree_stat_lock, flags);
+}
+
/**
* kfree - free previously allocated memory
* @objp: pointer returned by kmalloc.
@@ -3380,9 +3443,12 @@ void kfree(const void *objp)
struct kmem_cache *c;
unsigned long flags;

- if (unlikely(!objp))
+ if (unlikely(!objp)) {
+ add_kfree_null(__builtin_return_address(0));
return;
+ }
local_irq_save(flags);
+ atomic_inc(&kfree_non_nulls);
kfree_debugcheck(objp);
c = virt_to_cache(objp);
mutex_debug_check_no_locks_freed(objp, obj_size(c));
Daniel Walker
2006-04-21 18:42:56 UTC
Permalink
Post by Steven Rostedt
[ 6491] c01aafc4 - start_this_handle+0x234/0x4b0
[ 8404] c01aba66 - do_get_write_access+0x2e6/0x5a0
IMO , these two are overloaded with goto's it makes it hard to know
whats going on .

Daniel
Steven Rostedt
2006-04-21 18:56:38 UTC
Permalink
Post by Daniel Walker
Post by Steven Rostedt
[ 6491] c01aafc4 - start_this_handle+0x234/0x4b0
[ 8404] c01aba66 - do_get_write_access+0x2e6/0x5a0
IMO , these two are overloaded with goto's it makes it hard to know
whats going on .
(I've added Stephen Tweedie to the CC list since these two functions are
his)

Perhaps this patch is something that can be useful. It can tell us where
we need to either fix the logic so that a kfree(NULL) doesn't happen, or
it can tell us where we should have a "if (obj) kfree(obj)" since it is
quicker than doing the function call.

Right now, these two locations seem to be good candidates to putting in
the NULL check before calling kfree.

-- Steve
Andrew Morton
2006-04-21 19:26:52 UTC
Permalink
Post by Steven Rostedt
Total number of NULL frees: 16129
Total number of non NULL frees: 18119
....
[ 6491] c01aafc4 - start_this_handle+0x234/0x4b0
[ 8404] c01aba66 - do_get_write_access+0x2e6/0x5a0
eh. I'll fix those up.
Hua Zhong
2006-04-21 21:02:39 UTC
Permalink
Maybe something like this might be a useful debug option to detect unwise likely()/unlikely() usage?

This is a quick hack (not submitted for inclusion), but it works on my box and detected certain points after boot. Anyone likes this idea?

The reason I had to add printk_debug_likely() is that using printk directly gives compile error. It seems not to like asmlinkage for some reason..I guess likely/unlikely are too fundamental macros and the dependency gets too messy. Or maybe it could be fixed in a way I've not found.

It increases kernel size by about 10%, but hey, it's a debugg option. :-)

Sample print:

possible (un)likely misuse at include/asm/mmu_context.h:32/switch_mm()
<c011b644> printk_debug_likely+0x25/0x31 <c02b3092> schedule+0xb04/0xefa
<c02b457b> __mutex_lock_slowpath+0x35a/0x89e <c016c0fd> real_lookup+0x1f/0xb3
<c016c0fd> real_lookup+0x1f/0xb3 <c016c512> do_lookup+0x50/0xe8
<c016d14b> __link_path_walk+0xba1/0x125f <c016d8a6> link_path_walk+0x9d/0x166
<c014b911> __handle_mm_fault+0x2fd/0x35b <c01ca29e> _atomic_dec_and_lock+0x22/0x2c
<c016d5c2> __link_path_walk+0x1018/0x125f <c016d8a6> link_path_walk+0x9d/0x166
<c014b911> __handle_mm_fault+0x2fd/0x35b <c01ce9fe> strncpy_from_user+0x94/0xb3
<c016dee7> do_path_lookup+0x35c/0x4d2 <c016e385> __user_walk_fd+0x84/0x9b
<c0167ae1> vfs_stat_fd+0x15/0x3c <c014b911> __handle_mm_fault+0x2fd/0x35b
<c0168091> sys_stat64+0xf/0x23 <c0113fea> do_page_fault+0x30d/0x753
<c01ceefd> copy_to_user+0x113/0x11c <c0126ee6> sys_rt_sigprocmask+0xae/0xc1
<c01035cb> sysenter_past_esp+0x54/0x75
possible (un)likely misuse at kernel/sched.c:1638/context_switch()
<c011b644> printk_debug_likely+0x25/0x31 <c02b3025> schedule+0xa97/0xefa
<c011ddcd> do_exit+0x75b/0x765 <c011dec4> sys_exit_group+0x0/0xd
<c01035cb> sysenter_past_esp+0x54/0x75
possible (un)likely misuse at kernel/softlockup.c:59/softlockup_tick()
<c011b644> printk_debug_likely+0x25/0x31 <c0138f78> softlockup_tick+0x88/0xf5
<c0123dcd> update_process_times+0x35/0x57 <c0106133> timer_interrupt+0x3d/0x60
<c013919c> handle_IRQ_event+0x21/0x4a <c01392f7> __do_IRQ+0x132/0x1e7
<c0104d86> do_IRQ+0x9c/0xab <c01037fe> common_interrupt+0x1a/0x20
possible (un)likely misuse at kernel/sched.c:1645/context_switch()
<c011b644> printk_debug_likely+0x25/0x31 <c02b3236> schedule+0xca8/0xefa
<c0130d33> enqueue_hrtimer+0x5b/0x80 <c02b57f5> do_nanosleep+0x3b/0xc0
<c0131092> hrtimer_nanosleep+0x45/0xe6 <c01680b4> sys_lstat64+0xf/0x23
<c013102a> hrtimer_wakeup+0x0/0x18 <c01cf021> copy_from_user+0x11b/0x142
<c0131179> sys_nanosleep+0x46/0x4e <c01035cb> sysenter_past_esp+0x54/0x75

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f23d3c6..a6df5f7 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -58,9 +58,32 @@ #endif
* build go below this comment. Actual compiler/compiler version
* specific implementations come from the above header files
*/
+#define CONFIG_DEBUG_LIKELY
+#ifdef CONFIG_DEBUG_LIKELY
+extern int printk_debug_likely(const char *fmt, ...);
+extern int debug_likely_count_min_thresh;
+extern int debug_likely_print_max_thresh;
+#define __check_likely(x, v, uv) \
+ ({ static int _ckl_print_nr = 0; \
+ static unsigned int _ckl_s[2]; \
+ int _ckl_r = !!(x); \
+ _ckl_s[_ckl_r]++; \
+ if ((_ckl_s[v] == _ckl_s[uv]) && (_ckl_s[v] > debug_likely_count_min_thresh) \
+ && (_ckl_print_nr < debug_likely_print_max_thresh)) { \
+ _ckl_print_nr++; \
+ printk_debug_likely("possible (un)likely misuse at %s:%d/%s()\n", \
+ __FILE__, __LINE__, __FUNCTION__); \
+ } \
+ _ckl_r; })
+#else
+#define __check_likely(x, v, uv) __builtin_expect(!!(x), v)
+#endif
+
+#define likely(x) __check_likely(x, 1, 0)
+#define unlikely(x) __check_likely(x, 0, 1)

-#define likely(x) __builtin_expect(!!(x), 1)
-#define unlikely(x) __builtin_expect(!!(x), 0)
+//#define likely(x) __builtin_expect(!!(x), 1)
+//#define unlikely(x) __builtin_expect(!!(x), 0)

/* Optimization barrier */
#ifndef barrier
diff --git a/kernel/printk.c b/kernel/printk.c
index c056f33..cc09dca 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -602,6 +602,30 @@ out:
EXPORT_SYMBOL(printk);
EXPORT_SYMBOL(vprintk);

+#ifdef CONFIG_DEBUG_LIKELY
+int debug_likely_count_min_thresh = 100;
+int debug_likely_print_max_thresh = 1;
+int printk_debug_likely(const char *fmt, ...)
+{
+ int r = 0;
+ static atomic_t recurse = ATOMIC_INIT(1); /* as a mutex */
+ if (atomic_dec_and_test(&recurse)) {
+ va_list args;
+
+ va_start(args, fmt);
+ r = vprintk(fmt, args);
+ va_end(args);
+ dump_stack();
+ }
+ atomic_inc(&recurse);
+
+ return r;
+}
+EXPORT_SYMBOL(debug_likely_count_min_thresh);
+EXPORT_SYMBOL(debug_likely_print_max_thresh);
+EXPORT_SYMBOL(printk_debug_likely);
+#endif
+
#else

asmlinkage long sys_syslog(int type, char __user *buf, int len)
Daniel Walker
2006-04-21 21:11:08 UTC
Permalink
I've been working on some code that records all the data for output to a
proc entry . It's pretty light weight .. I'll send it once it's
finished ..

Daniel
Post by Hua Zhong
Maybe something like this might be a useful debug option to detect unwise likely()/unlikely() usage?
This is a quick hack (not submitted for inclusion), but it works on my box and detected certain points after boot. Anyone likes this idea?
The reason I had to add printk_debug_likely() is that using printk directly gives compile error. It seems not to like asmlinkage for some reason..I guess likely/unlikely are too fundamental macros and the dependency gets too messy. Or maybe it could be fixed in a way I've not found.
It increases kernel size by about 10%, but hey, it's a debugg option. :-)
possible (un)likely misuse at include/asm/mmu_context.h:32/switch_mm()
<c011b644> printk_debug_likely+0x25/0x31 <c02b3092> schedule+0xb04/0xefa
<c02b457b> __mutex_lock_slowpath+0x35a/0x89e <c016c0fd> real_lookup+0x1f/0xb3
<c016c0fd> real_lookup+0x1f/0xb3 <c016c512> do_lookup+0x50/0xe8
<c016d14b> __link_path_walk+0xba1/0x125f <c016d8a6> link_path_walk+0x9d/0x166
<c014b911> __handle_mm_fault+0x2fd/0x35b <c01ca29e> _atomic_dec_and_lock+0x22/0x2c
<c016d5c2> __link_path_walk+0x1018/0x125f <c016d8a6> link_path_walk+0x9d/0x166
<c014b911> __handle_mm_fault+0x2fd/0x35b <c01ce9fe> strncpy_from_user+0x94/0xb3
<c016dee7> do_path_lookup+0x35c/0x4d2 <c016e385> __user_walk_fd+0x84/0x9b
<c0167ae1> vfs_stat_fd+0x15/0x3c <c014b911> __handle_mm_fault+0x2fd/0x35b
<c0168091> sys_stat64+0xf/0x23 <c0113fea> do_page_fault+0x30d/0x753
<c01ceefd> copy_to_user+0x113/0x11c <c0126ee6> sys_rt_sigprocmask+0xae/0xc1
<c01035cb> sysenter_past_esp+0x54/0x75
possible (un)likely misuse at kernel/sched.c:1638/context_switch()
<c011b644> printk_debug_likely+0x25/0x31 <c02b3025> schedule+0xa97/0xefa
<c011ddcd> do_exit+0x75b/0x765 <c011dec4> sys_exit_group+0x0/0xd
<c01035cb> sysenter_past_esp+0x54/0x75
possible (un)likely misuse at kernel/softlockup.c:59/softlockup_tick()
<c011b644> printk_debug_likely+0x25/0x31 <c0138f78> softlockup_tick+0x88/0xf5
<c0123dcd> update_process_times+0x35/0x57 <c0106133> timer_interrupt+0x3d/0x60
<c013919c> handle_IRQ_event+0x21/0x4a <c01392f7> __do_IRQ+0x132/0x1e7
<c0104d86> do_IRQ+0x9c/0xab <c01037fe> common_interrupt+0x1a/0x20
possible (un)likely misuse at kernel/sched.c:1645/context_switch()
<c011b644> printk_debug_likely+0x25/0x31 <c02b3236> schedule+0xca8/0xefa
<c0130d33> enqueue_hrtimer+0x5b/0x80 <c02b57f5> do_nanosleep+0x3b/0xc0
<c0131092> hrtimer_nanosleep+0x45/0xe6 <c01680b4> sys_lstat64+0xf/0x23
<c013102a> hrtimer_wakeup+0x0/0x18 <c01cf021> copy_from_user+0x11b/0x142
<c0131179> sys_nanosleep+0x46/0x4e <c01035cb> sysenter_past_esp+0x54/0x75
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f23d3c6..a6df5f7 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -58,9 +58,32 @@ #endif
* build go below this comment. Actual compiler/compiler version
* specific implementations come from the above header files
*/
+#define CONFIG_DEBUG_LIKELY
+#ifdef CONFIG_DEBUG_LIKELY
+extern int printk_debug_likely(const char *fmt, ...);
+extern int debug_likely_count_min_thresh;
+extern int debug_likely_print_max_thresh;
+#define __check_likely(x, v, uv) \
+ ({ static int _ckl_print_nr = 0; \
+ static unsigned int _ckl_s[2]; \
+ int _ckl_r = !!(x); \
+ _ckl_s[_ckl_r]++; \
+ if ((_ckl_s[v] == _ckl_s[uv]) && (_ckl_s[v] > debug_likely_count_min_thresh) \
+ && (_ckl_print_nr < debug_likely_print_max_thresh)) { \
+ _ckl_print_nr++; \
+ printk_debug_likely("possible (un)likely misuse at %s:%d/%s()\n", \
+ __FILE__, __LINE__, __FUNCTION__); \
+ } \
+ _ckl_r; })
+#else
+#define __check_likely(x, v, uv) __builtin_expect(!!(x), v)
+#endif
+
+#define likely(x) __check_likely(x, 1, 0)
+#define unlikely(x) __check_likely(x, 0, 1)
-#define likely(x) __builtin_expect(!!(x), 1)
-#define unlikely(x) __builtin_expect(!!(x), 0)
+//#define likely(x) __builtin_expect(!!(x), 1)
+//#define unlikely(x) __builtin_expect(!!(x), 0)
/* Optimization barrier */
#ifndef barrier
diff --git a/kernel/printk.c b/kernel/printk.c
index c056f33..cc09dca 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
EXPORT_SYMBOL(printk);
EXPORT_SYMBOL(vprintk);
+#ifdef CONFIG_DEBUG_LIKELY
+int debug_likely_count_min_thresh = 100;
+int debug_likely_print_max_thresh = 1;
+int printk_debug_likely(const char *fmt, ...)
+{
+ int r = 0;
+ static atomic_t recurse = ATOMIC_INIT(1); /* as a mutex */
+ if (atomic_dec_and_test(&recurse)) {
+ va_list args;
+
+ va_start(args, fmt);
+ r = vprintk(fmt, args);
+ va_end(args);
+ dump_stack();
+ }
+ atomic_inc(&recurse);
+
+ return r;
+}
+EXPORT_SYMBOL(debug_likely_count_min_thresh);
+EXPORT_SYMBOL(debug_likely_print_max_thresh);
+EXPORT_SYMBOL(printk_debug_likely);
+#endif
+
#else
asmlinkage long sys_syslog(int type, char __user *buf, int len)
Michael Buesch
2006-04-21 21:36:19 UTC
Permalink
Post by Daniel Walker
I've been working on some code that records all the data for output to a
proc entry . It's pretty light weight .. I'll send it once it's
finished ..
What about debugfs instead of proc? It has a much cleaner and easier
to use interface. And such debugging things, ... that is what
debugfs is for.
--
Greetings Michael.
Andrew Morton
2006-04-21 21:42:33 UTC
Permalink
Post by Hua Zhong
+#define __check_likely(x, v, uv) \
those single-char identifiers are not nice.
Post by Hua Zhong
+ ({ static int _ckl_print_nr = 0; \
+ static unsigned int _ckl_s[2]; \
+ int _ckl_r = !!(x); \
+ _ckl_s[_ckl_r]++; \
+ if ((_ckl_s[v] == _ckl_s[uv]) && (_ckl_s[v] > debug_likely_count_min_thresh) \
+ && (_ckl_print_nr < debug_likely_print_max_thresh)) { \
+ _ckl_print_nr++; \
+ printk_debug_likely("possible (un)likely misuse at %s:%d/%s()\n", \
+ __FILE__, __LINE__, __FUNCTION__); \
+ } \
+ _ckl_r; })
Kinda. It would be better to put all the counters into a linked list

struct likeliness {
char *file;
int line;
atomic_t true_count;
atomic_t false_count;
struct likeliness *next;
};

then

#define __check_likely(expr, value)
({
static struct likeliness likeliness;
do_check_likely(__FILE__, __LINE__, &likeliness, value);
})

...

static struct likeliness *likeliness_list;

void do_check_likely(char *file, int line, struct likeliness *likeliness,
int value)
{
static DEFINE_SPINLOCK(lock);
unsigned long flags;

if (!likeliness->next) {
if (!spin_trylock_irqsave(&lock, flags))
return; /* Can be called from NMI */
if (!likeliness->next) {
likeliness->next = likeliness_list;
likeliness_list = likeliness;
likeliness->file = file;
likeliness->line = line;
}
spin_unlock_irqsave(&lock, flags);
}

if (value)
atomic_inc(&likeliness->true_count);
else
atomic_inc(&likeliness->false_count);
}
EXPORT_SYMBOL(do_check_likely);

then write a thingy to dump the linked list (sysrq&printk if it's a hack,
/proc handler if not).

It'll crash if a module gets registered then unloaded.
CONFIG_MODULE_UNLOAD=n would be required.
Andrew Morton
2006-04-21 21:48:02 UTC
Permalink
Post by Andrew Morton
#define __check_likely(expr, value)
({
static struct likeliness likeliness;
do_check_likely(__FILE__, __LINE__, &likeliness, value);
})
#define __check_likely(expr, value)
({
static struct likeliness likeliness = {
.file = __FILE__,
.line = __LINE__,
};
do_check_likely(&likeliness, value);
})
Hua Zhong
2006-04-21 22:53:24 UTC
Permalink
Post by Andrew Morton
struct likeliness {
char *file;
int line;
atomic_t true_count;
atomic_t false_count;
struct likeliness *next;
};
It seems including atomic.h inside compiler.h is a bit tricky (might have interdependency).

Can we just live with int instead of atomic_t? We don't really care about losing a count occasionally..
Post by Andrew Morton
It'll crash if a module gets registered then unloaded.
CONFIG_MODULE_UNLOAD=n would be required.
What about init data that is thrown away after boot?

Hua
Daniel Walker
2006-04-21 22:58:38 UTC
Permalink
Post by Hua Zhong
Post by Andrew Morton
struct likeliness {
char *file;
int line;
atomic_t true_count;
atomic_t false_count;
struct likeliness *next;
};
It seems including atomic.h inside compiler.h is a bit tricky (might have interdependency).
Can we just live with int instead of atomic_t? We don't really care about losing a count occasionally..
It's nice so you don't have to fool around with locking .. The atomic_t
structure is pretty simple thought . I think it boils down to just an
int anyway .

Daniel
Hua Zhong
2006-04-21 23:03:57 UTC
Permalink
Post by Hua Zhong
Post by Hua Zhong
It seems including atomic.h inside compiler.h is a bit
tricky (might have interdependency).
Post by Hua Zhong
Can we just live with int instead of atomic_t? We don't
really care about losing a count occasionally..
It's nice so you don't have to fool around with locking ..
The atomic_t structure is pretty simple thought . I think it
boils down to just an int anyway .
We could move atomic_t into a separate atomic_type.h? I just want to make sure before I mess with the file structure..

Hua
Andrew Morton
2006-04-21 23:25:01 UTC
Permalink
Post by Hua Zhong
Post by Andrew Morton
struct likeliness {
char *file;
int line;
atomic_t true_count;
atomic_t false_count;
struct likeliness *next;
};
It seems including atomic.h inside compiler.h is a bit tricky (might have interdependency).
Can we just live with int instead of atomic_t? We don't really care about losing a count occasionally..
Sure, int or long would work OK.
Post by Hua Zhong
Post by Andrew Morton
It'll crash if a module gets registered then unloaded.
CONFIG_MODULE_UNLOAD=n would be required.
What about init data that is thrown away after boot?
Good point. Putting #ifndef CONFIG_foo around init/main.c:init()'s call to
free_initmem would fix that.
Andrew Morton
2006-04-21 23:27:20 UTC
Permalink
btw, I don't think a patch like this would be appropriate to mainline -
it's the sort of thing which just one or two people would use once or twice
per year.

So I'd be inclined to host it permanently in -mm, along with the other 37
mm-only debugging patches.

Consequently it doesn't have to be an especially gorgeous piece of code ;)
Jan Engelhardt
2006-04-22 11:18:58 UTC
Permalink
Post by Andrew Morton
Kinda. It would be better to put all the counters into a linked list
I'd prefer a binary tree which sorts on the caller address and maps these
addrs to struct someinfo;
Post by Andrew Morton
struct likeliness {
char *file;
int line;
atomic_t true_count;
atomic_t false_count;
struct likeliness *next;
};
Oh, and if it is going to stay linked list, why not use struct list_head.

Since we are at it, non-NULL counting could also be done, which could give
an overview who unnecessarily calls kfree too often :>


Jan Engelhardt
--
l***@horizon.com
2006-04-22 12:05:51 UTC
Permalink
Post by Paul Mackerras
Well, we'd have to start by fixing up the janitors that run around
taking out the if statements in the callers. :)
You just need to document those two as special. Probably the
simplest is to tell the programmer and the compiler in one
fell swoop:

if (unlikely(p))
kfree(p);

Or that could be wrapped up in a macro:

#define kfree_likely_null(p) if (unlikely(p)) kfree(p)

Or just mention it to the programmer. A few possible one-line comments:

/* Testing before calling is faster if often NULL, as here. */
/* It's worth the (redundant) test for NULL if it often succeeds */
/* This test saves the call often enough to be worth it. */
/* Test for NULL not necessary, but worth it here */
/* Don't delete NULL test; speed trumps code size here */
/* Very often NULL, so avoid call overhead if possible */
/* kfree(NULL) is legal, but probabilities favor testing here */
Loading...