Discussion:
[PATCH] checkpatch: Check for quoted strings broken across lines
Josh Triplett
2012-02-02 20:06:21 UTC
Permalink
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.

Test case:

void context(void)
{
pr_err("this string"
" should produce a warning\n");
pr_err("this multi-line string\n"
"should not produce a warning\n");
asm ("this asm\n\t"
"should not produce a warning");
}

Results of checkpatch on that test case:

WARNING: quoted string split across lines
+ " should produce a warning\n");

total: 0 errors, 1 warnings, 9 lines checked

Signed-off-by: Josh Triplett <***@joshtriplett.org>
---
scripts/checkpatch.pl | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e3bfcbe..ce4d740 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1737,6 +1737,17 @@ sub process {
"line over 80 characters\n" . $herecurr);
}

+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
+ if ($line =~ /^\+\s*"/ &&
+ $prevline =~ /"\s*$/ &&
+ $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $herecurr);
+ }
+
# check for spaces before a quoted newline
if ($rawline =~ /^.*\".*\s\\n/) {
WARN("QUOTED_WHITESPACE_BEFORE_NEWLINE",
--
1.7.9
Nick Bowler
2012-02-02 20:22:07 UTC
Permalink
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[...]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
There are tons of strings in the kernel that this makes checkpatch warn
about where it probably shouldn't. For example, this one (from
kernel/auditsc.c:1476):

audit_log_format(ab,
"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
"mq_msgsize=%ld mq_curmsgs=%ld",

WARNING: quoted string split across lines
#1478: FILE: auditsc.c:1478:
+ "mq_msgsize=%ld mq_curmsgs=%ld",

Breaking "greppability" of this string is a non-issue, because this sort
of string is not really greppable to begin with (and would certainly not
be any easier to grep for if it were all on one line).

Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
Josh Triplett
2012-02-02 21:16:38 UTC
Permalink
Post by Nick Bowler
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[...]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
There are tons of strings in the kernel that this makes checkpatch warn
about where it probably shouldn't. For example, this one (from
audit_log_format(ab,
"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
"mq_msgsize=%ld mq_curmsgs=%ld",
WARNING: quoted string split across lines
+ "mq_msgsize=%ld mq_curmsgs=%ld",
Breaking "greppability" of this string is a non-issue, because this sort
of string is not really greppable to begin with (and would certainly not
be any easier to grep for if it were all on one line).
While I agree that in that particular case (heavy on the %formats and
light on the text) you can't easily grep to begin with, the guideline
from CodingStyle still applies, as does the standard guideline about
checkpatch (namely "don't go globally fixing everything it says, but fix
it in new or changed code").

I certainly don't think joining those lines would *hurt*. Making that
change blindly across the entire kernel doesn't seem worth it, but
changing it on new code seems like a good idea.

In theory checkpatch could try to heuristically ignore cases where the
split in the string occurs immediately before or after a %format, but I
don't fancy writing a regex to match valid printf format specifiers. :)

I still think this patch provides a net win, and I don't think the
example you mentioned represents a false positive.

- Josh Triplett
Nick Bowler
2012-02-02 22:08:53 UTC
Permalink
Post by Josh Triplett
Post by Nick Bowler
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[...]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
There are tons of strings in the kernel that this makes checkpatch warn
about where it probably shouldn't. For example, this one (from
audit_log_format(ab,
"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
"mq_msgsize=%ld mq_curmsgs=%ld",
WARNING: quoted string split across lines
+ "mq_msgsize=%ld mq_curmsgs=%ld",
Breaking "greppability" of this string is a non-issue, because this sort
of string is not really greppable to begin with (and would certainly not
be any easier to grep for if it were all on one line).
While I agree that in that particular case (heavy on the %formats and
light on the text) you can't easily grep to begin with, the guideline
from CodingStyle still applies,
The guideline from CodingStyle talks about "user-visible strings". I
don't think this string counts, because it contains code that is not
displayed to the user (hence, the string is not user-visible). That is
the reason why it's not greppable in the first place.

There are hundreds, if not thousands of similar strings in the kernel.
Post by Josh Triplett
as does the standard guideline about checkpatch (namely "don't go
globally fixing everything it says, but fix it in new or changed
code").
I certainly don't think joining those lines would *hurt*.
I disagree. Joining those lines will push the conversion specifiers way
off to the right (possibly off the screen), and these specifiers are
vital to understanding the rest of the function parameters. In other
words, the specifiers should be treated the same as actual *code*.
Post by Josh Triplett
Making that change blindly across the entire kernel doesn't seem worth
it, but changing it on new code seems like a good idea.
For reasons that I've already outlined, I don't think it's a good idea
to change it in new code, either.
Post by Josh Triplett
In theory checkpatch could try to heuristically ignore cases where the
split in the string occurs immediately before or after a %format, but I
don't fancy writing a regex to match valid printf format specifiers. :)
It doesn't hurt to be conservative, erring on the side of not warning
(the status quo) rather than warning. Regardless, writing a regex for
printf conversion specifiers should be straightforward, because the
format is so rigid.

Here's a perl regex which should match all the valid possibilities
according to ISO C99 (untested!). Catching the Linux extensions is
left as an exercise to the reader.

%[-+ #0]*(hh|ll|[ljztL])?[*[:digit:]]*(\.[*[:digit:]]*)?[diouxXfFeEgGaAcspn%]
Post by Josh Triplett
I still think this patch provides a net win, and I don't think the
example you mentioned represents a false positive.
Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
Josh Triplett
2012-02-03 01:31:08 UTC
Permalink
Post by Nick Bowler
Post by Josh Triplett
Post by Nick Bowler
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[...]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
There are tons of strings in the kernel that this makes checkpatch warn
about where it probably shouldn't. For example, this one (from
audit_log_format(ab,
"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
"mq_msgsize=%ld mq_curmsgs=%ld",
WARNING: quoted string split across lines
+ "mq_msgsize=%ld mq_curmsgs=%ld",
Breaking "greppability" of this string is a non-issue, because this sort
of string is not really greppable to begin with (and would certainly not
be any easier to grep for if it were all on one line).
While I agree that in that particular case (heavy on the %formats and
light on the text) you can't easily grep to begin with, the guideline
from CodingStyle still applies,
The guideline from CodingStyle talks about "user-visible strings". I
don't think this string counts, because it contains code that is not
displayed to the user (hence, the string is not user-visible). That is
the reason why it's not greppable in the first place.
There are hundreds, if not thousands of similar strings in the kernel.
OK, with a combination of arcane grep/sed commands and hand checking, I
just scanned the *entire* kernel for quoted strings split across lines,
and filtered out all user-visible strings, leaving only the split
non-user-visible strings.

An excerpt from that pile of ugly, suggesting just how many ways kernel
code has to spell "user-visible":

\(print[a-z_]*\|pr_[a-z]*\|pr_[a-z]*_ratelimited\|pr_debug[0-9]\|pr_info_once\|dev_[a-z]*\|msg\|warn\|warn_once\|dbg\|debug\|debugp\|err\|error\|errdev\|errx\|warning\|DBG[A-Z0-9_]*\|DEBUG[A-Z_]*\|trace[a-z0-9_]*\|LOG_KMS\|exception\|FAIL\|fatal\|assert\|panic_vm\|trace\|panic\|puts\|MODULE_[A-Z_]*\|asm\|__asm__\|volatile\|__volatile__\|message\|log\|info\|DAC960_[A-Za-z_]*\|notify\|ufsd\|crit\|debugf[0-9]\|DP\|die\|fyi\|notice\|dout\|snoop\|throtl_log_tg\|ERROR_INODE\|RFALSE\|DMESGE\|SAY\|JOM\|SAM\|JOT\|DBF_EVENT\|DE_ACT\|s390_handle_damage\|CIO_MSG_EVENT\|CIO_CRW_EVENT\|DBF_EVENT_DEVID\|fat_fs_error_ratelimit\|mark_tsc_unstable\|ata_ehi_push_desc\|DEB[0-9]\|IUCV_DBF_TEXT_\|LOG_PARSE\|esp_log_[a-z]*\|check_warn_return\|D_ISR\|D_CALIB\|D_RATE\|D_TX_REPLY\|D_TXPOWER\|D_EEPROM\|D_POWER\|D_SC
AN\|D_ASSOC\|D_HC\|D_HC_DUMP\|IP_VS_ERR_BUF\|DMWARN_LIMIT\|ide_dump_status\|ppfinit\|mca_recovered\|die_if_kernel\|die_if_no_fixup\|gdbstub_proto\|ite_pr\|nvt_pr\|deb_i2c\|deb_irq\|deb_ts\|deb_rc\|mxl_i2c\|MLX4_EN_PARM_INT\|DI\|d_fnstart\|d_fnend\|lbs_deb_[a-z_]*\|LPFC_[A-Z_]*\|stop\|mconsole_reply_v0\|CH_ALERT\|D1\|ADD\|lbtf_deb_usbd\|IWL_DEBUG_MAC80211\|WL_CONN\|CX18_INFO_DEV\|CX18_ERR_DEV\|OPT_BOOLEAN\|asc_prt_line\|gdbstub_msg_write\|DCCP_BUG\|fappend\)

I found that a single change to my checkpatch test (only checking
strings passed as function arguments) limits the false positives in the
entire kernel to a grand total of 12:

arch/powerpc/platforms/iseries/mf.c: memcpy(src, "\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00"
arch/um/os-Linux/process.c: if (sscanf(buf, "%*d " COMM_SCANF " %*c %*d %*d %*d %*d %*d %*d %*d "
arch/x86/platform/olpc/olpc_dt.c: olpc_dt_interpret("\" /***@0\" find-device"
arch/x86/platform/olpc/olpc_dt.c: olpc_dt_interpret("\" /pci/***@1\" find-device"
arch/x86/platform/olpc/olpc_dt.c: olpc_dt_interpret("\" /pci/***@1,1\" find-device"
drivers/block/rbd.c: "%" __stringify(RBD_MAX_OBJ_NAME_LEN) "s"
drivers/pcmcia/ds.c: if (add_uevent_var(env, "MODALIAS=pcmcia:m%04Xc%04Xf%02Xfn%02Xpfn%02X"
drivers/tty/tty_audit.c: audit_log_format(ab, "%s pid=%u uid=%u auid=%u ses=%u "
fs/ecryptfs/crypto.c:static unsigned char *portable_filename_chars = ("-.0123456789ABCD"
kernel/auditsc.c: audit_log_format(ab, " inode=%lu"
kernel/auditsc.c: audit_log_format(ab, "login pid=%d uid=%u "
security/selinux/ss/services.c: audit_log_format(ab, "op=security_compute_av reason=%s "

Not "hundreds, if not thousands", but 12, including 4 calls to
audit_log_format. :)

Meanwhile, this checkpatch test correctly flags several thousand
instances of user-visible strings that shouldn't get split. That seems
like a pretty reasonable ratio to me; what do you think?

I'll send PATCHv2, which limits the check to function parameters,
momentarily.

- Josh Triplett
Jesper Juhl
2012-02-02 21:29:45 UTC
Permalink
Post by Nick Bowler
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[...]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
s/greppability/grepability/ ?
Post by Nick Bowler
Post by Josh Triplett
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
There are tons of strings in the kernel that this makes checkpatch warn
about where it probably shouldn't. For example, this one (from
audit_log_format(ab,
"oflag=0x%x mode=%#ho mq_flags=0x%lx mq_maxmsg=%ld "
"mq_msgsize=%ld mq_curmsgs=%ld",
WARNING: quoted string split across lines
+ "mq_msgsize=%ld mq_curmsgs=%ld",
Breaking "greppability" of this string is a non-issue, because this sort
of string is not really greppable to begin with (and would certainly not
be any easier to grep for if it were all on one line).
Agreed. checkpatch needs to not warn about strings like this.
But generally a good idea to warn about "easily grepable strings broken
into multiple lines" if possible, IMHO...
--
Jesper Juhl <***@chaosbits.net> http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.
Joe Perches
2012-02-02 21:34:16 UTC
Permalink
Post by Jesper Juhl
But generally a good idea to warn about "easily grepable strings broken
into multiple lines" if possible, IMHO...
Maybe count the %'s? Ignore if more than 2?
Josh Triplett
2012-02-02 22:02:22 UTC
Permalink
Post by Joe Perches
Post by Jesper Juhl
But generally a good idea to warn about "easily grepable strings broken
into multiple lines" if possible, IMHO...
Maybe count the %'s? Ignore if more than 2?
Numerous greppable strings include several formats, but you can
generally look at a message and know which non-parameterized bits to
grep for. Even in the case of strings with numerous parameters, it
often works better to grep for the whole string and replace the
parameters with '.*', and make the string incrementally less specific
until I find it. For example, from dmesg on my system:

[ 0.000000] last_pfn = 0xdb000 max_arch_pfn = 0x400000000

Grepping for "last_pfn" or even "last_pfn = " produces a pile of
results, but grepping for 'last_pfn = .* max_arch_pfn' produces exactly
one result:

arch/x86/kernel/e820.c: printk(KERN_INFO "last_pfn = %#lx max_arch_pfn = %#lx\n",

That wouldn't work if the string broke across lines. So, I still think
it seems reasonable for checkpatch to encourage keeping such strings
un-split.

- Josh Triplett
Joe Perches
2012-02-02 20:36:29 UTC
Permalink
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[]
Post by Josh Triplett
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
Post by Josh Triplett
@@ -1737,6 +1737,17 @@ sub process {
[]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
+ if ($line =~ /^\+\s*"/ &&
+ $prevline =~ /"\s*$/ &&
+ $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $herecurr);
Seems sensible but there are also asm uses like:

arch/x86/include/asm/pvclock.h: "xor %5,%5 ; "
"add %4,%%eax ; "
Josh Triplett
2012-02-02 21:28:41 UTC
Permalink
Post by Joe Perches
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[]
Post by Josh Triplett
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
Post by Josh Triplett
@@ -1737,6 +1737,17 @@ sub process {
[]
Post by Josh Triplett
+# Check for strings broken across lines (breaks greppability). Make an
+# exception when the previous string ends in a newline (multiple lines in one
+# string constant) or \n\t (common in inline assembly to indent the instruction
+# on the following line).
+ if ($line =~ /^\+\s*"/ &&
+ $prevline =~ /"\s*$/ &&
+ $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $herecurr);
arch/x86/include/asm/pvclock.h: "xor %5,%5 ; "
"add %4,%%eax ; "
I did see those, yes. However, a quick grep through the kernel shows
that those occur quite rarely compared to \n or \n\t; the latter looks
like the standard style. How about I provide a patch for
Documentation/CodingStyle adding a chapter on inline assembly, with that
and other style guidelines? :)

- Josh Triplett
Joe Perches
2012-02-02 21:32:07 UTC
Permalink
Post by Josh Triplett
Post by Joe Perches
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[]
Post by Josh Triplett
Post by Joe Perches
arch/x86/include/asm/pvclock.h: "xor %5,%5 ; "
"add %4,%%eax ; "
I did see those, yes. However, a quick grep through the kernel shows
that those occur quite rarely compared to \n or \n\t; the latter looks
like the standard style. How about I provide a patch for
Documentation/CodingStyle adding a chapter on inline assembly, with that
and other style guidelines? :)
Sounds fine to me.

Another option might be to ignore all
strings in any asm block.
Josh Triplett
2012-02-02 22:36:02 UTC
Permalink
Post by Joe Perches
Post by Josh Triplett
Post by Joe Perches
Post by Josh Triplett
Documentation/CodingStyle recommends not splitting quoted strings across
lines, because it breaks the ability to grep for the string. checkpatch
already makes an exception to the 80-column rule for quoted strings to
allow this. Rather than just allowing it, actively warn about quoted
strings split across lines.
[]
Post by Josh Triplett
Post by Joe Perches
arch/x86/include/asm/pvclock.h: "xor %5,%5 ; "
"add %4,%%eax ; "
I did see those, yes. However, a quick grep through the kernel shows
that those occur quite rarely compared to \n or \n\t; the latter looks
like the standard style. How about I provide a patch for
Documentation/CodingStyle adding a chapter on inline assembly, with that
and other style guidelines? :)
Sounds fine to me.
Done, with message-id <***@leaf> and subject
"[PATCH] Documentation/CodingStyle: Add guidelines for inline assembly".

- Josh triplett
Josh Triplett
2012-02-03 05:27:28 UTC
Permalink
checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string. Rather than just permitting this, actively warn about quoted
strings split across lines.

Test case:

void context(void)
{
struct { unsigned magic; const char *strdata; } foo[] = {
{ 42, "these strings"
"do not produce warnings" },
{ 256, "though perhaps"
"they should" },
};
pr_err("this string"
" should produce a warning\n");
pr_err("this multi-line string\n"
"should not produce a warning\n");
asm ("this asm\n\t"
"should not produce a warning");
}

Results of checkpatch on that test case:

WARNING: quoted string split across lines
#10: FILE: test.c:10:
+ " should produce a warning\n");

total: 0 errors, 1 warnings, 15 lines checked

Signed-off-by: Josh Triplett <***@joshtriplett.org>
---

v2: Add the "after open parenthesis" heuristic which almost completely
eliminates non-user-visible strings, leaving only 12 in the entire
kernel, versus thousands of correct matches. Change "breaks
greppability" to "breaks the ability to grep for the string".

scripts/checkpatch.pl | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e3bfcbe..b898df4 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1737,6 +1737,21 @@ sub process {
"line over 80 characters\n" . $herecurr);
}

+# Check for user-visible strings broken across lines, which breaks the ability
+# to grep for the string. Limited to strings used as parameters (those
+# following an open parenthesis), which almost completely eliminates false
+# positives, as well as warning only once per parameter rather than once per
+# line of the string. Make an exception when the previous string ends in a
+# newline (multiple lines in one string constant) or \n\t (common in inline
+# assembly to indent the instruction on the following line).
+ if ($line =~ /^\+\s*"/ &&
+ $prevline =~ /"\s*$/ &&
+ $prevline =~ /\(/ &&
+ $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $herecurr);
+ }
+
# check for spaces before a quoted newline
if ($rawline =~ /^.*\".*\s\\n/) {
WARN("QUOTED_WHITESPACE_BEFORE_NEWLINE",
--
1.7.9
Joe Perches
2012-02-03 05:38:35 UTC
Permalink
Post by Josh Triplett
checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string. Rather than just permitting this, actively warn about quoted
strings split across lines.
[]
Post by Josh Triplett
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $herecurr);
I think the output would be better as:

WARN("SPLIT_STRING",
"quoted string split across lines\n" . $hereprev);
Josh Triplett
2012-02-03 08:55:16 UTC
Permalink
Post by Josh Triplett
Post by Josh Triplett
checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string. Rather than just permitting this, actively warn about quoted
strings split across lines.
[]
Post by Josh Triplett
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $herecurr);
WARN("SPLIT_STRING",
"quoted string split across lines\n" . $hereprev);
Awesome; I'd thought that it would look better to show the previous
line, but I didn't know about $hereprev. PATCHv3 momentarily.

- Josh Triplett
Josh Triplett
2012-02-03 06:34:36 UTC
Permalink
Post by Josh Triplett
checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string. Rather than just permitting this, actively warn about quoted
strings split across lines.
[...]
Post by Josh Triplett
v2: Add the "after open parenthesis" heuristic which almost completely
eliminates non-user-visible strings, leaving only 12 in the entire
kernel, versus thousands of correct matches. Change "breaks
greppability" to "breaks the ability to grep for the string".
Sorry, forgot to change "PATCH" to "PATCHv2" in the subject.

- Josh Triplett
Josh Triplett
2012-02-03 09:30:20 UTC
Permalink
checkpatch already makes an exception to the 80-column rule for quoted
strings, and Documentation/CodingStyle recommends not splitting quoted
strings across lines, because it breaks the ability to grep for the
string. Rather than just permitting this, actively warn about quoted
strings split across lines.

Test case:

void context(void)
{
struct { unsigned magic; const char *strdata; } foo[] = {
{ 42, "these strings"
"do not produce warnings" },
{ 256, "though perhaps"
"they should" },
};
pr_err("this string"
" should produce a warning\n");
pr_err("this multi-line string\n"
"should not produce a warning\n");
asm ("this asm\n\t"
"should not produce a warning");
}

Results of checkpatch on that test case:

WARNING: quoted string split across lines
#10: FILE: test.c:10:
+ pr_err("this string"
+ " should produce a warning\n");

total: 0 errors, 1 warnings, 15 lines checked

Signed-off-by: Josh Triplett <***@joshtriplett.org>
---

v3: Switched from $herecurr to $hereprev to also show the line where the
string begins. Thanks to Joe Perches for the suggestion.

v2: Add the "after open parenthesis" heuristic which almost completely
eliminates non-user-visible strings, leaving only 12 in the entire
kernel, versus thousands of correct matches. Change "breaks
greppability" to "breaks the ability to grep for the string".

scripts/checkpatch.pl | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index e3bfcbe..01a5e4b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1737,6 +1737,21 @@ sub process {
"line over 80 characters\n" . $herecurr);
}

+# Check for user-visible strings broken across lines, which breaks the ability
+# to grep for the string. Limited to strings used as parameters (those
+# following an open parenthesis), which almost completely eliminates false
+# positives, as well as warning only once per parameter rather than once per
+# line of the string. Make an exception when the previous string ends in a
+# newline (multiple lines in one string constant) or \n\t (common in inline
+# assembly to indent the instruction on the following line).
+ if ($line =~ /^\+\s*"/ &&
+ $prevline =~ /"\s*$/ &&
+ $prevline =~ /\(/ &&
+ $prevrawline !~ /\\n(?:\\t)*"\s*$/) {
+ WARN("SPLIT_STRING",
+ "quoted string split across lines\n" . $hereprev);
+ }
+
# check for spaces before a quoted newline
if ($rawline =~ /^.*\".*\s\\n/) {
WARN("QUOTED_WHITESPACE_BEFORE_NEWLINE",
--
1.7.9
Loading...