Discussion:
[PATCH -tip v14 00/12] tracing: kprobe-based event tracer and x86 instruction decoder
(too old to reply)
Masami Hiramatsu
2009-08-13 20:34:04 UTC
Permalink
Hi,

Here are the patches of kprobe-based event tracer for x86, version 14,
which allows you to probe various kernel events through ftrace interface.
The tracer supports per-probe filtering which allows you to set filters
on each probe and shows formats of each probe.

This version includes below fixes.
- Define remove_subsystem_dir() always (patch 6/12)
- Modify syscall_tracer because of ftrace_event_call change (patch 6/12)
- Support 'sa' argument for stack address (patch 8/12)
- Use call->data instead of container_of() macro. (patch 8/12)
- Assign new event id for each event. (patch 11/12)

Lai, this version still can not be applied on your patch ('use defined
fields to print formats') yet, since I couldn't update your patch on
the latest -tip tree.

This patchset also includes x86(-64) instruction decoder which
supports non-SSE/FP opcodes and includes x86 opcode map. The decoder
is used for finding the instruction boundaries when inserting new
kprobes. I think it will be possible to share this opcode map
with KVM's decoder.
The decoder is tested when building kernel, the test compares the
results of objdump and the decoder right after building vmlinux.
You can enable that test by CONFIG_X86_DECODER_SELFTEST=y.

This series can be applied on the latest linux-2.6.31-rc5-tip.

This supports only x86(-32/-64) (but porting it on other arch
just needs kprobes/kretprobes and register and stack access APIs).

I also made two tools for this tracer.
- Kprobe stress test script which tests kprobes on all kernel symbols to
find symbols which should be blacklisted.
- C expression to kprobes event format converter which helps you to define
kprobes events by C source code line number or function name, and local
variable name.

Enhancement ideas will be added after merging:
- .init function tracing support.
- Support primitive types(long, ulong, int, uint, etc) for args.


Kprobe-based Event Tracer
=========================

Overview
--------
This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
and kretprobe). It probes anywhere where kprobes can probe(this means, all
functions body except for __kprobes functions).

Unlike the function tracer, this tracer can probe instructions inside of
kernel functions. It allows you to check which instruction has been executed.

Unlike the Tracepoint based events tracer, this tracer can add new probe points
on the fly.

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
probe events via /sys/kernel/debug/tracing/events/kprobes/<EVENT>/filter.


Synopsis of kprobe_events
-------------------------
p[:EVENT] SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : Set a probe
r[:EVENT] SYMBOL[+0] [FETCHARGS] : Set a return probe

EVENT : Event name. If omitted, the event name is generated
based on SYMBOL+offs or MEMADDR.
SYMBOL[+offs|-offs] : Symbol+offset where the probe is inserted.
MEMADDR : Address where the probe is inserted.

FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
sN : Fetch Nth entry of stack (N >= 0)
sa : Fetch stack address.
@ADDR : Fetch memory at ADDR (ADDR should be in kernel)
@SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
aN : Fetch function argument. (N >= 0)(*)
rv : Fetch return value.(**)
ra : Fetch return address.(**)
+|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)

(*) aN may not correct on asmlinkaged functions and at the middle of
function body.
(**) only for return probe.
(***) this is useful for fetching a field of data structures.


Per-Probe Event Filtering
-------------------------
Per-probe event filtering feature allows you to set different filter on each
probe and gives you what arguments will be shown in trace buffer. If an event
name is specified right after 'p:' or 'r:' in kprobe_events, the tracer adds
an event under tracing/events/kprobes/<EVENT>, at the directory you can see
'id', 'enabled', 'format' and 'filter'.

enabled:
You can enable/disable the probe by writing 1 or 0 on it.

format:
It shows the format of this probe event. It also shows aliases of arguments
which you specified to kprobe_events.

filter:
You can write filtering rules of this event. And you can use both of aliase
names and field names for describing filters.


Event Profiling
---------------
You can check the total number of probe hits and probe miss-hits via
/sys/kernel/debug/tracing/kprobe_profile.
The first column is event name, the second is the number of probe hits,
the third is the number of probe miss-hits.


Usage examples
--------------
To add a probe as a new event, write a new definition to kprobe_events
as below.

echo p:myprobe do_sys_open a0 a1 a2 a3 > /sys/kernel/debug/tracing/kprobe_events

This sets a kprobe on the top of do_sys_open() function with recording
1st to 4th arguments as "myprobe" event.

echo r:myretprobe do_sys_open rv ra >> /sys/kernel/debug/tracing/kprobe_events

This sets a kretprobe on the return point of do_sys_open() function with
recording return value and return address as "myretprobe" event.
You can see the format of these events via
/sys/kernel/debug/tracing/events/kprobes/<EVENT>/format.

cat /sys/kernel/debug/tracing/events/kprobes/myprobe/format
name: myprobe
ID: 23
format:
field:unsigned short common_type; offset:0; size:2;
field:unsigned char common_flags; offset:2; size:1;
field:unsigned char common_preempt_count; offset:3; size:1;
field:int common_pid; offset:4; size:4;
field:int common_tgid; offset:8; size:4;

field: unsigned long ip; offset:16;tsize:8;
field: int nargs; offset:24;tsize:4;
field: unsigned long arg0; offset:32;tsize:8;
field: unsigned long arg1; offset:40;tsize:8;
field: unsigned long arg2; offset:48;tsize:8;
field: unsigned long arg3; offset:56;tsize:8;

alias: a0; original: arg0;
alias: a1; original: arg1;
alias: a2; original: arg2;
alias: a3; original: arg3;

print fmt: "%lx: 0x%lx 0x%lx 0x%lx 0x%lx", ip, arg0, arg1, arg2, arg3


You can see that the event has 4 arguments and alias expressions
corresponding to it.

echo > /sys/kernel/debug/tracing/kprobe_events

This clears all probe points. and you can see the traced information via
/sys/kernel/debug/tracing/trace.

cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<...>-1447 [001] 1038282.286875: do_sys_open+0x0/0xd6: 0x3 0x7fffd1ec4440 0x8000 0x0
<...>-1447 [001] 1038282.286878: sys_openat+0xc/0xe <- do_sys_open: 0xfffffffffffffffe 0xffffffff81367a3a
<...>-1447 [001] 1038282.286885: do_sys_open+0x0/0xd6: 0xffffff9c 0x40413c 0x8000 0x1b6
<...>-1447 [001] 1038282.286915: sys_open+0x1b/0x1d <- do_sys_open: 0x3 0xffffffff81367a3a
<...>-1447 [001] 1038282.286969: do_sys_open+0x0/0xd6: 0xffffff9c 0x4041c6 0x98800 0x10
<...>-1447 [001] 1038282.286976: sys_open+0x1b/0x1d <- do_sys_open: 0x3 0xffffffff81367a3a


Each line shows when the kernel hits a probe, and <- SYMBOL means kernel
returns from SYMBOL(e.g. "sys_open+0x1b/0x1d <- do_sys_open" means kernel
returns from do_sys_open to sys_open+0x1b).


Thank you,

---

Masami Hiramatsu (12):
tracing: Add kprobes event profiling interface
tracing: Kprobe tracer assigns new event ids for each event
tracing: Generate names for each kprobe event automatically
tracing: Kprobe-tracer supports more than 6 arguments
tracing: add kprobe-based event tracer
tracing: Introduce TRACE_FIELD_ZERO() macro
tracing: ftrace dynamic ftrace_event_call support
x86: add pt_regs register and stack access APIs
kprobes: cleanup fix_riprel() using insn decoder on x86
kprobes: checks probe address is instruction boudary on x86
x86: x86 instruction decoder build-time selftest
x86: instruction decoder API


Documentation/trace/kprobetrace.txt | 148 ++++
arch/x86/Kconfig.debug | 9
arch/x86/Makefile | 3
arch/x86/include/asm/inat.h | 188 +++++
arch/x86/include/asm/inat_types.h | 29 +
arch/x86/include/asm/insn.h | 143 ++++
arch/x86/include/asm/ptrace.h | 62 ++
arch/x86/kernel/kprobes.c | 197 +++--
arch/x86/kernel/ptrace.c | 112 +++
arch/x86/lib/Makefile | 13
arch/x86/lib/inat.c | 78 ++
arch/x86/lib/insn.c | 464 +++++++++++++
arch/x86/lib/x86-opcode-map.txt | 719 ++++++++++++++++++++
arch/x86/tools/Makefile | 15
arch/x86/tools/distill.awk | 42 +
arch/x86/tools/gen-insn-attr-x86.awk | 314 +++++++++
arch/x86/tools/test_get_len.c | 113 +++
include/linux/ftrace_event.h | 14
include/linux/syscalls.h | 4
include/trace/ftrace.h | 19 -
include/trace/syscall.h | 8
kernel/trace/Kconfig | 12
kernel/trace/Makefile | 1
kernel/trace/trace.h | 23 +
kernel/trace/trace_event_types.h | 4
kernel/trace/trace_events.c | 119 ++-
kernel/trace/trace_export.c | 39 +
kernel/trace/trace_kprobe.c | 1234 ++++++++++++++++++++++++++++++++++
kernel/trace/trace_syscalls.c | 16
29 files changed, 3949 insertions(+), 193 deletions(-)
create mode 100644 Documentation/trace/kprobetrace.txt
create mode 100644 arch/x86/include/asm/inat.h
create mode 100644 arch/x86/include/asm/inat_types.h
create mode 100644 arch/x86/include/asm/insn.h
create mode 100644 arch/x86/lib/inat.c
create mode 100644 arch/x86/lib/insn.c
create mode 100644 arch/x86/lib/x86-opcode-map.txt
create mode 100644 arch/x86/tools/Makefile
create mode 100644 arch/x86/tools/distill.awk
create mode 100644 arch/x86/tools/gen-insn-attr-x86.awk
create mode 100644 arch/x86/tools/test_get_len.c
create mode 100644 kernel/trace/trace_kprobe.c
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:34:21 UTC
Permalink
Add a user-space selftest of x86 instruction decoder at kernel build time.
When CONFIG_X86_DECODER_SELFTEST=y, Kbuild builds a test harness of x86
instruction decoder and performs it after building vmlinux.
The test compares the results of objdump and x86 instruction decoder
code and check there are no differences.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Signed-off-by: Jim Keniston <***@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/Kconfig.debug | 9 +++
arch/x86/Makefile | 3 +
arch/x86/tools/Makefile | 15 +++++
arch/x86/tools/distill.awk | 42 +++++++++++++++
arch/x86/tools/test_get_len.c | 113 +++++++++++++++++++++++++++++++++++++++++
5 files changed, 182 insertions(+), 0 deletions(-)
create mode 100644 arch/x86/tools/Makefile
create mode 100644 arch/x86/tools/distill.awk
create mode 100644 arch/x86/tools/test_get_len.c

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index d105f29..7d0b681 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -186,6 +186,15 @@ config X86_DS_SELFTEST
config HAVE_MMIOTRACE_SUPPORT
def_bool y

+config X86_DECODER_SELFTEST
+ bool "x86 instruction decoder selftest"
+ depends on DEBUG_KERNEL
+ ---help---
+ Perform x86 instruction decoder selftests at build time.
+ This option is useful for checking the sanity of x86 instruction
+ decoder code.
+ If unsure, say "N".
+
#
# IO delay types:
#
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 1f3851a..f79580c 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -154,6 +154,9 @@ all: bzImage
KBUILD_IMAGE := $(boot)/bzImage

bzImage: vmlinux
+ifeq ($(CONFIG_X86_DECODER_SELFTEST),y)
+ $(Q)$(MAKE) $(build)=arch/x86/tools posttest
+endif
$(Q)$(MAKE) $(build)=$(boot) $(KBUILD_IMAGE)
$(Q)mkdir -p $(objtree)/arch/$(UTS_MACHINE)/boot
$(Q)ln -fsn ../../x86/boot/bzImage $(objtree)/arch/$(UTS_MACHINE)/boot/$@
diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
new file mode 100644
index 0000000..3dd626b
--- /dev/null
+++ b/arch/x86/tools/Makefile
@@ -0,0 +1,15 @@
+PHONY += posttest
+quiet_cmd_posttest = TEST $@
+ cmd_posttest = $(OBJDUMP) -d $(objtree)/vmlinux | awk -f $(srctree)/arch/x86/tools/distill.awk | $(obj)/test_get_len
+
+posttest: $(obj)/test_get_len vmlinux
+ $(call cmd,posttest)
+
+hostprogs-y := test_get_len
+
+# -I needed for generated C source and C source which in the kernel tree.
+HOSTCFLAGS_test_get_len.o := -Wall -I$(objtree)/arch/x86/lib/ -I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/
+
+# Dependancies are also needed.
+$(obj)/test_get_len.o: $(srctree)/arch/x86/lib/insn.c $(srctree)/arch/x86/lib/inat.c $(srctree)/arch/x86/include/asm/inat_types.h $(srctree)/arch/x86/include/asm/inat.h $(srctree)/arch/x86/include/asm/insn.h $(objtree)/arch/x86/lib/inat-tables.c
+
diff --git a/arch/x86/tools/distill.awk b/arch/x86/tools/distill.awk
new file mode 100644
index 0000000..d433619
--- /dev/null
+++ b/arch/x86/tools/distill.awk
@@ -0,0 +1,42 @@
+#!/bin/awk -f
+# Usage: objdump -d a.out | awk -f distill.awk | ./test_get_len
+# Distills the disassembly as follows:
+# - Removes all lines except the disassembled instructions.
+# - For instructions that exceed 1 line (7 bytes), crams all the hex bytes
+# into a single line.
+# - Remove bad(or prefix only) instructions
+
+BEGIN {
+ prev_addr = ""
+ prev_hex = ""
+ prev_mnemonic = ""
+ bad_expr = "(\\(bad\\)|^rex|^.byte|^rep(z|nz)$|^lock$|^es$|^cs$|^ss$|^ds$|^fs$|^gs$|^data(16|32)$|^addr(16|32|64))"
+ fwait_expr = "^9b "
+ fwait_str="9b\tfwait"
+}
+
+/^ *[0-9a-f]+:/ {
+ if (split($0, field, "\t") < 3) {
+ # This is a continuation of the same insn.
+ prev_hex = prev_hex field[2]
+ } else {
+ # Skip bad instructions
+ if (match(prev_mnemonic, bad_expr))
+ prev_addr = ""
+ # Split fwait from other f* instructions
+ if (match(prev_hex, fwait_expr) && prev_mnemonic != "fwait") {
+ printf "%s\t%s\n", prev_addr, fwait_str
+ sub(fwait_expr, "", prev_hex)
+ }
+ if (prev_addr != "")
+ printf "%s\t%s\t%s\n", prev_addr, prev_hex, prev_mnemonic
+ prev_addr = field[1]
+ prev_hex = field[2]
+ prev_mnemonic = field[3]
+ }
+}
+
+END {
+ if (prev_addr != "")
+ printf "%s\t%s\t%s\n", prev_addr, prev_hex, prev_mnemonic
+}
diff --git a/arch/x86/tools/test_get_len.c b/arch/x86/tools/test_get_len.c
new file mode 100644
index 0000000..1e81adb
--- /dev/null
+++ b/arch/x86/tools/test_get_len.c
@@ -0,0 +1,113 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2009
+ */
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <assert.h>
+
+#ifdef __x86_64__
+#define CONFIG_X86_64
+#else
+#define CONFIG_X86_32
+#endif
+#define unlikely(cond) (cond)
+
+#include <asm/insn.h>
+#include <inat.c>
+#include <insn.c>
+
+/*
+ * Test of instruction analysis in general and insn_get_length() in
+ * particular. See if insn_get_length() and the disassembler agree
+ * on the length of each instruction in an elf disassembly.
+ *
+ * Usage: objdump -d a.out | awk -f distill.awk | ./test_get_len
+ */
+
+const char *prog;
+
+static void usage(void)
+{
+ fprintf(stderr, "Usage: objdump -d a.out | awk -f distill.awk |"
+ " ./test_get_len\n");
+ exit(1);
+}
+
+static void malformed_line(const char *line, int line_nr)
+{
+ fprintf(stderr, "%s: malformed line %d:\n%s", prog, line_nr, line);
+ exit(3);
+}
+
+#define BUFSIZE 256
+
+int main(int argc, char **argv)
+{
+ char line[BUFSIZE];
+ unsigned char insn_buf[16];
+ struct insn insn;
+ int insns = 0;
+
+ prog = argv[0];
+ if (argc > 1)
+ usage();
+
+ while (fgets(line, BUFSIZE, stdin)) {
+ char copy[BUFSIZE], *s, *tab1, *tab2;
+ int nb = 0;
+ unsigned int b;
+
+ insns++;
+ memset(insn_buf, 0, 16);
+ strcpy(copy, line);
+ tab1 = strchr(copy, '\t');
+ if (!tab1)
+ malformed_line(line, insns);
+ s = tab1 + 1;
+ s += strspn(s, " ");
+ tab2 = strchr(s, '\t');
+ if (!tab2)
+ malformed_line(line, insns);
+ *tab2 = '\0'; /* Characters beyond tab2 aren't examined */
+ while (s < tab2) {
+ if (sscanf(s, "%x", &b) == 1) {
+ insn_buf[nb++] = (unsigned char) b;
+ s += 3;
+ } else
+ break;
+ }
+ /* Decode an instruction */
+#ifdef __x86_64__
+ insn_init(&insn, insn_buf, 1);
+#else
+ insn_init(&insn, insn_buf, 0);
+#endif
+ insn_get_length(&insn);
+ if (insn.length != nb) {
+ fprintf(stderr, "Error: %s", line);
+ fprintf(stderr, "Error: objdump says %d bytes, but "
+ "insn_get_length() says %d (attr:%x)\n", nb,
+ insn.length, insn.attr);
+ exit(2);
+ }
+ }
+ fprintf(stderr, "Succeed: decoded and checked %d instructions\n",
+ insns);
+ return 0;
+}
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:34:13 UTC
Permalink
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.

This version introduces instruction attributes for decoding instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).

Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.

1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;

Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable

Group opcodes, which has 8 elements, are written as below;

GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable

These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Signed-off-by: Jim Keniston <***@us.ibm.com>
Acked-by: H. Peter Anvin <***@zytor.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/include/asm/inat.h | 188 +++++++++
arch/x86/include/asm/inat_types.h | 29 +
arch/x86/include/asm/insn.h | 143 +++++++
arch/x86/lib/Makefile | 13 +
arch/x86/lib/inat.c | 78 ++++
arch/x86/lib/insn.c | 464 ++++++++++++++++++++++
arch/x86/lib/x86-opcode-map.txt | 719 ++++++++++++++++++++++++++++++++++
arch/x86/tools/gen-insn-attr-x86.awk | 314 +++++++++++++++
8 files changed, 1948 insertions(+), 0 deletions(-)
create mode 100644 arch/x86/include/asm/inat.h
create mode 100644 arch/x86/include/asm/inat_types.h
create mode 100644 arch/x86/include/asm/insn.h
create mode 100644 arch/x86/lib/inat.c
create mode 100644 arch/x86/lib/insn.c
create mode 100644 arch/x86/lib/x86-opcode-map.txt
create mode 100644 arch/x86/tools/gen-insn-attr-x86.awk

diff --git a/arch/x86/include/asm/inat.h b/arch/x86/include/asm/inat.h
new file mode 100644
index 0000000..2866fdd
--- /dev/null
+++ b/arch/x86/include/asm/inat.h
@@ -0,0 +1,188 @@
+#ifndef _ASM_X86_INAT_H
+#define _ASM_X86_INAT_H
+/*
+ * x86 instruction attributes
+ *
+ * Written by Masami Hiramatsu <***@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+#include <asm/inat_types.h>
+
+/*
+ * Internal bits. Don't use bitmasks directly, because these bits are
+ * unstable. You should use checking functions.
+ */
+
+#define INAT_OPCODE_TABLE_SIZE 256
+#define INAT_GROUP_TABLE_SIZE 8
+
+/* Legacy instruction prefixes */
+#define INAT_PFX_OPNDSZ 1 /* 0x66 */ /* LPFX1 */
+#define INAT_PFX_REPNE 2 /* 0xF2 */ /* LPFX2 */
+#define INAT_PFX_REPE 3 /* 0xF3 */ /* LPFX3 */
+#define INAT_PFX_LOCK 4 /* 0xF0 */
+#define INAT_PFX_CS 5 /* 0x2E */
+#define INAT_PFX_DS 6 /* 0x3E */
+#define INAT_PFX_ES 7 /* 0x26 */
+#define INAT_PFX_FS 8 /* 0x64 */
+#define INAT_PFX_GS 9 /* 0x65 */
+#define INAT_PFX_SS 10 /* 0x36 */
+#define INAT_PFX_ADDRSZ 11 /* 0x67 */
+
+#define INAT_LPREFIX_MAX 3
+
+/* Immediate size */
+#define INAT_IMM_BYTE 1
+#define INAT_IMM_WORD 2
+#define INAT_IMM_DWORD 3
+#define INAT_IMM_QWORD 4
+#define INAT_IMM_PTR 5
+#define INAT_IMM_VWORD32 6
+#define INAT_IMM_VWORD 7
+
+/* Legacy prefix */
+#define INAT_PFX_OFFS 0
+#define INAT_PFX_BITS 4
+#define INAT_PFX_MAX ((1 << INAT_PFX_BITS) - 1)
+#define INAT_PFX_MASK (INAT_PFX_MAX << INAT_PFX_OFFS)
+/* Escape opcodes */
+#define INAT_ESC_OFFS (INAT_PFX_OFFS + INAT_PFX_BITS)
+#define INAT_ESC_BITS 2
+#define INAT_ESC_MAX ((1 << INAT_ESC_BITS) - 1)
+#define INAT_ESC_MASK (INAT_ESC_MAX << INAT_ESC_OFFS)
+/* Group opcodes (1-16) */
+#define INAT_GRP_OFFS (INAT_ESC_OFFS + INAT_ESC_BITS)
+#define INAT_GRP_BITS 5
+#define INAT_GRP_MAX ((1 << INAT_GRP_BITS) - 1)
+#define INAT_GRP_MASK (INAT_GRP_MAX << INAT_GRP_OFFS)
+/* Immediates */
+#define INAT_IMM_OFFS (INAT_GRP_OFFS + INAT_GRP_BITS)
+#define INAT_IMM_BITS 3
+#define INAT_IMM_MASK (((1 << INAT_IMM_BITS) - 1) << INAT_IMM_OFFS)
+/* Flags */
+#define INAT_FLAG_OFFS (INAT_IMM_OFFS + INAT_IMM_BITS)
+#define INAT_REXPFX (1 << INAT_FLAG_OFFS)
+#define INAT_MODRM (1 << (INAT_FLAG_OFFS + 1))
+#define INAT_FORCE64 (1 << (INAT_FLAG_OFFS + 2))
+#define INAT_SCNDIMM (1 << (INAT_FLAG_OFFS + 3))
+#define INAT_MOFFSET (1 << (INAT_FLAG_OFFS + 4))
+#define INAT_VARIANT (1 << (INAT_FLAG_OFFS + 5))
+/* Attribute making macros for attribute tables */
+#define INAT_MAKE_PREFIX(pfx) (pfx << INAT_PFX_OFFS)
+#define INAT_MAKE_ESCAPE(esc) (esc << INAT_ESC_OFFS)
+#define INAT_MAKE_GROUP(grp) ((grp << INAT_GRP_OFFS) | INAT_MODRM)
+#define INAT_MAKE_IMM(imm) (imm << INAT_IMM_OFFS)
+
+/* Attribute search APIs */
+extern insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode);
+extern insn_attr_t inat_get_escape_attribute(insn_byte_t opcode,
+ insn_byte_t last_pfx,
+ insn_attr_t esc_attr);
+extern insn_attr_t inat_get_group_attribute(insn_byte_t modrm,
+ insn_byte_t last_pfx,
+ insn_attr_t esc_attr);
+
+/* Attribute checking functions */
+static inline int inat_is_prefix(insn_attr_t attr)
+{
+ return attr & INAT_PFX_MASK;
+}
+
+static inline int inat_is_address_size_prefix(insn_attr_t attr)
+{
+ return (attr & INAT_PFX_MASK) == INAT_PFX_ADDRSZ;
+}
+
+static inline int inat_is_operand_size_prefix(insn_attr_t attr)
+{
+ return (attr & INAT_PFX_MASK) == INAT_PFX_OPNDSZ;
+}
+
+static inline int inat_last_prefix_id(insn_attr_t attr)
+{
+ if ((attr & INAT_PFX_MASK) > INAT_LPREFIX_MAX)
+ return 0;
+ else
+ return attr & INAT_PFX_MASK;
+}
+
+static inline int inat_is_escape(insn_attr_t attr)
+{
+ return attr & INAT_ESC_MASK;
+}
+
+static inline int inat_escape_id(insn_attr_t attr)
+{
+ return (attr & INAT_ESC_MASK) >> INAT_ESC_OFFS;
+}
+
+static inline int inat_is_group(insn_attr_t attr)
+{
+ return attr & INAT_GRP_MASK;
+}
+
+static inline int inat_group_id(insn_attr_t attr)
+{
+ return (attr & INAT_GRP_MASK) >> INAT_GRP_OFFS;
+}
+
+static inline int inat_group_common_attribute(insn_attr_t attr)
+{
+ return attr & ~INAT_GRP_MASK;
+}
+
+static inline int inat_has_immediate(insn_attr_t attr)
+{
+ return attr & INAT_IMM_MASK;
+}
+
+static inline int inat_immediate_size(insn_attr_t attr)
+{
+ return (attr & INAT_IMM_MASK) >> INAT_IMM_OFFS;
+}
+
+static inline int inat_is_rex_prefix(insn_attr_t attr)
+{
+ return attr & INAT_REXPFX;
+}
+
+static inline int inat_has_modrm(insn_attr_t attr)
+{
+ return attr & INAT_MODRM;
+}
+
+static inline int inat_is_force64(insn_attr_t attr)
+{
+ return attr & INAT_FORCE64;
+}
+
+static inline int inat_has_second_immediate(insn_attr_t attr)
+{
+ return attr & INAT_SCNDIMM;
+}
+
+static inline int inat_has_moffset(insn_attr_t attr)
+{
+ return attr & INAT_MOFFSET;
+}
+
+static inline int inat_has_variant(insn_attr_t attr)
+{
+ return attr & INAT_VARIANT;
+}
+
+#endif
diff --git a/arch/x86/include/asm/inat_types.h b/arch/x86/include/asm/inat_types.h
new file mode 100644
index 0000000..cb3c20c
--- /dev/null
+++ b/arch/x86/include/asm/inat_types.h
@@ -0,0 +1,29 @@
+#ifndef _ASM_X86_INAT_TYPES_H
+#define _ASM_X86_INAT_TYPES_H
+/*
+ * x86 instruction attributes
+ *
+ * Written by Masami Hiramatsu <***@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+/* Instruction attributes */
+typedef unsigned int insn_attr_t;
+typedef unsigned char insn_byte_t;
+typedef signed int insn_value_t;
+
+#endif
diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
new file mode 100644
index 0000000..12b4e37
--- /dev/null
+++ b/arch/x86/include/asm/insn.h
@@ -0,0 +1,143 @@
+#ifndef _ASM_X86_INSN_H
+#define _ASM_X86_INSN_H
+/*
+ * x86 instruction analysis
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2009
+ */
+
+/* insn_attr_t is defined in inat.h */
+#include <asm/inat.h>
+
+struct insn_field {
+ union {
+ insn_value_t value;
+ insn_byte_t bytes[4];
+ };
+ /* !0 if we've run insn_get_xxx() for this field */
+ unsigned char got;
+ unsigned char nbytes;
+};
+
+struct insn {
+ struct insn_field prefixes; /*
+ * Prefixes
+ * prefixes.bytes[3]: last prefix
+ */
+ struct insn_field rex_prefix; /* REX prefix */
+ struct insn_field opcode; /*
+ * opcode.bytes[0]: opcode1
+ * opcode.bytes[1]: opcode2
+ * opcode.bytes[2]: opcode3
+ */
+ struct insn_field modrm;
+ struct insn_field sib;
+ struct insn_field displacement;
+ union {
+ struct insn_field immediate;
+ struct insn_field moffset1; /* for 64bit MOV */
+ struct insn_field immediate1; /* for 64bit imm or off16/32 */
+ };
+ union {
+ struct insn_field moffset2; /* for 64bit MOV */
+ struct insn_field immediate2; /* for 64bit imm or seg16 */
+ };
+
+ insn_attr_t attr;
+ unsigned char opnd_bytes;
+ unsigned char addr_bytes;
+ unsigned char length;
+ unsigned char x86_64;
+
+ const insn_byte_t *kaddr; /* kernel address of insn to analyze */
+ const insn_byte_t *next_byte;
+};
+
+#define X86_MODRM_MOD(modrm) (((modrm) & 0xc0) >> 6)
+#define X86_MODRM_REG(modrm) (((modrm) & 0x38) >> 3)
+#define X86_MODRM_RM(modrm) ((modrm) & 0x07)
+
+#define X86_SIB_SCALE(sib) (((sib) & 0xc0) >> 6)
+#define X86_SIB_INDEX(sib) (((sib) & 0x38) >> 3)
+#define X86_SIB_BASE(sib) ((sib) & 0x07)
+
+#define X86_REX_W(rex) ((rex) & 8)
+#define X86_REX_R(rex) ((rex) & 4)
+#define X86_REX_X(rex) ((rex) & 2)
+#define X86_REX_B(rex) ((rex) & 1)
+
+/* The last prefix is needed for two-byte and three-byte opcodes */
+static inline insn_byte_t insn_last_prefix(struct insn *insn)
+{
+ return insn->prefixes.bytes[3];
+}
+
+extern void insn_init(struct insn *insn, const void *kaddr, int x86_64);
+extern void insn_get_prefixes(struct insn *insn);
+extern void insn_get_opcode(struct insn *insn);
+extern void insn_get_modrm(struct insn *insn);
+extern void insn_get_sib(struct insn *insn);
+extern void insn_get_displacement(struct insn *insn);
+extern void insn_get_immediate(struct insn *insn);
+extern void insn_get_length(struct insn *insn);
+
+/* Attribute will be determined after getting ModRM (for opcode groups) */
+static inline void insn_get_attribute(struct insn *insn)
+{
+ insn_get_modrm(insn);
+}
+
+/* Instruction uses RIP-relative addressing */
+extern int insn_rip_relative(struct insn *insn);
+
+/* Init insn for kernel text */
+static inline void kernel_insn_init(struct insn *insn, const void *kaddr)
+{
+#ifdef CONFIG_X86_64
+ insn_init(insn, kaddr, 1);
+#else /* CONFIG_X86_32 */
+ insn_init(insn, kaddr, 0);
+#endif
+}
+
+/* Offset of each field from kaddr */
+static inline int insn_offset_rex_prefix(struct insn *insn)
+{
+ return insn->prefixes.nbytes;
+}
+static inline int insn_offset_opcode(struct insn *insn)
+{
+ return insn_offset_rex_prefix(insn) + insn->rex_prefix.nbytes;
+}
+static inline int insn_offset_modrm(struct insn *insn)
+{
+ return insn_offset_opcode(insn) + insn->opcode.nbytes;
+}
+static inline int insn_offset_sib(struct insn *insn)
+{
+ return insn_offset_modrm(insn) + insn->modrm.nbytes;
+}
+static inline int insn_offset_displacement(struct insn *insn)
+{
+ return insn_offset_sib(insn) + insn->sib.nbytes;
+}
+static inline int insn_offset_immediate(struct insn *insn)
+{
+ return insn_offset_displacement(insn) + insn->displacement.nbytes;
+}
+
+#endif /* _ASM_X86_INSN_H */
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 07c3189..c77f8a7 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -2,12 +2,25 @@
# Makefile for x86 specific library files.
#

+inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk
+inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt
+quiet_cmd_inat_tables = GEN $@
+ cmd_inat_tables = $(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@
+
+$(obj)/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
+ $(call cmd,inat_tables)
+
+$(obj)/inat.o: $(obj)/inat-tables.c
+
+clean-files := inat-tables.c
+
obj-$(CONFIG_SMP) := msr.o

lib-y := delay.o
lib-y += thunk_$(BITS).o
lib-y += usercopy_$(BITS).o getuser.o putuser.o
lib-y += memcpy_$(BITS).o
+lib-y += insn.o inat.o

ifeq ($(CONFIG_X86_32),y)
obj-y += atomic64_32.o
diff --git a/arch/x86/lib/inat.c b/arch/x86/lib/inat.c
new file mode 100644
index 0000000..054656a
--- /dev/null
+++ b/arch/x86/lib/inat.c
@@ -0,0 +1,78 @@
+/*
+ * x86 instruction attribute tables
+ *
+ * Written by Masami Hiramatsu <***@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+#include <asm/insn.h>
+
+/* Attribute tables are generated from opcode map */
+#include "inat-tables.c"
+
+/* Attribute search APIs */
+insn_attr_t inat_get_opcode_attribute(insn_byte_t opcode)
+{
+ return inat_primary_table[opcode];
+}
+
+insn_attr_t inat_get_escape_attribute(insn_byte_t opcode, insn_byte_t last_pfx,
+ insn_attr_t esc_attr)
+{
+ const insn_attr_t *table;
+ insn_attr_t lpfx_attr;
+ int n, m = 0;
+
+ n = inat_escape_id(esc_attr);
+ if (last_pfx) {
+ lpfx_attr = inat_get_opcode_attribute(last_pfx);
+ m = inat_last_prefix_id(lpfx_attr);
+ }
+ table = inat_escape_tables[n][0];
+ if (!table)
+ return 0;
+ if (inat_has_variant(table[opcode]) && m) {
+ table = inat_escape_tables[n][m];
+ if (!table)
+ return 0;
+ }
+ return table[opcode];
+}
+
+insn_attr_t inat_get_group_attribute(insn_byte_t modrm, insn_byte_t last_pfx,
+ insn_attr_t grp_attr)
+{
+ const insn_attr_t *table;
+ insn_attr_t lpfx_attr;
+ int n, m = 0;
+
+ n = inat_group_id(grp_attr);
+ if (last_pfx) {
+ lpfx_attr = inat_get_opcode_attribute(last_pfx);
+ m = inat_last_prefix_id(lpfx_attr);
+ }
+ table = inat_group_tables[n][0];
+ if (!table)
+ return inat_group_common_attribute(grp_attr);
+ if (inat_has_variant(table[X86_MODRM_REG(modrm)]) && m) {
+ table = inat_escape_tables[n][m];
+ if (!table)
+ return inat_group_common_attribute(grp_attr);
+ }
+ return table[X86_MODRM_REG(modrm)] |
+ inat_group_common_attribute(grp_attr);
+}
+
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
new file mode 100644
index 0000000..dfd56a3
--- /dev/null
+++ b/arch/x86/lib/insn.c
@@ -0,0 +1,464 @@
+/*
+ * x86 instruction analysis
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2002, 2004, 2009
+ */
+
+#include <linux/string.h>
+#include <asm/inat.h>
+#include <asm/insn.h>
+
+#define get_next(t, insn) \
+ ({t r; r = *(t*)insn->next_byte; insn->next_byte += sizeof(t); r; })
+
+#define peek_next(t, insn) \
+ ({t r; r = *(t*)insn->next_byte; r; })
+
+/**
+ * insn_init() - initialize struct insn
+ * @insn: &struct insn to be initialized
+ * @kaddr: address (in kernel memory) of instruction (or copy thereof)
+ * @x86_64: !0 for 64-bit kernel or 64-bit app
+ */
+void insn_init(struct insn *insn, const void *kaddr, int x86_64)
+{
+ memset(insn, 0, sizeof(*insn));
+ insn->kaddr = kaddr;
+ insn->next_byte = kaddr;
+ insn->x86_64 = x86_64 ? 1 : 0;
+ insn->opnd_bytes = 4;
+ if (x86_64)
+ insn->addr_bytes = 8;
+ else
+ insn->addr_bytes = 4;
+}
+
+/**
+ * insn_get_prefixes - scan x86 instruction prefix bytes
+ * @insn: &struct insn containing instruction
+ *
+ * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
+ * to point to the (first) opcode. No effect if @insn->prefixes.got
+ * is already set.
+ */
+void insn_get_prefixes(struct insn *insn)
+{
+ struct insn_field *prefixes = &insn->prefixes;
+ insn_attr_t attr;
+ insn_byte_t b, lb;
+ int i, nb;
+
+ if (prefixes->got)
+ return;
+
+ nb = 0;
+ lb = 0;
+ b = peek_next(insn_byte_t, insn);
+ attr = inat_get_opcode_attribute(b);
+ while (inat_is_prefix(attr)) {
+ /* Skip if same prefix */
+ for (i = 0; i < nb; i++)
+ if (prefixes->bytes[i] == b)
+ goto found;
+ if (nb == 4)
+ /* Invalid instruction */
+ break;
+ prefixes->bytes[nb++] = b;
+ if (inat_is_address_size_prefix(attr)) {
+ /* address size switches 2/4 or 4/8 */
+ if (insn->x86_64)
+ insn->addr_bytes ^= 12;
+ else
+ insn->addr_bytes ^= 6;
+ } else if (inat_is_operand_size_prefix(attr)) {
+ /* oprand size switches 2/4 */
+ insn->opnd_bytes ^= 6;
+ }
+found:
+ prefixes->nbytes++;
+ insn->next_byte++;
+ lb = b;
+ b = peek_next(insn_byte_t, insn);
+ attr = inat_get_opcode_attribute(b);
+ }
+ /* Set the last prefix */
+ if (lb && lb != insn->prefixes.bytes[3]) {
+ if (unlikely(insn->prefixes.bytes[3])) {
+ /* Swap the last prefix */
+ b = insn->prefixes.bytes[3];
+ for (i = 0; i < nb; i++)
+ if (prefixes->bytes[i] == lb)
+ prefixes->bytes[i] = b;
+ }
+ insn->prefixes.bytes[3] = lb;
+ }
+
+ if (insn->x86_64) {
+ b = peek_next(insn_byte_t, insn);
+ attr = inat_get_opcode_attribute(b);
+ if (inat_is_rex_prefix(attr)) {
+ insn->rex_prefix.value = b;
+ insn->rex_prefix.nbytes = 1;
+ insn->next_byte++;
+ if (X86_REX_W(b))
+ /* REX.W overrides opnd_size */
+ insn->opnd_bytes = 8;
+ }
+ }
+ insn->rex_prefix.got = 1;
+ prefixes->got = 1;
+ return;
+}
+
+/**
+ * insn_get_opcode - collect opcode(s)
+ * @insn: &struct insn containing instruction
+ *
+ * Populates @insn->opcode, updates @insn->next_byte to point past the
+ * opcode byte(s), and set @insn->attr (except for groups).
+ * If necessary, first collects any preceding (prefix) bytes.
+ * Sets @insn->opcode.value = opcode1. No effect if @insn->opcode.got
+ * is already 1.
+ */
+void insn_get_opcode(struct insn *insn)
+{
+ struct insn_field *opcode = &insn->opcode;
+ insn_byte_t op, pfx;
+ if (opcode->got)
+ return;
+ if (!insn->prefixes.got)
+ insn_get_prefixes(insn);
+
+ /* Get first opcode */
+ op = get_next(insn_byte_t, insn);
+ opcode->bytes[0] = op;
+ opcode->nbytes = 1;
+ insn->attr = inat_get_opcode_attribute(op);
+ while (inat_is_escape(insn->attr)) {
+ /* Get escaped opcode */
+ op = get_next(insn_byte_t, insn);
+ opcode->bytes[opcode->nbytes++] = op;
+ pfx = insn_last_prefix(insn);
+ insn->attr = inat_get_escape_attribute(op, pfx, insn->attr);
+ }
+ opcode->got = 1;
+}
+
+/**
+ * insn_get_modrm - collect ModRM byte, if any
+ * @insn: &struct insn containing instruction
+ *
+ * Populates @insn->modrm and updates @insn->next_byte to point past the
+ * ModRM byte, if any. If necessary, first collects the preceding bytes
+ * (prefixes and opcode(s)). No effect if @insn->modrm.got is already 1.
+ */
+void insn_get_modrm(struct insn *insn)
+{
+ struct insn_field *modrm = &insn->modrm;
+ insn_byte_t pfx, mod;
+ if (modrm->got)
+ return;
+ if (!insn->opcode.got)
+ insn_get_opcode(insn);
+
+ if (inat_has_modrm(insn->attr)) {
+ mod = get_next(insn_byte_t, insn);
+ modrm->value = mod;
+ modrm->nbytes = 1;
+ if (inat_is_group(insn->attr)) {
+ pfx = insn_last_prefix(insn);
+ insn->attr = inat_get_group_attribute(mod, pfx,
+ insn->attr);
+ }
+ }
+
+ if (insn->x86_64 && inat_is_force64(insn->attr))
+ insn->opnd_bytes = 8;
+ modrm->got = 1;
+}
+
+
+/**
+ * insn_rip_relative() - Does instruction use RIP-relative addressing mode?
+ * @insn: &struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * ModRM byte. No effect if @insn->x86_64 is 0.
+ */
+int insn_rip_relative(struct insn *insn)
+{
+ struct insn_field *modrm = &insn->modrm;
+
+ if (!insn->x86_64)
+ return 0;
+ if (!modrm->got)
+ insn_get_modrm(insn);
+ /*
+ * For rip-relative instructions, the mod field (top 2 bits)
+ * is zero and the r/m field (bottom 3 bits) is 0x5.
+ */
+ return (modrm->nbytes && (modrm->value & 0xc7) == 0x5);
+}
+
+/**
+ * insn_get_sib() - Get the SIB byte of instruction
+ * @insn: &struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * ModRM byte.
+ */
+void insn_get_sib(struct insn *insn)
+{
+ insn_byte_t modrm;
+
+ if (insn->sib.got)
+ return;
+ if (!insn->modrm.got)
+ insn_get_modrm(insn);
+ if (insn->modrm.nbytes) {
+ modrm = (insn_byte_t)insn->modrm.value;
+ if (insn->addr_bytes != 2 &&
+ X86_MODRM_MOD(modrm) != 3 && X86_MODRM_RM(modrm) == 4) {
+ insn->sib.value = get_next(insn_byte_t, insn);
+ insn->sib.nbytes = 1;
+ }
+ }
+ insn->sib.got = 1;
+}
+
+
+/**
+ * insn_get_displacement() - Get the displacement of instruction
+ * @insn: &struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * SIB byte.
+ * Displacement value is sign-expanded.
+ */
+void insn_get_displacement(struct insn *insn)
+{
+ insn_byte_t mod, rm, base;
+
+ if (insn->displacement.got)
+ return;
+ if (!insn->sib.got)
+ insn_get_sib(insn);
+ if (insn->modrm.nbytes) {
+ /*
+ * Interpreting the modrm byte:
+ * mod = 00 - no displacement fields (exceptions below)
+ * mod = 01 - 1-byte displacement field
+ * mod = 10 - displacement field is 4 bytes, or 2 bytes if
+ * address size = 2 (0x67 prefix in 32-bit mode)
+ * mod = 11 - no memory operand
+ *
+ * If address size = 2...
+ * mod = 00, r/m = 110 - displacement field is 2 bytes
+ *
+ * If address size != 2...
+ * mod != 11, r/m = 100 - SIB byte exists
+ * mod = 00, SIB base = 101 - displacement field is 4 bytes
+ * mod = 00, r/m = 101 - rip-relative addressing, displacement
+ * field is 4 bytes
+ */
+ mod = X86_MODRM_MOD(insn->modrm.value);
+ rm = X86_MODRM_RM(insn->modrm.value);
+ base = X86_SIB_BASE(insn->sib.value);
+ if (mod == 3)
+ goto out;
+ if (mod == 1) {
+ insn->displacement.value = get_next(char, insn);
+ insn->displacement.nbytes = 1;
+ } else if (insn->addr_bytes == 2) {
+ if ((mod == 0 && rm == 6) || mod == 2) {
+ insn->displacement.value =
+ get_next(short, insn);
+ insn->displacement.nbytes = 2;
+ }
+ } else {
+ if ((mod == 0 && rm == 5) || mod == 2 ||
+ (mod == 0 && base == 5)) {
+ insn->displacement.value = get_next(int, insn);
+ insn->displacement.nbytes = 4;
+ }
+ }
+ }
+out:
+ insn->displacement.got = 1;
+}
+
+/* Decode moffset16/32/64 */
+static void __get_moffset(struct insn *insn)
+{
+ switch (insn->addr_bytes) {
+ case 2:
+ insn->moffset1.value = get_next(short, insn);
+ insn->moffset1.nbytes = 2;
+ break;
+ case 4:
+ insn->moffset1.value = get_next(int, insn);
+ insn->moffset1.nbytes = 4;
+ break;
+ case 8:
+ insn->moffset1.value = get_next(int, insn);
+ insn->moffset1.nbytes = 4;
+ insn->moffset2.value = get_next(int, insn);
+ insn->moffset2.nbytes = 4;
+ break;
+ }
+ insn->moffset1.got = insn->moffset2.got = 1;
+}
+
+/* Decode imm v32(Iz) */
+static void __get_immv32(struct insn *insn)
+{
+ switch (insn->opnd_bytes) {
+ case 2:
+ insn->immediate.value = get_next(short, insn);
+ insn->immediate.nbytes = 2;
+ break;
+ case 4:
+ case 8:
+ insn->immediate.value = get_next(int, insn);
+ insn->immediate.nbytes = 4;
+ break;
+ }
+}
+
+/* Decode imm v64(Iv/Ov) */
+static void __get_immv(struct insn *insn)
+{
+ switch (insn->opnd_bytes) {
+ case 2:
+ insn->immediate1.value = get_next(short, insn);
+ insn->immediate1.nbytes = 2;
+ break;
+ case 4:
+ insn->immediate1.value = get_next(int, insn);
+ insn->immediate1.nbytes = 4;
+ break;
+ case 8:
+ insn->immediate1.value = get_next(int, insn);
+ insn->immediate1.nbytes = 4;
+ insn->immediate2.value = get_next(int, insn);
+ insn->immediate2.nbytes = 4;
+ break;
+ }
+ insn->immediate1.got = insn->immediate2.got = 1;
+}
+
+/* Decode ptr16:16/32(Ap) */
+static void __get_immptr(struct insn *insn)
+{
+ switch (insn->opnd_bytes) {
+ case 2:
+ insn->immediate1.value = get_next(short, insn);
+ insn->immediate1.nbytes = 2;
+ break;
+ case 4:
+ insn->immediate1.value = get_next(int, insn);
+ insn->immediate1.nbytes = 4;
+ break;
+ case 8:
+ /* ptr16:64 is not exist (no segment) */
+ return;
+ }
+ insn->immediate2.value = get_next(unsigned short, insn);
+ insn->immediate2.nbytes = 2;
+ insn->immediate1.got = insn->immediate2.got = 1;
+}
+
+/**
+ * insn_get_immediate() - Get the immediates of instruction
+ * @insn: &struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * displacement bytes.
+ * Basically, most of immediates are sign-expanded. Unsigned-value can be
+ * get by bit masking with ((1 << (nbytes * 8)) - 1)
+ */
+void insn_get_immediate(struct insn *insn)
+{
+ if (insn->immediate.got)
+ return;
+ if (!insn->displacement.got)
+ insn_get_displacement(insn);
+
+ if (inat_has_moffset(insn->attr)) {
+ __get_moffset(insn);
+ goto done;
+ }
+
+ if (!inat_has_immediate(insn->attr))
+ /* no immediates */
+ goto done;
+
+ switch (inat_immediate_size(insn->attr)) {
+ case INAT_IMM_BYTE:
+ insn->immediate.value = get_next(char, insn);
+ insn->immediate.nbytes = 1;
+ break;
+ case INAT_IMM_WORD:
+ insn->immediate.value = get_next(short, insn);
+ insn->immediate.nbytes = 2;
+ break;
+ case INAT_IMM_DWORD:
+ insn->immediate.value = get_next(int, insn);
+ insn->immediate.nbytes = 4;
+ break;
+ case INAT_IMM_QWORD:
+ insn->immediate1.value = get_next(int, insn);
+ insn->immediate1.nbytes = 4;
+ insn->immediate2.value = get_next(int, insn);
+ insn->immediate2.nbytes = 4;
+ break;
+ case INAT_IMM_PTR:
+ __get_immptr(insn);
+ break;
+ case INAT_IMM_VWORD32:
+ __get_immv32(insn);
+ break;
+ case INAT_IMM_VWORD:
+ __get_immv(insn);
+ break;
+ default:
+ break;
+ }
+ if (inat_has_second_immediate(insn->attr)) {
+ insn->immediate2.value = get_next(char, insn);
+ insn->immediate2.nbytes = 1;
+ }
+done:
+ insn->immediate.got = 1;
+}
+
+/**
+ * insn_get_length() - Get the length of instruction
+ * @insn: &struct insn containing instruction
+ *
+ * If necessary, first collects the instruction up to and including the
+ * immediates bytes.
+ */
+void insn_get_length(struct insn *insn)
+{
+ if (insn->length)
+ return;
+ if (!insn->immediate.got)
+ insn_get_immediate(insn);
+ insn->length = (unsigned char)((unsigned long)insn->next_byte
+ - (unsigned long)insn->kaddr);
+}
diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
new file mode 100644
index 0000000..083dd59
--- /dev/null
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -0,0 +1,719 @@
+# x86 Opcode Maps
+#
+#<Opcode maps>
+# Table: table-name
+# Referrer: escaped-name
+# opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
+# (or)
+# opcode: escape # escaped-name
+# EndTable
+#
+#<group maps>
+# GrpTable: GrpXXX
+# reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
+# EndTable
+#
+
+Table: one byte opcode
+Referrer:
+# 0x00 - 0x0f
+00: ADD Eb,Gb
+01: ADD Ev,Gv
+02: ADD Gb,Eb
+03: ADD Gv,Ev
+04: ADD AL,Ib
+05: ADD rAX,Iz
+06: PUSH ES (i64)
+07: POP ES (i64)
+08: OR Eb,Gb
+09: OR Ev,Gv
+0a: OR Gb,Eb
+0b: OR Gv,Ev
+0c: OR AL,Ib
+0d: OR rAX,Iz
+0e: PUSH CS (i64)
+0f: escape # 2-byte escape
+# 0x10 - 0x1f
+10: ADC Eb,Gb
+11: ADC Ev,Gv
+12: ADC Gb,Eb
+13: ADC Gv,Ev
+14: ADC AL,Ib
+15: ADC rAX,Iz
+16: PUSH SS (i64)
+17: POP SS (i64)
+18: SBB Eb,Gb
+19: SBB Ev,Gv
+1a: SBB Gb,Eb
+1b: SBB Gv,Ev
+1c: SBB AL,Ib
+1d: SBB rAX,Iz
+1e: PUSH DS (i64)
+1f: POP DS (i64)
+# 0x20 - 0x2f
+20: AND Eb,Gb
+21: AND Ev,Gv
+22: AND Gb,Eb
+23: AND Gv,Ev
+24: AND AL,Ib
+25: AND rAx,Iz
+26: SEG=ES (Prefix)
+27: DAA (i64)
+28: SUB Eb,Gb
+29: SUB Ev,Gv
+2a: SUB Gb,Eb
+2b: SUB Gv,Ev
+2c: SUB AL,Ib
+2d: SUB rAX,Iz
+2e: SEG=CS (Prefix)
+2f: DAS (i64)
+# 0x30 - 0x3f
+30: XOR Eb,Gb
+31: XOR Ev,Gv
+32: XOR Gb,Eb
+33: XOR Gv,Ev
+34: XOR AL,Ib
+35: XOR rAX,Iz
+36: SEG=SS (Prefix)
+37: AAA (i64)
+38: CMP Eb,Gb
+39: CMP Ev,Gv
+3a: CMP Gb,Eb
+3b: CMP Gv,Ev
+3c: CMP AL,Ib
+3d: CMP rAX,Iz
+3e: SEG=DS (Prefix)
+3f: AAS (i64)
+# 0x40 - 0x4f
+40: INC eAX (i64) | REX (o64)
+41: INC eCX (i64) | REX.B (o64)
+42: INC eDX (i64) | REX.X (o64)
+43: INC eBX (i64) | REX.XB (o64)
+44: INC eSP (i64) | REX.R (o64)
+45: INC eBP (i64) | REX.RB (o64)
+46: INC eSI (i64) | REX.RX (o64)
+47: INC eDI (i64) | REX.RXB (o64)
+48: DEC eAX (i64) | REX.W (o64)
+49: DEC eCX (i64) | REX.WB (o64)
+4a: DEC eDX (i64) | REX.WX (o64)
+4b: DEC eBX (i64) | REX.WXB (o64)
+4c: DEC eSP (i64) | REX.WR (o64)
+4d: DEC eBP (i64) | REX.WRB (o64)
+4e: DEC eSI (i64) | REX.WRX (o64)
+4f: DEC eDI (i64) | REX.WRXB (o64)
+# 0x50 - 0x5f
+50: PUSH rAX/r8 (d64)
+51: PUSH rCX/r9 (d64)
+52: PUSH rDX/r10 (d64)
+53: PUSH rBX/r11 (d64)
+54: PUSH rSP/r12 (d64)
+55: PUSH rBP/r13 (d64)
+56: PUSH rSI/r14 (d64)
+57: PUSH rDI/r15 (d64)
+58: POP rAX/r8 (d64)
+59: POP rCX/r9 (d64)
+5a: POP rDX/r10 (d64)
+5b: POP rBX/r11 (d64)
+5c: POP rSP/r12 (d64)
+5d: POP rBP/r13 (d64)
+5e: POP rSI/r14 (d64)
+5f: POP rDI/r15 (d64)
+# 0x60 - 0x6f
+60: PUSHA/PUSHAD (i64)
+61: POPA/POPAD (i64)
+62: BOUND Gv,Ma (i64)
+63: ARPL Ew,Gw (i64) | MOVSXD Gv,Ev (o64)
+64: SEG=FS (Prefix)
+65: SEG=GS (Prefix)
+66: Operand-Size (Prefix)
+67: Address-Size (Prefix)
+68: PUSH Iz (d64)
+69: IMUL Gv,Ev,Iz
+6a: PUSH Ib (d64)
+6b: IMUL Gv,Ev,Ib
+6c: INS/INSB Yb,DX
+6d: INS/INSW/INSD Yz,DX
+6e: OUTS/OUTSB DX,Xb
+6f: OUTS/OUTSW/OUTSD DX,Xz
+# 0x70 - 0x7f
+70: JO Jb
+71: JNO Jb
+72: JB/JNAE/JC Jb
+73: JNB/JAE/JNC Jb
+74: JZ/JE Jb
+75: JNZ/JNE Jb
+76: JBE/JNA Jb
+77: JNBE/JA Jb
+78: JS Jb
+79: JNS Jb
+7a: JP/JPE Jb
+7b: JNP/JPO Jb
+7c: JL/JNGE Jb
+7d: JNL/JGE Jb
+7e: JLE/JNG Jb
+7f: JNLE/JG Jb
+# 0x80 - 0x8f
+80: Grp1 Eb,Ib (1A)
+81: Grp1 Ev,Iz (1A)
+82: Grp1 Eb,Ib (1A),(i64)
+83: Grp1 Ev,Ib (1A)
+84: TEST Eb,Gb
+85: TEST Ev,Gv
+86: XCHG Eb,Gb
+87: XCHG Ev,Gv
+88: MOV Eb,Gb
+89: MOV Ev,Gv
+8a: MOV Gb,Eb
+8b: MOV Gv,Ev
+8c: MOV Ev,Sw
+8d: LEA Gv,M
+8e: MOV Sw,Ew
+8f: Grp1A (1A) | POP Ev (d64)
+# 0x90 - 0x9f
+90: NOP | PAUSE (F3) | XCHG r8,rAX
+91: XCHG rCX/r9,rAX
+92: XCHG rDX/r10,rAX
+93: XCHG rBX/r11,rAX
+94: XCHG rSP/r12,rAX
+95: XCHG rBP/r13,rAX
+96: XCHG rSI/r14,rAX
+97: XCHG rDI/r15,rAX
+98: CBW/CWDE/CDQE
+99: CWD/CDQ/CQO
+9a: CALLF Ap (i64)
+9b: FWAIT/WAIT
+9c: PUSHF/D/Q Fv (d64)
+9d: POPF/D/Q Fv (d64)
+9e: SAHF
+9f: LAHF
+# 0xa0 - 0xaf
+a0: MOV AL,Ob
+a1: MOV rAX,Ov
+a2: MOV Ob,AL
+a3: MOV Ov,rAX
+a4: MOVS/B Xb,Yb
+a5: MOVS/W/D/Q Xv,Yv
+a6: CMPS/B Xb,Yb
+a7: CMPS/W/D Xv,Yv
+a8: TEST AL,Ib
+a9: TEST rAX,Iz
+aa: STOS/B Yb,AL
+ab: STOS/W/D/Q Yv,rAX
+ac: LODS/B AL,Xb
+ad: LODS/W/D/Q rAX,Xv
+ae: SCAS/B AL,Yb
+af: SCAS/W/D/Q rAX,Xv
+# 0xb0 - 0xbf
+b0: MOV AL/R8L,Ib
+b1: MOV CL/R9L,Ib
+b2: MOV DL/R10L,Ib
+b3: MOV BL/R11L,Ib
+b4: MOV AH/R12L,Ib
+b5: MOV CH/R13L,Ib
+b6: MOV DH/R14L,Ib
+b7: MOV BH/R15L,Ib
+b8: MOV rAX/r8,Iv
+b9: MOV rCX/r9,Iv
+ba: MOV rDX/r10,Iv
+bb: MOV rBX/r11,Iv
+bc: MOV rSP/r12,Iv
+bd: MOV rBP/r13,Iv
+be: MOV rSI/r14,Iv
+bf: MOV rDI/r15,Iv
+# 0xc0 - 0xcf
+c0: Grp2 Eb,Ib (1A)
+c1: Grp2 Ev,Ib (1A)
+c2: RETN Iw (f64)
+c3: RETN
+c4: LES Gz,Mp (i64)
+c5: LDS Gz,Mp (i64)
+c6: Grp11 Eb,Ib (1A)
+c7: Grp11 Ev,Iz (1A)
+c8: ENTER Iw,Ib
+c9: LEAVE (d64)
+ca: RETF Iw
+cb: RETF
+cc: INT3
+cd: INT Ib
+ce: INTO (i64)
+cf: IRET/D/Q
+# 0xd0 - 0xdf
+d0: Grp2 Eb,1 (1A)
+d1: Grp2 Ev,1 (1A)
+d2: Grp2 Eb,CL (1A)
+d3: Grp2 Ev,CL (1A)
+d4: AAM Ib (i64)
+d5: AAD Ib (i64)
+d6:
+d7: XLAT/XLATB
+d8: ESC
+d9: ESC
+da: ESC
+db: ESC
+dc: ESC
+dd: ESC
+de: ESC
+df: ESC
+# 0xe0 - 0xef
+e0: LOOPNE/LOOPNZ Jb (f64)
+e1: LOOPE/LOOPZ Jb (f64)
+e2: LOOP Jb (f64)
+e3: JrCXZ Jb (f64)
+e4: IN AL,Ib
+e5: IN eAX,Ib
+e6: OUT Ib,AL
+e7: OUT Ib,eAX
+e8: CALL Jz (f64)
+e9: JMP-near Jz (f64)
+ea: JMP-far Ap (i64)
+eb: JMP-short Jb (f64)
+ec: IN AL,DX
+ed: IN eAX,DX
+ee: OUT DX,AL
+ef: OUT DX,eAX
+# 0xf0 - 0xff
+f0: LOCK (Prefix)
+f1:
+f2: REPNE (Prefix)
+f3: REP/REPE (Prefix)
+f4: HLT
+f5: CMC
+f6: Grp3_1 Eb (1A)
+f7: Grp3_2 Ev (1A)
+f8: CLC
+f9: STC
+fa: CLI
+fb: STI
+fc: CLD
+fd: STD
+fe: Grp4 (1A)
+ff: Grp5 (1A)
+EndTable
+
+Table: 2-byte opcode # First Byte is 0x0f
+Referrer: 2-byte escape
+# 0x0f 0x00-0x0f
+00: Grp6 (1A)
+01: Grp7 (1A)
+02: LAR Gv,Ew
+03: LSL Gv,Ew
+04:
+05: SYSCALL (o64)
+06: CLTS
+07: SYSRET (o64)
+08: INVD
+09: WBINVD
+0a:
+0b: UD2 (1B)
+0c:
+0d: NOP Ev
+0e:
+0f:
+# 0x0f 0x10-0x1f
+10:
+11:
+12:
+13:
+14:
+15:
+16:
+17:
+18: Grp16 (1A)
+19:
+1a:
+1b:
+1c:
+1d:
+1e:
+1f: NOP Ev
+# 0x0f 0x20-0x2f
+20: MOV Rd,Cd
+21: MOV Rd,Dd
+22: MOV Cd,Rd
+23: MOV Dd,Rd
+24:
+25:
+26:
+27:
+28: movaps Vps,Wps | movapd Vpd,Wpd (66)
+29: movaps Wps,Vps | movapd Wpd,Vpd (66)
+2a:
+2b:
+2c:
+2d:
+2e:
+2f:
+# 0x0f 0x30-0x3f
+30: WRMSR
+31: RDTSC
+32: RDMSR
+33: RDPMC
+34: SYSENTER
+35: SYSEXIT
+36:
+37: GETSEC
+38: escape # 3-byte escape 1
+39:
+3a: escape # 3-byte escape 2
+3b:
+3c:
+3d:
+3e:
+3f:
+# 0x0f 0x40-0x4f
+40: CMOVO Gv,Ev
+41: CMOVNO Gv,Ev
+42: CMOVB/C/NAE Gv,Ev
+43: CMOVAE/NB/NC Gv,Ev
+44: CMOVE/Z Gv,Ev
+45: CMOVNE/NZ Gv,Ev
+46: CMOVBE/NA Gv,Ev
+47: CMOVA/NBE Gv,Ev
+48: CMOVS Gv,Ev
+49: CMOVNS Gv,Ev
+4a: CMOVP/PE Gv,Ev
+4b: CMOVNP/PO Gv,Ev
+4c: CMOVL/NGE Gv,Ev
+4d: CMOVNL/GE Gv,Ev
+4e: CMOVLE/NG Gv,Ev
+4f: CMOVNLE/G Gv,Ev
+# 0x0f 0x50-0x5f
+50:
+51:
+52:
+53:
+54:
+55:
+56:
+57:
+58:
+59:
+5a:
+5b:
+5c:
+5d:
+5e:
+5f:
+# 0x0f 0x60-0x6f
+60:
+61:
+62:
+63:
+64:
+65:
+66:
+67:
+68:
+69:
+6a:
+6b:
+6c:
+6d:
+6e:
+6f:
+# 0x0f 0x70-0x7f
+70:
+71: Grp12 (1A)
+72: Grp13 (1A)
+73: Grp14 (1A)
+74:
+75:
+76:
+77:
+78: VMREAD Ed/q,Gd/q
+79: VMWRITE Gd/q,Ed/q
+7a:
+7b:
+7c:
+7d:
+7e:
+7f:
+# 0x0f 0x80-0x8f
+80: JO Jz (f64)
+81: JNO Jz (f64)
+82: JB/JNAE/JC Jz (f64)
+83: JNB/JAE/JNC Jz (f64)
+84: JZ/JE Jz (f64)
+85: JNZ/JNE Jz (f64)
+86: JBE/JNA Jz (f64)
+87: JNBE/JA Jz (f64)
+88: JS Jz (f64)
+89: JNS Jz (f64)
+8a: JP/JPE Jz (f64)
+8b: JNP/JPO Jz (f64)
+8c: JL/JNGE Jz (f64)
+8d: JNL/JGE Jz (f64)
+8e: JLE/JNG Jz (f64)
+8f: JNLE/JG Jz (f64)
+# 0x0f 0x90-0x9f
+90: SETO Eb
+91: SETNO Eb
+92: SETB/C/NAE Eb
+93: SETAE/NB/NC Eb
+94: SETE/Z Eb
+95: SETNE/NZ Eb
+96: SETBE/NA Eb
+97: SETA/NBE Eb
+98: SETS Eb
+99: SETNS Eb
+9a: SETP/PE Eb
+9b: SETNP/PO Eb
+9c: SETL/NGE Eb
+9d: SETNL/GE Eb
+9e: SETLE/NG Eb
+9f: SETNLE/G Eb
+# 0x0f 0xa0-0xaf
+a0: PUSH FS (d64)
+a1: POP FS (d64)
+a2: CPUID
+a3: BT Ev,Gv
+a4: SHLD Ev,Gv,Ib
+a5: SHLD Ev,Gv,CL
+a6:
+a7: GrpRNG
+a8: PUSH GS (d64)
+a9: POP GS (d64)
+aa: RSM
+ab: BTS Ev,Gv
+ac: SHRD Ev,Gv,Ib
+ad: SHRD Ev,Gv,CL
+ae: Grp15 (1A),(1C)
+af: IMUL Gv,Ev
+# 0x0f 0xb0-0xbf
+b0: CMPXCHG Eb,Gb
+b1: CMPXCHG Ev,Gv
+b2: LSS Gv,Mp
+b3: BTR Ev,Gv
+b4: LFS Gv,Mp
+b5: LGS Gv,Mp
+b6: MOVZX Gv,Eb
+b7: MOVZX Gv,Ew
+b8: JMPE | POPCNT Gv,Ev (F3)
+b9: Grp10 (1A)
+ba: Grp8 Ev,Ib (1A)
+bb: BTC Ev,Gv
+bc: BSF Gv,Ev
+bd: BSR Gv,Ev
+be: MOVSX Gv,Eb
+bf: MOVSX Gv,Ew
+# 0x0f 0xc0-0xcf
+c0: XADD Eb,Gb
+c1: XADD Ev,Gv
+c2:
+c3: movnti Md/q,Gd/q
+c4:
+c5:
+c6:
+c7: Grp9 (1A)
+c8: BSWAP RAX/EAX/R8/R8D
+c9: BSWAP RCX/ECX/R9/R9D
+ca: BSWAP RDX/EDX/R10/R10D
+cb: BSWAP RBX/EBX/R11/R11D
+cc: BSWAP RSP/ESP/R12/R12D
+cd: BSWAP RBP/EBP/R13/R13D
+ce: BSWAP RSI/ESI/R14/R14D
+cf: BSWAP RDI/EDI/R15/R15D
+# 0x0f 0xd0-0xdf
+d0:
+d1:
+d2:
+d3:
+d4:
+d5:
+d6:
+d7:
+d8:
+d9:
+da:
+db:
+dc:
+dd:
+de:
+df:
+# 0x0f 0xe0-0xef
+e0:
+e1:
+e2:
+e3:
+e4:
+e5:
+e6:
+e7:
+e8:
+e9:
+ea:
+eb:
+ec:
+ed:
+ee:
+ef:
+# 0x0f 0xf0-0xff
+f0:
+f1:
+f2:
+f3:
+f4:
+f5:
+f6:
+f7:
+f8:
+f9:
+fa:
+fb:
+fc:
+fd:
+fe:
+ff:
+EndTable
+
+Table: 3-byte opcode 1
+Referrer: 3-byte escape 1
+80: INVEPT Gd/q,Mdq (66)
+81: INVPID Gd/q,Mdq (66)
+f0: MOVBE Gv,Mv | CRC32 Gd,Eb (F2)
+f1: MOVBE Mv,Gv | CRC32 Gd,Ev (F2)
+EndTable
+
+Table: 3-byte opcode 2
+Referrer: 3-byte escape 2
+# all opcode is for SSE
+EndTable
+
+GrpTable: Grp1
+0: ADD
+1: OR
+2: ADC
+3: SBB
+4: AND
+5: SUB
+6: XOR
+7: CMP
+EndTable
+
+GrpTable: Grp1A
+0: POP
+EndTable
+
+GrpTable: Grp2
+0: ROL
+1: ROR
+2: RCL
+3: RCR
+4: SHL/SAL
+5: SHR
+6:
+7: SAR
+EndTable
+
+GrpTable: Grp3_1
+0: TEST Eb,Ib
+1:
+2: NOT Eb
+3: NEG Eb
+4: MUL AL,Eb
+5: IMUL AL,Eb
+6: DIV AL,Eb
+7: IDIV AL,Eb
+EndTable
+
+GrpTable: Grp3_2
+0: TEST Ev,Iz
+1:
+2: NOT Ev
+3: NEG Ev
+4: MUL rAX,Ev
+5: IMUL rAX,Ev
+6: DIV rAX,Ev
+7: IDIV rAX,Ev
+EndTable
+
+GrpTable: Grp4
+0: INC Eb
+1: DEC Eb
+EndTable
+
+GrpTable: Grp5
+0: INC Ev
+1: DEC Ev
+2: CALLN Ev (f64)
+3: CALLF Ep
+4: JMPN Ev (f64)
+5: JMPF Ep
+6: PUSH Ev (d64)
+7:
+EndTable
+
+GrpTable: Grp6
+0: SLDT Rv/Mw
+1: STR Rv/Mw
+2: LLDT Ew
+3: LTR Ew
+4: VERR Ew
+5: VERW Ew
+EndTable
+
+GrpTable: Grp7
+0: SGDT Ms | VMCALL (001),(11B) | VMLAUNCH (010),(11B) | VMRESUME (011),(11B) | VMXOFF (100),(11B)
+1: SIDT Ms | MONITOR (000),(11B) | MWAIT (001)
+2: LGDT Ms | XGETBV (000),(11B) | XSETBV (001),(11B)
+3: LIDT Ms
+4: SMSW Mw/Rv
+5:
+6: LMSW Ew
+7: INVLPG Mb | SWAPGS (o64),(000),(11B) | RDTSCP (001),(11B)
+EndTable
+
+GrpTable: Grp8
+4: BT
+5: BTS
+6: BTR
+7: BTC
+EndTable
+
+GrpTable: Grp9
+1: CMPXCHG8B/16B Mq/Mdq
+6: VMPTRLD Mq | VMCLEAR Mq (66) | VMXON Mq (F3)
+7: VMPTRST Mq
+EndTable
+
+GrpTable: Grp10
+EndTable
+
+GrpTable: Grp11
+0: MOV
+EndTable
+
+GrpTable: Grp12
+EndTable
+
+GrpTable: Grp13
+EndTable
+
+GrpTable: Grp14
+EndTable
+
+GrpTable: Grp15
+0: fxsave
+1: fxstor
+2: ldmxcsr
+3: stmxcsr
+4: XSAVE
+5: XRSTOR | lfence (11B)
+6: mfence (11B)
+7: clflush | sfence (11B)
+EndTable
+
+GrpTable: Grp16
+0: prefetch NTA
+1: prefetch T0
+2: prefetch T1
+3: prefetch T2
+EndTable
+
+GrpTable: GrpRNG
+0: xstore-rng
+1: xcrypt-ecb
+2: xcrypt-cbc
+4: xcrypt-cfb
+5: xcrypt-ofb
+EndTable
diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk
new file mode 100644
index 0000000..93b62c9
--- /dev/null
+++ b/arch/x86/tools/gen-insn-attr-x86.awk
@@ -0,0 +1,314 @@
+#!/bin/awk -f
+# gen-insn-attr-x86.awk: Instruction attribute table generator
+# Written by Masami Hiramatsu <***@redhat.com>
+#
+# Usage: awk -f gen-insn-attr-x86.awk x86-opcode-map.txt > inat-tables.c
+
+BEGIN {
+ print "/* x86 opcode map generated from x86-opcode-map.txt */"
+ print "/* Do not change this code. */"
+ ggid = 1
+ geid = 1
+
+ opnd_expr = "^[[:alpha:]]"
+ ext_expr = "^\\("
+ sep_expr = "^\\|$"
+ group_expr = "^Grp[[:alnum:]]+"
+
+ imm_expr = "^[IJAO][[:lower:]]"
+ imm_flag["Ib"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
+ imm_flag["Jb"] = "INAT_MAKE_IMM(INAT_IMM_BYTE)"
+ imm_flag["Iw"] = "INAT_MAKE_IMM(INAT_IMM_WORD)"
+ imm_flag["Id"] = "INAT_MAKE_IMM(INAT_IMM_DWORD)"
+ imm_flag["Iq"] = "INAT_MAKE_IMM(INAT_IMM_QWORD)"
+ imm_flag["Ap"] = "INAT_MAKE_IMM(INAT_IMM_PTR)"
+ imm_flag["Iz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
+ imm_flag["Jz"] = "INAT_MAKE_IMM(INAT_IMM_VWORD32)"
+ imm_flag["Iv"] = "INAT_MAKE_IMM(INAT_IMM_VWORD)"
+ imm_flag["Ob"] = "INAT_MOFFSET"
+ imm_flag["Ov"] = "INAT_MOFFSET"
+
+ modrm_expr = "^([CDEGMNPQRSUVW][[:lower:]]+|NTA|T[012])"
+ force64_expr = "\\([df]64\\)"
+ rex_expr = "^REX(\\.[XRWB]+)*"
+ fpu_expr = "^ESC" # TODO
+
+ lprefix1_expr = "\\(66\\)"
+ delete lptable1
+ lprefix2_expr = "\\(F2\\)"
+ delete lptable2
+ lprefix3_expr = "\\(F3\\)"
+ delete lptable3
+ max_lprefix = 4
+
+ prefix_expr = "\\(Prefix\\)"
+ prefix_num["Operand-Size"] = "INAT_PFX_OPNDSZ"
+ prefix_num["REPNE"] = "INAT_PFX_REPNE"
+ prefix_num["REP/REPE"] = "INAT_PFX_REPE"
+ prefix_num["LOCK"] = "INAT_PFX_LOCK"
+ prefix_num["SEG=CS"] = "INAT_PFX_CS"
+ prefix_num["SEG=DS"] = "INAT_PFX_DS"
+ prefix_num["SEG=ES"] = "INAT_PFX_ES"
+ prefix_num["SEG=FS"] = "INAT_PFX_FS"
+ prefix_num["SEG=GS"] = "INAT_PFX_GS"
+ prefix_num["SEG=SS"] = "INAT_PFX_SS"
+ prefix_num["Address-Size"] = "INAT_PFX_ADDRSZ"
+
+ delete table
+ delete etable
+ delete gtable
+ eid = -1
+ gid = -1
+}
+
+function semantic_error(msg) {
+ print "Semantic error at " NR ": " msg > "/dev/stderr"
+ exit 1
+}
+
+function debug(msg) {
+ print "DEBUG: " msg
+}
+
+function array_size(arr, i,c) {
+ c = 0
+ for (i in arr)
+ c++
+ return c
+}
+
+/^Table:/ {
+ print "/* " $0 " */"
+}
+
+/^Referrer:/ {
+ if (NF == 1) {
+ # primary opcode table
+ tname = "inat_primary_table"
+ eid = -1
+ } else {
+ # escape opcode table
+ ref = ""
+ for (i = 2; i <= NF; i++)
+ ref = ref $i
+ eid = escape[ref]
+ tname = sprintf("inat_escape_table_%d", eid)
+ }
+}
+
+/^GrpTable:/ {
+ print "/* " $0 " */"
+ if (!($2 in group))
+ semantic_error("No group: " $2 )
+ gid = group[$2]
+ tname = "inat_group_table_" gid
+}
+
+function print_table(tbl,name,fmt,n)
+{
+ print "const insn_attr_t " name " = {"
+ for (i = 0; i < n; i++) {
+ id = sprintf(fmt, i)
+ if (tbl[id])
+ print " [" id "] = " tbl[id] ","
+ }
+ print "};"
+}
+
+/^EndTable/ {
+ if (gid != -1) {
+ # print group tables
+ if (array_size(table) != 0) {
+ print_table(table, tname "[INAT_GROUP_TABLE_SIZE]",
+ "0x%x", 8)
+ gtable[gid,0] = tname
+ }
+ if (array_size(lptable1) != 0) {
+ print_table(lptable1, tname "_1[INAT_GROUP_TABLE_SIZE]",
+ "0x%x", 8)
+ gtable[gid,1] = tname "_1"
+ }
+ if (array_size(lptable2) != 0) {
+ print_table(lptable2, tname "_2[INAT_GROUP_TABLE_SIZE]",
+ "0x%x", 8)
+ gtable[gid,2] = tname "_2"
+ }
+ if (array_size(lptable3) != 0) {
+ print_table(lptable3, tname "_3[INAT_GROUP_TABLE_SIZE]",
+ "0x%x", 8)
+ gtable[gid,3] = tname "_3"
+ }
+ } else {
+ # print primary/escaped tables
+ if (array_size(table) != 0) {
+ print_table(table, tname "[INAT_OPCODE_TABLE_SIZE]",
+ "0x%02x", 256)
+ etable[eid,0] = tname
+ }
+ if (array_size(lptable1) != 0) {
+ print_table(lptable1,tname "_1[INAT_OPCODE_TABLE_SIZE]",
+ "0x%02x", 256)
+ etable[eid,1] = tname "_1"
+ }
+ if (array_size(lptable2) != 0) {
+ print_table(lptable2,tname "_2[INAT_OPCODE_TABLE_SIZE]",
+ "0x%02x", 256)
+ etable[eid,2] = tname "_2"
+ }
+ if (array_size(lptable3) != 0) {
+ print_table(lptable3,tname "_3[INAT_OPCODE_TABLE_SIZE]",
+ "0x%02x", 256)
+ etable[eid,3] = tname "_3"
+ }
+ }
+ print ""
+ delete table
+ delete lptable1
+ delete lptable2
+ delete lptable3
+ gid = -1
+ eid = -1
+}
+
+function add_flags(old,new) {
+ if (old && new)
+ return old " | " new
+ else if (old)
+ return old
+ else
+ return new
+}
+
+# convert operands to flags.
+function convert_operands(opnd, i,imm,mod)
+{
+ imm = null
+ mod = null
+ for (i in opnd) {
+ i = opnd[i]
+ if (match(i, imm_expr) == 1) {
+ if (!imm_flag[i])
+ semantic_error("Unknown imm opnd: " i)
+ if (imm) {
+ if (i != "Ib")
+ semantic_error("Second IMM error")
+ imm = add_flags(imm, "INAT_SCNDIMM")
+ } else
+ imm = imm_flag[i]
+ } else if (match(i, modrm_expr))
+ mod = "INAT_MODRM"
+ }
+ return add_flags(imm, mod)
+}
+
+/^[0-9a-f]+\:/ {
+ if (NR == 1)
+ next
+ # get index
+ idx = "0x" substr($1, 1, index($1,":") - 1)
+ if (idx in table)
+ semantic_error("Redefine " idx " in " tname)
+
+ # check if escaped opcode
+ if ("escape" == $2) {
+ if ($3 != "#")
+ semantic_error("No escaped name")
+ ref = ""
+ for (i = 4; i <= NF; i++)
+ ref = ref $i
+ if (ref in escape)
+ semantic_error("Redefine escape (" ref ")")
+ escape[ref] = geid
+ geid++
+ table[idx] = "INAT_MAKE_ESCAPE(" escape[ref] ")"
+ next
+ }
+
+ variant = null
+ # converts
+ i = 2
+ while (i <= NF) {
+ opcode = $(i++)
+ delete opnds
+ ext = null
+ flags = null
+ opnd = null
+ # parse one opcode
+ if (match($i, opnd_expr)) {
+ opnd = $i
+ split($(i++), opnds, ",")
+ flags = convert_operands(opnds)
+ }
+ if (match($i, ext_expr))
+ ext = $(i++)
+ if (match($i, sep_expr))
+ i++
+ else if (i < NF)
+ semantic_error($i " is not a separator")
+
+ # check if group opcode
+ if (match(opcode, group_expr)) {
+ if (!(opcode in group)) {
+ group[opcode] = ggid
+ ggid++
+ }
+ flags = add_flags(flags, "INAT_MAKE_GROUP(" group[opcode] ")")
+ }
+ # check force(or default) 64bit
+ if (match(ext, force64_expr))
+ flags = add_flags(flags, "INAT_FORCE64")
+
+ # check REX prefix
+ if (match(opcode, rex_expr))
+ flags = add_flags(flags, "INAT_REXPFX")
+
+ # check coprocessor escape : TODO
+ if (match(opcode, fpu_expr))
+ flags = add_flags(flags, "INAT_MODRM")
+
+ # check prefixes
+ if (match(ext, prefix_expr)) {
+ if (!prefix_num[opcode])
+ semantic_error("Unknown prefix: " opcode)
+ flags = add_flags(flags, "INAT_MAKE_PREFIX(" prefix_num[opcode] ")")
+ }
+ if (length(flags) == 0)
+ continue
+ # check if last prefix
+ if (match(ext, lprefix1_expr)) {
+ lptable1[idx] = add_flags(lptable1[idx],flags)
+ variant = "INAT_VARIANT"
+ } else if (match(ext, lprefix2_expr)) {
+ lptable2[idx] = add_flags(lptable2[idx],flags)
+ variant = "INAT_VARIANT"
+ } else if (match(ext, lprefix3_expr)) {
+ lptable3[idx] = add_flags(lptable3[idx],flags)
+ variant = "INAT_VARIANT"
+ } else {
+ table[idx] = add_flags(table[idx],flags)
+ }
+ }
+ if (variant)
+ table[idx] = add_flags(table[idx],variant)
+}
+
+END {
+ # print escape opcode map's array
+ print "/* Escape opcode map array */"
+ print "const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1]" \
+ "[INAT_LPREFIX_MAX + 1] = {"
+ for (i = 0; i < geid; i++)
+ for (j = 0; j < max_lprefix; j++)
+ if (etable[i,j])
+ print " ["i"]["j"] = "etable[i,j]","
+ print "};\n"
+ # print group opcode map's array
+ print "/* Group opcode map array */"
+ print "const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1]"\
+ "[INAT_LPREFIX_MAX + 1] = {"
+ for (i = 0; i < ggid; i++)
+ for (j = 0; j < max_lprefix; j++)
+ if (gtable[i,j])
+ print " ["i"]["j"] = "gtable[i,j]","
+ print "};"
+}
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-20 14:42:05 UTC
Permalink
Post by Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
arch/x86/lib/inat.c:29: erreur: ‘inat_primary_table’ undeclared (first use in this function)
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is reported only once
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
Thanks for reporting!
Hmm, it seems that inat-tables.c is not correctly generated.
Could you tell me which awk you used and send the inat-tables.c?

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-20 14:46:10 UTC
Permalink
Post by Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decode=
r
Post by Masami Hiramatsu
can decode x86 instructions used in kernel into prefix, opcode, mod=
rm,
Post by Masami Hiramatsu
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding instruc=
tions.
Post by Masami Hiramatsu
The instruction attribute tables are generated from the opcode map =
file
Post by Masami Hiramatsu
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk)=
=2E
Post by Masami Hiramatsu
Currently, the opcode maps are based on opcode maps in Intel(R) 64 =
and
Post by Masami Hiramatsu
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extr=
a2)...] [| 2nd-mnemonic ...]
Post by Masami Hiramatsu
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [=
| 2nd-mnemonic ...]
Post by Masami Hiramatsu
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), bec=
ause
Post by Masami Hiramatsu
those opcodes are used in the kernel.
arch/x86/lib/inat.c: In function =E2=80=98inat_get_opcode_attribute=E2=
arch/x86/lib/inat.c:29: erreur: =E2=80=98inat_primary_table=E2=80=99=
undeclared (first use in this function)
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is repor=
ted only once
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
Thanks for reporting!
Hmm, it seems that inat-tables.c is not correctly generated.
Could you tell me which awk you used and send the inat-tables.c?
Thank you,
Sure:

$ awk -Wv
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

compiled limits:
max NF 32767
sprintf buffer 2040

And I've sent you the content of inat_tables.c in the other answer :)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-20 15:03:40 UTC
Permalink
Post by Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
arch/x86/lib/inat.c:29: erreur: ‘inat_primary_table’ undeclared (first use in this function)
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is reported only once
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
I've attached my config. I haven't such problem on a dual x86-64 box.
Actually I have the same problem in x86-64
/* x86 opcode map generated from x86-opcode-map.txt */
/* Do not change this code. */
/* Table: one byte opcode */
/* Escape opcode map array */
const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1][INAT_LPREFIX_MAX + 1] = {
};
/* Group opcode map array */
const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1][INAT_LPREFIX_MAX + 1] = {
};
I guess there is a problem with the generation of this file.
Aah, you may use mawk on Ubuntu 9.04, right?
If so, unfortunately, mawk is still under development.

http://invisible-island.net/mawk/CHANGES
20090727
add check/fix to prevent gsub from recurring to modify on a substring
of the current line when the regular expression is anchored to the
beginning of the line; fixes gawk's anchgsub testcase.
add check for implicit concatenation mistaken for exponent; fixes
gawk's hex testcase.
add character-classes to built-in regular expressions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Look, this means we can't use char-class expressions like
[:lower:] until this version...

And I've found another bug in mawk-1.3.3-20090728(the latest one).
it almost works, but;

$ mawk 'BEGIN {printf("0x%x\n", 0)}'
0x1
$ gawk 'BEGIN {printf("0x%x\n", 0)}'
0x0

This bug skips an array element index 0x0 in inat-tables.c :(

So, I recommend you to install gawk instead mawk until that
supports all posix-awk features, since I don't think it is
good idea to avoid all those bugs which depends on
implementation (not specification).


Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-20 15:25:51 UTC
Permalink
Post by Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decod=
er
Post by Masami Hiramatsu
can decode x86 instructions used in kernel into prefix, opcode, mo=
drm,
Post by Masami Hiramatsu
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding instru=
ctions.
Post by Masami Hiramatsu
The instruction attribute tables are generated from the opcode map=
file
Post by Masami Hiramatsu
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk=
).
Post by Masami Hiramatsu
Currently, the opcode maps are based on opcode maps in Intel(R) 64=
and
Post by Masami Hiramatsu
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(ext=
ra2)...] [| 2nd-mnemonic ...]
Post by Masami Hiramatsu
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] =
[| 2nd-mnemonic ...]
Post by Masami Hiramatsu
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), be=
cause
Post by Masami Hiramatsu
those opcodes are used in the kernel.
arch/x86/lib/inat.c: In function =E2=80=98inat_get_opcode_attribute=
arch/x86/lib/inat.c:29: erreur: =E2=80=98inat_primary_table=E2=80=99=
undeclared (first use in this function)
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is repo=
rted only once
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
I've attached my config. I haven't such problem on a dual x86-64 bo=
x.
Post by Masami Hiramatsu
Actually I have the same problem in x86-64
/* x86 opcode map generated from x86-opcode-map.txt */
/* Do not change this code. */
/* Table: one byte opcode */
/* Escape opcode map array */
const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1][INAT_L=
PREFIX_MAX + 1] =3D {
Post by Masami Hiramatsu
};
/* Group opcode map array */
const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1][INAT_LP=
REFIX_MAX + 1] =3D {
Post by Masami Hiramatsu
};
I guess there is a problem with the generation of this file.
Aah, you may use mawk on Ubuntu 9.04, right?
If so, unfortunately, mawk is still under development.
http://invisible-island.net/mawk/CHANGES
Aargh...
Post by Masami Hiramatsu
20090727
add check/fix to prevent gsub from recurring to modify on a substri=
ng
Post by Masami Hiramatsu
of the current line when the regular expression is anchored to the
beginning of the line; fixes gawk's anchgsub testcase.
add check for implicit concatenation mistaken for exponent; fixes
gawk's hex testcase.
add character-classes to built-in regular expressions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Look, this means we can't use char-class expressions like
[:lower:] until this version...
And I've found another bug in mawk-1.3.3-20090728(the latest one).
it almost works, but;
$ mawk 'BEGIN {printf("0x%x\n", 0)}'
0x1
Ouch, indeed.
Post by Masami Hiramatsu
$ gawk 'BEGIN {printf("0x%x\n", 0)}'
0x0
This bug skips an array element index 0x0 in inat-tables.c :(
So, I recommend you to install gawk instead mawk until that
supports all posix-awk features, since I don't think it is
good idea to avoid all those bugs which depends on
implementation (not specification).
Thank you,
Yeah, indeed. May be add a warning (or build error) in case the user us=
es
mawk?

Anyway that works fine now with gawk, thanks!
All your patches build well :-)
Post by Masami Hiramatsu
--=20
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
Masami Hiramatsu
2009-08-20 16:16:05 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Post by Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
arch/x86/lib/inat.c:29: erreur: ‘inat_primary_table’ undeclared (first use in this function)
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is reported only once
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
I've attached my config. I haven't such problem on a dual x86-64 box.
Actually I have the same problem in x86-64
/* x86 opcode map generated from x86-opcode-map.txt */
/* Do not change this code. */
/* Table: one byte opcode */
/* Escape opcode map array */
const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1][INAT_LPREFIX_MAX + 1] = {
};
/* Group opcode map array */
const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1][INAT_LPREFIX_MAX + 1] = {
};
I guess there is a problem with the generation of this file.
Aah, you may use mawk on Ubuntu 9.04, right?
If so, unfortunately, mawk is still under development.
http://invisible-island.net/mawk/CHANGES
Aargh...
Post by Masami Hiramatsu
20090727
add check/fix to prevent gsub from recurring to modify on a substring
of the current line when the regular expression is anchored to the
beginning of the line; fixes gawk's anchgsub testcase.
add check for implicit concatenation mistaken for exponent; fixes
gawk's hex testcase.
add character-classes to built-in regular expressions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Look, this means we can't use char-class expressions like
[:lower:] until this version...
And I've found another bug in mawk-1.3.3-20090728(the latest one).
it almost works, but;
$ mawk 'BEGIN {printf("0x%x\n", 0)}'
0x1
Ouch, indeed.
Post by Masami Hiramatsu
$ gawk 'BEGIN {printf("0x%x\n", 0)}'
0x0
This bug skips an array element index 0x0 in inat-tables.c :(
So, I recommend you to install gawk instead mawk until that
supports all posix-awk features, since I don't think it is
good idea to avoid all those bugs which depends on
implementation (not specification).
Thank you,
Yeah, indeed. May be add a warning (or build error) in case the user uses
mawk?
Hmm, it is possible that mawk will fix those bugs and catch up soon,
so, I think checking mawk is not a good idea.
(and since there will be other awk implementations, it's not fair.)

I think what all I can do now is reporting bugs to
mawk and ubuntu people.:-)
Post by Frederic Weisbecker
Anyway that works fine now with gawk, thanks!
All your patches build well :-)
Thank you for testing!
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-20 18:07:38 UTC
Permalink
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
On Thu, Aug 20, 2009 at 01:42:31AM +0200, Frederic Weisbecker wrot=
Add x86 instruction decoder to arch-specific libraries. This dec=
oder
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
can decode x86 instructions used in kernel into prefix, opcode, =
modrm,
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
sib, displacement and immediates. This can also show the length =
of
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
instructions.
This version introduces instruction attributes for decoding inst=
ructions.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
The instruction attribute tables are generated from the opcode m=
ap file
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.a=
wk).
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Currently, the opcode maps are based on opcode maps in Intel(R) =
64 and
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A=
,
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(=
extra2)...] [| 2nd-mnemonic ...]
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)..=
=2E] [| 2nd-mnemonic ...]
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), =
because
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
those opcodes are used in the kernel.
arch/x86/lib/inat.c: In function =E2=80=98inat_get_opcode_attribu=
arch/x86/lib/inat.c:29: erreur: =E2=80=98inat_primary_table=E2=80=
=99 undeclared (first use in this function)
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is re=
ported only once
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
I've attached my config. I haven't such problem on a dual x86-64 =
box.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Actually I have the same problem in x86-64
/* x86 opcode map generated from x86-opcode-map.txt */
/* Do not change this code. */
/* Table: one byte opcode */
/* Escape opcode map array */
const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1][INAT=
_LPREFIX_MAX + 1] =3D {
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
};
/* Group opcode map array */
const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1][INAT_=
LPREFIX_MAX + 1] =3D {
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
};
I guess there is a problem with the generation of this file.
Aah, you may use mawk on Ubuntu 9.04, right?
If so, unfortunately, mawk is still under development.
http://invisible-island.net/mawk/CHANGES
Aargh...
Post by Masami Hiramatsu
20090727
add check/fix to prevent gsub from recurring to modify on a subst=
ring
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
of the current line when the regular expression is anchored to th=
e
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
beginning of the line; fixes gawk's anchgsub testcase.
add check for implicit concatenation mistaken for exponent; fixes
gawk's hex testcase.
add character-classes to built-in regular expressions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Look, this means we can't use char-class expressions like
[:lower:] until this version...
And I've found another bug in mawk-1.3.3-20090728(the latest one).
it almost works, but;
$ mawk 'BEGIN {printf("0x%x\n", 0)}'
0x1
Ouch, indeed.
Post by Masami Hiramatsu
$ gawk 'BEGIN {printf("0x%x\n", 0)}'
0x0
This bug skips an array element index 0x0 in inat-tables.c :(
So, I recommend you to install gawk instead mawk until that
supports all posix-awk features, since I don't think it is
good idea to avoid all those bugs which depends on
implementation (not specification).
Thank you,
Yeah, indeed. May be add a warning (or build error) in case the user=
uses
Post by Masami Hiramatsu
Post by Frederic Weisbecker
mawk?
Hmm, it is possible that mawk will fix those bugs and catch up soon,
so, I think checking mawk is not a good idea.
(and since there will be other awk implementations, it's not fair.)
I think what all I can do now is reporting bugs to
mawk and ubuntu people.:-)
Yeah, but without your tip I couldn't be able to find the origin
before some time.
And the kernel couldn't build anyway.

At least we should do something with this version of mawk.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Anyway that works fine now with gawk, thanks!
All your patches build well :-)
Thank you for testing!
--=20
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
Masami Hiramatsu
2009-08-20 19:01:25 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Post by Masami Hiramatsu
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
arch/x86/lib/inat.c:29: erreur: ‘inat_primary_table’ undeclared (first use in this function)
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is reported only once
arch/x86/lib/inat.c:29: erreur: for each function it appears in.)
I've attached my config. I haven't such problem on a dual x86-64 box.
Actually I have the same problem in x86-64
/* x86 opcode map generated from x86-opcode-map.txt */
/* Do not change this code. */
/* Table: one byte opcode */
/* Escape opcode map array */
const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1][INAT_LPREFIX_MAX + 1] = {
};
/* Group opcode map array */
const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1][INAT_LPREFIX_MAX + 1] = {
};
I guess there is a problem with the generation of this file.
Aah, you may use mawk on Ubuntu 9.04, right?
If so, unfortunately, mawk is still under development.
http://invisible-island.net/mawk/CHANGES
Aargh...
Post by Masami Hiramatsu
20090727
add check/fix to prevent gsub from recurring to modify on a substring
of the current line when the regular expression is anchored to the
beginning of the line; fixes gawk's anchgsub testcase.
add check for implicit concatenation mistaken for exponent; fixes
gawk's hex testcase.
add character-classes to built-in regular expressions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Look, this means we can't use char-class expressions like
[:lower:] until this version...
And I've found another bug in mawk-1.3.3-20090728(the latest one).
it almost works, but;
$ mawk 'BEGIN {printf("0x%x\n", 0)}'
0x1
Ouch, indeed.
Post by Masami Hiramatsu
$ gawk 'BEGIN {printf("0x%x\n", 0)}'
0x0
This bug skips an array element index 0x0 in inat-tables.c :(
So, I recommend you to install gawk instead mawk until that
supports all posix-awk features, since I don't think it is
good idea to avoid all those bugs which depends on
implementation (not specification).
Thank you,
Yeah, indeed. May be add a warning (or build error) in case the user uses
mawk?
Hmm, it is possible that mawk will fix those bugs and catch up soon,
so, I think checking mawk is not a good idea.
(and since there will be other awk implementations, it's not fair.)
I think what all I can do now is reporting bugs to
mawk and ubuntu people.:-)
Yeah, but without your tip I couldn't be able to find the origin
before some time.
And the kernel couldn't build anyway.
At least we should do something with this version of mawk.
Hm, indeed.
Maybe, we can run additional sanity check script before using
awk, like this;

---
res=`echo a | $AWK '/[[:lower:]]+/{print "OK"}'`
[ "$res" != "OK" ] && exit 1

res=`$AWK 'BEGIN {printf("%x", 0)}'`
[ "$res" != "0" ] && exit 1

exit 0
---

Thanks,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-20 20:14:34 UTC
Permalink
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
On Thu, Aug 20, 2009 at 01:42:31AM +0200, Frederic Weisbecker wr=
On Thu, Aug 13, 2009 at 04:34:13PM -0400, Masami Hiramatsu wrot=
Add x86 instruction decoder to arch-specific libraries. This d=
ecoder
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
can decode x86 instructions used in kernel into prefix, opcode=
, modrm,
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
sib, displacement and immediates. This can also show the lengt=
h of
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
instructions.
This version introduces instruction attributes for decoding in=
structions.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
The instruction attribute tables are generated from the opcode=
map file
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86=
=2Eawk).
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Currently, the opcode maps are based on opcode maps in Intel(R=
) 64 and
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
IA-32 Architectures Software Developers Manual Vol.2: Appendix=
=2EA,
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)=
[,(extra2)...] [| 2nd-mnemonic ...]
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2=
)...] [| 2nd-mnemonic ...]
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
EndTable
These opcode maps include a few SSE and FP opcodes (for setup)=
, because
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
those opcodes are used in the kernel.
arch/x86/lib/inat.c: In function =E2=80=98inat_get_opcode_attri=
arch/x86/lib/inat.c:29: erreur: =E2=80=98inat_primary_table=E2=80=
=99 undeclared (first use in this function)
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: (Each undeclared identifier is =
reported only once
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
arch/x86/lib/inat.c:29: erreur: for each function it appears in=
=2E)
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
I've attached my config. I haven't such problem on a dual x86-6=
4 box.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Actually I have the same problem in x86-64
/* x86 opcode map generated from x86-opcode-map.txt */
/* Do not change this code. */
/* Table: one byte opcode */
/* Escape opcode map array */
const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1][IN=
AT_LPREFIX_MAX + 1] =3D {
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
};
/* Group opcode map array */
const insn_attr_t const *inat_group_tables[INAT_GRP_MAX + 1][INA=
T_LPREFIX_MAX + 1] =3D {
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
};
I guess there is a problem with the generation of this file.
Aah, you may use mawk on Ubuntu 9.04, right?
If so, unfortunately, mawk is still under development.
http://invisible-island.net/mawk/CHANGES
Aargh...
Post by Masami Hiramatsu
20090727
add check/fix to prevent gsub from recurring to modify on a sub=
string
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
of the current line when the regular expression is anchored to =
the
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
beginning of the line; fixes gawk's anchgsub testcase.
add check for implicit concatenation mistaken for exponent; fix=
es
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
gawk's hex testcase.
add character-classes to built-in regular expressions.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Look, this means we can't use char-class expressions like
[:lower:] until this version...
And I've found another bug in mawk-1.3.3-20090728(the latest one)=
=2E
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
Post by Masami Hiramatsu
it almost works, but;
$ mawk 'BEGIN {printf("0x%x\n", 0)}'
0x1
Ouch, indeed.
Post by Masami Hiramatsu
$ gawk 'BEGIN {printf("0x%x\n", 0)}'
0x0
This bug skips an array element index 0x0 in inat-tables.c :(
So, I recommend you to install gawk instead mawk until that
supports all posix-awk features, since I don't think it is
good idea to avoid all those bugs which depends on
implementation (not specification).
Thank you,
Yeah, indeed. May be add a warning (or build error) in case the us=
er uses
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Frederic Weisbecker
mawk?
Hmm, it is possible that mawk will fix those bugs and catch up soon=
,
Post by Masami Hiramatsu
Post by Frederic Weisbecker
so, I think checking mawk is not a good idea.
(and since there will be other awk implementations, it's not fair.)
I think what all I can do now is reporting bugs to
mawk and ubuntu people.:-)
Yeah, but without your tip I couldn't be able to find the origin
before some time.
And the kernel couldn't build anyway.
At least we should do something with this version of mawk.
Hm, indeed.
Maybe, we can run additional sanity check script before using
awk, like this;
---
res=3D`echo a | $AWK '/[[:lower:]]+/{print "OK"}'`
[ "$res" !=3D "OK" ] && exit 1
res=3D`$AWK 'BEGIN {printf("%x", 0)}'`
[ "$res" !=3D "0" ] && exit 1
exit 0
---
Thanks,
Yeah, that looks good.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-13 20:34:28 UTC
Permalink
Ensure safeness of inserting kprobes by checking whether the specified
address is at the first byte of a instruction on x86.
This is done by decoding probed function from its head to the probe point.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/kernel/kprobes.c | 69 +++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 69 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index b5b1848..80d493f 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -48,6 +48,7 @@
#include <linux/preempt.h>
#include <linux/module.h>
#include <linux/kdebug.h>
+#include <linux/kallsyms.h>

#include <asm/cacheflush.h>
#include <asm/desc.h>
@@ -55,6 +56,7 @@
#include <asm/uaccess.h>
#include <asm/alternative.h>
#include <asm/debugreg.h>
+#include <asm/insn.h>

void jprobe_return_end(void);

@@ -245,6 +247,71 @@ retry:
}
}

+/* Recover the probed instruction at addr for further analysis. */
+static int recover_probed_instruction(kprobe_opcode_t *buf, unsigned long addr)
+{
+ struct kprobe *kp;
+ kp = get_kprobe((void *)addr);
+ if (!kp)
+ return -EINVAL;
+
+ /*
+ * Basically, kp->ainsn.insn has an original instruction.
+ * However, RIP-relative instruction can not do single-stepping
+ * at different place, fix_riprel() tweaks the displacement of
+ * that instruction. In that case, we can't recover the instruction
+ * from the kp->ainsn.insn.
+ *
+ * On the other hand, kp->opcode has a copy of the first byte of
+ * the probed instruction, which is overwritten by int3. And
+ * the instruction at kp->addr is not modified by kprobes except
+ * for the first byte, we can recover the original instruction
+ * from it and kp->opcode.
+ */
+ memcpy(buf, kp->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
+ buf[0] = kp->opcode;
+ return 0;
+}
+
+/* Dummy buffers for kallsyms_lookup */
+static char __dummy_buf[KSYM_NAME_LEN];
+
+/* Check if paddr is at an instruction boundary */
+static int __kprobes can_probe(unsigned long paddr)
+{
+ int ret;
+ unsigned long addr, offset = 0;
+ struct insn insn;
+ kprobe_opcode_t buf[MAX_INSN_SIZE];
+
+ if (!kallsyms_lookup(paddr, NULL, &offset, NULL, __dummy_buf))
+ return 0;
+
+ /* Decode instructions */
+ addr = paddr - offset;
+ while (addr < paddr) {
+ kernel_insn_init(&insn, (void *)addr);
+ insn_get_opcode(&insn);
+
+ /* Check if the instruction has been modified. */
+ if (insn.opcode.bytes[0] == BREAKPOINT_INSTRUCTION) {
+ ret = recover_probed_instruction(buf, addr);
+ if (ret)
+ /*
+ * Another debugging subsystem might insert
+ * this breakpoint. In that case, we can't
+ * recover it.
+ */
+ return 0;
+ kernel_insn_init(&insn, buf);
+ }
+ insn_get_length(&insn);
+ addr += insn.length;
+ }
+
+ return (addr == paddr);
+}
+
/*
* Returns non-zero if opcode modifies the interrupt flag.
*/
@@ -360,6 +427,8 @@ static void __kprobes arch_copy_kprobe(struct kprobe *p)

int __kprobes arch_prepare_kprobe(struct kprobe *p)
{
+ if (!can_probe((unsigned long)p->addr))
+ return -EILSEQ;
/* insn: must be on special executable page on x86. */
p->ainsn.insn = get_insn_slot();
if (!p->ainsn.insn)
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-18 23:03:43 UTC
Permalink
Ensure safeness of inserting kprobes by checking whether the specifie=
d
address is at the first byte of a instruction on x86.
This is done by decoding probed function from its head to the probe p=
oint.
=20
---
=20
arch/x86/kernel/kprobes.c | 69 +++++++++++++++++++++++++++++++++++=
++++++++++
1 files changed, 69 insertions(+), 0 deletions(-)
=20
diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index b5b1848..80d493f 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -48,6 +48,7 @@
#include <linux/preempt.h>
#include <linux/module.h>
#include <linux/kdebug.h>
+#include <linux/kallsyms.h>
=20
#include <asm/cacheflush.h>
#include <asm/desc.h>
@@ -55,6 +56,7 @@
#include <asm/uaccess.h>
#include <asm/alternative.h>
#include <asm/debugreg.h>
+#include <asm/insn.h>
=20
void jprobe_return_end(void);
=20
}
}
=20
+/* Recover the probed instruction at addr for further analysis. */
+static int recover_probed_instruction(kprobe_opcode_t *buf, unsigned=
long addr)
+{
+ struct kprobe *kp;
+ kp =3D get_kprobe((void *)addr);
+ if (!kp)
+ return -EINVAL;
+
+ /*
+ * Basically, kp->ainsn.insn has an original instruction.
+ * However, RIP-relative instruction can not do single-stepping
+ * at different place, fix_riprel() tweaks the displacement of
+ * that instruction. In that case, we can't recover the instructio=
n
+ * from the kp->ainsn.insn.
+ *
+ * On the other hand, kp->opcode has a copy of the first byte of
+ * the probed instruction, which is overwritten by int3. And
+ * the instruction at kp->addr is not modified by kprobes except
+ * for the first byte, we can recover the original instruction
+ * from it and kp->opcode.
+ */
+ memcpy(buf, kp->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
+ buf[0] =3D kp->opcode;
+ return 0;
+}
+
+/* Dummy buffers for kallsyms_lookup */
+static char __dummy_buf[KSYM_NAME_LEN];
+
+/* Check if paddr is at an instruction boundary */
+static int __kprobes can_probe(unsigned long paddr)
+{
+ int ret;
+ unsigned long addr, offset =3D 0;
+ struct insn insn;
+ kprobe_opcode_t buf[MAX_INSN_SIZE];
+
+ if (!kallsyms_lookup(paddr, NULL, &offset, NULL, __dummy_buf))
+ return 0;
+
+ /* Decode instructions */
+ addr =3D paddr - offset;
+ while (addr < paddr) {
+ kernel_insn_init(&insn, (void *)addr);
+ insn_get_opcode(&insn);
+
+ /* Check if the instruction has been modified. */
+ if (insn.opcode.bytes[0] =3D=3D BREAKPOINT_INSTRUCTION) {
+ ret =3D recover_probed_instruction(buf, addr);
I'm confused about the reason of this recovering. Is it to remove
kprobes behind the current setting one in the current function?

If such cleanup is needed for whatever reason, I wonder what happens
to the corresponding kprobe structure, why isn't it using the arch_disa=
rm_
helper to patch back?

(Questions that may prove my solid misunderstanding of the kprobes code=
;-)

=46rederic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-18 23:17:39 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Ensure safeness of inserting kprobes by checking whether the specified
address is at the first byte of a instruction on x86.
This is done by decoding probed function from its head to the probe point.
---
arch/x86/kernel/kprobes.c | 69 +++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 69 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index b5b1848..80d493f 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -48,6 +48,7 @@
#include <linux/preempt.h>
#include <linux/module.h>
#include <linux/kdebug.h>
+#include <linux/kallsyms.h>
#include <asm/cacheflush.h>
#include <asm/desc.h>
@@ -55,6 +56,7 @@
#include <asm/uaccess.h>
#include <asm/alternative.h>
#include <asm/debugreg.h>
+#include <asm/insn.h>
void jprobe_return_end(void);
}
}
+/* Recover the probed instruction at addr for further analysis. */
+static int recover_probed_instruction(kprobe_opcode_t *buf, unsigned long addr)
+{
+ struct kprobe *kp;
+ kp = get_kprobe((void *)addr);
+ if (!kp)
+ return -EINVAL;
+
+ /*
+ * Basically, kp->ainsn.insn has an original instruction.
+ * However, RIP-relative instruction can not do single-stepping
+ * at different place, fix_riprel() tweaks the displacement of
+ * that instruction. In that case, we can't recover the instruction
+ * from the kp->ainsn.insn.
+ *
+ * On the other hand, kp->opcode has a copy of the first byte of
+ * the probed instruction, which is overwritten by int3. And
+ * the instruction at kp->addr is not modified by kprobes except
+ * for the first byte, we can recover the original instruction
+ * from it and kp->opcode.
+ */
+ memcpy(buf, kp->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
+ buf[0] = kp->opcode;
+ return 0;
+}
+
+/* Dummy buffers for kallsyms_lookup */
+static char __dummy_buf[KSYM_NAME_LEN];
+
+/* Check if paddr is at an instruction boundary */
+static int __kprobes can_probe(unsigned long paddr)
+{
+ int ret;
+ unsigned long addr, offset = 0;
+ struct insn insn;
+ kprobe_opcode_t buf[MAX_INSN_SIZE];
+
+ if (!kallsyms_lookup(paddr, NULL, &offset, NULL, __dummy_buf))
+ return 0;
+
+ /* Decode instructions */
+ addr = paddr - offset;
+ while (addr < paddr) {
+ kernel_insn_init(&insn, (void *)addr);
+ insn_get_opcode(&insn);
+
+ /* Check if the instruction has been modified. */
+ if (insn.opcode.bytes[0] == BREAKPOINT_INSTRUCTION) {
+ ret = recover_probed_instruction(buf, addr);
I'm confused about the reason of this recovering. Is it to remove
kprobes behind the current setting one in the current function?
No, it recovers just an instruction which is probed by a kprobe,
because we need to know the first byte of this instruction for
decoding it.

Perhaps we'd better to have more generic interface (text_peek?)
for it because another subsystem (e.g. kgdb) may want to insert int3...

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-18 23:43:42 UTC
Permalink
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
+ while (addr < paddr) {
+ kernel_insn_init(&insn, (void *)addr);
+ insn_get_opcode(&insn);
+
+ /* Check if the instruction has been modified. */
+ if (insn.opcode.bytes[0] == BREAKPOINT_INSTRUCTION) {
+ ret = recover_probed_instruction(buf, addr);
I'm confused about the reason of this recovering. Is it to remove
kprobes behind the current setting one in the current function?
No, it recovers just an instruction which is probed by a kprobe,
because we need to know the first byte of this instruction for
decoding it.
Perhaps we'd better to have more generic interface (text_peek?)
for it because another subsystem (e.g. kgdb) may want to insert int3...
Thank you,
Aah, I see now, it's to keep a sane check of the instructions
boundaries without int 3 artifacts in the middle.

But in that case, you should re-arm the breakpoint after your
check, right?

Or may be you could do the check without repatching?
May be by doing a copy of insn.opcode.bytes and replacing bytes[0]
with what a random kprobe has stolen?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-19 00:19:33 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
+ while (addr < paddr) {
+ kernel_insn_init(&insn, (void *)addr);
+ insn_get_opcode(&insn);
+
+ /* Check if the instruction has been modified. */
+ if (insn.opcode.bytes[0] == BREAKPOINT_INSTRUCTION) {
+ ret = recover_probed_instruction(buf, addr);
I'm confused about the reason of this recovering. Is it to remove
kprobes behind the current setting one in the current function?
No, it recovers just an instruction which is probed by a kprobe,
because we need to know the first byte of this instruction for
decoding it.
Ah, sorry, it was not accurate. the function recovers an instruction
on the buffer(buf), not on the real kernel text. :)
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Perhaps we'd better to have more generic interface (text_peek?)
for it because another subsystem (e.g. kgdb) may want to insert int3...
Thank you,
Aah, I see now, it's to keep a sane check of the instructions
boundaries without int 3 artifacts in the middle.
But in that case, you should re-arm the breakpoint after your
check, right?
Or may be you could do the check without repatching?
Yes, it doesn't modify kernel text, just recover an original
instruction from kernel text and backup byte on a buffer.
Post by Frederic Weisbecker
May be by doing a copy of insn.opcode.bytes and replacing bytes[0]
with what a random kprobe has stolen?
Hm, no, this function is protected from other kprobes by kprobe_mutex.

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-19 00:46:29 UTC
Permalink
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
+ while (addr < paddr) {
+ kernel_insn_init(&insn, (void *)addr);
+ insn_get_opcode(&insn);
+
+ /* Check if the instruction has been modified. */
+ if (insn.opcode.bytes[0] == BREAKPOINT_INSTRUCTION) {
+ ret = recover_probed_instruction(buf, addr);
I'm confused about the reason of this recovering. Is it to remove
kprobes behind the current setting one in the current function?
No, it recovers just an instruction which is probed by a kprobe,
because we need to know the first byte of this instruction for
decoding it.
Ah, sorry, it was not accurate. the function recovers an instruction
on the buffer(buf), not on the real kernel text. :)
Ah ok. I'll just add a small comment about that then, and apply
it.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Perhaps we'd better to have more generic interface (text_peek?)
for it because another subsystem (e.g. kgdb) may want to insert int3...
Thank you,
Aah, I see now, it's to keep a sane check of the instructions
boundaries without int 3 artifacts in the middle.
But in that case, you should re-arm the breakpoint after your
check, right?
Or may be you could do the check without repatching?
Yes, it doesn't modify kernel text, just recover an original
instruction from kernel text and backup byte on a buffer.
Ok.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
May be by doing a copy of insn.opcode.bytes and replacing bytes[0]
with what a random kprobe has stolen?
Hm, no, this function is protected from other kprobes by kprobe_mutex.
Thank you,
Right, thanks!
Post by Masami Hiramatsu
--
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
Masami Hiramatsu
2009-08-13 20:34:36 UTC
Permalink
Cleanup fix_riprel() in arch/x86/kernel/kprobes.c by using x86 instruction
decoder.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/kernel/kprobes.c | 128 ++++++++-------------------------------------
1 files changed, 23 insertions(+), 105 deletions(-)

diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 80d493f..98f48d0 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -109,50 +109,6 @@ static const u32 twobyte_is_boostable[256 / 32] = {
/* ----------------------------------------------- */
/* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
};
-static const u32 onebyte_has_modrm[256 / 32] = {
- /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
- /* ----------------------------------------------- */
- W(0x00, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* 00 */
- W(0x10, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) , /* 10 */
- W(0x20, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) | /* 20 */
- W(0x30, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0) , /* 30 */
- W(0x40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* 40 */
- W(0x50, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 50 */
- W(0x60, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0) | /* 60 */
- W(0x70, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 70 */
- W(0x80, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 80 */
- W(0x90, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 90 */
- W(0xa0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* a0 */
- W(0xb0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* b0 */
- W(0xc0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0) | /* c0 */
- W(0xd0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1) , /* d0 */
- W(0xe0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* e0 */
- W(0xf0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1) /* f0 */
- /* ----------------------------------------------- */
- /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
-};
-static const u32 twobyte_has_modrm[256 / 32] = {
- /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
- /* ----------------------------------------------- */
- W(0x00, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1) | /* 0f */
- W(0x10, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0) , /* 1f */
- W(0x20, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1) | /* 2f */
- W(0x30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) , /* 3f */
- W(0x40, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 4f */
- W(0x50, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 5f */
- W(0x60, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* 6f */
- W(0x70, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1) , /* 7f */
- W(0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) | /* 8f */
- W(0x90, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* 9f */
- W(0xa0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1) | /* af */
- W(0xb0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1) , /* bf */
- W(0xc0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0) | /* cf */
- W(0xd0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) , /* df */
- W(0xe0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1) | /* ef */
- W(0xf0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0) /* ff */
- /* ----------------------------------------------- */
- /* 0 1 2 3 4 5 6 7 8 9 a b c d e f */
-};
#undef W

struct kretprobe_blackpoint kretprobe_blacklist[] = {
@@ -345,68 +301,30 @@ static int __kprobes is_IF_modifier(kprobe_opcode_t *insn)
static void __kprobes fix_riprel(struct kprobe *p)
{
#ifdef CONFIG_X86_64
- u8 *insn = p->ainsn.insn;
- s64 disp;
- int need_modrm;
-
- /* Skip legacy instruction prefixes. */
- while (1) {
- switch (*insn) {
- case 0x66:
- case 0x67:
- case 0x2e:
- case 0x3e:
- case 0x26:
- case 0x64:
- case 0x65:
- case 0x36:
- case 0xf0:
- case 0xf3:
- case 0xf2:
- ++insn;
- continue;
- }
- break;
- }
+ struct insn insn;
+ kernel_insn_init(&insn, p->ainsn.insn);

- /* Skip REX instruction prefix. */
- if (is_REX_prefix(insn))
- ++insn;
-
- if (*insn == 0x0f) {
- /* Two-byte opcode. */
- ++insn;
- need_modrm = test_bit(*insn,
- (unsigned long *)twobyte_has_modrm);
- } else
- /* One-byte opcode. */
- need_modrm = test_bit(*insn,
- (unsigned long *)onebyte_has_modrm);
-
- if (need_modrm) {
- u8 modrm = *++insn;
- if ((modrm & 0xc7) == 0x05) {
- /* %rip+disp32 addressing mode */
- /* Displacement follows ModRM byte. */
- ++insn;
- /*
- * The copied instruction uses the %rip-relative
- * addressing mode. Adjust the displacement for the
- * difference between the original location of this
- * instruction and the location of the copy that will
- * actually be run. The tricky bit here is making sure
- * that the sign extension happens correctly in this
- * calculation, since we need a signed 32-bit result to
- * be sign-extended to 64 bits when it's added to the
- * %rip value and yield the same 64-bit result that the
- * sign-extension of the original signed 32-bit
- * displacement would have given.
- */
- disp = (u8 *) p->addr + *((s32 *) insn) -
- (u8 *) p->ainsn.insn;
- BUG_ON((s64) (s32) disp != disp); /* Sanity check. */
- *(s32 *)insn = (s32) disp;
- }
+ if (insn_rip_relative(&insn)) {
+ s64 newdisp;
+ u8 *disp;
+ insn_get_displacement(&insn);
+ /*
+ * The copied instruction uses the %rip-relative addressing
+ * mode. Adjust the displacement for the difference between
+ * the original location of this instruction and the location
+ * of the copy that will actually be run. The tricky bit here
+ * is making sure that the sign extension happens correctly in
+ * this calculation, since we need a signed 32-bit result to
+ * be sign-extended to 64 bits when it's added to the %rip
+ * value and yield the same 64-bit result that the sign-
+ * extension of the original signed 32-bit displacement would
+ * have given.
+ */
+ newdisp = (u8 *) p->addr + (s64) insn.displacement.value -
+ (u8 *) p->ainsn.insn;
+ BUG_ON((s64) (s32) newdisp != newdisp); /* Sanity check. */
+ disp = (u8 *) p->ainsn.insn + insn_offset_displacement(&insn);
+ *(s32 *) disp = (s32) newdisp;
}
#endif
}
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:34:44 UTC
Permalink
Add following APIs for accessing registers and stack entries from pt_regs.
These APIs are required by kprobes-based event tracer on ftrace.
Some other debugging tools might be able to use it too.

- regs_query_register_offset(const char *name)
Query the offset of "name" register.

- regs_query_register_name(unsigned int offset)
Query the name of register by its offset.

- regs_get_register(struct pt_regs *regs, unsigned int offset)
Get the value of a register by its offset.

- regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
Check the address is in the kernel stack.

- regs_get_kernel_stack_nth(struct pt_regs *reg, unsigned int nth)
Get Nth entry of the kernel stack. (N >= 0)

- regs_get_argument_nth(struct pt_regs *reg, unsigned int nth)
Get Nth argument at function call. (N >= 0)


Signed-off-by: Masami Hiramatsu <***@redhat.com>
Reviewed-by: Frederic Weisbecker <***@gmail.com>
Cc: linux-***@vger.kernel.org
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/include/asm/ptrace.h | 62 +++++++++++++++++++++++
arch/x86/kernel/ptrace.c | 112 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 174 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index 0f0d908..a3d49dd 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -7,6 +7,7 @@

#ifdef __KERNEL__
#include <asm/segment.h>
+#include <asm/page_types.h>
#endif

#ifndef __ASSEMBLY__
@@ -216,6 +217,67 @@ static inline unsigned long user_stack_pointer(struct pt_regs *regs)
return regs->sp;
}

+/* Query offset/name of register from its name/offset */
+extern int regs_query_register_offset(const char *name);
+extern const char *regs_query_register_name(unsigned int offset);
+#define MAX_REG_OFFSET (offsetof(struct pt_regs, ss))
+
+/**
+ * regs_get_register() - get register value from its offset
+ * @regs: pt_regs from which register value is gotten.
+ * @offset: offset number of the register.
+ *
+ * regs_get_register returns the value of a register whose offset from @regs
+ * is @offset. The @offset is the offset of the register in struct pt_regs.
+ * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
+ */
+static inline unsigned long regs_get_register(struct pt_regs *regs,
+ unsigned int offset)
+{
+ if (unlikely(offset > MAX_REG_OFFSET))
+ return 0;
+ return *(unsigned long *)((unsigned long)regs + offset);
+}
+
+/**
+ * regs_within_kernel_stack() - check the address in the stack
+ * @regs: pt_regs which contains kernel stack pointer.
+ * @addr: address which is checked.
+ *
+ * regs_within_kenel_stack() checks @addr is within the kernel stack page(s).
+ * If @addr is within the kernel stack, it returns true. If not, returns false.
+ */
+static inline int regs_within_kernel_stack(struct pt_regs *regs,
+ unsigned long addr)
+{
+ return ((addr & ~(THREAD_SIZE - 1)) ==
+ (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
+}
+
+/**
+ * regs_get_kernel_stack_nth() - get Nth entry of the stack
+ * @regs: pt_regs which contains kernel stack pointer.
+ * @n: stack entry number.
+ *
+ * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which
+ * is specifined by @regs. If the @n th entry is NOT in the kernel stack,
+ * this returns 0.
+ */
+static inline unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
+ unsigned int n)
+{
+ unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs);
+ addr += n;
+ if (regs_within_kernel_stack(regs, (unsigned long)addr))
+ return *addr;
+ else
+ return 0;
+}
+
+/* Get Nth argument at function call */
+extern unsigned long regs_get_argument_nth(struct pt_regs *regs,
+ unsigned int n);
+
/*
* These are defined as per linux/ptrace.h, which see.
*/
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 916082d..d270530 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -52,6 +52,118 @@ enum x86_regset {
REGSET_IOPERM32,
};

+struct pt_regs_offset {
+ const char *name;
+ int offset;
+};
+
+#define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct pt_regs, r)}
+#define REG_OFFSET_END {.name = NULL, .offset = 0}
+
+static const struct pt_regs_offset regoffset_table[] = {
+#ifdef CONFIG_X86_64
+ REG_OFFSET_NAME(r15),
+ REG_OFFSET_NAME(r14),
+ REG_OFFSET_NAME(r13),
+ REG_OFFSET_NAME(r12),
+ REG_OFFSET_NAME(r11),
+ REG_OFFSET_NAME(r10),
+ REG_OFFSET_NAME(r9),
+ REG_OFFSET_NAME(r8),
+#endif
+ REG_OFFSET_NAME(bx),
+ REG_OFFSET_NAME(cx),
+ REG_OFFSET_NAME(dx),
+ REG_OFFSET_NAME(si),
+ REG_OFFSET_NAME(di),
+ REG_OFFSET_NAME(bp),
+ REG_OFFSET_NAME(ax),
+#ifdef CONFIG_X86_32
+ REG_OFFSET_NAME(ds),
+ REG_OFFSET_NAME(es),
+ REG_OFFSET_NAME(fs),
+ REG_OFFSET_NAME(gs),
+#endif
+ REG_OFFSET_NAME(orig_ax),
+ REG_OFFSET_NAME(ip),
+ REG_OFFSET_NAME(cs),
+ REG_OFFSET_NAME(flags),
+ REG_OFFSET_NAME(sp),
+ REG_OFFSET_NAME(ss),
+ REG_OFFSET_END,
+};
+
+/**
+ * regs_query_register_offset() - query register offset from its name
+ * @name: the name of a register
+ *
+ * regs_query_register_offset() returns the offset of a register in struct
+ * pt_regs from its name. If the name is invalid, this returns -EINVAL;
+ */
+int regs_query_register_offset(const char *name)
+{
+ const struct pt_regs_offset *roff;
+ for (roff = regoffset_table; roff->name != NULL; roff++)
+ if (!strcmp(roff->name, name))
+ return roff->offset;
+ return -EINVAL;
+}
+
+/**
+ * regs_query_register_name() - query register name from its offset
+ * @offset: the offset of a register in struct pt_regs.
+ *
+ * regs_query_register_name() returns the name of a register from its
+ * offset in struct pt_regs. If the @offset is invalid, this returns NULL;
+ */
+const char *regs_query_register_name(unsigned int offset)
+{
+ const struct pt_regs_offset *roff;
+ for (roff = regoffset_table; roff->name != NULL; roff++)
+ if (roff->offset == offset)
+ return roff->name;
+ return NULL;
+}
+
+static const int arg_offs_table[] = {
+#ifdef CONFIG_X86_32
+ [0] = offsetof(struct pt_regs, ax),
+ [1] = offsetof(struct pt_regs, dx),
+ [2] = offsetof(struct pt_regs, cx)
+#else /* CONFIG_X86_64 */
+ [0] = offsetof(struct pt_regs, di),
+ [1] = offsetof(struct pt_regs, si),
+ [2] = offsetof(struct pt_regs, dx),
+ [3] = offsetof(struct pt_regs, cx),
+ [4] = offsetof(struct pt_regs, r8),
+ [5] = offsetof(struct pt_regs, r9)
+#endif
+};
+
+/**
+ * regs_get_argument_nth() - get Nth argument at function call
+ * @regs: pt_regs which contains registers at function entry.
+ * @n: argument number.
+ *
+ * regs_get_argument_nth() returns @n th argument of a function call.
+ * Since usually the kernel stack will be changed right after function entry,
+ * you must use this at function entry. If the @n th entry is NOT in the
+ * kernel stack or pt_regs, this returns 0.
+ */
+unsigned long regs_get_argument_nth(struct pt_regs *regs, unsigned int n)
+{
+ if (n < ARRAY_SIZE(arg_offs_table))
+ return *((unsigned long *)regs + arg_offs_table[n]);
+ else {
+ /*
+ * The typical case: arg n is on the stack.
+ * (Note: stack[0] = return address, so skip it)
+ */
+ n -= ARRAY_SIZE(arg_offs_table);
+ return regs_get_kernel_stack_nth(regs, 1 + n);
+ }
+}
+
/*
* does not yet catch signals sent when the child dies.
* in exit.c or in signal.c.
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:35:18 UTC
Permalink
Support up to 128 arguments for each kprobes event.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

Documentation/trace/kprobetrace.txt | 2 +-
kernel/trace/trace_kprobe.c | 21 +++++++++++++--------
2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index efff6eb..c9c09b4 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -32,7 +32,7 @@ Synopsis of kprobe_events
SYMBOL[+offs|-offs] : Symbol+offset where the probe is inserted.
MEMADDR : Address where the probe is inserted.

- FETCHARGS : Arguments.
+ FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
sN : Fetch Nth entry of stack (N >= 0)
sa : Fetch stack address.
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index d92877a..4704e40 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -32,7 +32,7 @@
#include "trace.h"
#include "trace_output.h"

-#define TRACE_KPROBE_ARGS 6
+#define MAX_TRACE_ARGS 128
#define MAX_ARGSTR_LEN 63

/* currently, trace_kprobe only supports X86. */
@@ -184,11 +184,15 @@ struct trace_probe {
struct kretprobe rp;
};
const char *symbol; /* symbol name */
- unsigned int nr_args;
- struct fetch_func args[TRACE_KPROBE_ARGS];
struct ftrace_event_call call;
+ unsigned int nr_args;
+ struct fetch_func args[];
};

+#define SIZEOF_TRACE_PROBE(n) \
+ (offsetof(struct trace_probe, args) + \
+ (sizeof(struct fetch_func) * (n)))
+
static int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs);
static int kretprobe_trace_func(struct kretprobe_instance *ri,
struct pt_regs *regs);
@@ -263,11 +267,11 @@ static DEFINE_MUTEX(probe_lock);
static LIST_HEAD(probe_list);

static struct trace_probe *alloc_trace_probe(const char *symbol,
- const char *event)
+ const char *event, int nargs)
{
struct trace_probe *tp;

- tp = kzalloc(sizeof(struct trace_probe), GFP_KERNEL);
+ tp = kzalloc(SIZEOF_TRACE_PROBE(nargs), GFP_KERNEL);
if (!tp)
return ERR_PTR(-ENOMEM);

@@ -573,9 +577,10 @@ static int create_trace_probe(int argc, char **argv)
if (offset && is_return)
return -EINVAL;
}
+ argc -= 2; argv += 2;

/* setup a probe */
- tp = alloc_trace_probe(symbol, event);
+ tp = alloc_trace_probe(symbol, event, argc);
if (IS_ERR(tp))
return PTR_ERR(tp);

@@ -594,8 +599,8 @@ static int create_trace_probe(int argc, char **argv)
kp->addr = addr;

/* parse arguments */
- argc -= 2; argv += 2; ret = 0;
- for (i = 0; i < argc && i < TRACE_KPROBE_ARGS; i++) {
+ ret = 0;
+ for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
if (strlen(argv[i]) > MAX_ARGSTR_LEN) {
pr_info("Argument%d(%s) is too long.\n", i, argv[i]);
ret = -ENOSPC;
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:35:11 UTC
Permalink
Add kprobes-based event tracer on ftrace.

This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes
(kprobe and kretprobe). It probes anywhere where kprobes can probe(this
means, all functions body except for __kprobes functions).

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
probe events via /sys/kernel/debug/tracing/events/kprobes/<EVENT>/filter.

This tracer supports following probe arguments for each probe.

%REG : Fetch register REG
sN : Fetch Nth entry of stack (N >= 0)
sa : Fetch stack address.
@ADDR : Fetch memory at ADDR (ADDR should be in kernel)
@SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
aN : Fetch function argument. (N >= 0)
rv : Fetch return value.
ra : Fetch return address.
+|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.

See Documentation/trace/kprobetrace.txt for details.

Changes from v13:
- Support 'sa' for stack address.
- Use call->data instead of container_of() macro.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

Documentation/trace/kprobetrace.txt | 139 ++++
kernel/trace/Kconfig | 12
kernel/trace/Makefile | 1
kernel/trace/trace.h | 29 +
kernel/trace/trace_event_types.h | 18 +
kernel/trace/trace_kprobe.c | 1205 +++++++++++++++++++++++++++++++++++
6 files changed, 1404 insertions(+), 0 deletions(-)
create mode 100644 Documentation/trace/kprobetrace.txt
create mode 100644 kernel/trace/trace_kprobe.c

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
new file mode 100644
index 0000000..efff6eb
--- /dev/null
+++ b/Documentation/trace/kprobetrace.txt
@@ -0,0 +1,139 @@
+ Kprobe-based Event Tracer
+ =========================
+
+ Documentation is written by Masami Hiramatsu
+
+
+Overview
+--------
+This tracer is similar to the events tracer which is based on Tracepoint
+infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
+and kretprobe). It probes anywhere where kprobes can probe(this means, all
+functions body except for __kprobes functions).
+
+Unlike the function tracer, this tracer can probe instructions inside of
+kernel functions. It allows you to check which instruction has been executed.
+
+Unlike the Tracepoint based events tracer, this tracer can add and remove
+probe points on the fly.
+
+Similar to the events tracer, this tracer doesn't need to be activated via
+current_tracer, instead of that, just set probe points via
+/sys/kernel/debug/tracing/kprobe_events. And you can set filters on each
+probe events via /sys/kernel/debug/tracing/events/kprobes/<EVENT>/filter.
+
+
+Synopsis of kprobe_events
+-------------------------
+ p[:EVENT] SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : Set a probe
+ r[:EVENT] SYMBOL[+0] [FETCHARGS] : Set a return probe
+
+ EVENT : Event name.
+ SYMBOL[+offs|-offs] : Symbol+offset where the probe is inserted.
+ MEMADDR : Address where the probe is inserted.
+
+ FETCHARGS : Arguments.
+ %REG : Fetch register REG
+ sN : Fetch Nth entry of stack (N >= 0)
+ sa : Fetch stack address.
+ @ADDR : Fetch memory at ADDR (ADDR should be in kernel)
+ @SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
+ aN : Fetch function argument. (N >= 0)(*)
+ rv : Fetch return value.(**)
+ ra : Fetch return address.(**)
+ +|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)
+
+ (*) aN may not correct on asmlinkaged functions and at the middle of
+ function body.
+ (**) only for return probe.
+ (***) this is useful for fetching a field of data structures.
+
+
+Per-Probe Event Filtering
+-------------------------
+ Per-probe event filtering feature allows you to set different filter on each
+probe and gives you what arguments will be shown in trace buffer. If an event
+name is specified right after 'p:' or 'r:' in kprobe_events, the tracer adds
+an event under tracing/events/kprobes/<EVENT>, at the directory you can see
+'id', 'enabled', 'format' and 'filter'.
+
+enabled:
+ You can enable/disable the probe by writing 1 or 0 on it.
+
+format:
+ It shows the format of this probe event. It also shows aliases of arguments
+ which you specified to kprobe_events.
+
+filter:
+ You can write filtering rules of this event. And you can use both of aliase
+ names and field names for describing filters.
+
+
+Usage examples
+--------------
+To add a probe as a new event, write a new definition to kprobe_events
+as below.
+
+ echo p:myprobe do_sys_open a0 a1 a2 a3 > /sys/kernel/debug/tracing/kprobe_events
+
+ This sets a kprobe on the top of do_sys_open() function with recording
+1st to 4th arguments as "myprobe" event.
+
+ echo r:myretprobe do_sys_open rv ra >> /sys/kernel/debug/tracing/kprobe_events
+
+ This sets a kretprobe on the return point of do_sys_open() function with
+recording return value and return address as "myretprobe" event.
+ You can see the format of these events via
+/sys/kernel/debug/tracing/events/kprobes/<EVENT>/format.
+
+ cat /sys/kernel/debug/tracing/events/kprobes/myprobe/format
+name: myprobe
+ID: 23
+format:
+ field:unsigned short common_type; offset:0; size:2;
+ field:unsigned char common_flags; offset:2; size:1;
+ field:unsigned char common_preempt_count; offset:3; size:1;
+ field:int common_pid; offset:4; size:4;
+ field:int common_tgid; offset:8; size:4;
+
+ field: unsigned long ip; offset:16;tsize:8;
+ field: int nargs; offset:24;tsize:4;
+ field: unsigned long arg0; offset:32;tsize:8;
+ field: unsigned long arg1; offset:40;tsize:8;
+ field: unsigned long arg2; offset:48;tsize:8;
+ field: unsigned long arg3; offset:56;tsize:8;
+
+ alias: a0; original: arg0;
+ alias: a1; original: arg1;
+ alias: a2; original: arg2;
+ alias: a3; original: arg3;
+
+print fmt: "%lx: 0x%lx 0x%lx 0x%lx 0x%lx", ip, arg0, arg1, arg2, arg3
+
+
+ You can see that the event has 4 arguments and alias expressions
+corresponding to it.
+
+ echo > /sys/kernel/debug/tracing/kprobe_events
+
+ This clears all probe points. and you can see the traced information via
+/sys/kernel/debug/tracing/trace.
+
+ cat /sys/kernel/debug/tracing/trace
+# tracer: nop
+#
+# TASK-PID CPU# TIMESTAMP FUNCTION
+# | | | | |
+ <...>-1447 [001] 1038282.286875: do_sys_open+0x0/0xd6: 0x3 0x7fffd1ec4440 0x8000 0x0
+ <...>-1447 [001] 1038282.286878: sys_openat+0xc/0xe <- do_sys_open: 0xfffffffffffffffe 0xffffffff81367a3a
+ <...>-1447 [001] 1038282.286885: do_sys_open+0x0/0xd6: 0xffffff9c 0x40413c 0x8000 0x1b6
+ <...>-1447 [001] 1038282.286915: sys_open+0x1b/0x1d <- do_sys_open: 0x3 0xffffffff81367a3a
+ <...>-1447 [001] 1038282.286969: do_sys_open+0x0/0xd6: 0xffffff9c 0x4041c6 0x98800 0x10
+ <...>-1447 [001] 1038282.286976: sys_open+0x1b/0x1d <- do_sys_open: 0x3 0xffffffff81367a3a
+
+
+ Each line shows when the kernel hits a probe, and <- SYMBOL means kernel
+returns from SYMBOL(e.g. "sys_open+0x1b/0x1d <- do_sys_open" means kernel
+returns from do_sys_open to sys_open+0x1b).
+
+
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 860c712..60f3401 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -445,6 +445,18 @@ config BLK_DEV_IO_TRACE

If unsure, say N.

+config KPROBE_TRACER
+ depends on KPROBES
+ depends on X86
+ bool "Trace kprobes"
+ select TRACING
+ select GENERIC_TRACER
+ help
+ This tracer probes everywhere where kprobes can probe it, and
+ records various registers and memories specified by user.
+ This also allows you to trace kprobe probe points as a dynamic
+ defined events. It provides per-probe event filtering interface.
+
config DYNAMIC_FTRACE
bool "enable/disable ftrace tracepoints dynamically"
depends on FUNCTION_TRACER
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index ce3b1cd..8e6884d 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -55,5 +55,6 @@ obj-$(CONFIG_FTRACE_SYSCALLS) += trace_syscalls.o
obj-$(CONFIG_EVENT_PROFILE) += trace_event_profile.o
obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
obj-$(CONFIG_KSYM_TRACER) += trace_ksym.o
+obj-$(CONFIG_KPROBE_TRACER) += trace_kprobe.o

libftrace-y := ftrace.o
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 606073c..4ce4525 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -43,6 +43,8 @@ enum trace_type {
TRACE_POWER,
TRACE_BLK,
TRACE_KSYM,
+ TRACE_KPROBE,
+ TRACE_KRETPROBE,

__TRACE_LAST_TYPE,
};
@@ -221,6 +223,29 @@ struct ksym_trace_entry {
char cmd[TASK_COMM_LEN];
};

+struct kprobe_trace_entry {
+ struct trace_entry ent;
+ unsigned long ip;
+ int nargs;
+ unsigned long args[];
+};
+
+#define SIZEOF_KPROBE_TRACE_ENTRY(n) \
+ (offsetof(struct kprobe_trace_entry, args) + \
+ (sizeof(unsigned long) * (n)))
+
+struct kretprobe_trace_entry {
+ struct trace_entry ent;
+ unsigned long func;
+ unsigned long ret_ip;
+ int nargs;
+ unsigned long args[];
+};
+
+#define SIZEOF_KRETPROBE_TRACE_ENTRY(n) \
+ (offsetof(struct kretprobe_trace_entry, args) + \
+ (sizeof(unsigned long) * (n)))
+
/*
* trace_flag_type is an enumeration that holds different
* states when a trace occurs. These are:
@@ -333,6 +358,10 @@ extern void __ftrace_bad_type(void);
IF_ASSIGN(var, ent, struct kmemtrace_free_entry, \
TRACE_KMEM_FREE); \
IF_ASSIGN(var, ent, struct ksym_trace_entry, TRACE_KSYM);\
+ IF_ASSIGN(var, ent, struct kprobe_trace_entry, \
+ TRACE_KPROBE); \
+ IF_ASSIGN(var, ent, struct kretprobe_trace_entry, \
+ TRACE_KRETPROBE); \
__ftrace_bad_type(); \
} while (0)

diff --git a/kernel/trace/trace_event_types.h b/kernel/trace/trace_event_types.h
index e74f090..186b598 100644
--- a/kernel/trace/trace_event_types.h
+++ b/kernel/trace/trace_event_types.h
@@ -175,4 +175,22 @@ TRACE_EVENT_FORMAT(kmem_free, TRACE_KMEM_FREE, kmemtrace_free_entry, ignore,
TP_RAW_FMT("type:%u call_site:%lx ptr:%p")
);

+TRACE_EVENT_FORMAT(kprobe, TRACE_KPROBE, kprobe_trace_entry, ignore,
+ TRACE_STRUCT(
+ TRACE_FIELD(unsigned long, ip, ip)
+ TRACE_FIELD(int, nargs, nargs)
+ TRACE_FIELD_ZERO(unsigned long, args)
+ ),
+ TP_RAW_FMT("%08lx: args:0x%lx ...")
+);
+
+TRACE_EVENT_FORMAT(kretprobe, TRACE_KRETPROBE, kretprobe_trace_entry, ignore,
+ TRACE_STRUCT(
+ TRACE_FIELD(unsigned long, func, func)
+ TRACE_FIELD(unsigned long, ret_ip, ret_ip)
+ TRACE_FIELD(int, nargs, nargs)
+ TRACE_FIELD_ZERO(unsigned long, args)
+ ),
+ TP_RAW_FMT("%08lx <- %08lx: args:0x%lx ...")
+);
#undef TRACE_SYSTEM
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
new file mode 100644
index 0000000..d92877a
--- /dev/null
+++ b/kernel/trace/trace_kprobe.c
@@ -0,0 +1,1205 @@
+/*
+ * kprobe based kernel tracer
+ *
+ * Created by Masami Hiramatsu <***@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/uaccess.h>
+#include <linux/kprobes.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/debugfs.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/ctype.h>
+#include <linux/ptrace.h>
+
+#include "trace.h"
+#include "trace_output.h"
+
+#define TRACE_KPROBE_ARGS 6
+#define MAX_ARGSTR_LEN 63
+
+/* currently, trace_kprobe only supports X86. */
+
+struct fetch_func {
+ unsigned long (*func)(struct pt_regs *, void *);
+ void *data;
+};
+
+static __kprobes unsigned long call_fetch(struct fetch_func *f,
+ struct pt_regs *regs)
+{
+ return f->func(regs, f->data);
+}
+
+/* fetch handlers */
+static __kprobes unsigned long fetch_register(struct pt_regs *regs,
+ void *offset)
+{
+ return regs_get_register(regs, (unsigned int)((unsigned long)offset));
+}
+
+static __kprobes unsigned long fetch_stack(struct pt_regs *regs,
+ void *num)
+{
+ return regs_get_kernel_stack_nth(regs,
+ (unsigned int)((unsigned long)num));
+}
+
+static __kprobes unsigned long fetch_memory(struct pt_regs *regs, void *addr)
+{
+ unsigned long retval;
+
+ if (probe_kernel_address(addr, retval))
+ return 0;
+ return retval;
+}
+
+static __kprobes unsigned long fetch_argument(struct pt_regs *regs, void *num)
+{
+ return regs_get_argument_nth(regs, (unsigned int)((unsigned long)num));
+}
+
+static __kprobes unsigned long fetch_retvalue(struct pt_regs *regs,
+ void *dummy)
+{
+ return regs_return_value(regs);
+}
+
+static __kprobes unsigned long fetch_ip(struct pt_regs *regs, void *dummy)
+{
+ return instruction_pointer(regs);
+}
+
+static __kprobes unsigned long fetch_stack_address(struct pt_regs *regs,
+ void *dummy)
+{
+ return kernel_stack_pointer(regs);
+}
+
+/* Memory fetching by symbol */
+struct symbol_cache {
+ char *symbol;
+ long offset;
+ unsigned long addr;
+};
+
+static unsigned long update_symbol_cache(struct symbol_cache *sc)
+{
+ sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
+ if (sc->addr)
+ sc->addr += sc->offset;
+ return sc->addr;
+}
+
+static void free_symbol_cache(struct symbol_cache *sc)
+{
+ kfree(sc->symbol);
+ kfree(sc);
+}
+
+static struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
+{
+ struct symbol_cache *sc;
+
+ if (!sym || strlen(sym) == 0)
+ return NULL;
+ sc = kzalloc(sizeof(struct symbol_cache), GFP_KERNEL);
+ if (!sc)
+ return NULL;
+
+ sc->symbol = kstrdup(sym, GFP_KERNEL);
+ if (!sc->symbol) {
+ kfree(sc);
+ return NULL;
+ }
+ sc->offset = offset;
+
+ update_symbol_cache(sc);
+ return sc;
+}
+
+static __kprobes unsigned long fetch_symbol(struct pt_regs *regs, void *data)
+{
+ struct symbol_cache *sc = data;
+
+ if (sc->addr)
+ return fetch_memory(regs, (void *)sc->addr);
+ else
+ return 0;
+}
+
+/* Special indirect memory access interface */
+struct indirect_fetch_data {
+ struct fetch_func orig;
+ long offset;
+};
+
+static __kprobes unsigned long fetch_indirect(struct pt_regs *regs, void *data)
+{
+ struct indirect_fetch_data *ind = data;
+ unsigned long addr;
+
+ addr = call_fetch(&ind->orig, regs);
+ if (addr) {
+ addr += ind->offset;
+ return fetch_memory(regs, (void *)addr);
+ } else
+ return 0;
+}
+
+static __kprobes void free_indirect_fetch_data(struct indirect_fetch_data *data)
+{
+ if (data->orig.func == fetch_indirect)
+ free_indirect_fetch_data(data->orig.data);
+ else if (data->orig.func == fetch_symbol)
+ free_symbol_cache(data->orig.data);
+ kfree(data);
+}
+
+/**
+ * kprobe_trace_core
+ */
+
+struct trace_probe {
+ struct list_head list;
+ union {
+ struct kprobe kp;
+ struct kretprobe rp;
+ };
+ const char *symbol; /* symbol name */
+ unsigned int nr_args;
+ struct fetch_func args[TRACE_KPROBE_ARGS];
+ struct ftrace_event_call call;
+};
+
+static int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs);
+static int kretprobe_trace_func(struct kretprobe_instance *ri,
+ struct pt_regs *regs);
+
+static __kprobes int probe_is_return(struct trace_probe *tp)
+{
+ return (tp->rp.handler == kretprobe_trace_func);
+}
+
+static __kprobes const char *probe_symbol(struct trace_probe *tp)
+{
+ return tp->symbol ? tp->symbol : "unknown";
+}
+
+static __kprobes long probe_offset(struct trace_probe *tp)
+{
+ return (probe_is_return(tp)) ? tp->rp.kp.offset : tp->kp.offset;
+}
+
+static __kprobes void *probe_address(struct trace_probe *tp)
+{
+ return (probe_is_return(tp)) ? tp->rp.kp.addr : tp->kp.addr;
+}
+
+static int trace_arg_string(char *buf, size_t n, struct fetch_func *ff)
+{
+ int ret = -EINVAL;
+
+ if (ff->func == fetch_argument)
+ ret = snprintf(buf, n, "a%lu", (unsigned long)ff->data);
+ else if (ff->func == fetch_register) {
+ const char *name;
+ name = regs_query_register_name((unsigned int)((long)ff->data));
+ ret = snprintf(buf, n, "%%%s", name);
+ } else if (ff->func == fetch_stack)
+ ret = snprintf(buf, n, "s%lu", (unsigned long)ff->data);
+ else if (ff->func == fetch_memory)
+ ret = snprintf(buf, n, "@0x%p", ff->data);
+ else if (ff->func == fetch_symbol) {
+ struct symbol_cache *sc = ff->data;
+ ret = snprintf(buf, n, "@%s%+ld", sc->symbol, sc->offset);
+ } else if (ff->func == fetch_retvalue)
+ ret = snprintf(buf, n, "rv");
+ else if (ff->func == fetch_ip)
+ ret = snprintf(buf, n, "ra");
+ else if (ff->func == fetch_stack_address)
+ ret = snprintf(buf, n, "sa");
+ else if (ff->func == fetch_indirect) {
+ struct indirect_fetch_data *id = ff->data;
+ size_t l = 0;
+ ret = snprintf(buf, n, "%+ld(", id->offset);
+ if (ret >= n)
+ goto end;
+ l += ret;
+ ret = trace_arg_string(buf + l, n - l, &id->orig);
+ if (ret < 0)
+ goto end;
+ l += ret;
+ ret = snprintf(buf + l, n - l, ")");
+ ret += l;
+ }
+end:
+ if (ret >= n)
+ return -ENOSPC;
+ return ret;
+}
+
+static int register_probe_event(struct trace_probe *tp);
+static void unregister_probe_event(struct trace_probe *tp);
+
+static DEFINE_MUTEX(probe_lock);
+static LIST_HEAD(probe_list);
+
+static struct trace_probe *alloc_trace_probe(const char *symbol,
+ const char *event)
+{
+ struct trace_probe *tp;
+
+ tp = kzalloc(sizeof(struct trace_probe), GFP_KERNEL);
+ if (!tp)
+ return ERR_PTR(-ENOMEM);
+
+ if (symbol) {
+ tp->symbol = kstrdup(symbol, GFP_KERNEL);
+ if (!tp->symbol)
+ goto error;
+ }
+ if (event) {
+ tp->call.name = kstrdup(event, GFP_KERNEL);
+ if (!tp->call.name)
+ goto error;
+ }
+
+ INIT_LIST_HEAD(&tp->list);
+ return tp;
+error:
+ kfree(tp->symbol);
+ kfree(tp);
+ return ERR_PTR(-ENOMEM);
+}
+
+static void free_trace_probe(struct trace_probe *tp)
+{
+ int i;
+
+ for (i = 0; i < tp->nr_args; i++)
+ if (tp->args[i].func == fetch_symbol)
+ free_symbol_cache(tp->args[i].data);
+ else if (tp->args[i].func == fetch_indirect)
+ free_indirect_fetch_data(tp->args[i].data);
+
+ kfree(tp->call.name);
+ kfree(tp->symbol);
+ kfree(tp);
+}
+
+static struct trace_probe *find_probe_event(const char *event)
+{
+ struct trace_probe *tp;
+
+ list_for_each_entry(tp, &probe_list, list)
+ if (tp->call.name && !strcmp(tp->call.name, event))
+ return tp;
+ return NULL;
+}
+
+static void __unregister_trace_probe(struct trace_probe *tp)
+{
+ if (probe_is_return(tp))
+ unregister_kretprobe(&tp->rp);
+ else
+ unregister_kprobe(&tp->kp);
+}
+
+/* Unregister a trace_probe and probe_event: call with locking probe_lock */
+static void unregister_trace_probe(struct trace_probe *tp)
+{
+ if (tp->call.name)
+ unregister_probe_event(tp);
+ __unregister_trace_probe(tp);
+ list_del(&tp->list);
+}
+
+/* Register a trace_probe and probe_event */
+static int register_trace_probe(struct trace_probe *tp)
+{
+ struct trace_probe *old_tp;
+ int ret;
+
+ mutex_lock(&probe_lock);
+
+ if (probe_is_return(tp))
+ ret = register_kretprobe(&tp->rp);
+ else
+ ret = register_kprobe(&tp->kp);
+
+ if (ret) {
+ pr_warning("Could not insert probe(%d)\n", ret);
+ if (ret == -EILSEQ) {
+ pr_warning("Probing address(0x%p) is not an "
+ "instruction boundary.\n",
+ probe_address(tp));
+ ret = -EINVAL;
+ }
+ goto end;
+ }
+ /* register as an event */
+ if (tp->call.name) {
+ old_tp = find_probe_event(tp->call.name);
+ if (old_tp) {
+ /* delete old event */
+ unregister_trace_probe(old_tp);
+ free_trace_probe(old_tp);
+ }
+ ret = register_probe_event(tp);
+ if (ret) {
+ pr_warning("Faild to register probe event(%d)\n", ret);
+ __unregister_trace_probe(tp);
+ }
+ }
+ list_add_tail(&tp->list, &probe_list);
+end:
+ mutex_unlock(&probe_lock);
+ return ret;
+}
+
+/* Split symbol and offset. */
+static int split_symbol_offset(char *symbol, long *offset)
+{
+ char *tmp;
+ int ret;
+
+ if (!offset)
+ return -EINVAL;
+
+ tmp = strchr(symbol, '+');
+ if (!tmp)
+ tmp = strchr(symbol, '-');
+
+ if (tmp) {
+ /* skip sign because strict_strtol doesn't accept '+' */
+ ret = strict_strtol(tmp + 1, 0, offset);
+ if (ret)
+ return ret;
+ if (*tmp == '-')
+ *offset = -(*offset);
+ *tmp = '\0';
+ } else
+ *offset = 0;
+ return 0;
+}
+
+#define PARAM_MAX_ARGS 16
+#define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))
+
+static int parse_trace_arg(char *arg, struct fetch_func *ff, int is_return)
+{
+ int ret = 0;
+ unsigned long param;
+ long offset;
+ char *tmp;
+
+ switch (arg[0]) {
+ case 'a': /* argument */
+ ret = strict_strtoul(arg + 1, 10, &param);
+ if (ret || param > PARAM_MAX_ARGS)
+ ret = -EINVAL;
+ else {
+ ff->func = fetch_argument;
+ ff->data = (void *)param;
+ }
+ break;
+ case 'r': /* retval or retaddr */
+ if (is_return && arg[1] == 'v') {
+ ff->func = fetch_retvalue;
+ ff->data = NULL;
+ } else if (is_return && arg[1] == 'a') {
+ ff->func = fetch_ip;
+ ff->data = NULL;
+ } else
+ ret = -EINVAL;
+ break;
+ case '%': /* named register */
+ ret = regs_query_register_offset(arg + 1);
+ if (ret >= 0) {
+ ff->func = fetch_register;
+ ff->data = (void *)(unsigned long)ret;
+ ret = 0;
+ }
+ break;
+ case 's': /* stack */
+ if (arg[1] == 'a') {
+ ff->func = fetch_stack_address;
+ ff->data = NULL;
+ } else {
+ ret = strict_strtoul(arg + 1, 10, &param);
+ if (ret || param > PARAM_MAX_STACK)
+ ret = -EINVAL;
+ else {
+ ff->func = fetch_stack;
+ ff->data = (void *)param;
+ }
+ }
+ break;
+ case '@': /* memory or symbol */
+ if (isdigit(arg[1])) {
+ ret = strict_strtoul(arg + 1, 0, &param);
+ if (ret)
+ break;
+ ff->func = fetch_memory;
+ ff->data = (void *)param;
+ } else {
+ ret = split_symbol_offset(arg + 1, &offset);
+ if (ret)
+ break;
+ ff->data = alloc_symbol_cache(arg + 1,
+ offset);
+ if (ff->data)
+ ff->func = fetch_symbol;
+ else
+ ret = -EINVAL;
+ }
+ break;
+ case '+': /* indirect memory */
+ case '-':
+ tmp = strchr(arg, '(');
+ if (!tmp) {
+ ret = -EINVAL;
+ break;
+ }
+ *tmp = '\0';
+ ret = strict_strtol(arg + 1, 0, &offset);
+ if (ret)
+ break;
+ if (arg[0] == '-')
+ offset = -offset;
+ arg = tmp + 1;
+ tmp = strrchr(arg, ')');
+ if (tmp) {
+ struct indirect_fetch_data *id;
+ *tmp = '\0';
+ id = kzalloc(sizeof(struct indirect_fetch_data),
+ GFP_KERNEL);
+ if (!id)
+ return -ENOMEM;
+ id->offset = offset;
+ ret = parse_trace_arg(arg, &id->orig, is_return);
+ if (ret)
+ kfree(id);
+ else {
+ ff->func = fetch_indirect;
+ ff->data = (void *)id;
+ }
+ } else
+ ret = -EINVAL;
+ break;
+ default:
+ /* TODO: support custom handler */
+ ret = -EINVAL;
+ }
+ return ret;
+}
+
+static int create_trace_probe(int argc, char **argv)
+{
+ /*
+ * Argument syntax:
+ * - Add kprobe: p[:EVENT] SYMBOL[+OFFS|-OFFS]|ADDRESS [FETCHARGS]
+ * - Add kretprobe: r[:EVENT] SYMBOL[+0] [FETCHARGS]
+ * Fetch args:
+ * aN : fetch Nth of function argument. (N:0-)
+ * rv : fetch return value
+ * ra : fetch return address
+ * sa : fetch stack address
+ * sN : fetch Nth of stack (N:0-)
+ * @ADDR : fetch memory at ADDR (ADDR should be in kernel)
+ * @SYM[+|-offs] : fetch memory at SYM +|- offs (SYM is a data symbol)
+ * %REG : fetch register REG
+ * Indirect memory fetch:
+ * +|-offs(ARG) : fetch memory at ARG +|- offs address.
+ */
+ struct trace_probe *tp;
+ struct kprobe *kp;
+ int i, ret = 0;
+ int is_return = 0;
+ char *symbol = NULL, *event = NULL;
+ long offset = 0;
+ void *addr = NULL;
+
+ if (argc < 2)
+ return -EINVAL;
+
+ if (argv[0][0] == 'p')
+ is_return = 0;
+ else if (argv[0][0] == 'r')
+ is_return = 1;
+ else
+ return -EINVAL;
+
+ if (argv[0][1] == ':') {
+ event = &argv[0][2];
+ if (strlen(event) == 0) {
+ pr_info("Event name is not specifiled\n");
+ return -EINVAL;
+ }
+ }
+
+ if (isdigit(argv[1][0])) {
+ if (is_return)
+ return -EINVAL;
+ /* an address specified */
+ ret = strict_strtoul(&argv[0][2], 0, (unsigned long *)&addr);
+ if (ret)
+ return ret;
+ } else {
+ /* a symbol specified */
+ symbol = argv[1];
+ /* TODO: support .init module functions */
+ ret = split_symbol_offset(symbol, &offset);
+ if (ret)
+ return ret;
+ if (offset && is_return)
+ return -EINVAL;
+ }
+
+ /* setup a probe */
+ tp = alloc_trace_probe(symbol, event);
+ if (IS_ERR(tp))
+ return PTR_ERR(tp);
+
+ if (is_return) {
+ kp = &tp->rp.kp;
+ tp->rp.handler = kretprobe_trace_func;
+ } else {
+ kp = &tp->kp;
+ tp->kp.pre_handler = kprobe_trace_func;
+ }
+
+ if (tp->symbol) {
+ kp->symbol_name = tp->symbol;
+ kp->offset = offset;
+ } else
+ kp->addr = addr;
+
+ /* parse arguments */
+ argc -= 2; argv += 2; ret = 0;
+ for (i = 0; i < argc && i < TRACE_KPROBE_ARGS; i++) {
+ if (strlen(argv[i]) > MAX_ARGSTR_LEN) {
+ pr_info("Argument%d(%s) is too long.\n", i, argv[i]);
+ ret = -ENOSPC;
+ goto error;
+ }
+ ret = parse_trace_arg(argv[i], &tp->args[i], is_return);
+ if (ret)
+ goto error;
+ }
+ tp->nr_args = i;
+
+ ret = register_trace_probe(tp);
+ if (ret)
+ goto error;
+ return 0;
+
+error:
+ free_trace_probe(tp);
+ return ret;
+}
+
+static void cleanup_all_probes(void)
+{
+ struct trace_probe *tp;
+
+ mutex_lock(&probe_lock);
+ /* TODO: Use batch unregistration */
+ while (!list_empty(&probe_list)) {
+ tp = list_entry(probe_list.next, struct trace_probe, list);
+ unregister_trace_probe(tp);
+ free_trace_probe(tp);
+ }
+ mutex_unlock(&probe_lock);
+}
+
+
+/* Probes listing interfaces */
+static void *probes_seq_start(struct seq_file *m, loff_t *pos)
+{
+ mutex_lock(&probe_lock);
+ return seq_list_start(&probe_list, *pos);
+}
+
+static void *probes_seq_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ return seq_list_next(v, &probe_list, pos);
+}
+
+static void probes_seq_stop(struct seq_file *m, void *v)
+{
+ mutex_unlock(&probe_lock);
+}
+
+static int probes_seq_show(struct seq_file *m, void *v)
+{
+ struct trace_probe *tp = v;
+ int i, ret;
+ char buf[MAX_ARGSTR_LEN + 1];
+
+ seq_printf(m, "%c", probe_is_return(tp) ? 'r' : 'p');
+ if (tp->call.name)
+ seq_printf(m, ":%s", tp->call.name);
+
+ if (tp->symbol)
+ seq_printf(m, " %s%+ld", probe_symbol(tp), probe_offset(tp));
+ else
+ seq_printf(m, " 0x%p", probe_address(tp));
+
+ for (i = 0; i < tp->nr_args; i++) {
+ ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ if (ret < 0) {
+ pr_warning("Argument%d decoding error(%d).\n", i, ret);
+ return ret;
+ }
+ seq_printf(m, " %s", buf);
+ }
+ seq_printf(m, "\n");
+ return 0;
+}
+
+static const struct seq_operations probes_seq_op = {
+ .start = probes_seq_start,
+ .next = probes_seq_next,
+ .stop = probes_seq_stop,
+ .show = probes_seq_show
+};
+
+static int probes_open(struct inode *inode, struct file *file)
+{
+ if ((file->f_mode & FMODE_WRITE) &&
+ (file->f_flags & O_TRUNC))
+ cleanup_all_probes();
+
+ return seq_open(file, &probes_seq_op);
+}
+
+static int command_trace_probe(const char *buf)
+{
+ char **argv;
+ int argc = 0, ret = 0;
+
+ argv = argv_split(GFP_KERNEL, buf, &argc);
+ if (!argv)
+ return -ENOMEM;
+
+ if (argc)
+ ret = create_trace_probe(argc, argv);
+
+ argv_free(argv);
+ return ret;
+}
+
+#define WRITE_BUFSIZE 128
+
+static ssize_t probes_write(struct file *file, const char __user *buffer,
+ size_t count, loff_t *ppos)
+{
+ char *kbuf, *tmp;
+ int ret;
+ size_t done;
+ size_t size;
+
+ kbuf = kmalloc(WRITE_BUFSIZE, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ ret = done = 0;
+ while (done < count) {
+ size = count - done;
+ if (size >= WRITE_BUFSIZE)
+ size = WRITE_BUFSIZE - 1;
+ if (copy_from_user(kbuf, buffer + done, size)) {
+ ret = -EFAULT;
+ goto out;
+ }
+ kbuf[size] = '\0';
+ tmp = strchr(kbuf, '\n');
+ if (tmp) {
+ *tmp = '\0';
+ size = tmp - kbuf + 1;
+ } else if (done + size < count) {
+ pr_warning("Line length is too long: "
+ "Should be less than %d.", WRITE_BUFSIZE);
+ ret = -EINVAL;
+ goto out;
+ }
+ done += size;
+ /* Remove comments */
+ tmp = strchr(kbuf, '#');
+ if (tmp)
+ *tmp = '\0';
+
+ ret = command_trace_probe(kbuf);
+ if (ret)
+ goto out;
+ }
+ ret = done;
+out:
+ kfree(kbuf);
+ return ret;
+}
+
+static const struct file_operations kprobe_events_ops = {
+ .owner = THIS_MODULE,
+ .open = probes_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+ .write = probes_write,
+};
+
+/* Kprobe handler */
+static __kprobes int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
+{
+ struct trace_probe *tp = container_of(kp, struct trace_probe, kp);
+ struct kprobe_trace_entry *entry;
+ struct ring_buffer_event *event;
+ int size, i, pc;
+ unsigned long irq_flags;
+ struct ftrace_event_call *call = &event_kprobe;
+
+ if (&tp->call.name)
+ call = &tp->call;
+
+ local_save_flags(irq_flags);
+ pc = preempt_count();
+
+ size = SIZEOF_KPROBE_TRACE_ENTRY(tp->nr_args);
+
+ event = trace_current_buffer_lock_reserve(TRACE_KPROBE, size,
+ irq_flags, pc);
+ if (!event)
+ return 0;
+
+ entry = ring_buffer_event_data(event);
+ entry->nargs = tp->nr_args;
+ entry->ip = (unsigned long)kp->addr;
+ for (i = 0; i < tp->nr_args; i++)
+ entry->args[i] = call_fetch(&tp->args[i], regs);
+
+ if (!filter_current_check_discard(call, entry, event))
+ trace_nowake_buffer_unlock_commit(event, irq_flags, pc);
+ return 0;
+}
+
+/* Kretprobe handler */
+static __kprobes int kretprobe_trace_func(struct kretprobe_instance *ri,
+ struct pt_regs *regs)
+{
+ struct trace_probe *tp = container_of(ri->rp, struct trace_probe, rp);
+ struct kretprobe_trace_entry *entry;
+ struct ring_buffer_event *event;
+ int size, i, pc;
+ unsigned long irq_flags;
+ struct ftrace_event_call *call = &event_kretprobe;
+
+ if (&tp->call.name)
+ call = &tp->call;
+
+ local_save_flags(irq_flags);
+ pc = preempt_count();
+
+ size = SIZEOF_KRETPROBE_TRACE_ENTRY(tp->nr_args);
+
+ event = trace_current_buffer_lock_reserve(TRACE_KRETPROBE, size,
+ irq_flags, pc);
+ if (!event)
+ return 0;
+
+ entry = ring_buffer_event_data(event);
+ entry->nargs = tp->nr_args;
+ entry->func = (unsigned long)probe_address(tp);
+ entry->ret_ip = (unsigned long)ri->ret_addr;
+ for (i = 0; i < tp->nr_args; i++)
+ entry->args[i] = call_fetch(&tp->args[i], regs);
+
+ if (!filter_current_check_discard(call, entry, event))
+ trace_nowake_buffer_unlock_commit(event, irq_flags, pc);
+
+ return 0;
+}
+
+/* Event entry printers */
+enum print_line_t
+print_kprobe_event(struct trace_iterator *iter, int flags)
+{
+ struct kprobe_trace_entry *field;
+ struct trace_seq *s = &iter->seq;
+ int i;
+
+ trace_assign_type(field, iter->ent);
+
+ if (!seq_print_ip_sym(s, field->ip, flags | TRACE_ITER_SYM_OFFSET))
+ goto partial;
+
+ if (!trace_seq_puts(s, ":"))
+ goto partial;
+
+ for (i = 0; i < field->nargs; i++)
+ if (!trace_seq_printf(s, " 0x%lx", field->args[i]))
+ goto partial;
+
+ if (!trace_seq_puts(s, "\n"))
+ goto partial;
+
+ return TRACE_TYPE_HANDLED;
+partial:
+ return TRACE_TYPE_PARTIAL_LINE;
+}
+
+enum print_line_t
+print_kretprobe_event(struct trace_iterator *iter, int flags)
+{
+ struct kretprobe_trace_entry *field;
+ struct trace_seq *s = &iter->seq;
+ int i;
+
+ trace_assign_type(field, iter->ent);
+
+ if (!seq_print_ip_sym(s, field->ret_ip, flags | TRACE_ITER_SYM_OFFSET))
+ goto partial;
+
+ if (!trace_seq_puts(s, " <- "))
+ goto partial;
+
+ if (!seq_print_ip_sym(s, field->func, flags & ~TRACE_ITER_SYM_OFFSET))
+ goto partial;
+
+ if (!trace_seq_puts(s, ":"))
+ goto partial;
+
+ for (i = 0; i < field->nargs; i++)
+ if (!trace_seq_printf(s, " 0x%lx", field->args[i]))
+ goto partial;
+
+ if (!trace_seq_puts(s, "\n"))
+ goto partial;
+
+ return TRACE_TYPE_HANDLED;
+partial:
+ return TRACE_TYPE_PARTIAL_LINE;
+}
+
+static struct trace_event kprobe_trace_event = {
+ .type = TRACE_KPROBE,
+ .trace = print_kprobe_event,
+};
+
+static struct trace_event kretprobe_trace_event = {
+ .type = TRACE_KRETPROBE,
+ .trace = print_kretprobe_event,
+};
+
+static int probe_event_enable(struct ftrace_event_call *call)
+{
+ struct trace_probe *tp = (struct trace_probe *)call->data;
+
+ if (probe_is_return(tp))
+ return enable_kretprobe(&tp->rp);
+ else
+ return enable_kprobe(&tp->kp);
+}
+
+static void probe_event_disable(struct ftrace_event_call *call)
+{
+ struct trace_probe *tp = (struct trace_probe *)call->data;
+
+ if (probe_is_return(tp))
+ disable_kretprobe(&tp->rp);
+ else
+ disable_kprobe(&tp->kp);
+}
+
+static int probe_event_raw_init(struct ftrace_event_call *event_call)
+{
+ INIT_LIST_HEAD(&event_call->fields);
+ init_preds(event_call);
+ return 0;
+}
+
+#undef DEFINE_FIELD
+#define DEFINE_FIELD(type, item, name, is_signed) \
+ do { \
+ ret = trace_define_field(event_call, #type, name, \
+ offsetof(typeof(field), item), \
+ sizeof(field.item), is_signed);\
+ if (ret) \
+ return ret; \
+ } while (0)
+
+static int kprobe_event_define_fields(struct ftrace_event_call *event_call)
+{
+ int ret, i;
+ struct kprobe_trace_entry field;
+ char buf[MAX_ARGSTR_LEN + 1];
+ struct trace_probe *tp = (struct trace_probe *)event_call->data;
+
+ __common_field(int, type, 1);
+ __common_field(unsigned char, flags, 0);
+ __common_field(unsigned char, preempt_count, 0);
+ __common_field(int, pid, 1);
+ __common_field(int, tgid, 1);
+
+ DEFINE_FIELD(unsigned long, ip, "ip", 0);
+ DEFINE_FIELD(int, nargs, "nargs", 1);
+ for (i = 0; i < tp->nr_args; i++) {
+ /* Set argN as a field */
+ sprintf(buf, "arg%d", i);
+ DEFINE_FIELD(unsigned long, args[i], buf, 0);
+ /* Set argument string as an alias field */
+ ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ if (ret < 0)
+ return ret;
+ DEFINE_FIELD(unsigned long, args[i], buf, 0);
+ }
+ return 0;
+}
+
+static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
+{
+ int ret, i;
+ struct kretprobe_trace_entry field;
+ char buf[MAX_ARGSTR_LEN + 1];
+ struct trace_probe *tp = (struct trace_probe *)event_call->data;
+
+ __common_field(int, type, 1);
+ __common_field(unsigned char, flags, 0);
+ __common_field(unsigned char, preempt_count, 0);
+ __common_field(int, pid, 1);
+ __common_field(int, tgid, 1);
+
+ DEFINE_FIELD(unsigned long, func, "func", 0);
+ DEFINE_FIELD(unsigned long, ret_ip, "ret_ip", 0);
+ DEFINE_FIELD(int, nargs, "nargs", 1);
+ for (i = 0; i < tp->nr_args; i++) {
+ /* Set argN as a field */
+ sprintf(buf, "arg%d", i);
+ DEFINE_FIELD(unsigned long, args[i], buf, 0);
+ /* Set argument string as an alias field */
+ ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ if (ret < 0)
+ return ret;
+ DEFINE_FIELD(unsigned long, args[i], buf, 0);
+ }
+ return 0;
+}
+
+static int __probe_event_show_format(struct trace_seq *s,
+ struct trace_probe *tp, const char *fmt,
+ const char *arg)
+{
+ int i, ret;
+ char buf[MAX_ARGSTR_LEN + 1];
+
+ /* Show aliases */
+ for (i = 0; i < tp->nr_args; i++) {
+ ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ if (ret < 0)
+ return ret;
+ if (!trace_seq_printf(s, "\talias: %s;\toriginal: arg%d;\n",
+ buf, i))
+ return 0;
+ }
+ /* Show format */
+ if (!trace_seq_printf(s, "\nprint fmt: \"%s", fmt))
+ return 0;
+
+ for (i = 0; i < tp->nr_args; i++)
+ if (!trace_seq_puts(s, " 0x%lx"))
+ return 0;
+
+ if (!trace_seq_printf(s, "\", %s", arg))
+ return 0;
+
+ for (i = 0; i < tp->nr_args; i++)
+ if (!trace_seq_printf(s, ", arg%d", i))
+ return 0;
+
+ return trace_seq_puts(s, "\n");
+}
+
+#undef SHOW_FIELD
+#define SHOW_FIELD(type, item, name) \
+ do { \
+ ret = trace_seq_printf(s, "\tfield: " #type " %s;\t" \
+ "offset:%u;tsize:%u;\n", name, \
+ (unsigned int)offsetof(typeof(field), item),\
+ (unsigned int)sizeof(type)); \
+ if (!ret) \
+ return 0; \
+ } while (0)
+
+static int kprobe_event_show_format(struct ftrace_event_call *call,
+ struct trace_seq *s)
+{
+ struct kprobe_trace_entry field __attribute__((unused));
+ int ret, i;
+ char buf[8];
+ struct trace_probe *tp = (struct trace_probe *)call->data;
+
+ SHOW_FIELD(unsigned long, ip, "ip");
+ SHOW_FIELD(int, nargs, "nargs");
+
+ /* Show fields */
+ for (i = 0; i < tp->nr_args; i++) {
+ sprintf(buf, "arg%d", i);
+ SHOW_FIELD(unsigned long, args[i], buf);
+ }
+ trace_seq_puts(s, "\n");
+
+ return __probe_event_show_format(s, tp, "%lx:", "ip");
+}
+
+static int kretprobe_event_show_format(struct ftrace_event_call *call,
+ struct trace_seq *s)
+{
+ struct kretprobe_trace_entry field __attribute__((unused));
+ int ret, i;
+ char buf[8];
+ struct trace_probe *tp = (struct trace_probe *)call->data;
+
+ SHOW_FIELD(unsigned long, func, "func");
+ SHOW_FIELD(unsigned long, ret_ip, "ret_ip");
+ SHOW_FIELD(int, nargs, "nargs");
+
+ /* Show fields */
+ for (i = 0; i < tp->nr_args; i++) {
+ sprintf(buf, "arg%d", i);
+ SHOW_FIELD(unsigned long, args[i], buf);
+ }
+ trace_seq_puts(s, "\n");
+
+ return __probe_event_show_format(s, tp, "%lx <- %lx:",
+ "func, ret_ip");
+}
+
+static int register_probe_event(struct trace_probe *tp)
+{
+ struct ftrace_event_call *call = &tp->call;
+ int ret;
+
+ /* Initialize ftrace_event_call */
+ call->system = "kprobes";
+ if (probe_is_return(tp)) {
+ call->event = &kretprobe_trace_event;
+ call->id = TRACE_KRETPROBE;
+ call->raw_init = probe_event_raw_init;
+ call->show_format = kretprobe_event_show_format;
+ call->define_fields = kretprobe_event_define_fields;
+ } else {
+ call->event = &kprobe_trace_event;
+ call->id = TRACE_KPROBE;
+ call->raw_init = probe_event_raw_init;
+ call->show_format = kprobe_event_show_format;
+ call->define_fields = kprobe_event_define_fields;
+ }
+ call->enabled = 1;
+ call->regfunc = probe_event_enable;
+ call->unregfunc = probe_event_disable;
+ call->data = tp;
+ ret = trace_add_event_call(call);
+ if (ret)
+ pr_info("Failed to register kprobe event: %s\n", call->name);
+ return ret;
+}
+
+static void unregister_probe_event(struct trace_probe *tp)
+{
+ /*
+ * Prevent to unregister event itself because the event is shared
+ * among other probes.
+ */
+ tp->call.event = NULL;
+ trace_remove_event_call(&tp->call);
+}
+
+/* Make a debugfs interface for controling probe points */
+static __init int init_kprobe_trace(void)
+{
+ struct dentry *d_tracer;
+ struct dentry *entry;
+ int ret;
+
+ ret = register_ftrace_event(&kprobe_trace_event);
+ if (!ret) {
+ pr_warning("Could not register kprobe_trace_event type.\n");
+ return 0;
+ }
+ ret = register_ftrace_event(&kretprobe_trace_event);
+ if (!ret) {
+ pr_warning("Could not register kretprobe_trace_event type.\n");
+ return 0;
+ }
+
+ d_tracer = tracing_init_dentry();
+ if (!d_tracer)
+ return 0;
+
+ entry = debugfs_create_file("kprobe_events", 0644, d_tracer,
+ NULL, &kprobe_events_ops);
+
+ if (!entry)
+ pr_warning("Could not create debugfs "
+ "'kprobe_events' entry\n");
+ return 0;
+}
+fs_initcall(init_kprobe_trace);
+
+
+#ifdef CONFIG_FTRACE_STARTUP_TEST
+
+static int kprobe_trace_selftest_target(int a1, int a2, int a3,
+ int a4, int a5, int a6)
+{
+ return a1 + a2 + a3 + a4 + a5 + a6;
+}
+
+static __init int kprobe_trace_self_tests_init(void)
+{
+ int ret;
+ int (*target)(int, int, int, int, int, int);
+
+ target = kprobe_trace_selftest_target;
+
+ pr_info("Testing kprobe tracing: ");
+
+ ret = command_trace_probe("p:testprobe kprobe_trace_selftest_target "
+ "a1 a2 a3 a4 a5 a6");
+ if (WARN_ON_ONCE(ret))
+ pr_warning("error enabling function entry\n");
+
+ ret = command_trace_probe("r:testprobe2 kprobe_trace_selftest_target "
+ "ra rv");
+ if (WARN_ON_ONCE(ret))
+ pr_warning("error enabling function return\n");
+
+ ret = target(1, 2, 3, 4, 5, 6);
+
+ cleanup_all_probes();
+
+ pr_cont("OK\n");
+ return 0;
+}
+
+late_initcall(kprobe_trace_self_tests_init);
+
+#endif
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-19 01:23:08 UTC
Permalink
Post by Masami Hiramatsu
Add kprobes-based event tracer on ftrace.
=20
This tracer is similar to the events tracer which is based on Tracepo=
int
Post by Masami Hiramatsu
infrastructure. Instead of Tracepoint, this tracer is based on kprobe=
s
Post by Masami Hiramatsu
(kprobe and kretprobe). It probes anywhere where kprobes can probe(th=
is
Post by Masami Hiramatsu
means, all functions body except for __kprobes functions).
=20
Similar to the events tracer, this tracer doesn't need to be activate=
d via
Post by Masami Hiramatsu
current_tracer, instead of that, just set probe points via
/sys/kernel/debug/tracing/kprobe_events. And you can set filters on e=
ach
Post by Masami Hiramatsu
probe events via /sys/kernel/debug/tracing/events/kprobes/<EVENT>/fil=
ter.
Post by Masami Hiramatsu
=20
This tracer supports following probe arguments for each probe.
=20
%REG : Fetch register REG
sN : Fetch Nth entry of stack (N >=3D 0)
sa : Fetch stack address.
@ADDR : Fetch memory at ADDR (ADDR should be in kernel)
@SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data =
symbol)
Post by Masami Hiramatsu
aN : Fetch function argument. (N >=3D 0)
rv : Fetch return value.
ra : Fetch return address.
+|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.
=20
See Documentation/trace/kprobetrace.txt for details.
=20
- Support 'sa' for stack address.
- Use call->data instead of container_of() macro.
=20
---
=20
Documentation/trace/kprobetrace.txt | 139 ++++
I'll probably split this commit to have the first version of the
documentation as a separate patch in order to lighten this.

=46rederic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-13 20:35:26 UTC
Permalink
Generate names for each kprobe event based on the probe point,
and remove generic k*probe event types because there is no user
of those types.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

Documentation/trace/kprobetrace.txt | 3 +-
kernel/trace/trace_event_types.h | 18 ----------
kernel/trace/trace_kprobe.c | 64 ++++++++++++++++++-----------------
3 files changed, 35 insertions(+), 50 deletions(-)

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index c9c09b4..5e59e85 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -28,7 +28,8 @@ Synopsis of kprobe_events
p[:EVENT] SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : Set a probe
r[:EVENT] SYMBOL[+0] [FETCHARGS] : Set a return probe

- EVENT : Event name.
+ EVENT : Event name. If omitted, the event name is generated
+ based on SYMBOL+offs or MEMADDR.
SYMBOL[+offs|-offs] : Symbol+offset where the probe is inserted.
MEMADDR : Address where the probe is inserted.

diff --git a/kernel/trace/trace_event_types.h b/kernel/trace/trace_event_types.h
index 186b598..e74f090 100644
--- a/kernel/trace/trace_event_types.h
+++ b/kernel/trace/trace_event_types.h
@@ -175,22 +175,4 @@ TRACE_EVENT_FORMAT(kmem_free, TRACE_KMEM_FREE, kmemtrace_free_entry, ignore,
TP_RAW_FMT("type:%u call_site:%lx ptr:%p")
);

-TRACE_EVENT_FORMAT(kprobe, TRACE_KPROBE, kprobe_trace_entry, ignore,
- TRACE_STRUCT(
- TRACE_FIELD(unsigned long, ip, ip)
- TRACE_FIELD(int, nargs, nargs)
- TRACE_FIELD_ZERO(unsigned long, args)
- ),
- TP_RAW_FMT("%08lx: args:0x%lx ...")
-);
-
-TRACE_EVENT_FORMAT(kretprobe, TRACE_KRETPROBE, kretprobe_trace_entry, ignore,
- TRACE_STRUCT(
- TRACE_FIELD(unsigned long, func, func)
- TRACE_FIELD(unsigned long, ret_ip, ret_ip)
- TRACE_FIELD(int, nargs, nargs)
- TRACE_FIELD_ZERO(unsigned long, args)
- ),
- TP_RAW_FMT("%08lx <- %08lx: args:0x%lx ...")
-);
#undef TRACE_SYSTEM
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 4704e40..ec137ed 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -34,6 +34,7 @@

#define MAX_TRACE_ARGS 128
#define MAX_ARGSTR_LEN 63
+#define MAX_EVENT_NAME_LEN 64

/* currently, trace_kprobe only supports X86. */

@@ -280,11 +281,11 @@ static struct trace_probe *alloc_trace_probe(const char *symbol,
if (!tp->symbol)
goto error;
}
- if (event) {
- tp->call.name = kstrdup(event, GFP_KERNEL);
- if (!tp->call.name)
- goto error;
- }
+ if (!event)
+ goto error;
+ tp->call.name = kstrdup(event, GFP_KERNEL);
+ if (!tp->call.name)
+ goto error;

INIT_LIST_HEAD(&tp->list);
return tp;
@@ -314,7 +315,7 @@ static struct trace_probe *find_probe_event(const char *event)
struct trace_probe *tp;

list_for_each_entry(tp, &probe_list, list)
- if (tp->call.name && !strcmp(tp->call.name, event))
+ if (!strcmp(tp->call.name, event))
return tp;
return NULL;
}
@@ -330,8 +331,7 @@ static void __unregister_trace_probe(struct trace_probe *tp)
/* Unregister a trace_probe and probe_event: call with locking probe_lock */
static void unregister_trace_probe(struct trace_probe *tp)
{
- if (tp->call.name)
- unregister_probe_event(tp);
+ unregister_probe_event(tp);
__unregister_trace_probe(tp);
list_del(&tp->list);
}
@@ -360,18 +360,16 @@ static int register_trace_probe(struct trace_probe *tp)
goto end;
}
/* register as an event */
- if (tp->call.name) {
- old_tp = find_probe_event(tp->call.name);
- if (old_tp) {
- /* delete old event */
- unregister_trace_probe(old_tp);
- free_trace_probe(old_tp);
- }
- ret = register_probe_event(tp);
- if (ret) {
- pr_warning("Faild to register probe event(%d)\n", ret);
- __unregister_trace_probe(tp);
- }
+ old_tp = find_probe_event(tp->call.name);
+ if (old_tp) {
+ /* delete old event */
+ unregister_trace_probe(old_tp);
+ free_trace_probe(old_tp);
+ }
+ ret = register_probe_event(tp);
+ if (ret) {
+ pr_warning("Faild to register probe event(%d)\n", ret);
+ __unregister_trace_probe(tp);
}
list_add_tail(&tp->list, &probe_list);
end:
@@ -580,7 +578,18 @@ static int create_trace_probe(int argc, char **argv)
argc -= 2; argv += 2;

/* setup a probe */
- tp = alloc_trace_probe(symbol, event, argc);
+ if (!event) {
+ /* Make a new event name */
+ char buf[MAX_EVENT_NAME_LEN];
+ if (symbol)
+ snprintf(buf, MAX_EVENT_NAME_LEN, "%c@%s%+ld",
+ is_return ? 'r' : 'p', symbol, offset);
+ else
+ snprintf(buf, MAX_EVENT_NAME_LEN, "%***@0x%p",
+ is_return ? 'r' : 'p', addr);
+ tp = alloc_trace_probe(symbol, buf, argc);
+ } else
+ tp = alloc_trace_probe(symbol, event, argc);
if (IS_ERR(tp))
return PTR_ERR(tp);

@@ -661,8 +670,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
char buf[MAX_ARGSTR_LEN + 1];

seq_printf(m, "%c", probe_is_return(tp) ? 'r' : 'p');
- if (tp->call.name)
- seq_printf(m, ":%s", tp->call.name);
+ seq_printf(m, ":%s", tp->call.name);

if (tp->symbol)
seq_printf(m, " %s%+ld", probe_symbol(tp), probe_offset(tp));
@@ -780,10 +788,7 @@ static __kprobes int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
struct ring_buffer_event *event;
int size, i, pc;
unsigned long irq_flags;
- struct ftrace_event_call *call = &event_kprobe;
-
- if (&tp->call.name)
- call = &tp->call;
+ struct ftrace_event_call *call = &tp->call;

local_save_flags(irq_flags);
pc = preempt_count();
@@ -815,10 +820,7 @@ static __kprobes int kretprobe_trace_func(struct kretprobe_instance *ri,
struct ring_buffer_event *event;
int size, i, pc;
unsigned long irq_flags;
- struct ftrace_event_call *call = &event_kretprobe;
-
- if (&tp->call.name)
- call = &tp->call;
+ struct ftrace_event_call *call = &tp->call;

local_save_flags(irq_flags);
pc = preempt_count();
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:35:34 UTC
Permalink
Assigns new event ids for each kprobes event. This doesn't clear ring_buffer
when unregistering each kprobe event. Thus, if you mind 'Unknown event'
messages, clear the buffer manually after changing kprobe events.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

kernel/trace/trace.h | 6 -----
kernel/trace/trace_kprobe.c | 51 +++++++++++++------------------------------
2 files changed, 15 insertions(+), 42 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 4ce4525..0b78d76 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -43,8 +43,6 @@ enum trace_type {
TRACE_POWER,
TRACE_BLK,
TRACE_KSYM,
- TRACE_KPROBE,
- TRACE_KRETPROBE,

__TRACE_LAST_TYPE,
};
@@ -358,10 +356,6 @@ extern void __ftrace_bad_type(void);
IF_ASSIGN(var, ent, struct kmemtrace_free_entry, \
TRACE_KMEM_FREE); \
IF_ASSIGN(var, ent, struct ksym_trace_entry, TRACE_KSYM);\
- IF_ASSIGN(var, ent, struct kprobe_trace_entry, \
- TRACE_KPROBE); \
- IF_ASSIGN(var, ent, struct kretprobe_trace_entry, \
- TRACE_KRETPROBE); \
__ftrace_bad_type(); \
} while (0)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index ec137ed..0e8498e 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -186,6 +186,7 @@ struct trace_probe {
};
const char *symbol; /* symbol name */
struct ftrace_event_call call;
+ struct trace_event event;
unsigned int nr_args;
struct fetch_func args[];
};
@@ -795,7 +796,7 @@ static __kprobes int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)

size = SIZEOF_KPROBE_TRACE_ENTRY(tp->nr_args);

- event = trace_current_buffer_lock_reserve(TRACE_KPROBE, size,
+ event = trace_current_buffer_lock_reserve(call->id, size,
irq_flags, pc);
if (!event)
return 0;
@@ -827,7 +828,7 @@ static __kprobes int kretprobe_trace_func(struct kretprobe_instance *ri,

size = SIZEOF_KRETPROBE_TRACE_ENTRY(tp->nr_args);

- event = trace_current_buffer_lock_reserve(TRACE_KRETPROBE, size,
+ event = trace_current_buffer_lock_reserve(call->id, size,
irq_flags, pc);
if (!event)
return 0;
@@ -853,7 +854,7 @@ print_kprobe_event(struct trace_iterator *iter, int flags)
struct trace_seq *s = &iter->seq;
int i;

- trace_assign_type(field, iter->ent);
+ field = (struct kprobe_trace_entry *)iter->ent;

if (!seq_print_ip_sym(s, field->ip, flags | TRACE_ITER_SYM_OFFSET))
goto partial;
@@ -880,7 +881,7 @@ print_kretprobe_event(struct trace_iterator *iter, int flags)
struct trace_seq *s = &iter->seq;
int i;

- trace_assign_type(field, iter->ent);
+ field = (struct kretprobe_trace_entry *)iter->ent;

if (!seq_print_ip_sym(s, field->ret_ip, flags | TRACE_ITER_SYM_OFFSET))
goto partial;
@@ -906,16 +907,6 @@ partial:
return TRACE_TYPE_PARTIAL_LINE;
}

-static struct trace_event kprobe_trace_event = {
- .type = TRACE_KPROBE,
- .trace = print_kprobe_event,
-};
-
-static struct trace_event kretprobe_trace_event = {
- .type = TRACE_KRETPROBE,
- .trace = print_kretprobe_event,
-};
-
static int probe_event_enable(struct ftrace_event_call *call)
{
struct trace_probe *tp = (struct trace_probe *)call->data;
@@ -1107,35 +1098,35 @@ static int register_probe_event(struct trace_probe *tp)
/* Initialize ftrace_event_call */
call->system = "kprobes";
if (probe_is_return(tp)) {
- call->event = &kretprobe_trace_event;
- call->id = TRACE_KRETPROBE;
+ tp->event.trace = print_kretprobe_event;
call->raw_init = probe_event_raw_init;
call->show_format = kretprobe_event_show_format;
call->define_fields = kretprobe_event_define_fields;
} else {
- call->event = &kprobe_trace_event;
- call->id = TRACE_KPROBE;
+ tp->event.trace = print_kprobe_event;
call->raw_init = probe_event_raw_init;
call->show_format = kprobe_event_show_format;
call->define_fields = kprobe_event_define_fields;
}
+ call->event = &tp->event;
+ call->id = register_ftrace_event(&tp->event);
+ if (!call->id)
+ return -ENODEV;
call->enabled = 1;
call->regfunc = probe_event_enable;
call->unregfunc = probe_event_disable;
call->data = tp;
ret = trace_add_event_call(call);
- if (ret)
+ if (ret) {
pr_info("Failed to register kprobe event: %s\n", call->name);
+ unregister_ftrace_event(&tp->event);
+ }
return ret;
}

static void unregister_probe_event(struct trace_probe *tp)
{
- /*
- * Prevent to unregister event itself because the event is shared
- * among other probes.
- */
- tp->call.event = NULL;
+ /* tp->event is unregistered in trace_remove_event_call() */
trace_remove_event_call(&tp->call);
}

@@ -1144,18 +1135,6 @@ static __init int init_kprobe_trace(void)
{
struct dentry *d_tracer;
struct dentry *entry;
- int ret;
-
- ret = register_ftrace_event(&kprobe_trace_event);
- if (!ret) {
- pr_warning("Could not register kprobe_trace_event type.\n");
- return 0;
- }
- ret = register_ftrace_event(&kretprobe_trace_event);
- if (!ret) {
- pr_warning("Could not register kretprobe_trace_event type.\n");
- return 0;
- }

d_tracer = tracing_init_dentry();
if (!d_tracer)
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:35:42 UTC
Permalink
Add profiling interaces for each kprobes event. This interface provides
how many times each probe hit or missed.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

Documentation/trace/kprobetrace.txt | 8 +++++++
kernel/trace/trace_kprobe.c | 43 +++++++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt
index 5e59e85..3de7517 100644
--- a/Documentation/trace/kprobetrace.txt
+++ b/Documentation/trace/kprobetrace.txt
@@ -70,6 +70,14 @@ filter:
names and field names for describing filters.


+Event Profiling
+---------------
+ You can check the total number of probe hits and probe miss-hits via
+/sys/kernel/debug/tracing/kprobe_profile.
+ The first column is event name, the second is the number of probe hits,
+the third is the number of probe miss-hits.
+
+
Usage examples
--------------
To add a probe as a new event, write a new definition to kprobe_events
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 0e8498e..0f5d0a6 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -184,6 +184,7 @@ struct trace_probe {
struct kprobe kp;
struct kretprobe rp;
};
+ unsigned long nhit;
const char *symbol; /* symbol name */
struct ftrace_event_call call;
struct trace_event event;
@@ -781,6 +782,37 @@ static const struct file_operations kprobe_events_ops = {
.write = probes_write,
};

+/* Probes profiling interfaces */
+static int probes_profile_seq_show(struct seq_file *m, void *v)
+{
+ struct trace_probe *tp = v;
+
+ seq_printf(m, " %-44s %15lu %15lu\n", tp->call.name, tp->nhit,
+ probe_is_return(tp) ? tp->rp.kp.nmissed : tp->kp.nmissed);
+
+ return 0;
+}
+
+static const struct seq_operations profile_seq_op = {
+ .start = probes_seq_start,
+ .next = probes_seq_next,
+ .stop = probes_seq_stop,
+ .show = probes_profile_seq_show
+};
+
+static int profile_open(struct inode *inode, struct file *file)
+{
+ return seq_open(file, &profile_seq_op);
+}
+
+static const struct file_operations kprobe_profile_ops = {
+ .owner = THIS_MODULE,
+ .open = profile_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
/* Kprobe handler */
static __kprobes int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
{
@@ -791,6 +823,8 @@ static __kprobes int kprobe_trace_func(struct kprobe *kp, struct pt_regs *regs)
unsigned long irq_flags;
struct ftrace_event_call *call = &tp->call;

+ tp->nhit++;
+
local_save_flags(irq_flags);
pc = preempt_count();

@@ -1143,9 +1177,18 @@ static __init int init_kprobe_trace(void)
entry = debugfs_create_file("kprobe_events", 0644, d_tracer,
NULL, &kprobe_events_ops);

+ /* Event list interface */
if (!entry)
pr_warning("Could not create debugfs "
"'kprobe_events' entry\n");
+
+ /* Profile interface */
+ entry = debugfs_create_file("kprobe_profile", 0444, d_tracer,
+ NULL, &kprobe_profile_ops);
+
+ if (!entry)
+ pr_warning("Could not create debugfs "
+ "'kprobe_profile' entry\n");
return 0;
}
fs_initcall(init_kprobe_trace);
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:34:53 UTC
Permalink
Add dynamic ftrace_event_call support to ftrace. Trace engines can adds=
new
ftrace_event_call to ftrace on the fly. Each operator functions of the =
call
takes a ftrace_event_call data structure as an argument, because these
functions may be shared among several ftrace_event_calls.

Changes from v13:
- Define remove_subsystem_dir() always (revirt a2ca5e03), because
trace_remove_event_call() uses it.
- Modify syscall tracer because of ftrace_event_call change.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Acked-by: Frederic Weisbecker <***@gmail.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemys=C5=82aw Pawe=C5=82czyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

include/linux/ftrace_event.h | 14 +++--
include/linux/syscalls.h | 4 +
include/trace/ftrace.h | 19 +++----
include/trace/syscall.h | 8 +--
kernel/trace/trace_events.c | 119 +++++++++++++++++++++++++++++----=
--------
kernel/trace/trace_export.c | 23 ++++----
kernel/trace/trace_syscalls.c | 16 +++---
7 files changed, 125 insertions(+), 78 deletions(-)

diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.=
h
index 189806b..9af68ce 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -112,13 +112,13 @@ struct ftrace_event_call {
struct dentry *dir;
struct trace_event *event;
int enabled;
- int (*regfunc)(void *);
- void (*unregfunc)(void *);
+ int (*regfunc)(struct ftrace_event_call *);
+ void (*unregfunc)(struct ftrace_event_call *);
int id;
- int (*raw_init)(void);
- int (*show_format)(struct ftrace_event_call *call,
- struct trace_seq *s);
- int (*define_fields)(void);
+ int (*raw_init)(struct ftrace_event_call *);
+ int (*show_format)(struct ftrace_event_call *,
+ struct trace_seq *);
+ int (*define_fields)(struct ftrace_event_call *);
struct list_head fields;
int filter_active;
struct event_filter *filter;
@@ -142,6 +142,8 @@ extern int filter_current_check_discard(struct ftra=
ce_event_call *call,
=20
extern int trace_define_field(struct ftrace_event_call *call, char *ty=
pe,
char *name, int offset, int size, int is_signed);
+extern int trace_add_event_call(struct ftrace_event_call *call);
+extern void trace_remove_event_call(struct ftrace_event_call *call);
=20
#define is_signed_type(type) (((type)(-1)) < 0)
=20
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 87d06c1..be59d22 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -165,7 +165,7 @@ static void prof_sysexit_disable_##sname(struct ftr=
ace_event_call *event_call) \
struct trace_event enter_syscall_print_##sname =3D { \
.trace =3D print_syscall_enter, \
}; \
- static int init_enter_##sname(void) \
+ static int init_enter_##sname(struct ftrace_event_call *call) \
{ \
int num, id; \
num =3D syscall_name_to_nr("sys"#sname); \
@@ -201,7 +201,7 @@ static void prof_sysexit_disable_##sname(struct ftr=
ace_event_call *event_call) \
struct trace_event exit_syscall_print_##sname =3D { \
.trace =3D print_syscall_exit, \
}; \
- static int init_exit_##sname(void) \
+ static int init_exit_##sname(struct ftrace_event_call *call) \
{ \
int num, id; \
num =3D syscall_name_to_nr("sys"#sname); \
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index b250b06..6c1b5b1 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -294,10 +294,9 @@ ftrace_raw_output_##call(struct trace_iterator *it=
er, int flags) \
#undef TRACE_EVENT
#define TRACE_EVENT(call, proto, args, tstruct, func, print) \
int \
-ftrace_define_fields_##call(void) \
+ftrace_define_fields_##call(struct ftrace_event_call *event_call) \
{ \
struct ftrace_raw_##call field; \
- struct ftrace_event_call *event_call =3D &event_##call; \
int ret; \
\
__common_field(int, type, 1); \
@@ -411,7 +410,7 @@ static void ftrace_profile_disable_##call(struct ft=
race_event_call *event_call)\
* event_trace_printk(_RET_IP_, "<call>: " <fmt>);
* }
*
- * static int ftrace_reg_event_<call>(void)
+ * static int ftrace_reg_event_<call>(struct ftrace_event_call *unused=
)
* {
* int ret;
*
@@ -422,7 +421,7 @@ static void ftrace_profile_disable_##call(struct ft=
race_event_call *event_call)\
* return ret;
* }
*
- * static void ftrace_unreg_event_<call>(void)
+ * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unu=
sed)
* {
* unregister_trace_<call>(ftrace_event_<call>);
* }
@@ -455,7 +454,7 @@ static void ftrace_profile_disable_##call(struct ft=
race_event_call *event_call)\
* trace_current_buffer_unlock_commit(event, irq_flags, pc);
* }
*
- * static int ftrace_raw_reg_event_<call>(void)
+ * static int ftrace_raw_reg_event_<call>(struct ftrace_event_call *un=
used)
* {
* int ret;
*
@@ -466,7 +465,7 @@ static void ftrace_profile_disable_##call(struct ft=
race_event_call *event_call)\
* return ret;
* }
*
- * static void ftrace_unreg_event_<call>(void)
+ * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unu=
sed)
* {
* unregister_trace_<call>(ftrace_raw_event_<call>);
* }
@@ -475,7 +474,7 @@ static void ftrace_profile_disable_##call(struct ft=
race_event_call *event_call)\
* .trace =3D ftrace_raw_output_<call>, <-- stage 2
* };
*
- * static int ftrace_raw_init_event_<call>(void)
+ * static int ftrace_raw_init_event_<call>(struct ftrace_event_call *u=
nused)
* {
* int id;
*
@@ -569,7 +568,7 @@ static void ftrace_raw_event_##call(proto) \
trace_nowake_buffer_unlock_commit(event, irq_flags, pc); \
} \
\
-static int ftrace_raw_reg_event_##call(void *ptr) \
+static int ftrace_raw_reg_event_##call(struct ftrace_event_call *unuse=
d)\
{ \
int ret; \
\
@@ -580,7 +579,7 @@ static int ftrace_raw_reg_event_##call(void *ptr) =
\
return ret; \
} \
\
-static void ftrace_raw_unreg_event_##call(void *ptr) \
+static void ftrace_raw_unreg_event_##call(struct ftrace_event_call *un=
used)\
{ \
unregister_trace_##call(ftrace_raw_event_##call); \
} \
@@ -589,7 +588,7 @@ static struct trace_event ftrace_event_type_##call =
=3D { \
.trace =3D ftrace_raw_output_##call, \
}; \
\
-static int ftrace_raw_init_event_##call(void) \
+static int ftrace_raw_init_event_##call(struct ftrace_event_call *unus=
ed)\
{ \
int id; \
\
diff --git a/include/trace/syscall.h b/include/trace/syscall.h
index 0cb0362..848b4ae 100644
--- a/include/trace/syscall.h
+++ b/include/trace/syscall.h
@@ -51,10 +51,10 @@ void set_syscall_enter_id(int num, int id);
void set_syscall_exit_id(int num, int id);
extern struct trace_event event_syscall_enter;
extern struct trace_event event_syscall_exit;
-extern int reg_event_syscall_enter(void *ptr);
-extern void unreg_event_syscall_enter(void *ptr);
-extern int reg_event_syscall_exit(void *ptr);
-extern void unreg_event_syscall_exit(void *ptr);
+extern int reg_event_syscall_enter(struct ftrace_event_call *call);
+extern void unreg_event_syscall_enter(struct ftrace_event_call *call);
+extern int reg_event_syscall_exit(struct ftrace_event_call *call);
+extern void unreg_event_syscall_exit(struct ftrace_event_call *call);
extern int
ftrace_format_syscall(struct ftrace_event_call *call, struct trace_seq=
*s);
enum print_line_t print_syscall_enter(struct trace_iterator *iter, int=
flags);
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index b568ade..be701d1 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -62,9 +62,7 @@ err:
}
EXPORT_SYMBOL_GPL(trace_define_field);
=20
-#ifdef CONFIG_MODULES
-
-static void trace_destroy_fields(struct ftrace_event_call *call)
+void trace_destroy_fields(struct ftrace_event_call *call)
{
struct ftrace_event_field *field, *next;
=20
@@ -76,8 +74,6 @@ static void trace_destroy_fields(struct ftrace_event_=
call *call)
}
}
=20
-#endif /* CONFIG_MODULES */
-
static void ftrace_event_enable_disable(struct ftrace_event_call *call=
,
int enable)
{
@@ -86,14 +82,14 @@ static void ftrace_event_enable_disable(struct ftra=
ce_event_call *call,
if (call->enabled) {
call->enabled =3D 0;
tracing_stop_cmdline_record();
- call->unregfunc(call->data);
+ call->unregfunc(call);
}
break;
case 1:
if (!call->enabled) {
call->enabled =3D 1;
tracing_start_cmdline_record();
- call->regfunc(call->data);
+ call->regfunc(call);
}
break;
}
@@ -941,7 +937,7 @@ event_create_dir(struct ftrace_event_call *call, st=
ruct dentry *d_events,
id);
=20
if (call->define_fields) {
- ret =3D call->define_fields();
+ ret =3D call->define_fields(call);
if (ret < 0) {
pr_warning("Could not initialize trace point"
" events/%s\n", call->name);
@@ -961,27 +957,43 @@ event_create_dir(struct ftrace_event_call *call, =
struct dentry *d_events,
return 0;
}
=20
-#define for_each_event(event, start, end) \
- for (event =3D start; \
- (unsigned long)event < (unsigned long)end; \
- event++)
+static int __trace_add_event_call(struct ftrace_event_call *call)
+{
+ struct dentry *d_events;
+ int ret;
=20
-#ifdef CONFIG_MODULES
+ if (!call->name)
+ return -EINVAL;
=20
-static LIST_HEAD(ftrace_module_file_list);
+ if (call->raw_init) {
+ ret =3D call->raw_init(call);
+ if (ret < 0) {
+ if (ret !=3D -ENOSYS)
+ pr_warning("Could not initialize trace "
+ "events/%s\n", call->name);
+ return ret;
+ }
+ }
=20
-/*
- * Modules must own their file_operations to keep up with
- * reference counting.
- */
-struct ftrace_module_file_ops {
- struct list_head list;
- struct module *mod;
- struct file_operations id;
- struct file_operations enable;
- struct file_operations format;
- struct file_operations filter;
-};
+ d_events =3D event_trace_events_dir();
+ if (!d_events)
+ return -ENOENT;
+
+ list_add(&call->list, &ftrace_events);
+ return event_create_dir(call, d_events, &ftrace_event_id_fops,
+ &ftrace_enable_fops, &ftrace_event_filter_fops,
+ &ftrace_event_format_fops);
+}
+
+/* Add an additional event_call dynamically */
+int trace_add_event_call(struct ftrace_event_call *call)
+{
+ int ret;
+ mutex_lock(&event_mutex);
+ ret =3D __trace_add_event_call(call);
+ mutex_unlock(&event_mutex);
+ return ret;
+}
=20
static void remove_subsystem_dir(const char *name)
{
@@ -1009,6 +1021,48 @@ static void remove_subsystem_dir(const char *nam=
e)
}
}
=20
+static void __trace_remove_event_call(struct ftrace_event_call *call)
+{
+ ftrace_event_enable_disable(call, 0);
+ if (call->event)
+ __unregister_ftrace_event(call->event);
+ debugfs_remove_recursive(call->dir);
+ list_del(&call->list);
+ trace_destroy_fields(call);
+ destroy_preds(call);
+ remove_subsystem_dir(call->system);
+}
+
+/* Remove an event_call */
+void trace_remove_event_call(struct ftrace_event_call *call)
+{
+ mutex_lock(&event_mutex);
+ __trace_remove_event_call(call);
+ mutex_unlock(&event_mutex);
+}
+
+#define for_each_event(event, start, end) \
+ for (event =3D start; \
+ (unsigned long)event < (unsigned long)end; \
+ event++)
+
+#ifdef CONFIG_MODULES
+
+static LIST_HEAD(ftrace_module_file_list);
+
+/*
+ * Modules must own their file_operations to keep up with
+ * reference counting.
+ */
+struct ftrace_module_file_ops {
+ struct list_head list;
+ struct module *mod;
+ struct file_operations id;
+ struct file_operations enable;
+ struct file_operations format;
+ struct file_operations filter;
+};
+
static struct ftrace_module_file_ops *
trace_create_file_ops(struct module *mod)
{
@@ -1066,7 +1120,7 @@ static void trace_module_add_events(struct module=
*mod)
if (!call->name)
continue;
if (call->raw_init) {
- ret =3D call->raw_init();
+ ret =3D call->raw_init(call);
if (ret < 0) {
if (ret !=3D -ENOSYS)
pr_warning("Could not initialize trace "
@@ -1101,14 +1155,7 @@ static void trace_module_remove_events(struct mo=
dule *mod)
list_for_each_entry_safe(call, p, &ftrace_events, list) {
if (call->mod =3D=3D mod) {
found =3D true;
- ftrace_event_enable_disable(call, 0);
- if (call->event)
- __unregister_ftrace_event(call->event);
- debugfs_remove_recursive(call->dir);
- list_del(&call->list);
- trace_destroy_fields(call);
- destroy_preds(call);
- remove_subsystem_dir(call->system);
+ __trace_remove_event_call(call);
}
}
=20
@@ -1226,7 +1273,7 @@ static __init int event_trace_init(void)
if (!call->name)
continue;
if (call->raw_init) {
- ret =3D call->raw_init();
+ ret =3D call->raw_init(call);
if (ret < 0) {
if (ret !=3D -ENOSYS)
pr_warning("Could not initialize trace "
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index 956d4bc..71c8d7f 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -117,10 +117,16 @@ ftrace_format_##call(struct ftrace_event_call *un=
used, \
#define TRACE_FIELD_SPECIAL(type_item, item, len, cmd) \
cmd;
=20
+static int ftrace_raw_init_event(struct ftrace_event_call *event_call)
+{
+ INIT_LIST_HEAD(&event_call->fields);
+ init_preds(event_call);
+ return 0;
+}
+
#undef TRACE_EVENT_FORMAT
#define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt) \
-int ftrace_define_fields_##call(void); \
-static int ftrace_raw_init_event_##call(void); \
+int ftrace_define_fields_##call(struct ftrace_event_call *c); \
\
struct ftrace_event_call __used \
__attribute__((__aligned__(4))) \
@@ -128,16 +134,10 @@ __attribute__((section("_ftrace_events"))) event_=
##call =3D { \
.name =3D #call, \
.id =3D proto, \
.system =3D __stringify(TRACE_SYSTEM), \
- .raw_init =3D ftrace_raw_init_event_##call, \
+ .raw_init =3D ftrace_raw_init_event, \
.show_format =3D ftrace_format_##call, \
.define_fields =3D ftrace_define_fields_##call, \
-}; \
-static int ftrace_raw_init_event_##call(void) \
-{ \
- INIT_LIST_HEAD(&event_##call.fields); \
- init_preds(&event_##call); \
- return 0; \
-} \
+};
=20
#undef TRACE_EVENT_FORMAT_NOFILTER
#define TRACE_EVENT_FORMAT_NOFILTER(call, proto, args, fmt, tstruct, \
@@ -184,9 +184,8 @@ __attribute__((section("_ftrace_events"))) event_##=
call =3D { \
#undef TRACE_EVENT_FORMAT
#define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt) \
int \
-ftrace_define_fields_##call(void) \
+ftrace_define_fields_##call(struct ftrace_event_call *event_call) \
{ \
- struct ftrace_event_call *event_call =3D &event_##call; \
struct args field; \
int ret; \
\
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscall=
s.c
index f837ccc..3451621 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -210,13 +210,13 @@ void ftrace_syscall_exit(struct pt_regs *regs, lo=
ng ret)
trace_wake_up();
}
=20
-int reg_event_syscall_enter(void *ptr)
+int reg_event_syscall_enter(struct ftrace_event_call *call)
{
int ret =3D 0;
int num;
char *name;
=20
- name =3D (char *)ptr;
+ name =3D (char *)call->data;
num =3D syscall_name_to_nr(name);
if (num < 0 || num >=3D FTRACE_SYSCALL_MAX)
return -ENOSYS;
@@ -234,12 +234,12 @@ int reg_event_syscall_enter(void *ptr)
return ret;
}
=20
-void unreg_event_syscall_enter(void *ptr)
+void unreg_event_syscall_enter(struct ftrace_event_call *call)
{
int num;
char *name;
=20
- name =3D (char *)ptr;
+ name =3D (char *)call->data;
num =3D syscall_name_to_nr(name);
if (num < 0 || num >=3D FTRACE_SYSCALL_MAX)
return;
@@ -251,13 +251,13 @@ void unreg_event_syscall_enter(void *ptr)
mutex_unlock(&syscall_trace_lock);
}
=20
-int reg_event_syscall_exit(void *ptr)
+int reg_event_syscall_exit(struct ftrace_event_call *call)
{
int ret =3D 0;
int num;
char *name;
=20
- name =3D (char *)ptr;
+ name =3D (char *)call->data;
num =3D syscall_name_to_nr(name);
if (num < 0 || num >=3D FTRACE_SYSCALL_MAX)
return -ENOSYS;
@@ -275,12 +275,12 @@ int reg_event_syscall_exit(void *ptr)
return ret;
}
=20
-void unreg_event_syscall_exit(void *ptr)
+void unreg_event_syscall_exit(struct ftrace_event_call *call)
{
int num;
char *name;
=20
- name =3D (char *)ptr;
+ name =3D (char *)call->data;
num =3D syscall_name_to_nr(name);
if (num < 0 || num >=3D FTRACE_SYSCALL_MAX)
return;


--=20
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-13 20:35:01 UTC
Permalink
Use TRACE_FIELD_ZERO(type, item) instead of TRACE_FIELD_ZERO_CHAR(item)=
=2E
This also includes a fix of TRACE_ZERO_CHAR() macro.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemys=C5=82aw Pawe=C5=82czyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

kernel/trace/trace_event_types.h | 4 ++--
kernel/trace/trace_export.c | 16 ++++++++--------
2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/trace_event_types.h b/kernel/trace/trace_even=
t_types.h
index 6db005e..e74f090 100644
--- a/kernel/trace/trace_event_types.h
+++ b/kernel/trace/trace_event_types.h
@@ -109,7 +109,7 @@ TRACE_EVENT_FORMAT(bprint, TRACE_BPRINT, bprint_ent=
ry, ignore,
TRACE_STRUCT(
TRACE_FIELD(unsigned long, ip, ip)
TRACE_FIELD(char *, fmt, fmt)
- TRACE_FIELD_ZERO_CHAR(buf)
+ TRACE_FIELD_ZERO(char, buf)
),
TP_RAW_FMT("%08lx (%d) fmt:%p %s")
);
@@ -117,7 +117,7 @@ TRACE_EVENT_FORMAT(bprint, TRACE_BPRINT, bprint_ent=
ry, ignore,
TRACE_EVENT_FORMAT(print, TRACE_PRINT, print_entry, ignore,
TRACE_STRUCT(
TRACE_FIELD(unsigned long, ip, ip)
- TRACE_FIELD_ZERO_CHAR(buf)
+ TRACE_FIELD_ZERO(char, buf)
),
TP_RAW_FMT("%08lx (%d) fmt:%p %s")
);
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index 71c8d7f..b0ac92c 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -42,9 +42,9 @@ extern void __bad_type_size(void);
if (!ret) \
return 0;
=20
-#undef TRACE_FIELD_ZERO_CHAR
-#define TRACE_FIELD_ZERO_CHAR(item) \
- ret =3D trace_seq_printf(s, "\tfield:char " #item ";\t" \
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item) \
+ ret =3D trace_seq_printf(s, "\tfield:" #type " " #item ";\t" \
"offset:%u;\tsize:0;\n", \
(unsigned int)offsetof(typeof(field), item)); \
if (!ret) \
@@ -92,9 +92,6 @@ ftrace_format_##call(struct ftrace_event_call *unused=
, \
=20
#include "trace_event_types.h"
=20
-#undef TRACE_ZERO_CHAR
-#define TRACE_ZERO_CHAR(arg)
-
#undef TRACE_FIELD
#define TRACE_FIELD(type, item, assign)\
entry->item =3D assign;
@@ -107,6 +104,9 @@ ftrace_format_##call(struct ftrace_event_call *unus=
ed, \
#define TRACE_FIELD_SIGN(type, item, assign, is_signed) \
TRACE_FIELD(type, item, assign)
=20
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item)
+
#undef TP_CMD
#define TP_CMD(cmd...) cmd
=20
@@ -178,8 +178,8 @@ __attribute__((section("_ftrace_events"))) event_##=
call =3D { \
if (ret) \
return ret;
=20
-#undef TRACE_FIELD_ZERO_CHAR
-#define TRACE_FIELD_ZERO_CHAR(item)
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item)
=20
#undef TRACE_EVENT_FORMAT
#define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt) \


--=20
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Frederic Weisbecker
2009-08-19 01:09:05 UTC
Permalink
Use TRACE_FIELD_ZERO(type, item) instead of TRACE_FIELD_ZERO_CHAR(ite=
m).
This also includes a fix of TRACE_ZERO_CHAR() macro.
I can't find what the fix is about (see below)
=20
---
=20
kernel/trace/trace_event_types.h | 4 ++--
kernel/trace/trace_export.c | 16 ++++++++--------
2 files changed, 10 insertions(+), 10 deletions(-)
=20
diff --git a/kernel/trace/trace_event_types.h b/kernel/trace/trace_ev=
ent_types.h
index 6db005e..e74f090 100644
--- a/kernel/trace/trace_event_types.h
+++ b/kernel/trace/trace_event_types.h
@@ -109,7 +109,7 @@ TRACE_EVENT_FORMAT(bprint, TRACE_BPRINT, bprint_e=
ntry, ignore,
TRACE_STRUCT(
TRACE_FIELD(unsigned long, ip, ip)
TRACE_FIELD(char *, fmt, fmt)
- TRACE_FIELD_ZERO_CHAR(buf)
+ TRACE_FIELD_ZERO(char, buf)
),
TP_RAW_FMT("%08lx (%d) fmt:%p %s")
);
@@ -117,7 +117,7 @@ TRACE_EVENT_FORMAT(bprint, TRACE_BPRINT, bprint_e=
ntry, ignore,
TRACE_EVENT_FORMAT(print, TRACE_PRINT, print_entry, ignore,
TRACE_STRUCT(
TRACE_FIELD(unsigned long, ip, ip)
- TRACE_FIELD_ZERO_CHAR(buf)
+ TRACE_FIELD_ZERO(char, buf)
),
TP_RAW_FMT("%08lx (%d) fmt:%p %s")
);
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.=
c
index 71c8d7f..b0ac92c 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -42,9 +42,9 @@ extern void __bad_type_size(void);
if (!ret) \
return 0;
=20
-#undef TRACE_FIELD_ZERO_CHAR
-#define TRACE_FIELD_ZERO_CHAR(item) \
- ret =3D trace_seq_printf(s, "\tfield:char " #item ";\t" \
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item) \
+ ret =3D trace_seq_printf(s, "\tfield:" #type " " #item ";\t" \
"offset:%u;\tsize:0;\n", \
(unsigned int)offsetof(typeof(field), item)); \
if (!ret) \
@@ -92,9 +92,6 @@ ftrace_format_##call(struct ftrace_event_call *unus=
ed, \
=20
#include "trace_event_types.h"
=20
-#undef TRACE_ZERO_CHAR
-#define TRACE_ZERO_CHAR(arg)
-
#undef TRACE_FIELD
#define TRACE_FIELD(type, item, assign)\
entry->item =3D assign;
@@ -107,6 +104,9 @@ ftrace_format_##call(struct ftrace_event_call *un=
used, \
#define TRACE_FIELD_SIGN(type, item, assign, is_signed) \
TRACE_FIELD(type, item, assign)
=20
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item)
+
Is it about the above moving?
If so, could you just tell so that I can add something about
it in the changelog.

Thanks.
=46rederic.
#undef TP_CMD
#define TP_CMD(cmd...) cmd
=20
@@ -178,8 +178,8 @@ __attribute__((section("_ftrace_events"))) event_=
##call =3D { \
if (ret) \
return ret;
=20
-#undef TRACE_FIELD_ZERO_CHAR
-#define TRACE_FIELD_ZERO_CHAR(item)
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item)
=20
#undef TRACE_EVENT_FORMAT
#define TRACE_EVENT_FORMAT(call, proto, args, fmt, tstruct, tpfmt) \
=20
=20
--=20
Masami Hiramatsu
=20
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
=20
Masami Hiramatsu
2009-08-19 02:20:11 UTC
Permalink
Post by Frederic Weisbecker
Use TRACE_FIELD_ZERO(type, item) instead of TRACE_FIELD_ZERO_CHAR(item).
This also includes a fix of TRACE_ZERO_CHAR() macro.
I can't find what the fix is about (see below)
Ah, OK. This patch actually includes two parts.

One is introducing TRACE_FIELD_ZERO which is more generic than
TRACE_FIELD_ZERO_CHAR, I think.

Another is a typo fix of TRACE_ZERO_CHAR.
Post by Frederic Weisbecker
---
kernel/trace/trace_event_types.h | 4 ++--
kernel/trace/trace_export.c | 16 ++++++++--------
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace_event_types.h b/kernel/trace/trace_event_types.h
index 6db005e..e74f090 100644
--- a/kernel/trace/trace_event_types.h
+++ b/kernel/trace/trace_event_types.h
@@ -109,7 +109,7 @@ TRACE_EVENT_FORMAT(bprint, TRACE_BPRINT, bprint_entry, ignore,
TRACE_STRUCT(
TRACE_FIELD(unsigned long, ip, ip)
TRACE_FIELD(char *, fmt, fmt)
- TRACE_FIELD_ZERO_CHAR(buf)
+ TRACE_FIELD_ZERO(char, buf)
),
TP_RAW_FMT("%08lx (%d) fmt:%p %s")
);
@@ -117,7 +117,7 @@ TRACE_EVENT_FORMAT(bprint, TRACE_BPRINT, bprint_entry, ignore,
TRACE_EVENT_FORMAT(print, TRACE_PRINT, print_entry, ignore,
TRACE_STRUCT(
TRACE_FIELD(unsigned long, ip, ip)
- TRACE_FIELD_ZERO_CHAR(buf)
+ TRACE_FIELD_ZERO(char, buf)
),
TP_RAW_FMT("%08lx (%d) fmt:%p %s")
);
diff --git a/kernel/trace/trace_export.c b/kernel/trace/trace_export.c
index 71c8d7f..b0ac92c 100644
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -42,9 +42,9 @@ extern void __bad_type_size(void);
if (!ret) \
return 0;
-#undef TRACE_FIELD_ZERO_CHAR
-#define TRACE_FIELD_ZERO_CHAR(item) \
- ret = trace_seq_printf(s, "\tfield:char " #item ";\t" \
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item) \
+ ret = trace_seq_printf(s, "\tfield:" #type " " #item ";\t" \
"offset:%u;\tsize:0;\n", \
(unsigned int)offsetof(typeof(field), item)); \
if (!ret) \
@@ -92,9 +92,6 @@ ftrace_format_##call(struct ftrace_event_call *unused, \
#include "trace_event_types.h"
-#undef TRACE_ZERO_CHAR
-#define TRACE_ZERO_CHAR(arg)
-
#undef TRACE_FIELD
#define TRACE_FIELD(type, item, assign)\
entry->item = assign;
@@ -107,6 +104,9 @@ ftrace_format_##call(struct ftrace_event_call *unused, \
#define TRACE_FIELD_SIGN(type, item, assign, is_signed) \
TRACE_FIELD(type, item, assign)
+#undef TRACE_FIELD_ZERO
+#define TRACE_FIELD_ZERO(type, item)
+
Is it about the above moving?
If so, could you just tell so that I can add something about
it in the changelog.
No, I assume that TRACE_ZERO_CHAR is just a typo of TRACE_FIELD_ZERO_CHAR.
(because I couldn't find any other TRACE_ZERO_CHAR)

BTW, this patch may not be needed after applying patch 10/12, since
it removes ftrace event definitions of TRACE_KPROBE/KRETPROBE.

Perhaps, would I better merge and split those additional patches(and
remove this change)?
(It also could make the incremental review hard...)

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:57:20 UTC
Permalink
This script tests kprobes to probe on all symbols in the kernel and finds
symbols which must be blacklisted.


Usage
-----
kprobestest [-s SYMLIST] [-b BLACKLIST] [-w WHITELIST]
Run stress test. If SYMLIST file is specified, use it as
an initial symbol list (This is useful for verifying white list
after diagnosing all symbols).

kprobestest cleanup
Cleanup all lists


How to Work
-----------
This tool list up all symbols in the kernel via /proc/kallsyms, and sorts
it into groups (each of them including 64 symbols in default). And then,
it tests each group by using kprobe-tracer. If a kernel crash occurred,
that group is moved into 'failed' dir. If the group passed the test, this
script moves it into 'passed' dir and saves kprobe_profile into
'passed/profiles/'.
After testing all groups, all 'failed' groups are merged and sorted into
smaller groups (divided by 4, in default). And those are tested again.
This loop will be repeated until all group has just 1 symbol.

Finally, the script sorts all 'passed' symbols into 'tested', 'untested',
and 'missed' based on profiles.


Note
----
- This script just gives us some clues to the blacklisted functions.
In some cases, a combination of probe points will cause a problem, but
each of them doesn't cause the problem alone.

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-20 18:43:34 UTC
Permalink
Post by Masami Hiramatsu
This script tests kprobes to probe on all symbols in the kernel and finds
symbols which must be blacklisted.
Usage
-----
kprobestest [-s SYMLIST] [-b BLACKLIST] [-w WHITELIST]
Run stress test. If SYMLIST file is specified, use it as
an initial symbol list (This is useful for verifying white list
after diagnosing all symbols).
kprobestest cleanup
Cleanup all lists
How to Work
-----------
This tool list up all symbols in the kernel via /proc/kallsyms, and sorts
it into groups (each of them including 64 symbols in default). And then,
it tests each group by using kprobe-tracer. If a kernel crash occurred,
that group is moved into 'failed' dir. If the group passed the test, this
script moves it into 'passed' dir and saves kprobe_profile into
'passed/profiles/'.
After testing all groups, all 'failed' groups are merged and sorted into
smaller groups (divided by 4, in default). And those are tested again.
This loop will be repeated until all group has just 1 symbol.
Finally, the script sorts all 'passed' symbols into 'tested', 'untested',
and 'missed' based on profiles.
Note
----
- This script just gives us some clues to the blacklisted functions.
In some cases, a combination of probe points will cause a problem, but
each of them doesn't cause the problem alone.
Thank you,
This script makes my x86-64 dual core easily and hardly locking-up
on the 1st batch of symbols to test.
I have one sym list in the failed and unset directories:

int_very_careful
int_signal
int_restore_rest
stub_clone
stub_fork
stub_vfork
stub_sigaltstack
stub_iopl
ptregscall_common
stub_execve
stub_rt_sigreturn
irq_entries_start
common_interrupt
ret_from_intr
exit_intr
retint_with_reschedule
retint_check
retint_swapgs
retint_restore_args
restore_args
irq_return
retint_careful
retint_signal
retint_kernel
irq_move_cleanup_interrupt
reboot_interrupt
apic_timer_interrupt
generic_interrupt
invalidate_interrupt0
invalidate_interrupt1
invalidate_interrupt2
invalidate_interrupt3
invalidate_interrupt4
invalidate_interrupt5
invalidate_interrupt6
invalidate_interrupt7
threshold_interrupt
thermal_interrupt
mce_self_interrupt
call_function_single_interrupt
call_function_interrupt
reschedule_interrupt
error_interrupt
spurious_interrupt
perf_pending_interrupt
divide_error
overflow
bounds
invalid_op
device_not_available
double_fault
coprocessor_segment_overrun
invalid_TSS
segment_not_present
spurious_interrupt_bug
coprocessor_error
alignment_check
simd_coprocessor_error
native_load_gs_index
gs_change
kernel_thread
child_rip
kernel_execve
call_softirq


I don't have a crash log because I was running with X.
But it also happened with other batch of symbols.

The problem is that I don't have any serial line in this
box then I can't catch any crash log.
My K7 testbox also died in my arms this afternoon.

But I still have two other testboxes (one P2 and one P3),
hopefully I could reproduce the problem in these boxes
in which I can connect a serial line.

I've pushed your patches in the following git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git \
tracing/kprobes

So you can send patches on top of this one.

Config in attachment:

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.31-rc5
# Thu Aug 20 19:35:39 2009
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_FAST_CMPXCHG_LOCAL=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
# CONFIG_RWSEM_XCHGADD_ALGORITHM is not set
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y
CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_TRAMPOLINE=y
# CONFIG_KTIME_SCALAR is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
# CONFIG_TASK_DELAY_ACCT is not set
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_TREE=y

#
# RCU Subsystem
#
# CONFIG_CLASSIC_RCU is not set
CONFIG_TREE_RCU=y
# CONFIG_PREEMPT_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=64
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_PREEMPT_RCU_TRACE is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_USER_SCHED=y
# CONFIG_CGROUP_SCHED is not set
# CONFIG_CGROUPS is not set
# CONFIG_SYSFS_DEPRECATED_V2 is not set
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_HAVE_PERF_COUNTERS=y

#
# Performance Counters
#
CONFIG_PERF_COUNTERS=y
CONFIG_EVENT_PROFILE=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_COMPAT_BRK is not set
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
# CONFIG_PROFILING is not set
CONFIG_TRACEPOINTS=y
CONFIG_MARKERS=y
CONFIG_HAVE_OPROFILE=y
CONFIG_KPROBES=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_KRETPROBES=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_DMA_API_DEBUG=y

#
# GCOV-based kernel profiling
#
# CONFIG_GCOV_KERNEL is not set
# CONFIG_SLOW_WORK is not set
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
# CONFIG_MODULE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_BSG is not set
CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLOCK_COMPAT=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
CONFIG_FREEZER=y

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
# CONFIG_SPARSE_IRQ is not set
CONFIG_X86_MPPARSE=y
# CONFIG_X86_EXTENDED_PLATFORM is not set
CONFIG_SCHED_OMIT_FRAME_POINTER=y
# CONFIG_PARAVIRT_GUEST is not set
# CONFIG_MEMTEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_MPSC is not set
CONFIG_MCORE2=y
# CONFIG_GENERIC_CPU is not set
CONFIG_X86_CPU=y
CONFIG_X86_L1_CACHE_BYTES=64
CONFIG_X86_INTERNODE_CACHE_BYTES=64
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_P6_NOP=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=64
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
# CONFIG_X86_DS is not set
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_DMI=y
CONFIG_GART_IOMMU=y
# CONFIG_CALGARY_IOMMU is not set
# CONFIG_AMD_IOMMU is not set
CONFIG_SWIOTLB=y
CONFIG_IOMMU_HELPER=y
# CONFIG_IOMMU_API is not set
# CONFIG_MAXSMP is not set
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set
CONFIG_X86_MCE=y
CONFIG_X86_NEW_MCE=y
CONFIG_X86_MCE_INTEL=y
# CONFIG_X86_MCE_AMD is not set
CONFIG_X86_MCE_THRESHOLD=y
# CONFIG_X86_MCE_INJECT is not set
CONFIG_X86_THERMAL_VECTOR=y
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_MICROCODE_INTEL=y
# CONFIG_MICROCODE_AMD is not set
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
# CONFIG_X86_CPU_DEBUG is not set
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_DIRECT_GBPAGES=y
# CONFIG_NUMA is not set
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y

#
# Memory hotplug is currently incompatible with Software Suspend
#
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_HAVE_MLOCK=y
CONFIG_HAVE_MLOCKED_PAGE_BIT=y
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_X86_RESERVE_LOW_64K=y
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_EFI=y
CONFIG_SECCOMP=y
# CONFIG_CC_STACKPROTECTOR is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_SCHED_HRTICK=y
# CONFIG_KEXEC is not set
CONFIG_CRASH_DUMP=y
CONFIG_PHYSICAL_START=0x200000
# CONFIG_RELOCATABLE is not set
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
CONFIG_COMPAT_VDSO=y
# CONFIG_CMDLINE_BOOL is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management and ACPI options
#
CONFIG_ARCH_HIBERNATION_HEADER=y
CONFIG_PM=y
CONFIG_PM_DEBUG=y
# CONFIG_PM_VERBOSE is not set
CONFIG_CAN_PM_TRACE=y
CONFIG_PM_TRACE=y
CONFIG_PM_TRACE_RTC=y
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM_SLEEP=y
CONFIG_SUSPEND=y
CONFIG_PM_TEST_SUSPEND=y
CONFIG_SUSPEND_FREEZER=y
CONFIG_HIBERNATION_NVS=y
CONFIG_HIBERNATION=y
CONFIG_PM_STD_PARTITION=""
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_PROCFS=y
CONFIG_ACPI_PROCFS_POWER=y
CONFIG_ACPI_SYSFS_POWER=y
CONFIG_ACPI_PROC_EVENT=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_PCI_SLOT=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=y
CONFIG_ACPI_SBS=y

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=y
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_P4_CLOCKMOD=y

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=y
CONFIG_CPU_IDLE=y
CONFIG_CPU_IDLE_GOV_LADDER=y
CONFIG_CPU_IDLE_GOV_MENU=y

#
# Memory power savings
#
CONFIG_I7300_IDLE_IOAT_CHANNEL=y
CONFIG_I7300_IDLE=y

#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_DOMAINS=y
# CONFIG_DMAR is not set
# CONFIG_INTR_REMAP is not set
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_PCIEAER=y
# CONFIG_PCIE_ECRC is not set
# CONFIG_PCIEAER_INJECT is not set
# CONFIG_PCIEASPM is not set
CONFIG_ARCH_SUPPORTS_MSI=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY=y
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
CONFIG_HT_IRQ=y
# CONFIG_PCI_IOV is not set
CONFIG_ISA_DMA_API=y
CONFIG_K8_NB=y
CONFIG_PCCARD=y
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=y
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=y
CONFIG_YENTA_O2=y
CONFIG_YENTA_RICOH=y
CONFIG_YENTA_TI=y
CONFIG_YENTA_ENE_TUNE=y
CONFIG_YENTA_TOSHIBA=y
CONFIG_PD6729=y
CONFIG_I82092=y
CONFIG_PCCARD_NONSTATIC=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_FAKE=y
CONFIG_HOTPLUG_PCI_ACPI=y
CONFIG_HOTPLUG_PCI_ACPI_IBM=y
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_CPCI_ZT5550=y
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=y
CONFIG_HOTPLUG_PCI_SHPC=y

#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
# CONFIG_HAVE_AOUT is not set
CONFIG_BINFMT_MISC=y
CONFIG_IA32_EMULATION=y
CONFIG_IA32_AOUT=y
CONFIG_COMPAT=y
CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=y
# CONFIG_XFRM_SUB_POLICY is not set
# CONFIG_XFRM_MIGRATE is not set
# CONFIG_XFRM_STATISTICS is not set
CONFIG_XFRM_IPCOMP=y
CONFIG_NET_KEY=y
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
# CONFIG_IP_PNP_BOOTP is not set
# CONFIG_IP_PNP_RARP is not set
CONFIG_NET_IPIP=y
CONFIG_NET_IPGRE=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=y
CONFIG_INET_ESP=y
CONFIG_INET_IPCOMP=y
CONFIG_INET_XFRM_TUNNEL=y
CONFIG_INET_TUNNEL=y
CONFIG_INET_XFRM_MODE_TRANSPORT=y
CONFIG_INET_XFRM_MODE_TUNNEL=y
CONFIG_INET_XFRM_MODE_BEET=y
CONFIG_INET_LRO=y
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
CONFIG_TCP_CONG_ADVANCED=y
CONFIG_TCP_CONG_BIC=y
CONFIG_TCP_CONG_CUBIC=y
CONFIG_TCP_CONG_WESTWOOD=y
CONFIG_TCP_CONG_HTCP=y
CONFIG_TCP_CONG_HSTCP=y
CONFIG_TCP_CONG_HYBLA=y
CONFIG_TCP_CONG_VEGAS=y
CONFIG_TCP_CONG_SCALABLE=y
CONFIG_TCP_CONG_LP=y
CONFIG_TCP_CONG_VENO=y
CONFIG_TCP_CONG_YEAH=y
CONFIG_TCP_CONG_ILLINOIS=y
# CONFIG_DEFAULT_BIC is not set
CONFIG_DEFAULT_CUBIC=y
# CONFIG_DEFAULT_HTCP is not set
# CONFIG_DEFAULT_VEGAS is not set
# CONFIG_DEFAULT_WESTWOOD is not set
# CONFIG_DEFAULT_RENO is not set
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
# CONFIG_IPV6 is not set
CONFIG_NETLABEL=y
CONFIG_NETWORK_SECMARK=y
# CONFIG_NETFILTER is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_PHONET is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_TCPPROBE is not set
# CONFIG_NET_DROP_MONITOR is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_AF_RXRPC=y
# CONFIG_AF_RXRPC_DEBUG is not set
# CONFIG_RXKAD is not set
CONFIG_WIRELESS=y
CONFIG_CFG80211=y
# CONFIG_CFG80211_REG_DEBUG is not set
# CONFIG_CFG80211_DEBUGFS is not set
CONFIG_WIRELESS_OLD_REGULATORY=y
CONFIG_WIRELESS_EXT=y
CONFIG_WIRELESS_EXT_SYSFS=y
CONFIG_LIB80211=y
CONFIG_LIB80211_CRYPT_WEP=y
CONFIG_LIB80211_CRYPT_CCMP=y
CONFIG_LIB80211_CRYPT_TKIP=y
CONFIG_LIB80211_DEBUG=y
CONFIG_MAC80211=y
CONFIG_MAC80211_DEFAULT_PS=y
CONFIG_MAC80211_DEFAULT_PS_VALUE=1

#
# Rate control algorithm selection
#
CONFIG_MAC80211_RC_MINSTREL=y
# CONFIG_MAC80211_RC_DEFAULT_PID is not set
CONFIG_MAC80211_RC_DEFAULT_MINSTREL=y
CONFIG_MAC80211_RC_DEFAULT="minstrel"
CONFIG_MAC80211_LEDS=y
# CONFIG_MAC80211_DEBUGFS is not set
# CONFIG_MAC80211_DEBUG_MENU is not set
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
# CONFIG_STANDALONE is not set
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
# CONFIG_MTD is not set
# CONFIG_PARPORT is not set
CONFIG_PNP=y
CONFIG_PNP_DEBUG_MESSAGES=y

#
# Protocols
#
CONFIG_PNPACPI=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y
# CONFIG_BLK_DEV_NBD is not set
CONFIG_BLK_DEV_SX8=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=65536
# CONFIG_BLK_DEV_XIP is not set
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
# CONFIG_BLK_DEV_HD is not set
# CONFIG_MISC_DEVICES is not set
# CONFIG_DELL_LAPTOP is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_TGT=y
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=y
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=y
# CONFIG_CHR_DEV_SCH is not set
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_SCAN_ASYNC is not set
CONFIG_SCSI_WAIT_SCAN=m

#
# SCSI Transports
#
CONFIG_SCSI_SPI_ATTRS=y
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_FC_TGT_ATTRS=y
CONFIG_SCSI_ISCSI_ATTRS=y
CONFIG_SCSI_SAS_ATTRS=y
CONFIG_SCSI_SAS_LIBSAS=y
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_SAS_HOST_SMP=y
# CONFIG_SCSI_SAS_LIBSAS_DEBUG is not set
CONFIG_SCSI_SRP_ATTRS=y
CONFIG_SCSI_SRP_TGT_ATTRS=y
CONFIG_SCSI_LOWLEVEL=y
# CONFIG_ISCSI_TCP is not set
# CONFIG_SCSI_BNX2_ISCSI is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_AIC94XX is not set
# CONFIG_SCSI_MVSAS is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_ARCMSR is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_MPT2SAS is not set
# CONFIG_SCSI_HPTIOP is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_LIBFC is not set
# CONFIG_LIBFCOE is not set
# CONFIG_FCOE is not set
# CONFIG_FCOE_FNIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_STEX is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_QLA_ISCSI is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_DEBUG is not set
# CONFIG_SCSI_SRP is not set
# CONFIG_SCSI_LOWLEVEL_PCMCIA is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_ACPI=y
CONFIG_SATA_PMP=y
CONFIG_SATA_AHCI=y
CONFIG_SATA_SIL24=y
CONFIG_ATA_SFF=y
# CONFIG_SATA_SVW is not set
CONFIG_ATA_PIIX=y
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_PATA_ACPI is not set
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PCMCIA is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RZ1000 is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
# CONFIG_PATA_SCH is not set
# CONFIG_MD is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#

#
# You can enable one or both FireWire driver stacks.
#

#
# See the help texts for more information.
#
# CONFIG_FIREWIRE is not set
CONFIG_IEEE1394=y
CONFIG_IEEE1394_OHCI1394=y
CONFIG_IEEE1394_PCILYNX=y
CONFIG_IEEE1394_SBP2=y
# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
CONFIG_IEEE1394_ETH1394_ROM_ENTRY=y
CONFIG_IEEE1394_ETH1394=y
CONFIG_IEEE1394_RAWIO=y
CONFIG_IEEE1394_VIDEO1394=y
CONFIG_IEEE1394_DV1394=y
# CONFIG_IEEE1394_VERBOSEDEBUG is not set
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=y
# CONFIG_BONDING is not set
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_VETH is not set
# CONFIG_NET_SB1000 is not set
# CONFIG_ARCNET is not set
# CONFIG_NET_ETHERNET is not set
CONFIG_MII=y
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_E1000E is not set
# CONFIG_IP1000 is not set
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_R8169=y
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
# CONFIG_CNIC is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_JME is not set
# CONFIG_NETDEV_10000 is not set
# CONFIG_TR is not set

#
# Wireless LAN
#
# CONFIG_WLAN_PRE80211 is not set
CONFIG_WLAN_80211=y
# CONFIG_PCMCIA_RAYCS is not set
# CONFIG_LIBERTAS is not set
# CONFIG_LIBERTAS_THINFIRM is not set
# CONFIG_AIRO is not set
# CONFIG_ATMEL is not set
# CONFIG_AIRO_CS is not set
# CONFIG_PCMCIA_WL3501 is not set
# CONFIG_PRISM54 is not set
# CONFIG_RTL8180 is not set
# CONFIG_ADM8211 is not set
# CONFIG_MAC80211_HWSIM is not set
# CONFIG_MWL8K is not set
# CONFIG_P54_COMMON is not set
CONFIG_ATH_COMMON=y
CONFIG_ATH5K=y
# CONFIG_ATH5K_DEBUG is not set
# CONFIG_ATH9K is not set
# CONFIG_IPW2100 is not set
# CONFIG_IPW2200 is not set
# CONFIG_IWLWIFI is not set
CONFIG_HOSTAP=y
# CONFIG_HOSTAP_FIRMWARE is not set
# CONFIG_HOSTAP_PLX is not set
# CONFIG_HOSTAP_PCI is not set
# CONFIG_HOSTAP_CS is not set
# CONFIG_B43 is not set
# CONFIG_B43LEGACY is not set
# CONFIG_RT2X00 is not set
# CONFIG_HERMES is not set

#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
# CONFIG_NET_PCMCIA is not set
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_POLLDEV=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_KEYBOARD_LKKBD=y
# CONFIG_KEYBOARD_LM8323 is not set
CONFIG_KEYBOARD_NEWTON=y
CONFIG_KEYBOARD_STOWAWAY=y
CONFIG_KEYBOARD_SUNKBD=y
CONFIG_KEYBOARD_XTKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_PS2_ALPS=y
CONFIG_MOUSE_PS2_LOGIPS2PP=y
CONFIG_MOUSE_PS2_SYNAPTICS=y
CONFIG_MOUSE_PS2_LIFEBOOK=y
CONFIG_MOUSE_PS2_TRACKPOINT=y
# CONFIG_MOUSE_PS2_ELANTECH is not set
# CONFIG_MOUSE_PS2_TOUCHKIT is not set
CONFIG_MOUSE_SERIAL=y
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_MOUSE_SYNAPTICS_I2C is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
CONFIG_SERIO_CT82C710=y
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=y
# CONFIG_GAMEPORT is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_DEVKMEM=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
# CONFIG_ROCKETPORT is not set
# CONFIG_CYCLADES is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_ISI is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
# CONFIG_SYNCLINK_GT is not set
# CONFIG_N_HDLC is not set
# CONFIG_RISCOM8 is not set
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
# CONFIG_RIO is not set
# CONFIG_STALDRV is not set
# CONFIG_NOZOMI is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
# CONFIG_SERIAL_8250_CS is not set
CONFIG_SERIAL_8250_NR_UARTS=48
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_IPMI_HANDLER is not set
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_TIMERIOMEM is not set
CONFIG_HW_RANDOM_INTEL=y
# CONFIG_HW_RANDOM_AMD is not set
CONFIG_HW_RANDOM_VIA=y
CONFIG_NVRAM=y
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set

#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_CARDMAN_4000 is not set
# CONFIG_CARDMAN_4040 is not set
# CONFIG_IPWIRELESS is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
CONFIG_RAW_DRIVER=y
CONFIG_MAX_RAW_DEVS=256
CONFIG_HPET=y
# CONFIG_HPET_MMAP is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
CONFIG_I2C=y
CONFIG_I2C_BOARDINFO=y
# CONFIG_I2C_CHARDEV is not set
# CONFIG_I2C_HELPER_AUTO is not set

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=y
CONFIG_I2C_ALGOPCF=y
CONFIG_I2C_ALGOPCA=y

#
# I2C Hardware Bus support
#

#
# PC SMBus host controller drivers
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
CONFIG_I2C_I801=y
# CONFIG_I2C_ISCH is not set
CONFIG_I2C_PIIX4=y
CONFIG_I2C_NFORCE2=y
# CONFIG_I2C_NFORCE2_S4985 is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set

#
# I2C system bus drivers (mostly embedded / system-on-chip)
#
CONFIG_I2C_OCORES=y
# CONFIG_I2C_SIMTEC is not set

#
# External I2C/SMBus adapter drivers
#
# CONFIG_I2C_PARPORT_LIGHT is not set
# CONFIG_I2C_TAOS_EVM is not set

#
# Graphics adapter I2C/DDC channel drivers
#
# CONFIG_I2C_VOODOO3 is not set

#
# Other I2C/SMBus bus drivers
#
# CONFIG_I2C_PCA_PLATFORM is not set
# CONFIG_I2C_STUB is not set

#
# Miscellaneous I2C Chip support
#
# CONFIG_DS1682 is not set
# CONFIG_SENSORS_PCF8574 is not set
# CONFIG_PCF8575 is not set
# CONFIG_SENSORS_PCA9539 is not set
# CONFIG_SENSORS_TSL2550 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
# CONFIG_SPI is not set

#
# PPS support
#
# CONFIG_PPS is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
# CONFIG_GPIOLIB is not set
CONFIG_W1=y
# CONFIG_W1_CON is not set

#
# 1-wire Bus Masters
#
# CONFIG_W1_MASTER_MATROX is not set
# CONFIG_W1_MASTER_DS2482 is not set

#
# 1-wire Slaves
#
# CONFIG_W1_SLAVE_THERM is not set
# CONFIG_W1_SLAVE_SMEM is not set
# CONFIG_W1_SLAVE_DS2431 is not set
# CONFIG_W1_SLAVE_DS2433 is not set
CONFIG_W1_SLAVE_DS2760=y
# CONFIG_W1_SLAVE_BQ27000 is not set
CONFIG_POWER_SUPPLY=y
# CONFIG_POWER_SUPPLY_DEBUG is not set
# CONFIG_PDA_POWER is not set
CONFIG_BATTERY_DS2760=y
# CONFIG_BATTERY_DS2782 is not set
CONFIG_BATTERY_BQ27x00=y
# CONFIG_BATTERY_MAX17040 is not set
# CONFIG_HWMON is not set
CONFIG_THERMAL=y
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_SC520_WDT is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
# CONFIG_IBMASR is not set
# CONFIG_WAFER_WDT is not set
CONFIG_I6300ESB_WDT=y
CONFIG_ITCO_WDT=y
CONFIG_ITCO_VENDOR_SUPPORT=y
# CONFIG_IT8712F_WDT is not set
# CONFIG_IT87_WDT is not set
# CONFIG_HP_WATCHDOG is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_PC87413_WDT is not set
# CONFIG_60XX_WDT is not set
# CONFIG_SBC8360_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_SMSC_SCH311X_WDT is not set
# CONFIG_SMSC37B787_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83697HF_WDT is not set
# CONFIG_W83697UG_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_W83977F_WDT is not set
# CONFIG_MACHZ_WDT is not set
# CONFIG_SBC_EPX_C3_WATCHDOG is not set

#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
# CONFIG_WDTPCI is not set
CONFIG_SSB_POSSIBLE=y

#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_CORE is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_TWL4030_CORE is not set
# CONFIG_MFD_TMIO is not set
# CONFIG_PMIC_DA903X is not set
# CONFIG_MFD_WM8400 is not set
# CONFIG_MFD_WM8350_I2C is not set
# CONFIG_MFD_PCF50633 is not set
# CONFIG_AB3100_CORE is not set
# CONFIG_REGULATOR is not set
# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_VIA is not set
CONFIG_DRM=y
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_R128 is not set
# CONFIG_DRM_RADEON is not set
CONFIG_DRM_I810=y
CONFIG_DRM_I830=y
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
# CONFIG_DRM_SAVAGE is not set
CONFIG_VGASTATE=y
CONFIG_VIDEO_OUTPUT_CONTROL=y
CONFIG_FB=y
CONFIG_FIRMWARE_EDID=y
# CONFIG_FB_DDC is not set
# CONFIG_FB_BOOT_VESA_SUPPORT is not set
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set
# CONFIG_FB_SYS_FILLRECT is not set
# CONFIG_FB_SYS_COPYAREA is not set
# CONFIG_FB_SYS_IMAGEBLIT is not set
# CONFIG_FB_FOREIGN_ENDIAN is not set
# CONFIG_FB_SYS_FOPS is not set
# CONFIG_FB_SVGALIB is not set
# CONFIG_FB_MACMODES is not set
# CONFIG_FB_BACKLIGHT is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y

#
# Frame buffer hardware drivers
#
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=y
CONFIG_FB_UVESA=y
# CONFIG_FB_VESA is not set
# CONFIG_FB_EFI is not set
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
CONFIG_FB_LE80578=y
CONFIG_FB_CARILLO_RANCH=y
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_VIA is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_GEODE is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
# CONFIG_FB_BROADSHEET is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_LCD_CLASS_DEVICE=y
# CONFIG_LCD_ILI9320 is not set
CONFIG_LCD_PLATFORM=y
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_BACKLIGHT_GENERIC=y
CONFIG_BACKLIGHT_PROGEAR=y
CONFIG_BACKLIGHT_CARILLO_RANCH=y
CONFIG_BACKLIGHT_MBP_NVIDIA=y
# CONFIG_BACKLIGHT_SAHARA is not set

#
# Display device support
#
CONFIG_DISPLAY_SUPPORT=y

#
# Display hardware drivers
#

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
CONFIG_FONTS=y
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_FONT_6x11 is not set
# CONFIG_FONT_7x14 is not set
# CONFIG_FONT_PEARL_8x8 is not set
# CONFIG_FONT_ACORN_8x8 is not set
# CONFIG_FONT_MINI_4x6 is not set
# CONFIG_FONT_SUN8x16 is not set
# CONFIG_FONT_SUN12x22 is not set
CONFIG_FONT_10x18=y
CONFIG_LOGO=y
# CONFIG_LOGO_LINUX_MONO is not set
CONFIG_LOGO_LINUX_VGA16=y
# CONFIG_LOGO_LINUX_CLUT224 is not set
CONFIG_SOUND=y
CONFIG_SOUND_OSS_CORE=y
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=y
CONFIG_SND_SEQUENCER=y
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_PCM_OSS_PLUGINS=y
CONFIG_SND_SEQUENCER_OSS=y
# CONFIG_SND_HRTIMER is not set
# CONFIG_SND_DYNAMIC_MINORS is not set
CONFIG_SND_SUPPORT_OLD_API=y
CONFIG_SND_VERBOSE_PROCFS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_VMASTER=y
# CONFIG_SND_RAWMIDI_SEQ is not set
# CONFIG_SND_OPL3_LIB_SEQ is not set
# CONFIG_SND_OPL4_LIB_SEQ is not set
# CONFIG_SND_SBAWE_SEQ is not set
# CONFIG_SND_EMU10K1_SEQ is not set
CONFIG_SND_DRIVERS=y
# CONFIG_SND_PCSP is not set
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
CONFIG_SND_PCI=y
# CONFIG_SND_AD1889 is not set
# CONFIG_SND_ALS300 is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AW2 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CA0106 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_OXYGEN is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS5530 is not set
# CONFIG_SND_CTXFI is not set
# CONFIG_SND_DARLA20 is not set
# CONFIG_SND_GINA20 is not set
# CONFIG_SND_LAYLA20 is not set
# CONFIG_SND_DARLA24 is not set
# CONFIG_SND_GINA24 is not set
# CONFIG_SND_LAYLA24 is not set
# CONFIG_SND_MONA is not set
# CONFIG_SND_MIA is not set
# CONFIG_SND_ECHO3G is not set
# CONFIG_SND_INDIGO is not set
# CONFIG_SND_INDIGOIO is not set
# CONFIG_SND_INDIGODJ is not set
# CONFIG_SND_INDIGOIOX is not set
# CONFIG_SND_INDIGODJX is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_EMU10K1X is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_FM801 is not set
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_HWDEP=y
# CONFIG_SND_HDA_RECONFIG is not set
# CONFIG_SND_HDA_INPUT_BEEP is not set
# CONFIG_SND_HDA_INPUT_JACK is not set
CONFIG_SND_HDA_CODEC_REALTEK=y
CONFIG_SND_HDA_CODEC_ANALOG=y
CONFIG_SND_HDA_CODEC_SIGMATEL=y
CONFIG_SND_HDA_CODEC_VIA=y
CONFIG_SND_HDA_CODEC_ATIHDMI=y
CONFIG_SND_HDA_CODEC_NVHDMI=y
CONFIG_SND_HDA_CODEC_INTELHDMI=y
CONFIG_SND_HDA_ELD=y
CONFIG_SND_HDA_CODEC_CONEXANT=y
CONFIG_SND_HDA_CODEC_CA0110=y
CONFIG_SND_HDA_CODEC_CMEDIA=y
CONFIG_SND_HDA_CODEC_SI3054=y
CONFIG_SND_HDA_GENERIC=y
# CONFIG_SND_HDA_POWER_SAVE is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_HDSPM is not set
# CONFIG_SND_HIFIER is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_LX6464ES is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_PCXHR is not set
# CONFIG_SND_RIPTIDE is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VIA82XX_MODEM is not set
# CONFIG_SND_VIRTUOSO is not set
# CONFIG_SND_VX222 is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_PCMCIA is not set
# CONFIG_SND_SOC is not set
CONFIG_SOUND_PRIME=y
CONFIG_SOUND_OSS=y
# CONFIG_SOUND_TRACEINIT is not set
# CONFIG_SOUND_DMAP is not set
# CONFIG_SOUND_SSCAPE is not set
# CONFIG_SOUND_VMIDI is not set
# CONFIG_SOUND_TRIX is not set
# CONFIG_SOUND_MSS is not set
# CONFIG_SOUND_MPU401 is not set
# CONFIG_SOUND_PAS is not set
# CONFIG_SOUND_PSS is not set
# CONFIG_SOUND_SB is not set
# CONFIG_SOUND_YM3812 is not set
# CONFIG_SOUND_UART6850 is not set
# CONFIG_SOUND_AEDSP16 is not set
# CONFIG_HID_SUPPORT is not set
# CONFIG_USB_SUPPORT is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y

#
# LED drivers
#
# CONFIG_LEDS_ALIX2 is not set
# CONFIG_LEDS_PCA9532 is not set
# CONFIG_LEDS_LP3944 is not set
# CONFIG_LEDS_CLEVO_MAIL is not set
# CONFIG_LEDS_PCA955X is not set
# CONFIG_LEDS_BD2802 is not set

#
# LED Triggers
#
CONFIG_LEDS_TRIGGERS=y
# CONFIG_LEDS_TRIGGER_TIMER is not set
# CONFIG_LEDS_TRIGGER_HEARTBEAT is not set
# CONFIG_LEDS_TRIGGER_BACKLIGHT is not set
# CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set

#
# iptables trigger is under Netfilter config (LED target)
#
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
CONFIG_RTC_LIB=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_HCTOSYS=y
CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
# CONFIG_RTC_DEBUG is not set

#
# RTC interfaces
#
CONFIG_RTC_INTF_SYSFS=y
CONFIG_RTC_INTF_PROC=y
CONFIG_RTC_INTF_DEV=y
CONFIG_RTC_INTF_DEV_UIE_EMUL=y
CONFIG_RTC_DRV_TEST=y

#
# I2C RTC drivers
#
CONFIG_RTC_DRV_DS1307=y
CONFIG_RTC_DRV_DS1374=y
CONFIG_RTC_DRV_DS1672=y
# CONFIG_RTC_DRV_MAX6900 is not set
CONFIG_RTC_DRV_RS5C372=y
CONFIG_RTC_DRV_ISL1208=y
CONFIG_RTC_DRV_X1205=y
CONFIG_RTC_DRV_PCF8563=y
CONFIG_RTC_DRV_PCF8583=y
CONFIG_RTC_DRV_M41T80=y
CONFIG_RTC_DRV_M41T80_WDT=y
# CONFIG_RTC_DRV_S35390A is not set
CONFIG_RTC_DRV_FM3130=y
# CONFIG_RTC_DRV_RX8581 is not set
# CONFIG_RTC_DRV_RX8025 is not set

#
# SPI RTC drivers
#

#
# Platform RTC drivers
#
CONFIG_RTC_DRV_CMOS=y
# CONFIG_RTC_DRV_DS1286 is not set
# CONFIG_RTC_DRV_DS1511 is not set
CONFIG_RTC_DRV_DS1553=y
CONFIG_RTC_DRV_DS1742=y
CONFIG_RTC_DRV_STK17TA8=y
CONFIG_RTC_DRV_M48T86=y
# CONFIG_RTC_DRV_M48T35 is not set
CONFIG_RTC_DRV_M48T59=y
# CONFIG_RTC_DRV_BQ4802 is not set
CONFIG_RTC_DRV_V3020=y

#
# on-CPU RTC drivers
#
CONFIG_DMADEVICES=y

#
# DMA Devices
#
CONFIG_INTEL_IOATDMA=y
CONFIG_DMA_ENGINE=y

#
# DMA Clients
#
CONFIG_NET_DMA=y
# CONFIG_ASYNC_TX_DMA is not set
# CONFIG_DMATEST is not set
CONFIG_DCA=y
# CONFIG_AUXDISPLAY is not set
CONFIG_UIO=y
CONFIG_UIO_CIF=y
CONFIG_UIO_PDRV=y
CONFIG_UIO_PDRV_GENIRQ=y
CONFIG_UIO_SMX=y
# CONFIG_UIO_AEC is not set
# CONFIG_UIO_SERCOS3 is not set

#
# TI VLYNQ
#
# CONFIG_STAGING is not set
CONFIG_X86_PLATFORM_DEVICES=y
CONFIG_ACER_WMI=y
# CONFIG_ASUS_LAPTOP is not set
# CONFIG_DELL_WMI is not set
CONFIG_FUJITSU_LAPTOP=y
# CONFIG_FUJITSU_LAPTOP_DEBUG is not set
# CONFIG_HP_WMI is not set
# CONFIG_MSI_LAPTOP is not set
# CONFIG_PANASONIC_LAPTOP is not set
# CONFIG_COMPAL_LAPTOP is not set
# CONFIG_THINKPAD_ACPI is not set
# CONFIG_INTEL_MENLOW is not set
# CONFIG_EEEPC_LAPTOP is not set
CONFIG_ACPI_WMI=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set

#
# Firmware Drivers
#
CONFIG_EDD=y
CONFIG_EDD_OFF=y
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_EFI_VARS=y
CONFIG_DELL_RBU=y
CONFIG_DCDBAS=y
CONFIG_DMIID=y
CONFIG_ISCSI_IBFT_FIND=y
CONFIG_ISCSI_IBFT=y

#
# File systems
#
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_EXT4_FS is not set
CONFIG_REISERFS_FS=y
CONFIG_REISERFS_CHECK=y
# CONFIG_REISERFS_PROC_INFO is not set
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
CONFIG_REISERFS_FS_SECURITY=y
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_BTRFS_FS is not set
CONFIG_FILE_LOCKING=y
CONFIG_FSNOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
# CONFIG_QUOTA is not set
CONFIG_AUTOFS_FS=y
CONFIG_AUTOFS4_FS=y
# CONFIG_FUSE_FS is not set
CONFIG_GENERIC_ACL=y

#
# Caches
#
# CONFIG_FSCACHE is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
# CONFIG_MSDOS_FS is not set
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=y
# CONFIG_NTFS_DEBUG is not set
CONFIG_NTFS_RW=y

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_VMCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=y
CONFIG_MISC_FILESYSTEMS=y
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_ECRYPT_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=y
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
CONFIG_ROMFS_FS=y
CONFIG_ROMFS_BACKED_BY_BLOCK=y
# CONFIG_ROMFS_BACKED_BY_MTD is not set
# CONFIG_ROMFS_BACKED_BY_BOTH is not set
CONFIG_ROMFS_ON_BLOCK=y
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_NETWORK_FILESYSTEMS is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
CONFIG_ACORN_PARTITION=y
# CONFIG_ACORN_PARTITION_CUMANA is not set
# CONFIG_ACORN_PARTITION_EESOX is not set
CONFIG_ACORN_PARTITION_ICS=y
# CONFIG_ACORN_PARTITION_ADFS is not set
# CONFIG_ACORN_PARTITION_POWERTEC is not set
CONFIG_ACORN_PARTITION_RISCIX=y
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
# CONFIG_MINIX_SUBPARTITION is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
CONFIG_LDM_PARTITION=y
# CONFIG_LDM_DEBUG is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
CONFIG_EFI_PARTITION=y
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="cp437"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=y
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
CONFIG_NLS_UTF8=y
# CONFIG_DLM is not set

#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_PRINTK_TIME=y
# CONFIG_ENABLE_WARN_DEPRECATED is not set
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_FRAME_WARN=2048
# CONFIG_MAGIC_SYSRQ is not set
CONFIG_UNUSED_SYMBOLS=y
CONFIG_DEBUG_FS=y
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SHIRQ is not set
CONFIG_DETECT_SOFTLOCKUP=y
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
# CONFIG_SCHED_DEBUG is not set
# CONFIG_SCHEDSTATS is not set
# CONFIG_TIMER_STATS is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_KMEMLEAK is not set
CONFIG_DEBUG_PREEMPT=y
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_TRACE_IRQFLAGS=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
# CONFIG_DEBUG_WRITECOUNT is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
CONFIG_ARCH_WANT_FRAME_POINTERS=y
CONFIG_FRAME_POINTER=y
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
# CONFIG_KPROBES_SANITY_TEST is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_LKDTM is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
# CONFIG_SYSCTL_SYSCALL_CHECK is not set
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FTRACE_NMI_ENTER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_FTRACE_SYSCALLS=y
CONFIG_TRACER_MAX_TRACE=y
CONFIG_RING_BUFFER=y
CONFIG_FTRACE_NMI_ENTER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_IRQSOFF_TRACER=y
CONFIG_PREEMPT_TRACER=y
# CONFIG_SYSPROF_TRACER is not set
# CONFIG_SCHED_TRACER is not set
CONFIG_FTRACE_SYSCALLS=y
# CONFIG_BOOT_TRACER is not set
CONFIG_BRANCH_PROFILE_NONE=y
# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
# CONFIG_PROFILE_ALL_BRANCHES is not set
# CONFIG_POWER_TRACER is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_KMEMTRACE is not set
# CONFIG_WORKQUEUE_TRACER is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
CONFIG_KPROBE_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
# CONFIG_FUNCTION_PROFILER is not set
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
# CONFIG_MMIOTRACE is not set
# CONFIG_RING_BUFFER_BENCHMARK is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_HAVE_ARCH_KMEMCHECK=y
# CONFIG_STRICT_DEVMEM is not set
# CONFIG_X86_VERBOSE_BOOTUP is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PER_CPU_MAPS is not set
# CONFIG_X86_PTDUMP is not set
CONFIG_DEBUG_RODATA=y
CONFIG_DEBUG_RODATA_TEST=y
# CONFIG_DEBUG_NX_TEST is not set
# CONFIG_IOMMU_DEBUG is not set
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_X86_DECODER_SELFTEST=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
# CONFIG_IO_DELAY_0X80 is not set
CONFIG_IO_DELAY_0XED=y
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=1
# CONFIG_DEBUG_BOOT_PARAMS is not set
# CONFIG_CPA_DEBUG is not set
# CONFIG_OPTIMIZE_INLINING is not set

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITYFS is not set
CONFIG_SECURITY_NETWORK=y
# CONFIG_SECURITY_NETWORK_XFRM is not set
# CONFIG_SECURITY_PATH is not set
CONFIG_SECURITY_FILE_CAPABILITIES=y
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=0
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1
# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
CONFIG_SECURITY_SMACK=y
# CONFIG_SECURITY_TOMOYO is not set
# CONFIG_IMA is not set
CONFIG_CRYPTO=y

#
# Crypto core or helper
#
# CONFIG_CRYPTO_FIPS is not set
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=y
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=y
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_PCOMP=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_GF128MUL=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_WORKQUEUE=y
CONFIG_CRYPTO_CRYPTD=y
CONFIG_CRYPTO_AUTHENC=y
# CONFIG_CRYPTO_TEST is not set

#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
CONFIG_CRYPTO_SEQIV=y

#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=y
CONFIG_CRYPTO_CTS=y
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=y
CONFIG_CRYPTO_PCBC=y
CONFIG_CRYPTO_XTS=y

#
# Hash modes
#
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=y

#
# Digest
#
CONFIG_CRYPTO_CRC32C=y
# CONFIG_CRYPTO_CRC32C_INTEL is not set
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=y
CONFIG_CRYPTO_RMD128=y
CONFIG_CRYPTO_RMD160=y
CONFIG_CRYPTO_RMD256=y
CONFIG_CRYPTO_RMD320=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_TGR192=y
CONFIG_CRYPTO_WP512=y

#
# Ciphers
#
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_X86_64=y
# CONFIG_CRYPTO_AES_NI_INTEL is not set
CONFIG_CRYPTO_ANUBIS=y
CONFIG_CRYPTO_ARC4=y
CONFIG_CRYPTO_BLOWFISH=y
CONFIG_CRYPTO_CAMELLIA=y
CONFIG_CRYPTO_CAST5=y
CONFIG_CRYPTO_CAST6=y
CONFIG_CRYPTO_DES=y
CONFIG_CRYPTO_FCRYPT=y
CONFIG_CRYPTO_KHAZAD=y
CONFIG_CRYPTO_SALSA20=y
CONFIG_CRYPTO_SALSA20_X86_64=y
CONFIG_CRYPTO_SEED=y
CONFIG_CRYPTO_SERPENT=y
CONFIG_CRYPTO_TEA=y
CONFIG_CRYPTO_TWOFISH=y
CONFIG_CRYPTO_TWOFISH_COMMON=y
CONFIG_CRYPTO_TWOFISH_X86_64=y

#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
# CONFIG_CRYPTO_ZLIB is not set
CONFIG_CRYPTO_LZO=y

#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_HW=y
# CONFIG_CRYPTO_DEV_PADLOCK is not set
CONFIG_CRYPTO_DEV_HIFN_795X=y
CONFIG_CRYPTO_DEV_HIFN_795X_RNG=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
# CONFIG_VIRTUALIZATION is not set
CONFIG_BINARY_PRINTF=y

#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_FIND_LAST_BIT=y
CONFIG_CRC_CCITT=y
CONFIG_CRC16=y
CONFIG_CRC_T10DIF=y
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_CRC7=y
CONFIG_LIBCRC32C=y
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-20 19:45:18 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
This script tests kprobes to probe on all symbols in the kernel and finds
symbols which must be blacklisted.
Usage
-----
kprobestest [-s SYMLIST] [-b BLACKLIST] [-w WHITELIST]
Run stress test. If SYMLIST file is specified, use it as
an initial symbol list (This is useful for verifying white list
after diagnosing all symbols).
kprobestest cleanup
Cleanup all lists
How to Work
-----------
This tool list up all symbols in the kernel via /proc/kallsyms, and sorts
it into groups (each of them including 64 symbols in default). And then,
it tests each group by using kprobe-tracer. If a kernel crash occurred,
that group is moved into 'failed' dir. If the group passed the test, this
script moves it into 'passed' dir and saves kprobe_profile into
'passed/profiles/'.
After testing all groups, all 'failed' groups are merged and sorted into
smaller groups (divided by 4, in default). And those are tested again.
This loop will be repeated until all group has just 1 symbol.
Finally, the script sorts all 'passed' symbols into 'tested', 'untested',
and 'missed' based on profiles.
Note
----
- This script just gives us some clues to the blacklisted functions.
In some cases, a combination of probe points will cause a problem, but
each of them doesn't cause the problem alone.
Thank you,
This script makes my x86-64 dual core easily and hardly locking-up
on the 1st batch of symbols to test.
int_very_careful
int_signal
int_restore_rest
stub_clone
stub_fork
stub_vfork
stub_sigaltstack
stub_iopl
ptregscall_common
stub_execve
stub_rt_sigreturn
irq_entries_start
common_interrupt
ret_from_intr
exit_intr
retint_with_reschedule
retint_check
retint_swapgs
retint_restore_args
restore_args
irq_return
retint_careful
retint_signal
retint_kernel
irq_move_cleanup_interrupt
reboot_interrupt
apic_timer_interrupt
generic_interrupt
invalidate_interrupt0
invalidate_interrupt1
invalidate_interrupt2
invalidate_interrupt3
invalidate_interrupt4
invalidate_interrupt5
invalidate_interrupt6
invalidate_interrupt7
threshold_interrupt
thermal_interrupt
mce_self_interrupt
call_function_single_interrupt
call_function_interrupt
reschedule_interrupt
error_interrupt
spurious_interrupt
perf_pending_interrupt
divide_error
overflow
bounds
invalid_op
device_not_available
double_fault
coprocessor_segment_overrun
invalid_TSS
segment_not_present
spurious_interrupt_bug
coprocessor_error
alignment_check
simd_coprocessor_error
native_load_gs_index
gs_change
kernel_thread
child_rip
kernel_execve
call_softirq
I don't have a crash log because I was running with X.
But it also happened with other batch of symbols.
Thank you for reporting, here, I also have a result
tested on ***@x86-64.

native_read_tscp
native_read_msr_safe
native_read_msr_amd_safe
native_write_msr_safe
vmalloc_fault
spurious_fault
search_exception_tables
notify_die
trace_hardirqs_off_caller
ident_complete
lock_acquire
lock_release
bad_address
secondary_startup_64
stack_start
bad_address
restore_args
irq_return
restore
trace_hardirqs_off_thunk
init_level4_pgt
level3_ident_pgt
level3_kernel_pgt
level2_fixmap_pgt
_text
startup_64
level1_fixmap_pgt
level2_ident_pgt
level2_kernel_pgt
level2_spare_pgt
native_get_debugreg
native_set_debugreg
native_set_iopl_mask
native_load_sp0
debug_show_all_locks
debug_check_no_locks_held
valid_state
mark_lock
mark_held_locks
lockdep_trace_alloc
trace_softirqs_on
trace_hardirqs_on_caller
__down_write
__down_read
trace_hardirqs_on_thunk
lockdep_sys_exit_thunk

Most of them can be fixed just by adding __kprobes.
Some of them which are already in the another section, kprobes
should check the symbols are in the section.
Post by Frederic Weisbecker
The problem is that I don't have any serial line in this
box then I can't catch any crash log.
My K7 testbox also died in my arms this afternoon.
But I still have two other testboxes (one P2 and one P3),
hopefully I could reproduce the problem in these boxes
in which I can connect a serial line.
Thank you for helping me to find it!
Post by Frederic Weisbecker
git://git.kernel.org/pub/scm/linux/kernel/git/fgrederic/random-tracing.git \
tracing/kprobes
So you can send patches on top of this one.
Great! I've found another trivial bugs, so I'll fix those on it.

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-21 00:01:12 UTC
Permalink
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
This script tests kprobes to probe on all symbols in the kernel and finds
symbols which must be blacklisted.
Usage
-----
kprobestest [-s SYMLIST] [-b BLACKLIST] [-w WHITELIST]
Run stress test. If SYMLIST file is specified, use it as
an initial symbol list (This is useful for verifying white list
after diagnosing all symbols).
kprobestest cleanup
Cleanup all lists
How to Work
-----------
This tool list up all symbols in the kernel via /proc/kallsyms, and sorts
it into groups (each of them including 64 symbols in default). And then,
it tests each group by using kprobe-tracer. If a kernel crash occurred,
that group is moved into 'failed' dir. If the group passed the test, this
script moves it into 'passed' dir and saves kprobe_profile into
'passed/profiles/'.
After testing all groups, all 'failed' groups are merged and sorted into
smaller groups (divided by 4, in default). And those are tested again.
This loop will be repeated until all group has just 1 symbol.
Finally, the script sorts all 'passed' symbols into 'tested', 'untested',
and 'missed' based on profiles.
Note
----
- This script just gives us some clues to the blacklisted functions.
In some cases, a combination of probe points will cause a problem, but
each of them doesn't cause the problem alone.
Thank you,
This script makes my x86-64 dual core easily and hardly locking-up
on the 1st batch of symbols to test.
int_very_careful
int_signal
int_restore_rest
stub_clone
stub_fork
stub_vfork
stub_sigaltstack
stub_iopl
ptregscall_common
stub_execve
stub_rt_sigreturn
irq_entries_start
common_interrupt
ret_from_intr
exit_intr
retint_with_reschedule
retint_check
retint_swapgs
retint_restore_args
restore_args
irq_return
retint_careful
retint_signal
retint_kernel
irq_move_cleanup_interrupt
reboot_interrupt
apic_timer_interrupt
generic_interrupt
invalidate_interrupt0
invalidate_interrupt1
invalidate_interrupt2
invalidate_interrupt3
invalidate_interrupt4
invalidate_interrupt5
invalidate_interrupt6
invalidate_interrupt7
threshold_interrupt
thermal_interrupt
mce_self_interrupt
call_function_single_interrupt
call_function_interrupt
reschedule_interrupt
error_interrupt
spurious_interrupt
perf_pending_interrupt
divide_error
overflow
bounds
invalid_op
device_not_available
double_fault
coprocessor_segment_overrun
invalid_TSS
segment_not_present
spurious_interrupt_bug
coprocessor_error
alignment_check
simd_coprocessor_error
native_load_gs_index
gs_change
kernel_thread
child_rip
kernel_execve
call_softirq
I don't have a crash log because I was running with X.
But it also happened with other batch of symbols.
Thank you for reporting, here, I also have a result
native_read_tscp
native_read_msr_safe
native_read_msr_amd_safe
native_write_msr_safe
vmalloc_fault
spurious_fault
search_exception_tables
notify_die
trace_hardirqs_off_caller
ident_complete
lock_acquire
lock_release
bad_address
secondary_startup_64
stack_start
bad_address
restore_args
irq_return
restore
trace_hardirqs_off_thunk
init_level4_pgt
level3_ident_pgt
level3_kernel_pgt
level2_fixmap_pgt
_text
startup_64
level1_fixmap_pgt
level2_ident_pgt
level2_kernel_pgt
level2_spare_pgt
native_get_debugreg
native_set_debugreg
native_set_iopl_mask
native_load_sp0
debug_show_all_locks
debug_check_no_locks_held
valid_state
mark_lock
mark_held_locks
lockdep_trace_alloc
trace_softirqs_on
trace_hardirqs_on_caller
__down_write
__down_read
trace_hardirqs_on_thunk
lockdep_sys_exit_thunk
Most of them can be fixed just by adding __kprobes.
Some of them which are already in the another section, kprobes
should check the symbols are in the section.
You mean the blacklist?

I also fear that putting bad kprobed functions into the kprobe
section or into the blacklist may hide some kprobe internal bugs.

Doing so is indeed mandatory for functions that trigger tracing
recursion of things like that, but what if kprobe has an internal
bug that only triggers while probe a certain class of function.

Ie: it would be nice to identify the reason of the crash for
each culprit in these lists.

That may even help to find the others in advance.

Also kprobes seems to be a very fragile feature (that's what
this selftest unearthes at least for me).
And it really needs a recursion detection that stops every kprobing
while reaching a given threshold of recursion. Something
that would dump the stack and the falling kprobe structure.

That would avoid such hard lockups and also help to identify
the dangerous symbols to probe.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
The problem is that I don't have any serial line in this
box then I can't catch any crash log.
My K7 testbox also died in my arms this afternoon.
But I still have two other testboxes (one P2 and one P3),
hopefully I could reproduce the problem in these boxes
in which I can connect a serial line.
Thank you for helping me to find it!
Post by Frederic Weisbecker
git://git.kernel.org/pub/scm/linux/kernel/git/fgrederic/random-tracing.git \
tracing/kprobes
So you can send patches on top of this one.
Great! I've found another trivial bugs, so I'll fix those on it.
Cool :)

Btw, here is the result of your stress test in a PIII (attaching the log
and the config).

Thanks.
Masami Hiramatsu
2009-08-21 01:00:07 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
Most of them can be fixed just by adding __kprobes.
Some of them which are already in the another section, kprobes
should check the symbols are in the section.
You mean the blacklist?
I also fear that putting bad kprobed functions into the kprobe
section or into the blacklist may hide some kprobe internal bugs.
Doing so is indeed mandatory for functions that trigger tracing
recursion of things like that, but what if kprobe has an internal
bug that only triggers while probe a certain class of function.
Ie: it would be nice to identify the reason of the crash for
each culprit in these lists.
That may even help to find the others in advance.
Indeed, actually I've found some bugs while making jump-optimization
patches by using this stress test.
But some of them are obviously what we just forget to add __kprobes,
since those will be called from kprobes int3 handling functions.

And also, many lock-related code has been changed. I think
kprobes should use raw_*_lock, or prohibit to probe lock monitoring
functions like lockdep, because it will cause recursive call.
Post by Frederic Weisbecker
Also kprobes seems to be a very fragile feature (that's what
this selftest unearthes at least for me).
And it really needs a recursion detection that stops every kprobing
while reaching a given threshold of recursion. Something
that would dump the stack and the falling kprobe structure.
Hmm, kprobes already has recursion detection(kp->nmiss), so
maybe, we can check it.
Post by Frederic Weisbecker
That would avoid such hard lockups and also help to identify
the dangerous symbols to probe.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
The problem is that I don't have any serial line in this
box then I can't catch any crash log.
My K7 testbox also died in my arms this afternoon.
But I still have two other testboxes (one P2 and one P3),
hopefully I could reproduce the problem in these boxes
in which I can connect a serial line.
Thank you for helping me to find it!
Post by Frederic Weisbecker
git://git.kernel.org/pub/scm/linux/kernel/git/fgrederic/random-tracing.git \
tracing/kprobes
So you can send patches on top of this one.
Great! I've found another trivial bugs, so I'll fix those on it.
Cool :)
Btw, here is the result of your stress test in a PIII (attaching the log
and the config).
Thanks, I'll check that.

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-21 19:43:07 UTC
Permalink
Fix x86 instruction decoder selftest to check only .text because other
sections (e.g. .notes) will have random bytes which don't need to be checked.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/tools/Makefile | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
index 3dd626b..95e9cc4 100644
--- a/arch/x86/tools/Makefile
+++ b/arch/x86/tools/Makefile
@@ -1,6 +1,6 @@
PHONY += posttest
quiet_cmd_posttest = TEST $@
- cmd_posttest = $(OBJDUMP) -d $(objtree)/vmlinux | awk -f $(srctree)/arch/x86/tools/distill.awk | $(obj)/test_get_len
+ cmd_posttest = $(OBJDUMP) -d -j .text $(objtree)/vmlinux | awk -f $(srctree)/arch/x86/tools/distill.awk | $(obj)/test_get_len

posttest: $(obj)/test_get_len vmlinux
$(call cmd,posttest)
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-23 19:34:07 UTC
Permalink
Post by Masami Hiramatsu
Fix x86 instruction decoder selftest to check only .text because other
sections (e.g. .notes) will have random bytes which don't need to be checked.
Applied these 4 patches in

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git \
tracing/kprobes

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-21 19:43:16 UTC
Permalink
Check some awk features which old mawk doesn't support.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

arch/x86/tools/gen-insn-attr-x86.awk | 20 ++++++++++++++++++++
1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/arch/x86/tools/gen-insn-attr-x86.awk b/arch/x86/tools/gen-insn-attr-x86.awk
index 93b62c9..19ba096 100644
--- a/arch/x86/tools/gen-insn-attr-x86.awk
+++ b/arch/x86/tools/gen-insn-attr-x86.awk
@@ -4,7 +4,25 @@
#
# Usage: awk -f gen-insn-attr-x86.awk x86-opcode-map.txt > inat-tables.c

+# Awk implementation sanity check
+function check_awk_implement() {
+ if (!match("abc", "[[:lower:]]+"))
+ return "Your awk doesn't support charactor-class."
+ if (sprintf("%x", 0) != "0")
+ return "Your awk has a printf-format problem."
+ return ""
+}
+
BEGIN {
+ # Implementation error checking
+ awkchecked = check_awk_implement()
+ if (awkchecked != "") {
+ print "Error: " awkchecked > "/dev/stderr"
+ print "Please try to use gawk." > "/dev/stderr"
+ exit 1
+ }
+
+ # Setup generating tables
print "/* x86 opcode map generated from x86-opcode-map.txt */"
print "/* Do not change this code. */"
ggid = 1
@@ -293,6 +311,8 @@ function convert_operands(opnd, i,imm,mod)
}

END {
+ if (awkchecked != "")
+ exit 1
# print escape opcode map's array
print "/* Escape opcode map array */"
print "const insn_attr_t const *inat_escape_tables[INAT_ESC_MAX + 1]" \
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-21 19:43:43 UTC
Permalink
Fix a format typo in kprobe-tracer.

Currently, it shows 'tsize' in format;

$ cat /debug/tracing/events/kprobes/event/format
...
field: unsigned long ip; offset:16;tsize:8;
field: int nargs; offset:24;tsize:4;
...

This should be '\tsize';

$ cat /debug/tracing/events/kprobes/event/format
...
field: unsigned long ip; offset:16; size:8;
field: int nargs; offset:24; size:4;
...

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

kernel/trace/trace_kprobe.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 7cd726e..22e91c0 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1069,7 +1069,7 @@ static int __probe_event_show_format(struct trace_seq *s,
#define SHOW_FIELD(type, item, name) \
do { \
ret = trace_seq_printf(s, "\tfield: " #type " %s;\t" \
- "offset:%u;tsize:%u;\n", name, \
+ "offset:%u;\tsize:%u;\n", name, \
(unsigned int)offsetof(typeof(field), item),\
(unsigned int)sizeof(type)); \
if (!ret) \
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-21 19:43:51 UTC
Permalink
Change trace_arg_string() and parse_trace_arg() to probe_arg_string()
and parse_probe_arg(), since those are kprobe-tracer local functions.

Signed-off-by: Masami Hiramatsu <***@redhat.com>
Cc: Jim Keniston <***@us.ibm.com>
Cc: H. Peter Anvin <***@zytor.com>
Cc: Ananth N Mavinakayanahalli <***@in.ibm.com>
Cc: Avi Kivity <***@redhat.com>
Cc: Andi Kleen <***@linux.intel.com>
Cc: Christoph Hellwig <***@infradead.org>
Cc: Frank Ch. Eigler <***@redhat.com>
Cc: Frederic Weisbecker <***@gmail.com>
Cc: Ingo Molnar <***@elte.hu>
Cc: Jason Baron <***@redhat.com>
Cc: K.Prasad <***@linux.vnet.ibm.com>
Cc: Lai Jiangshan <***@cn.fujitsu.com>
Cc: Li Zefan <***@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <***@pawelczyk.it>
Cc: Roland McGrath <***@redhat.com>
Cc: Sam Ravnborg <***@ravnborg.org>
Cc: Srikar Dronamraju <***@linux.vnet.ibm.com>
Cc: Steven Rostedt <***@goodmis.org>
Cc: Tom Zanussi <***@gmail.com>
Cc: Vegard Nossum <***@gmail.com>
---

kernel/trace/trace_kprobe.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 22e91c0..783d2db 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -220,7 +220,7 @@ static __kprobes void *probe_address(struct trace_probe *tp)
return (probe_is_return(tp)) ? tp->rp.kp.addr : tp->kp.addr;
}

-static int trace_arg_string(char *buf, size_t n, struct fetch_func *ff)
+static int probe_arg_string(char *buf, size_t n, struct fetch_func *ff)
{
int ret = -EINVAL;

@@ -250,7 +250,7 @@ static int trace_arg_string(char *buf, size_t n, struct fetch_func *ff)
if (ret >= n)
goto end;
l += ret;
- ret = trace_arg_string(buf + l, n - l, &id->orig);
+ ret = probe_arg_string(buf + l, n - l, &id->orig);
if (ret < 0)
goto end;
l += ret;
@@ -408,7 +408,7 @@ static int split_symbol_offset(char *symbol, long *offset)
#define PARAM_MAX_ARGS 16
#define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))

-static int parse_trace_arg(char *arg, struct fetch_func *ff, int is_return)
+static int parse_probe_arg(char *arg, struct fetch_func *ff, int is_return)
{
int ret = 0;
unsigned long param;
@@ -499,7 +499,7 @@ static int parse_trace_arg(char *arg, struct fetch_func *ff, int is_return)
if (!id)
return -ENOMEM;
id->offset = offset;
- ret = parse_trace_arg(arg, &id->orig, is_return);
+ ret = parse_probe_arg(arg, &id->orig, is_return);
if (ret)
kfree(id);
else {
@@ -617,7 +617,7 @@ static int create_trace_probe(int argc, char **argv)
ret = -ENOSPC;
goto error;
}
- ret = parse_trace_arg(argv[i], &tp->args[i], is_return);
+ ret = parse_probe_arg(argv[i], &tp->args[i], is_return);
if (ret)
goto error;
}
@@ -680,7 +680,7 @@ static int probes_seq_show(struct seq_file *m, void *v)
seq_printf(m, " 0x%p", probe_address(tp));

for (i = 0; i < tp->nr_args; i++) {
- ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ ret = probe_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
if (ret < 0) {
pr_warning("Argument%d decoding error(%d).\n", i, ret);
return ret;
@@ -996,7 +996,7 @@ static int kprobe_event_define_fields(struct ftrace_event_call *event_call)
sprintf(buf, "arg%d", i);
DEFINE_FIELD(unsigned long, args[i], buf, 0);
/* Set argument string as an alias field */
- ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ ret = probe_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
if (ret < 0)
return ret;
DEFINE_FIELD(unsigned long, args[i], buf, 0);
@@ -1023,7 +1023,7 @@ static int kretprobe_event_define_fields(struct ftrace_event_call *event_call)
sprintf(buf, "arg%d", i);
DEFINE_FIELD(unsigned long, args[i], buf, 0);
/* Set argument string as an alias field */
- ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ ret = probe_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
if (ret < 0)
return ret;
DEFINE_FIELD(unsigned long, args[i], buf, 0);
@@ -1040,7 +1040,7 @@ static int __probe_event_show_format(struct trace_seq *s,

/* Show aliases */
for (i = 0; i < tp->nr_args; i++) {
- ret = trace_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
+ ret = probe_arg_string(buf, MAX_ARGSTR_LEN, &tp->args[i]);
if (ret < 0)
return ret;
if (!trace_seq_printf(s, "\talias: %s;\toriginal: arg%d;\n",
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Masami Hiramatsu
2009-08-13 20:59:19 UTC
Permalink
This program converts probe point in C expression to kprobe event
format for kprobe-based event tracer. This helps to define kprobes
events by C source line number or function name, and local variable
name. Currently, this supports only x86(32/64) kernels.


Compile
--------
Before compilation, please install libelf and libdwarf development
packages.
(e.g. elfutils-libelf-devel and libdwarf-devel on Fedora)

$ gcc -Wall -lelf -ldwarf c2kpe.c -o c2kpe


Synopsis
--------
$ c2kpe [options] FUNCTION[+OFFS][@SRC] [VAR [VAR ...]]
or
$ c2kpe [options] @SRC:LINE [VAR [VAR ...]]

FUNCTION: Probing function name.
OFFS: Offset in bytes.
SRC: Source file path.
LINE: Line number
VAR: Local variable name.
options:
-r KREL Kernel release version (e.g. 2.6.31-rc5)
-m DEBUGINFO Dwarf-format binary file (vmlinux or kmodule)


Example
-------
$ c2kpe sys_read fd buf count
sys_read+0 %di %si %dx

$ c2kpe @mm/filemap.c:339 inode pos
sync_page_range+125 -48(%bp) %r14


Example with kprobe-tracer
--------------------------
Since C expression may be converted multiple results, I recommend to use
readline.

$ c2kpe sys_read fd buf count | while read i; do \
echo "p $i" > $DEBUGFS/tracing/kprobe_events ;\
done


Note
----
- This requires a kernel compiled with CONFIG_DEBUG_INFO.
- Specifying @SRC speeds up c2kpe, because we can skip CUs which don't
include specified SRC file.
- c2kpe doesn't check whether the offset byte is correctly on the
instruction boundary. I recommend you to use @SRC:LINE expression for
tracing function body.
- This tool doesn't search kmodule file. You need to specify kmodule
file if you want to probe it.


TODO
----
- Fix bugs.
- Support multiple probepoints from stdin.
- Better kmodule support.
- Use elfutils-libdw?
- Merge into trace-cmd or perf-tools?
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Christoph Hellwig
2009-08-13 21:05:37 UTC
Permalink
You rock, this is awesome! I'm a bit busy right now, but I'll play
around with it ASAP and will see how well it works for me.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Frederic Weisbecker
2009-08-30 19:50:43 UTC
Permalink
Post by Masami Hiramatsu
This program converts probe point in C expression to kprobe event
format for kprobe-based event tracer. This helps to define kprobes
events by C source line number or function name, and local variable
name. Currently, this supports only x86(32/64) kernels.
Compile
--------
Before compilation, please install libelf and libdwarf development
packages.
(e.g. elfutils-libelf-devel and libdwarf-devel on Fedora)
This may probably need a specific libdwarf version?

c2kpe.c: In function =E2=80=98die_get_entrypc=E2=80=99:
c2kpe.c:422: erreur: =E2=80=98Dwarf_Ranges=E2=80=99 undeclared (first u=
se in this function)
c2kpe.c:422: erreur: (Each undeclared identifier is reported only once
c2kpe.c:422: erreur: for each function it appears in.)
c2kpe.c:422: erreur: =E2=80=98ranges=E2=80=99 undeclared (first use in =
this function)
c2kpe.c:447: attention : implicit declaration of function =E2=80=98dwar=
f_get_ranges=E2=80=99
c2kpe.c:451: attention : implicit declaration of function =E2=80=98dwar=
f_ranges_dealloc=E2=80=99
Post by Masami Hiramatsu
TODO
----
- Fix bugs.
- Support multiple probepoints from stdin.
- Better kmodule support.
- Use elfutils-libdw?
- Merge into trace-cmd or perf-tools?
Yeah definetly, that would be a veeery interesting thing to have.
I've played with kprobe ftrace to debug something this evening.

It's very cool to be able to put dynamic tracepoints in desired places.

But...
I firstly needed to put random trace_printk() in some places to
observe some variables values. And then I thought about the kprobes
tracer and realized I could do that without the need of rebuilding
my kernel. Then I've played with it and indeed it works well and
it's useful, but at the cost of reading objdump based assembly
code to find the places where I could find my variables values.
And after two or three probes in such conditions, I've become
tired of that, then I wanted to try this tool.


While I cannot yet because of this build error, I can imagine
the power of such facility from perf.

We could have a perf probe that creates a kprobe event in debugfs
(default enable =3D 0) and which then rely on perf record for the actua=
l
recording.

Then we could analyse it through perf trace.
Let's imagine a simple example:

int foo(int arg1, int arg2)
{
int var1;

var1 =3D arg1;
var1 *=3D arg2;
var1 -=3D arg1;

------> insert a probe here (file bar.c : line 60)

var1 ^=3D ...

return var1;
}

=2E/perf kprobe --file bar.c:60 --action "arg1=3D%d","arg2=3D%d","var1=3D=
%d" -- ls -R /
=2E/perf trace
arg1=3D1 arg2=3D1 var1=3D0
arg1=3D2 arg2=3D2 var1=3D2
etc..

You may want to sort by field:

=2E/perf trace -s arg1 --order desc
arg1=3D1
|
------- arg2=3D1 var=3D1
|
------- arg2=3D2 var=3D1

arg1=3D2
|
------- arg2=3D1 var=3D0
|
------- [...]

=2E/perf trace -s arg1,arg2 --order asc
arg1=3D1
|
------- arg2=3D1
|
--------- var1=3D0
|
--------- var1=3D....
arg2=3D...
|

Ok the latter is a bad example because var1 will always have only one
value for a given arg1 and arg2. But I guess you see the point.

You won't have to care about the perf trace part, it's already
implemented and I'll soon handle the sorting part.

All we need is the perf kprobes that translate a C level
probing expression to a /debug/tracing/kprobe_events compliant
thing. And then just call perf record with the new created
event as an argument.

=46rederic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Masami Hiramatsu
2009-08-31 04:14:34 UTC
Permalink
Post by Frederic Weisbecker
Post by Masami Hiramatsu
This program converts probe point in C expression to kprobe event
format for kprobe-based event tracer. This helps to define kprobes
events by C source line number or function name, and local variable
name. Currently, this supports only x86(32/64) kernels.
Compile
--------
Before compilation, please install libelf and libdwarf development
packages.
(e.g. elfutils-libelf-devel and libdwarf-devel on Fedora)
This may probably need a specific libdwarf version?
c2kpe.c:422: erreur: ‘Dwarf_Ranges’ undeclared (first use in this function)
c2kpe.c:422: erreur: (Each undeclared identifier is reported only once
c2kpe.c:422: erreur: for each function it appears in.)
c2kpe.c:422: erreur: ‘ranges’ undeclared (first use in this function)
c2kpe.c:447: attention : implicit declaration of function ‘dwarf_get_ranges’
c2kpe.c:451: attention : implicit declaration of function ‘dwarf_ranges_dealloc’
Aah, sure, it should be compiled with libdwarf newer than 20090324.
You can find it in http://reality.sgiweb.org/davea/dwarf.html

BTW, libdwarf and libdw (which is the yet another implementation of
dwarf library) are still under development, e.g. libdwarf doesn't
support gcc-4.4.1(very new) and only the latest libdw(0.142) can
support it. So, perhaps I might better port it on libdw, even that is
less documented...:(
Post by Frederic Weisbecker
Post by Masami Hiramatsu
TODO
----
- Fix bugs.
- Support multiple probepoints from stdin.
- Better kmodule support.
- Use elfutils-libdw?
- Merge into trace-cmd or perf-tools?
Yeah definetly, that would be a veeery interesting thing to have.
I've played with kprobe ftrace to debug something this evening.
It's very cool to be able to put dynamic tracepoints in desired places.
But...
I firstly needed to put random trace_printk() in some places to
observe some variables values. And then I thought about the kprobes
tracer and realized I could do that without the need of rebuilding
my kernel. Then I've played with it and indeed it works well and
it's useful, but at the cost of reading objdump based assembly
code to find the places where I could find my variables values.
And after two or three probes in such conditions, I've become
tired of that, then I wanted to try this tool.
While I cannot yet because of this build error, I can imagine
the power of such facility from perf.
We could have a perf probe that creates a kprobe event in debugfs
(default enable = 0) and which then rely on perf record for the actual
recording.
Then we could analyse it through perf trace.
int foo(int arg1, int arg2)
{
int var1;
var1 = arg1;
var1 *= arg2;
var1 -= arg1;
------> insert a probe here (file bar.c : line 60)
var1 ^= ...
return var1;
}
./perf kprobe --file bar.c:60 --action "arg1=%d","arg2=%d","var1=%d" -- ls -R /
I recommend it should be separated from record, like below:

# set new event
./perf kprobe --add kprobe:event1 --file bar.c:60 --action "arg1=%d","arg2=%d","var1=%d"
# record new event
./perf record -e kprobe:event1 -a -R -- ls -R /

This will allow us to focus on one thing -- convert C to kprobe-tracer.
And also, it can be listed as like as tracepoint events.
Post by Frederic Weisbecker
./perf trace
arg1=1 arg2=1 var1=0
arg1=2 arg2=2 var1=2
etc..
./perf trace -s arg1 --order desc
arg1=1
|
------- arg2=1 var=1
|
------- arg2=2 var=1
arg1=2
|
------- arg2=1 var=0
|
------- [...]
./perf trace -s arg1,arg2 --order asc
arg1=1
|
------- arg2=1
|
--------- var1=0
|
--------- var1=....
arg2=...
|
Ok the latter is a bad example because var1 will always have only one
value for a given arg1 and arg2. But I guess you see the point.
You won't have to care about the perf trace part, it's already
implemented and I'll soon handle the sorting part.
All we need is the perf kprobes that translate a C level
probing expression to a /debug/tracing/kprobe_events compliant
thing. And then just call perf record with the new created
event as an argument.
Indeed, that's what I imagine.

Thank you,
--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: ***@redhat.com
Frederic Weisbecker
2009-08-31 22:14:30 UTC
Permalink
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
This program converts probe point in C expression to kprobe event
format for kprobe-based event tracer. This helps to define kprobes
events by C source line number or function name, and local variabl=
e
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
name. Currently, this supports only x86(32/64) kernels.
Compile
--------
Before compilation, please install libelf and libdwarf development
packages.
(e.g. elfutils-libelf-devel and libdwarf-devel on Fedora)
=20
=20
This may probably need a specific libdwarf version?
=20
c2kpe.c:422: erreur: =E2=80=98Dwarf_Ranges=E2=80=99 undeclared (fir=
st use in this function)
Post by Masami Hiramatsu
Post by Frederic Weisbecker
c2kpe.c:422: erreur: (Each undeclared identifier is reported only o=
nce
Post by Masami Hiramatsu
Post by Frederic Weisbecker
c2kpe.c:422: erreur: for each function it appears in.)
c2kpe.c:422: erreur: =E2=80=98ranges=E2=80=99 undeclared (first use=
in this function)
Post by Masami Hiramatsu
Post by Frederic Weisbecker
c2kpe.c:447: attention : implicit declaration of function =E2=80=98=
dwarf_get_ranges=E2=80=99
Post by Masami Hiramatsu
Post by Frederic Weisbecker
c2kpe.c:451: attention : implicit declaration of function =E2=80=98=
dwarf_ranges_dealloc=E2=80=99
Post by Masami Hiramatsu
=20
Aah, sure, it should be compiled with libdwarf newer than 20090324.
You can find it in http://reality.sgiweb.org/davea/dwarf.html
Ah ok.

=20
Post by Masami Hiramatsu
BTW, libdwarf and libdw (which is the yet another implementation of
dwarf library) are still under development, e.g. libdwarf doesn't
support gcc-4.4.1(very new) and only the latest libdw(0.142) can
support it. So, perhaps I might better port it on libdw, even that is
less documented...:(
May be let's continue with libdwarf for now and we'll see if support
for libdw is required later.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
Post by Masami Hiramatsu
TODO
----
- Fix bugs.
- Support multiple probepoints from stdin.
- Better kmodule support.
- Use elfutils-libdw?
- Merge into trace-cmd or perf-tools?
=20
=20
Yeah definetly, that would be a veeery interesting thing to have.
I've played with kprobe ftrace to debug something this evening.
=20
It's very cool to be able to put dynamic tracepoints in desired pla=
ces.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
=20
But...
I firstly needed to put random trace_printk() in some places to
observe some variables values. And then I thought about the kprobes
tracer and realized I could do that without the need of rebuilding
my kernel. Then I've played with it and indeed it works well and
it's useful, but at the cost of reading objdump based assembly
code to find the places where I could find my variables values.
And after two or three probes in such conditions, I've become
tired of that, then I wanted to try this tool.
=20
=20
While I cannot yet because of this build error, I can imagine
the power of such facility from perf.
=20
We could have a perf probe that creates a kprobe event in debugfs
(default enable =3D 0) and which then rely on perf record for the a=
ctual
Post by Masami Hiramatsu
Post by Frederic Weisbecker
recording.
=20
Then we could analyse it through perf trace.
=20
int foo(int arg1, int arg2)
{
int var1;
=20
var1 =3D arg1;
var1 *=3D arg2;
var1 -=3D arg1;
=20
------> insert a probe here (file bar.c : line 60)
=20
var1 ^=3D ...
=20
return var1;
}
=20
./perf kprobe --file bar.c:60 --action "arg1=3D%d","arg2=3D%d","var=
1=3D%d" -- ls -R /
Post by Masami Hiramatsu
=20
=20
# set new event
./perf kprobe --add kprobe:event1 --file bar.c:60 --action "arg1=3D%d=
","arg2=3D%d","var1=3D%d"
Post by Masami Hiramatsu
# record new event
./perf record -e kprobe:event1 -a -R -- ls -R /
That indeed solves the command line overkill, but that also
breaks a bit the workflow :)

Well, I guess we can start simple in the beginning and follow the above
mockup which is indeed better decoupled.=20
And if something more intuitive comes in mind later, then we can still
change it.
Post by Masami Hiramatsu
This will allow us to focus on one thing -- convert C to kprobe-trace=
r.
Post by Masami Hiramatsu
And also, it can be listed as like as tracepoint events.
Yeah.
Post by Masami Hiramatsu
Post by Frederic Weisbecker
./perf trace
arg1=3D1 arg2=3D1 var1=3D0
arg1=3D2 arg2=3D2 var1=3D2
etc..
=20
=20
./perf trace -s arg1 --order desc
arg1=3D1
|
------- arg2=3D1 var=3D1
|
------- arg2=3D2 var=3D1
=20
arg1=3D2
|
------- arg2=3D1 var=3D0
|
------- [...]
=20
./perf trace -s arg1,arg2 --order asc
arg1=3D1
|
------- arg2=3D1
|
--------- var1=3D0
|
--------- var1=3D....
arg2=3D...
|
=20
Ok the latter is a bad example because var1 will always have only o=
ne
Post by Masami Hiramatsu
Post by Frederic Weisbecker
value for a given arg1 and arg2. But I guess you see the point.
=20
You won't have to care about the perf trace part, it's already
implemented and I'll soon handle the sorting part.
=20
All we need is the perf kprobes that translate a C level
probing expression to a /debug/tracing/kprobe_events compliant
thing. And then just call perf record with the new created
event as an argument.
=20
Indeed, that's what I imagine.
Cool, thanks!
Post by Masami Hiramatsu
Thank you,
=20
--=20
Masami Hiramatsu
=20
Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division
=20
=20
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Continue reading on narkive:
Loading...