<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/tools/perf/arch, branch v5.2-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>treewide: Add SPDX license identifier - Makefile/Kconfig</title>
<updated>2019-05-21T08:50:46+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-19T12:07:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ec8f24b7faaf3d4799a7c3f4c1b87f6b02778ad1'/>
<id>ec8f24b7faaf3d4799a7c3f4c1b87f6b02778ad1</id>
<content type='text'>
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf regs x86: Add X86 specific arch__intr_reg_mask()</title>
<updated>2019-05-16T17:17:23+00:00</updated>
<author>
<name>Kan Liang</name>
<email>kan.liang@linux.intel.com</email>
</author>
<published>2019-05-14T20:19:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6466ec14aaf44ff14a05369dcf0929d0f01171c6'/>
<id>6466ec14aaf44ff14a05369dcf0929d0f01171c6</id>
<content type='text'>
XMM registers can be collected on Icelake and later platforms.

Add specific arch__intr_reg_mask(), which creating an event to check if
the kernel and hardware can collect XMM registers.

Test on Skylake which doesn't support XMM registers collection. There is
nothing changed.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
   R10 R11 R12 R13 R14 R15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
    or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

    -I, --intr-regs[=&lt;any register&gt;]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xff0fff

Test on Icelake which support XMM registers collection.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
   R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
   XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
    or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

    -I, --intr-regs[=&lt;any register&gt;]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff

Committer notes:

Don't set attr.sample_period as a named struct init, as it is part of an
unnamed union in 'struct perf_event_attr', and doing so breaks the build
on older gcc versions, such as:

  gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)
  gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)

  arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask':
  arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer
  cc1: warnings being treated as errors
  arch/x86/util/perf_regs.c:279: warning: missing braces around initializer
  arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.&lt;anonymous&gt;')

Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
[ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ]
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
XMM registers can be collected on Icelake and later platforms.

Add specific arch__intr_reg_mask(), which creating an event to check if
the kernel and hardware can collect XMM registers.

Test on Skylake which doesn't support XMM registers collection. There is
nothing changed.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
   R10 R11 R12 R13 R14 R15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
    or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

    -I, --intr-regs[=&lt;any register&gt;]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xff0fff

Test on Icelake which support XMM registers collection.

   #perf record -I?
   available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
   R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
   XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
    or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

    -I, --intr-regs[=&lt;any register&gt;]
                          sample selected machine registers on
   interrupt, use '-I?' to list register names

   #perf record -I
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]

   #perf evlist -v
   cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
   IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
   inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
   sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
   1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff

Committer notes:

Don't set attr.sample_period as a named struct init, as it is part of an
unnamed union in 'struct perf_event_attr', and doing so breaks the build
on older gcc versions, such as:

  gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)
  gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)

  arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask':
  arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer
  cc1: warnings being treated as errors
  arch/x86/util/perf_regs.c:279: warning: missing braces around initializer
  arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.&lt;anonymous&gt;')

Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
[ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ]
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools x86: Add support for recording and printing XMM registers</title>
<updated>2019-05-15T19:36:47+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2019-05-06T14:19:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ca138a7aabc68bf727918bb40ce08157cd5ec0a5'/>
<id>ca138a7aabc68bf727918bb40ce08157cd5ec0a5</id>
<content type='text'>
Icelake and later platforms support collecting XMM registers with PEBS
event.

Add support for 'perf script' to dump them, and support for the register
parser in 'perf record -I=' ... to configure them.

For now they are just printed in hex, we could potentially later add
other formats too.

Committer testing:

Before:

  # perf record -IXMM0
  Warning:
  unknown register XMM0, check man page or run 'perf record -I?'

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

  #
  # perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]
  #

After:

  # perf record -IXMM0
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
  /bin/dmesg | grep -i perf may provide additional information.

  #
  # perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

      -I, --intr-regs[=&lt;any register&gt;]
                            sample selected machine registers on interrupt, use -I ? to list register names
  #

More work is needed to, when faced with such error, warn the user that
that register is not available on the running platform.

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20190506141926.13659-1-kan.liang@linux.intel.com
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Icelake and later platforms support collecting XMM registers with PEBS
event.

Add support for 'perf script' to dump them, and support for the register
parser in 'perf record -I=' ... to configure them.

For now they are just printed in hex, we could potentially later add
other formats too.

Committer testing:

Before:

  # perf record -IXMM0
  Warning:
  unknown register XMM0, check man page or run 'perf record -I?'

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

  #
  # perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]
  #

After:

  # perf record -IXMM0
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
  /bin/dmesg | grep -i perf may provide additional information.

  #
  # perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

      -I, --intr-regs[=&lt;any register&gt;]
                            sample selected machine registers on interrupt, use -I ? to list register names
  #

More work is needed to, when faced with such error, warn the user that
that register is not available on the running platform.

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20190506141926.13659-1-kan.liang@linux.intel.com
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>csky: Add support for libdw</title>
<updated>2019-05-15T19:36:46+00:00</updated>
<author>
<name>Mao Han</name>
<email>han_mao@c-sky.com</email>
</author>
<published>2019-04-21T15:33:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b399ec215b8488468b947c34dfc097003408fe34'/>
<id>b399ec215b8488468b947c34dfc097003408fe34</id>
<content type='text'>
This patch add support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing when
--call-graph=dwarf is given.

Here is the elfutils csky backend patch set:

https://sourceware.org/ml/elfutils-devel/2019-q2/msg00007.html

Signed-off-by: Mao Han &lt;han_mao@c-sky.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: linux-arch@vger.kernel.org
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1555860794-10572-1-git-send-email-guoren@kernel.org
Signed-off-by: Guo Ren &lt;ren_guo@c-sky.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch add support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing when
--call-graph=dwarf is given.

Here is the elfutils csky backend patch set:

https://sourceware.org/ml/elfutils-devel/2019-q2/msg00007.html

Signed-off-by: Mao Han &lt;han_mao@c-sky.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: linux-arch@vger.kernel.org
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1555860794-10572-1-git-send-email-guoren@kernel.org
Signed-off-by: Guo Ren &lt;ren_guo@c-sky.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistd</title>
<updated>2019-03-28T17:41:11+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-03-25T14:34:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8142bd82a59e452fefea7b21113101d6a87d9fa8'/>
<id>8142bd82a59e452fefea7b21113101d6a87d9fa8</id>
<content type='text'>
To pick up the changes introduced in the following csets:

  2b188cc1bb85 ("Add io_uring IO interface")
  edafccee56ff ("io_uring: add support for pre-mapped user IO buffers")
  3eb39f47934f ("signal: add pidfd_send_signal() syscall")

This makes 'perf trace' to become aware of these new syscalls, so that
one can use them like 'perf trace -e ui_uring*,*signal' to do a system
wide strace-like session looking at those syscalls, for instance.

For example:

  # perf trace -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla

   Summary of events:

   io_uring-cp (383), 1208866 events, 100.0%

     syscall         calls   total    min     avg     max   stddev
                             (msec) (msec)  (msec)  (msec)     (%)
     -------------- ------ -------- ------ ------- -------  ------
     io_uring_enter 605780 2955.615  0.000   0.005  33.804   1.94%
     openat              4  459.446  0.004 114.861 459.435 100.00%
     munmap              4    0.073  0.009   0.018   0.042  44.03%
     mmap               10    0.054  0.002   0.005   0.026  43.24%
     brk                28    0.038  0.001   0.001   0.003   7.51%
     io_uring_setup      1    0.030  0.030   0.030   0.030   0.00%
     mprotect            4    0.014  0.002   0.004   0.005  14.32%
     close               5    0.012  0.001   0.002   0.004  28.87%
     fstat               3    0.006  0.001   0.002   0.003  35.83%
     read                4    0.004  0.001   0.001   0.002  13.58%
     access              1    0.003  0.003   0.003   0.003   0.00%
     lseek               3    0.002  0.001   0.001   0.001   9.00%
     arch_prctl          2    0.002  0.001   0.001   0.001   0.69%
     execve              1    0.000  0.000   0.000   0.000   0.00%
  #
  # perf trace -e io_uring* -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla

   Summary of events:

   io_uring-cp (390), 1191250 events, 100.0%

     syscall         calls   total    min    avg    max  stddev
                             (msec) (msec) (msec) (msec)    (%)
     -------------- ------ -------- ------ ------ ------ ------
     io_uring_enter 597093 2706.060  0.001  0.005 14.761  1.10%
     io_uring_setup      1    0.038  0.038  0.038  0.038  0.00%
  #

More work needed to make the tools/perf/examples/bpf/augmented_raw_syscalls.c
BPF program to copy the 'struct io_uring_params' arguments to perf's ring
buffer so that 'perf trace' can use the BTF info put in place by pahole's
conversion of the kernel DWARF and then auto-beautify those arguments.

This patch produces the expected change in the generated syscalls table
for x86_64:

  --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before	2019-03-26 13:37:46.679057774 -0300
  +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c	2019-03-26 13:38:12.755990383 -0300
  @@ -334,5 +334,9 @@ static const char *syscalltbl_x86_64[] =
   	[332] = "statx",
   	[333] = "io_pgetevents",
   	[334] = "rseq",
  +	[424] = "pidfd_send_signal",
  +	[425] = "io_uring_setup",
  +	[426] = "io_uring_enter",
  +	[427] = "io_uring_register",
   };
  -#define SYSCALLTBL_x86_64_MAX_ID 334
  +#define SYSCALLTBL_x86_64_MAX_ID 427

This silences these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: Christian Brauner &lt;christian@brauner.io&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lkml.kernel.org/n/tip-p0ars3otuc52x5iznf21shhw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To pick up the changes introduced in the following csets:

  2b188cc1bb85 ("Add io_uring IO interface")
  edafccee56ff ("io_uring: add support for pre-mapped user IO buffers")
  3eb39f47934f ("signal: add pidfd_send_signal() syscall")

This makes 'perf trace' to become aware of these new syscalls, so that
one can use them like 'perf trace -e ui_uring*,*signal' to do a system
wide strace-like session looking at those syscalls, for instance.

For example:

  # perf trace -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla

   Summary of events:

   io_uring-cp (383), 1208866 events, 100.0%

     syscall         calls   total    min     avg     max   stddev
                             (msec) (msec)  (msec)  (msec)     (%)
     -------------- ------ -------- ------ ------- -------  ------
     io_uring_enter 605780 2955.615  0.000   0.005  33.804   1.94%
     openat              4  459.446  0.004 114.861 459.435 100.00%
     munmap              4    0.073  0.009   0.018   0.042  44.03%
     mmap               10    0.054  0.002   0.005   0.026  43.24%
     brk                28    0.038  0.001   0.001   0.003   7.51%
     io_uring_setup      1    0.030  0.030   0.030   0.030   0.00%
     mprotect            4    0.014  0.002   0.004   0.005  14.32%
     close               5    0.012  0.001   0.002   0.004  28.87%
     fstat               3    0.006  0.001   0.002   0.003  35.83%
     read                4    0.004  0.001   0.001   0.002  13.58%
     access              1    0.003  0.003   0.003   0.003   0.00%
     lseek               3    0.002  0.001   0.001   0.001   9.00%
     arch_prctl          2    0.002  0.001   0.001   0.001   0.69%
     execve              1    0.000  0.000   0.000   0.000   0.00%
  #
  # perf trace -e io_uring* -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla

   Summary of events:

   io_uring-cp (390), 1191250 events, 100.0%

     syscall         calls   total    min    avg    max  stddev
                             (msec) (msec) (msec) (msec)    (%)
     -------------- ------ -------- ------ ------ ------ ------
     io_uring_enter 597093 2706.060  0.001  0.005 14.761  1.10%
     io_uring_setup      1    0.038  0.038  0.038  0.038  0.00%
  #

More work needed to make the tools/perf/examples/bpf/augmented_raw_syscalls.c
BPF program to copy the 'struct io_uring_params' arguments to perf's ring
buffer so that 'perf trace' can use the BTF info put in place by pahole's
conversion of the kernel DWARF and then auto-beautify those arguments.

This patch produces the expected change in the generated syscalls table
for x86_64:

  --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before	2019-03-26 13:37:46.679057774 -0300
  +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c	2019-03-26 13:38:12.755990383 -0300
  @@ -334,5 +334,9 @@ static const char *syscalltbl_x86_64[] =
   	[332] = "statx",
   	[333] = "io_pgetevents",
   	[334] = "rseq",
  +	[424] = "pidfd_send_signal",
  +	[425] = "io_uring_setup",
  +	[426] = "io_uring_enter",
  +	[427] = "io_uring_register",
   };
  -#define SYSCALLTBL_x86_64_MAX_ID 334
  +#define SYSCALLTBL_x86_64_MAX_ID 427

This silences these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Andrii Nakryiko &lt;andrii.nakryiko@gmail.com&gt;
Cc: Christian Brauner &lt;christian@brauner.io&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lkml.kernel.org/n/tip-p0ars3otuc52x5iznf21shhw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools: Update x86's syscall_64.tbl, no change in tools/perf behaviour</title>
<updated>2019-03-11T19:13:04+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-03-11T16:20:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=df94bb44b518b1d0c9f4b2e5127441cec13ab75c'/>
<id>df94bb44b518b1d0c9f4b2e5127441cec13ab75c</id>
<content type='text'>
To pick the changes in 7948450d4556 ("x86/x32: use time64 versions of
sigtimedwait and recvmmsg"), that doesn't cause any change in behaviour
in tools/perf/ as it deals just with the x32 entries.

This silences this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-mqpvshayeqidlulx5qpioa59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To pick the changes in 7948450d4556 ("x86/x32: use time64 versions of
sigtimedwait and recvmmsg"), that doesn't cause any change in behaviour
in tools/perf/ as it deals just with the x32 entries.

This silences this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-mqpvshayeqidlulx5qpioa59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf script: Support insn output for normal samples</title>
<updated>2019-03-11T14:56:02+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2019-03-05T14:47:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3ab481a1cfe1511b94e142b648e2c5ade9175ed3'/>
<id>3ab481a1cfe1511b94e142b648e2c5ade9175ed3</id>
<content type='text'>
perf script -F +insn was only working for PT traces because the PT
instruction decoder was filling in the insn/insn_len sample attributes.
Support it for non PT samples too on x86 using the existing x86
instruction decoder.

This adds some extra checking to ensure that we don't try to decode
instructions when using perf.data from a different architecture.

  % perf record -a sleep 1
  % perf script -F ip,sym,insn --xed
   ffffffff811704c9 remote_function               movl  %eax, 0x18(%rbx)
   ffffffff8100bb50 intel_bts_enable_local                retq
   ffffffff81048612 native_apic_mem_write                 movl  %esi, -0xa04000(%rdi)
   ffffffff81048612 native_apic_mem_write                 movl  %esi, -0xa04000(%rdi)
   ffffffff81048612 native_apic_mem_write                 movl  %esi, -0xa04000(%rdi)
   ffffffff810f1f79 generic_exec_single           xor %eax, %eax
   ffffffff811704c9 remote_function               movl  %eax, 0x18(%rbx)
   ffffffff8100bb34 intel_bts_enable_local                movl  0x2000(%rax), %edx
   ffffffff81048610 native_apic_mem_write                 mov %edi, %edi
  ...

Committer testing:

Before:

  # perf script -F ip,sym,insn --xed | head -5
   ffffffffa4068804 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068804 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068804 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068806 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068806 native_write_msr 		addb  %al, (%rax)
  # perf script -F ip,sym,insn --xed | grep -v "addb  %al, (%rax)"
  #

After:

  # perf script -F ip,sym,insn --xed | head -5
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
  # perf script -F ip,sym,insn --xed | grep -v "addb  %al, (%rax)" | head -5
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
  #

More examples:

  # perf script -F ip,sym,insn --xed | grep -v native_write_msr | head
   ffffffffa416b90e tick_check_broadcast_expired 		btq  %rax, 0x1a5f42a(%rip)
   ffffffffa4956bd0 nmi_cpu_backtrace 		pushq  %r13
   ffffffffa415b95e __hrtimer_next_event_base 		movq  0x18(%rax), %rdx
   ffffffffa4956bf3 nmi_cpu_backtrace 		popq  %r12
   ffffffffa4171d5c smp_call_function_single 		pause
   ffffffffa4956bdd nmi_cpu_backtrace 		mov %ebp, %r12d
   ffffffffa4797e4d menu_select 		cmp $0x190, %rax
   ffffffffa4171d5c smp_call_function_single 		pause
   ffffffffa405a7d8 nmi_cpu_backtrace_handler 		callq  0xffffffffa4956bd0
   ffffffffa4797f7a menu_select 		shr $0x3, %rax
  #

Which matches the annotate output modulo resolving callqs:

  # perf annotate --stdio2 nmi_cpu_backtrace_handler
  Samples: 4  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period]
  nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux
  Percent
              Disassembly of section .text:

              ffffffff8105a7d0 &lt;nmi_cpu_backtrace_handler&gt;:
              nmi_cpu_backtrace_handler():
                      nmi_trigger_cpumask_backtrace(mask, exclude_self,
                                                    nmi_raise_cpu_backtrace);
              }

              static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
              {
   24.45      → callq  __fentry__
                      if (nmi_cpu_backtrace(regs))
                mov    %rsi,%rdi
   75.55      → callq  nmi_cpu_backtrace
                              return NMI_HANDLED;
                movzbl %al,%eax

                      return NMI_DONE;
              }
              ← retq
    #

  # perf annotate --stdio2 __hrtimer_next_event_base
  Samples: 4  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period]
  __hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux
  Percent
              Disassembly of section .text:

              ffffffff8115b910 &lt;__hrtimer_next_event_base&gt;:
              __hrtimer_next_event_base():

              static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
                                                       const struct hrtimer *exclude,
                                                       unsigned int active,
                                                       ktime_t expires_next)
              {
              → callq  __fentry__
&lt;SNIP&gt;
          4a:   add    $0x1,%r14
   77.31        mov    0x18(%rax),%rdx
                shl    $0x6,%r14
                sub    0x38(%rbx,%r14,1),%rdx
                              if (expires &lt; expires_next) {
                cmp    %r12,%rdx
              ↓ jge    68
&lt;SNIP&gt;

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org
[ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ]
[ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ]
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf script -F +insn was only working for PT traces because the PT
instruction decoder was filling in the insn/insn_len sample attributes.
Support it for non PT samples too on x86 using the existing x86
instruction decoder.

This adds some extra checking to ensure that we don't try to decode
instructions when using perf.data from a different architecture.

  % perf record -a sleep 1
  % perf script -F ip,sym,insn --xed
   ffffffff811704c9 remote_function               movl  %eax, 0x18(%rbx)
   ffffffff8100bb50 intel_bts_enable_local                retq
   ffffffff81048612 native_apic_mem_write                 movl  %esi, -0xa04000(%rdi)
   ffffffff81048612 native_apic_mem_write                 movl  %esi, -0xa04000(%rdi)
   ffffffff81048612 native_apic_mem_write                 movl  %esi, -0xa04000(%rdi)
   ffffffff810f1f79 generic_exec_single           xor %eax, %eax
   ffffffff811704c9 remote_function               movl  %eax, 0x18(%rbx)
   ffffffff8100bb34 intel_bts_enable_local                movl  0x2000(%rax), %edx
   ffffffff81048610 native_apic_mem_write                 mov %edi, %edi
  ...

Committer testing:

Before:

  # perf script -F ip,sym,insn --xed | head -5
   ffffffffa4068804 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068804 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068804 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068806 native_write_msr 		addb  %al, (%rax)
   ffffffffa4068806 native_write_msr 		addb  %al, (%rax)
  # perf script -F ip,sym,insn --xed | grep -v "addb  %al, (%rax)"
  #

After:

  # perf script -F ip,sym,insn --xed | head -5
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
  # perf script -F ip,sym,insn --xed | grep -v "addb  %al, (%rax)" | head -5
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068804 native_write_msr 		wrmsr
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
   ffffffffa4068806 native_write_msr 		nopl  %eax, (%rax,%rax,1)
  #

More examples:

  # perf script -F ip,sym,insn --xed | grep -v native_write_msr | head
   ffffffffa416b90e tick_check_broadcast_expired 		btq  %rax, 0x1a5f42a(%rip)
   ffffffffa4956bd0 nmi_cpu_backtrace 		pushq  %r13
   ffffffffa415b95e __hrtimer_next_event_base 		movq  0x18(%rax), %rdx
   ffffffffa4956bf3 nmi_cpu_backtrace 		popq  %r12
   ffffffffa4171d5c smp_call_function_single 		pause
   ffffffffa4956bdd nmi_cpu_backtrace 		mov %ebp, %r12d
   ffffffffa4797e4d menu_select 		cmp $0x190, %rax
   ffffffffa4171d5c smp_call_function_single 		pause
   ffffffffa405a7d8 nmi_cpu_backtrace_handler 		callq  0xffffffffa4956bd0
   ffffffffa4797f7a menu_select 		shr $0x3, %rax
  #

Which matches the annotate output modulo resolving callqs:

  # perf annotate --stdio2 nmi_cpu_backtrace_handler
  Samples: 4  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period]
  nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux
  Percent
              Disassembly of section .text:

              ffffffff8105a7d0 &lt;nmi_cpu_backtrace_handler&gt;:
              nmi_cpu_backtrace_handler():
                      nmi_trigger_cpumask_backtrace(mask, exclude_self,
                                                    nmi_raise_cpu_backtrace);
              }

              static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
              {
   24.45      → callq  __fentry__
                      if (nmi_cpu_backtrace(regs))
                mov    %rsi,%rdi
   75.55      → callq  nmi_cpu_backtrace
                              return NMI_HANDLED;
                movzbl %al,%eax

                      return NMI_DONE;
              }
              ← retq
    #

  # perf annotate --stdio2 __hrtimer_next_event_base
  Samples: 4  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period]
  __hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux
  Percent
              Disassembly of section .text:

              ffffffff8115b910 &lt;__hrtimer_next_event_base&gt;:
              __hrtimer_next_event_base():

              static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
                                                       const struct hrtimer *exclude,
                                                       unsigned int active,
                                                       ktime_t expires_next)
              {
              → callq  __fentry__
&lt;SNIP&gt;
          4a:   add    $0x1,%r14
   77.31        mov    0x18(%rax),%rdx
                shl    $0x6,%r14
                sub    0x38(%rbx,%r14,1),%rdx
                              if (expires &lt; expires_next) {
                cmp    %r12,%rdx
              ↓ jge    68
&lt;SNIP&gt;

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org
[ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ]
[ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ]
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf annotate: Calculate the max instruction name, align column to that</title>
<updated>2019-03-06T19:40:15+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-03-06T19:40:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bc3bb795345891509b4a3cbff824cbef8c130f20'/>
<id>bc3bb795345891509b4a3cbff824cbef8c130f20</id>
<content type='text'>
We were hardcoding '6' as the max instruction name, and we have lots
that are longer than that, see the diff from two 'P' printed TUI
annotations for a libc function that uses instructions with long names,
such as 'vpmovmskb' with its 9 chars:

  --- __strcmp_avx2.annotation.before	2019-03-06 16:31:39.368020425 -0300
  +++ __strcmp_avx2.annotation	2019-03-06 16:32:12.079450508 -0300
  @@ -2,284 +2,284 @@
   Event: cycles:ppp

   Percent        endbr64
  -  0.10         mov    %edi,%eax
  +  0.10         mov        %edi,%eax
  -               xor    %edx,%edx
  +               xor        %edx,%edx
  -  3.54         vpxor  %ymm7,%ymm7,%ymm7
  +  3.54         vpxor      %ymm7,%ymm7,%ymm7
  -               or     %esi,%eax
  +               or         %esi,%eax
  -               and    $0xfff,%eax
  +               and        $0xfff,%eax
  -               cmp    $0xf80,%eax
  +               cmp        $0xf80,%eax
  -             ↓ jg     370
  +             ↓ jg         370
  - 27.07         vmovdqu (%rdi),%ymm1
  + 27.07         vmovdqu    (%rdi),%ymm1
  -  7.97         vpcmpeqb (%rsi),%ymm1,%ymm0
  +  7.97         vpcmpeqb   (%rsi),%ymm1,%ymm0
  -  2.15         vpminub %ymm1,%ymm0,%ymm0
  +  2.15         vpminub    %ymm1,%ymm0,%ymm0
  -  4.09         vpcmpeqb %ymm7,%ymm0,%ymm0
  +  4.09         vpcmpeqb   %ymm7,%ymm0,%ymm0
  -  0.43         vpmovmskb %ymm0,%ecx
  +  0.43         vpmovmskb  %ymm0,%ecx
  -  1.53         test   %ecx,%ecx
  +  1.53         test       %ecx,%ecx
  -             ↓ je     b0
  +             ↓ je         b0
  -  5.26         tzcnt  %ecx,%edx
  +  5.26         tzcnt      %ecx,%edx
  - 18.40         movzbl (%rdi,%rdx,1),%eax
  + 18.40         movzbl     (%rdi,%rdx,1),%eax
  -  7.09         movzbl (%rsi,%rdx,1),%edx
  +  7.09         movzbl     (%rsi,%rdx,1),%edx
  -  3.34         sub    %edx,%eax
  +  3.34         sub        %edx,%eax
     2.37         vzeroupper
                ← retq
                  nop
  -         50:   tzcnt  %ecx,%edx
  +         50:   tzcnt      %ecx,%edx
  -               movzbl 0x20(%rdi,%rdx,1),%eax
  +               movzbl     0x20(%rdi,%rdx,1),%eax
  -               movzbl 0x20(%rsi,%rdx,1),%edx
  +               movzbl     0x20(%rsi,%rdx,1),%edx
  -               sub    %edx,%eax
  +               sub        %edx,%eax
                  vzeroupper
                ← retq
  -               data16 nopw %cs:0x0(%rax,%rax,1)
  +               data16     nopw %cs:0x0(%rax,%rax,1)

Reported-by: Travis Downs &lt;travis.downs@gmail.com&gt;
LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We were hardcoding '6' as the max instruction name, and we have lots
that are longer than that, see the diff from two 'P' printed TUI
annotations for a libc function that uses instructions with long names,
such as 'vpmovmskb' with its 9 chars:

  --- __strcmp_avx2.annotation.before	2019-03-06 16:31:39.368020425 -0300
  +++ __strcmp_avx2.annotation	2019-03-06 16:32:12.079450508 -0300
  @@ -2,284 +2,284 @@
   Event: cycles:ppp

   Percent        endbr64
  -  0.10         mov    %edi,%eax
  +  0.10         mov        %edi,%eax
  -               xor    %edx,%edx
  +               xor        %edx,%edx
  -  3.54         vpxor  %ymm7,%ymm7,%ymm7
  +  3.54         vpxor      %ymm7,%ymm7,%ymm7
  -               or     %esi,%eax
  +               or         %esi,%eax
  -               and    $0xfff,%eax
  +               and        $0xfff,%eax
  -               cmp    $0xf80,%eax
  +               cmp        $0xf80,%eax
  -             ↓ jg     370
  +             ↓ jg         370
  - 27.07         vmovdqu (%rdi),%ymm1
  + 27.07         vmovdqu    (%rdi),%ymm1
  -  7.97         vpcmpeqb (%rsi),%ymm1,%ymm0
  +  7.97         vpcmpeqb   (%rsi),%ymm1,%ymm0
  -  2.15         vpminub %ymm1,%ymm0,%ymm0
  +  2.15         vpminub    %ymm1,%ymm0,%ymm0
  -  4.09         vpcmpeqb %ymm7,%ymm0,%ymm0
  +  4.09         vpcmpeqb   %ymm7,%ymm0,%ymm0
  -  0.43         vpmovmskb %ymm0,%ecx
  +  0.43         vpmovmskb  %ymm0,%ecx
  -  1.53         test   %ecx,%ecx
  +  1.53         test       %ecx,%ecx
  -             ↓ je     b0
  +             ↓ je         b0
  -  5.26         tzcnt  %ecx,%edx
  +  5.26         tzcnt      %ecx,%edx
  - 18.40         movzbl (%rdi,%rdx,1),%eax
  + 18.40         movzbl     (%rdi,%rdx,1),%eax
  -  7.09         movzbl (%rsi,%rdx,1),%edx
  +  7.09         movzbl     (%rsi,%rdx,1),%edx
  -  3.34         sub    %edx,%eax
  +  3.34         sub        %edx,%eax
     2.37         vzeroupper
                ← retq
                  nop
  -         50:   tzcnt  %ecx,%edx
  +         50:   tzcnt      %ecx,%edx
  -               movzbl 0x20(%rdi,%rdx,1),%eax
  +               movzbl     0x20(%rdi,%rdx,1),%eax
  -               movzbl 0x20(%rsi,%rdx,1),%edx
  +               movzbl     0x20(%rsi,%rdx,1),%edx
  -               sub    %edx,%eax
  +               sub        %edx,%eax
                  vzeroupper
                ← retq
  -               data16 nopw %cs:0x0(%rax,%rax,1)
  +               data16     nopw %cs:0x0(%rax,%rax,1)

Reported-by: Travis Downs &lt;travis.downs@gmail.com&gt;
LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tools: Rename build libperf to perf</title>
<updated>2019-02-14T18:18:08+00:00</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@kernel.org</email>
</author>
<published>2019-02-13T12:32:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ff328836dfde0cef9f28c8b8791a90a36d7a183'/>
<id>5ff328836dfde0cef9f28c8b8791a90a36d7a183</id>
<content type='text'>
Rename build libperf to perf, because it's used to build perf.

The libperf build object name will be used for libperf library.

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20190213123246.4015-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename build libperf to perf, because it's used to build perf.

The libperf build object name will be used for libperf library.

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20190213123246.4015-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'perf-core-for-mingo-5.1-20190206' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core</title>
<updated>2019-02-09T12:16:01+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2019-02-09T12:16:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6854daa07a29b0e27f7017fe23a24e595d1f7617'/>
<id>6854daa07a29b0e27f7017fe23a24e595d1f7617</id>
<content type='text'>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

Hardware tracing:

  Adrian Hunter:

  - Handle calls optimized into jumps to a different symbol
    in the thread stack routines used to process hardware traces (Adrian Hunter)

Intel PT:

  Adrian Hunter:

  - Fix overlap calculation for padding.

  - Fix CYC timestamp calculation after OVF.

  - Packet splitting can only happen in 32-bit.

  - Add timestamp to auxtrace errors.

ARM CoreSight:

  Leo Yan:

  - Add last instruction information in packet

  - Set sample flags for instruction range, exception and
    return packets and for a trace discontinuity.

  - Add exception number in exception packet

  - Change tuple from traceID-CPU# to traceID-metadata

  - Add traceID in packet

  Mathieu Poirier:

  - Add "sinks" group to PMU directory

  - Use event attributes to send sink information to kernel

  - Remove set_drv_config() API, no longer used.

perf annotate:

  Jiri Olsa:

  - Delay symbol annotation to the resort phase, speeding up 'perf report'
    startup.

perf record:

  Alexey Budankov:

  - Allow binding userspace buffers to NUMA nodes.

Symbols:

  Adrian Hunter:

  - Fix calculating of symbol sizes when splitting kallsyms into
    maps for kcore processing.

Vendor events:

  William Cohen:

  - Intel: Fix Load_Miss_Real_Latency on CLX

Misc:

  Arnaldo Carvalho de Melo:

  - Streamline headers, removing includes when all that is needed are
    just forward declarations, fixup the fallout for cases where headers
    should have been explicitely included but were instead obtained
    indirectly, by sheer luck.

  - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it
    continue to build on older systems where those were not yet introduced
    or in systems using some other libc than the GNU one where those
    helpers aren't present.

Documentation:

  Changbin Du:

  - Add documentation for BPF event selection.

Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

Hardware tracing:

  Adrian Hunter:

  - Handle calls optimized into jumps to a different symbol
    in the thread stack routines used to process hardware traces (Adrian Hunter)

Intel PT:

  Adrian Hunter:

  - Fix overlap calculation for padding.

  - Fix CYC timestamp calculation after OVF.

  - Packet splitting can only happen in 32-bit.

  - Add timestamp to auxtrace errors.

ARM CoreSight:

  Leo Yan:

  - Add last instruction information in packet

  - Set sample flags for instruction range, exception and
    return packets and for a trace discontinuity.

  - Add exception number in exception packet

  - Change tuple from traceID-CPU# to traceID-metadata

  - Add traceID in packet

  Mathieu Poirier:

  - Add "sinks" group to PMU directory

  - Use event attributes to send sink information to kernel

  - Remove set_drv_config() API, no longer used.

perf annotate:

  Jiri Olsa:

  - Delay symbol annotation to the resort phase, speeding up 'perf report'
    startup.

perf record:

  Alexey Budankov:

  - Allow binding userspace buffers to NUMA nodes.

Symbols:

  Adrian Hunter:

  - Fix calculating of symbol sizes when splitting kallsyms into
    maps for kcore processing.

Vendor events:

  William Cohen:

  - Intel: Fix Load_Miss_Real_Latency on CLX

Misc:

  Arnaldo Carvalho de Melo:

  - Streamline headers, removing includes when all that is needed are
    just forward declarations, fixup the fallout for cases where headers
    should have been explicitely included but were instead obtained
    indirectly, by sheer luck.

  - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it
    continue to build on older systems where those were not yet introduced
    or in systems using some other libc than the GNU one where those
    helpers aren't present.

Documentation:

  Changbin Du:

  - Add documentation for BPF event selection.

Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
