<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/tools/perf/util/session.c, branch v5.11</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>perf intel-pt: Fix 'CPU too large' error</title>
<updated>2021-01-15T20:28:27+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2021-01-07T17:41:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5501e9229a80d95a1ea68609f44c447a75d23ed5'/>
<id>5501e9229a80d95a1ea68609f44c447a75d23ed5</id>
<content type='text'>
In some cases, the number of cpus (nr_cpus_online) is confused with the
maximum cpu number (nr_cpus_avail), which results in the error in the
example below:

Example on system with 8 cpus:

 Before:
   # echo 0 &gt; /sys/devices/system/cpu/cpu2/online
   # ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.147 MB perf.data ]
   # ./perf script --itrace=e
   Requested CPU 7 too large. Consider raising MAX_NR_CPUS
   0x25908 [0x8]: failed to process type: 68 [Invalid argument]

 After:
   # ./perf script --itrace=e
   #

Fixes: 8c7274691f0d ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Fixes: 7df4e36a4785 ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Tested-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In some cases, the number of cpus (nr_cpus_online) is confused with the
maximum cpu number (nr_cpus_avail), which results in the error in the
example below:

Example on system with 8 cpus:

 Before:
   # echo 0 &gt; /sys/devices/system/cpu/cpu2/online
   # ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.147 MB perf.data ]
   # ./perf script --itrace=e
   Requested CPU 7 too large. Consider raising MAX_NR_CPUS
   0x25908 [0x8]: failed to process type: 68 [Invalid argument]

 After:
   # ./perf script --itrace=e
   #

Fixes: 8c7274691f0d ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Fixes: 7df4e36a4785 ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Tested-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf script: Support data page size</title>
<updated>2020-12-19T20:04:39+00:00</updated>
<author>
<name>Kan Liang</name>
<email>kan.liang@linux.intel.com</email>
</author>
<published>2020-12-16T18:57:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6b9bae63de4fe24365ad0c2d23e77ae06f8c58e4'/>
<id>6b9bae63de4fe24365ad0c2d23e77ae06f8c58e4</id>
<content type='text'>
Display the data page size if it is available and asked by the user:

Can be configured by the user, for example:

  perf script --fields comm,event,phys_addr,data_page_size
            dtlb mem-loads:uP:        3fec82ea8 4K
            dtlb mem-loads:uP:        3fec82e90 4K
            dtlb mem-loads:uP:        3e23700a4 4K
            dtlb mem-loads:uP:        3fec82f20 4K
            dtlb mem-loads:uP:        3e23700a4 4K
            dtlb mem-loads:uP:        3b4211bec 4K
            dtlb mem-loads:uP:        382205dc0 2M
            dtlb mem-loads:uP:        36fa082c0 2M
            dtlb mem-loads:uP:        377607340 2M
            dtlb mem-loads:uP:        330010180 2M
            dtlb mem-loads:uP:        33200fd80 2M
            dtlb mem-loads:uP:        31b012b80 2M

Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20201216185805.9981-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Display the data page size if it is available and asked by the user:

Can be configured by the user, for example:

  perf script --fields comm,event,phys_addr,data_page_size
            dtlb mem-loads:uP:        3fec82ea8 4K
            dtlb mem-loads:uP:        3fec82e90 4K
            dtlb mem-loads:uP:        3e23700a4 4K
            dtlb mem-loads:uP:        3fec82f20 4K
            dtlb mem-loads:uP:        3e23700a4 4K
            dtlb mem-loads:uP:        3b4211bec 4K
            dtlb mem-loads:uP:        382205dc0 2M
            dtlb mem-loads:uP:        36fa082c0 2M
            dtlb mem-loads:uP:        377607340 2M
            dtlb mem-loads:uP:        330010180 2M
            dtlb mem-loads:uP:        33200fd80 2M
            dtlb mem-loads:uP:        31b012b80 2M

Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20201216185805.9981-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' deliver event method</title>
<updated>2020-11-30T18:16:29+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T18:16:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=515ea461c26e19ebca4351266480306979a113fc'/>
<id>515ea461c26e19ebca4351266480306979a113fc</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' header methods</title>
<updated>2020-11-30T18:15:30+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T18:13:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1420ba2f6250270c4143d96af86f654f9f4d9997'/>
<id>1420ba2f6250270c4143d96af86f654f9f4d9997</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' raw samples methods</title>
<updated>2020-11-30T18:15:30+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T18:11:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=44d2a5573665ab5dfb72572e43184388d15d695e'/>
<id>44d2a5573665ab5dfb72572e43184388d15d695e</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' event attribute config methods</title>
<updated>2020-11-30T18:15:27+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T18:07:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=78e1bc25786656c490befc6d44d265f263cb8861'/>
<id>78e1bc25786656c490befc6d44d265f263cb8861</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' print methods</title>
<updated>2020-11-30T17:55:12+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T17:55:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=712737241980476a277a4108e3121240a29de968'/>
<id>712737241980476a277a4108e3121240a29de968</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' sample id lookup  methods</title>
<updated>2020-11-30T17:17:57+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T17:17:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3ccf8a7b66b6bff69a7be62f2d5a2a61328ebe91'/>
<id>3ccf8a7b66b6bff69a7be62f2d5a2a61328ebe91</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf evlist: Use the right prefix for 'struct evlist' sample parsing methods</title>
<updated>2020-11-30T12:43:07+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2020-11-30T12:43:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2a6599cd5e093b3c607a39288f14a618c03a0e24'/>
<id>2a6599cd5e093b3c607a39288f14a618c03a0e24</id>
<content type='text'>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf data: Allow to use stdio functions for pipe mode</title>
<updated>2020-11-16T16:37:28+00:00</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2020-10-30T05:47:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=601366678c93618f37a685332c0ba07e5556798c'/>
<id>601366678c93618f37a685332c0ba07e5556798c</id>
<content type='text'>
When perf data is in a pipe, it reads each event separately using
read(2) syscall.  This is a huge performance bottleneck when
processing large data like in perf inject.  Also perf inject needs to
use write(2) syscall for the output.

So convert it to use buffer I/O functions in stdio library for pipe
data.  This makes inject-build-id bench time drops from 20ms to 8ms.

  $ perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.074 msec (+- 0.013 msec)
    Average time per event: 0.792 usec (+- 0.001 usec)
    Average memory usage: 8328 KB (+- 0 KB)
    Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
    Average time per event: 0.538 usec (+- 0.001 usec)
    Average memory usage: 7563 KB (+- 0 KB)

This patch enables it just for perf inject when used with pipe (it's a
default behavior).  Maybe we could do it for perf record and/or report
later..

Committer testing:

Before:

  $ perf stat -r 5 perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.605 msec (+- 0.064 msec)
    Average time per event: 1.334 usec (+- 0.006 usec)
    Average memory usage: 12220 KB (+- 7 KB)
    Average build-id-all injection took: 11.458 msec (+- 0.058 msec)
    Average time per event: 1.123 usec (+- 0.006 usec)
    Average memory usage: 11546 KB (+- 8 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.673 msec (+- 0.057 msec)
    Average time per event: 1.341 usec (+- 0.006 usec)
    Average memory usage: 12508 KB (+- 8 KB)
    Average build-id-all injection took: 11.437 msec (+- 0.046 msec)
    Average time per event: 1.121 usec (+- 0.004 usec)
    Average memory usage: 11812 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.641 msec (+- 0.069 msec)
    Average time per event: 1.337 usec (+- 0.007 usec)
    Average memory usage: 12302 KB (+- 8 KB)
    Average build-id-all injection took: 10.820 msec (+- 0.106 msec)
    Average time per event: 1.061 usec (+- 0.010 usec)
    Average memory usage: 11616 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.379 msec (+- 0.074 msec)
    Average time per event: 1.312 usec (+- 0.007 usec)
    Average memory usage: 12334 KB (+- 8 KB)
    Average build-id-all injection took: 11.288 msec (+- 0.071 msec)
    Average time per event: 1.107 usec (+- 0.007 usec)
    Average memory usage: 11657 KB (+- 8 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.534 msec (+- 0.058 msec)
    Average time per event: 1.327 usec (+- 0.006 usec)
    Average memory usage: 12264 KB (+- 8 KB)
    Average build-id-all injection took: 11.557 msec (+- 0.076 msec)
    Average time per event: 1.133 usec (+- 0.007 usec)
    Average memory usage: 11593 KB (+- 8 KB)

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            4,060.05 msec task-clock:u              #    1.566 CPUs utilized            ( +-  0.65% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             101,888      page-faults:u             #    0.025 M/sec                    ( +-  0.12% )
       3,745,833,163      cycles:u                  #    0.923 GHz                      ( +-  0.10% )  (83.22%)
         194,346,613      stalled-cycles-frontend:u #    5.19% frontend cycles idle     ( +-  0.57% )  (83.30%)
         708,495,034      stalled-cycles-backend:u  #   18.91% backend cycles idle      ( +-  0.48% )  (83.48%)
       5,629,328,628      instructions:u            #    1.50  insn per cycle
                                                    #    0.13  stalled cycles per insn  ( +-  0.21% )  (83.57%)
       1,236,697,927      branches:u                #  304.602 M/sec                    ( +-  0.16% )  (83.44%)
          17,564,877      branch-misses:u           #    1.42% of all branches          ( +-  0.23% )  (82.99%)

              2.5934 +- 0.0128 seconds time elapsed  ( +-  0.49% )

  $

After:

  $ perf stat -r 5 perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.560 msec (+- 0.125 msec)
    Average time per event: 0.839 usec (+- 0.012 usec)
    Average memory usage: 12520 KB (+- 8 KB)
    Average build-id-all injection took: 5.789 msec (+- 0.054 msec)
    Average time per event: 0.568 usec (+- 0.005 usec)
    Average memory usage: 11919 KB (+- 9 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.639 msec (+- 0.111 msec)
    Average time per event: 0.847 usec (+- 0.011 usec)
    Average memory usage: 12732 KB (+- 8 KB)
    Average build-id-all injection took: 5.647 msec (+- 0.069 msec)
    Average time per event: 0.554 usec (+- 0.007 usec)
    Average memory usage: 12093 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.551 msec (+- 0.096 msec)
    Average time per event: 0.838 usec (+- 0.009 usec)
    Average memory usage: 12739 KB (+- 8 KB)
    Average build-id-all injection took: 5.617 msec (+- 0.061 msec)
    Average time per event: 0.551 usec (+- 0.006 usec)
    Average memory usage: 12105 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.403 msec (+- 0.097 msec)
    Average time per event: 0.824 usec (+- 0.010 usec)
    Average memory usage: 12770 KB (+- 8 KB)
    Average build-id-all injection took: 5.611 msec (+- 0.085 msec)
    Average time per event: 0.550 usec (+- 0.008 usec)
    Average memory usage: 12134 KB (+- 8 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.518 msec (+- 0.102 msec)
    Average time per event: 0.835 usec (+- 0.010 usec)
    Average memory usage: 12518 KB (+- 10 KB)
    Average build-id-all injection took: 5.503 msec (+- 0.073 msec)
    Average time per event: 0.540 usec (+- 0.007 usec)
    Average memory usage: 11882 KB (+- 8 KB)

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            2,394.88 msec task-clock:u              #    1.577 CPUs utilized            ( +-  0.83% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             103,181      page-faults:u             #    0.043 M/sec                    ( +-  0.11% )
       3,548,172,030      cycles:u                  #    1.482 GHz                      ( +-  0.30% )  (83.26%)
          81,537,700      stalled-cycles-frontend:u #    2.30% frontend cycles idle     ( +-  1.54% )  (83.24%)
         876,631,544      stalled-cycles-backend:u  #   24.71% backend cycles idle      ( +-  1.14% )  (83.45%)
       5,960,361,707      instructions:u            #    1.68  insn per cycle
                                                    #    0.15  stalled cycles per insn  ( +-  0.27% )  (83.26%)
       1,269,413,491      branches:u                #  530.054 M/sec                    ( +-  0.10% )  (83.48%)
          11,372,453      branch-misses:u           #    0.90% of all branches          ( +-  0.52% )  (83.31%)

             1.51874 +- 0.00642 seconds time elapsed  ( +-  0.42% )

  $

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lore.kernel.org/lkml/20201030054742.87740-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When perf data is in a pipe, it reads each event separately using
read(2) syscall.  This is a huge performance bottleneck when
processing large data like in perf inject.  Also perf inject needs to
use write(2) syscall for the output.

So convert it to use buffer I/O functions in stdio library for pipe
data.  This makes inject-build-id bench time drops from 20ms to 8ms.

  $ perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.074 msec (+- 0.013 msec)
    Average time per event: 0.792 usec (+- 0.001 usec)
    Average memory usage: 8328 KB (+- 0 KB)
    Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
    Average time per event: 0.538 usec (+- 0.001 usec)
    Average memory usage: 7563 KB (+- 0 KB)

This patch enables it just for perf inject when used with pipe (it's a
default behavior).  Maybe we could do it for perf record and/or report
later..

Committer testing:

Before:

  $ perf stat -r 5 perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.605 msec (+- 0.064 msec)
    Average time per event: 1.334 usec (+- 0.006 usec)
    Average memory usage: 12220 KB (+- 7 KB)
    Average build-id-all injection took: 11.458 msec (+- 0.058 msec)
    Average time per event: 1.123 usec (+- 0.006 usec)
    Average memory usage: 11546 KB (+- 8 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.673 msec (+- 0.057 msec)
    Average time per event: 1.341 usec (+- 0.006 usec)
    Average memory usage: 12508 KB (+- 8 KB)
    Average build-id-all injection took: 11.437 msec (+- 0.046 msec)
    Average time per event: 1.121 usec (+- 0.004 usec)
    Average memory usage: 11812 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.641 msec (+- 0.069 msec)
    Average time per event: 1.337 usec (+- 0.007 usec)
    Average memory usage: 12302 KB (+- 8 KB)
    Average build-id-all injection took: 10.820 msec (+- 0.106 msec)
    Average time per event: 1.061 usec (+- 0.010 usec)
    Average memory usage: 11616 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.379 msec (+- 0.074 msec)
    Average time per event: 1.312 usec (+- 0.007 usec)
    Average memory usage: 12334 KB (+- 8 KB)
    Average build-id-all injection took: 11.288 msec (+- 0.071 msec)
    Average time per event: 1.107 usec (+- 0.007 usec)
    Average memory usage: 11657 KB (+- 8 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 13.534 msec (+- 0.058 msec)
    Average time per event: 1.327 usec (+- 0.006 usec)
    Average memory usage: 12264 KB (+- 8 KB)
    Average build-id-all injection took: 11.557 msec (+- 0.076 msec)
    Average time per event: 1.133 usec (+- 0.007 usec)
    Average memory usage: 11593 KB (+- 8 KB)

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            4,060.05 msec task-clock:u              #    1.566 CPUs utilized            ( +-  0.65% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             101,888      page-faults:u             #    0.025 M/sec                    ( +-  0.12% )
       3,745,833,163      cycles:u                  #    0.923 GHz                      ( +-  0.10% )  (83.22%)
         194,346,613      stalled-cycles-frontend:u #    5.19% frontend cycles idle     ( +-  0.57% )  (83.30%)
         708,495,034      stalled-cycles-backend:u  #   18.91% backend cycles idle      ( +-  0.48% )  (83.48%)
       5,629,328,628      instructions:u            #    1.50  insn per cycle
                                                    #    0.13  stalled cycles per insn  ( +-  0.21% )  (83.57%)
       1,236,697,927      branches:u                #  304.602 M/sec                    ( +-  0.16% )  (83.44%)
          17,564,877      branch-misses:u           #    1.42% of all branches          ( +-  0.23% )  (82.99%)

              2.5934 +- 0.0128 seconds time elapsed  ( +-  0.49% )

  $

After:

  $ perf stat -r 5 perf bench internals inject-build-id
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.560 msec (+- 0.125 msec)
    Average time per event: 0.839 usec (+- 0.012 usec)
    Average memory usage: 12520 KB (+- 8 KB)
    Average build-id-all injection took: 5.789 msec (+- 0.054 msec)
    Average time per event: 0.568 usec (+- 0.005 usec)
    Average memory usage: 11919 KB (+- 9 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.639 msec (+- 0.111 msec)
    Average time per event: 0.847 usec (+- 0.011 usec)
    Average memory usage: 12732 KB (+- 8 KB)
    Average build-id-all injection took: 5.647 msec (+- 0.069 msec)
    Average time per event: 0.554 usec (+- 0.007 usec)
    Average memory usage: 12093 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.551 msec (+- 0.096 msec)
    Average time per event: 0.838 usec (+- 0.009 usec)
    Average memory usage: 12739 KB (+- 8 KB)
    Average build-id-all injection took: 5.617 msec (+- 0.061 msec)
    Average time per event: 0.551 usec (+- 0.006 usec)
    Average memory usage: 12105 KB (+- 7 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.403 msec (+- 0.097 msec)
    Average time per event: 0.824 usec (+- 0.010 usec)
    Average memory usage: 12770 KB (+- 8 KB)
    Average build-id-all injection took: 5.611 msec (+- 0.085 msec)
    Average time per event: 0.550 usec (+- 0.008 usec)
    Average memory usage: 12134 KB (+- 8 KB)
  # Running 'internals/inject-build-id' benchmark:
    Average build-id injection took: 8.518 msec (+- 0.102 msec)
    Average time per event: 0.835 usec (+- 0.010 usec)
    Average memory usage: 12518 KB (+- 10 KB)
    Average build-id-all injection took: 5.503 msec (+- 0.073 msec)
    Average time per event: 0.540 usec (+- 0.007 usec)
    Average memory usage: 11882 KB (+- 8 KB)

   Performance counter stats for 'perf bench internals inject-build-id' (5 runs):

            2,394.88 msec task-clock:u              #    1.577 CPUs utilized            ( +-  0.83% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             103,181      page-faults:u             #    0.043 M/sec                    ( +-  0.11% )
       3,548,172,030      cycles:u                  #    1.482 GHz                      ( +-  0.30% )  (83.26%)
          81,537,700      stalled-cycles-frontend:u #    2.30% frontend cycles idle     ( +-  1.54% )  (83.24%)
         876,631,544      stalled-cycles-backend:u  #   24.71% backend cycles idle      ( +-  1.14% )  (83.45%)
       5,960,361,707      instructions:u            #    1.68  insn per cycle
                                                    #    0.15  stalled cycles per insn  ( +-  0.27% )  (83.26%)
       1,269,413,491      branches:u                #  530.054 M/sec                    ( +-  0.10% )  (83.48%)
          11,372,453      branch-misses:u           #    0.90% of all branches          ( +-  0.52% )  (83.31%)

             1.51874 +- 0.00642 seconds time elapsed  ( +-  0.42% )

  $

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lore.kernel.org/lkml/20201030054742.87740-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
