<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/tools/perf/Documentation, branch v5.5</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>perf kvm: Clarify the 'perf kvm' -i and -o command line options</title>
<updated>2019-12-02T18:38:59+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-12-02T18:38:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9974406884459c9301597c2c9f7def6c38099ab4'/>
<id>9974406884459c9301597c2c9f7def6c38099ab4</id>
<content type='text'>
The 'perf kvm' subcommand has options that it in turn passes to other
perf subcommands such as 'report' and 'record', particularly -i and -o
end up setting the same variable that will then be used for 'record's -o
and report '-i', which ends up being confusing, leading some to think
that both -i and -o can be used with 'report'.

Improve the man page to state that -i is used with the post-processing
subcommands while -o is used just with 'record' and that to save the
output of 'report' one should simply redirect its output to a file.

Noticed while reading the https://www.linux-kvm.org/page/Perf_events
page.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Steve Dickson &lt;steved@redhat.com&gt;
Cc: William Cohen &lt;wcohen@redhat.com&gt;
Link: https://lkml.kernel.org/n/tip-tclbttvmgtm525fvmh85f7d9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The 'perf kvm' subcommand has options that it in turn passes to other
perf subcommands such as 'report' and 'record', particularly -i and -o
end up setting the same variable that will then be used for 'record's -o
and report '-i', which ends up being confusing, leading some to think
that both -i and -o can be used with 'report'.

Improve the man page to state that -i is used with the post-processing
subcommands while -o is used just with 'record' and that to save the
output of 'report' one should simply redirect its output to a file.

Noticed while reading the https://www.linux-kvm.org/page/Perf_events
page.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Steve Dickson &lt;steved@redhat.com&gt;
Cc: William Cohen &lt;wcohen@redhat.com&gt;
Link: https://lkml.kernel.org/n/tip-tclbttvmgtm525fvmh85f7d9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf intel-pt: Add support for recording AUX area samples</title>
<updated>2019-11-22T13:48:13+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-11-15T12:42:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c4ab2f0f763da64d88cec6f20fd664f2347eca60'/>
<id>c4ab2f0f763da64d88cec6f20fd664f2347eca60</id>
<content type='text'>
Set up the default number of mmap pages, default sample size and default
psb_period for AUX area sampling. Add documentation also.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191115124225.5247-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Set up the default number of mmap pages, default sample size and default
psb_period for AUX area sampling. Add documentation also.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191115124225.5247-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf record: Add aux-sample-size config term</title>
<updated>2019-11-22T13:48:13+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-11-15T12:42:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eb7a52d46c6ac95df563f867d526b3d46616b10b'/>
<id>eb7a52d46c6ac95df563f867d526b3d46616b10b</id>
<content type='text'>
To allow individual events to be selected for AUX area sampling, add
aux-sample-size config term. attr.aux_sample_size is updated by
auxtrace_parse_sample_options() so that the existing validation will see
the value. Any event that has a non-zero aux_sample_size will cause AUX
area sampling to be configured, irrespective of the --aux-sample option.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191115124225.5247-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To allow individual events to be selected for AUX area sampling, add
aux-sample-size config term. attr.aux_sample_size is updated by
auxtrace_parse_sample_options() so that the existing validation will see
the value. Any event that has a non-zero aux_sample_size will cause AUX
area sampling to be configured, irrespective of the --aux-sample option.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191115124225.5247-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf record: Add support for AUX area sampling</title>
<updated>2019-11-22T13:48:13+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-11-15T12:42:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c0a6de06c446f8d173ef53fba361acedd5880b20'/>
<id>c0a6de06c446f8d173ef53fba361acedd5880b20</id>
<content type='text'>
Add a 'perf record' option '--aux-sample' to request AUX area sampling.
AUX area sampling uses an overwriting buffer much like snapshot mode, so
adjust the AUX buffer mmapping accordingly. To make it easy to queue
samples for decoding, synthesize an ID index.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a 'perf record' option '--aux-sample' to request AUX area sampling.
AUX area sampling uses an overwriting buffer much like snapshot mode, so
adjust the AUX buffer mmapping accordingly. To make it easy to queue
samples for decoding, synthesize an ID index.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf tool: Provide an option to print perf_event_open args and return value</title>
<updated>2019-11-12T11:32:27+00:00</updated>
<author>
<name>Ravi Bangoria</name>
<email>ravi.bangoria@linux.ibm.com</email>
</author>
<published>2019-11-08T09:41:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ccd26741f5e6bdf2c18ea38b32e6a347c5648a97'/>
<id>ccd26741f5e6bdf2c18ea38b32e6a347c5648a97</id>
<content type='text'>
Perf record with verbose=2 already prints this information along with
whole lot of other traces which requires lot of scrolling. Introduce
an option to print only perf_event_open() arguments and return value.

Sample o/p:

  $ perf --debug perf-event-open=1 record -- ls &gt; /dev/null
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
  ------------------------------------------------------------
  sys_perf_event_open: pid 4308  cpu 0  group_fd -1  flags 0x8 = 4
  sys_perf_event_open: pid 4308  cpu 1  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 4308  cpu 2  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 4308  cpu 3  group_fd -1  flags 0x8 = 8
  sys_perf_event_open: pid 4308  cpu 4  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 4308  cpu 5  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 4308  cpu 6  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 4308  cpu 7  group_fd -1  flags 0x8 = 12
  ------------------------------------------------------------
  perf_event_attr:
    type                             1
    size                             112
    config                           0x9
    watermark                        1
    sample_id_all                    1
    bpf_event                        1
    { wakeup_events, wakeup_watermark } 1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -13
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.002 MB perf.data (9 samples) ]

Committer notes:

Just like the 'verbose' variable this new 'debug_peo_args' needs to be
added to util/python.c, since we don't link the debug.o file in the
python binding, which ended up making 'perf test python' fail with:

  # perf test -v python
  18: 'import perf' in python                               :
  --- start ---
  test child forked, pid 19237
  Traceback (most recent call last):
    File "&lt;stdin&gt;", line 1, in &lt;module&gt;
  ImportError: /tmp/build/perf/python/perf.so: undefined symbol: debug_peo_args
  test child finished with -1
  ---- end ----
  'import perf' in python: FAILED!
  #

After adding that new variable to util/python.c:

  # perf test -v python
  18: 'import perf' in python                               :
  --- start ---
  test child forked, pid 22364
  test child finished with 0
  ---- end ----
  'import perf' in python: Ok
  #

Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191108094128.28769-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Perf record with verbose=2 already prints this information along with
whole lot of other traces which requires lot of scrolling. Introduce
an option to print only perf_event_open() arguments and return value.

Sample o/p:

  $ perf --debug perf-event-open=1 record -- ls &gt; /dev/null
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
  ------------------------------------------------------------
  sys_perf_event_open: pid 4308  cpu 0  group_fd -1  flags 0x8 = 4
  sys_perf_event_open: pid 4308  cpu 1  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 4308  cpu 2  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 4308  cpu 3  group_fd -1  flags 0x8 = 8
  sys_perf_event_open: pid 4308  cpu 4  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 4308  cpu 5  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 4308  cpu 6  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 4308  cpu 7  group_fd -1  flags 0x8 = 12
  ------------------------------------------------------------
  perf_event_attr:
    type                             1
    size                             112
    config                           0x9
    watermark                        1
    sample_id_all                    1
    bpf_event                        1
    { wakeup_events, wakeup_watermark } 1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -13
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.002 MB perf.data (9 samples) ]

Committer notes:

Just like the 'verbose' variable this new 'debug_peo_args' needs to be
added to util/python.c, since we don't link the debug.o file in the
python binding, which ended up making 'perf test python' fail with:

  # perf test -v python
  18: 'import perf' in python                               :
  --- start ---
  test child forked, pid 19237
  Traceback (most recent call last):
    File "&lt;stdin&gt;", line 1, in &lt;module&gt;
  ImportError: /tmp/build/perf/python/perf.so: undefined symbol: debug_peo_args
  test child finished with -1
  ---- end ----
  'import perf' in python: FAILED!
  #

After adding that new variable to util/python.c:

  # perf test -v python
  18: 'import perf' in python                               :
  --- start ---
  test child forked, pid 22364
  test child finished with 0
  ---- end ----
  'import perf' in python: Ok
  #

Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191108094128.28769-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf report: Sort by sampled cycles percent per block for stdio</title>
<updated>2019-11-07T13:14:48+00:00</updated>
<author>
<name>Jin Yao</name>
<email>yao.jin@linux.intel.com</email>
</author>
<published>2019-11-07T07:47:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6f7164fa231a5f360e576593c547bea7dc56ddbc'/>
<id>6f7164fa231a5f360e576593c547bea7dc56ddbc</id>
<content type='text'>
It would be useful to support sorting for all blocks by the sampled
cycles percent per block. This is useful to concentrate on the globally
hottest blocks.

This patch implements a new option "--total-cycles" which sorts all
blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the percent:

 percent = block sampled cycles aggregation / total sampled cycles

Note that, this patch only supports "--stdio" mode.

For example,

  # perf record -b ./div
  # perf report --total-cycles --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 2M of event 'cycles'
  # Event count (approx.): 2753248
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                             [Program Block Range]      Shared Object
  # ...............  ..............  ...........  ..........  ................................................  .................
  #
             26.04%            2.8M        0.40%          18                            [div.c:42 -&gt; div.c:39]                div
             15.17%            1.2M        0.16%           7                [random_r.c:357 -&gt; random_r.c:380]       libc-2.27.so
              5.11%          402.0K        0.04%           2                            [div.c:27 -&gt; div.c:28]                div
              4.87%          381.6K        0.04%           2                    [random.c:288 -&gt; random.c:291]       libc-2.27.so
              4.53%          381.0K        0.04%           2                            [div.c:40 -&gt; div.c:40]                div
              3.85%          300.9K        0.02%           1                            [div.c:22 -&gt; div.c:25]                div
              3.08%          241.1K        0.02%           1                          [rand.c:26 -&gt; rand.c:27]       libc-2.27.so
              3.06%          240.0K        0.02%           1                    [random.c:291 -&gt; random.c:291]       libc-2.27.so
              2.78%          215.7K        0.02%           1                    [random.c:298 -&gt; random.c:298]       libc-2.27.so
              2.52%          198.3K        0.02%           1                    [random.c:293 -&gt; random.c:293]       libc-2.27.so
              2.36%          184.8K        0.02%           1                          [rand.c:28 -&gt; rand.c:28]       libc-2.27.so
              2.33%          180.5K        0.02%           1                    [random.c:295 -&gt; random.c:295]       libc-2.27.so
              2.28%          176.7K        0.02%           1                    [random.c:295 -&gt; random.c:295]       libc-2.27.so
              2.20%          168.8K        0.02%           1                        [rand@plt+0 -&gt; rand@plt+0]                div
              1.98%          158.2K        0.02%           1                [random_r.c:388 -&gt; random_r.c:388]       libc-2.27.so
              1.57%          123.3K        0.02%           1                            [div.c:42 -&gt; div.c:44]                div
              1.44%          116.0K        0.42%          19                [random_r.c:357 -&gt; random_r.c:394]       libc-2.27.so
              0.25%          182.5K        0.02%           1                [random_r.c:388 -&gt; random_r.c:391]       libc-2.27.so
              0.00%              48        1.07%          48        [x86_pmu_enable+284 -&gt; x86_pmu_enable+298]  [kernel.kallsyms]
              0.00%              74        1.64%          74             [vm_mmap_pgoff+0 -&gt; vm_mmap_pgoff+92]  [kernel.kallsyms]
              0.00%              73        1.62%          73                         [vm_mmap+0 -&gt; vm_mmap+48]  [kernel.kallsyms]
              0.00%              63        0.69%          31                       [up_write+0 -&gt; up_write+34]  [kernel.kallsyms]
              0.00%              13        0.29%          13      [setup_arg_pages+396 -&gt; setup_arg_pages+413]  [kernel.kallsyms]
              0.00%               3        0.07%           3      [setup_arg_pages+418 -&gt; setup_arg_pages+450]  [kernel.kallsyms]
              0.00%             616        6.84%         308   [security_mmap_file+0 -&gt; security_mmap_file+72]  [kernel.kallsyms]
              0.00%              23        0.51%          23  [security_mmap_file+77 -&gt; security_mmap_file+87]  [kernel.kallsyms]
              0.00%               4        0.02%           1                  [sched_clock+0 -&gt; sched_clock+4]  [kernel.kallsyms]
              0.00%               4        0.02%           1                 [sched_clock+9 -&gt; sched_clock+12]  [kernel.kallsyms]
              0.00%               1        0.02%           1                [rcu_nmi_exit+0 -&gt; rcu_nmi_exit+9]  [kernel.kallsyms]

Committer testing:

This should provide material for hours of endless joy, both from looking
for suspicious things in the implementation of this patch, such as the
top one:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
              2.17%            1.7M        0.08%         607   [compiler.h:199 -&gt; common.c:221]              [kernel.vmlinux]

As well from things that look legit:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
              0.16%          123.0K        0.60%        4.7K   [nospec-branch.h:265 -&gt; nospec-branch.h:278]  [kernel.vmlinux]

:-)

Very short system wide taken branches session:

  # perf record -h -b

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

      -b, --branch-any      sample any taken branches

  #
  # perf record -b
  ^C[ perf record: Woken up 595 times to write data ]
  [ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]

  #
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
  #
  # perf report --total-cycles --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 6M of event 'cycles'
  # Event count (approx.): 6299936
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
  # ...............  ..............  ...........  ..........  ......................................................................  ....................
  #
              2.17%            1.7M        0.08%         607                                        [compiler.h:199 -&gt; common.c:221]      [kernel.vmlinux]
              1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -&gt; memset-vec-unaligned-erms.S:151]          libc-2.29.so
              0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -&gt; entry_64.S:662]      [kernel.vmlinux]
              0.56%          541.8K        0.09%         672                                        [compiler.h:199 -&gt; common.c:300]      [kernel.vmlinux]
              0.39%          293.2K        0.01%         104                                    [list_debug.c:43 -&gt; list_debug.c:61]      [kernel.vmlinux]
              0.36%          278.6K        0.03%         272                                    [entry_64.S:1289 -&gt; entry_64.S:1308]      [kernel.vmlinux]
              0.30%          260.8K        0.07%         564                              [clear_page_64.S:47 -&gt; clear_page_64.S:50]      [kernel.vmlinux]
              0.28%          215.3K        0.05%         369                                            [traps.c:623 -&gt; traps.c:628]      [kernel.vmlinux]
              0.23%          178.1K        0.04%         278                                      [entry_64.S:271 -&gt; entry_64.S:275]      [kernel.vmlinux]
              0.20%          152.6K        0.09%         706                                      [paravirt.c:177 -&gt; paravirt.c:179]      [kernel.vmlinux]
              0.20%          155.8K        0.05%         373                                      [entry_64.S:153 -&gt; entry_64.S:175]      [kernel.vmlinux]
              0.18%          136.6K        0.03%         222                                                [msr.h:105 -&gt; msr.h:166]      [kernel.vmlinux]
              0.16%          123.0K        0.60%        4.7K                            [nospec-branch.h:265 -&gt; nospec-branch.h:278]      [kernel.vmlinux]
              0.16%          118.3K        0.01%          44                                      [entry_64.S:632 -&gt; entry_64.S:657]      [kernel.vmlinux]
              0.14%          104.5K        0.00%          28                                          [rwsem.c:1541 -&gt; rwsem.c:1544]      [kernel.vmlinux]
              0.13%           99.2K        0.01%          53                                      [spinlock.c:150 -&gt; spinlock.c:152]      [kernel.vmlinux]
              0.13%           95.5K        0.00%          35                                              [swap.c:456 -&gt; swap.c:471]      [kernel.vmlinux]
              0.12%           96.2K        0.05%         407                              [copy_user_64.S:175 -&gt; copy_user_64.S:209]      [kernel.vmlinux]
              0.11%           85.9K        0.00%          31                                        [swap.c:400 -&gt; page-flags.h:188]      [kernel.vmlinux]
              0.10%           73.0K        0.01%          52                                          [paravirt.h:763 -&gt; list.h:131]      [kernel.vmlinux]
              0.07%           56.2K        0.03%         214                                      [filemap.c:1524 -&gt; filemap.c:1557]      [kernel.vmlinux]
              0.07%           54.2K        0.02%         145                                        [memory.c:1032 -&gt; memory.c:1049]      [kernel.vmlinux]
              0.07%           50.3K        0.00%          39                                            [mmzone.c:49 -&gt; mmzone.c:69]      [kernel.vmlinux]
              0.06%           48.3K        0.01%          40                                   [paravirt.h:768 -&gt; page_alloc.c:3304]      [kernel.vmlinux]
              0.06%           46.7K        0.02%         155                                        [memory.c:1032 -&gt; memory.c:1056]      [kernel.vmlinux]
              0.06%           46.9K        0.01%         103                                              [swap.c:867 -&gt; swap.c:902]      [kernel.vmlinux]
              0.06%           47.8K        0.00%          34                                    [entry_64.S:1201 -&gt; entry_64.S:1202]      [kernel.vmlinux]

 -----------------------------------------------------------

 v7:
 ---
 Use use_browser in report__browse_block_hists for supporting
 stdio and potential tui mode.

 v6:
 ---
 Create report__browse_block_hists in block-info.c (codes are
 moved from builtin-report.c). It's called from
 perf_evlist__tty_browse_hists.

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms in block hist_entry to
    other patch.

 v4:
 ---
 1. Use new option '--total-cycles' to replace
    '-s total_cycles' in v3.

 2. Move block info collection out of block info
    printing.

 v3:
 ---
 1. Use common function block_info__process_sym to
    process the blocks per symbol.

 2. Remove the nasty hack for skipping calculation
    of column length

 3. Some minor cleanup

Signed-off-by: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@intel.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20191107074719.26139-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It would be useful to support sorting for all blocks by the sampled
cycles percent per block. This is useful to concentrate on the globally
hottest blocks.

This patch implements a new option "--total-cycles" which sorts all
blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the percent:

 percent = block sampled cycles aggregation / total sampled cycles

Note that, this patch only supports "--stdio" mode.

For example,

  # perf record -b ./div
  # perf report --total-cycles --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 2M of event 'cycles'
  # Event count (approx.): 2753248
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                             [Program Block Range]      Shared Object
  # ...............  ..............  ...........  ..........  ................................................  .................
  #
             26.04%            2.8M        0.40%          18                            [div.c:42 -&gt; div.c:39]                div
             15.17%            1.2M        0.16%           7                [random_r.c:357 -&gt; random_r.c:380]       libc-2.27.so
              5.11%          402.0K        0.04%           2                            [div.c:27 -&gt; div.c:28]                div
              4.87%          381.6K        0.04%           2                    [random.c:288 -&gt; random.c:291]       libc-2.27.so
              4.53%          381.0K        0.04%           2                            [div.c:40 -&gt; div.c:40]                div
              3.85%          300.9K        0.02%           1                            [div.c:22 -&gt; div.c:25]                div
              3.08%          241.1K        0.02%           1                          [rand.c:26 -&gt; rand.c:27]       libc-2.27.so
              3.06%          240.0K        0.02%           1                    [random.c:291 -&gt; random.c:291]       libc-2.27.so
              2.78%          215.7K        0.02%           1                    [random.c:298 -&gt; random.c:298]       libc-2.27.so
              2.52%          198.3K        0.02%           1                    [random.c:293 -&gt; random.c:293]       libc-2.27.so
              2.36%          184.8K        0.02%           1                          [rand.c:28 -&gt; rand.c:28]       libc-2.27.so
              2.33%          180.5K        0.02%           1                    [random.c:295 -&gt; random.c:295]       libc-2.27.so
              2.28%          176.7K        0.02%           1                    [random.c:295 -&gt; random.c:295]       libc-2.27.so
              2.20%          168.8K        0.02%           1                        [rand@plt+0 -&gt; rand@plt+0]                div
              1.98%          158.2K        0.02%           1                [random_r.c:388 -&gt; random_r.c:388]       libc-2.27.so
              1.57%          123.3K        0.02%           1                            [div.c:42 -&gt; div.c:44]                div
              1.44%          116.0K        0.42%          19                [random_r.c:357 -&gt; random_r.c:394]       libc-2.27.so
              0.25%          182.5K        0.02%           1                [random_r.c:388 -&gt; random_r.c:391]       libc-2.27.so
              0.00%              48        1.07%          48        [x86_pmu_enable+284 -&gt; x86_pmu_enable+298]  [kernel.kallsyms]
              0.00%              74        1.64%          74             [vm_mmap_pgoff+0 -&gt; vm_mmap_pgoff+92]  [kernel.kallsyms]
              0.00%              73        1.62%          73                         [vm_mmap+0 -&gt; vm_mmap+48]  [kernel.kallsyms]
              0.00%              63        0.69%          31                       [up_write+0 -&gt; up_write+34]  [kernel.kallsyms]
              0.00%              13        0.29%          13      [setup_arg_pages+396 -&gt; setup_arg_pages+413]  [kernel.kallsyms]
              0.00%               3        0.07%           3      [setup_arg_pages+418 -&gt; setup_arg_pages+450]  [kernel.kallsyms]
              0.00%             616        6.84%         308   [security_mmap_file+0 -&gt; security_mmap_file+72]  [kernel.kallsyms]
              0.00%              23        0.51%          23  [security_mmap_file+77 -&gt; security_mmap_file+87]  [kernel.kallsyms]
              0.00%               4        0.02%           1                  [sched_clock+0 -&gt; sched_clock+4]  [kernel.kallsyms]
              0.00%               4        0.02%           1                 [sched_clock+9 -&gt; sched_clock+12]  [kernel.kallsyms]
              0.00%               1        0.02%           1                [rcu_nmi_exit+0 -&gt; rcu_nmi_exit+9]  [kernel.kallsyms]

Committer testing:

This should provide material for hours of endless joy, both from looking
for suspicious things in the implementation of this patch, such as the
top one:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
              2.17%            1.7M        0.08%         607   [compiler.h:199 -&gt; common.c:221]              [kernel.vmlinux]

As well from things that look legit:

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
              0.16%          123.0K        0.60%        4.7K   [nospec-branch.h:265 -&gt; nospec-branch.h:278]  [kernel.vmlinux]

:-)

Very short system wide taken branches session:

  # perf record -h -b

   Usage: perf record [&lt;options&gt;] [&lt;command&gt;]
      or: perf record [&lt;options&gt;] -- &lt;command&gt; [&lt;options&gt;]

      -b, --branch-any      sample any taken branches

  #
  # perf record -b
  ^C[ perf record: Woken up 595 times to write data ]
  [ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]

  #
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
  #
  # perf report --total-cycles --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 6M of event 'cycles'
  # Event count (approx.): 6299936
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
  # ...............  ..............  ...........  ..........  ......................................................................  ....................
  #
              2.17%            1.7M        0.08%         607                                        [compiler.h:199 -&gt; common.c:221]      [kernel.vmlinux]
              1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -&gt; memset-vec-unaligned-erms.S:151]          libc-2.29.so
              0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -&gt; entry_64.S:662]      [kernel.vmlinux]
              0.56%          541.8K        0.09%         672                                        [compiler.h:199 -&gt; common.c:300]      [kernel.vmlinux]
              0.39%          293.2K        0.01%         104                                    [list_debug.c:43 -&gt; list_debug.c:61]      [kernel.vmlinux]
              0.36%          278.6K        0.03%         272                                    [entry_64.S:1289 -&gt; entry_64.S:1308]      [kernel.vmlinux]
              0.30%          260.8K        0.07%         564                              [clear_page_64.S:47 -&gt; clear_page_64.S:50]      [kernel.vmlinux]
              0.28%          215.3K        0.05%         369                                            [traps.c:623 -&gt; traps.c:628]      [kernel.vmlinux]
              0.23%          178.1K        0.04%         278                                      [entry_64.S:271 -&gt; entry_64.S:275]      [kernel.vmlinux]
              0.20%          152.6K        0.09%         706                                      [paravirt.c:177 -&gt; paravirt.c:179]      [kernel.vmlinux]
              0.20%          155.8K        0.05%         373                                      [entry_64.S:153 -&gt; entry_64.S:175]      [kernel.vmlinux]
              0.18%          136.6K        0.03%         222                                                [msr.h:105 -&gt; msr.h:166]      [kernel.vmlinux]
              0.16%          123.0K        0.60%        4.7K                            [nospec-branch.h:265 -&gt; nospec-branch.h:278]      [kernel.vmlinux]
              0.16%          118.3K        0.01%          44                                      [entry_64.S:632 -&gt; entry_64.S:657]      [kernel.vmlinux]
              0.14%          104.5K        0.00%          28                                          [rwsem.c:1541 -&gt; rwsem.c:1544]      [kernel.vmlinux]
              0.13%           99.2K        0.01%          53                                      [spinlock.c:150 -&gt; spinlock.c:152]      [kernel.vmlinux]
              0.13%           95.5K        0.00%          35                                              [swap.c:456 -&gt; swap.c:471]      [kernel.vmlinux]
              0.12%           96.2K        0.05%         407                              [copy_user_64.S:175 -&gt; copy_user_64.S:209]      [kernel.vmlinux]
              0.11%           85.9K        0.00%          31                                        [swap.c:400 -&gt; page-flags.h:188]      [kernel.vmlinux]
              0.10%           73.0K        0.01%          52                                          [paravirt.h:763 -&gt; list.h:131]      [kernel.vmlinux]
              0.07%           56.2K        0.03%         214                                      [filemap.c:1524 -&gt; filemap.c:1557]      [kernel.vmlinux]
              0.07%           54.2K        0.02%         145                                        [memory.c:1032 -&gt; memory.c:1049]      [kernel.vmlinux]
              0.07%           50.3K        0.00%          39                                            [mmzone.c:49 -&gt; mmzone.c:69]      [kernel.vmlinux]
              0.06%           48.3K        0.01%          40                                   [paravirt.h:768 -&gt; page_alloc.c:3304]      [kernel.vmlinux]
              0.06%           46.7K        0.02%         155                                        [memory.c:1032 -&gt; memory.c:1056]      [kernel.vmlinux]
              0.06%           46.9K        0.01%         103                                              [swap.c:867 -&gt; swap.c:902]      [kernel.vmlinux]
              0.06%           47.8K        0.00%          34                                    [entry_64.S:1201 -&gt; entry_64.S:1202]      [kernel.vmlinux]

 -----------------------------------------------------------

 v7:
 ---
 Use use_browser in report__browse_block_hists for supporting
 stdio and potential tui mode.

 v6:
 ---
 Create report__browse_block_hists in block-info.c (codes are
 moved from builtin-report.c). It's called from
 perf_evlist__tty_browse_hists.

 v5:
 ---
 1. Move all block functions to block-info.c

 2. Move the code of setting ms in block hist_entry to
    other patch.

 v4:
 ---
 1. Use new option '--total-cycles' to replace
    '-s total_cycles' in v3.

 2. Move block info collection out of block info
    printing.

 v3:
 ---
 1. Use common function block_info__process_sym to
    process the blocks per symbol.

 2. Remove the nasty hack for skipping calculation
    of column length

 3. Some minor cleanup

Signed-off-by: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@intel.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20191107074719.26139-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf record: Add support for limit perf output file size</title>
<updated>2019-11-07T11:30:19+00:00</updated>
<author>
<name>Jiwei Sun</name>
<email>jiwei.sun@windriver.com</email>
</author>
<published>2019-10-22T08:09:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6d57581659f7229903d141455c7308e309056e89'/>
<id>6d57581659f7229903d141455c7308e309056e89</id>
<content type='text'>
The patch adds a new option to limit the output file size, then based on
it, we can create a wrapper of the perf command that uses the option to
avoid exhausting the disk space by the unconscious user.

In order to make the perf.data parsable, we just limit the sample data
size, since the perf.data consists of many headers and sample data and
other data, the actual size of the recorded file will bigger than the
setting value.

Testing it:

  # ./perf record -a -g --max-size=10M
  Couldn't synthesize bpf events.
  [ perf record: perf size limit reached (10249 KB), stopping session ]
  [ perf record: Woken up 32 times to write data ]
  [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ]

  # ls -lh perf.data
  -rw------- 1 root root 11M Oct 22 14:32 perf.data

  # ./perf record -a -g --max-size=10K
  [ perf record: perf size limit reached (10 KB), stopping session ]
  Couldn't synthesize bpf events.
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ]

  # ls -l perf.data
  -rw------- 1 root root 1626952 Oct 22 14:36 perf.data

Committer notes:

Fixed the build in multiple distros by using PRIu64 to print u64 struct
members, fixing this:

  builtin-record.c: In function 'record__write':
  builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
       rec-&gt;bytes_written &gt;&gt; 10);
       ^
    CC       /tmp/build/pe

Signed-off-by: Jiwei Sun &lt;jiwei.sun@windriver.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Richard Danter &lt;richard.danter@windriver.com&gt;
Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The patch adds a new option to limit the output file size, then based on
it, we can create a wrapper of the perf command that uses the option to
avoid exhausting the disk space by the unconscious user.

In order to make the perf.data parsable, we just limit the sample data
size, since the perf.data consists of many headers and sample data and
other data, the actual size of the recorded file will bigger than the
setting value.

Testing it:

  # ./perf record -a -g --max-size=10M
  Couldn't synthesize bpf events.
  [ perf record: perf size limit reached (10249 KB), stopping session ]
  [ perf record: Woken up 32 times to write data ]
  [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ]

  # ls -lh perf.data
  -rw------- 1 root root 11M Oct 22 14:32 perf.data

  # ./perf record -a -g --max-size=10K
  [ perf record: perf size limit reached (10 KB), stopping session ]
  Couldn't synthesize bpf events.
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ]

  # ls -l perf.data
  -rw------- 1 root root 1626952 Oct 22 14:36 perf.data

Committer notes:

Fixed the build in multiple distros by using PRIu64 to print u64 struct
members, fixing this:

  builtin-record.c: In function 'record__write':
  builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
       rec-&gt;bytes_written &gt;&gt; 10);
       ^
    CC       /tmp/build/pe

Signed-off-by: Jiwei Sun &lt;jiwei.sun@windriver.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Richard Danter &lt;richard.danter@windriver.com&gt;
Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf stat: Add --per-node agregation support</title>
<updated>2019-11-06T18:49:39+00:00</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@kernel.org</email>
</author>
<published>2019-08-28T08:17:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=86895b480a2f10c7c6659fc5312f84b93011ce2d'/>
<id>86895b480a2f10c7c6659fc5312f84b93011ce2d</id>
<content type='text'>
Adding new --per-node option to aggregate counts per NUMA
nodes for system-wide mode measurements.

You can specify --per-node in live mode:

  # perf stat  -a -I 1000 -e cycles --per-node
  #           time node   cpus             counts unit events
       1.000542550 N0       20          6,202,097      cycles
       1.000542550 N1       20            639,559      cycles
       2.002040063 N0       20          7,412,495      cycles
       2.002040063 N1       20          2,185,577      cycles
       3.003451699 N0       20          6,508,917      cycles
       3.003451699 N1       20            765,607      cycles
  ...

Or in the record/report stat session:

  # perf stat record -a -I 1000 -e cycles
  #           time             counts unit events
       1.000536937         10,008,468      cycles
       2.002090152          9,578,539      cycles
       3.003625233          7,647,869      cycles
       4.005135036          7,032,086      cycles
  ^C     4.340902364          3,923,893      cycles

  # perf stat report --per-node
  #           time node   cpus             counts unit events
       1.000536937 N0       20          9,355,086      cycles
       1.000536937 N1       20            653,382      cycles
       2.002090152 N0       20          7,712,838      cycles
       2.002090152 N1       20          1,865,701      cycles
       3.003625233 N0       20          6,604,441      cycles
       3.003625233 N1       20          1,043,428      cycles
       4.005135036 N0       20          6,350,522      cycles
       4.005135036 N1       20            681,564      cycles
       4.340902364 N0       20          3,403,188      cycles
       4.340902364 N1       20            520,705      cycles

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Alexey Budankov &lt;alexey.budankov@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Joe Mario &lt;jmario@redhat.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20190904073415.723-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adding new --per-node option to aggregate counts per NUMA
nodes for system-wide mode measurements.

You can specify --per-node in live mode:

  # perf stat  -a -I 1000 -e cycles --per-node
  #           time node   cpus             counts unit events
       1.000542550 N0       20          6,202,097      cycles
       1.000542550 N1       20            639,559      cycles
       2.002040063 N0       20          7,412,495      cycles
       2.002040063 N1       20          2,185,577      cycles
       3.003451699 N0       20          6,508,917      cycles
       3.003451699 N1       20            765,607      cycles
  ...

Or in the record/report stat session:

  # perf stat record -a -I 1000 -e cycles
  #           time             counts unit events
       1.000536937         10,008,468      cycles
       2.002090152          9,578,539      cycles
       3.003625233          7,647,869      cycles
       4.005135036          7,032,086      cycles
  ^C     4.340902364          3,923,893      cycles

  # perf stat report --per-node
  #           time node   cpus             counts unit events
       1.000536937 N0       20          9,355,086      cycles
       1.000536937 N1       20            653,382      cycles
       2.002090152 N0       20          7,712,838      cycles
       2.002090152 N1       20          1,865,701      cycles
       3.003625233 N0       20          6,604,441      cycles
       3.003625233 N1       20          1,043,428      cycles
       4.005135036 N0       20          6,350,522      cycles
       4.005135036 N1       20            681,564      cycles
       4.340902364 N0       20          3,403,188      cycles
       4.340902364 N1       20            520,705      cycles

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Alexey Budankov &lt;alexey.budankov@linux.intel.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Joe Mario &lt;jmario@redhat.com&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20190904073415.723-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf record: Put a copy of kcore into the perf.data directory</title>
<updated>2019-11-06T18:43:05+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-10-04T08:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eeb399b531a1576e36016f8a7f0c50d10194e190'/>
<id>eeb399b531a1576e36016f8a7f0c50d10194e190</id>
<content type='text'>
Add a new 'perf record' option '--kcore' which will put a copy of
/proc/kcore, kallsyms and modules into a perf.data directory. Note, that
without the --kcore option, output goes to a file as previously.  The
tools' -o and -i options work with either a file name or directory name.

Example:

  $ sudo perf record --kcore uname

  $ sudo tree perf.data
  perf.data
  ├── kcore_dir
  │   ├── kallsyms
  │   ├── kcore
  │   └── modules
  └── data

  $ sudo perf script -v
  build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de
  build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541
  Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field.
  Using CPUID GenuineIntel-6-8E-A
  Using perf.data/kcore_dir/kcore for kernel data
  Using perf.data/kcore_dir/kallsyms for symbols
             perf 19058 506778.423729:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423733:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423734:          7 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423736:        117 cycles:  ffffffffa2caa54a native_write_msr+0xa (vmlinux)
             perf 19058 506778.423738:       2092 cycles:  ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux)
             perf 19058 506778.423740:      37380 cycles:  ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux)
            uname 19058 506778.423751:     582673 cycles:  ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux)
            uname 19058 506778.423892:    2241841 cycles:  ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux)
            uname 19058 506778.424430:    2457397 cycles:  ffffffffa3019232 check_memory_region+0x52 (vmlinux)

Committer testing:

  # rm -rf perf.data*
  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  # ls -l perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data
  # perf record --kcore uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  ls[root@quaco ~]# ls -lad perf.data*
  drwx------. 3 root root  4096 Oct 21 11:08 perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # perf evlist -v -i perf.data/data
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new 'perf record' option '--kcore' which will put a copy of
/proc/kcore, kallsyms and modules into a perf.data directory. Note, that
without the --kcore option, output goes to a file as previously.  The
tools' -o and -i options work with either a file name or directory name.

Example:

  $ sudo perf record --kcore uname

  $ sudo tree perf.data
  perf.data
  ├── kcore_dir
  │   ├── kallsyms
  │   ├── kcore
  │   └── modules
  └── data

  $ sudo perf script -v
  build id event received for vmlinux: 1eaa285996affce2d74d8e66dcea09a80c9941de
  build id event received for [vdso]: 8bbaf5dc62a9b644b4d4e4539737e104e4a84541
  Samples for 'cycles' event do not have CPU attribute set. Skipping 'cpu' field.
  Using CPUID GenuineIntel-6-8E-A
  Using perf.data/kcore_dir/kcore for kernel data
  Using perf.data/kcore_dir/kallsyms for symbols
             perf 19058 506778.423729:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423733:          1 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423734:          7 cycles:  ffffffffa2caa548 native_write_msr+0x8 (vmlinux)
             perf 19058 506778.423736:        117 cycles:  ffffffffa2caa54a native_write_msr+0xa (vmlinux)
             perf 19058 506778.423738:       2092 cycles:  ffffffffa2c9b7b0 native_apic_msr_write+0x0 (vmlinux)
             perf 19058 506778.423740:      37380 cycles:  ffffffffa2f121d0 perf_event_addr_filters_exec+0x0 (vmlinux)
            uname 19058 506778.423751:     582673 cycles:  ffffffffa303a407 propagate_protected_usage+0x147 (vmlinux)
            uname 19058 506778.423892:    2241841 cycles:  ffffffffa2cae0c9 unwind_next_frame.part.5+0x79 (vmlinux)
            uname 19058 506778.424430:    2457397 cycles:  ffffffffa3019232 check_memory_region+0x52 (vmlinux)

Committer testing:

  # rm -rf perf.data*
  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  # ls -l perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data
  # perf record --kcore uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.024 MB perf.data (7 samples) ]
  ls[root@quaco ~]# ls -lad perf.data*
  drwx------. 3 root root  4096 Oct 21 11:08 perf.data
  -rw-------. 1 root root 34772 Oct 21 11:08 perf.data.old
  # perf evlist -v
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  # perf evlist -v -i perf.data/data
  cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf data: Support single perf.data file directory</title>
<updated>2019-11-06T18:43:05+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2019-10-04T08:31:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=46e201efa15b70ec39df5237116fddebb4f5057c'/>
<id>46e201efa15b70ec39df5237116fddebb4f5057c</id>
<content type='text'>
Support directory output that contains a regular perf.data file, named
"data". By default the directory is named perf.data i.e.
	perf.data
	└── data

Most of the infrastructure to support a directory is already there. This
patch makes the changes needed to support the format above.

Presently there is no 'perf record' option to output a directory.

This is preparation for adding support for putting a copy of /proc/kcore in
the directory.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support directory output that contains a regular perf.data file, named
"data". By default the directory is named perf.data i.e.
	perf.data
	└── data

Most of the infrastructure to support a directory is already there. This
patch makes the changes needed to support the format above.

Presently there is no 'perf record' option to output a directory.

This is preparation for adding support for putting a copy of /proc/kcore in
the directory.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20191004083121.12182-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
