<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/arc/kernel/perf_event.c, branch v5.6</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>ARC: perf: Accommodate big-endian CPU</title>
<updated>2019-10-22T16:59:43+00:00</updated>
<author>
<name>Alexey Brodkin</name>
<email>Alexey.Brodkin@synopsys.com</email>
</author>
<published>2019-10-22T14:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5effc09c4907901f0e71e68e5f2e14211d9a203f'/>
<id>5effc09c4907901f0e71e68e5f2e14211d9a203f</id>
<content type='text'>
8-letter strings representing ARC perf events are stores in two
32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc.

And the same order of bytes in the word is used regardless CPU endianness.

Which means in case of big-endian CPU core we need to swap bytes to get
the same order as if it was on little-endian CPU.

Otherwise we're seeing the following error message on boot:
-------------------------&gt;8----------------------
ARC perf        : 8 counters (32 bits), 40 conditions, [overflow IRQ support]
sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
  arc_unwind_core+0xd4/0xfc
  dump_stack+0x64/0x80
  sysfs_warn_dup+0x46/0x58
  sysfs_add_file_mode_ns+0xb2/0x168
  create_files+0x70/0x2a0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0
Failed to register pmu: arc_pct, reason -17
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
  arc_unwind_core+0xd4/0xfc
  dump_stack+0x64/0x80
  __warn+0x9c/0xd4
  warn_slowpath_fmt+0x22/0x2c
  perf_event_sysfs_init+0x70/0xa0
---[ end trace a75fb9a9837bd1ec ]---
-------------------------&gt;8----------------------

What happens here we're trying to register more than one raw perf event
with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters
and encoded into two 32-bit words. In this particular case we deal with 2
events:
 * "IJMP____" which counts all jump &amp; branch instructions
 * "IJMPC___" which counts only conditional jumps &amp; branches

Those strings are split in two 32-bit words this way "IJMP" + "____" &amp;
"IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core
being big-endian then we read "PMJI" + "____" &amp; "PMJI" + "___C".

And since we interpret read array of ASCII letters as a null-terminated string
on big-endian CPU we end up with 2 events of the same name "PMJI".

Signed-off-by: Alexey Brodkin &lt;abrodkin@synopsys.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
8-letter strings representing ARC perf events are stores in two
32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc.

And the same order of bytes in the word is used regardless CPU endianness.

Which means in case of big-endian CPU core we need to swap bytes to get
the same order as if it was on little-endian CPU.

Otherwise we're seeing the following error message on boot:
-------------------------&gt;8----------------------
ARC perf        : 8 counters (32 bits), 40 conditions, [overflow IRQ support]
sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
  arc_unwind_core+0xd4/0xfc
  dump_stack+0x64/0x80
  sysfs_warn_dup+0x46/0x58
  sysfs_add_file_mode_ns+0xb2/0x168
  create_files+0x70/0x2a0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0
Failed to register pmu: arc_pct, reason -17
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
  arc_unwind_core+0xd4/0xfc
  dump_stack+0x64/0x80
  __warn+0x9c/0xd4
  warn_slowpath_fmt+0x22/0x2c
  perf_event_sysfs_init+0x70/0xa0
---[ end trace a75fb9a9837bd1ec ]---
-------------------------&gt;8----------------------

What happens here we're trying to register more than one raw perf event
with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters
and encoded into two 32-bit words. In this particular case we deal with 2
events:
 * "IJMP____" which counts all jump &amp; branch instructions
 * "IJMPC___" which counts only conditional jumps &amp; branches

Those strings are split in two 32-bit words this way "IJMP" + "____" &amp;
"IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core
being big-endian then we read "PMJI" + "____" &amp; "PMJI" + "___C".

And since we interpret read array of ASCII letters as a null-terminated string
on big-endian CPU we end up with 2 events of the same name "PMJI".

Signed-off-by: Alexey Brodkin &lt;abrodkin@synopsys.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: perf: avoid kernel killing where it is possible</title>
<updated>2019-01-17T22:38:00+00:00</updated>
<author>
<name>Eugeniy Paltsev</name>
<email>Eugeniy.Paltsev@synopsys.com</email>
</author>
<published>2018-12-13T16:56:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=29133260f7c2e4ce50b4da6bf0674331bc0a4da5'/>
<id>29133260f7c2e4ce50b4da6bf0674331bc0a4da5</id>
<content type='text'>
No, not gonna die tonight.

Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No, not gonna die tonight.

Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: perf: move HW events mapping to separate function</title>
<updated>2019-01-17T22:38:00+00:00</updated>
<author>
<name>Eugeniy Paltsev</name>
<email>Eugeniy.Paltsev@synopsys.com</email>
</author>
<published>2018-12-13T16:56:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=baf9cc85ba01f32cf2ee79042a64b874a58cfb92'/>
<id>baf9cc85ba01f32cf2ee79042a64b874a58cfb92</id>
<content type='text'>
Move HW events mapping to separate function to make code more readable.

Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move HW events mapping to separate function to make code more readable.

Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: perf: introduce Kernel PMU events support</title>
<updated>2019-01-17T22:38:00+00:00</updated>
<author>
<name>Eugeniy Paltsev</name>
<email>Eugeniy.Paltsev@synopsys.com</email>
</author>
<published>2018-12-13T16:56:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0e956150fe09fa4430c42a9bbe48a72967fa0012'/>
<id>0e956150fe09fa4430c42a9bbe48a72967fa0012</id>
<content type='text'>
Export all available ARC architected hardware events as
kernel PMU events to make non-generic events accessible.

ARC PMU HW allow us to read the list of all available
events names. So we generate kernel PMU event list
dynamically in arc_pmu_device_probe() using
human-readable events names we got from HW instead of
using pre-defined events list.

--------------------------&gt;8--------------------------
$ perf list
  [snip]
  arc_pmu/bdata64/                  [Kernel PMU event]
  arc_pmu/bdcstall/                 [Kernel PMU event]
  arc_pmu/bdslot/                   [Kernel PMU event]
  arc_pmu/bfbmp/                    [Kernel PMU event]
  arc_pmu/bfirqex/                  [Kernel PMU event]
  arc_pmu/bflgstal/                 [Kernel PMU event]
  arc_pmu/bflush/                   [Kernel PMU event]
--------------------------&gt;8--------------------------

Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Export all available ARC architected hardware events as
kernel PMU events to make non-generic events accessible.

ARC PMU HW allow us to read the list of all available
events names. So we generate kernel PMU event list
dynamically in arc_pmu_device_probe() using
human-readable events names we got from HW instead of
using pre-defined events list.

--------------------------&gt;8--------------------------
$ perf list
  [snip]
  arc_pmu/bdata64/                  [Kernel PMU event]
  arc_pmu/bdcstall/                 [Kernel PMU event]
  arc_pmu/bdslot/                   [Kernel PMU event]
  arc_pmu/bfbmp/                    [Kernel PMU event]
  arc_pmu/bfirqex/                  [Kernel PMU event]
  arc_pmu/bflgstal/                 [Kernel PMU event]
  arc_pmu/bflush/                   [Kernel PMU event]
--------------------------&gt;8--------------------------

Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARC: perf: trivial code cleanup</title>
<updated>2019-01-17T22:38:00+00:00</updated>
<author>
<name>Eugeniy Paltsev</name>
<email>Eugeniy.Paltsev@synopsys.com</email>
</author>
<published>2018-12-13T16:56:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=14f81a91ad29a57d6c3cf123ad9489f2ca9133fb'/>
<id>14f81a91ad29a57d6c3cf123ad9489f2ca9133fb</id>
<content type='text'>
* Use BIT(), lower_32_bits(), upper_32_bits() macroses,
  fix code style violations.
* Use u32, u64, s64 instead of uint32_t, uint64_t, int64_t
* Fix description comment as this code doesn't belong only to
  ARC700 anymore.
* Use SPDX License Identifier.
* Remove useless ifdefs. ifdef around 'arc_pmu_match' structure
  declaration is useless as we refer to 'arc_pmu_match' in
  several places which aren't guarded with ifdef. Nevertheless
  'ARC' option selects 'OF' unconditionally so we can simply
  get rid of this ifdef.

Acked-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Use BIT(), lower_32_bits(), upper_32_bits() macroses,
  fix code style violations.
* Use u32, u64, s64 instead of uint32_t, uint64_t, int64_t
* Fix description comment as this code doesn't belong only to
  ARC700 anymore.
* Use SPDX License Identifier.
* Remove useless ifdefs. ifdef around 'arc_pmu_match' structure
  declaration is useless as we refer to 'arc_pmu_match' in
  several places which aren't guarded with ifdef. Nevertheless
  'ARC' option selects 'OF' unconditionally so we can simply
  get rid of this ifdef.

Acked-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Signed-off-by: Eugeniy Paltsev &lt;Eugeniy.Paltsev@synopsys.com&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARCv2: perf: optimize given that num counters &lt;= 32</title>
<updated>2017-11-21T23:20:55+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-10-08T16:47:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5b9027d6d044d4917992119d184ab0bb616489cc'/>
<id>5b9027d6d044d4917992119d184ab0bb616489cc</id>
<content type='text'>
use ffz primitive which maps to ARCv2 instruction, vs. non atomic
__test_and_set_bit

It is unlikely if we will even have more than 32 counters, but still add
a BUILD_BUG to catch that

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
use ffz primitive which maps to ARCv2 instruction, vs. non atomic
__test_and_set_bit

It is unlikely if we will even have more than 32 counters, but still add
a BUILD_BUG to catch that

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARCv2: perf: tweak overflow interrupt</title>
<updated>2017-11-21T23:20:31+00:00</updated>
<author>
<name>Vineet Gupta</name>
<email>vgupta@synopsys.com</email>
</author>
<published>2015-05-09T12:57:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4d431290402c8d867af7ba45ee75407d68748c4a'/>
<id>4d431290402c8d867af7ba45ee75407d68748c4a</id>
<content type='text'>
Current perf ISR loops thru all 32 counters, checking for each if it
caused the interrupt. Instead only loop thru counters which actually
interrupted (typically 1).

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current perf ISR loops thru all 32 counters, checking for each if it
caused the interrupt. Instead only loop thru counters which actually
interrupted (typically 1).

Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arc: perf: Enable generic "cache-references" and "cache-misses" events</title>
<updated>2016-09-30T21:48:18+00:00</updated>
<author>
<name>Alexey Brodkin</name>
<email>abrodkin@synopsys.com</email>
</author>
<published>2016-08-25T11:47:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e0d5321faca1133cbb34a3a780d62a3a0814b6dc'/>
<id>e0d5321faca1133cbb34a3a780d62a3a0814b6dc</id>
<content type='text'>
We used to live with PERF_COUNT_HW_CACHE_REFERENCES and
PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC.

Those events are actually aliases to 2 cache events that we do support
and so this change sets "cache-reference" and "cache-misses" events
in the same way as "L1-dcache-loads" and L1-dcache-load-misses.

And while at it adding debug info for cache events as well as doing a
subtle fix in HW events debug info - config value is much better
represented by hex so we may see not only event index but as well other
control bits set (if they exist).

Signed-off-by: Alexey Brodkin &lt;abrodkin@synopsys.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We used to live with PERF_COUNT_HW_CACHE_REFERENCES and
PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC.

Those events are actually aliases to 2 cache events that we do support
and so this change sets "cache-reference" and "cache-misses" events
in the same way as "L1-dcache-loads" and L1-dcache-load-misses.

And while at it adding debug info for cache events as well as doing a
subtle fix in HW events debug info - config value is much better
represented by hex so we may see not only event index but as well other
control bits set (if they exist).

Signed-off-by: Alexey Brodkin &lt;abrodkin@synopsys.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix typos</title>
<updated>2016-05-30T04:37:32+00:00</updated>
<author>
<name>Andrea Gelmini</name>
<email>andrea.gelmini@gelma.net</email>
</author>
<published>2016-05-21T11:45:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2547476a5e4061f6addb88d5fc837d3a950f54c4'/>
<id>2547476a5e4061f6addb88d5fc837d3a950f54c4</id>
<content type='text'>
Signed-off-by: Andrea Gelmini &lt;andrea.gelmini@gelma.net&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Andrea Gelmini &lt;andrea.gelmini@gelma.net&gt;
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf core: Pass max stack as a perf_callchain_entry context</title>
<updated>2016-05-17T02:11:50+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2016-04-28T15:30:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cfbcf468454ab4b20f0b4b62da51920b99fdb19e'/>
<id>cfbcf468454ab4b20f0b4b62da51920b99fdb19e</id>
<content type='text'>
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Brendan Gregg &lt;brendan.d.gregg@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: He Kuang &lt;hekuang@huawei.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Milian Wolff &lt;milian.wolff@kdab.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: Wang Nan &lt;wangnan0@huawei.com&gt;
Cc: Zefan Li &lt;lizefan@huawei.com&gt;
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Brendan Gregg &lt;brendan.d.gregg@gmail.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: He Kuang &lt;hekuang@huawei.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Milian Wolff &lt;milian.wolff@kdab.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: Wang Nan &lt;wangnan0@huawei.com&gt;
Cc: Zefan Li &lt;lizefan@huawei.com&gt;
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
