<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/perf, branch v5.19</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>powerpc/perf: Fix the threshold compare group constraint for power9</title>
<updated>2022-05-22T05:58:30+00:00</updated>
<author>
<name>Kajol Jain</name>
<email>kjain@linux.ibm.com</email>
</author>
<published>2022-05-06T06:10:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ab0cc6bbf0c812731c703ec757fcc3fc3a457a34'/>
<id>ab0cc6bbf0c812731c703ec757fcc3fc3a457a34</id>
<content type='text'>
Thresh compare bits for a event is used to program thresh compare
field in Monitor Mode Control Register A (MMCRA: 9-18 bits for power9).
When scheduling events as a group, all events in that group should
match value in threshold bits (like thresh compare, thresh control,
thresh select). Otherwise event open for the sibling events should fail.
But in the current code, incase thresh compare bits are not valid,
we are not failing in group_constraint function which can result
in invalid group schduling.

Fix the issue by returning -1 incase event is threshold and threshold
compare value is not valid.

Thresh control bits in the event code is used to program thresh_ctl
field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
the scheduling of group events PM_MRK_INST_CMPL (873534401e0) and
PM_THRESH_MET (8734340101ec) is expected to fail as both event
request different thresh control bits and invalid thresh compare value.

Result before the patch changes:

[command]# perf stat -e "{r8735340401e0,r8734340101ec}" sleep 1

 Performance counter stats for 'sleep 1':

            11,048      r8735340401e0
             1,967      r8734340101ec

       1.001354036 seconds time elapsed

       0.001421000 seconds user
       0.000000000 seconds sys

Result after the patch changes:

[command]# perf stat -e "{r8735340401e0,r8734340101ec}" sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (r8735340401e0).
/bin/dmesg | grep -i perf may provide additional information.

Fixes: 78a16d9fc1206 ("powerpc/perf: Avoid FAB_*_MATCH checks for power9")
Signed-off-by: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Reviewed-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220506061015.43916-2-kjain@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Thresh compare bits for a event is used to program thresh compare
field in Monitor Mode Control Register A (MMCRA: 9-18 bits for power9).
When scheduling events as a group, all events in that group should
match value in threshold bits (like thresh compare, thresh control,
thresh select). Otherwise event open for the sibling events should fail.
But in the current code, incase thresh compare bits are not valid,
we are not failing in group_constraint function which can result
in invalid group schduling.

Fix the issue by returning -1 incase event is threshold and threshold
compare value is not valid.

Thresh control bits in the event code is used to program thresh_ctl
field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
the scheduling of group events PM_MRK_INST_CMPL (873534401e0) and
PM_THRESH_MET (8734340101ec) is expected to fail as both event
request different thresh control bits and invalid thresh compare value.

Result before the patch changes:

[command]# perf stat -e "{r8735340401e0,r8734340101ec}" sleep 1

 Performance counter stats for 'sleep 1':

            11,048      r8735340401e0
             1,967      r8734340101ec

       1.001354036 seconds time elapsed

       0.001421000 seconds user
       0.000000000 seconds sys

Result after the patch changes:

[command]# perf stat -e "{r8735340401e0,r8734340101ec}" sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (r8735340401e0).
/bin/dmesg | grep -i perf may provide additional information.

Fixes: 78a16d9fc1206 ("powerpc/perf: Avoid FAB_*_MATCH checks for power9")
Signed-off-by: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Reviewed-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220506061015.43916-2-kjain@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/perf: Fix the threshold compare group constraint for power10</title>
<updated>2022-05-22T05:58:29+00:00</updated>
<author>
<name>Kajol Jain</name>
<email>kjain@linux.ibm.com</email>
</author>
<published>2022-05-06T06:10:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=505d31650ba96d6032313480fdb566d289a4698c'/>
<id>505d31650ba96d6032313480fdb566d289a4698c</id>
<content type='text'>
Thresh compare bits for a event is used to program thresh compare
field in Monitor Mode Control Register A (MMCRA: 8-18 bits for power10).
When scheduling events as a group, all events in that group should
match value in threshold bits. Otherwise event open for the sibling
events should fail. But in the current code, incase thresh compare bits are
not valid, we are not failing in group_constraint function which can result
in invalid group schduling.

Fix the issue by returning -1 incase event is threshold and threshold
compare value is not valid in group_constraint function.

Patch also fixes the p10_thresh_cmp_val function to return -1,
incase threshold bits are not valid and changes corresponding check in
is_thresh_cmp_valid function to return false only when the thresh_cmp
value is less then 0.

Thresh control bits in the event code is used to program thresh_ctl
field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
the scheduling of group events PM_MRK_INST_CMPL (3534401e0) and
PM_THRESH_MET (34340101ec) is expected to fail as both event
request different thresh control bits.

Result before the patch changes:

[command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1

 Performance counter stats for 'sleep 1':

             8,482      r35340401e0
                 0      r34340101ec

       1.001474838 seconds time elapsed

       0.001145000 seconds user
       0.000000000 seconds sys

Result after the patch changes:

[command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1

 Performance counter stats for 'sleep 1':

     &lt;not counted&gt;      r35340401e0
   &lt;not supported&gt;      r34340101ec

       1.001499607 seconds time elapsed

       0.000204000 seconds user
       0.000760000 seconds sys

Fixes: 82d2c16b350f7 ("powerpc/perf: Adds support for programming of Thresholding in P10")
Signed-off-by: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Reviewed-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220506061015.43916-1-kjain@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Thresh compare bits for a event is used to program thresh compare
field in Monitor Mode Control Register A (MMCRA: 8-18 bits for power10).
When scheduling events as a group, all events in that group should
match value in threshold bits. Otherwise event open for the sibling
events should fail. But in the current code, incase thresh compare bits are
not valid, we are not failing in group_constraint function which can result
in invalid group schduling.

Fix the issue by returning -1 incase event is threshold and threshold
compare value is not valid in group_constraint function.

Patch also fixes the p10_thresh_cmp_val function to return -1,
incase threshold bits are not valid and changes corresponding check in
is_thresh_cmp_valid function to return false only when the thresh_cmp
value is less then 0.

Thresh control bits in the event code is used to program thresh_ctl
field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
the scheduling of group events PM_MRK_INST_CMPL (3534401e0) and
PM_THRESH_MET (34340101ec) is expected to fail as both event
request different thresh control bits.

Result before the patch changes:

[command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1

 Performance counter stats for 'sleep 1':

             8,482      r35340401e0
                 0      r34340101ec

       1.001474838 seconds time elapsed

       0.001145000 seconds user
       0.000000000 seconds sys

Result after the patch changes:

[command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1

 Performance counter stats for 'sleep 1':

     &lt;not counted&gt;      r35340401e0
   &lt;not supported&gt;      r34340101ec

       1.001499607 seconds time elapsed

       0.000204000 seconds user
       0.000760000 seconds sys

Fixes: 82d2c16b350f7 ("powerpc/perf: Adds support for programming of Thresholding in P10")
Signed-off-by: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Reviewed-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220506061015.43916-1-kjain@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Add missing headers</title>
<updated>2022-05-08T12:15:40+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2022-03-08T19:20:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e6f6390ab7b9d649c13de2c8a591bce61a10ec3b'/>
<id>e6f6390ab7b9d649c13de2c8a591bce61a10ec3b</id>
<content type='text'>
Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h,
asm/pci.h etc...

Include the needed headers, and remove asm/prom.h when it was
needed exclusively for pulling necessary headers.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h,
asm/pci.h etc...

Include the needed headers, and remove asm/prom.h when it was
needed exclusively for pulling necessary headers.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: fix typos in comments</title>
<updated>2022-05-05T12:12:44+00:00</updated>
<author>
<name>Julia Lawall</name>
<email>Julia.Lawall@inria.fr</email>
</author>
<published>2022-04-30T18:56:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1fd02f6605b855b4af2883f29a2abc88bdf17857'/>
<id>1fd02f6605b855b4af2883f29a2abc88bdf17857</id>
<content type='text'>
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall &lt;Julia.Lawall@inria.fr&gt;
Reviewed-by: Joel Stanley &lt;joel@jms.id.au&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall &lt;Julia.Lawall@inria.fr&gt;
Reviewed-by: Joel Stanley &lt;joel@jms.id.au&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220430185654.5855-1-Julia.Lawall@inria.fr

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'v5.18-rc4' into next</title>
<updated>2022-05-05T12:09:35+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2022-05-05T12:09:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d0a31acc34dc2921ad32a67a7534704114e64f82'/>
<id>d0a31acc34dc2921ad32a67a7534704114e64f82</id>
<content type='text'>
Merge master into next, to bring in commit 5f24d5a579d1 ("mm, hugetlb:
allow for "high" userspace addresses"), which is needed as a
prerequisite for the series converting powerpc to the generic mmap
logic.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge master into next, to bring in commit 5f24d5a579d1 ("mm, hugetlb:
allow for "high" userspace addresses"), which is needed as a
prerequisite for the series converting powerpc to the generic mmap
logic.
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/perf/24x7: use 'unsigned int' instead of 'unsigned'</title>
<updated>2022-05-04T09:37:44+00:00</updated>
<author>
<name>Jason Wang</name>
<email>wangborong@cdjrlc.com</email>
</author>
<published>2021-07-29T11:52:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=76c452b494b8be58a1965ab93b7002f295f7705a'/>
<id>76c452b494b8be58a1965ab93b7002f295f7705a</id>
<content type='text'>
Replace the 'unsigned' with 'unsigned int' which is more accurate.

Signed-off-by: Jason Wang &lt;wangborong@cdjrlc.com&gt;
[chleroy: Fixed parenthesis alignment]
Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20210729115252.40895-1-wangborong@cdjrlc.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace the 'unsigned' with 'unsigned int' which is more accurate.

Signed-off-by: Jason Wang &lt;wangborong@cdjrlc.com&gt;
[chleroy: Fixed parenthesis alignment]
Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20210729115252.40895-1-wangborong@cdjrlc.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/perf: Fix symbol undeclared warning</title>
<updated>2022-05-02T13:02:15+00:00</updated>
<author>
<name>Wang Wensheng</name>
<email>wangwensheng4@huawei.com</email>
</author>
<published>2020-09-23T07:14:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cacaeb0c23979d20652d862e8ff974d9d7244e2d'/>
<id>cacaeb0c23979d20652d862e8ff974d9d7244e2d</id>
<content type='text'>
Build kernel with `C=2`:
arch/powerpc/perf/isa207-common.c:24:18: warning: symbol
'isa207_pmu_format_attr' was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:101:5: warning: symbol 'p9_dd21_bl_ev'
was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:115:5: warning: symbol 'p9_dd22_bl_ev'
was not declared. Should it be static?

Those symbols are used only in the files that define them so we declare
them as static to fix the warnings.

Signed-off-by: Wang Wensheng &lt;wangwensheng4@huawei.com&gt;
Reviewed-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200923071453.2540-1-wangwensheng4@huawei.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Build kernel with `C=2`:
arch/powerpc/perf/isa207-common.c:24:18: warning: symbol
'isa207_pmu_format_attr' was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:101:5: warning: symbol 'p9_dd21_bl_ev'
was not declared. Should it be static?
arch/powerpc/perf/power9-pmu.c:115:5: warning: symbol 'p9_dd22_bl_ev'
was not declared. Should it be static?

Those symbols are used only in the files that define them so we declare
them as static to fix the warnings.

Signed-off-by: Wang Wensheng &lt;wangwensheng4@huawei.com&gt;
Reviewed-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200923071453.2540-1-wangwensheng4@huawei.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/perf: Fix 32bit compile</title>
<updated>2022-04-21T13:26:47+00:00</updated>
<author>
<name>Alexey Kardashevskiy</name>
<email>aik@ozlabs.ru</email>
</author>
<published>2022-04-21T02:57:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bb82c574691daf8f7fa9a160264d15c5804cb769'/>
<id>bb82c574691daf8f7fa9a160264d15c5804cb769</id>
<content type='text'>
The "read_bhrb" global symbol is only called under CONFIG_PPC64 of
arch/powerpc/perf/core-book3s.c but it is compiled for both 32 and 64 bit
anyway (and LLVM fails to link this on 32bit).

This fixes it by moving bhrb.o to obj64 targets.

Signed-off-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220421025756.571995-1-aik@ozlabs.ru

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The "read_bhrb" global symbol is only called under CONFIG_PPC64 of
arch/powerpc/perf/core-book3s.c but it is compiled for both 32 and 64 bit
anyway (and LLVM fails to link this on 32bit).

This fixes it by moving bhrb.o to obj64 targets.

Signed-off-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220421025756.571995-1-aik@ozlabs.ru

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/perf: Fix power10 event alternatives</title>
<updated>2022-04-21T13:25:33+00:00</updated>
<author>
<name>Athira Rajeev</name>
<email>atrajeev@linux.vnet.ibm.com</email>
</author>
<published>2022-04-19T11:48:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6cc9a852f123301d5271f1484df8e961b2b64f1'/>
<id>c6cc9a852f123301d5271f1484df8e961b2b64f1</id>
<content type='text'>
When scheduling a group of events, there are constraint checks done to
make sure all events can go in a group. Example, one of the criteria is
that events in a group cannot use the same PMC. But platform specific
PMU supports alternative event for some of the event codes. During
perf_event_open(), if any event group doesn't match constraint check
criteria, further lookup is done to find alternative event.

By current design, the array of alternatives events in PMU code is
expected to be sorted by column 0. This is because in
find_alternative() the return criteria is based on event code
comparison. ie. "event &lt; ev_alt[i][0])". This optimisation is there
since find_alternative() can be called multiple times. In power10 PMU
code, the alternative event array is not sorted properly and hence there
is breakage in finding alternative event.

To work with existing logic, fix the alternative event array to be
sorted by column 0 for power10-pmu.c

Results:

In case where an alternative event is not chosen when we could, events
will be multiplexed. ie, time sliced where it could actually run
concurrently.

Example, in power10 PM_INST_CMPL_ALT(0x00002) has alternative event,
PM_INST_CMPL(0x500fa). Without the fix, if a group of events with PMC1
to PMC4 is used along with PM_INST_CMPL_ALT, it will be time sliced
since all programmable PMC's are consumed already. But with the fix,
when it picks alternative event on PMC5, all events will run
concurrently.

Before:

 # perf stat -e r00002,r100fc,r200fa,r300fc,r400fc

 Performance counter stats for 'system wide':

         328668935      r00002               (79.94%)
          56501024      r100fc               (79.95%)
          49564238      r200fa               (79.95%)
               376      r300fc               (80.19%)
               660      r400fc               (79.97%)

       4.039150522 seconds time elapsed

With the fix, since alternative event is chosen to run on PMC6, events
will be run concurrently.

After:

 # perf stat -e r00002,r100fc,r200fa,r300fc,r400fc

 Performance counter stats for 'system wide':

          23596607      r00002
           4907738      r100fc
           2283608      r200fa
               135      r300fc
               248      r400fc

       1.664671390 seconds time elapsed

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Reviewed-by: Madhavan Srinivasan &lt;maddy@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220419114828.89843-2-atrajeev@linux.vnet.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When scheduling a group of events, there are constraint checks done to
make sure all events can go in a group. Example, one of the criteria is
that events in a group cannot use the same PMC. But platform specific
PMU supports alternative event for some of the event codes. During
perf_event_open(), if any event group doesn't match constraint check
criteria, further lookup is done to find alternative event.

By current design, the array of alternatives events in PMU code is
expected to be sorted by column 0. This is because in
find_alternative() the return criteria is based on event code
comparison. ie. "event &lt; ev_alt[i][0])". This optimisation is there
since find_alternative() can be called multiple times. In power10 PMU
code, the alternative event array is not sorted properly and hence there
is breakage in finding alternative event.

To work with existing logic, fix the alternative event array to be
sorted by column 0 for power10-pmu.c

Results:

In case where an alternative event is not chosen when we could, events
will be multiplexed. ie, time sliced where it could actually run
concurrently.

Example, in power10 PM_INST_CMPL_ALT(0x00002) has alternative event,
PM_INST_CMPL(0x500fa). Without the fix, if a group of events with PMC1
to PMC4 is used along with PM_INST_CMPL_ALT, it will be time sliced
since all programmable PMC's are consumed already. But with the fix,
when it picks alternative event on PMC5, all events will run
concurrently.

Before:

 # perf stat -e r00002,r100fc,r200fa,r300fc,r400fc

 Performance counter stats for 'system wide':

         328668935      r00002               (79.94%)
          56501024      r100fc               (79.95%)
          49564238      r200fa               (79.95%)
               376      r300fc               (80.19%)
               660      r400fc               (79.97%)

       4.039150522 seconds time elapsed

With the fix, since alternative event is chosen to run on PMC6, events
will be run concurrently.

After:

 # perf stat -e r00002,r100fc,r200fa,r300fc,r400fc

 Performance counter stats for 'system wide':

          23596607      r00002
           4907738      r100fc
           2283608      r200fa
               135      r300fc
               248      r400fc

       1.664671390 seconds time elapsed

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Reviewed-by: Madhavan Srinivasan &lt;maddy@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220419114828.89843-2-atrajeev@linux.vnet.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/perf: Fix power9 event alternatives</title>
<updated>2022-04-21T13:25:22+00:00</updated>
<author>
<name>Athira Rajeev</name>
<email>atrajeev@linux.vnet.ibm.com</email>
</author>
<published>2022-04-19T11:48:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0dcad700bb2776e3886fe0a645a4bf13b1e747cd'/>
<id>0dcad700bb2776e3886fe0a645a4bf13b1e747cd</id>
<content type='text'>
When scheduling a group of events, there are constraint checks done to
make sure all events can go in a group. Example, one of the criteria is
that events in a group cannot use the same PMC. But platform specific
PMU supports alternative event for some of the event codes. During
perf_event_open(), if any event group doesn't match constraint check
criteria, further lookup is done to find alternative event.

By current design, the array of alternatives events in PMU code is
expected to be sorted by column 0. This is because in
find_alternative() the return criteria is based on event code
comparison. ie. "event &lt; ev_alt[i][0])". This optimisation is there
since find_alternative() can be called multiple times. In power9 PMU
code, the alternative event array is not sorted properly and hence there
is breakage in finding alternative events.

To work with existing logic, fix the alternative event array to be
sorted by column 0 for power9-pmu.c

Results:

With alternative events, multiplexing can be avoided. That is, for
example, in power9 PM_LD_MISS_L1 (0x3e054) has alternative event,
PM_LD_MISS_L1_ALT (0x400f0). This is an identical event which can be
programmed in a different PMC.

Before:

 # perf stat -e r3e054,r300fc

 Performance counter stats for 'system wide':

           1057860      r3e054              (50.21%)
               379      r300fc              (49.79%)

       0.944329741 seconds time elapsed

Since both the events are using PMC3 in this case, they are
multiplexed here.

After:

 # perf stat -e r3e054,r300fc

 Performance counter stats for 'system wide':

           1006948      r3e054
               182      r300fc

Fixes: 91e0bd1e6251 ("powerpc/perf: Add PM_LD_MISS_L1 and PM_BR_2PATH to power9 event list")
Signed-off-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Reviewed-by: Madhavan Srinivasan &lt;maddy@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220419114828.89843-1-atrajeev@linux.vnet.ibm.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When scheduling a group of events, there are constraint checks done to
make sure all events can go in a group. Example, one of the criteria is
that events in a group cannot use the same PMC. But platform specific
PMU supports alternative event for some of the event codes. During
perf_event_open(), if any event group doesn't match constraint check
criteria, further lookup is done to find alternative event.

By current design, the array of alternatives events in PMU code is
expected to be sorted by column 0. This is because in
find_alternative() the return criteria is based on event code
comparison. ie. "event &lt; ev_alt[i][0])". This optimisation is there
since find_alternative() can be called multiple times. In power9 PMU
code, the alternative event array is not sorted properly and hence there
is breakage in finding alternative events.

To work with existing logic, fix the alternative event array to be
sorted by column 0 for power9-pmu.c

Results:

With alternative events, multiplexing can be avoided. That is, for
example, in power9 PM_LD_MISS_L1 (0x3e054) has alternative event,
PM_LD_MISS_L1_ALT (0x400f0). This is an identical event which can be
programmed in a different PMC.

Before:

 # perf stat -e r3e054,r300fc

 Performance counter stats for 'system wide':

           1057860      r3e054              (50.21%)
               379      r300fc              (49.79%)

       0.944329741 seconds time elapsed

Since both the events are using PMC3 in this case, they are
multiplexed here.

After:

 # perf stat -e r3e054,r300fc

 Performance counter stats for 'system wide':

           1006948      r3e054
               182      r300fc

Fixes: 91e0bd1e6251 ("powerpc/perf: Add PM_LD_MISS_L1 and PM_BR_2PATH to power9 event list")
Signed-off-by: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Reviewed-by: Madhavan Srinivasan &lt;maddy@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220419114828.89843-1-atrajeev@linux.vnet.ibm.com
</pre>
</div>
</content>
</entry>
</feed>
