<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/tools/arch/arm64/include/asm, branch v6.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>tools headers arm64: Sync arm64's cputype.h with the kernel sources</title>
<updated>2023-07-14T13:36:16+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2023-07-14T13:36:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=963293ff058cef54718fd225a542c73778257b3c'/>
<id>963293ff058cef54718fd225a542c73778257b3c</id>
<content type='text'>
To get the changes in:

  e910baa9c1efdf76 ("KVM: arm64: vgic: Add Apple M2 PRO/MAX cpus to the list of broken SEIS implementations")

That makes this perf source code to be rebuilt:

  CC      /tmp/build/perf-tools/util/arm-spe.o

The changes in the above patch don't affect things that are used in
arm-spe.c (things like MIDR_NEOVERSE_N1, etc). Unsure if Apple M2 has
SPE (Statistical Profiling Extension) :-)

That addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To get the changes in:

  e910baa9c1efdf76 ("KVM: arm64: vgic: Add Apple M2 PRO/MAX cpus to the list of broken SEIS implementations")

That makes this perf source code to be rebuilt:

  CC      /tmp/build/perf-tools/util/arm-spe.o

The changes in the above patch don't affect things that are used in
arm-spe.c (things like MIDR_NEOVERSE_N1, etc). Unsure if Apple M2 has
SPE (Statistical Profiling Extension) :-)

That addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools headers arm64: Sync arm64's cputype.h with the kernel sources</title>
<updated>2023-01-18T13:07:42+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2023-01-18T12:38:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8c51e8f4e9ec6b404a10c84055b8454eba9f42fb'/>
<id>8c51e8f4e9ec6b404a10c84055b8454eba9f42fb</id>
<content type='text'>
To get the changes in:

  decb17aeb8fa2148 ("KVM: arm64: vgic: Add Apple M2 cpus to the list of broken SEIS implementations")
  07e39e60bbf0ccd5 ("arm64: Add Cortex-715 CPU part definition")
  8ec8490a1950efec ("arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: D Scott Phillips &lt;scott@os.amperecomputing.com&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lore.kernel.org/lkml/Y8fvEGCGn+227qW0@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To get the changes in:

  decb17aeb8fa2148 ("KVM: arm64: vgic: Add Apple M2 cpus to the list of broken SEIS implementations")
  07e39e60bbf0ccd5 ("arm64: Add Cortex-715 CPU part definition")
  8ec8490a1950efec ("arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: D Scott Phillips &lt;scott@os.amperecomputing.com&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lore.kernel.org/lkml/Y8fvEGCGn+227qW0@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools headers arm64: Sync arm64's cputype.h with the kernel sources</title>
<updated>2022-10-25T20:40:48+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2022-04-09T14:48:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ffc1df3dc97ee2aad6d2a94e4615c2a96cf291ad'/>
<id>ffc1df3dc97ee2aad6d2a94e4615c2a96cf291ad</id>
<content type='text'>
To get the changes in:

  0e5d5ae837c8ce04 ("arm64: Add AMPERE1 to the Spectre-BHB affected list")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: D Scott Phillips &lt;scott@os.amperecomputing.com&gt;
https://lore.kernel.org/lkml/Y1fy5GD7ZYvkeufv@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To get the changes in:

  0e5d5ae837c8ce04 ("arm64: Add AMPERE1 to the Spectre-BHB affected list")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: D Scott Phillips &lt;scott@os.amperecomputing.com&gt;
https://lore.kernel.org/lkml/Y1fy5GD7ZYvkeufv@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools headers arm64: Sync arm64's cputype.h with the kernel sources</title>
<updated>2022-06-19T14:23:04+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2022-04-09T14:48:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=37402d5d061ba914a12d16ee8dda6d6964b4819d'/>
<id>37402d5d061ba914a12d16ee8dda6d6964b4819d</id>
<content type='text'>
To get the changes in:

  cae889302ebf5a9b ("KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Link: https://lore.kernel.org/lkml/Yq8w7p4omYKNwOij@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To get the changes in:

  cae889302ebf5a9b ("KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Marc Zyngier &lt;maz@kernel.org&gt;
Link: https://lore.kernel.org/lkml/Yq8w7p4omYKNwOij@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools headers arm64: Sync arm64's cputype.h with the kernel sources</title>
<updated>2022-04-09T15:34:29+00:00</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2022-04-09T14:48:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=278aaba2c555a54e62aec40e04defaa9fffcc1c9'/>
<id>278aaba2c555a54e62aec40e04defaa9fffcc1c9</id>
<content type='text'>
To get the changes in:

  83bea32ac7ed37bb ("arm64: Add part number for Arm Cortex-A78AE")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Andrew Kilroy &lt;andrew.kilroy@arm.com&gt;
Cc: Chanho Park &lt;chanho61.park@samsung.com&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To get the changes in:

  83bea32ac7ed37bb ("arm64: Add part number for Arm Cortex-A78AE")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Ali Saidi &lt;alisaidi@amazon.com&gt;
Cc: Andrew Kilroy &lt;andrew.kilroy@arm.com&gt;
Cc: Chanho Park &lt;chanho61.park@samsung.com&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools arm64: Import cputype.h</title>
<updated>2022-03-26T13:53:45+00:00</updated>
<author>
<name>Ali Saidi</name>
<email>alisaidi@amazon.com</email>
</author>
<published>2022-03-24T18:33:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1314376d495f2d79cc58753ff3034ccc503c43c9'/>
<id>1314376d495f2d79cc58753ff3034ccc503c43c9</id>
<content type='text'>
Bring-in the kernel's arch/arm64/include/asm/cputype.h into tools/
for arm64 to make use of all the core-type definitions in perf.

Replace sysreg.h with the version already imported into tools/.

Committer notes:

Added an entry to tools/perf/check-headers.sh, so that we get notified
when the original file in the kernel sources gets modified.

Tester notes:

LGTM. I did the testing on both my x86 and Arm64 platforms, thanks for
the fixing up.

Signed-off-by: Ali Saidi &lt;alisaidi@amazon.com&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andrew Kilroy &lt;andrew.kilroy@arm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nick.Forrington@arm.com
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220324183323.31414-2-alisaidi@amazon.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Bring-in the kernel's arch/arm64/include/asm/cputype.h into tools/
for arm64 to make use of all the core-type definitions in perf.

Replace sysreg.h with the version already imported into tools/.

Committer notes:

Added an entry to tools/perf/check-headers.sh, so that we get notified
when the original file in the kernel sources gets modified.

Tester notes:

LGTM. I did the testing on both my x86 and Arm64 platforms, thanks for
the fixing up.

Signed-off-by: Ali Saidi &lt;alisaidi@amazon.com&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Andrew Kilroy &lt;andrew.kilroy@arm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: German Gomez &lt;german.gomez@arm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Li Huafei &lt;lihuafei1@huawei.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nick.Forrington@arm.com
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220324183323.31414-2-alisaidi@amazon.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools: arm64: Import sysreg.h</title>
<updated>2021-10-17T10:15:51+00:00</updated>
<author>
<name>Raghavendra Rao Ananta</name>
<email>rananta@google.com</email>
</author>
<published>2021-10-07T23:34:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=272a067df3c89f6f2176a350f88661625a2c8b3b'/>
<id>272a067df3c89f6f2176a350f88661625a2c8b3b</id>
<content type='text'>
Bring-in the kernel's arch/arm64/include/asm/sysreg.h
into tools/ for arm64 to make use of all the standard
register definitions in consistence with the kernel.

Make use of the register read/write definitions from
sysreg.h, instead of the existing definitions. A syntax
correction is needed for the files that use write_sysreg()
to make it compliant with the new (kernel's) syntax.

Reviewed-by: Andrew Jones &lt;drjones@redhat.com&gt;
Reviewed-by: Oliver Upton &lt;oupton@google.com&gt;
Signed-off-by: Raghavendra Rao Ananta &lt;rananta@google.com&gt;
[maz: squashed two commits in order to keep the series bisectable]
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Link: https://lore.kernel.org/r/20211007233439.1826892-3-rananta@google.com
Link: https://lore.kernel.org/r/20211007233439.1826892-4-rananta@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Bring-in the kernel's arch/arm64/include/asm/sysreg.h
into tools/ for arm64 to make use of all the standard
register definitions in consistence with the kernel.

Make use of the register read/write definitions from
sysreg.h, instead of the existing definitions. A syntax
correction is needed for the files that use write_sysreg()
to make it compliant with the new (kernel's) syntax.

Reviewed-by: Andrew Jones &lt;drjones@redhat.com&gt;
Reviewed-by: Oliver Upton &lt;oupton@google.com&gt;
Signed-off-by: Raghavendra Rao Ananta &lt;rananta@google.com&gt;
[maz: squashed two commits in order to keep the series bisectable]
Signed-off-by: Marc Zyngier &lt;maz@kernel.org&gt;
Link: https://lore.kernel.org/r/20211007233439.1826892-3-rananta@google.com
Link: https://lore.kernel.org/r/20211007233439.1826892-4-rananta@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>tools: add smp_* barrier variants to include infrastructure</title>
<updated>2019-04-11T21:45:50+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2019-04-09T09:44:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6b7a21140fca461c6d8d5c65a3746e7da50a409e'/>
<id>6b7a21140fca461c6d8d5c65a3746e7da50a409e</id>
<content type='text'>
Add the definition for smp_rmb(), smp_wmb(), and smp_mb() to the
tools include infrastructure: this patch adds the implementation
for x86-64 and arm64, and have it fall back as currently is for
other archs which do not have it implemented at this point. The
x86-64 one uses lock + add combination for smp_mb() with address
below red zone.

This is on top of 09d62154f613 ("tools, perf: add and use optimized
ring_buffer_{read_head, write_tail} helpers"), which didn't touch
smp_* barrier implementations. Magnus recently rightfully reported
however that the latter on x86-64 still wrongly falls back to sfence,
lfence and mfence respectively, thus fix that for applications under
tools making use of these to avoid such ugly surprises. The main
header under tools (include/asm/barrier.h) will in that case not
select the fallback implementation.

Reported-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add the definition for smp_rmb(), smp_wmb(), and smp_mb() to the
tools include infrastructure: this patch adds the implementation
for x86-64 and arm64, and have it fall back as currently is for
other archs which do not have it implemented at this point. The
x86-64 one uses lock + add combination for smp_mb() with address
below red zone.

This is on top of 09d62154f613 ("tools, perf: add and use optimized
ring_buffer_{read_head, write_tail} helpers"), which didn't touch
smp_* barrier implementations. Magnus recently rightfully reported
however that the latter on x86-64 still wrongly falls back to sfence,
lfence and mfence respectively, thus fix that for applications under
tools making use of these to avoid such ugly surprises. The main
header under tools (include/asm/barrier.h) will in that case not
select the fallback implementation.

Reported-by: Magnus Karlsson &lt;magnus.karlsson@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools headers barrier: Fix arm64 tools build failure wrt smp_load_{acquire,release}</title>
<updated>2018-11-01T13:07:43+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2018-10-31T17:44:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=51f5fd2e4615dcdc25cd7f9d19b7b27eb9ecdac7'/>
<id>51f5fd2e4615dcdc25cd7f9d19b7b27eb9ecdac7</id>
<content type='text'>
Cheers for reporting this. I managed to reproduce the build failure with
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1).

The code in question is the arm64 versions of smp_load_acquire() and
smp_store_release(). Unlike other architectures, these are not built
around READ_ONCE() and WRITE_ONCE() since we have instructions we can
use instead of fences. Bringing our macros up-to-date with those (i.e.
tweaking the union initialisation and using the special "uXX_alias_t"
types) appears to fix the issue for me.

Committer notes:

Testing it in the systems previously failing:

  # time dm android-ndk:r12b-arm \
         android-ndk:r15c-arm \
         debian:experimental-x-arm64 \
         ubuntu:14.04.4-x-linaro-arm64 \
         ubuntu:16.04-x-arm \
         ubuntu:16.04-x-arm64 \
         ubuntu:18.04-x-arm \
         ubuntu:18.04-x-arm64
    1 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
    2 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
    3 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0
    4 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
    5 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
    6 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
    7 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
    8 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0

Reported-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Tested-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20181031174408.GA27871@arm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cheers for reporting this. I managed to reproduce the build failure with
gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1).

The code in question is the arm64 versions of smp_load_acquire() and
smp_store_release(). Unlike other architectures, these are not built
around READ_ONCE() and WRITE_ONCE() since we have instructions we can
use instead of fences. Bringing our macros up-to-date with those (i.e.
tweaking the union initialisation and using the special "uXX_alias_t"
types) appears to fix the issue for me.

Committer notes:

Testing it in the systems previously failing:

  # time dm android-ndk:r12b-arm \
         android-ndk:r15c-arm \
         debian:experimental-x-arm64 \
         ubuntu:14.04.4-x-linaro-arm64 \
         ubuntu:16.04-x-arm \
         ubuntu:16.04-x-arm64 \
         ubuntu:18.04-x-arm \
         ubuntu:18.04-x-arm64
    1 android-ndk:r12b-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
    2 android-ndk:r15c-arm          : Ok   arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
    3 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0
    4 ubuntu:14.04.4-x-linaro-arm64 : Ok   aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
    5 ubuntu:16.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
    6 ubuntu:16.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
    7 ubuntu:18.04-x-arm            : Ok   arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
    8 ubuntu:18.04-x-arm64          : Ok   aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0

Reported-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Tested-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20181031174408.GA27871@arm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpers</title>
<updated>2018-10-19T20:43:08+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-10-19T13:51:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=09d62154f61316f7e97eae3f31ef8770c7e4b386'/>
<id>09d62154f61316f7e97eae3f31ef8770c7e4b386</id>
<content type='text'>
Currently, on x86-64, perf uses LFENCE and MFENCE (rmb() and mb(),
respectively) when processing events from the perf ring buffer which
is unnecessarily expensive as we can do more lightweight in particular
given this is critical fast-path in perf.

According to Peter rmb()/mb() were added back then via a94d342b9cb0
("tools/perf: Add required memory barriers") at a time where kernel
still supported chips that needed it, but nowadays support for these
has been ditched completely, therefore we can fix them up as well.

While for x86-64, replacing rmb() and mb() with smp_*() variants would
result in just a compiler barrier for the former and LOCK + ADD for
the latter (__sync_synchronize() uses slower MFENCE by the way), Peter
suggested we can use smp_{load_acquire,store_release}() instead for
architectures where its implementation doesn't resolve in slower smp_mb().
Thus, e.g. in x86-64 we would be able to avoid CPU barrier entirely due
to TSO. For architectures where the latter needs to use smp_mb() e.g.
on arm, we stick to cheaper smp_rmb() variant for fetching the head.

This work adds helpers ring_buffer_read_head() and ring_buffer_write_tail()
for tools infrastructure that either switches to smp_load_acquire() for
architectures where it is cheaper or uses READ_ONCE() + smp_rmb() barrier
for those where it's not in order to fetch the data_head from the perf
control page, and it uses smp_store_release() to write the data_tail.
Latter is smp_mb() + WRITE_ONCE() combination or a cheaper variant if
architecture allows for it. Those that rely on smp_rmb() and smp_mb() can
further improve performance in a follow up step by implementing the two
under tools/arch/*/include/asm/barrier.h such that they don't have to
fallback to rmb() and mb() in tools/include/asm/barrier.h.

Switch perf to use ring_buffer_read_head() and ring_buffer_write_tail()
so it can make use of the optimizations. Later, we convert libbpf as
well to use the same helpers.

Side note [0]: the topic has been raised of whether one could simply use
the C11 gcc builtins [1] for the smp_load_acquire() and smp_store_release()
instead:

  __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
  __atomic_store_n(ptr, val, __ATOMIC_RELEASE);

Kernel and (presumably) tooling shipped along with the kernel has a
minimum requirement of being able to build with gcc-4.6 and the latter
does not have C11 builtins. While generally the C11 memory models don't
align with the kernel's, the C11 load-acquire and store-release alone
/could/ suffice, however. Issue is that this is implementation dependent
on how the load-acquire and store-release is done by the compiler and
the mapping of supported compilers must align to be compatible with the
kernel's implementation, and thus needs to be verified/tracked on a
case by case basis whether they match (unless an architecture uses them
also from kernel side). The implementations for smp_load_acquire() and
smp_store_release() in this patch have been adapted from the kernel side
ones to have a concrete and compatible mapping in place.

  [0] http://patchwork.ozlabs.org/patch/985422/
  [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, on x86-64, perf uses LFENCE and MFENCE (rmb() and mb(),
respectively) when processing events from the perf ring buffer which
is unnecessarily expensive as we can do more lightweight in particular
given this is critical fast-path in perf.

According to Peter rmb()/mb() were added back then via a94d342b9cb0
("tools/perf: Add required memory barriers") at a time where kernel
still supported chips that needed it, but nowadays support for these
has been ditched completely, therefore we can fix them up as well.

While for x86-64, replacing rmb() and mb() with smp_*() variants would
result in just a compiler barrier for the former and LOCK + ADD for
the latter (__sync_synchronize() uses slower MFENCE by the way), Peter
suggested we can use smp_{load_acquire,store_release}() instead for
architectures where its implementation doesn't resolve in slower smp_mb().
Thus, e.g. in x86-64 we would be able to avoid CPU barrier entirely due
to TSO. For architectures where the latter needs to use smp_mb() e.g.
on arm, we stick to cheaper smp_rmb() variant for fetching the head.

This work adds helpers ring_buffer_read_head() and ring_buffer_write_tail()
for tools infrastructure that either switches to smp_load_acquire() for
architectures where it is cheaper or uses READ_ONCE() + smp_rmb() barrier
for those where it's not in order to fetch the data_head from the perf
control page, and it uses smp_store_release() to write the data_tail.
Latter is smp_mb() + WRITE_ONCE() combination or a cheaper variant if
architecture allows for it. Those that rely on smp_rmb() and smp_mb() can
further improve performance in a follow up step by implementing the two
under tools/arch/*/include/asm/barrier.h such that they don't have to
fallback to rmb() and mb() in tools/include/asm/barrier.h.

Switch perf to use ring_buffer_read_head() and ring_buffer_write_tail()
so it can make use of the optimizations. Later, we convert libbpf as
well to use the same helpers.

Side note [0]: the topic has been raised of whether one could simply use
the C11 gcc builtins [1] for the smp_load_acquire() and smp_store_release()
instead:

  __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
  __atomic_store_n(ptr, val, __ATOMIC_RELEASE);

Kernel and (presumably) tooling shipped along with the kernel has a
minimum requirement of being able to build with gcc-4.6 and the latter
does not have C11 builtins. While generally the C11 memory models don't
align with the kernel's, the C11 load-acquire and store-release alone
/could/ suffice, however. Issue is that this is implementation dependent
on how the load-acquire and store-release is done by the compiler and
the mapping of supported compilers must align to be compatible with the
kernel's implementation, and thus needs to be verified/tracked on a
case by case basis whether they match (unless an architecture uses them
also from kernel side). The implementations for smp_load_acquire() and
smp_store_release() in this patch have been adapted from the kernel side
ones to have a concrete and compatible mapping in place.

  [0] http://patchwork.ozlabs.org/patch/985422/
  [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
