<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/powerpc/kernel/module_64.c, branch v6.17</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>powerpc64/ftrace: fix module loading without patchable function entries</title>
<updated>2025-04-15T06:10:54+00:00</updated>
<author>
<name>Anthony Iliopoulos</name>
<email>ailiop@suse.com</email>
</author>
<published>2025-02-04T23:18:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=534f5a8ba27863141e29766467a3e1f61bcb47ac'/>
<id>534f5a8ba27863141e29766467a3e1f61bcb47ac</id>
<content type='text'>
get_stubs_size assumes that there must always be at least one patchable
function entry, which is not always the case (modules that export data
but no code), otherwise it returns -ENOEXEC and thus the section header
sh_size is set to that value. During module_memory_alloc() the size is
passed to execmem_alloc() after being page-aligned and thus set to zero
which will cause it to fail the allocation (and thus module loading) as
__vmalloc_node_range() checks for zero-sized allocs and returns null:

[  115.466896] module_64: cast_common: doesn't contain __patchable_function_entries.
[  115.469189] ------------[ cut here ]------------
[  115.469496] WARNING: CPU: 0 PID: 274 at mm/vmalloc.c:3778 __vmalloc_node_range_noprof+0x8b4/0x8f0
...
[  115.478574] ---[ end trace 0000000000000000 ]---
[  115.479545] execmem: unable to allocate memory

Fix this by removing the check completely, since it is anyway not
helpful to propagate this as an error upwards.

Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Anthony Iliopoulos &lt;ailiop@suse.com&gt;
Acked-by: Naveen N Rao (AMD) &lt;naveen@kernel.org&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20250204231821.39140-1-ailiop@suse.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
get_stubs_size assumes that there must always be at least one patchable
function entry, which is not always the case (modules that export data
but no code), otherwise it returns -ENOEXEC and thus the section header
sh_size is set to that value. During module_memory_alloc() the size is
passed to execmem_alloc() after being page-aligned and thus set to zero
which will cause it to fail the allocation (and thus module loading) as
__vmalloc_node_range() checks for zero-sized allocs and returns null:

[  115.466896] module_64: cast_common: doesn't contain __patchable_function_entries.
[  115.469189] ------------[ cut here ]------------
[  115.469496] WARNING: CPU: 0 PID: 274 at mm/vmalloc.c:3778 __vmalloc_node_range_noprof+0x8b4/0x8f0
...
[  115.478574] ---[ end trace 0000000000000000 ]---
[  115.479545] execmem: unable to allocate memory

Fix this by removing the check completely, since it is anyway not
helpful to propagate this as an error upwards.

Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Anthony Iliopoulos &lt;ailiop@suse.com&gt;
Acked-by: Naveen N Rao (AMD) &lt;naveen@kernel.org&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/20250204231821.39140-1-ailiop@suse.com

</pre>
</div>
</content>
</entry>
<entry>
<title>modules: Support extended MODVERSIONS info</title>
<updated>2025-01-10T16:25:26+00:00</updated>
<author>
<name>Matthew Maurer</name>
<email>mmaurer@google.com</email>
</author>
<published>2025-01-03T17:37:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=54ac1ac8edeb74ff87fc880d1ee58785bdcbe323'/>
<id>54ac1ac8edeb74ff87fc880d1ee58785bdcbe323</id>
<content type='text'>
Adds a new format for MODVERSIONS which stores each field in a separate
ELF section. This initially adds support for variable length names, but
could later be used to add additional fields to MODVERSIONS in a
backwards compatible way if needed. Any new fields will be ignored by
old user tooling, unlike the current format where user tooling cannot
tolerate adjustments to the format (for example making the name field
longer).

Since PPC munges its version records to strip leading dots, we reproduce
the munging for the new format. Other architectures do not appear to
have architecture-specific usage of this information.

Reviewed-by: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Signed-off-by: Matthew Maurer &lt;mmaurer@google.com&gt;
Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds a new format for MODVERSIONS which stores each field in a separate
ELF section. This initially adds support for variable length names, but
could later be used to add additional fields to MODVERSIONS in a
backwards compatible way if needed. Any new fields will be ignored by
old user tooling, unlike the current format where user tooling cannot
tolerate adjustments to the format (for example making the name field
longer).

Since PPC munges its version records to strip leading dots, we reproduce
the munging for the new format. Other architectures do not appear to
have architecture-specific usage of this information.

Reviewed-by: Sami Tolvanen &lt;samitolvanen@google.com&gt;
Signed-off-by: Matthew Maurer &lt;mmaurer@google.com&gt;
Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux</title>
<updated>2024-11-23T18:44:31+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-23T18:44:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=42d9e8b7ccddee75989283cf7477305cfe3776ff'/>
<id>42d9e8b7ccddee75989283cf7477305cfe3776ff</id>
<content type='text'>
Pull powerpc updates from Michael Ellerman:

 - Rework kfence support for the HPT MMU to work on systems with &gt;= 16TB
   of RAM.

 - Remove the powerpc "maple" platform, used by the "Yellow Dog
   Powerstation".

 - Add support for DYNAMIC_FTRACE_WITH_CALL_OPS,
   DYNAMIC_FTRACE_WITH_DIRECT_CALLS &amp; BPF Trampolines.

 - Add support for running KVM nested guests on Power11.

 - Other small features, cleanups and fixes.

Thanks to Amit Machhiwal, Arnd Bergmann, Christophe Leroy, Costa
Shulyupin, David Hunter, David Wang, Disha Goel, Gautam Menghani, Geert
Uytterhoeven, Hari Bathini, Julia Lawall, Kajol Jain, Keith Packard,
Lukas Bulwahn, Madhavan Srinivasan, Markus Elfring, Michal Suchanek,
Ming Lei, Mukesh Kumar Chaurasiya, Nathan Chancellor, Naveen N Rao,
Nicholas Piggin, Nysal Jan K.A, Paulo Miguel Almeida, Pavithra Prakash,
Ritesh Harjani (IBM), Rob Herring (Arm), Sachin P Bappalige, Shen
Lichuan, Simon Horman, Sourabh Jain, Thomas Weißschuh, Thorsten Blum,
Thorsten Leemhuis, Venkat Rao Bagalkote, Zhang Zekun, and zhang jiao.

* tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (89 commits)
  EDAC/powerpc: Remove PPC_MAPLE drivers
  powerpc/perf: Add per-task/process monitoring to vpa_pmu driver
  powerpc/kvm: Add vpa latency counters to kvm_vcpu_arch
  docs: ABI: sysfs-bus-event_source-devices-vpa-pmu: Document sysfs event format entries for vpa_pmu
  powerpc/perf: Add perf interface to expose vpa counters
  MAINTAINERS: powerpc: Mark Maddy as "M"
  powerpc/Makefile: Allow overriding CPP
  powerpc-km82xx.c: replace of_node_put() with __free
  ps3: Correct some typos in comments
  powerpc/kexec: Fix return of uninitialized variable
  macintosh: Use common error handling code in via_pmu_led_init()
  powerpc/powermac: Use of_property_match_string() in pmac_has_backlight_type()
  powerpc: remove dead config options for MPC85xx platform support
  powerpc/xive: Use cpumask_intersects()
  selftests/powerpc: Remove the path after initialization.
  powerpc/xmon: symbol lookup length fixed
  powerpc/ep8248e: Use %pa to format resource_size_t
  powerpc/ps3: Reorganize kerneldoc parameter names
  KVM: PPC: Book3S HV: Fix kmv -&gt; kvm typo
  powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull powerpc updates from Michael Ellerman:

 - Rework kfence support for the HPT MMU to work on systems with &gt;= 16TB
   of RAM.

 - Remove the powerpc "maple" platform, used by the "Yellow Dog
   Powerstation".

 - Add support for DYNAMIC_FTRACE_WITH_CALL_OPS,
   DYNAMIC_FTRACE_WITH_DIRECT_CALLS &amp; BPF Trampolines.

 - Add support for running KVM nested guests on Power11.

 - Other small features, cleanups and fixes.

Thanks to Amit Machhiwal, Arnd Bergmann, Christophe Leroy, Costa
Shulyupin, David Hunter, David Wang, Disha Goel, Gautam Menghani, Geert
Uytterhoeven, Hari Bathini, Julia Lawall, Kajol Jain, Keith Packard,
Lukas Bulwahn, Madhavan Srinivasan, Markus Elfring, Michal Suchanek,
Ming Lei, Mukesh Kumar Chaurasiya, Nathan Chancellor, Naveen N Rao,
Nicholas Piggin, Nysal Jan K.A, Paulo Miguel Almeida, Pavithra Prakash,
Ritesh Harjani (IBM), Rob Herring (Arm), Sachin P Bappalige, Shen
Lichuan, Simon Horman, Sourabh Jain, Thomas Weißschuh, Thorsten Blum,
Thorsten Leemhuis, Venkat Rao Bagalkote, Zhang Zekun, and zhang jiao.

* tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (89 commits)
  EDAC/powerpc: Remove PPC_MAPLE drivers
  powerpc/perf: Add per-task/process monitoring to vpa_pmu driver
  powerpc/kvm: Add vpa latency counters to kvm_vcpu_arch
  docs: ABI: sysfs-bus-event_source-devices-vpa-pmu: Document sysfs event format entries for vpa_pmu
  powerpc/perf: Add perf interface to expose vpa counters
  MAINTAINERS: powerpc: Mark Maddy as "M"
  powerpc/Makefile: Allow overriding CPP
  powerpc-km82xx.c: replace of_node_put() with __free
  ps3: Correct some typos in comments
  powerpc/kexec: Fix return of uninitialized variable
  macintosh: Use common error handling code in via_pmu_led_init()
  powerpc/powermac: Use of_property_match_string() in pmac_has_backlight_type()
  powerpc: remove dead config options for MPC85xx platform support
  powerpc/xive: Use cpumask_intersects()
  selftests/powerpc: Remove the path after initialization.
  powerpc/xmon: symbol lookup length fixed
  powerpc/ep8248e: Use %pa to format resource_size_t
  powerpc/ps3: Reorganize kerneldoc parameter names
  KVM: PPC: Book3S HV: Fix kmv -&gt; kvm typo
  powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>asm-generic: introduce text-patching.h</title>
<updated>2024-11-07T22:25:15+00:00</updated>
<author>
<name>Mike Rapoport (Microsoft)</name>
<email>rppt@kernel.org</email>
</author>
<published>2024-10-23T16:27:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0c3beacf681ec897e0b36685a9b49d01f5cb2dfb'/>
<id>0c3beacf681ec897e0b36685a9b49d01f5cb2dfb</id>
<content type='text'>
Several architectures support text patching, but they name the header
files that declare patching functions differently.

Make all such headers consistently named text-patching.h and add an empty
header in asm-generic for architectures that do not support text patching.

Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt; # m68k
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Tested-by: kdevops &lt;kdevops@lists.linux.dev&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: Brian Cain &lt;bcain@quicinc.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Cc: John Paul Adrian Glaubitz &lt;glaubitz@physik.fu-berlin.de&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Cc: Vineet Gupta &lt;vgupta@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Several architectures support text patching, but they name the header
files that declare patching functions differently.

Make all such headers consistently named text-patching.h and add an empty
header in asm-generic for architectures that do not support text patching.

Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt; # m68k
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Tested-by: kdevops &lt;kdevops@lists.linux.dev&gt;
Cc: Andreas Larsson &lt;andreas@gaisler.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: Brian Cain &lt;bcain@quicinc.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dinh Nguyen &lt;dinguyen@kernel.org&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Huacai Chen &lt;chenhuacai@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Cc: John Paul Adrian Glaubitz &lt;glaubitz@physik.fu-berlin.de&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Cc: Vineet Gupta &lt;vgupta@kernel.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc64/ftrace: Move ftrace sequence out of line</title>
<updated>2024-10-31T00:00:54+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2024-10-30T07:08:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eec37961a56aa4f3fe1c33ffd48eec7d1bb0c009'/>
<id>eec37961a56aa4f3fe1c33ffd48eec7d1bb0c009</id>
<content type='text'>
Function profile sequence on powerpc includes two instructions at the
beginning of each function:
	mflr	r0
	bl	ftrace_caller

The call to ftrace_caller() gets nop'ed out during kernel boot and is
patched in when ftrace is enabled.

Given the sequence, we cannot return from ftrace_caller with 'blr' as we
need to keep LR and r0 intact. This results in link stack (return
address predictor) imbalance when ftrace is enabled. To address that, we
would like to use a three instruction sequence:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0

Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
reserve two instruction slots before the function. This results in a
total of five instruction slots to be reserved for ftrace use on each
function that is traced.

Move the function profile sequence out-of-line to minimize its impact.
To do this, we reserve a single nop at function entry using
-fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
the total number of functions that can be traced. This is then used to
generate a .S file reserving the appropriate amount of space for use as
ftrace stubs, which is built and linked into vmlinux.

On bootup, the stub space is split into separate stubs per function and
populated with the proper instruction sequence. A pointer to the
associated stub is maintained in dyn_arch_ftrace.

For modules, space for ftrace stubs is reserved from the generic module
stub space.

This is restricted to and enabled by default only on 64-bit powerpc,
though there are some changes to accommodate 32-bit powerpc. This is
done so that 32-bit powerpc could choose to opt into this based on
further tests and benchmarks.

As an example, after this patch, kernel functions will have a single nop
at function entry:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	nop
	mfocrf	r11,8
	...

When ftrace is enabled, the nop is converted to an unconditional branch
to the stub associated with that function:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	b	ftrace_ool_stub_text_end+0x11b28
	mfocrf	r11,8
	...

The associated stub:
&lt;ftrace_ool_stub_text_end+0x11b28&gt;:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0
	b	kernel_clone+0xc
	...

This change showed an improvement of ~10% in null_syscall benchmark on a
Power 10 system with ftrace enabled.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-13-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Function profile sequence on powerpc includes two instructions at the
beginning of each function:
	mflr	r0
	bl	ftrace_caller

The call to ftrace_caller() gets nop'ed out during kernel boot and is
patched in when ftrace is enabled.

Given the sequence, we cannot return from ftrace_caller with 'blr' as we
need to keep LR and r0 intact. This results in link stack (return
address predictor) imbalance when ftrace is enabled. To address that, we
would like to use a three instruction sequence:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0

Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to
reserve two instruction slots before the function. This results in a
total of five instruction slots to be reserved for ftrace use on each
function that is traced.

Move the function profile sequence out-of-line to minimize its impact.
To do this, we reserve a single nop at function entry using
-fpatchable-function-entry=1 and add a pass on vmlinux.o to determine
the total number of functions that can be traced. This is then used to
generate a .S file reserving the appropriate amount of space for use as
ftrace stubs, which is built and linked into vmlinux.

On bootup, the stub space is split into separate stubs per function and
populated with the proper instruction sequence. A pointer to the
associated stub is maintained in dyn_arch_ftrace.

For modules, space for ftrace stubs is reserved from the generic module
stub space.

This is restricted to and enabled by default only on 64-bit powerpc,
though there are some changes to accommodate 32-bit powerpc. This is
done so that 32-bit powerpc could choose to opt into this based on
further tests and benchmarks.

As an example, after this patch, kernel functions will have a single nop
at function entry:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	nop
	mfocrf	r11,8
	...

When ftrace is enabled, the nop is converted to an unconditional branch
to the stub associated with that function:
&lt;kernel_clone&gt;:
	addis	r2,r12,467
	addi	r2,r2,-16028
	b	ftrace_ool_stub_text_end+0x11b28
	mfocrf	r11,8
	...

The associated stub:
&lt;ftrace_ool_stub_text_end+0x11b28&gt;:
	mflr	r0
	bl	ftrace_caller
	mtlr	r0
	b	kernel_clone+0xc
	...

This change showed an improvement of ~10% in null_syscall benchmark on a
Power 10 system with ftrace enabled.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-13-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/module_64: Convert #ifdef to IS_ENABLED()</title>
<updated>2024-10-31T00:00:53+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2024-10-30T07:08:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c12cfe9dee077763708e0a5cf3aca02a85b1e8ba'/>
<id>c12cfe9dee077763708e0a5cf3aca02a85b1e8ba</id>
<content type='text'>
Minor refactor for converting #ifdef to IS_ENABLED().

Reviewed-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-6-hbathini@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Minor refactor for converting #ifdef to IS_ENABLED().

Reviewed-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://patch.msgid.link/20241030070850.1361304-6-hbathini@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: Convert patch_instruction() to patch_u32()</title>
<updated>2024-08-21T10:15:13+00:00</updated>
<author>
<name>Benjamin Gray</name>
<email>bgray@linux.ibm.com</email>
</author>
<published>2024-05-15T02:44:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=90d4fed5b273155c378b1d37595f2209f0a92bed'/>
<id>90d4fed5b273155c378b1d37595f2209f0a92bed</id>
<content type='text'>
This use of patch_instruction() is working on 32 bit data, and can fail
if the data looks like a prefixed instruction and the extra write
crosses a page boundary. Use patch_u32() to fix the write size.

Fixes: 8734b41b3efe ("powerpc/module_64: Fix livepatching for RO modules")
Link: https://lore.kernel.org/all/20230203004649.1f59dbd4@yea/
Signed-off-by: Benjamin Gray &lt;bgray@linux.ibm.com&gt;
Tested-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Acked-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20240515024445.236364-4-bgray@linux.ibm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This use of patch_instruction() is working on 32 bit data, and can fail
if the data looks like a prefixed instruction and the extra write
crosses a page boundary. Use patch_u32() to fix the write size.

Fixes: 8734b41b3efe ("powerpc/module_64: Fix livepatching for RO modules")
Link: https://lore.kernel.org/all/20230203004649.1f59dbd4@yea/
Signed-off-by: Benjamin Gray &lt;bgray@linux.ibm.com&gt;
Tested-by: Hari Bathini &lt;hbathini@linux.ibm.com&gt;
Acked-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20240515024445.236364-4-bgray@linux.ibm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/ftrace: Add support for -fpatchable-function-entry</title>
<updated>2023-08-21T14:09:06+00:00</updated>
<author>
<name>Naveen N Rao</name>
<email>naveen@kernel.org</email>
</author>
<published>2023-06-19T09:47:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0f71dcfb4aef6043da6cc509e7a7f6a3ae87c12d'/>
<id>0f71dcfb4aef6043da6cc509e7a7f6a3ae87c12d</id>
<content type='text'>
GCC v13.1 updated support for -fpatchable-function-entry on ppc64le to
emit nops after the local entry point, rather than before it. This
allows us to use this in the kernel for ftrace purposes. A new script is
added under arch/powerpc/tools/ to help detect if nops are emitted after
the function local entry point, or before the global entry point.

With -fpatchable-function-entry, we no longer have the profiling
instructions generated at function entry, so we only need to validate
the presence of two nops at the ftrace location in ftrace_init_nop(). We
patch the preceding instruction with 'mflr r0' to match the
-mprofile-kernel ABI for subsequent ftrace use.

This changes the profiling instructions used on ppc32. The default -pg
option emits an additional 'stw' instruction after 'mflr r0' and before
the branch to _mcount 'bl _mcount'. This is very similar to the original
-mprofile-kernel implementation on ppc64le, where an additional 'std'
instruction was used to save LR to its save location in the caller's
stackframe. Subsequently, this additional store was removed in later
compiler versions for performance reasons. The same reasons apply for
ppc32 so we only patch in a 'mflr r0'.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Reviewed-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/68586d22981a2c3bb45f27a2b621173d10a7d092.1687166935.git.naveen@kernel.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GCC v13.1 updated support for -fpatchable-function-entry on ppc64le to
emit nops after the local entry point, rather than before it. This
allows us to use this in the kernel for ftrace purposes. A new script is
added under arch/powerpc/tools/ to help detect if nops are emitted after
the function local entry point, or before the global entry point.

With -fpatchable-function-entry, we no longer have the profiling
instructions generated at function entry, so we only need to validate
the presence of two nops at the ftrace location in ftrace_init_nop(). We
patch the preceding instruction with 'mflr r0' to match the
-mprofile-kernel ABI for subsequent ftrace use.

This changes the profiling instructions used on ppc32. The default -pg
option emits an additional 'stw' instruction after 'mflr r0' and before
the branch to _mcount 'bl _mcount'. This is very similar to the original
-mprofile-kernel implementation on ppc64le, where an additional 'std'
instruction was used to save LR to its save location in the caller's
stackframe. Subsequently, this additional store was removed in later
compiler versions for performance reasons. The same reasons apply for
ppc32 so we only patch in a 'mflr r0'.

Signed-off-by: Naveen N Rao &lt;naveen@kernel.org&gt;
Reviewed-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/68586d22981a2c3bb45f27a2b621173d10a7d092.1687166935.git.naveen@kernel.org

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: modules support building with PCREL addresing</title>
<updated>2023-04-20T03:21:42+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-04-08T02:17:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=77e69ee7ce0715c39b9a0cde68ff44fe467435ef'/>
<id>77e69ee7ce0715c39b9a0cde68ff44fe467435ef</id>
<content type='text'>
Build modules using PCREL addressing when CONFIG_PPC_KERNEL_PCREL=y.

- The module loader must handle several new relocation types:

  * R_PPC64_REL24_NOTOC is a function call handled like R_PPC_REL24, but
    does not restore r2 upon return. The external function call stub is
    changed to use pcrel addressing to load the function pointer rather
    than based on the module TOC.

  * R_PPC64_GOT_PCREL34 is a reference to external data. A GOT table
    must be built by hand, because the linker adds this during the final
    link (which is not done for kernel modules). The GOT table is built
    similarly to the way the external function call stub table is. This
    section is called .mygot because .got has a special meaning for the
    linker and can become upset.

  * R_PPC64_PCREL34 is used for local data addressing, but there is a
    special case where the percpu section is moved at load-time to the
    percpu area which is out of range of this relocation. This requires
    the PCREL34 relocations are converted to use GOT_PCREL34 addressing.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
[mpe: Some coding style &amp; formatting fixups]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230408021752.862660-7-npiggin@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Build modules using PCREL addressing when CONFIG_PPC_KERNEL_PCREL=y.

- The module loader must handle several new relocation types:

  * R_PPC64_REL24_NOTOC is a function call handled like R_PPC_REL24, but
    does not restore r2 upon return. The external function call stub is
    changed to use pcrel addressing to load the function pointer rather
    than based on the module TOC.

  * R_PPC64_GOT_PCREL34 is a reference to external data. A GOT table
    must be built by hand, because the linker adds this during the final
    link (which is not done for kernel modules). The GOT table is built
    similarly to the way the external function call stub table is. This
    section is called .mygot because .got has a special meaning for the
    linker and can become upset.

  * R_PPC64_PCREL34 is used for local data addressing, but there is a
    special case where the percpu section is moved at load-time to the
    percpu area which is out of range of this relocation. This requires
    the PCREL34 relocations are converted to use GOT_PCREL34 addressing.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
[mpe: Some coding style &amp; formatting fixups]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230408021752.862660-7-npiggin@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: vmlinux support building with PCREL addresing</title>
<updated>2023-04-20T02:59:21+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2023-04-08T02:17:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7e3a68be42e10f5fa5890e97afc0afd992355bc3'/>
<id>7e3a68be42e10f5fa5890e97afc0afd992355bc3</id>
<content type='text'>
PC-Relative or PCREL addressing is an extension to the ELF ABI which
uses Power ISA v3.1 PC-relative instructions to calculate addresses,
rather than the traditional TOC scheme.

Add an option to build vmlinux using pcrel addressing. Modules continue
to use TOC addressing.

- TOC address helpers and r2 are poisoned with -1 when running vmlinux.
  r2 could be used for something useful once things are ironed out.

- Assembly must call C functions with @notoc annotation, or the linker
  complains aobut a missing nop after the call. This is done with the
  CFUNC macro introduced earlier.

- Boot: with the exception of prom_init, the execution branches to the
  kernel virtual address early in boot, before any addresses are
  generated, which ensures 34-bit pcrel addressing does not miss the
  high PAGE_OFFSET bits. TOC relative addressing has a similar
  requirement. prom_init does not go to the virtual address and its
  addresses should not carry over to the post-prom kernel.

- Ftrace trampolines are converted from TOC addressing to pcrel
  addressing, including module ftrace trampolines that currently use the
  kernel TOC to find ftrace target functions.

- BPF function prologue and function calling generation are converted
  from TOC to pcrel.

- copypage_64.S has an interesting problem, prefixed instructions have
  alignment restrictions so the linker can add padding, which makes the
  assembler treat the difference between two local labels as
  non-constant even if alignment is arranged so padding is not required.
  This may need toolchain help to solve nicely, for now move the prefix
  instruction out of the alternate patch section to work around it.

This reduces kernel text size by about 6%.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230408021752.862660-6-npiggin@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PC-Relative or PCREL addressing is an extension to the ELF ABI which
uses Power ISA v3.1 PC-relative instructions to calculate addresses,
rather than the traditional TOC scheme.

Add an option to build vmlinux using pcrel addressing. Modules continue
to use TOC addressing.

- TOC address helpers and r2 are poisoned with -1 when running vmlinux.
  r2 could be used for something useful once things are ironed out.

- Assembly must call C functions with @notoc annotation, or the linker
  complains aobut a missing nop after the call. This is done with the
  CFUNC macro introduced earlier.

- Boot: with the exception of prom_init, the execution branches to the
  kernel virtual address early in boot, before any addresses are
  generated, which ensures 34-bit pcrel addressing does not miss the
  high PAGE_OFFSET bits. TOC relative addressing has a similar
  requirement. prom_init does not go to the virtual address and its
  addresses should not carry over to the post-prom kernel.

- Ftrace trampolines are converted from TOC addressing to pcrel
  addressing, including module ftrace trampolines that currently use the
  kernel TOC to find ftrace target functions.

- BPF function prologue and function calling generation are converted
  from TOC to pcrel.

- copypage_64.S has an interesting problem, prefixed instructions have
  alignment restrictions so the linker can add padding, which makes the
  assembler treat the difference between two local labels as
  non-constant even if alignment is arranged so padding is not required.
  This may need toolchain help to solve nicely, for now move the prefix
  instruction out of the alternate patch section to work around it.

This reduces kernel text size by about 6%.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230408021752.862660-6-npiggin@gmail.com
</pre>
</div>
</content>
</entry>
</feed>
