<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/riscv/kernel/module-sections.c, branch v6.16</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>riscv: module: Optimize PLT/GOT entry counting</title>
<updated>2025-06-05T18:09:43+00:00</updated>
<author>
<name>Samuel Holland</name>
<email>samuel.holland@sifive.com</email>
</author>
<published>2025-04-09T17:14:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=be17c0df67959fe4f88dac75dc26ed9252d4b133'/>
<id>be17c0df67959fe4f88dac75dc26ed9252d4b133</id>
<content type='text'>
perf reports that 99.63% of the cycles from `modprobe amdgpu` are spent
inside module_frob_arch_sections(). This is because amdgpu.ko contains
about 300000 relocations in its .rela.text section, and the algorithm in
count_max_entries() takes quadratic time.

Apply two optimizations from the arm64 code, which together reduce the
total execution time by 99.58%. First, sort the relocations so duplicate
entries are adjacent. Second, reduce the number of relocations that must
be sorted by filtering to only relocations that need PLT/GOT entries, as
done in commit d4e0340919fb ("arm64/module: Optimize module load time by
optimizing PLT counting").

Unlike the arm64 code, here the filtering and sorting is done in a
scratch buffer, because the HI20 relocation search optimization in
apply_relocate_add() depends on the original order of the relocations.
This allows accumulating PLT/GOT relocations across sections so sorting
and counting is only done once per module.

Signed-off-by: Samuel Holland &lt;samuel.holland@sifive.com&gt;
Reviewed-by: Andrew Jones &lt;ajones@ventanamicro.com&gt;
Link: https://lore.kernel.org/r/20250409171526.862481-3-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
perf reports that 99.63% of the cycles from `modprobe amdgpu` are spent
inside module_frob_arch_sections(). This is because amdgpu.ko contains
about 300000 relocations in its .rela.text section, and the algorithm in
count_max_entries() takes quadratic time.

Apply two optimizations from the arm64 code, which together reduce the
total execution time by 99.58%. First, sort the relocations so duplicate
entries are adjacent. Second, reduce the number of relocations that must
be sorted by filtering to only relocations that need PLT/GOT entries, as
done in commit d4e0340919fb ("arm64/module: Optimize module load time by
optimizing PLT counting").

Unlike the arm64 code, here the filtering and sorting is done in a
scratch buffer, because the HI20 relocation search optimization in
apply_relocate_add() depends on the original order of the relocations.
This allows accumulating PLT/GOT relocations across sections so sorting
and counting is only done once per module.

Signed-off-by: Samuel Holland &lt;samuel.holland@sifive.com&gt;
Reviewed-by: Andrew Jones &lt;ajones@ventanamicro.com&gt;
Link: https://lore.kernel.org/r/20250409171526.862481-3-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: module: Allocate PLT entries for R_RISCV_PLT32</title>
<updated>2025-04-14T13:07:07+00:00</updated>
<author>
<name>Samuel Holland</name>
<email>samuel.holland@sifive.com</email>
</author>
<published>2025-04-09T17:14:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1ee1313f4722e6d67c6e9447ee81d24d6e3ff4ad'/>
<id>1ee1313f4722e6d67c6e9447ee81d24d6e3ff4ad</id>
<content type='text'>
apply_r_riscv_plt32_rela() may need to emit a PLT entry for the
referenced symbol, so there must be space allocated in the PLT.

Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations")
Signed-off-by: Samuel Holland &lt;samuel.holland@sifive.com&gt;
Reviewed-by: Andrew Jones &lt;ajones@ventanamicro.com&gt;
Link: https://lore.kernel.org/r/20250409171526.862481-2-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
apply_r_riscv_plt32_rela() may need to emit a PLT entry for the
referenced symbol, so there must be space allocated in the PLT.

Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations")
Signed-off-by: Samuel Holland &lt;samuel.holland@sifive.com&gt;
Reviewed-by: Andrew Jones &lt;ajones@ventanamicro.com&gt;
Link: https://lore.kernel.org/r/20250409171526.862481-2-samuel.holland@sifive.com
Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: add missing header file includes</title>
<updated>2019-10-28T07:46:01+00:00</updated>
<author>
<name>Paul Walmsley</name>
<email>paul.walmsley@sifive.com</email>
</author>
<published>2019-10-17T22:21:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ed881bc3afc40d7a23c2211ead1aeb4980dda20'/>
<id>5ed881bc3afc40d7a23c2211ead1aeb4980dda20</id>
<content type='text'>
sparse identifies several missing prototypes caused by missing
preprocessor include directives:

arch/riscv/kernel/cpufeature.c:16:6: warning: symbol 'has_fpu' was not declared. Should it be static?
arch/riscv/kernel/process.c:26:6: warning: symbol 'arch_cpu_idle' was not declared. Should it be static?
arch/riscv/kernel/reset.c:15:6: warning: symbol 'pm_power_off' was not declared. Should it be static?
arch/riscv/kernel/syscall_table.c:15:6: warning: symbol 'sys_call_table' was not declared. Should it be static?
arch/riscv/kernel/traps.c:149:13: warning: symbol 'trap_init' was not declared. Should it be static?
arch/riscv/kernel/vdso.c:54:5: warning: symbol 'arch_setup_additional_pages' was not declared. Should it be static?
arch/riscv/kernel/smp.c:64:6: warning: symbol 'arch_match_cpu_phys_id' was not declared. Should it be static?
arch/riscv/kernel/module-sections.c:89:5: warning: symbol 'module_frob_arch_sections' was not declared. Should it be static?
arch/riscv/mm/context.c:42:6: warning: symbol 'switch_mm' was not declared. Should it be static?

Fix by including the appropriate header files in the appropriate
source files.

This patch should have no functional impact.

Signed-off-by: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sparse identifies several missing prototypes caused by missing
preprocessor include directives:

arch/riscv/kernel/cpufeature.c:16:6: warning: symbol 'has_fpu' was not declared. Should it be static?
arch/riscv/kernel/process.c:26:6: warning: symbol 'arch_cpu_idle' was not declared. Should it be static?
arch/riscv/kernel/reset.c:15:6: warning: symbol 'pm_power_off' was not declared. Should it be static?
arch/riscv/kernel/syscall_table.c:15:6: warning: symbol 'sys_call_table' was not declared. Should it be static?
arch/riscv/kernel/traps.c:149:13: warning: symbol 'trap_init' was not declared. Should it be static?
arch/riscv/kernel/vdso.c:54:5: warning: symbol 'arch_setup_additional_pages' was not declared. Should it be static?
arch/riscv/kernel/smp.c:64:6: warning: symbol 'arch_match_cpu_phys_id' was not declared. Should it be static?
arch/riscv/kernel/module-sections.c:89:5: warning: symbol 'module_frob_arch_sections' was not declared. Should it be static?
arch/riscv/mm/context.c:42:6: warning: symbol 'switch_mm' was not declared. Should it be static?

Fix by including the appropriate header files in the appropriate
source files.

This patch should have no functional impact.

Signed-off-by: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RISC-V: Support MODULE_SECTIONS mechanism on RV32</title>
<updated>2019-01-07T16:19:20+00:00</updated>
<author>
<name>Zong Li</name>
<email>zongbox@gmail.com</email>
</author>
<published>2018-12-07T09:02:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2cffc9569050a8dbc0c4a6ee7186c0919487c3ec'/>
<id>2cffc9569050a8dbc0c4a6ee7186c0919487c3ec</id>
<content type='text'>
This patch supports dynamic generate got and plt sections mechanism on
rv32. It contains the modification as follows:
 - Always enable MODULE_SECTIONS (both rv64 and rv32)
 - Change the fixed size type.

This patch had been tested by following modules:

btrfs 6795991 0 - Live 0xa544b000
test_static_keys 17304 0 - Live 0xa28be000
zstd_compress 1198986 1 btrfs, Live 0xa2a25000
zstd_decompress 608112 1 btrfs, Live 0xa24e7000
lzo 8787 0 - Live 0xa2049000
xor 27461 1 btrfs, Live 0xa2041000
zram 78849 0 - Live 0xa2276000
netdevsim 55909 0 - Live 0xa202d000
tun 211534 0 - Live 0xa21b5000
fuse 566049 0 - Live 0xa25fb000
nfs_layout_flexfiles 192597 0 - Live 0xa229b000
ramoops 74895 0 - Live 0xa2019000
xfs 3973221 0 - Live 0xa507f000
libcrc32c 3053 2 btrfs,xfs, Live 0xa34af000
lzo_compress 17302 2 btrfs,lzo, Live 0xa347d000
lzo_decompress 7178 2 btrfs,lzo, Live 0xa3451000
raid6_pq 142086 1 btrfs, Live 0xa33a4000
reed_solomon 31022 1 ramoops, Live 0xa31eb000
test_bitmap 3734 0 - Live 0xa31af000
test_bpf 1588736 0 - Live 0xa2c11000
test_kmod 41161 0 - Live 0xa29f8000
test_module 1356 0 - Live 0xa299e000
test_printf 6024 0 [permanent], Live 0xa2971000
test_static_key_base 5797 1 test_static_keys, Live 0xa2931000
test_user_copy 4382 0 - Live 0xa28c9000
xxhash 70501 2 zstd_compress,zstd_decompress, Live 0xa2055000

Signed-off-by: Zong Li &lt;zong@andestech.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch supports dynamic generate got and plt sections mechanism on
rv32. It contains the modification as follows:
 - Always enable MODULE_SECTIONS (both rv64 and rv32)
 - Change the fixed size type.

This patch had been tested by following modules:

btrfs 6795991 0 - Live 0xa544b000
test_static_keys 17304 0 - Live 0xa28be000
zstd_compress 1198986 1 btrfs, Live 0xa2a25000
zstd_decompress 608112 1 btrfs, Live 0xa24e7000
lzo 8787 0 - Live 0xa2049000
xor 27461 1 btrfs, Live 0xa2041000
zram 78849 0 - Live 0xa2276000
netdevsim 55909 0 - Live 0xa202d000
tun 211534 0 - Live 0xa21b5000
fuse 566049 0 - Live 0xa25fb000
nfs_layout_flexfiles 192597 0 - Live 0xa229b000
ramoops 74895 0 - Live 0xa2019000
xfs 3973221 0 - Live 0xa507f000
libcrc32c 3053 2 btrfs,xfs, Live 0xa34af000
lzo_compress 17302 2 btrfs,lzo, Live 0xa347d000
lzo_decompress 7178 2 btrfs,lzo, Live 0xa3451000
raid6_pq 142086 1 btrfs, Live 0xa33a4000
reed_solomon 31022 1 ramoops, Live 0xa31eb000
test_bitmap 3734 0 - Live 0xa31af000
test_bpf 1588736 0 - Live 0xa2c11000
test_kmod 41161 0 - Live 0xa29f8000
test_module 1356 0 - Live 0xa299e000
test_printf 6024 0 [permanent], Live 0xa2971000
test_static_key_base 5797 1 test_static_keys, Live 0xa2931000
test_user_copy 4382 0 - Live 0xa28c9000
xxhash 70501 2 zstd_compress,zstd_decompress, Live 0xa2055000

Signed-off-by: Zong Li &lt;zong@andestech.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RISC-V: Add section of GOT.PLT for kernel module</title>
<updated>2018-04-03T03:00:54+00:00</updated>
<author>
<name>Zong Li</name>
<email>zong@andestech.com</email>
</author>
<published>2018-03-15T08:50:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b8bde0ef12bd43f013d879973a1900930bfb95ee'/>
<id>b8bde0ef12bd43f013d879973a1900930bfb95ee</id>
<content type='text'>
Separate the function symbol address from .plt to .got.plt section.

The original plt entry has trampoline code with symbol address,
there is a 32-bit padding bwtween jar instruction and symbol address.

Extract the symbol address to .got.plt to reduce the module size.

Signed-off-by: Zong Li &lt;zong@andestech.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Separate the function symbol address from .plt to .got.plt section.

The original plt entry has trampoline code with symbol address,
there is a 32-bit padding bwtween jar instruction and symbol address.

Extract the symbol address to .got.plt to reduce the module size.

Signed-off-by: Zong Li &lt;zong@andestech.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RISC-V: Add sections of PLT and GOT for kernel module</title>
<updated>2018-04-03T03:00:54+00:00</updated>
<author>
<name>Zong Li</name>
<email>zong@andestech.com</email>
</author>
<published>2018-03-15T08:50:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ab1ef68e54019937cf859f2c86c9ead6f3e62f19'/>
<id>ab1ef68e54019937cf859f2c86c9ead6f3e62f19</id>
<content type='text'>
The address of external symbols will locate more than 32-bit offset
in 64-bit kernel with sv39 or sv48 virtual addressing.

Module loader emits the GOT and PLT entries for data symbols and
function symbols respectively.

The PLT entry is a trampoline code for jumping to the 64-bit
real address. The GOT entry is just the data symbol address.

Signed-off-by: Zong Li &lt;zong@andestech.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The address of external symbols will locate more than 32-bit offset
in 64-bit kernel with sv39 or sv48 virtual addressing.

Module loader emits the GOT and PLT entries for data symbols and
function symbols respectively.

The PLT entry is a trampoline code for jumping to the 64-bit
real address. The GOT entry is just the data symbol address.

Signed-off-by: Zong Li &lt;zong@andestech.com&gt;
Signed-off-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
