<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/kernel/misc_32.S, branch v6.14</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>powerpc/32: Replace mulhdu() by mul_u64_u64_shr()</title>
<updated>2024-12-10T02:45:30+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2024-12-07T10:09:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2a17a5bebc9aa7f59e99676350866adc41577c03'/>
<id>2a17a5bebc9aa7f59e99676350866adc41577c03</id>
<content type='text'>
Using mul_u64_u64_shr() provides similar calculation as mulhdu()
assembly function, but enables inlining by the compiler.

The home-made assembly function had special handling for when one of
the arguments is not a fully populated u64 but time functions use it
to multiply timebase by a calculated scale which is constructed to
have most significant bit set.

On mpc8xx sched_clock() runs 3% faster. On mpc83xx it is 2%.

As you can see below, sched_clock() is not much bigger than before:

	c000cf68 &lt;sched_clock&gt;:
	c000cf68:	7d 2d 42 a6 	mftbu   r9
	c000cf6c:	7d 0c 42 a6 	mftb    r8
	c000cf70:	7d 4d 42 a6 	mftbu   r10
	c000cf74:	7c 09 50 40 	cmplw   r9,r10
	c000cf78:	40 82 ff f0 	bne     c000cf68 &lt;sched_clock&gt;
	c000cf7c:	3d 40 c1 37 	lis     r10,-16073
	c000cf80:	38 8a b3 30 	addi    r4,r10,-19664
	c000cf84:	80 ea b3 30 	lwz     r7,-19664(r10)
	c000cf88:	80 64 00 14 	lwz     r3,20(r4)
	c000cf8c:	39 40 00 00 	li      r10,0
	c000cf90:	80 a4 00 04 	lwz     r5,4(r4)
	c000cf94:	80 c4 00 10 	lwz     r6,16(r4)
	c000cf98:	7c 63 40 10 	subfc   r3,r3,r8
	c000cf9c:	80 84 00 08 	lwz     r4,8(r4)
	c000cfa0:	7d 06 49 10 	subfe   r8,r6,r9
	c000cfa4:	7c c7 19 d6 	mullw   r6,r7,r3
	c000cfa8:	7d 25 18 16 	mulhwu  r9,r5,r3
	c000cfac:	7c 08 29 d6 	mullw   r0,r8,r5
	c000cfb0:	7c 67 18 16 	mulhwu  r3,r7,r3
	c000cfb4:	7d 29 30 14 	addc    r9,r9,r6
	c000cfb8:	7c a8 28 16 	mulhwu  r5,r8,r5
	c000cfbc:	7c ca 51 14 	adde    r6,r10,r10
	c000cfc0:	7d 67 41 d6 	mullw   r11,r7,r8
	c000cfc4:	7d 29 00 14 	addc    r9,r9,r0
	c000cfc8:	7c c6 01 94 	addze   r6,r6
	c000cfcc:	7c 63 28 14 	addc    r3,r3,r5
	c000cfd0:	7d 4a 51 14 	adde    r10,r10,r10
	c000cfd4:	7c e7 40 16 	mulhwu  r7,r7,r8
	c000cfd8:	7c 63 58 14 	addc    r3,r3,r11
	c000cfdc:	7d 4a 01 94 	addze   r10,r10
	c000cfe0:	7c 63 30 14 	addc    r3,r3,r6
	c000cfe4:	7d 4a 39 14 	adde    r10,r10,r7
	c000cfe8:	35 24 ff e0 	addic.  r9,r4,-32
	c000cfec:	41 80 00 10 	blt     c000cffc &lt;sched_clock+0x94&gt;
	c000cff0:	7c 63 48 30 	slw     r3,r3,r9
	c000cff4:	38 80 00 00 	li      r4,0
	c000cff8:	4e 80 00 20 	blr
	c000cffc:	21 04 00 1f 	subfic  r8,r4,31
	c000d000:	54 69 f8 7e 	srwi    r9,r3,1
	c000d004:	7d 4a 20 30 	slw     r10,r10,r4
	c000d008:	7d 29 44 30 	srw     r9,r9,r8
	c000d00c:	7c 64 20 30 	slw     r4,r3,r4
	c000d010:	7d 23 53 78 	or      r3,r9,r10
	c000d014:	4e 80 00 20 	blr

Before this change:

	c000d0bc &lt;sched_clock&gt;:
	c000d0bc:	94 21 ff f0 	stwu    r1,-16(r1)
	c000d0c0:	7c 08 02 a6 	mflr    r0
	c000d0c4:	90 01 00 14 	stw     r0,20(r1)
	c000d0c8:	93 e1 00 0c 	stw     r31,12(r1)
	c000d0cc:	7d 2d 42 a6 	mftbu   r9
	c000d0d0:	7d 0c 42 a6 	mftb    r8
	c000d0d4:	7d 4d 42 a6 	mftbu   r10
	c000d0d8:	7c 09 50 40 	cmplw   r9,r10
	c000d0dc:	40 82 ff f0 	bne     c000d0cc &lt;sched_clock+0x10&gt;
	c000d0e0:	3f e0 c1 37 	lis     r31,-16073
	c000d0e4:	3b ff b3 30 	addi    r31,r31,-19664
	c000d0e8:	80 9f 00 14 	lwz     r4,20(r31)
	c000d0ec:	80 7f 00 10 	lwz     r3,16(r31)
	c000d0f0:	7c 84 40 10 	subfc   r4,r4,r8
	c000d0f4:	80 bf 00 00 	lwz     r5,0(r31)
	c000d0f8:	80 df 00 04 	lwz     r6,4(r31)
	c000d0fc:	7c 63 49 10 	subfe   r3,r3,r9
	c000d100:	48 00 37 85 	bl      c0010884 &lt;mulhdu&gt;
	c000d104:	81 3f 00 08 	lwz     r9,8(r31)
	c000d108:	35 49 ff e0 	addic.  r10,r9,-32
	c000d10c:	41 80 00 20 	blt     c000d12c &lt;sched_clock+0x70&gt;
	c000d110:	80 01 00 14 	lwz     r0,20(r1)
	c000d114:	7c 83 50 30 	slw     r3,r4,r10
	c000d118:	83 e1 00 0c 	lwz     r31,12(r1)
	c000d11c:	38 80 00 00 	li      r4,0
	c000d120:	7c 08 03 a6 	mtlr    r0
	c000d124:	38 21 00 10 	addi    r1,r1,16
	c000d128:	4e 80 00 20 	blr
	c000d12c:	80 01 00 14 	lwz     r0,20(r1)
	c000d130:	54 8a f8 7e 	srwi    r10,r4,1
	c000d134:	21 09 00 1f 	subfic  r8,r9,31
	c000d138:	83 e1 00 0c 	lwz     r31,12(r1)
	c000d13c:	7c 63 48 30 	slw     r3,r3,r9
	c000d140:	7d 4a 44 30 	srw     r10,r10,r8
	c000d144:	7c 84 48 30 	slw     r4,r4,r9
	c000d148:	7d 43 1b 78 	or      r3,r10,r3
	c000d14c:	7c 08 03 a6 	mtlr    r0
	c000d150:	38 21 00 10 	addi    r1,r1,16
	c000d154:	4e 80 00 20 	blr

	c0010884 &lt;mulhdu&gt;:
	c0010884:	2c 06 00 00 	cmpwi   r6,0
	c0010888:	2c 83 00 00 	cmpwi   cr1,r3,0
	c001088c:	7c 8a 23 78 	mr      r10,r4
	c0010890:	7c 84 28 16 	mulhwu  r4,r4,r5
	c0010894:	41 82 00 14 	beq     c00108a8 &lt;mulhdu+0x24&gt;
	c0010898:	7c 0a 30 16 	mulhwu  r0,r10,r6
	c001089c:	7c ea 29 d6 	mullw   r7,r10,r5
	c00108a0:	7c e0 38 14 	addc    r7,r0,r7
	c00108a4:	7c 84 01 94 	addze   r4,r4
	c00108a8:	4d 86 00 20 	beqlr   cr1
	c00108ac:	7d 23 29 d6 	mullw   r9,r3,r5
	c00108b0:	7d 43 28 16 	mulhwu  r10,r3,r5
	c00108b4:	41 82 00 18 	beq     c00108cc &lt;mulhdu+0x48&gt;
	c00108b8:	7c 03 31 d6 	mullw   r0,r3,r6
	c00108bc:	7d 03 30 16 	mulhwu  r8,r3,r6
	c00108c0:	7c e0 38 14 	addc    r7,r0,r7
	c00108c4:	7c 84 41 14 	adde    r4,r4,r8
	c00108c8:	7d 4a 01 94 	addze   r10,r10
	c00108cc:	7c 84 48 14 	addc    r4,r4,r9
	c00108d0:	7c 6a 01 94 	addze   r3,r10
	c00108d4:	4e 80 00 20 	blr

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/f29e473c193c87bdbd36b209dfdee99d2f0c60dc.1733566130.git.christophe.leroy@csgroup.eu

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using mul_u64_u64_shr() provides similar calculation as mulhdu()
assembly function, but enables inlining by the compiler.

The home-made assembly function had special handling for when one of
the arguments is not a fully populated u64 but time functions use it
to multiply timebase by a calculated scale which is constructed to
have most significant bit set.

On mpc8xx sched_clock() runs 3% faster. On mpc83xx it is 2%.

As you can see below, sched_clock() is not much bigger than before:

	c000cf68 &lt;sched_clock&gt;:
	c000cf68:	7d 2d 42 a6 	mftbu   r9
	c000cf6c:	7d 0c 42 a6 	mftb    r8
	c000cf70:	7d 4d 42 a6 	mftbu   r10
	c000cf74:	7c 09 50 40 	cmplw   r9,r10
	c000cf78:	40 82 ff f0 	bne     c000cf68 &lt;sched_clock&gt;
	c000cf7c:	3d 40 c1 37 	lis     r10,-16073
	c000cf80:	38 8a b3 30 	addi    r4,r10,-19664
	c000cf84:	80 ea b3 30 	lwz     r7,-19664(r10)
	c000cf88:	80 64 00 14 	lwz     r3,20(r4)
	c000cf8c:	39 40 00 00 	li      r10,0
	c000cf90:	80 a4 00 04 	lwz     r5,4(r4)
	c000cf94:	80 c4 00 10 	lwz     r6,16(r4)
	c000cf98:	7c 63 40 10 	subfc   r3,r3,r8
	c000cf9c:	80 84 00 08 	lwz     r4,8(r4)
	c000cfa0:	7d 06 49 10 	subfe   r8,r6,r9
	c000cfa4:	7c c7 19 d6 	mullw   r6,r7,r3
	c000cfa8:	7d 25 18 16 	mulhwu  r9,r5,r3
	c000cfac:	7c 08 29 d6 	mullw   r0,r8,r5
	c000cfb0:	7c 67 18 16 	mulhwu  r3,r7,r3
	c000cfb4:	7d 29 30 14 	addc    r9,r9,r6
	c000cfb8:	7c a8 28 16 	mulhwu  r5,r8,r5
	c000cfbc:	7c ca 51 14 	adde    r6,r10,r10
	c000cfc0:	7d 67 41 d6 	mullw   r11,r7,r8
	c000cfc4:	7d 29 00 14 	addc    r9,r9,r0
	c000cfc8:	7c c6 01 94 	addze   r6,r6
	c000cfcc:	7c 63 28 14 	addc    r3,r3,r5
	c000cfd0:	7d 4a 51 14 	adde    r10,r10,r10
	c000cfd4:	7c e7 40 16 	mulhwu  r7,r7,r8
	c000cfd8:	7c 63 58 14 	addc    r3,r3,r11
	c000cfdc:	7d 4a 01 94 	addze   r10,r10
	c000cfe0:	7c 63 30 14 	addc    r3,r3,r6
	c000cfe4:	7d 4a 39 14 	adde    r10,r10,r7
	c000cfe8:	35 24 ff e0 	addic.  r9,r4,-32
	c000cfec:	41 80 00 10 	blt     c000cffc &lt;sched_clock+0x94&gt;
	c000cff0:	7c 63 48 30 	slw     r3,r3,r9
	c000cff4:	38 80 00 00 	li      r4,0
	c000cff8:	4e 80 00 20 	blr
	c000cffc:	21 04 00 1f 	subfic  r8,r4,31
	c000d000:	54 69 f8 7e 	srwi    r9,r3,1
	c000d004:	7d 4a 20 30 	slw     r10,r10,r4
	c000d008:	7d 29 44 30 	srw     r9,r9,r8
	c000d00c:	7c 64 20 30 	slw     r4,r3,r4
	c000d010:	7d 23 53 78 	or      r3,r9,r10
	c000d014:	4e 80 00 20 	blr

Before this change:

	c000d0bc &lt;sched_clock&gt;:
	c000d0bc:	94 21 ff f0 	stwu    r1,-16(r1)
	c000d0c0:	7c 08 02 a6 	mflr    r0
	c000d0c4:	90 01 00 14 	stw     r0,20(r1)
	c000d0c8:	93 e1 00 0c 	stw     r31,12(r1)
	c000d0cc:	7d 2d 42 a6 	mftbu   r9
	c000d0d0:	7d 0c 42 a6 	mftb    r8
	c000d0d4:	7d 4d 42 a6 	mftbu   r10
	c000d0d8:	7c 09 50 40 	cmplw   r9,r10
	c000d0dc:	40 82 ff f0 	bne     c000d0cc &lt;sched_clock+0x10&gt;
	c000d0e0:	3f e0 c1 37 	lis     r31,-16073
	c000d0e4:	3b ff b3 30 	addi    r31,r31,-19664
	c000d0e8:	80 9f 00 14 	lwz     r4,20(r31)
	c000d0ec:	80 7f 00 10 	lwz     r3,16(r31)
	c000d0f0:	7c 84 40 10 	subfc   r4,r4,r8
	c000d0f4:	80 bf 00 00 	lwz     r5,0(r31)
	c000d0f8:	80 df 00 04 	lwz     r6,4(r31)
	c000d0fc:	7c 63 49 10 	subfe   r3,r3,r9
	c000d100:	48 00 37 85 	bl      c0010884 &lt;mulhdu&gt;
	c000d104:	81 3f 00 08 	lwz     r9,8(r31)
	c000d108:	35 49 ff e0 	addic.  r10,r9,-32
	c000d10c:	41 80 00 20 	blt     c000d12c &lt;sched_clock+0x70&gt;
	c000d110:	80 01 00 14 	lwz     r0,20(r1)
	c000d114:	7c 83 50 30 	slw     r3,r4,r10
	c000d118:	83 e1 00 0c 	lwz     r31,12(r1)
	c000d11c:	38 80 00 00 	li      r4,0
	c000d120:	7c 08 03 a6 	mtlr    r0
	c000d124:	38 21 00 10 	addi    r1,r1,16
	c000d128:	4e 80 00 20 	blr
	c000d12c:	80 01 00 14 	lwz     r0,20(r1)
	c000d130:	54 8a f8 7e 	srwi    r10,r4,1
	c000d134:	21 09 00 1f 	subfic  r8,r9,31
	c000d138:	83 e1 00 0c 	lwz     r31,12(r1)
	c000d13c:	7c 63 48 30 	slw     r3,r3,r9
	c000d140:	7d 4a 44 30 	srw     r10,r10,r8
	c000d144:	7c 84 48 30 	slw     r4,r4,r9
	c000d148:	7d 43 1b 78 	or      r3,r10,r3
	c000d14c:	7c 08 03 a6 	mtlr    r0
	c000d150:	38 21 00 10 	addi    r1,r1,16
	c000d154:	4e 80 00 20 	blr

	c0010884 &lt;mulhdu&gt;:
	c0010884:	2c 06 00 00 	cmpwi   r6,0
	c0010888:	2c 83 00 00 	cmpwi   cr1,r3,0
	c001088c:	7c 8a 23 78 	mr      r10,r4
	c0010890:	7c 84 28 16 	mulhwu  r4,r4,r5
	c0010894:	41 82 00 14 	beq     c00108a8 &lt;mulhdu+0x24&gt;
	c0010898:	7c 0a 30 16 	mulhwu  r0,r10,r6
	c001089c:	7c ea 29 d6 	mullw   r7,r10,r5
	c00108a0:	7c e0 38 14 	addc    r7,r0,r7
	c00108a4:	7c 84 01 94 	addze   r4,r4
	c00108a8:	4d 86 00 20 	beqlr   cr1
	c00108ac:	7d 23 29 d6 	mullw   r9,r3,r5
	c00108b0:	7d 43 28 16 	mulhwu  r10,r3,r5
	c00108b4:	41 82 00 18 	beq     c00108cc &lt;mulhdu+0x48&gt;
	c00108b8:	7c 03 31 d6 	mullw   r0,r3,r6
	c00108bc:	7d 03 30 16 	mulhwu  r8,r3,r6
	c00108c0:	7c e0 38 14 	addc    r7,r0,r7
	c00108c4:	7c 84 41 14 	adde    r4,r4,r8
	c00108c8:	7d 4a 01 94 	addze   r10,r10
	c00108cc:	7c 84 48 14 	addc    r4,r4,r9
	c00108d0:	7c 6a 01 94 	addze   r3,r10
	c00108d4:	4e 80 00 20 	blr

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Madhavan Srinivasan &lt;maddy@linux.ibm.com&gt;
Link: https://patch.msgid.link/f29e473c193c87bdbd36b209dfdee99d2f0c60dc.1733566130.git.christophe.leroy@csgroup.eu

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Remove core support for 40x</title>
<updated>2024-06-28T12:28:47+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2024-06-28T12:11:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=732b32daef80567a7ef5be3d87ae79b6bfd9d82d'/>
<id>732b32daef80567a7ef5be3d87ae79b6bfd9d82d</id>
<content type='text'>
Now that 40x platforms have gone, remove support
for 40x in the core of powerpc arch.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20240628121201.130802-4-mpe@ellerman.id.au

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that 40x platforms have gone, remove support
for 40x in the core of powerpc arch.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20240628121201.130802-4-mpe@ellerman.id.au

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: replace #include &lt;asm/export.h&gt; with #include &lt;linux/export.h&gt;</title>
<updated>2023-08-16T13:54:48+00:00</updated>
<author>
<name>Masahiro Yamada</name>
<email>masahiroy@kernel.org</email>
</author>
<published>2023-08-06T15:09:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=393261828740c3ed95fc810c3f4c1018b86af7e5'/>
<id>393261828740c3ed95fc810c3f4c1018b86af7e5</id>
<content type='text'>
Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated &lt;asm/export.h&gt;, which is now a wrapper of &lt;linux/export.h&gt;.

Replace #include &lt;asm/export.h&gt; with #include &lt;linux/export.h&gt;.

After all the &lt;asm/export.h&gt; lines are converted, &lt;asm/export.h&gt; and
&lt;asm-generic/export.h&gt; will be removed.

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated &lt;asm/export.h&gt;, which is now a wrapper of &lt;linux/export.h&gt;.

Replace #include &lt;asm/export.h&gt; with #include &lt;linux/export.h&gt;.

After all the &lt;asm/export.h&gt; lines are converted, &lt;asm/export.h&gt; and
&lt;asm-generic/export.h&gt; will be removed.

Signed-off-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: allow minimum sized kernel stack frames</title>
<updated>2022-12-02T06:54:09+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2022-11-27T12:49:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=90f1b43196c5e79f6c986a359011a19857984c27'/>
<id>90f1b43196c5e79f6c986a359011a19857984c27</id>
<content type='text'>
This affects only 64-bit ELFv2 kernels, and reduces the minimum
asm-created stack frame size from 112 to 32 byte on those kernels.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20221127124942.1665522-16-npiggin@gmail.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This affects only 64-bit ELFv2 kernels, and reduces the minimum
asm-created stack frame size from 112 to 32 byte on those kernels.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20221127124942.1665522-16-npiggin@gmail.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Avoid link stack corruption in misc asm functions</title>
<updated>2021-08-25T03:35:47+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2021-08-24T07:56:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=33e1402435cb9f3021439a15935ea2dc69ec1844'/>
<id>33e1402435cb9f3021439a15935ea2dc69ec1844</id>
<content type='text'>
bl;mflr is used at several places to get code position.

Use bcl 20,31,+4 instead of bl in order to preserve link stack.

See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()") for details.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/c6eabb4fb6c156f75d56dcbcc6f243e5ac0fba42.1629791763.git.christophe.leroy@csgroup.eu

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bl;mflr is used at several places to get code position.

Use bcl 20,31,+4 instead of bl in order to preserve link stack.

See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()") for details.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/c6eabb4fb6c156f75d56dcbcc6f243e5ac0fba42.1629791763.git.christophe.leroy@csgroup.eu

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto</title>
<updated>2021-08-15T03:49:24+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2021-04-13T16:38:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1e688dd2a3d6759d416616ff07afc4bb836c4213'/>
<id>1e688dd2a3d6759d416616ff07afc4bb836c4213</id>
<content type='text'>
Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
flexibility to GCC.

For that add an entry to the exception table so that
program_check_exception() knowns where to resume execution
after a WARNING.

Here are two exemples. The first one is done on PPC32 (which
benefits from the previous patch), the second is on PPC64.

	unsigned long test(struct pt_regs *regs)
	{
		int ret;

		WARN_ON(regs-&gt;msr &amp; MSR_PR);

		return regs-&gt;gpr[3];
	}

	unsigned long test9w(unsigned long a, unsigned long b)
	{
		if (WARN_ON(!b))
			return 0;
		return a / b;
	}

Before the patch:

	000003a8 &lt;test&gt;:
	 3a8:	81 23 00 84 	lwz     r9,132(r3)
	 3ac:	71 29 40 00 	andi.   r9,r9,16384
	 3b0:	40 82 00 0c 	bne     3bc &lt;test+0x14&gt;
	 3b4:	80 63 00 0c 	lwz     r3,12(r3)
	 3b8:	4e 80 00 20 	blr

	 3bc:	0f e0 00 00 	twui    r0,0
	 3c0:	80 63 00 0c 	lwz     r3,12(r3)
	 3c4:	4e 80 00 20 	blr

	0000000000000bf0 &lt;.test9w&gt;:
	 bf0:	7c 89 00 74 	cntlzd  r9,r4
	 bf4:	79 29 d1 82 	rldicl  r9,r9,58,6
	 bf8:	0b 09 00 00 	tdnei   r9,0
	 bfc:	2c 24 00 00 	cmpdi   r4,0
	 c00:	41 82 00 0c 	beq     c0c &lt;.test9w+0x1c&gt;
	 c04:	7c 63 23 92 	divdu   r3,r3,r4
	 c08:	4e 80 00 20 	blr

	 c0c:	38 60 00 00 	li      r3,0
	 c10:	4e 80 00 20 	blr

After the patch:

	000003a8 &lt;test&gt;:
	 3a8:	81 23 00 84 	lwz     r9,132(r3)
	 3ac:	71 29 40 00 	andi.   r9,r9,16384
	 3b0:	40 82 00 0c 	bne     3bc &lt;test+0x14&gt;
	 3b4:	80 63 00 0c 	lwz     r3,12(r3)
	 3b8:	4e 80 00 20 	blr

	 3bc:	0f e0 00 00 	twui    r0,0

	0000000000000c50 &lt;.test9w&gt;:
	 c50:	7c 89 00 74 	cntlzd  r9,r4
	 c54:	79 29 d1 82 	rldicl  r9,r9,58,6
	 c58:	0b 09 00 00 	tdnei   r9,0
	 c5c:	7c 63 23 92 	divdu   r3,r3,r4
	 c60:	4e 80 00 20 	blr

	 c70:	38 60 00 00 	li      r3,0
	 c74:	4e 80 00 20 	blr

In the first exemple, we see GCC doesn't need to duplicate what
happens after the trap.

In the second exemple, we see that GCC doesn't need to emit a test
and a branch in the likely path in addition to the trap.

We've got some WARN_ON() in .softirqentry.text section so it needs
to be added in the OTHER_TEXT_SECTIONS in modpost.c

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/389962b1b702e3c78d169e59bcfac56282889173.1618331882.git.christophe.leroy@csgroup.eu
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
flexibility to GCC.

For that add an entry to the exception table so that
program_check_exception() knowns where to resume execution
after a WARNING.

Here are two exemples. The first one is done on PPC32 (which
benefits from the previous patch), the second is on PPC64.

	unsigned long test(struct pt_regs *regs)
	{
		int ret;

		WARN_ON(regs-&gt;msr &amp; MSR_PR);

		return regs-&gt;gpr[3];
	}

	unsigned long test9w(unsigned long a, unsigned long b)
	{
		if (WARN_ON(!b))
			return 0;
		return a / b;
	}

Before the patch:

	000003a8 &lt;test&gt;:
	 3a8:	81 23 00 84 	lwz     r9,132(r3)
	 3ac:	71 29 40 00 	andi.   r9,r9,16384
	 3b0:	40 82 00 0c 	bne     3bc &lt;test+0x14&gt;
	 3b4:	80 63 00 0c 	lwz     r3,12(r3)
	 3b8:	4e 80 00 20 	blr

	 3bc:	0f e0 00 00 	twui    r0,0
	 3c0:	80 63 00 0c 	lwz     r3,12(r3)
	 3c4:	4e 80 00 20 	blr

	0000000000000bf0 &lt;.test9w&gt;:
	 bf0:	7c 89 00 74 	cntlzd  r9,r4
	 bf4:	79 29 d1 82 	rldicl  r9,r9,58,6
	 bf8:	0b 09 00 00 	tdnei   r9,0
	 bfc:	2c 24 00 00 	cmpdi   r4,0
	 c00:	41 82 00 0c 	beq     c0c &lt;.test9w+0x1c&gt;
	 c04:	7c 63 23 92 	divdu   r3,r3,r4
	 c08:	4e 80 00 20 	blr

	 c0c:	38 60 00 00 	li      r3,0
	 c10:	4e 80 00 20 	blr

After the patch:

	000003a8 &lt;test&gt;:
	 3a8:	81 23 00 84 	lwz     r9,132(r3)
	 3ac:	71 29 40 00 	andi.   r9,r9,16384
	 3b0:	40 82 00 0c 	bne     3bc &lt;test+0x14&gt;
	 3b4:	80 63 00 0c 	lwz     r3,12(r3)
	 3b8:	4e 80 00 20 	blr

	 3bc:	0f e0 00 00 	twui    r0,0

	0000000000000c50 &lt;.test9w&gt;:
	 c50:	7c 89 00 74 	cntlzd  r9,r4
	 c54:	79 29 d1 82 	rldicl  r9,r9,58,6
	 c58:	0b 09 00 00 	tdnei   r9,0
	 c5c:	7c 63 23 92 	divdu   r3,r3,r4
	 c60:	4e 80 00 20 	blr

	 c70:	38 60 00 00 	li      r3,0
	 c74:	4e 80 00 20 	blr

In the first exemple, we see GCC doesn't need to duplicate what
happens after the trap.

In the second exemple, we see that GCC doesn't need to emit a test
and a branch in the likely path in addition to the trap.

We've got some WARN_ON() in .softirqentry.text section so it needs
to be added in the OTHER_TEXT_SECTIONS in modpost.c

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/389962b1b702e3c78d169e59bcfac56282889173.1618331882.git.christophe.leroy@csgroup.eu
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/32: Remove __main()</title>
<updated>2021-06-16T14:09:10+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2021-06-08T17:22:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4696cfdb1380238dca2bda6199428d7e50c4ea38'/>
<id>4696cfdb1380238dca2bda6199428d7e50c4ea38</id>
<content type='text'>
Comment says that __main() is there to make GCC happy.

It's been there since the implementation of ppc arch in Linux 1.3.45.

ppc32 is the only architecture having that. Even ppc64 doesn't have it.

Seems like GCC is still happy without it.

Drop it for good.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/d01028f8166b98584eec536b52f14c5e3f98ff6b.1623172922.git.christophe.leroy@csgroup.eu

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Comment says that __main() is there to make GCC happy.

It's been there since the implementation of ppc arch in Linux 1.3.45.

ppc32 is the only architecture having that. Even ppc64 doesn't have it.

Seems like GCC is still happy without it.

Drop it for good.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/d01028f8166b98584eec536b52f14c5e3f98ff6b.1623172922.git.christophe.leroy@csgroup.eu

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/irq: Inline call_do_irq() and call_do_softirq()</title>
<updated>2021-03-29T02:22:17+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2021-03-19T06:34:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=48cf12d88969bd4238b8769767eb476970319d93'/>
<id>48cf12d88969bd4238b8769767eb476970319d93</id>
<content type='text'>
call_do_irq() and call_do_softirq() are simple enough to be
worth inlining.

Inlining them avoids an mflr/mtlr pair plus a save/reload on stack.

This is inspired from S390 arch. Several other arches do more or
less the same. The way sparc arch does seems odd thought.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20210320122227.345427-1-mpe@ellerman.id.au
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
call_do_irq() and call_do_softirq() are simple enough to be
worth inlining.

Inlining them avoids an mflr/mtlr pair plus a save/reload on stack.

This is inspired from S390 arch. Several other arches do more or
less the same. The way sparc arch does seems odd thought.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20210320122227.345427-1-mpe@ellerman.id.au
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/32: Remove ksp_limit</title>
<updated>2021-03-29T02:22:05+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2021-03-12T12:50:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5747230645562921b5bc19f6409f7af08fe17c6d'/>
<id>5747230645562921b5bc19f6409f7af08fe17c6d</id>
<content type='text'>
ksp_limit is there to help detect stack overflows.
That is specific to ppc32 as it was removed from ppc64 in
commit cbc9565ee826 ("powerpc: Remove ksp_limit on ppc64").

There are other means for detecting stack overflows.

As ppc64 has proven to not need it, ppc32 should be able to do
without it too.

Lets remove it and simplify exception handling.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/d789c3385b22e07bedc997613c0d26074cb513e7.1615552866.git.christophe.leroy@csgroup.eu
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ksp_limit is there to help detect stack overflows.
That is specific to ppc32 as it was removed from ppc64 in
commit cbc9565ee826 ("powerpc: Remove ksp_limit on ppc64").

There are other means for detecting stack overflows.

As ppc64 has proven to not need it, ppc32 should be able to do
without it too.

Lets remove it and simplify exception handling.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/d789c3385b22e07bedc997613c0d26074cb513e7.1615552866.git.christophe.leroy@csgroup.eu
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Rewrite FSL_BOOKE flush_cache_instruction() in C</title>
<updated>2020-09-02T01:00:21+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@csgroup.eu</email>
</author>
<published>2020-08-14T05:56:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=704dfe931df951895dea98bd1d9cacbb601b6451'/>
<id>704dfe931df951895dea98bd1d9cacbb601b6451</id>
<content type='text'>
Nothing prevents flush_cache_instruction() from being writen in C.

Do it to improve readability and maintainability.

This function is only use by low level callers, it is not
intended to be used by module. Don't export it.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/f989eff8296800c427622c0985384148404e4f0b.1597384512.git.christophe.leroy@csgroup.eu
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Nothing prevents flush_cache_instruction() from being writen in C.

Do it to improve readability and maintainability.

This function is only use by low level callers, it is not
intended to be used by module. Don't export it.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/f989eff8296800c427622c0985384148404e4f0b.1597384512.git.christophe.leroy@csgroup.eu
</pre>
</div>
</content>
</entry>
</feed>
