<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/s390/kernel/fpu.c, branch linux-rolling-stable</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>s390: Explicitly include &lt;linux/export.h&gt;</title>
<updated>2025-06-17T16:18:02+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2025-06-12T11:47:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=65c9a9f92502442157f7eb98e8cd8ad255676330'/>
<id>65c9a9f92502442157f7eb98e8cd8ad255676330</id>
<content type='text'>
Explicitly include &lt;linux/export.h&gt; in files which contain an
EXPORT_SYMBOL().

See commit a934a57a42f6 ("scripts/misc-check: check missing #include
&lt;linux/export.h&gt; when W=1") for more details.

Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Explicitly include &lt;linux/export.h&gt; in files which contain an
EXPORT_SYMBOL().

See commit a934a57a42f6 ("scripts/misc-check: check missing #include
&lt;linux/export.h&gt; when W=1") for more details.

Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: Re-add exception handling in load_fpu_state()</title>
<updated>2024-07-31T14:30:20+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-07-25T09:31:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4734406c39238cbeafe66f0060084caa3247ff53'/>
<id>4734406c39238cbeafe66f0060084caa3247ff53</id>
<content type='text'>
With the recent rewrite of the fpu code exception handling for the
lfpc instruction within load_fpu_state() was erroneously removed.

Add it again to prevent that loading invalid floating point register
values cause an unhandled specification exception.

Fixes: 8c09871a950a ("s390/fpu: limit save and restore to used registers")
Cc: stable@vger.kernel.org
Reported-by: Aristeu Rozanski &lt;aris@redhat.com&gt;
Tested-by: Aristeu Rozanski &lt;aris@redhat.com&gt;
Reviewed-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With the recent rewrite of the fpu code exception handling for the
lfpc instruction within load_fpu_state() was erroneously removed.

Add it again to prevent that loading invalid floating point register
values cause an unhandled specification exception.

Fixes: 8c09871a950a ("s390/fpu: limit save and restore to used registers")
Cc: stable@vger.kernel.org
Reported-by: Aristeu Rozanski &lt;aris@redhat.com&gt;
Tested-by: Aristeu Rozanski &lt;aris@redhat.com&gt;
Reviewed-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: limit save and restore to used registers</title>
<updated>2024-02-16T13:30:16+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8c09871a950a3fe686e0e27fd4193179c5f74f37'/>
<id>8c09871a950a3fe686e0e27fd4193179c5f74f37</id>
<content type='text'>
The first invocation of kernel_fpu_begin() after switching from user to
kernel context will save all vector registers, even if only parts of the
vector registers are used within the kernel fpu context. Given that save
and restore of all vector registers is quite expensive change the current
approach in several ways:

- Instead of saving and restoring all user registers limit this to those
  registers which are actually used within an kernel fpu context.

- On context switch save all remaining user fpu registers, so they can be
  restored when the task is rescheduled.

- Saving user registers within kernel_fpu_begin() is done without disabling
  and enabling interrupts - which also slightly reduces runtime. In worst
  case (e.g. interrupt context uses the same registers) this may lead to
  the situation that registers are saved several times, however the
  assumption is that this will not happen frequently, so that the new
  method is faster in nearly all cases.

- save_user_fpu_regs() can still be called from all contexts and saves all
  (or all remaining) user registers to a tasks ufpu user fpu save area.

Overall this reduces the time required to save and restore the user fpu
context for nearly all cases.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The first invocation of kernel_fpu_begin() after switching from user to
kernel context will save all vector registers, even if only parts of the
vector registers are used within the kernel fpu context. Given that save
and restore of all vector registers is quite expensive change the current
approach in several ways:

- Instead of saving and restoring all user registers limit this to those
  registers which are actually used within an kernel fpu context.

- On context switch save all remaining user fpu registers, so they can be
  restored when the task is rescheduled.

- Saving user registers within kernel_fpu_begin() is done without disabling
  and enabling interrupts - which also slightly reduces runtime. In worst
  case (e.g. interrupt context uses the same registers) this may lead to
  the situation that registers are saved several times, however the
  assumption is that this will not happen frequently, so that the new
  method is faster in nearly all cases.

- save_user_fpu_regs() can still be called from all contexts and saves all
  (or all remaining) user registers to a tasks ufpu user fpu save area.

Overall this reduces the time required to save and restore the user fpu
context for nearly all cases.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: decrease stack usage for some cases</title>
<updated>2024-02-16T13:30:16+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=066c40918bb495de8f2e45bd7eec06737a142712'/>
<id>066c40918bb495de8f2e45bd7eec06737a142712</id>
<content type='text'>
The kernel_fpu structure has a quite large size of 520 bytes. In order to
reduce stack footprint introduce several kernel fpu structures with
different and also smaller sizes. This way every kernel fpu user must use
the correct variant. A compile time check verifies that the correct variant
is used.

There are several users which use only 16 instead of all 32 vector
registers. For those users the new kernel_fpu_16 structure with a size of
only 266 bytes can be used.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel_fpu structure has a quite large size of 520 bytes. In order to
reduce stack footprint introduce several kernel fpu structures with
different and also smaller sizes. This way every kernel fpu user must use
the correct variant. A compile time check verifies that the correct variant
is used.

There are several users which use only 16 instead of all 32 vector
registers. For those users the new kernel_fpu_16 structure with a size of
only 266 bytes can be used.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: remove anonymous union from struct fpu</title>
<updated>2024-02-16T13:30:16+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bdbd3acb33f5b09b99d75b0f0edeb7db98a05c89'/>
<id>bdbd3acb33f5b09b99d75b0f0edeb7db98a05c89</id>
<content type='text'>
The anonymous union within struct fpu contains a floating point register
array and a vector register array. Given that the vector register is always
present remove the floating point register array. For configurations
without vector registers save the floating point register contents within
their corresponding vector register location.

This allows to remove the union, and also to simplify ptrace and perf code.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The anonymous union within struct fpu contains a floating point register
array and a vector register array. Given that the vector register is always
present remove the floating point register array. For configurations
without vector registers save the floating point register contents within
their corresponding vector register location.

This allows to remove the union, and also to simplify ptrace and perf code.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: remove regs member from struct fpu</title>
<updated>2024-02-16T13:30:16+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9cbff7f2214d16af5c10f1f55ac72d4c1a8bd787'/>
<id>9cbff7f2214d16af5c10f1f55ac72d4c1a8bd787</id>
<content type='text'>
KVM was the only user which modified the regs pointer in struct fpu. Remove
the pointer and convert the rest of the core fpu code to directly access
the save area embedded within struct fpu.

Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
KVM was the only user which modified the regs pointer in struct fpu. Remove
the pointer and convert the rest of the core fpu code to directly access
the save area embedded within struct fpu.

Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: change type of fpu mask from u32 to int</title>
<updated>2024-02-16T13:30:15+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c038b984a9af1010555986d6fe32d4da9db9fc3d'/>
<id>c038b984a9af1010555986d6fe32d4da9db9fc3d</id>
<content type='text'>
Change type of fpu mask consistently from u32 to int. This is a
prerequisite to make the kernel fpu usage preemptible. Upcoming code
uses __atomic* ops which work with int pointers.

Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change type of fpu mask consistently from u32 to int. This is a
prerequisite to make the kernel fpu usage preemptible. Upcoming code
uses __atomic* ops which work with int pointers.

Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: rename save_fpu_regs() to save_user_fpu_regs(), etc</title>
<updated>2024-02-16T13:30:15+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=87c5c70036813d26e6e7e4393747a4fdc63cf193'/>
<id>87c5c70036813d26e6e7e4393747a4fdc63cf193</id>
<content type='text'>
Rename save_fpu_regs(), load_fpu_regs(), and struct thread_struct's fpu
member to save_user_fpu_regs(), load_user_fpu_regs(), and ufpu. This way
the function and variable names reflect for which context they are supposed
to be used.

This large and trivial conversion is a prerequisite for making the kernel
fpu usage preemptible.

Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename save_fpu_regs(), load_fpu_regs(), and struct thread_struct's fpu
member to save_user_fpu_regs(), load_user_fpu_regs(), and ufpu. This way
the function and variable names reflect for which context they are supposed
to be used.

This large and trivial conversion is a prerequisite for making the kernel
fpu usage preemptible.

Reviewed-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: convert FPU CIF flag to regular TIF flag</title>
<updated>2024-02-16T13:30:15+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=419abc4d3828813b58d047da146f519eedaa395b'/>
<id>419abc4d3828813b58d047da146f519eedaa395b</id>
<content type='text'>
The FPU state, as represented by the CIF_FPU flag reflects the FPU state of
a task, not the CPU it is running on. Therefore convert the flag to a
regular TIF flag.

This removes the magic in switch_to() where a save_fpu_regs() call for the
currently (previous) running task sets the per-cpu CIF_FPU flag, which is
required to restore FPU register contents of the next task, when it returns
to user space.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The FPU state, as represented by the CIF_FPU flag reflects the FPU state of
a task, not the CPU it is running on. Therefore convert the flag to a
regular TIF flag.

This removes the magic in switch_to() where a save_fpu_regs() call for the
currently (previous) running task sets the per-cpu CIF_FPU flag, which is
required to restore FPU register contents of the next task, when it returns
to user space.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/fpu: convert __kernel_fpu_begin()/__kernel_fpu_end() to C</title>
<updated>2024-02-16T13:30:15+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2024-02-03T10:45:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=918c7cad66509c2170e38a088550fb4a525e0878'/>
<id>918c7cad66509c2170e38a088550fb4a525e0878</id>
<content type='text'>
Convert the rather large __kernel_fpu_begin()/__kernel_fpu_end() inline
assemblies to C. The C variant is much more readable, and this also allows
to get rid of the non-obvious usage of KERNEL_VXR_* constants within the
inline assemblies. E.g. "tmll %[m],6" correlates with the two bits set in
KERNEL_VXR_LOW. If the corresponding defines would be changed, the inline
assembles would break in a subtle way.

Therefore convert to C, use the proper defines, and allow the compiler to
generate code using the (hopefully) most efficient instructions.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert the rather large __kernel_fpu_begin()/__kernel_fpu_end() inline
assemblies to C. The C variant is much more readable, and this also allows
to get rid of the non-obvious usage of KERNEL_VXR_* constants within the
inline assemblies. E.g. "tmll %[m],6" correlates with the two bits set in
KERNEL_VXR_LOW. If the corresponding defines would be changed, the inline
assembles would break in a subtle way.

Therefore convert to C, use the proper defines, and allow the compiler to
generate code using the (hopefully) most efficient instructions.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
