<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/vhost/vhost.c, branch v6.4</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>vhost: support PACKED when setting-getting vring_base</title>
<updated>2023-06-09T16:07:52+00:00</updated>
<author>
<name>Shannon Nelson</name>
<email>shannon.nelson@amd.com</email>
</author>
<published>2023-04-24T22:50:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=55d8122f5cd62d5aaa225d7167dcd14a44c850b9'/>
<id>55d8122f5cd62d5aaa225d7167dcd14a44c850b9</id>
<content type='text'>
Use the right structs for PACKED or split vqs when setting and
getting the vring base.

Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend")
Signed-off-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Message-Id: &lt;20230424225031.18947-3-shannon.nelson@amd.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the right structs for PACKED or split vqs when setting and
getting the vring base.

Fixes: 4c8cf31885f6 ("vhost: introduce vDPA-based backend")
Signed-off-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Message-Id: &lt;20230424225031.18947-3-shannon.nelson@amd.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: Fix worker hangs due to missed wake up calls</title>
<updated>2023-06-08T19:43:09+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-07T19:23:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4b13cbef797048fbb525f8c635a5279e9d209d93'/>
<id>4b13cbef797048fbb525f8c635a5279e9d209d93</id>
<content type='text'>
We can race where we have added work to the work_list, but
vhost_task_fn has passed that check but not yet set us into
TASK_INTERRUPTIBLE. wake_up_process will see us in TASK_RUNNING and
just return.

This bug was intoduced in commit f9010dbdce91 ("fork, vhost: Use
CLONE_THREAD to fix freezer/ps regression") when I moved the setting
of TASK_INTERRUPTIBLE to simplfy the code and avoid get_signal from
logging warnings about being in the wrong state. This moves the setting
of TASK_INTERRUPTIBLE back to before we test if we need to stop the
task to avoid a possible race there as well. We then have vhost_worker
set TASK_RUNNING if it finds work similar to before.

Fixes: f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230607192338.6041-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We can race where we have added work to the work_list, but
vhost_task_fn has passed that check but not yet set us into
TASK_INTERRUPTIBLE. wake_up_process will see us in TASK_RUNNING and
just return.

This bug was intoduced in commit f9010dbdce91 ("fork, vhost: Use
CLONE_THREAD to fix freezer/ps regression") when I moved the setting
of TASK_INTERRUPTIBLE to simplfy the code and avoid get_signal from
logging warnings about being in the wrong state. This moves the setting
of TASK_INTERRUPTIBLE back to before we test if we need to stop the
task to avoid a possible race there as well. We then have vhost_worker
set TASK_RUNNING if it finds work similar to before.

Fixes: f9010dbdce91 ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230607192338.6041-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: Fix crash during early vhost_transport_send_pkt calls</title>
<updated>2023-06-08T19:43:09+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-07T19:23:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a284f09effea1908db7985dbfb5458c9100038d8'/>
<id>a284f09effea1908db7985dbfb5458c9100038d8</id>
<content type='text'>
If userspace does VHOST_VSOCK_SET_GUEST_CID before VHOST_SET_OWNER we
can race where:
1. thread0 calls vhost_transport_send_pkt -&gt; vhost_work_queue
2. thread1 does VHOST_SET_OWNER which calls vhost_worker_create.
3. vhost_worker_create will set the dev-&gt;worker pointer before setting
the worker-&gt;vtsk pointer.
4. thread0's vhost_work_queue will see the dev-&gt;worker pointer is
set and try to call vhost_task_wake using not yet set worker-&gt;vtsk
pointer.
5. We then crash since vtsk is NULL.

Before commit 6e890c5d5021 ("vhost: use vhost_tasks for worker
threads"), we only had the worker pointer so we could just check it to
see if VHOST_SET_OWNER has been done. After that commit we have the
vhost_worker and vhost_task pointer, so we can now hit the bug above.

This patch embeds the vhost_worker in the vhost_dev and moves the work
list initialization back to vhost_dev_init, so we can just check the
worker.vtsk pointer to check if VHOST_SET_OWNER has been done like
before.

Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230607192338.6041-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reported-by: syzbot+d0d442c22fa8db45ff0e@syzkaller.appspotmail.com
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If userspace does VHOST_VSOCK_SET_GUEST_CID before VHOST_SET_OWNER we
can race where:
1. thread0 calls vhost_transport_send_pkt -&gt; vhost_work_queue
2. thread1 does VHOST_SET_OWNER which calls vhost_worker_create.
3. vhost_worker_create will set the dev-&gt;worker pointer before setting
the worker-&gt;vtsk pointer.
4. thread0's vhost_work_queue will see the dev-&gt;worker pointer is
set and try to call vhost_task_wake using not yet set worker-&gt;vtsk
pointer.
5. We then crash since vtsk is NULL.

Before commit 6e890c5d5021 ("vhost: use vhost_tasks for worker
threads"), we only had the worker pointer so we could just check it to
see if VHOST_SET_OWNER has been done. After that commit we have the
vhost_worker and vhost_task pointer, so we can now hit the bug above.

This patch embeds the vhost_worker in the vhost_dev and moves the work
list initialization back to vhost_dev_init, so we can just check the
worker.vtsk pointer to check if VHOST_SET_OWNER has been done like
before.

Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230607192338.6041-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reported-by: syzbot+d0d442c22fa8db45ff0e@syzkaller.appspotmail.com
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: use kzalloc() instead of kmalloc() followed by memset()</title>
<updated>2023-06-07T20:22:30+00:00</updated>
<author>
<name>Prathu Baronia</name>
<email>prathubaronia2011@gmail.com</email>
</author>
<published>2023-05-22T08:50:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4d8df0f5f79f747d75a7d356d9b9ea40a4e4c8a9'/>
<id>4d8df0f5f79f747d75a7d356d9b9ea40a4e4c8a9</id>
<content type='text'>
Use kzalloc() to allocate new zeroed out msg node instead of
memsetting a node allocated with kmalloc().

Signed-off-by: Prathu Baronia &lt;prathubaronia2011@gmail.com&gt;
Message-Id: &lt;20230522085019.42914-1-prathubaronia2011@gmail.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use kzalloc() to allocate new zeroed out msg node instead of
memsetting a node allocated with kmalloc().

Signed-off-by: Prathu Baronia &lt;prathubaronia2011@gmail.com&gt;
Message-Id: &lt;20230522085019.42914-1-prathubaronia2011@gmail.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fork, vhost: Use CLONE_THREAD to fix freezer/ps regression</title>
<updated>2023-06-01T21:15:33+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-01T18:32:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f9010dbdce911ee1f1af1398a24b1f9f992e0080'/>
<id>f9010dbdce911ee1f1af1398a24b1f9f992e0080</id>
<content type='text'>
When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.

To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.

This is a modified version of the patch written by Mike Christie
&lt;michael.christie@oracle.com&gt; which was a modified version of patch
originally written by Linus.

Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.

Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c.  Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn.  vhost_worker now returns true if work was done.

The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit.  This collection clears
__fatal_signal_pending.  This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.

For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file.  To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec.  The
coredump code and de_thread have been modified to ignore vhost threads.

Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.

Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.

Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Co-developed-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.

To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.

This is a modified version of the patch written by Mike Christie
&lt;michael.christie@oracle.com&gt; which was a modified version of patch
originally written by Linus.

Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.

Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c.  Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn.  vhost_worker now returns true if work was done.

The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit.  This collection clears
__fatal_signal_pending.  This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.

For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file.  To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec.  The
coredump code and de_thread have been modified to ignore vhost threads.

Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.

Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.

Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Co-developed-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sched-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-04-28T21:53:30+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-28T21:53:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=586b222d748e91c619d68e9239654ebc7fed9b0c'/>
<id>586b222d748e91c619d68e9239654ebc7fed9b0c</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:

 - Allow unprivileged PSI poll()ing

 - Fix performance regression introduced by mm_cid

 - Improve livepatch stalls by adding livepatch task switching to
   cond_resched(). This resolves livepatching busy-loop stalls with
   certain CPU-bound kthreads

 - Improve sched_move_task() performance on autogroup configs

 - On core-scheduling CPUs, avoid selecting throttled tasks to run

 - Misc cleanups, fixes and improvements

* tag 'sched-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/clock: Fix local_clock() before sched_clock_init()
  sched/rt: Fix bad task migration for rt tasks
  sched: Fix performance regression introduced by mm_cid
  sched/core: Make sched_dynamic_mutex static
  sched/psi: Allow unprivileged polling of N*2s period
  sched/psi: Extract update_triggers side effect
  sched/psi: Rename existing poll members in preparation
  sched/psi: Rearrange polling code in preparation
  sched/fair: Fix inaccurate tally of ttwu_move_affine
  vhost: Fix livepatch timeouts in vhost_worker()
  livepatch,sched: Add livepatch task switching to cond_resched()
  livepatch: Skip task_call_func() for current task
  livepatch: Convert stack entries array to percpu
  sched: Interleave cfs bandwidth timers for improved single thread performance at low utilization
  sched/core: Reduce cost of sched_move_task when config autogroup
  sched/core: Avoid selecting the task that is throttled to run when core-sched enable
  sched/topology: Make sched_energy_mutex,update static
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull scheduler updates from Ingo Molnar:

 - Allow unprivileged PSI poll()ing

 - Fix performance regression introduced by mm_cid

 - Improve livepatch stalls by adding livepatch task switching to
   cond_resched(). This resolves livepatching busy-loop stalls with
   certain CPU-bound kthreads

 - Improve sched_move_task() performance on autogroup configs

 - On core-scheduling CPUs, avoid selecting throttled tasks to run

 - Misc cleanups, fixes and improvements

* tag 'sched-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/clock: Fix local_clock() before sched_clock_init()
  sched/rt: Fix bad task migration for rt tasks
  sched: Fix performance regression introduced by mm_cid
  sched/core: Make sched_dynamic_mutex static
  sched/psi: Allow unprivileged polling of N*2s period
  sched/psi: Extract update_triggers side effect
  sched/psi: Rename existing poll members in preparation
  sched/psi: Rearrange polling code in preparation
  sched/fair: Fix inaccurate tally of ttwu_move_affine
  vhost: Fix livepatch timeouts in vhost_worker()
  livepatch,sched: Add livepatch task switching to cond_resched()
  livepatch: Skip task_call_func() for current task
  livepatch: Convert stack entries array to percpu
  sched: Interleave cfs bandwidth timers for improved single thread performance at low utilization
  sched/core: Reduce cost of sched_move_task when config autogroup
  sched/core: Avoid selecting the task that is throttled to run when core-sched enable
  sched/topology: Make sched_energy_mutex,update static
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost</title>
<updated>2023-04-28T00:05:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-28T00:05:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8ccd54fe45713cd458015b5b08d6098545e70543'/>
<id>8ccd54fe45713cd458015b5b08d6098545e70543</id>
<content type='text'>
Pull virtio updates from Michael Tsirkin:
 "virtio,vhost,vdpa: features, fixes, and cleanups:

   - reduction in interrupt rate in virtio

   - perf improvement for VDUSE

   - scalability for vhost-scsi

   - non power of 2 ring support for packed rings

   - better management for mlx5 vdpa

   - suspend for snet

   - VIRTIO_F_NOTIFICATION_DATA

   - shared backend with vdpa-sim-blk

   - user VA support in vdpa-sim

   - better struct packing for virtio

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (52 commits)
  vhost_vdpa: fix unmap process in no-batch mode
  MAINTAINERS: make me a reviewer of VIRTIO CORE AND NET DRIVERS
  tools/virtio: fix build caused by virtio_ring changes
  virtio_ring: add a struct device forward declaration
  vdpa_sim_blk: support shared backend
  vdpa_sim: move buffer allocation in the devices
  vdpa/snet: use likely/unlikely macros in hot functions
  vdpa/snet: implement kick_vq_with_data callback
  virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support
  virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
  vdpa/snet: support the suspend vDPA callback
  vdpa/snet: support getting and setting VQ state
  MAINTAINERS: add vringh.h to Virtio Core and Net Drivers
  vringh: address kdoc warnings
  vdpa: address kdoc warnings
  virtio_ring: don't update event idx on get_buf
  vdpa_sim: add support for user VA
  vdpa_sim: replace the spinlock with a mutex to protect the state
  vdpa_sim: use kthread worker
  vdpa_sim: make devices agnostic for work management
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull virtio updates from Michael Tsirkin:
 "virtio,vhost,vdpa: features, fixes, and cleanups:

   - reduction in interrupt rate in virtio

   - perf improvement for VDUSE

   - scalability for vhost-scsi

   - non power of 2 ring support for packed rings

   - better management for mlx5 vdpa

   - suspend for snet

   - VIRTIO_F_NOTIFICATION_DATA

   - shared backend with vdpa-sim-blk

   - user VA support in vdpa-sim

   - better struct packing for virtio

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (52 commits)
  vhost_vdpa: fix unmap process in no-batch mode
  MAINTAINERS: make me a reviewer of VIRTIO CORE AND NET DRIVERS
  tools/virtio: fix build caused by virtio_ring changes
  virtio_ring: add a struct device forward declaration
  vdpa_sim_blk: support shared backend
  vdpa_sim: move buffer allocation in the devices
  vdpa/snet: use likely/unlikely macros in hot functions
  vdpa/snet: implement kick_vq_with_data callback
  virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support
  virtio: add VIRTIO_F_NOTIFICATION_DATA feature support
  vdpa/snet: support the suspend vDPA callback
  vdpa/snet: support getting and setting VQ state
  MAINTAINERS: add vringh.h to Virtio Core and Net Drivers
  vringh: address kdoc warnings
  vdpa: address kdoc warnings
  virtio_ring: don't update event idx on get_buf
  vdpa_sim: add support for user VA
  vdpa_sim: replace the spinlock with a mutex to protect the state
  vdpa_sim: use kthread worker
  vdpa_sim: make devices agnostic for work management
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux</title>
<updated>2023-04-24T19:52:35+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-24T19:52:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3323ddce085cdb33331c2c1bb7a88233023566a9'/>
<id>3323ddce085cdb33331c2c1bb7a88233023566a9</id>
<content type='text'>
Pull user work thread updates from Christian Brauner:
 "This contains the work generalizing the ability to create a kernel
  worker from a userspace process.

  Such user workers will run with the same credentials as the userspace
  process they were created from providing stronger security and
  accounting guarantees than the traditional override_creds() approach
  ever could've hoped for.

  The original work was heavily based and optimzed for the needs of
  io_uring which was the first user. However, as it quickly turned out
  the ability to create user workers inherting properties from a
  userspace process is generally useful.

  The vhost subsystem currently creates workers using the kthread api.
  The consequences of using the kthread api are that RLIMITs don't work
  correctly as they are inherited from khtreadd. This leads to bugs
  where more workers are created than would be allowed by the RLIMITs of
  the userspace process in lieu of which workers are created.

  Problems like this disappear with user workers created from the
  userspace processes for which they perform the work. In addition,
  providing this api allows vhost to remove additional complexity. For
  example, cgroup and mm sharing will just work out of the box with user
  workers based on the relevant userspace process instead of manually
  ensuring the correct cgroup and mm contexts are used.

  So the vhost subsystem should simply be made to use the same mechanism
  as io_uring. To this end the original mechanism used for
  create_io_thread() is generalized into user workers:

   - Introduce PF_USER_WORKER as a generic indicator that a given task
     is a user worker, i.e., a kernel task that was created from a
     userspace process. Now a PF_IO_WORKER thread is just a specialized
     version of PF_USER_WORKER. So io_uring io workers raise both flags.

   - Make copy_process() available to core kernel code

   - Extend struct kernel_clone_args with the following bitfields
     allowing to indicate to copy_process():
       - to create a user worker (raise PF_USER_WORKER)
       - to not inherit any files from the userspace process
       - to ignore signals

  After all generic changes are in place the vhost subsystem implements
  a new dedicated vhost api based on user workers. Finally, vhost is
  switched to rely on the new api moving it off of kthreads.

  Thanks to Mike for sticking it out and making it through this rather
  arduous journey"

* tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  vhost: use vhost_tasks for worker threads
  vhost: move worker thread fields to new struct
  vhost_task: Allow vhost layer to use copy_process
  fork: allow kernel code to call copy_process
  fork: Add kernel_clone_args flag to ignore signals
  fork: add kernel_clone_args flag to not dup/clone files
  fork/vm: Move common PF_IO_WORKER behavior to new flag
  kernel: Make io_thread and kthread bit fields
  kthread: Pass in the thread's name during creation
  kernel: Allow a kernel thread's name to be set in copy_process
  csky: Remove kernel_thread declaration
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull user work thread updates from Christian Brauner:
 "This contains the work generalizing the ability to create a kernel
  worker from a userspace process.

  Such user workers will run with the same credentials as the userspace
  process they were created from providing stronger security and
  accounting guarantees than the traditional override_creds() approach
  ever could've hoped for.

  The original work was heavily based and optimzed for the needs of
  io_uring which was the first user. However, as it quickly turned out
  the ability to create user workers inherting properties from a
  userspace process is generally useful.

  The vhost subsystem currently creates workers using the kthread api.
  The consequences of using the kthread api are that RLIMITs don't work
  correctly as they are inherited from khtreadd. This leads to bugs
  where more workers are created than would be allowed by the RLIMITs of
  the userspace process in lieu of which workers are created.

  Problems like this disappear with user workers created from the
  userspace processes for which they perform the work. In addition,
  providing this api allows vhost to remove additional complexity. For
  example, cgroup and mm sharing will just work out of the box with user
  workers based on the relevant userspace process instead of manually
  ensuring the correct cgroup and mm contexts are used.

  So the vhost subsystem should simply be made to use the same mechanism
  as io_uring. To this end the original mechanism used for
  create_io_thread() is generalized into user workers:

   - Introduce PF_USER_WORKER as a generic indicator that a given task
     is a user worker, i.e., a kernel task that was created from a
     userspace process. Now a PF_IO_WORKER thread is just a specialized
     version of PF_USER_WORKER. So io_uring io workers raise both flags.

   - Make copy_process() available to core kernel code

   - Extend struct kernel_clone_args with the following bitfields
     allowing to indicate to copy_process():
       - to create a user worker (raise PF_USER_WORKER)
       - to not inherit any files from the userspace process
       - to ignore signals

  After all generic changes are in place the vhost subsystem implements
  a new dedicated vhost api based on user workers. Finally, vhost is
  switched to rely on the new api moving it off of kthreads.

  Thanks to Mike for sticking it out and making it through this rather
  arduous journey"

* tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  vhost: use vhost_tasks for worker threads
  vhost: move worker thread fields to new struct
  vhost_task: Allow vhost layer to use copy_process
  fork: allow kernel code to call copy_process
  fork: Add kernel_clone_args flag to ignore signals
  fork: add kernel_clone_args flag to not dup/clone files
  fork/vm: Move common PF_IO_WORKER behavior to new flag
  kernel: Make io_thread and kthread bit fields
  kthread: Pass in the thread's name during creation
  kernel: Allow a kernel thread's name to be set in copy_process
  csky: Remove kernel_thread declaration
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: use struct_size and size_add to compute flex array sizes</title>
<updated>2023-04-21T07:02:29+00:00</updated>
<author>
<name>Jacob Keller</name>
<email>jacob.e.keller@intel.com</email>
</author>
<published>2023-02-27T21:41:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e4be66e5f36b8cd2a052ba9e2ba063e6c37f5453'/>
<id>e4be66e5f36b8cd2a052ba9e2ba063e6c37f5453</id>
<content type='text'>
The vhost_get_avail_size and vhost_get_used_size functions compute the size
of structures with flexible array members with an additional 2 bytes if the
VIRTIO_RING_F_EVENT_IDX feature flag is set. Convert these functions to use
struct_size() and size_add() instead of coding the calculation by hand.

This ensures that the calculations will saturate at SIZE_MAX rather than
overflowing.

Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Message-Id: &lt;20230227214127.3678392-1-jacob.e.keller@intel.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The vhost_get_avail_size and vhost_get_used_size functions compute the size
of structures with flexible array members with an additional 2 bytes if the
VIRTIO_RING_F_EVENT_IDX feature flag is set. Convert these functions to use
struct_size() and size_add() instead of coding the calculation by hand.

This ensures that the calculations will saturate at SIZE_MAX rather than
overflowing.

Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Message-Id: &lt;20230227214127.3678392-1-jacob.e.keller@intel.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>docs: move x86 documentation into Documentation/arch/</title>
<updated>2023-03-30T18:58:51+00:00</updated>
<author>
<name>Jonathan Corbet</name>
<email>corbet@lwn.net</email>
</author>
<published>2023-03-14T23:06:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ff61f0791ce969d2db6c9f3b71d74ceec0a2e958'/>
<id>ff61f0791ce969d2db6c9f3b71d74ceec0a2e958</id>
<content type='text'>
Move the x86 documentation under Documentation/arch/ as a way of cleaning
up the top-level directory and making the structure of our docs more
closely match the structure of the source directories it describes.

All in-kernel references to the old paths have been updated.

Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the x86 documentation under Documentation/arch/ as a way of cleaning
up the top-level directory and making the structure of our docs more
closely match the structure of the source directories it describes.

All in-kernel references to the old paths have been updated.

Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
