<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/vhost/scsi.c, branch v6.10</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>vhost_scsi: Handle vhost_vq_work_queue failures for TMFs</title>
<updated>2024-05-22T12:31:15+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2024-03-16T00:47:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0352c961cb3542b1c03415259d7b85d99457acff'/>
<id>0352c961cb3542b1c03415259d7b85d99457acff</id>
<content type='text'>
vhost_vq_work_queue will never fail when queueing the TMF's response
handling because a guest can only send us TMFs when the device is fully
setup so there is always a worker at that time. In the next patches we
will modify the worker code so it handles SIGKILL by exiting before
outstanding commands/TMFs have sent their responses. In that case
vhost_vq_work_queue can fail when we try to send a response.

This has us just free the TMF's resources since at this time the guest
won't be able to get a response even if we could send it.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-6-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vhost_vq_work_queue will never fail when queueing the TMF's response
handling because a guest can only send us TMFs when the device is fully
setup so there is always a worker at that time. In the next patches we
will modify the worker code so it handles SIGKILL by exiting before
outstanding commands/TMFs have sent their responses. In that case
vhost_vq_work_queue can fail when we try to send a response.

This has us just free the TMF's resources since at this time the guest
won't be able to get a response even if we could send it.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-6-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-scsi: Use system wq to flush dev for TMFs</title>
<updated>2024-05-22T12:31:15+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2024-03-16T00:47:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=59b701b99e2d01245b951325a8c3c12faa98d3ea'/>
<id>59b701b99e2d01245b951325a8c3c12faa98d3ea</id>
<content type='text'>
We flush all the workers that are not also used by the ctl vq to make
sure that responses queued by LIO before the TMF response are sent
before the TMF response. This requires a special vhost_vq_flush
function which, in the next patches where we handle SIGKILL killing
workers while in use, will require extra locking/complexity. To avoid
that, this patch has us flush the entire device from the system work
queue, then queue up sending the response from there.

This is a little less optimal since we now flush all workers but this
will be ok since commands have already timed out and perf is not a
concern.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-4-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We flush all the workers that are not also used by the ctl vq to make
sure that responses queued by LIO before the TMF response are sent
before the TMF response. This requires a special vhost_vq_flush
function which, in the next patches where we handle SIGKILL killing
workers while in use, will require extra locking/complexity. To avoid
that, this patch has us flush the entire device from the system work
queue, then queue up sending the response from there.

This is a little less optimal since we now flush all workers but this
will be ok since commands have already timed out and perf is not a
concern.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-4-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-scsi: Handle vhost_vq_work_queue failures for cmds</title>
<updated>2024-05-22T12:31:15+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2024-03-16T00:47:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1eceddeeb6a9beb7ef2a22c176c03c5ff18c6386'/>
<id>1eceddeeb6a9beb7ef2a22c176c03c5ff18c6386</id>
<content type='text'>
In the next patches we will support the vhost_task being killed while in
use. The problem for vhost-scsi is that we can't free some structs until
we get responses for commands we have submitted to the target layer and
we currently process the responses from the vhost_task.

This has just drop the responses and free the command's resources. When
all commands have completed then operations like flush will be woken up
and we can complete device release and endpoint cleanup.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the next patches we will support the vhost_task being killed while in
use. The problem for vhost-scsi is that we can't free some structs until
we get responses for commands we have submitted to the target layer and
we currently process the responses from the vhost_task.

This has just drop the responses and free the command's resources. When
all commands have completed then operations like flush will be woken up
and we can complete device release and endpoint cleanup.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-scsi: Handle vhost_vq_work_queue failures for events</title>
<updated>2024-05-22T12:31:15+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2024-03-16T00:46:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b1b2ce58ed23c5d56e0ab299a5271ac01f95b75c'/>
<id>b1b2ce58ed23c5d56e0ab299a5271ac01f95b75c</id>
<content type='text'>
Currently, we can try to queue an event's work before the vhost_task is
created. When this happens we just drop it in vhost_scsi_do_plug before
even calling vhost_vq_work_queue. During a device shutdown we do the
same thing after vhost_scsi_clear_endpoint has cleared the backends.

In the next patches we will be able to kill the vhost_task before we
have cleared the endpoint. In that case, vhost_vq_work_queue can fail
and we will leak the event's memory. This has handle the failure by
just freeing the event. This is safe to do, because
vhost_vq_work_queue will only return failure for us when the vhost_task
is killed and so userspace will not be able to handle events if we
sent them.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, we can try to queue an event's work before the vhost_task is
created. When this happens we just drop it in vhost_scsi_do_plug before
even calling vhost_vq_work_queue. During a device shutdown we do the
same thing after vhost_scsi_clear_endpoint has cleared the backends.

In the next patches we will be able to kill the vhost_task before we
have cleared the endpoint. In that case, vhost_vq_work_queue can fail
and we will leak the event's memory. This has handle the failure by
just freeing the event. This is safe to do, because
vhost_vq_work_queue will only return failure for us when the vhost_task
is killed and so userspace will not be able to handle events if we
sent them.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost</title>
<updated>2023-11-05T19:02:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-05T19:02:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=77fa2fbe87fc605c4bfa87dff87be9bfded0e9a3'/>
<id>77fa2fbe87fc605c4bfa87dff87be9bfded0e9a3</id>
<content type='text'>
Pull virtio updates from Michael Tsirkin:
 "vhost,virtio,vdpa: features, fixes, cleanups.

  vdpa/mlx5:
   - VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
   - new maintainer

  vdpa:
   - support for vq descriptor mappings
   - decouple reset of iotlb mapping from device reset

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (34 commits)
  vdpa_sim: implement .reset_map support
  vdpa/mlx5: implement .reset_map driver op
  vhost-vdpa: clean iotlb map during reset for older userspace
  vdpa: introduce .compat_reset operation callback
  vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
  vhost-vdpa: reset vendor specific mapping to initial state in .release
  vdpa: introduce .reset_map operation callback
  virtio_pci: add check for common cfg size
  virtio-blk: fix implicit overflow on virtio_max_dma_size
  virtio_pci: add build offset check for the new common cfg items
  virtio: add definition of VIRTIO_F_NOTIF_CONFIG_DATA feature bit
  vduse: make vduse_class constant
  vhost-scsi: Spelling s/preceeding/preceding/g
  virtio: kdoc for struct virtio_pci_modern_device
  vdpa: Update sysfs ABI documentation
  MAINTAINERS: Add myself as mlx5_vdpa driver
  virtio-balloon: correct the comment of virtballoon_migratepage()
  mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
  vdpa/mlx5: Update cvq iotlb mapping on ASID change
  vdpa/mlx5: Make iotlb helper functions more generic
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull virtio updates from Michael Tsirkin:
 "vhost,virtio,vdpa: features, fixes, cleanups.

  vdpa/mlx5:
   - VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
   - new maintainer

  vdpa:
   - support for vq descriptor mappings
   - decouple reset of iotlb mapping from device reset

  and fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (34 commits)
  vdpa_sim: implement .reset_map support
  vdpa/mlx5: implement .reset_map driver op
  vhost-vdpa: clean iotlb map during reset for older userspace
  vdpa: introduce .compat_reset operation callback
  vhost-vdpa: introduce IOTLB_PERSIST backend feature bit
  vhost-vdpa: reset vendor specific mapping to initial state in .release
  vdpa: introduce .reset_map operation callback
  virtio_pci: add check for common cfg size
  virtio-blk: fix implicit overflow on virtio_max_dma_size
  virtio_pci: add build offset check for the new common cfg items
  virtio: add definition of VIRTIO_F_NOTIF_CONFIG_DATA feature bit
  vduse: make vduse_class constant
  vhost-scsi: Spelling s/preceeding/preceding/g
  virtio: kdoc for struct virtio_pci_modern_device
  vdpa: Update sysfs ABI documentation
  MAINTAINERS: Add myself as mlx5_vdpa driver
  virtio-balloon: correct the comment of virtballoon_migratepage()
  mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
  vdpa/mlx5: Update cvq iotlb mapping on ASID change
  vdpa/mlx5: Make iotlb helper functions more generic
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-scsi: Spelling s/preceeding/preceding/g</title>
<updated>2023-11-01T13:19:59+00:00</updated>
<author>
<name>Geert Uytterhoeven</name>
<email>geert+renesas@glider.be</email>
</author>
<published>2023-09-28T12:18:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ff1f51eefb1cd2c20c3f49538a64c09a1cc98be'/>
<id>5ff1f51eefb1cd2c20c3f49538a64c09a1cc98be</id>
<content type='text'>
Fix a misspelling of "preceding".

Signed-off-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Message-Id: &lt;b57b882675809f1f9dacbf42cf6b920b2bea9cba.1695903476.git.geert+renesas@glider.be&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a misspelling of "preceding".

Signed-off-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Message-Id: &lt;b57b882675809f1f9dacbf42cf6b920b2bea9cba.1695903476.git.geert+renesas@glider.be&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>scsi: target: Allow userspace to request direct submissions</title>
<updated>2023-10-13T19:53:58+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-09-28T02:09:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e2f4ea40138e16d1dfd768f2dead8f3f75a85673'/>
<id>e2f4ea40138e16d1dfd768f2dead8f3f75a85673</id>
<content type='text'>
This allows userspace to request the fabric drivers do direct submissions
if they support it. With the new device file, submit_type, users can
write 0 - 2 to control how commands are submitted to the backend:

 0 - TARGET_FABRIC_DEFAULT_SUBMIT - LIO will use the fabric's default
     submission type. This is the default for compat.

 1 - TARGET_DIRECT_SUBMIT - LIO will submit the cmd to the backend from the
     calling context if the fabric the cmd was received on supports it,
     else it will use the fabric's default type.

 2 - TARGET_QUEUE_SUBMIT - LIO will queue the cmd to the LIO submission
     workqueue which will pass it to the backend.

When using an NVMe drive and vhost-scsi with direct submission we see
around a 20% improvement in 4K I/Os:

fio jobs        1       2       4       8       10
--------------------------------------------------
defer           94K     190K    394K    770K    890K
direct          128K    252K    488K    950K    -

And when using the queueing mode, we now no longer see issues like where
the iSCSI tx thread is blocked in the block layer waiting on a tag so it
can't respond to a nop or perform I/Os for other LUs.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Link: https://lore.kernel.org/r/20230928020907.5730-6-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This allows userspace to request the fabric drivers do direct submissions
if they support it. With the new device file, submit_type, users can
write 0 - 2 to control how commands are submitted to the backend:

 0 - TARGET_FABRIC_DEFAULT_SUBMIT - LIO will use the fabric's default
     submission type. This is the default for compat.

 1 - TARGET_DIRECT_SUBMIT - LIO will submit the cmd to the backend from the
     calling context if the fabric the cmd was received on supports it,
     else it will use the fabric's default type.

 2 - TARGET_QUEUE_SUBMIT - LIO will queue the cmd to the LIO submission
     workqueue which will pass it to the backend.

When using an NVMe drive and vhost-scsi with direct submission we see
around a 20% improvement in 4K I/Os:

fio jobs        1       2       4       8       10
--------------------------------------------------
defer           94K     190K    394K    770K    890K
direct          128K    252K    488K    950K    -

And when using the queueing mode, we now no longer see issues like where
the iSCSI tx thread is blocked in the block layer waiting on a tag so it
can't respond to a nop or perform I/Os for other LUs.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Link: https://lore.kernel.org/r/20230928020907.5730-6-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>scsi: target: Have drivers report if they support direct submissions</title>
<updated>2023-10-13T19:53:57+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-09-28T02:09:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=194605d45dcb511983caca699d81855693b25fe0'/>
<id>194605d45dcb511983caca699d81855693b25fe0</id>
<content type='text'>
In some cases, like with multiple LUN targets or where the target has to
respond to transport level requests from the receiving context it can be
better to defer cmd submission to a helper thread. If the backend driver
blocks on something like request/tag allocation it can block the entire
target submission path and other LUs and transport IO on that session.

In other cases like single LUN targets with storage that can support all
the commands that the target can queue, then it's best to submit the cmd
to the backend from the target's cmd receiving context.

Subsequent commits will allow the user to config what they prefer, but
drivers like loop can't directly submit because they can be called from a
context that can't sleep. And, drivers like vhost-scsi can support direct
submission, but need to keep their default behavior of deferring execution
to avoid possible regressions where the backend can block.

Make the drivers tell LIO core if they support direct submissions and their
current default, so we can prevent users from misconfiguring the system and
initialize devices correctly.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Link: https://lore.kernel.org/r/20230928020907.5730-2-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In some cases, like with multiple LUN targets or where the target has to
respond to transport level requests from the receiving context it can be
better to defer cmd submission to a helper thread. If the backend driver
blocks on something like request/tag allocation it can block the entire
target submission path and other LUs and transport IO on that session.

In other cases like single LUN targets with storage that can support all
the commands that the target can queue, then it's best to submit the cmd
to the backend from the target's cmd receiving context.

Subsequent commits will allow the user to config what they prefer, but
drivers like loop can't directly submit because they can be called from a
context that can't sleep. And, drivers like vhost-scsi can support direct
submission, but need to keep their default behavior of deferring execution
to avoid possible regressions where the backend can block.

Make the drivers tell LIO core if they support direct submissions and their
current default, so we can prevent users from misconfiguring the system and
initialize devices correctly.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Link: https://lore.kernel.org/r/20230928020907.5730-2-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-scsi: Rename vhost_scsi_iov_to_sgl</title>
<updated>2023-08-10T19:24:28+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-07-09T20:28:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c5ace19efb0ac884a9a417e2a1499ce9849bdaa5'/>
<id>c5ace19efb0ac884a9a417e2a1499ce9849bdaa5</id>
<content type='text'>
Rename vhost_scsi_iov_to_sgl to vhost_scsi_map_iov_to_sgl so it matches
matches the naming style used for vhost_scsi_copy_iov_to_sgl.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230709202859.138387-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename vhost_scsi_iov_to_sgl to vhost_scsi_map_iov_to_sgl so it matches
matches the naming style used for vhost_scsi_copy_iov_to_sgl.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230709202859.138387-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost-scsi: Fix alignment handling with windows</title>
<updated>2023-08-10T19:24:27+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-07-09T20:28:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ced58bfa132c8ba0f9c893eb621595a84cfee12'/>
<id>5ced58bfa132c8ba0f9c893eb621595a84cfee12</id>
<content type='text'>
The linux block layer requires bios/requests to have lengths with a 512
byte alignment. Some drivers/layers like dm-crypt and the directi IO code
will test for it and just fail. Other drivers like SCSI just assume the
requirement is met and will end up in infinte retry loops. The problem
for drivers like SCSI is that it uses functions like blk_rq_cur_sectors
and blk_rq_sectors which divide the request's length by 512. If there's
lefovers then it just gets dropped. But other code in the block/scsi
layer may use blk_rq_bytes/blk_rq_cur_bytes and end up thinking there is
still data left and try to retry the cmd. We can then end up getting
stuck in retry loops where part of the block/scsi thinks there is data
left, but other parts think we want to do IOs of zero length.

Linux will always check for alignment, but windows will not. When
vhost-scsi then translates the iovec it gets from a windows guest to a
scatterlist, we can end up with sg items where the sg-&gt;length is not
divisible by 512 due to the misaligned offset:

sg[0].offset = 255;
sg[0].length = 3841;
sg...
sg[N].offset = 0;
sg[N].length = 255;

When the lio backends then convert the SG to bios or other iovecs, we
end up sending them with the same misaligned values and can hit the
issues above.

This just has us drop down to allocating a temp page and copying the data
when we detect a misaligned buffer and the IO is large enough that it
will get split into multiple bad IOs.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230709202859.138387-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The linux block layer requires bios/requests to have lengths with a 512
byte alignment. Some drivers/layers like dm-crypt and the directi IO code
will test for it and just fail. Other drivers like SCSI just assume the
requirement is met and will end up in infinte retry loops. The problem
for drivers like SCSI is that it uses functions like blk_rq_cur_sectors
and blk_rq_sectors which divide the request's length by 512. If there's
lefovers then it just gets dropped. But other code in the block/scsi
layer may use blk_rq_bytes/blk_rq_cur_bytes and end up thinking there is
still data left and try to retry the cmd. We can then end up getting
stuck in retry loops where part of the block/scsi thinks there is data
left, but other parts think we want to do IOs of zero length.

Linux will always check for alignment, but windows will not. When
vhost-scsi then translates the iovec it gets from a windows guest to a
scatterlist, we can end up with sg items where the sg-&gt;length is not
divisible by 512 due to the misaligned offset:

sg[0].offset = 255;
sg[0].length = 3841;
sg...
sg[N].offset = 0;
sg[N].length = 255;

When the lio backends then convert the SG to bios or other iovecs, we
end up sending them with the same misaligned values and can hit the
issues above.

This just has us drop down to allocating a temp page and copying the data
when we detect a misaligned buffer and the IO is large enough that it
will get split into multiple bad IOs.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230709202859.138387-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
