<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/crypto/virtio, branch linux-6.3.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Merge tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6</title>
<updated>2023-02-22T02:10:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-02-22T02:10:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=36289a03bcd3aabdf66de75cb6d1b4ee15726438'/>
<id>36289a03bcd3aabdf66de75cb6d1b4ee15726438</id>
<content type='text'>
Pull crypto update from Herbert Xu:
 "API:
   - Use kmap_local instead of kmap_atomic
   - Change request callback to take void pointer
   - Print FIPS status in /proc/crypto (when enabled)

  Algorithms:
   - Add rfc4106/gcm support on arm64
   - Add ARIA AVX2/512 support on x86

  Drivers:
   - Add TRNG driver for StarFive SoC
   - Delete ux500/hash driver (subsumed by stm32/hash)
   - Add zlib support in qat
   - Add RSA support in aspeed"

* tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits)
  crypto: x86/aria-avx - Do not use avx2 instructions
  crypto: aspeed - Fix modular aspeed-acry
  crypto: hisilicon/qm - fix coding style issues
  crypto: hisilicon/qm - update comments to match function
  crypto: hisilicon/qm - change function names
  crypto: hisilicon/qm - use min() instead of min_t()
  crypto: hisilicon/qm - remove some unused defines
  crypto: proc - Print fips status
  crypto: crypto4xx - Call dma_unmap_page when done
  crypto: octeontx2 - Fix objects shared between several modules
  crypto: nx - Fix sparse warnings
  crypto: ecc - Silence sparse warning
  tls: Pass rec instead of aead_req into tls_encrypt_done
  crypto: api - Remove completion function scaffolding
  tls: Remove completion function scaffolding
  tipc: Remove completion function scaffolding
  net: ipv6: Remove completion function scaffolding
  net: ipv4: Remove completion function scaffolding
  net: macsec: Remove completion function scaffolding
  dm: Remove completion function scaffolding
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull crypto update from Herbert Xu:
 "API:
   - Use kmap_local instead of kmap_atomic
   - Change request callback to take void pointer
   - Print FIPS status in /proc/crypto (when enabled)

  Algorithms:
   - Add rfc4106/gcm support on arm64
   - Add ARIA AVX2/512 support on x86

  Drivers:
   - Add TRNG driver for StarFive SoC
   - Delete ux500/hash driver (subsumed by stm32/hash)
   - Add zlib support in qat
   - Add RSA support in aspeed"

* tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits)
  crypto: x86/aria-avx - Do not use avx2 instructions
  crypto: aspeed - Fix modular aspeed-acry
  crypto: hisilicon/qm - fix coding style issues
  crypto: hisilicon/qm - update comments to match function
  crypto: hisilicon/qm - change function names
  crypto: hisilicon/qm - use min() instead of min_t()
  crypto: hisilicon/qm - remove some unused defines
  crypto: proc - Print fips status
  crypto: crypto4xx - Call dma_unmap_page when done
  crypto: octeontx2 - Fix objects shared between several modules
  crypto: nx - Fix sparse warnings
  crypto: ecc - Silence sparse warning
  tls: Pass rec instead of aead_req into tls_encrypt_done
  crypto: api - Remove completion function scaffolding
  tls: Remove completion function scaffolding
  tipc: Remove completion function scaffolding
  net: ipv6: Remove completion function scaffolding
  net: ipv4: Remove completion function scaffolding
  net: macsec: Remove completion function scaffolding
  dm: Remove completion function scaffolding
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: virtio/akcipher - Do not use GFP_ATOMIC when not needed</title>
<updated>2023-02-10T09:20:19+00:00</updated>
<author>
<name>Christophe JAILLET</name>
<email>christophe.jaillet@wanadoo.fr</email>
</author>
<published>2023-02-04T20:54:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4409c08d806721f0be80bf1c6537a983289272ed'/>
<id>4409c08d806721f0be80bf1c6537a983289272ed</id>
<content type='text'>
There is no need to use GFP_ATOMIC here. GFP_KERNEL is already used for
another memory allocation just the line after.

Signed-off-by: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no need to use GFP_ATOMIC here. GFP_KERNEL is already used for
another memory allocation just the line after.

Signed-off-by: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: fix memory leak in virtio_crypto_alg_skcipher_close_session()</title>
<updated>2022-12-28T10:28:10+00:00</updated>
<author>
<name>Wei Yongjun</name>
<email>weiyongjun1@huawei.com</email>
</author>
<published>2022-11-14T11:07:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b1d65f717cd6305a396a8738e022c6f7c65cfbe8'/>
<id>b1d65f717cd6305a396a8738e022c6f7c65cfbe8</id>
<content type='text'>
'vc_ctrl_req' is alloced in virtio_crypto_alg_skcipher_close_session(),
and should be freed in the invalid ctrl_status-&gt;status error handling
case. Otherwise there is a memory leak.

Fixes: 0756ad15b1fe ("virtio-crypto: use private buffer for control request")
Signed-off-by: Wei Yongjun &lt;weiyongjun1@huawei.com&gt;
Message-Id: &lt;20221114110740.537276-1-weiyongjun@huaweicloud.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Acked-by: zhenwei pi&lt;pizhenwei@bytedance.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'vc_ctrl_req' is alloced in virtio_crypto_alg_skcipher_close_session(),
and should be freed in the invalid ctrl_status-&gt;status error handling
case. Otherwise there is a memory leak.

Fixes: 0756ad15b1fe ("virtio-crypto: use private buffer for control request")
Signed-off-by: Wei Yongjun &lt;weiyongjun1@huawei.com&gt;
Message-Id: &lt;20221114110740.537276-1-weiyongjun@huaweicloud.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Acked-by: zhenwei pi&lt;pizhenwei@bytedance.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: virtio - Use helper to set reqsize</title>
<updated>2022-12-02T10:12:39+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2022-11-22T09:42:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=93c446cd36a410b31519af7a2dd32e899cc03d06'/>
<id>93c446cd36a410b31519af7a2dd32e899cc03d06</id>
<content type='text'>
The value of reqsize must only be changed through the helper.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The value of reqsize must only be changed through the helper.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: fix memory-leak</title>
<updated>2022-09-27T22:30:48+00:00</updated>
<author>
<name>lei he</name>
<email>helei.sig11@bytedance.com</email>
</author>
<published>2022-09-19T07:51:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1bedcf22c081a6e9943f09937b2da8d3ef52d20d'/>
<id>1bedcf22c081a6e9943f09937b2da8d3ef52d20d</id>
<content type='text'>
Fix memory-leak for virtio-crypto akcipher request, this problem is
introduced by 59ca6c93387d3(virtio-crypto: implement RSA algorithm).
The leak can be reproduced and tested with the following script
inside virtual machine:

#!/bin/bash

LOOP_TIMES=10000

# required module: pkcs8_key_parser, virtio_crypto
modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m
modprobe virtio_crypto # if CONFIG_CRYPTO_DEV_VIRTIO=m
rm -rf /tmp/data
dd if=/dev/random of=/tmp/data count=1 bs=230

# generate private key and self-signed cert
openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem \
		-outform der -out cert.der  \
		-subj "/C=CN/ST=GD/L=SZ/O=vihoo/OU=dev/CN=always.com/emailAddress=yy@always.com"
# convert private key from pem to der
openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der

# add key
PRIV_KEY_ID=`cat key.der | keyctl padd asymmetric test_priv_key @s`
echo "priv key id = "$PRIV_KEY_ID
PUB_KEY_ID=`cat cert.der | keyctl padd asymmetric test_pub_key @s`
echo "pub key id = "$PUB_KEY_ID

# query key
keyctl pkey_query $PRIV_KEY_ID 0
keyctl pkey_query $PUB_KEY_ID 0

# here we only run pkey_encrypt becasuse it is the fastest interface
function bench_pub() {
	keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 &gt;/tmp/enc.pub
}

# do bench_pub in loop to obtain the memory leak
for (( i = 0; i &lt; ${LOOP_TIMES}; ++i )); do
	bench_pub
done

Signed-off-by: lei he &lt;helei.sig11@bytedance.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Message-Id: &lt;20220919075158.3625-1-helei.sig11@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix memory-leak for virtio-crypto akcipher request, this problem is
introduced by 59ca6c93387d3(virtio-crypto: implement RSA algorithm).
The leak can be reproduced and tested with the following script
inside virtual machine:

#!/bin/bash

LOOP_TIMES=10000

# required module: pkcs8_key_parser, virtio_crypto
modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m
modprobe virtio_crypto # if CONFIG_CRYPTO_DEV_VIRTIO=m
rm -rf /tmp/data
dd if=/dev/random of=/tmp/data count=1 bs=230

# generate private key and self-signed cert
openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem \
		-outform der -out cert.der  \
		-subj "/C=CN/ST=GD/L=SZ/O=vihoo/OU=dev/CN=always.com/emailAddress=yy@always.com"
# convert private key from pem to der
openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der

# add key
PRIV_KEY_ID=`cat key.der | keyctl padd asymmetric test_priv_key @s`
echo "priv key id = "$PRIV_KEY_ID
PUB_KEY_ID=`cat cert.der | keyctl padd asymmetric test_pub_key @s`
echo "pub key id = "$PUB_KEY_ID

# query key
keyctl pkey_query $PRIV_KEY_ID 0
keyctl pkey_query $PUB_KEY_ID 0

# here we only run pkey_encrypt becasuse it is the fastest interface
function bench_pub() {
	keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 &gt;/tmp/enc.pub
}

# do bench_pub in loop to obtain the memory leak
for (( i = 0; i &lt; ${LOOP_TIMES}; ++i )); do
	bench_pub
done

Signed-off-by: lei he &lt;helei.sig11@bytedance.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Message-Id: &lt;20220919075158.3625-1-helei.sig11@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: enable retry for virtio-crypto-dev</title>
<updated>2022-05-31T16:45:09+00:00</updated>
<author>
<name>lei he</name>
<email>helei.sig11@bytedance.com</email>
</author>
<published>2022-05-06T13:16:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4e0d352af04cf4e019d3e45229eaaff9e8ffb33d'/>
<id>4e0d352af04cf4e019d3e45229eaaff9e8ffb33d</id>
<content type='text'>
Enable retry for virtio-crypto-dev, so that crypto-engine
can process cipher-requests parallelly.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: lei he &lt;helei.sig11@bytedance.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-6-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Enable retry for virtio-crypto-dev, so that crypto-engine
can process cipher-requests parallelly.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: lei he &lt;helei.sig11@bytedance.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-6-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: adjust dst_len at ops callback</title>
<updated>2022-05-31T16:45:09+00:00</updated>
<author>
<name>lei he</name>
<email>helei.sig11@bytedance.com</email>
</author>
<published>2022-05-06T13:16:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a36bd0ad9fbf69d0d711b1c105954ce8d6cc144a'/>
<id>a36bd0ad9fbf69d0d711b1c105954ce8d6cc144a</id>
<content type='text'>
For some akcipher operations(eg, decryption of pkcs1pad(rsa)),
the length of returned result maybe less than akcipher_req-&gt;dst_len,
we need to recalculate the actual dst_len through the virt-queue
protocol.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: lei he &lt;helei.sig11@bytedance.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-5-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For some akcipher operations(eg, decryption of pkcs1pad(rsa)),
the length of returned result maybe less than akcipher_req-&gt;dst_len,
we need to recalculate the actual dst_len through the virt-queue
protocol.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: lei he &lt;helei.sig11@bytedance.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-5-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: wait ctrl queue instead of busy polling</title>
<updated>2022-05-31T16:45:09+00:00</updated>
<author>
<name>zhenwei pi</name>
<email>pizhenwei@bytedance.com</email>
</author>
<published>2022-05-06T13:16:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=977231e8d45657871a86fe3c7bed94921d04e447'/>
<id>977231e8d45657871a86fe3c7bed94921d04e447</id>
<content type='text'>
Originally, after submitting request into virtio crypto control
queue, the guest side polls the result from the virt queue. This
works like following:
    CPU0   CPU1               ...             CPUx  CPUy
     |      |                                  |     |
     \      \                                  /     /
      \--------spin_lock(&amp;vcrypto-&gt;ctrl_lock)-------/
                           |
                 virtqueue add &amp; kick
                           |
                  busy poll virtqueue
                           |
              spin_unlock(&amp;vcrypto-&gt;ctrl_lock)
                          ...

There are two problems:
1, The queue depth is always 1, the performance of a virtio crypto
   device gets limited. Multi user processes share a single control
   queue, and hit spin lock race from control queue. Test on Intel
   Platinum 8260, a single worker gets ~35K/s create/close session
   operations, and 8 workers get ~40K/s operations with 800% CPU
   utilization.
2, The control request is supposed to get handled immediately, but
   in the current implementation of QEMU(v6.2), the vCPU thread kicks
   another thread to do this work, the latency also gets unstable.
   Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s:
        usecs               : count     distribution
         0 -&gt; 1          : 0        |                        |
         2 -&gt; 3          : 7        |                        |
         4 -&gt; 7          : 72       |                        |
         8 -&gt; 15         : 186485   |************************|
        16 -&gt; 31         : 687      |                        |
        32 -&gt; 63         : 5        |                        |
        64 -&gt; 127        : 3        |                        |
       128 -&gt; 255        : 1        |                        |
       256 -&gt; 511        : 0        |                        |
       512 -&gt; 1023       : 0        |                        |
      1024 -&gt; 2047       : 0        |                        |
      2048 -&gt; 4095       : 0        |                        |
      4096 -&gt; 8191       : 0        |                        |
      8192 -&gt; 16383      : 2        |                        |
This means that a CPU may hold vcrypto-&gt;ctrl_lock as long as 8192~16383us.

To improve the performance of control queue, a request on control queue
waits completion instead of busy polling to reduce lock racing, and gets
completed by control queue callback.
    CPU0   CPU1               ...             CPUx  CPUy
     |      |                                  |     |
     \      \                                  /     /
      \--------spin_lock(&amp;vcrypto-&gt;ctrl_lock)-------/
                           |
                 virtqueue add &amp; kick
                           |
      ---------spin_unlock(&amp;vcrypto-&gt;ctrl_lock)------
     /      /                                  \     \
     |      |                                  |     |
    wait   wait                               wait  wait

Test this patch, the guest side get ~200K/s operations with 300% CPU
utilization.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-4-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Originally, after submitting request into virtio crypto control
queue, the guest side polls the result from the virt queue. This
works like following:
    CPU0   CPU1               ...             CPUx  CPUy
     |      |                                  |     |
     \      \                                  /     /
      \--------spin_lock(&amp;vcrypto-&gt;ctrl_lock)-------/
                           |
                 virtqueue add &amp; kick
                           |
                  busy poll virtqueue
                           |
              spin_unlock(&amp;vcrypto-&gt;ctrl_lock)
                          ...

There are two problems:
1, The queue depth is always 1, the performance of a virtio crypto
   device gets limited. Multi user processes share a single control
   queue, and hit spin lock race from control queue. Test on Intel
   Platinum 8260, a single worker gets ~35K/s create/close session
   operations, and 8 workers get ~40K/s operations with 800% CPU
   utilization.
2, The control request is supposed to get handled immediately, but
   in the current implementation of QEMU(v6.2), the vCPU thread kicks
   another thread to do this work, the latency also gets unstable.
   Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s:
        usecs               : count     distribution
         0 -&gt; 1          : 0        |                        |
         2 -&gt; 3          : 7        |                        |
         4 -&gt; 7          : 72       |                        |
         8 -&gt; 15         : 186485   |************************|
        16 -&gt; 31         : 687      |                        |
        32 -&gt; 63         : 5        |                        |
        64 -&gt; 127        : 3        |                        |
       128 -&gt; 255        : 1        |                        |
       256 -&gt; 511        : 0        |                        |
       512 -&gt; 1023       : 0        |                        |
      1024 -&gt; 2047       : 0        |                        |
      2048 -&gt; 4095       : 0        |                        |
      4096 -&gt; 8191       : 0        |                        |
      8192 -&gt; 16383      : 2        |                        |
This means that a CPU may hold vcrypto-&gt;ctrl_lock as long as 8192~16383us.

To improve the performance of control queue, a request on control queue
waits completion instead of busy polling to reduce lock racing, and gets
completed by control queue callback.
    CPU0   CPU1               ...             CPUx  CPUy
     |      |                                  |     |
     \      \                                  /     /
      \--------spin_lock(&amp;vcrypto-&gt;ctrl_lock)-------/
                           |
                 virtqueue add &amp; kick
                           |
      ---------spin_unlock(&amp;vcrypto-&gt;ctrl_lock)------
     /      /                                  \     \
     |      |                                  |     |
    wait   wait                               wait  wait

Test this patch, the guest side get ~200K/s operations with 300% CPU
utilization.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-4-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: use private buffer for control request</title>
<updated>2022-05-31T16:45:09+00:00</updated>
<author>
<name>zhenwei pi</name>
<email>pizhenwei@bytedance.com</email>
</author>
<published>2022-05-06T13:16:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0756ad15b1fef287d4d8fa11bc36ea77a5c42e4a'/>
<id>0756ad15b1fef287d4d8fa11bc36ea77a5c42e4a</id>
<content type='text'>
Originally, all of the control requests share a single buffer(
ctrl &amp; input &amp; ctrl_status fields in struct virtio_crypto), this
allows queue depth 1 only, the performance of control queue gets
limited by this design.

In this patch, each request allocates request buffer dynamically, and
free buffer after request, so the scope protected by ctrl_lock also
get optimized here.
It's possible to optimize control queue depth in the next step.

A necessary comment is already in code, still describe it again:
/*
 * Note: there are padding fields in request, clear them to zero before
 * sending to host to avoid to divulge any information.
 * Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48]
 */
So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request.

Potentially dereferencing uninitialized variables:
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-3-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Originally, all of the control requests share a single buffer(
ctrl &amp; input &amp; ctrl_status fields in struct virtio_crypto), this
allows queue depth 1 only, the performance of control queue gets
limited by this design.

In this patch, each request allocates request buffer dynamically, and
free buffer after request, so the scope protected by ctrl_lock also
get optimized here.
It's possible to optimize control queue depth in the next step.

A necessary comment is already in code, still describe it again:
/*
 * Note: there are padding fields in request, clear them to zero before
 * sending to host to avoid to divulge any information.
 * Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48]
 */
So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request.

Potentially dereferencing uninitialized variables:
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-3-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>virtio-crypto: change code style</title>
<updated>2022-05-31T16:45:08+00:00</updated>
<author>
<name>zhenwei pi</name>
<email>pizhenwei@bytedance.com</email>
</author>
<published>2022-05-06T13:16:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6fd763d155860eb7ea3a93c8b3bf926940ffa3fb'/>
<id>6fd763d155860eb7ea3a93c8b3bf926940ffa3fb</id>
<content type='text'>
Use temporary variable to make code easy to read and maintain.
	/* Pad cipher's parameters */
        vcrypto-&gt;ctrl.u.sym_create_session.op_type =
                cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
        vcrypto-&gt;ctrl.u.sym_create_session.u.cipher.para.algo =
                vcrypto-&gt;ctrl.header.algo;
        vcrypto-&gt;ctrl.u.sym_create_session.u.cipher.para.keylen =
                cpu_to_le32(keylen);
        vcrypto-&gt;ctrl.u.sym_create_session.u.cipher.para.op =
                cpu_to_le32(op);
--&gt;
	sym_create_session = &amp;ctrl-&gt;u.sym_create_session;
	sym_create_session-&gt;op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
	sym_create_session-&gt;u.cipher.para.algo = ctrl-&gt;header.algo;
	sym_create_session-&gt;u.cipher.para.keylen = cpu_to_le32(keylen);
	sym_create_session-&gt;u.cipher.para.op = cpu_to_le32(op);

The new style shows more obviously:
- the variable we want to operate.
- an assignment statement in a single line.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-2-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use temporary variable to make code easy to read and maintain.
	/* Pad cipher's parameters */
        vcrypto-&gt;ctrl.u.sym_create_session.op_type =
                cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
        vcrypto-&gt;ctrl.u.sym_create_session.u.cipher.para.algo =
                vcrypto-&gt;ctrl.header.algo;
        vcrypto-&gt;ctrl.u.sym_create_session.u.cipher.para.keylen =
                cpu_to_le32(keylen);
        vcrypto-&gt;ctrl.u.sym_create_session.u.cipher.para.op =
                cpu_to_le32(op);
--&gt;
	sym_create_session = &amp;ctrl-&gt;u.sym_create_session;
	sym_create_session-&gt;op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
	sym_create_session-&gt;u.cipher.para.algo = ctrl-&gt;header.algo;
	sym_create_session-&gt;u.cipher.para.keylen = cpu_to_le32(keylen);
	sym_create_session-&gt;u.cipher.para.op = cpu_to_le32(op);

The new style shows more obviously:
- the variable we want to operate.
- an assignment statement in a single line.

Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Gonglei &lt;arei.gonglei@huawei.com&gt;
Reviewed-by: Gonglei &lt;arei.gonglei@huawei.com&gt;
Signed-off-by: zhenwei pi &lt;pizhenwei@bytedance.com&gt;
Message-Id: &lt;20220506131627.180784-2-pizhenwei@bytedance.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
