<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/ceph, branch v5.17</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>libceph: optionally use bounce buffer on recv path in crc mode</title>
<updated>2022-02-02T17:50:36+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2021-12-30T14:13:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=038b8d1d1ab1cce11a158d30bf080ff41a2cfd15'/>
<id>038b8d1d1ab1cce11a158d30bf080ff41a2cfd15</id>
<content type='text'>
Both msgr1 and msgr2 in crc mode are zero copy in the sense that
message data is read from the socket directly into the destination
buffer.  We assume that the destination buffer is stable (i.e. remains
unchanged while it is being read to) though.  Otherwise, CRC errors
ensue:

  libceph: read_partial_message 0000000048edf8ad data crc 1063286393 != exp. 228122706
  libceph: osd1 (1)192.168.122.1:6843 bad crc/signature

  libceph: bad data crc, calculated 57958023, expected 1805382778
  libceph: osd2 (2)192.168.122.1:6876 integrity error, bad crc

Introduce rxbounce option to enable use of a bounce buffer when
receiving message data.  In particular this is needed if a mapped
image is a Windows VM disk, passed to QEMU.  Windows has a system-wide
"dummy" page that may be mapped into the destination buffer (potentially
more than once into the same buffer) by the Windows Memory Manager in
an effort to generate a single large I/O [1][2].  QEMU makes a point of
preserving overlap relationships when cloning I/O vectors, so krbd gets
exposed to this behaviour.

[1] "What Is Really in That MDL?"
    https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn614012(v=vs.85)
[2] https://blogs.msmvps.com/kernelmustard/2005/05/04/dummy-pages/

URL: https://bugzilla.redhat.com/show_bug.cgi?id=1973317
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Both msgr1 and msgr2 in crc mode are zero copy in the sense that
message data is read from the socket directly into the destination
buffer.  We assume that the destination buffer is stable (i.e. remains
unchanged while it is being read to) though.  Otherwise, CRC errors
ensue:

  libceph: read_partial_message 0000000048edf8ad data crc 1063286393 != exp. 228122706
  libceph: osd1 (1)192.168.122.1:6843 bad crc/signature

  libceph: bad data crc, calculated 57958023, expected 1805382778
  libceph: osd2 (2)192.168.122.1:6876 integrity error, bad crc

Introduce rxbounce option to enable use of a bounce buffer when
receiving message data.  In particular this is needed if a mapped
image is a Windows VM disk, passed to QEMU.  Windows has a system-wide
"dummy" page that may be mapped into the destination buffer (potentially
more than once into the same buffer) by the Windows Memory Manager in
an effort to generate a single large I/O [1][2].  QEMU makes a point of
preserving overlap relationships when cloning I/O vectors, so krbd gets
exposed to this behaviour.

[1] "What Is Really in That MDL?"
    https://docs.microsoft.com/en-us/previous-versions/windows/hardware/design/dn614012(v=vs.85)
[2] https://blogs.msmvps.com/kernelmustard/2005/05/04/dummy-pages/

URL: https://bugzilla.redhat.com/show_bug.cgi?id=1973317
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: make recv path in secure mode work the same as send path</title>
<updated>2022-02-02T17:50:36+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2022-01-23T16:27:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2ea88716369ac9a7486a8cb309d6bf1239ea156c'/>
<id>2ea88716369ac9a7486a8cb309d6bf1239ea156c</id>
<content type='text'>
The recv path of secure mode is intertwined with that of crc mode.
While it's slightly more efficient that way (the ciphertext is read
into the destination buffer and decrypted in place, thus avoiding
two potentially heavy memory allocations for the bounce buffer and
the corresponding sg array), it isn't really amenable to changes.
Sacrifice that edge and align with the send path which always uses
a full-sized bounce buffer (currently there is no other way -- if
the kernel crypto API ever grows support for streaming (piecewise)
en/decryption for GCM [1], we would be able to easily take advantage
of that on both sides).

[1] https://lore.kernel.org/all/20141225202830.GA18794@gondor.apana.org.au/

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The recv path of secure mode is intertwined with that of crc mode.
While it's slightly more efficient that way (the ciphertext is read
into the destination buffer and decrypted in place, thus avoiding
two potentially heavy memory allocations for the bounce buffer and
the corresponding sg array), it isn't really amenable to changes.
Sacrifice that edge and align with the send path which always uses
a full-sized bounce buffer (currently there is no other way -- if
the kernel crypto API ever grows support for streaming (piecewise)
en/decryption for GCM [1], we would be able to easily take advantage
of that on both sides).

[1] https://lore.kernel.org/all/20141225202830.GA18794@gondor.apana.org.au/

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client</title>
<updated>2022-01-20T11:46:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-01-20T11:46:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=64f29d8856a9e0d1fcdc5344f76e70c364b941cb'/>
<id>64f29d8856a9e0d1fcdc5344f76e70c364b941cb</id>
<content type='text'>
Pull ceph updates from Ilya Dryomov:
 "The highlight is the new mount "device" string syntax implemented by
  Venky Shankar. It solves some long-standing issues with using
  different auth entities and/or mounting different CephFS filesystems
  from the same cluster, remounting and also misleading /proc/mounts
  contents. The existing syntax of course remains to be maintained.

  On top of that, there is a couple of fixes for edge cases in quota and
  a new mount option for turning on unbuffered I/O mode globally instead
  of on a per-file basis with ioctl(CEPH_IOC_SYNCIO)"

* tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client:
  ceph: move CEPH_SUPER_MAGIC definition to magic.h
  ceph: remove redundant Lsx caps check
  ceph: add new "nopagecache" option
  ceph: don't check for quotas on MDS stray dirs
  ceph: drop send metrics debug message
  rbd: make const pointer spaces a static const array
  ceph: Fix incorrect statfs report for small quota
  ceph: mount syntax module parameter
  doc: document new CephFS mount device syntax
  ceph: record updated mon_addr on remount
  ceph: new device mount syntax
  libceph: rename parse_fsid() to ceph_parse_fsid() and export
  libceph: generalize addr/ip parsing based on delimiter
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull ceph updates from Ilya Dryomov:
 "The highlight is the new mount "device" string syntax implemented by
  Venky Shankar. It solves some long-standing issues with using
  different auth entities and/or mounting different CephFS filesystems
  from the same cluster, remounting and also misleading /proc/mounts
  contents. The existing syntax of course remains to be maintained.

  On top of that, there is a couple of fixes for edge cases in quota and
  a new mount option for turning on unbuffered I/O mode globally instead
  of on a per-file basis with ioctl(CEPH_IOC_SYNCIO)"

* tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client:
  ceph: move CEPH_SUPER_MAGIC definition to magic.h
  ceph: remove redundant Lsx caps check
  ceph: add new "nopagecache" option
  ceph: don't check for quotas on MDS stray dirs
  ceph: drop send metrics debug message
  rbd: make const pointer spaces a static const array
  ceph: Fix incorrect statfs report for small quota
  ceph: mount syntax module parameter
  doc: document new CephFS mount device syntax
  ceph: record updated mon_addr on remount
  ceph: new device mount syntax
  libceph: rename parse_fsid() to ceph_parse_fsid() and export
  libceph: generalize addr/ip parsing based on delimiter
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: allow !GFP_KERNEL allocations for kvmalloc</title>
<updated>2022-01-15T14:30:29+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2022-01-14T22:07:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a421ef303008b0ceee2cfc625c3246fa7654b0ca'/>
<id>a421ef303008b0ceee2cfc625c3246fa7654b0ca</id>
<content type='text'>
Support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by
previous patches so we can allow the support for kvmalloc.  This will
allow some external users to simplify or completely remove their
helpers.

GFP_NOWAIT semantic hasn't been supported so far but it hasn't been
explicitly documented so let's add a note about that.

ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc.

Link: https://lkml.kernel.org/r/20211122153233.9924-5-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Support for GFP_NO{FS,IO} and __GFP_NOFAIL has been implemented by
previous patches so we can allow the support for kvmalloc.  This will
allow some external users to simplify or completely remove their
helpers.

GFP_NOWAIT semantic hasn't been supported so far but it hasn't been
explicitly documented so let's add a note about that.

ceph_kvmalloc is the first helper to be dropped and changed to kvmalloc.

Link: https://lkml.kernel.org/r/20211122153233.9924-5-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Cc: Jeff Layton &lt;jlayton@kernel.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: rename parse_fsid() to ceph_parse_fsid() and export</title>
<updated>2022-01-13T12:40:06+00:00</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2021-07-14T10:05:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4153c7fc937a2afa077dbdb9fe3189b9981f423c'/>
<id>4153c7fc937a2afa077dbdb9fe3189b9981f423c</id>
<content type='text'>
... as it is too generic. also, use __func__ when logging
rather than hardcoding the function name.

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... as it is too generic. also, use __func__ when logging
rather than hardcoding the function name.

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: generalize addr/ip parsing based on delimiter</title>
<updated>2022-01-13T12:40:05+00:00</updated>
<author>
<name>Venky Shankar</name>
<email>vshankar@redhat.com</email>
</author>
<published>2021-07-14T10:05:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2d7c86a8f9cdce1408c4f3c69d94d007eff2f179'/>
<id>2d7c86a8f9cdce1408c4f3c69d94d007eff2f179</id>
<content type='text'>
... and remove hardcoded function name in ceph_parse_ips().

[ idryomov: delim parameter, drop CEPH_ADDR_PARSE_DEFAULT_DELIM ]

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... and remove hardcoded function name in ceph_parse_ips().

[ idryomov: delim parameter, drop CEPH_ADDR_PARSE_DEFAULT_DELIM ]

Signed-off-by: Venky Shankar &lt;vshankar@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph, ceph: move ceph_osdc_copy_from() into cephfs code</title>
<updated>2021-11-08T02:29:52+00:00</updated>
<author>
<name>Luís Henriques</name>
<email>lhenriques@suse.de</email>
</author>
<published>2021-11-04T12:31:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aca39d9e86f3edeaac5d2c467f5fd31e0b0df606'/>
<id>aca39d9e86f3edeaac5d2c467f5fd31e0b0df606</id>
<content type='text'>
This patch moves ceph_osdc_copy_from() function out of libceph code into
cephfs.  There are no other users for this function, and there is the need
(in another patch) to access internal ceph_osd_request struct members.

Signed-off-by: Luís Henriques &lt;lhenriques@suse.de&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch moves ceph_osdc_copy_from() function out of libceph code into
cephfs.  There are no other users for this function, and there is the need
(in another patch) to access internal ceph_osd_request struct members.

Signed-off-by: Luís Henriques &lt;lhenriques@suse.de&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: don't rely on error_string to validate blocklisted session.</title>
<updated>2021-11-08T02:29:51+00:00</updated>
<author>
<name>Kotresh HR</name>
<email>khiremat@redhat.com</email>
</author>
<published>2021-09-27T13:52:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e1c9788cb39777e81ebfbf31ae80b4ec14eb6f6d'/>
<id>e1c9788cb39777e81ebfbf31ae80b4ec14eb6f6d</id>
<content type='text'>
The "error_string" in the metadata of MClientSession is being
parsed by kclient to validate whether the session is blocklisted.
The "error_string" is for humans and shouldn't be relied on it.
Hence added the flag to MClientsession to indicate the session
is blocklisted.

[ jlayton: minor formatting cleanup ]

URL: https://tracker.ceph.com/issues/47450
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The "error_string" in the metadata of MClientSession is being
parsed by kclient to validate whether the session is blocklisted.
The "error_string" is for humans and shouldn't be relied on it.
Hence added the flag to MClientsession to indicate the session
is blocklisted.

[ jlayton: minor formatting cleanup ]

URL: https://tracker.ceph.com/issues/47450
Signed-off-by: Kotresh HR &lt;khiremat@redhat.com&gt;
Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: flush mdlog before umounting</title>
<updated>2021-09-02T20:49:16+00:00</updated>
<author>
<name>Xiubo Li</name>
<email>xiubli@redhat.com</email>
</author>
<published>2021-07-05T01:22:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d095559ce4100f0c02aea229705230deac329c97'/>
<id>d095559ce4100f0c02aea229705230deac329c97</id>
<content type='text'>
Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Xiubo Li &lt;xiubli@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: set global_id as soon as we get an auth ticket</title>
<updated>2021-06-24T19:03:17+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2021-06-21T10:17:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=03af4c7bad8ca59143bca488b90b3775d10d7f94'/>
<id>03af4c7bad8ca59143bca488b90b3775d10d7f94</id>
<content type='text'>
Commit 61ca49a9105f ("libceph: don't set global_id until we get an
auth ticket") delayed the setting of global_id too much.  It is set
only after all tickets are received, but in pre-nautilus clusters an
auth ticket and the service tickets are obtained in separate steps
(for a total of three MAuth replies).  When the service tickets are
requested, global_id is used to build an authorizer; if global_id is
still 0 we never get them and fail to establish the session.

Moving the setting of global_id into protocol implementations.  This
way global_id can be set exactly when an auth ticket is received, not
sooner nor later.

Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket")
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 61ca49a9105f ("libceph: don't set global_id until we get an
auth ticket") delayed the setting of global_id too much.  It is set
only after all tickets are received, but in pre-nautilus clusters an
auth ticket and the service tickets are obtained in separate steps
(for a total of three MAuth replies).  When the service tickets are
requested, global_id is used to build an authorizer; if global_id is
still 0 we never get them and fail to establish the session.

Moving the setting of global_id into protocol implementations.  This
way global_id can be set exactly when an auth ticket is received, not
sooner nor later.

Fixes: 61ca49a9105f ("libceph: don't set global_id until we get an auth ticket")
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
