<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ceph/osdmap.c, branch linux-4.3.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>libceph: set 'exists' flag for newly up osd</title>
<updated>2015-09-08T20:14:29+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-08-28T09:59:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6dd74e44dc1df85f125982a8d6591bc4a76c9f5d'/>
<id>6dd74e44dc1df85f125982a8d6591bc4a76c9f5d</id>
<content type='text'>
Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crush: fix a bug in tree bucket decode</title>
<updated>2015-06-30T21:46:35+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-06-29T16:30:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=82cd003a77173c91b9acad8033fb7931dac8d751'/>
<id>82cd003a77173c91b9acad8033fb7931dac8d751</id>
<content type='text'>
struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe()
should be used.  -Wconversion catches this, but I guess it went
unnoticed in all the noise it spews.  The actual problem (at least for
common crushmaps) isn't the u32 -&gt; u8 truncation though - it's the
advancement by 4 bytes instead of 1 in the crushmap buffer.

Fixes: http://tracker.ceph.com/issues/2759

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe()
should be used.  -Wconversion catches this, but I guess it went
unnoticed in all the noise it spews.  The actual problem (at least for
common crushmaps) isn't the u32 -&gt; u8 truncation though - it's the
advancement by 4 bytes instead of 1 in the crushmap buffer.

Fixes: http://tracker.ceph.com/issues/2759

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crush: straw2 bucket type with an efficient 64-bit crush_ln()</title>
<updated>2015-04-22T15:33:43+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-04-14T13:54:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=958a27658d94cf212caeb0ffb04ee0b0bc89cc40'/>
<id>958a27658d94cf212caeb0ffb04ee0b0bc89cc40</id>
<content type='text'>
This is an improved straw bucket that correctly avoids any data movement
between items A and B when neither A nor B's weights are changed.  Said
differently, if we adjust the weight of item C (including adding it anew
or removing it completely), we will only see inputs move to or from C,
never between other items in the bucket.

Notably, there is not intermediate scaling factor that needs to be
calculated.  The mapping function is a simple function of the item weights.

The below commits were squashed together into this one (mostly to avoid
adding and then yanking a ~6000 lines worth of crush_ln_table):

- crush: add a straw2 bucket type
- crush: add crush_ln to calculate nature log efficently
- crush: improve straw2 adjustment slightly
- crush: change crush_ln to provide 32 more digits
- crush: fix crush_get_bucket_item_weight and bucket destroy for straw2
- crush/mapper: fix divide-by-0 in straw2
  (with div64_s64() for draw = ln / w and INT64_MIN -&gt; S64_MIN - need
   to create a proper compat.h in ceph.git)

Reflects ceph.git commits 242293c908e923d474910f2b8203fa3b41eb5a53,
                          32a1ead92efcd351822d22a5fc37d159c65c1338,
                          6289912418c4a3597a11778bcf29ed5415117ad9,
                          35fcb04e2945717cf5cfe150b9fa89cb3d2303a1,
                          6445d9ee7290938de1e4ee9563912a6ab6d8ee5f,
                          b5921d55d16796e12d66ad2c4add7305f9ce2353.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is an improved straw bucket that correctly avoids any data movement
between items A and B when neither A nor B's weights are changed.  Said
differently, if we adjust the weight of item C (including adding it anew
or removing it completely), we will only see inputs move to or from C,
never between other items in the bucket.

Notably, there is not intermediate scaling factor that needs to be
calculated.  The mapping function is a simple function of the item weights.

The below commits were squashed together into this one (mostly to avoid
adding and then yanking a ~6000 lines worth of crush_ln_table):

- crush: add a straw2 bucket type
- crush: add crush_ln to calculate nature log efficently
- crush: improve straw2 adjustment slightly
- crush: change crush_ln to provide 32 more digits
- crush: fix crush_get_bucket_item_weight and bucket destroy for straw2
- crush/mapper: fix divide-by-0 in straw2
  (with div64_s64() for draw = ln / w and INT64_MIN -&gt; S64_MIN - need
   to create a proper compat.h in ceph.git)

Reflects ceph.git commits 242293c908e923d474910f2b8203fa3b41eb5a53,
                          32a1ead92efcd351822d22a5fc37d159c65c1338,
                          6289912418c4a3597a11778bcf29ed5415117ad9,
                          35fcb04e2945717cf5cfe150b9fa89cb3d2303a1,
                          6445d9ee7290938de1e4ee9563912a6ab6d8ee5f,
                          b5921d55d16796e12d66ad2c4add7305f9ce2353.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: Convert pr_warning to pr_warn</title>
<updated>2014-10-14T17:03:23+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2014-09-10T04:17:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b9a678994b4a64b1106ab2cf7cfe7cbc10bb6f40'/>
<id>b9a678994b4a64b1106ab2cf7cfe7cbc10bb6f40</id>
<content type='text'>
Use the more common pr_warn.

Other miscellanea:

o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the more common pr_warn.

Other miscellanea:

o Coalesce formats
o Realign arguments

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: fix a use after free issue in osdmap_set_max_osd</title>
<updated>2014-10-14T17:03:21+00:00</updated>
<author>
<name>Li RongQing</name>
<email>roy.qing.li@gmail.com</email>
</author>
<published>2014-09-07T10:10:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=589506f1e7f135943bcd34903bcdcf1fdaf00549'/>
<id>589506f1e7f135943bcd34903bcdcf1fdaf00549</id>
<content type='text'>
If the state variable is krealloced successfully, map-&gt;osd_state will be
freed, once following two reallocation failed, and exit the function
without resetting map-&gt;osd_state, map-&gt;osd_state become a wild pointer.

fix it by resetting them after krealloc successfully.

Signed-off-by: Li RongQing &lt;roy.qing.li@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the state variable is krealloced successfully, map-&gt;osd_state will be
freed, once following two reallocation failed, and exit the function
without resetting map-&gt;osd_state, map-&gt;osd_state become a wild pointer.

fix it by resetting them after krealloc successfully.

Signed-off-by: Li RongQing &lt;roy.qing.li@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crush: decode and initialize chooseleaf_vary_r</title>
<updated>2014-05-16T17:29:55+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-05-09T14:27:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f140662f35a7332b5c3188ee667856323783ed5a'/>
<id>f140662f35a7332b5c3188ee667856323783ed5a</id>
<content type='text'>
Commit e2b149cc4ba0 ("crush: add chooseleaf_vary_r tunable") added the
crush_map::chooseleaf_vary_r field but missed the decode part.  This
lead to misdirected requests caused by incorrect raw crush mapping
sets.

Fixes: http://tracker.ceph.com/issues/8226

Reported-and-Tested-by: Dmitry Smirnov &lt;onlyjob@member.fsf.org&gt;
Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit e2b149cc4ba0 ("crush: add chooseleaf_vary_r tunable") added the
crush_map::chooseleaf_vary_r field but missed the decode part.  This
lead to misdirected requests caused by incorrect raw crush mapping
sets.

Fixes: http://tracker.ceph.com/issues/8226

Reported-and-Tested-by: Dmitry Smirnov &lt;onlyjob@member.fsf.org&gt;
Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: fix non-default values check in apply_primary_affinity()</title>
<updated>2014-04-28T19:54:10+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-04-10T14:09:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=92b2e75158f6b8316b5a567c73dcf5b3d8f6bbce'/>
<id>92b2e75158f6b8316b5a567c73dcf5b3d8f6bbce</id>
<content type='text'>
osd_primary_affinity array is indexed into incorrectly when checking
for non-default primary-affinity values.  This nullifies the impact of
the rest of the apply_primary_affinity() and results in misdirected
requests.

                if (osds[i] != CRUSH_ITEM_NONE &amp;&amp;
                    osdmap-&gt;osd_primary_affinity[i] !=
                                                ^^^
                                        CEPH_OSD_DEFAULT_PRIMARY_AFFINITY) {

For a pool with size 2, this always ends up checking osd0 and osd1
primary_affinity values, instead of the values that correspond to the
osds in question.  E.g., given a [2,3] up set and a [max,max,0,max]
primary affinity vector, requests are still sent to osd2, because both
osd0 and osd1 happen to have max primary_affinity values and therefore
we return from apply_primary_affinity() early on the premise that all
osds in the given set have max (default) values.  Fix it.

Fixes: http://tracker.ceph.com/issues/7954

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
osd_primary_affinity array is indexed into incorrectly when checking
for non-default primary-affinity values.  This nullifies the impact of
the rest of the apply_primary_affinity() and results in misdirected
requests.

                if (osds[i] != CRUSH_ITEM_NONE &amp;&amp;
                    osdmap-&gt;osd_primary_affinity[i] !=
                                                ^^^
                                        CEPH_OSD_DEFAULT_PRIMARY_AFFINITY) {

For a pool with size 2, this always ends up checking osd0 and osd1
primary_affinity values, instead of the values that correspond to the
osds in question.  E.g., given a [2,3] up set and a [max,max,0,max]
primary affinity vector, requests are still sent to osd2, because both
osd0 and osd1 happen to have max primary_affinity values and therefore
we return from apply_primary_affinity() early on the premise that all
osds in the given set have max (default) values.  Fix it.

Fixes: http://tracker.ceph.com/issues/7954

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Sage Weil &lt;sage@inktank.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: output primary affinity values on osdmap updates</title>
<updated>2014-04-05T04:08:28+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-04-02T16:34:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f31da0f3e12e57f21d73315e06c48fb9860fe07d'/>
<id>f31da0f3e12e57f21d73315e06c48fb9860fe07d</id>
<content type='text'>
Similar to osd weights, output primary affinity values on incremental
osdmap updates.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to osd weights, output primary affinity values on incremental
osdmap updates.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: redo ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()</title>
<updated>2014-04-05T04:08:19+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-03-24T15:12:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c4c1228525a2cbf9e06657b01135391700c1ec14'/>
<id>c4c1228525a2cbf9e06657b01135391700c1ec14</id>
<content type='text'>
Reimplement ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()
and get rid of the now unused calc_pg_raw().

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reimplement ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()
and get rid of the now unused calc_pg_raw().

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: add support for osd primary affinity</title>
<updated>2014-04-05T04:08:17+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-03-24T15:12:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=47ec1f3cc46dde00deb34922dbffdeda254ad76d'/>
<id>47ec1f3cc46dde00deb34922dbffdeda254ad76d</id>
<content type='text'>
Respond to non-default primary_affinity values accordingly.  (Primary
affinity allows the admin to shift 'primary responsibility' away from
specific osds, effectively shifting around the read side of the
workload and whatever overhead is incurred by peering and writes by
virtue of being the primary).

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Respond to non-default primary_affinity values accordingly.  (Primary
affinity allows the admin to shift 'primary responsibility' away from
specific osds, effectively shifting around the read side of the
workload and whatever overhead is incurred by peering and writes by
virtue of being the primary).

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
