<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ceph, branch v3.12.64</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>libceph: apply new_state before new_up_client on incrementals</title>
<updated>2016-08-19T07:50:45+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2016-07-24T16:32:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bbc3aa6b0e6050b2b2e04a08dd4d6423d576b196'/>
<id>bbc3aa6b0e6050b2b2e04a08dd4d6423d576b196</id>
<content type='text'>
commit 930c532869774ebf8af9efe9484c597f896a7d46 upstream.

Currently, osd_weight and osd_state fields are updated in the encoding
order.  This is wrong, because an incremental map may look like e.g.

    new_up_client: { osd=6, addr=... } # set osd_state and addr
    new_state: { osd=6, xorstate=EXISTS } # clear osd_state

Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down).  After
applying new_up_client, osd_state is changed to EXISTS | UP.  Carrying
on with the new_state update, we flip EXISTS and leave osd6 in a weird
"!EXISTS but UP" state.  A non-existent OSD is considered down by the
mapping code

2087    for (i = 0; i &lt; pg-&gt;pg_temp.len; i++) {
2088            if (ceph_osd_is_down(osdmap, pg-&gt;pg_temp.osds[i])) {
2089                    if (ceph_can_shift_osds(pi))
2090                            continue;
2091
2092                    temp-&gt;osds[temp-&gt;size++] = CRUSH_ITEM_NONE;

and so requests get directed to the second OSD in the set instead of
the first, resulting in OSD-side errors like:

[WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680

and hung rbds on the client:

[  493.566367] rbd: rbd0: write 400000 at 11cc00000 (0)
[  493.566805] rbd: rbd0:   result -6 xferred 400000
[  493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688

The fix is to decouple application from the decoding and:
- apply new_weight first
- apply new_state before new_up_client
- twiddle osd_state flags if marking in
- clear out some of the state if osd is destroyed

Fixes: http://tracker.ceph.com/issues/14901

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
[idryomov@gmail.com: backport to 3.10-3.14: strip primary-affinity]
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 930c532869774ebf8af9efe9484c597f896a7d46 upstream.

Currently, osd_weight and osd_state fields are updated in the encoding
order.  This is wrong, because an incremental map may look like e.g.

    new_up_client: { osd=6, addr=... } # set osd_state and addr
    new_state: { osd=6, xorstate=EXISTS } # clear osd_state

Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down).  After
applying new_up_client, osd_state is changed to EXISTS | UP.  Carrying
on with the new_state update, we flip EXISTS and leave osd6 in a weird
"!EXISTS but UP" state.  A non-existent OSD is considered down by the
mapping code

2087    for (i = 0; i &lt; pg-&gt;pg_temp.len; i++) {
2088            if (ceph_osd_is_down(osdmap, pg-&gt;pg_temp.osds[i])) {
2089                    if (ceph_can_shift_osds(pi))
2090                            continue;
2091
2092                    temp-&gt;osds[temp-&gt;size++] = CRUSH_ITEM_NONE;

and so requests get directed to the second OSD in the set instead of
the first, resulting in OSD-side errors like:

[WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680

and hung rbds on the client:

[  493.566367] rbd: rbd0: write 400000 at 11cc00000 (0)
[  493.566805] rbd: rbd0:   result -6 xferred 400000
[  493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688

The fix is to decouple application from the decoding and:
- apply new_weight first
- apply new_state before new_up_client
- twiddle osd_state flags if marking in
- clear out some of the state if osd is destroyed

Fixes: http://tracker.ceph.com/issues/14901

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
[idryomov@gmail.com: backport to 3.10-3.14: strip primary-affinity]
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: set 'exists' flag for newly up osd</title>
<updated>2016-08-19T07:50:44+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2015-08-28T09:59:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2ffb16a7ea55f7cd827d8ed95a0490ae8bdb60d7'/>
<id>2ffb16a7ea55f7cd827d8ed95a0490ae8bdb60d7</id>
<content type='text'>
commit 6dd74e44dc1df85f125982a8d6591bc4a76c9f5d upstream.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6dd74e44dc1df85f125982a8d6591bc4a76c9f5d upstream.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: don't bail early from try_read() when skipping a message</title>
<updated>2016-03-03T11:46:05+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2016-02-17T19:04:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=715a01aa261cd312c2aaa8614eadd6b28fd3a018'/>
<id>715a01aa261cd312c2aaa8614eadd6b28fd3a018</id>
<content type='text'>
commit e7a88e82fe380459b864e05b372638aeacb0f52d upstream.

The contract between try_read() and try_write() is that when called
each processes as much data as possible.  When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more.  try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.

Reported-by: Varada Kari &lt;Varada.Kari@sandisk.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Tested-by: Varada Kari &lt;Varada.Kari@sandisk.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e7a88e82fe380459b864e05b372638aeacb0f52d upstream.

The contract between try_read() and try_write() is that when called
each processes as much data as possible.  When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more.  try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.

Reported-by: Varada Kari &lt;Varada.Kari@sandisk.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Tested-by: Varada Kari &lt;Varada.Kari@sandisk.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>crush: fix a bug in tree bucket decode</title>
<updated>2015-08-04T14:52:26+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-06-29T16:30:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=85a5064b5c4fa42a2bd73feba46a5f96f56edabe'/>
<id>85a5064b5c4fa42a2bd73feba46a5f96f56edabe</id>
<content type='text'>
commit 82cd003a77173c91b9acad8033fb7931dac8d751 upstream.

struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe()
should be used.  -Wconversion catches this, but I guess it went
unnoticed in all the noise it spews.  The actual problem (at least for
common crushmaps) isn't the u32 -&gt; u8 truncation though - it's the
advancement by 4 bytes instead of 1 in the crushmap buffer.

Fixes: http://tracker.ceph.com/issues/2759

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 82cd003a77173c91b9acad8033fb7931dac8d751 upstream.

struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe()
should be used.  -Wconversion catches this, but I guess it went
unnoticed in all the noise it spews.  The actual problem (at least for
common crushmaps) isn't the u32 -&gt; u8 truncation though - it's the
advancement by 4 bytes instead of 1 in the crushmap buffer.

Fixes: http://tracker.ceph.com/issues/2759

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: request a new osdmap if lingering request maps to no osd</title>
<updated>2015-06-03T09:33:07+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-05-11T14:53:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=74b65e99d99c940a7fa731a317d13bc0b3b34911'/>
<id>74b65e99d99c940a7fa731a317d13bc0b3b34911</id>
<content type='text'>
commit b0494532214bdfbf241e94fabab5dd46f7b82631 upstream.

This commit does two things.  First, if there are any homeless
lingering requests, we now request a new osdmap even if the osdmap that
is being processed brought no changes, i.e. if a given lingering
request turned homeless in one of the previous epochs and remained
homeless in the current epoch.  Not doing so leaves us with a stale
osdmap and as a result we may miss our window for reestablishing the
watch and lose notifies.

MON=1 OSD=1:

    # cat linger-needmap.sh
    #!/bin/bash
    rbd create --size 1 test
    DEV=$(rbd map test)
    ceph osd out 0
    rbd map dne/dne # obtain a new osdmap as a side effect (!)
    sleep 1
    ceph osd in 0
    rbd resize --size 2 test
    # rbd info test | grep size -&gt; 2M
    # blockdev --getsize $DEV -&gt; 1M

N.B.: Not obtaining a new osdmap in between "osd out" and "osd in"
above is enough to make it miss that resize notify, but that is a
bug^Wlimitation of ceph watch/notify v1.

Second, homeless lingering requests are now kicked just like those
lingering requests whose mapping has changed.  This is mainly to
recognize that a homeless lingering request makes no sense and to
preserve the invariant that a registered lingering request is not
sitting on any of r_req_lru_item lists.  This spares us a WARN_ON,
which commit ba9d114ec557 ("libceph: clear r_req_lru_item in
__unregister_linger_request()") tried to fix the _wrong_ way.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b0494532214bdfbf241e94fabab5dd46f7b82631 upstream.

This commit does two things.  First, if there are any homeless
lingering requests, we now request a new osdmap even if the osdmap that
is being processed brought no changes, i.e. if a given lingering
request turned homeless in one of the previous epochs and remained
homeless in the current epoch.  Not doing so leaves us with a stale
osdmap and as a result we may miss our window for reestablishing the
watch and lose notifies.

MON=1 OSD=1:

    # cat linger-needmap.sh
    #!/bin/bash
    rbd create --size 1 test
    DEV=$(rbd map test)
    ceph osd out 0
    rbd map dne/dne # obtain a new osdmap as a side effect (!)
    sleep 1
    ceph osd in 0
    rbd resize --size 2 test
    # rbd info test | grep size -&gt; 2M
    # blockdev --getsize $DEV -&gt; 1M

N.B.: Not obtaining a new osdmap in between "osd out" and "osd in"
above is enough to make it miss that resize notify, but that is a
bug^Wlimitation of ceph watch/notify v1.

Second, homeless lingering requests are now kicked just like those
lingering requests whose mapping has changed.  This is mainly to
recognize that a homeless lingering request makes no sense and to
preserve the invariant that a registered lingering request is not
sitting on any of r_req_lru_item lists.  This spares us a WARN_ON,
which commit ba9d114ec557 ("libceph: clear r_req_lru_item in
__unregister_linger_request()") tried to fix the _wrong_ way.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: fix double __remove_osd() problem</title>
<updated>2015-03-05T14:36:59+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-02-17T16:37:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8ef0cf0e2fede8b5607986c197128ee2f21adf94'/>
<id>8ef0cf0e2fede8b5607986c197128ee2f21adf94</id>
<content type='text'>
commit 7eb71e0351fbb1b242ae70abb7bb17107fe2f792 upstream.

It turns out it's possible to get __remove_osd() called twice on the
same OSD.  That doesn't sit well with rb_erase() - depending on the
shape of the tree we can get a NULL dereference, a soft lockup or
a random crash at some point in the future as we end up touching freed
memory.  One scenario that I was able to reproduce is as follows:

            &lt;osd3 is idle, on the osd lru list&gt;
&lt;con reset - osd3&gt;
con_fault_finish()
  osd_reset()
                              &lt;osdmap - osd3 down&gt;
                              ceph_osdc_handle_map()
                                &lt;takes map_sem&gt;
                                kick_requests()
                                  &lt;takes request_mutex&gt;
                                  reset_changed_osds()
                                    __reset_osd()
                                      __remove_osd()
                                  &lt;releases request_mutex&gt;
                                &lt;releases map_sem&gt;
    &lt;takes map_sem&gt;
    &lt;takes request_mutex&gt;
    __kick_osd_requests()
      __reset_osd()
        __remove_osd() &lt;-- !!!

A case can be made that osd refcounting is imperfect and reworking it
would be a proper resolution, but for now Sage and I decided to fix
this by adding a safe guard around __remove_osd().

Fixes: http://tracker.ceph.com/issues/8087

Cc: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7eb71e0351fbb1b242ae70abb7bb17107fe2f792 upstream.

It turns out it's possible to get __remove_osd() called twice on the
same OSD.  That doesn't sit well with rb_erase() - depending on the
shape of the tree we can get a NULL dereference, a soft lockup or
a random crash at some point in the future as we end up touching freed
memory.  One scenario that I was able to reproduce is as follows:

            &lt;osd3 is idle, on the osd lru list&gt;
&lt;con reset - osd3&gt;
con_fault_finish()
  osd_reset()
                              &lt;osdmap - osd3 down&gt;
                              ceph_osdc_handle_map()
                                &lt;takes map_sem&gt;
                                kick_requests()
                                  &lt;takes request_mutex&gt;
                                  reset_changed_osds()
                                    __reset_osd()
                                      __remove_osd()
                                  &lt;releases request_mutex&gt;
                                &lt;releases map_sem&gt;
    &lt;takes map_sem&gt;
    &lt;takes request_mutex&gt;
    __kick_osd_requests()
      __reset_osd()
        __remove_osd() &lt;-- !!!

A case can be made that osd refcounting is imperfect and reworking it
would be a proper resolution, but for now Sage and I decided to fix
this by adding a safe guard around __remove_osd().

Fixes: http://tracker.ceph.com/issues/8087

Cc: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: change from BUG to WARN for __remove_osd() asserts</title>
<updated>2015-03-05T14:36:58+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@redhat.com</email>
</author>
<published>2014-11-05T16:33:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=27f9c1807c7e33efec5c02ef8a101774839214b8'/>
<id>27f9c1807c7e33efec5c02ef8a101774839214b8</id>
<content type='text'>
commit cc9f1f518cec079289d11d732efa490306b1ddad upstream.

No reason to use BUG_ON for osd request list assertions.

Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit cc9f1f518cec079289d11d732efa490306b1ddad upstream.

No reason to use BUG_ON for osd request list assertions.

Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: assert both regular and lingering lists in __remove_osd()</title>
<updated>2015-03-05T14:36:57+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>ilya.dryomov@inktank.com</email>
</author>
<published>2014-06-18T09:02:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=66d37e928df71d6aef8b2dfe8c6dca785653cf1f'/>
<id>66d37e928df71d6aef8b2dfe8c6dca785653cf1f</id>
<content type='text'>
commit 7c6e6fc53e7335570ed82f77656cedce1502744e upstream.

It is important that both regular and lingering requests lists are
empty when the OSD is removed.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7c6e6fc53e7335570ed82f77656cedce1502744e upstream.

It is important that both regular and lingering requests lists are
empty when the OSD is removed.

Signed-off-by: Ilya Dryomov &lt;ilya.dryomov@inktank.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: do not crash on large auth tickets</title>
<updated>2014-11-19T17:38:17+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@redhat.com</email>
</author>
<published>2014-10-22T20:25:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8ad766627efd7296267b8ddb231d990f84c90bdd'/>
<id>8ad766627efd7296267b8ddb231d990f84c90bdd</id>
<content type='text'>
commit aaef31703a0cf6a733e651885bfb49edc3ac6774 upstream.

Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:

[   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[   28.686032] IP: [&lt;ffffffff81392b42&gt;] scatterwalk_pagedone+0x22/0x80
[   28.686032] PGD 0
[   28.688088] Oops: 0000 [#1] PREEMPT SMP
[   28.688088] Modules linked in:
[   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   28.688088] Workqueue: ceph-msgr con_work
[   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[   28.688088] RIP: 0010:[&lt;ffffffff81392b42&gt;]  [&lt;ffffffff81392b42&gt;] scatterwalk_pagedone+0x22/0x80
[   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
[   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[   28.688088] Stack:
[   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[   28.688088] Call Trace:
[   28.688088]  [&lt;ffffffff81392ca8&gt;] scatterwalk_done+0x38/0x40
[   28.688088]  [&lt;ffffffff81392ca8&gt;] scatterwalk_done+0x38/0x40
[   28.688088]  [&lt;ffffffff81395d32&gt;] blkcipher_walk_done+0x182/0x220
[   28.688088]  [&lt;ffffffff813990bf&gt;] crypto_cbc_encrypt+0x15f/0x180
[   28.688088]  [&lt;ffffffff81399780&gt;] ? crypto_aes_set_key+0x30/0x30
[   28.688088]  [&lt;ffffffff8156c40c&gt;] ceph_aes_encrypt2+0x29c/0x2e0
[   28.688088]  [&lt;ffffffff8156d2a3&gt;] ceph_encrypt2+0x93/0xb0
[   28.688088]  [&lt;ffffffff8156d7da&gt;] ceph_x_encrypt+0x4a/0x60
[   28.688088]  [&lt;ffffffff8155b39d&gt;] ? ceph_buffer_new+0x5d/0xf0
[   28.688088]  [&lt;ffffffff8156e837&gt;] ceph_x_build_authorizer.isra.6+0x297/0x360
[   28.688088]  [&lt;ffffffff8112089b&gt;] ? kmem_cache_alloc_trace+0x11b/0x1c0
[   28.688088]  [&lt;ffffffff8156b496&gt;] ? ceph_auth_create_authorizer+0x36/0x80
[   28.688088]  [&lt;ffffffff8156ed83&gt;] ceph_x_create_authorizer+0x63/0xd0
[   28.688088]  [&lt;ffffffff8156b4b4&gt;] ceph_auth_create_authorizer+0x54/0x80
[   28.688088]  [&lt;ffffffff8155f7c0&gt;] get_authorizer+0x80/0xd0
[   28.688088]  [&lt;ffffffff81555a8b&gt;] prepare_write_connect+0x18b/0x2b0
[   28.688088]  [&lt;ffffffff81559289&gt;] try_read+0x1e59/0x1f10

This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed.  Fix it.

Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit aaef31703a0cf6a733e651885bfb49edc3ac6774 upstream.

Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:

[   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[   28.686032] IP: [&lt;ffffffff81392b42&gt;] scatterwalk_pagedone+0x22/0x80
[   28.686032] PGD 0
[   28.688088] Oops: 0000 [#1] PREEMPT SMP
[   28.688088] Modules linked in:
[   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   28.688088] Workqueue: ceph-msgr con_work
[   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[   28.688088] RIP: 0010:[&lt;ffffffff81392b42&gt;]  [&lt;ffffffff81392b42&gt;] scatterwalk_pagedone+0x22/0x80
[   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
[   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[   28.688088] Stack:
[   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[   28.688088] Call Trace:
[   28.688088]  [&lt;ffffffff81392ca8&gt;] scatterwalk_done+0x38/0x40
[   28.688088]  [&lt;ffffffff81392ca8&gt;] scatterwalk_done+0x38/0x40
[   28.688088]  [&lt;ffffffff81395d32&gt;] blkcipher_walk_done+0x182/0x220
[   28.688088]  [&lt;ffffffff813990bf&gt;] crypto_cbc_encrypt+0x15f/0x180
[   28.688088]  [&lt;ffffffff81399780&gt;] ? crypto_aes_set_key+0x30/0x30
[   28.688088]  [&lt;ffffffff8156c40c&gt;] ceph_aes_encrypt2+0x29c/0x2e0
[   28.688088]  [&lt;ffffffff8156d2a3&gt;] ceph_encrypt2+0x93/0xb0
[   28.688088]  [&lt;ffffffff8156d7da&gt;] ceph_x_encrypt+0x4a/0x60
[   28.688088]  [&lt;ffffffff8155b39d&gt;] ? ceph_buffer_new+0x5d/0xf0
[   28.688088]  [&lt;ffffffff8156e837&gt;] ceph_x_build_authorizer.isra.6+0x297/0x360
[   28.688088]  [&lt;ffffffff8112089b&gt;] ? kmem_cache_alloc_trace+0x11b/0x1c0
[   28.688088]  [&lt;ffffffff8156b496&gt;] ? ceph_auth_create_authorizer+0x36/0x80
[   28.688088]  [&lt;ffffffff8156ed83&gt;] ceph_x_create_authorizer+0x63/0xd0
[   28.688088]  [&lt;ffffffff8156b4b4&gt;] ceph_auth_create_authorizer+0x54/0x80
[   28.688088]  [&lt;ffffffff8155f7c0&gt;] get_authorizer+0x80/0xd0
[   28.688088]  [&lt;ffffffff81555a8b&gt;] prepare_write_connect+0x18b/0x2b0
[   28.688088]  [&lt;ffffffff81559289&gt;] try_read+0x1e59/0x1f10

This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed.  Fix it.

Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: ceph-msgr workqueue needs a resque worker</title>
<updated>2014-10-31T14:11:32+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@redhat.com</email>
</author>
<published>2014-10-10T12:39:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=292b588419a9fdd4950ae7944bed6efafa7571ae'/>
<id>292b588419a9fdd4950ae7944bed6efafa7571ae</id>
<content type='text'>
commit f9865f06f7f18c6661c88d0511f05c48612319cc upstream.

Commit f363e45fd118 ("net/ceph: make ceph_msgr_wq non-reentrant")
effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq.  This is
wrong - libceph is very much a memory reclaim path, so restore it.

Cc: stable@vger.kernel.org # needs backporting for &lt; 3.12
Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
Tested-by: Micha Krause &lt;micha@krausam.de&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f9865f06f7f18c6661c88d0511f05c48612319cc upstream.

Commit f363e45fd118 ("net/ceph: make ceph_msgr_wq non-reentrant")
effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq.  This is
wrong - libceph is very much a memory reclaim path, so restore it.

Cc: stable@vger.kernel.org # needs backporting for &lt; 3.12
Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
Tested-by: Micha Krause &lt;micha@krausam.de&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
</feed>
