<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/raid1.c, branch v4.1.41</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>md:raid1: fix a dead loop when read from a WriteMostly disk</title>
<updated>2017-06-13T13:29:18+00:00</updated>
<author>
<name>Wei Fang</name>
<email>fangwei1@huawei.com</email>
</author>
<published>2016-03-21T11:18:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d0388c0e2556ea797bc433618f54238f0c7b9fcf'/>
<id>d0388c0e2556ea797bc433618f54238f0c7b9fcf</id>
<content type='text'>
[ Upstream commit 816b0acf3deb6d6be5d0519b286fdd4bafade905 ]

If first_bad == this_sector when we get the WriteMostly disk
in read_balance(), valid disk will be returned with zero
max_sectors. It'll lead to a dead loop in make_request(), and
OOM will happen because of endless allocation of struct bio.

Since we can't get data from this disk in this case, so
continue for another disk.

Signed-off-by: Wei Fang &lt;fangwei1@huawei.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 816b0acf3deb6d6be5d0519b286fdd4bafade905 ]

If first_bad == this_sector when we get the WriteMostly disk
in read_balance(), valid disk will be returned with zero
max_sectors. It'll lead to a dead loop in make_request(), and
OOM will happen because of endless allocation of struct bio.

Since we can't get data from this disk in this case, so
continue for another disk.

Signed-off-by: Wei Fang &lt;fangwei1@huawei.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: submit_bio_wait() returns 0 on success</title>
<updated>2015-11-09T22:33:38+00:00</updated>
<author>
<name>Jes Sorensen</name>
<email>Jes.Sorensen@redhat.com</email>
</author>
<published>2015-10-20T16:09:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=26b70a8981b22f03f677dd569b5c8191ade40848'/>
<id>26b70a8981b22f03f677dd569b5c8191ade40848</id>
<content type='text'>
commit 203d27b0226a05202438ddb39ef0ef1acb14a759 upstream.

This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.

Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Reported-by: Bill Kuzeja &lt;William.Kuzeja@stratus.com&gt;
Signed-off-by: Jes Sorensen &lt;Jes.Sorensen@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 203d27b0226a05202438ddb39ef0ef1acb14a759 upstream.

This was introduced with 9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.

Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Reported-by: Bill Kuzeja &lt;William.Kuzeja@stratus.com&gt;
Signed-off-by: Jes Sorensen &lt;Jes.Sorensen@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies</title>
<updated>2015-08-17T03:52:23+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-27T01:48:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=24a9fa79d5b9f3cf555c24336f56c0fcbbdd98bf'/>
<id>24a9fa79d5b9f3cf555c24336f56c0fcbbdd98bf</id>
<content type='text'>
commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream.

raid1_end_read_request() assumes that the In_sync bits are consistent
with the -&gt;degaded count.
raid1_spare_active updates the In_sync bit before the -&gt;degraded count
and so exposes an inconsistency, as does error()
So extend the spinlock in raid1_spare_active() and error() to hide those
inconsistencies.

This should probably be part of
  Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from
  last working device'.")
as it addresses the same issue.  It fixes the same bug and should go
to -stable for same reasons.

Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream.

raid1_end_read_request() assumes that the In_sync bits are consistent
with the -&gt;degaded count.
raid1_spare_active updates the In_sync bit before the -&gt;degraded count
and so exposes an inconsistency, as does error()
So extend the spinlock in raid1_spare_active() and error() to hide those
inconsistencies.

This should probably be part of
  Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from
  last working device'.")
as it addresses the same issue.  It fixes the same bug and should go
to -stable for same reasons.

Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: fix test for 'was read error from last working device'.</title>
<updated>2015-08-10T19:21:55+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-23T23:22:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4731b65de6b30b7a8e771c51e4fb4a53846f158e'/>
<id>4731b65de6b30b7a8e771c51e4fb4a53846f158e</id>
<content type='text'>
commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream.

When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If -&gt;degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream.

When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If -&gt;degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>md: remove 'go_faster' option from -&gt;sync_request()</title>
<updated>2015-04-21T22:00:40+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-02-19T05:04:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09314799e4f0589e52bafcd0ca3556c60468bc0e'/>
<id>09314799e4f0589e52bafcd0ca3556c60468bc0e</id>
<content type='text'>
This option is not well justified and testing suggests that
it hardly ever makes any difference.

The comment suggests there might be a need to wait for non-resync
activity indicated by -&gt;nr_waiting, however raise_barrier()
already waits for all of that.

So just remove it to simplify reasoning about speed limiting.

This allows us to remove a 'FIXME' comment from raid5.c as that
never used the flag.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This option is not well justified and testing suggests that
it hardly ever makes any difference.

The comment suggests there might be a need to wait for non-resync
activity indicated by -&gt;nr_waiting, however raise_barrier()
already waits for all of that.

So just remove it to simplify reasoning about speed limiting.

This allows us to remove a 'FIXME' comment from raid5.c as that
never used the flag.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'cluster' into for-next</title>
<updated>2015-04-21T22:00:20+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-04-21T22:00:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d51e4fe6d68098d4361a6b6d41d8da727b1f1af4'/>
<id>d51e4fe6d68098d4361a6b6d41d8da727b1f1af4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: fix read balance when a drive is write-mostly.</title>
<updated>2015-02-25T00:37:02+00:00</updated>
<author>
<name>Tomáš Hodek</name>
<email>tomas.hodek@volny.cz</email>
</author>
<published>2015-02-23T00:00:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d1901ef099c38afd11add4cfb3312c02ef21ec4a'/>
<id>d1901ef099c38afd11add4cfb3312c02ef21ec4a</id>
<content type='text'>
When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or &gt;=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek &lt;tomas.hodek@volny.cz&gt;
Reported-by: Dark Penguin &lt;darkpenguin@yandex.ru&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Link: http://marc.info/?l=linux-raid&amp;m=135982797322422
Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
Cc: stable@vger.kernel.org (3.6+)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or &gt;=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek &lt;tomas.hodek@volny.cz&gt;
Reported-by: Dark Penguin &lt;darkpenguin@yandex.ru&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Link: http://marc.info/?l=linux-raid&amp;m=135982797322422
Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
Cc: stable@vger.kernel.org (3.6+)
</pre>
</div>
</content>
</entry>
<entry>
<title>Add new disk to clustered array</title>
<updated>2015-02-23T15:59:07+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2014-10-29T23:51:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1aee41f637694d4bbf91c24195f2b63e3f6badd2'/>
<id>1aee41f637694d4bbf91c24195f2b63e3f6badd2</id>
<content type='text'>
Algorithm:
1. Node 1 issues mdadm --manage /dev/mdX --add /dev/sdYY which issues
   ioctl(ADD_NEW_DISC with disc.state set to MD_DISK_CLUSTER_ADD)
2. Node 1 sends NEWDISK with uuid and slot number
3. Other nodes issue kobject_uevent_env with uuid and slot number
(Steps 4,5 could be a udev rule)
4. In userspace, the node searches for the disk, perhaps
   using blkid -t SUB_UUID=""
5. Other nodes issue either of the following depending on whether the disk
   was found:
   ioctl(ADD_NEW_DISK with disc.state set to MD_DISK_CANDIDATE and
	 disc.number set to slot number)
   ioctl(CLUSTERED_DISK_NACK)
6. Other nodes drop lock on no-new-devs (CR) if device is found
7. Node 1 attempts EX lock on no-new-devs
8. If node 1 gets the lock, it sends METADATA_UPDATED after unmarking the disk
   as SpareLocal
9. If not (get no-new-dev lock), it fails the operation and sends METADATA_UPDATED
10. Other nodes understand if the device is added or not by reading the superblock again after receiving the METADATA_UPDATED message.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Algorithm:
1. Node 1 issues mdadm --manage /dev/mdX --add /dev/sdYY which issues
   ioctl(ADD_NEW_DISC with disc.state set to MD_DISK_CLUSTER_ADD)
2. Node 1 sends NEWDISK with uuid and slot number
3. Other nodes issue kobject_uevent_env with uuid and slot number
(Steps 4,5 could be a udev rule)
4. In userspace, the node searches for the disk, perhaps
   using blkid -t SUB_UUID=""
5. Other nodes issue either of the following depending on whether the disk
   was found:
   ioctl(ADD_NEW_DISK with disc.state set to MD_DISK_CANDIDATE and
	 disc.number set to slot number)
   ioctl(CLUSTERED_DISK_NACK)
6. Other nodes drop lock on no-new-devs (CR) if device is found
7. Node 1 attempts EX lock on no-new-devs
8. If node 1 gets the lock, it sends METADATA_UPDATED after unmarking the disk
   as SpareLocal
9. If not (get no-new-dev lock), it fails the operation and sends METADATA_UPDATED
10. Other nodes understand if the device is added or not by reading the superblock again after receiving the METADATA_UPDATED message.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Read from the first device when an area is resyncing</title>
<updated>2015-02-23T15:59:07+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2014-08-12T15:13:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7d49ffcfa3cc08aa2301bf3fdb1e423a3fd33ee7'/>
<id>7d49ffcfa3cc08aa2301bf3fdb1e423a3fd33ee7</id>
<content type='text'>
set choose_first true for cluster read in read balance when the area
is resyncing.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
set choose_first true for cluster read in read balance when the area
is resyncing.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Suspend writes in RAID1 if within range</title>
<updated>2015-02-23T15:59:07+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2014-06-07T07:39:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=589a1c491621ab81a1955d17d634636522c1b4c1'/>
<id>589a1c491621ab81a1955d17d634636522c1b4c1</id>
<content type='text'>
If there is a resync going on, all nodes must suspend writes to the
range. This is recorded in the suspend_info/suspend_list.

If there is an I/O within the ranges of any of the suspend_info,
should_suspend will return 1.

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If there is a resync going on, all nodes must suspend writes to the
range. This is recorded in the suspend_info/suspend_list.

If there is an I/O within the ranges of any of the suspend_info,
should_suspend will return 1.

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
