<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/raid1.c, branch v4.1</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>md: remove 'go_faster' option from -&gt;sync_request()</title>
<updated>2015-04-21T22:00:40+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-02-19T05:04:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09314799e4f0589e52bafcd0ca3556c60468bc0e'/>
<id>09314799e4f0589e52bafcd0ca3556c60468bc0e</id>
<content type='text'>
This option is not well justified and testing suggests that
it hardly ever makes any difference.

The comment suggests there might be a need to wait for non-resync
activity indicated by -&gt;nr_waiting, however raise_barrier()
already waits for all of that.

So just remove it to simplify reasoning about speed limiting.

This allows us to remove a 'FIXME' comment from raid5.c as that
never used the flag.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This option is not well justified and testing suggests that
it hardly ever makes any difference.

The comment suggests there might be a need to wait for non-resync
activity indicated by -&gt;nr_waiting, however raise_barrier()
already waits for all of that.

So just remove it to simplify reasoning about speed limiting.

This allows us to remove a 'FIXME' comment from raid5.c as that
never used the flag.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'cluster' into for-next</title>
<updated>2015-04-21T22:00:20+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-04-21T22:00:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d51e4fe6d68098d4361a6b6d41d8da727b1f1af4'/>
<id>d51e4fe6d68098d4361a6b6d41d8da727b1f1af4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: fix read balance when a drive is write-mostly.</title>
<updated>2015-02-25T00:37:02+00:00</updated>
<author>
<name>Tomáš Hodek</name>
<email>tomas.hodek@volny.cz</email>
</author>
<published>2015-02-23T00:00:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d1901ef099c38afd11add4cfb3312c02ef21ec4a'/>
<id>d1901ef099c38afd11add4cfb3312c02ef21ec4a</id>
<content type='text'>
When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or &gt;=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek &lt;tomas.hodek@volny.cz&gt;
Reported-by: Dark Penguin &lt;darkpenguin@yandex.ru&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Link: http://marc.info/?l=linux-raid&amp;m=135982797322422
Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
Cc: stable@vger.kernel.org (3.6+)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a drive is marked write-mostly it should only be the
target of reads if there is no other option.

This behaviour was broken by

commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    md/raid1: read balance chooses idlest disk for SSD

which causes a write-mostly device to be *preferred* is some cases.

Restore correct behaviour by checking and setting
best_dist_disk and best_pending_disk rather than best_disk.

We only need to test one of these as they are both changed
from -1 or &gt;=0 at the same time.

As we leave min_pending and best_dist unchanged, any non-write-mostly
device will appear better than the write-mostly device.

Reported-by: Tomáš Hodek &lt;tomas.hodek@volny.cz&gt;
Reported-by: Dark Penguin &lt;darkpenguin@yandex.ru&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Link: http://marc.info/?l=linux-raid&amp;m=135982797322422
Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
Cc: stable@vger.kernel.org (3.6+)
</pre>
</div>
</content>
</entry>
<entry>
<title>Add new disk to clustered array</title>
<updated>2015-02-23T15:59:07+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2014-10-29T23:51:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1aee41f637694d4bbf91c24195f2b63e3f6badd2'/>
<id>1aee41f637694d4bbf91c24195f2b63e3f6badd2</id>
<content type='text'>
Algorithm:
1. Node 1 issues mdadm --manage /dev/mdX --add /dev/sdYY which issues
   ioctl(ADD_NEW_DISC with disc.state set to MD_DISK_CLUSTER_ADD)
2. Node 1 sends NEWDISK with uuid and slot number
3. Other nodes issue kobject_uevent_env with uuid and slot number
(Steps 4,5 could be a udev rule)
4. In userspace, the node searches for the disk, perhaps
   using blkid -t SUB_UUID=""
5. Other nodes issue either of the following depending on whether the disk
   was found:
   ioctl(ADD_NEW_DISK with disc.state set to MD_DISK_CANDIDATE and
	 disc.number set to slot number)
   ioctl(CLUSTERED_DISK_NACK)
6. Other nodes drop lock on no-new-devs (CR) if device is found
7. Node 1 attempts EX lock on no-new-devs
8. If node 1 gets the lock, it sends METADATA_UPDATED after unmarking the disk
   as SpareLocal
9. If not (get no-new-dev lock), it fails the operation and sends METADATA_UPDATED
10. Other nodes understand if the device is added or not by reading the superblock again after receiving the METADATA_UPDATED message.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Algorithm:
1. Node 1 issues mdadm --manage /dev/mdX --add /dev/sdYY which issues
   ioctl(ADD_NEW_DISC with disc.state set to MD_DISK_CLUSTER_ADD)
2. Node 1 sends NEWDISK with uuid and slot number
3. Other nodes issue kobject_uevent_env with uuid and slot number
(Steps 4,5 could be a udev rule)
4. In userspace, the node searches for the disk, perhaps
   using blkid -t SUB_UUID=""
5. Other nodes issue either of the following depending on whether the disk
   was found:
   ioctl(ADD_NEW_DISK with disc.state set to MD_DISK_CANDIDATE and
	 disc.number set to slot number)
   ioctl(CLUSTERED_DISK_NACK)
6. Other nodes drop lock on no-new-devs (CR) if device is found
7. Node 1 attempts EX lock on no-new-devs
8. If node 1 gets the lock, it sends METADATA_UPDATED after unmarking the disk
   as SpareLocal
9. If not (get no-new-dev lock), it fails the operation and sends METADATA_UPDATED
10. Other nodes understand if the device is added or not by reading the superblock again after receiving the METADATA_UPDATED message.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Read from the first device when an area is resyncing</title>
<updated>2015-02-23T15:59:07+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2014-08-12T15:13:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7d49ffcfa3cc08aa2301bf3fdb1e423a3fd33ee7'/>
<id>7d49ffcfa3cc08aa2301bf3fdb1e423a3fd33ee7</id>
<content type='text'>
set choose_first true for cluster read in read balance when the area
is resyncing.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
set choose_first true for cluster read in read balance when the area
is resyncing.

Signed-off-by: Lidong Zhong &lt;lzhong@suse.com&gt;
Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Suspend writes in RAID1 if within range</title>
<updated>2015-02-23T15:59:07+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2014-06-07T07:39:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=589a1c491621ab81a1955d17d634636522c1b4c1'/>
<id>589a1c491621ab81a1955d17d634636522c1b4c1</id>
<content type='text'>
If there is a resync going on, all nodes must suspend writes to the
range. This is recorded in the suspend_info/suspend_list.

If there is an I/O within the ranges of any of the suspend_info,
should_suspend will return 1.

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If there is a resync going on, all nodes must suspend writes to the
range. This is recorded in the suspend_info/suspend_list.

If there is an I/O within the ranges of any of the suspend_info,
should_suspend will return 1.

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: round up to bdev_logical_block_size in narrow_write_error</title>
<updated>2015-02-16T03:49:26+00:00</updated>
<author>
<name>Nate Dailey</name>
<email>nate.dailey@stratus.com</email>
</author>
<published>2015-02-12T17:02:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ab713cdc6f70da62c254c4acf77a0cfcda87b7f5'/>
<id>ab713cdc6f70da62c254c4acf77a0cfcda87b7f5</id>
<content type='text'>
This modifies raid1's narrow_write_error to round up block_sectors to the
device's logical block size.

This prevents sd complaining about "Bad block number requested" for non-512-byte
sector disks.

Signed-off-by: Nate Dailey &lt;nate.dailey@stratus.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This modifies raid1's narrow_write_error to round up block_sectors to the
device's logical block size.

This prevents sd complaining about "Bad block number requested" for non-512-byte
sector disks.

Signed-off-by: Nate Dailey &lt;nate.dailey@stratus.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: rename -&gt;stop to -&gt;free</title>
<updated>2015-02-03T21:35:52+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-15T01:56:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=afa0f557cb15176570a18fb2a093e348a793afd4'/>
<id>afa0f557cb15176570a18fb2a093e348a793afd4</id>
<content type='text'>
Now that the -&gt;stop function only frees the private data,
rename is accordingly.

Also pass in the private pointer as an arg rather than using
mddev-&gt;private.  This flexibility will be useful in level_store().

Finally, don't clear -&gt;private.  It doesn't make sense to clear
it seeing that isn't what we free, and it is no longer necessary
to clear -&gt;private (it was some time ago before  -&gt;to_remove was
introduced).

Setting -&gt;to_remove in -&gt;free() is a bit of a wart, but not a
big problem at the moment.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the -&gt;stop function only frees the private data,
rename is accordingly.

Also pass in the private pointer as an arg rather than using
mddev-&gt;private.  This flexibility will be useful in level_store().

Finally, don't clear -&gt;private.  It doesn't make sense to clear
it seeing that isn't what we free, and it is no longer necessary
to clear -&gt;private (it was some time ago before  -&gt;to_remove was
introduced).

Setting -&gt;to_remove in -&gt;free() is a bit of a wart, but not a
big problem at the moment.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: split detach operation out from -&gt;stop.</title>
<updated>2015-02-03T21:35:52+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-15T01:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5aa61f427e4979be733e4847b9199ff9cc48a47e'/>
<id>5aa61f427e4979be733e4847b9199ff9cc48a47e</id>
<content type='text'>
Each md personality has a 'stop' operation which does two
things:
 1/ it finalizes some aspects of the array to ensure nothing
    is accessing the -&gt;private data
 2/ it frees the -&gt;private data.

All the steps in '1' can apply to all arrays and so can be
performed in common code.

This is useful as in the case where we change the personality which
manages an array (in level_store()), it would be helpful to do
step 1 early, and step 2 later.

So split the 'step 1' functionality out into a new mddev_detach().

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each md personality has a 'stop' operation which does two
things:
 1/ it finalizes some aspects of the array to ensure nothing
    is accessing the -&gt;private data
 2/ it frees the -&gt;private data.

All the steps in '1' can apply to all arrays and so can be
performed in common code.

This is useful as in the case where we change the personality which
manages an array (in level_store()), it would be helpful to do
step 1 early, and step 2 later.

So split the 'step 1' functionality out into a new mddev_detach().

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: make merge_bvec_fn more robust in face of personality changes.</title>
<updated>2015-02-03T21:35:52+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-12-15T01:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=64590f45ddc7147fa1968147a1f5b5c436b728fe'/>
<id>64590f45ddc7147fa1968147a1f5b5c436b728fe</id>
<content type='text'>
There is no locking around calls to merge_bvec_fn(), so
it is possible that calls which coincide with a level (or personality)
change could go wrong.

So create a central dispatch point for these functions and use
rcu_read_lock().
If the array is suspended, reject any merge that can be rejected.
If not, we know it is safe to call the function.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no locking around calls to merge_bvec_fn(), so
it is possible that calls which coincide with a level (or personality)
change could go wrong.

So create a central dispatch point for these functions and use
rcu_read_lock().
If the array is suspended, reject any merge that can be rejected.
If not, we know it is safe to call the function.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
