<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md, branch v4.2-rc4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'md/4.2-fixes' of git://neil.brown.name/md</title>
<updated>2015-07-25T18:24:58+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-07-25T18:24:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aca105a697bf08208909225a66198277e51b4f65'/>
<id>aca105a697bf08208909225a66198277e51b4f65</id>
<content type='text'>
Pull md fixes from Neil Brown:
 "Some md fixes for 4.2

  Several are tagged for -stable.
  A few aren't because they are not very, serious or because they are in
  the 'experimental' cluster code"

* tag 'md/4.2-fixes' of git://neil.brown.name/md:
  md/raid5: clear R5_NeedReplace when no longer needed.
  Fix read-balancing during node failure
  md-cluster: fix bitmap sub-offset in bitmap_read_sb
  md: Return error if request_module fails and returns positive value
  md: Skip cluster setup in case of error while reading bitmap
  md/raid1: fix test for 'was read error from last working device'.
  md: Skip cluster setup for dm-raid
  md: flush -&gt;event_work before stopping array.
  md/raid10: always set reshape_safe when initializing reshape_position.
  md/raid5: avoid races when changing cache size.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull md fixes from Neil Brown:
 "Some md fixes for 4.2

  Several are tagged for -stable.
  A few aren't because they are not very, serious or because they are in
  the 'experimental' cluster code"

* tag 'md/4.2-fixes' of git://neil.brown.name/md:
  md/raid5: clear R5_NeedReplace when no longer needed.
  Fix read-balancing during node failure
  md-cluster: fix bitmap sub-offset in bitmap_read_sb
  md: Return error if request_module fails and returns positive value
  md: Skip cluster setup in case of error while reading bitmap
  md/raid1: fix test for 'was read error from last working device'.
  md: Skip cluster setup for dm-raid
  md: flush -&gt;event_work before stopping array.
  md/raid10: always set reshape_safe when initializing reshape_position.
  md/raid5: avoid races when changing cache size.
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5: clear R5_NeedReplace when no longer needed.</title>
<updated>2015-07-24T03:38:04+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-17T03:26:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e6030cb06c40e4ab4e8c712f13f494a09638ed2c'/>
<id>e6030cb06c40e4ab4e8c712f13f494a09638ed2c</id>
<content type='text'>
This flag is currently never cleared, which can in rare cases
trigger a warn-on if it is still set but the block isn't
InSync.

So clear it when it isn't need, which includes if the replacement
device has failed.

Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This flag is currently never cleared, which can in rare cases
trigger a warn-on if it is still set but the block isn't
InSync.

So clear it when it isn't need, which includes if the replacement
device has failed.

Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix read-balancing during node failure</title>
<updated>2015-07-24T03:37:59+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2015-06-24T14:30:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=90382ed9afeafd42ef193f0eadc6b2a252d6c24d'/>
<id>90382ed9afeafd42ef193f0eadc6b2a252d6c24d</id>
<content type='text'>
During a node failure, We need to suspend read balancing so that the
reads are directed to the first device and stale data is not read.
Suspending writes is not required because these would be recorded and
synced eventually.

A new flag MD_CLUSTER_SUSPEND_READ_BALANCING is set in recover_prep().
area_resyncing() will respond true for the entire devices if this
flag is set and the request type is READ. The flag is cleared
in recover_done().

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Reported-By: David Teigland &lt;teigland@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During a node failure, We need to suspend read balancing so that the
reads are directed to the first device and stale data is not read.
Suspending writes is not required because these would be recorded and
synced eventually.

A new flag MD_CLUSTER_SUSPEND_READ_BALANCING is set in recover_prep().
area_resyncing() will respond true for the entire devices if this
flag is set and the request type is READ. The flag is cleared
in recover_done().

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Reported-By: David Teigland &lt;teigland@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md-cluster: fix bitmap sub-offset in bitmap_read_sb</title>
<updated>2015-07-24T03:37:55+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.de</email>
</author>
<published>2015-07-01T02:19:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=33e38ac6887d975fe2635c7fcaefb6d5495cb2e1'/>
<id>33e38ac6887d975fe2635c7fcaefb6d5495cb2e1</id>
<content type='text'>
bitmap_read_sb is modifying mddev-&gt;bitmap_info.offset. This works for
the first bitmap read. However, when multiple bitmaps need to be opened
by the same node, it ends up corrupting the offset. Fix it by using a
local variable.

Also, bitmap_read_sb is not required in bitmap_copy_from_slot since
it is called in bitmap_create. Remove bitmap_read_sb().

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bitmap_read_sb is modifying mddev-&gt;bitmap_info.offset. This works for
the first bitmap read. However, when multiple bitmaps need to be opened
by the same node, it ends up corrupting the offset. Fix it by using a
local variable.

Also, bitmap_read_sb is not required in bitmap_copy_from_slot since
it is called in bitmap_create. Remove bitmap_read_sb().

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Return error if request_module fails and returns positive value</title>
<updated>2015-07-24T03:37:51+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2015-07-22T17:09:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b0c26a79d6993b280931f8e2b406ca4b220bb58f'/>
<id>b0c26a79d6993b280931f8e2b406ca4b220bb58f</id>
<content type='text'>
request_module() can return 256 (process exited) in some cases,
which is not as specified in the documentation before the
request_module() definition. Convert the error to -ENOENT.

The positive error number results in bitmap_create() returning
a value that is meant to be an error but doesn't look like one,
so it is dereferenced as a point and causes a crash.

(not needed for stable as this is "experimental" code)
Fixes: edb39c9deda8 ("Introduce md_cluster_operations to handle cluster functions")
Signed-off-By: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
request_module() can return 256 (process exited) in some cases,
which is not as specified in the documentation before the
request_module() definition. Convert the error to -ENOENT.

The positive error number results in bitmap_create() returning
a value that is meant to be an error but doesn't look like one,
so it is dereferenced as a point and causes a crash.

(not needed for stable as this is "experimental" code)
Fixes: edb39c9deda8 ("Introduce md_cluster_operations to handle cluster functions")
Signed-off-By: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Skip cluster setup in case of error while reading bitmap</title>
<updated>2015-07-24T03:37:48+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2015-07-22T17:09:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f7357273198adc86fe11c2a7be8a0816f44103bb'/>
<id>f7357273198adc86fe11c2a7be8a0816f44103bb</id>
<content type='text'>
If the bitmap read fails, the error code set is -EINVAL. However,
we don't check for errors and go ahead with cluster_setup.
Skip the cluster setup in case of error.

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the bitmap read fails, the error code set is -EINVAL. However,
we don't check for errors and go ahead with cluster_setup.
Skip the cluster setup in case of error.

Signed-off-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: fix test for 'was read error from last working device'.</title>
<updated>2015-07-24T03:37:21+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-23T23:22:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=34cab6f42003cb06f48f86a86652984dec338ae9'/>
<id>34cab6f42003cb06f48f86a86652984dec338ae9</id>
<content type='text'>
When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If -&gt;degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Cc: stable@vger.kernel.org (v3.0+)
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If -&gt;degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Cc: stable@vger.kernel.org (v3.0+)
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Skip cluster setup for dm-raid</title>
<updated>2015-07-22T23:22:00+00:00</updated>
<author>
<name>Goldwyn Rodrigues</name>
<email>rgoldwyn@suse.com</email>
</author>
<published>2015-07-22T17:09:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d3b178adb3a3adf54ecf77758138b654c3ee7f09'/>
<id>d3b178adb3a3adf54ecf77758138b654c3ee7f09</id>
<content type='text'>
There is a bug that the bitmap superblock isn't initialised properly for
dm-raid, so a new field can have garbage in new fields.
(dm-raid does initialisation in the kernel - md initialised the
 superblock in mdadm).

This means that for dm-raid we cannot currently trust the new -&gt;nodes
field. So:
 - use __GFP_ZERO to initialise the superblock properly for all new
    arrays
 - initialise all fields in bitmap_info in bitmap_new_disk_sb
 - ignore -&gt;nodes for dm arrays (yes, this is a hack)

This bug exposes dm-raid to bug in the (still experimental) md-cluster
code, so it is suitable for -stable.  It does cause crashes.

References: https://bugzilla.kernel.org/show_bug.cgi?id=100491
Cc: stable@vger.kernel.org (v4.1)
Signed-off-By: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a bug that the bitmap superblock isn't initialised properly for
dm-raid, so a new field can have garbage in new fields.
(dm-raid does initialisation in the kernel - md initialised the
 superblock in mdadm).

This means that for dm-raid we cannot currently trust the new -&gt;nodes
field. So:
 - use __GFP_ZERO to initialise the superblock properly for all new
    arrays
 - initialise all fields in bitmap_info in bitmap_new_disk_sb
 - ignore -&gt;nodes for dm arrays (yes, this is a hack)

This bug exposes dm-raid to bug in the (still experimental) md-cluster
code, so it is suitable for -stable.  It does cause crashes.

References: https://bugzilla.kernel.org/show_bug.cgi?id=100491
Cc: stable@vger.kernel.org (v4.1)
Signed-off-By: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: flush -&gt;event_work before stopping array.</title>
<updated>2015-07-22T04:09:29+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-22T00:20:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee5d004fd0591536a061451eba2b187092e9127c'/>
<id>ee5d004fd0591536a061451eba2b187092e9127c</id>
<content type='text'>
The 'event_work' worker used by dm-raid may still be running
when the array is stopped.  This can result in an oops.

So flush the workqueue on which it is run after detaching
and before destroying the device.

Reported-by: Heinz Mauelshagen &lt;heinzm@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Cc: stable@vger.kernel.org (2.6.38+ please delay 2 weeks after -final release)
Fixes: 9d09e663d550 ("dm: raid456 basic support")
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The 'event_work' worker used by dm-raid may still be running
when the array is stopped.  This can result in an oops.

So flush the workqueue on which it is run after detaching
and before destroying the device.

Reported-by: Heinz Mauelshagen &lt;heinzm@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Cc: stable@vger.kernel.org (2.6.38+ please delay 2 weeks after -final release)
Fixes: 9d09e663d550 ("dm: raid456 basic support")
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid10: always set reshape_safe when initializing reshape_position.</title>
<updated>2015-07-22T04:08:24+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2015-07-06T07:37:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=299b0685e31c9f3dcc2d58ee3beca761a40b44b3'/>
<id>299b0685e31c9f3dcc2d58ee3beca761a40b44b3</id>
<content type='text'>
'reshape_position' tracks where in the reshape we have reached.
'reshape_safe' tracks where in the reshape we have safely recorded
in the metadata.

These are compared to determine when to update the metadata.
So it is important that reshape_safe is initialised properly.
Currently it isn't.  When starting a reshape from the beginning
it usually has the correct value by luck.  But when reducing the
number of devices in a RAID10, it has the wrong value and this leads
to the metadata not being updated correctly.
This can lead to corruption if the reshape is not allowed to complete.

This patch is suitable for any -stable kernel which supports RAID10
reshape, which is 3.5 and later.

Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support")
Cc: stable@vger.kernel.org (v3.5+ please wait for -final to be out for 2 weeks)
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'reshape_position' tracks where in the reshape we have reached.
'reshape_safe' tracks where in the reshape we have safely recorded
in the metadata.

These are compared to determine when to update the metadata.
So it is important that reshape_safe is initialised properly.
Currently it isn't.  When starting a reshape from the beginning
it usually has the correct value by luck.  But when reducing the
number of devices in a RAID10, it has the wrong value and this leads
to the metadata not being updated correctly.
This can lead to corruption if the reshape is not allowed to complete.

This patch is suitable for any -stable kernel which supports RAID10
reshape, which is 3.5 and later.

Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support")
Cc: stable@vger.kernel.org (v3.5+ please wait for -final to be out for 2 weeks)
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
