<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/gfs2, branch v5.8</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>gfs2: Rework read and page fault locking</title>
<updated>2020-07-07T21:40:12+00:00</updated>
<author>
<name>Andreas Gruenbacher</name>
<email>agruenba@redhat.com</email>
</author>
<published>2020-07-01T17:25:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=20f829999c38b18e3d17f9e40dea3a28f721fac4'/>
<id>20f829999c38b18e3d17f9e40dea3a28f721fac4</id>
<content type='text'>
So far, gfs2 has taken the inode glocks inside the -&gt;readpage and
-&gt;readahead address space operations.  Since commit d4388340ae0b ("fs:
convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed
the pages to read ahead locked.  With that, the current holder of the
inode glock may be trying to lock one of those pages while
gfs2_readahead is trying to take the inode glock, resulting in a
deadlock.

Fix that by moving the lock taking to the higher-level -&gt;read_iter file
and -&gt;fault vm operations.  This also gets rid of an ugly lock inversion
workaround in gfs2_readpage.

The cache consistency model of filesystems like gfs2 is such that if
data is found in the page cache, the data is up to date and can be used
without taking any filesystem locks.  If a page is not cached,
filesystem locks must be taken before populating the page cache.

To avoid taking the inode glock when the data is already cached,
gfs2_file_read_iter first tries to read the data with the IOCB_NOIO flag
set.  If that fails, the inode glock is taken and the operation is
retried with the IOCB_NOIO flag cleared.

Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So far, gfs2 has taken the inode glocks inside the -&gt;readpage and
-&gt;readahead address space operations.  Since commit d4388340ae0b ("fs:
convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed
the pages to read ahead locked.  With that, the current holder of the
inode glock may be trying to lock one of those pages while
gfs2_readahead is trying to take the inode glock, resulting in a
deadlock.

Fix that by moving the lock taking to the higher-level -&gt;read_iter file
and -&gt;fault vm operations.  This also gets rid of an ugly lock inversion
workaround in gfs2_readpage.

The cache consistency model of filesystems like gfs2 is such that if
data is found in the page cache, the data is up to date and can be used
without taking any filesystem locks.  If a page is not cached,
filesystem locks must be taken before populating the page cache.

To avoid taking the inode glock when the data is already cached,
gfs2_file_read_iter first tries to read the data with the IOCB_NOIO flag
set.  If that fails, the inode glock is taken and the operation is
retried with the IOCB_NOIO flag cleared.

Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: The freeze glock should never be frozen</title>
<updated>2020-07-03T10:05:35+00:00</updated>
<author>
<name>Bob Peterson</name>
<email>rpeterso@redhat.com</email>
</author>
<published>2020-06-25T19:42:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c860f8ffbea8924de05a281b937128773d30a77c'/>
<id>c860f8ffbea8924de05a281b937128773d30a77c</id>
<content type='text'>
Before this patch, some gfs2 code locked the freeze glock with LM_FLAG_NOEXP
(Do not freeze) flag, and some did not. We never want to freeze the freeze
glock, so this patch makes it consistently use LM_FLAG_NOEXP always.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before this patch, some gfs2 code locked the freeze glock with LM_FLAG_NOEXP
(Do not freeze) flag, and some did not. We never want to freeze the freeze
glock, so this patch makes it consistently use LM_FLAG_NOEXP always.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: When freezing gfs2, use GL_EXACT and not GL_NOCACHE</title>
<updated>2020-07-03T10:05:35+00:00</updated>
<author>
<name>Bob Peterson</name>
<email>rpeterso@redhat.com</email>
</author>
<published>2020-06-25T18:30:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=623ba664b74a20f22a2ef7ebd71e171d2d7c626f'/>
<id>623ba664b74a20f22a2ef7ebd71e171d2d7c626f</id>
<content type='text'>
Before this patch, the freeze code in gfs2 specified GL_NOCACHE in
several places. That's wrong because we always want to know the state
of whether the file system is frozen.

There was also a problem with freeze/thaw transitioning the glock from
frozen (EX) to thawed (SH) because gfs2 will normally grant glocks in EX
to processes that request it in SH mode, unless GL_EXACT is specified.
Therefore, the freeze/thaw code, which tried to reacquire the glock in
SH mode would get the glock in EX mode, and miss the transition from EX
to SH. That made it think the thaw had completed normally, but since the
glock was still cached in EX, other nodes could not freeze again.

This patch removes the GL_NOCACHE flag to allow the freeze glock to be
cached. It also adds the GL_EXACT flag so the glock is fully transitioned
from EX to SH, thereby allowing future freeze operations.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before this patch, the freeze code in gfs2 specified GL_NOCACHE in
several places. That's wrong because we always want to know the state
of whether the file system is frozen.

There was also a problem with freeze/thaw transitioning the glock from
frozen (EX) to thawed (SH) because gfs2 will normally grant glocks in EX
to processes that request it in SH mode, unless GL_EXACT is specified.
Therefore, the freeze/thaw code, which tried to reacquire the glock in
SH mode would get the glock in EX mode, and miss the transition from EX
to SH. That made it think the thaw had completed normally, but since the
glock was still cached in EX, other nodes could not freeze again.

This patch removes the GL_NOCACHE flag to allow the freeze glock to be
cached. It also adds the GL_EXACT flag so the glock is fully transitioned
from EX to SH, thereby allowing future freeze operations.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: read-only mounts should grab the sd_freeze_gl glock</title>
<updated>2020-07-03T10:05:35+00:00</updated>
<author>
<name>Bob Peterson</name>
<email>rpeterso@redhat.com</email>
</author>
<published>2020-06-25T18:30:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b780cc615ba4795a7ef0e93b19424828a5ad456a'/>
<id>b780cc615ba4795a7ef0e93b19424828a5ad456a</id>
<content type='text'>
Before this patch, only read-write mounts would grab the freeze
glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
glock was never initialized. That meant requests to freeze, which
request the glock in EX, were granted without any state transition.
That meant you could mount a gfs2 file system, which is currently
frozen on a different cluster node, in read-only mode.

This patch makes read-only mounts lock the freeze glock in SH mode,
which will block for file systems that are frozen on another node.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before this patch, only read-write mounts would grab the freeze
glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
glock was never initialized. That meant requests to freeze, which
request the glock in EX, were granted without any state transition.
That meant you could mount a gfs2 file system, which is currently
frozen on a different cluster node, in read-only mode.

This patch makes read-only mounts lock the freeze glock in SH mode,
which will block for file systems that are frozen on another node.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: freeze should work on read-only mounts</title>
<updated>2020-07-03T10:05:35+00:00</updated>
<author>
<name>Bob Peterson</name>
<email>rpeterso@redhat.com</email>
</author>
<published>2020-06-25T18:29:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=541656d3a5136ae830d604e237f29f406d42c592'/>
<id>541656d3a5136ae830d604e237f29f406d42c592</id>
<content type='text'>
Before this patch, function freeze_go_sync, called when promoting
the freeze glock, was testing for the SDF_JOURNAL_LIVE superblock flag.
That's only set for read-write mounts. Read-only mounts don't use a
journal, so the bit is never set, so the freeze never happened.

This patch removes the check for SDF_JOURNAL_LIVE for freeze requests
but still checks it when deciding whether to flush a journal.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before this patch, function freeze_go_sync, called when promoting
the freeze glock, was testing for the SDF_JOURNAL_LIVE superblock flag.
That's only set for read-write mounts. Read-only mounts don't use a
journal, so the bit is never set, so the freeze never happened.

This patch removes the check for SDF_JOURNAL_LIVE for freeze requests
but still checks it when deciding whether to flush a journal.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: eliminate GIF_ORDERED in favor of list_empty</title>
<updated>2020-07-03T10:05:34+00:00</updated>
<author>
<name>Bob Peterson</name>
<email>rpeterso@redhat.com</email>
</author>
<published>2020-06-17T12:47:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7542486b89b2e321ffe0de82163b425d6a38bc72'/>
<id>7542486b89b2e321ffe0de82163b425d6a38bc72</id>
<content type='text'>
In several places, we used the GIF_ORDERED inode flag to determine
if an inode was on the ordered writes list. However, since we always
held the sd_ordered_lock spin_lock during the manipulation, we can
just as easily check list_empty(&amp;ip-&gt;i_ordered) instead.
This allows us to keep more than one ordered writes list to make
journal writing improvements.

This patch eliminates GIF_ORDERED in favor of checking list_empty.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In several places, we used the GIF_ORDERED inode flag to determine
if an inode was on the ordered writes list. However, since we always
held the sd_ordered_lock spin_lock during the manipulation, we can
just as easily check list_empty(&amp;ip-&gt;i_ordered) instead.
This allows us to keep more than one ordered writes list to make
journal writing improvements.

This patch eliminates GIF_ORDERED in favor of checking list_empty.

Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: Don't sleep during glock hash walk</title>
<updated>2020-06-30T11:04:45+00:00</updated>
<author>
<name>Andreas Gruenbacher</name>
<email>agruenba@redhat.com</email>
</author>
<published>2020-06-10T16:31:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=34244d711dea568f4a42c5b0d6b3d620f8cb6971'/>
<id>34244d711dea568f4a42c5b0d6b3d620f8cb6971</id>
<content type='text'>
In flush_delete_work, instead of flushing each individual pending
delayed work item, cancel and re-queue them for immediate execution.
The waiting isn't needed here because we're already waiting for all
queued work items to complete in gfs2_flush_delete_work.  This makes the
code more efficient, but more importantly, it avoids sleeping during a
rhashtable walk, inside rcu_read_lock().

Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In flush_delete_work, instead of flushing each individual pending
delayed work item, cancel and re-queue them for immediate execution.
The waiting isn't needed here because we're already waiting for all
queued work items to complete in gfs2_flush_delete_work.  This makes the
code more efficient, but more importantly, it avoids sleeping during a
rhashtable walk, inside rcu_read_lock().

Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: fix trans slab error when withdraw occurs inside log_flush</title>
<updated>2020-06-30T11:04:45+00:00</updated>
<author>
<name>Bob Peterson</name>
<email>rpeterso@redhat.com</email>
</author>
<published>2020-06-09T13:55:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=58e08e8d83ab03a1ca25d53420bd0b87f2dfe458'/>
<id>58e08e8d83ab03a1ca25d53420bd0b87f2dfe458</id>
<content type='text'>
Log flush operations (gfs2_log_flush()) can target a specific transaction.
But if the function encounters errors (e.g. io errors) and withdraws,
the transaction was only freed it if was queued to one of the ail lists.
If the withdraw occurred before the transaction was queued to the ail1
list, function ail_drain never freed it. The result was:

BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown()

This patch makes log_flush() add the targeted transaction to the ail1
list so that function ail_drain() will find and free it properly.

Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Log flush operations (gfs2_log_flush()) can target a specific transaction.
But if the function encounters errors (e.g. io errors) and withdraws,
the transaction was only freed it if was queued to one of the ail lists.
If the withdraw occurred before the transaction was queued to the ail1
list, function ail_drain never freed it. The result was:

BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown()

This patch makes log_flush() add the targeted transaction to the ail1
list so that function ail_drain() will find and free it properly.

Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Bob Peterson &lt;rpeterso@redhat.com&gt;
Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gfs2: Don't return NULL from gfs2_inode_lookup</title>
<updated>2020-06-30T11:04:45+00:00</updated>
<author>
<name>Andreas Gruenbacher</name>
<email>agruenba@redhat.com</email>
</author>
<published>2020-06-09T12:33:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5902f4dd6e666c4d160b2f5c4505f7e58642d2bf'/>
<id>5902f4dd6e666c4d160b2f5c4505f7e58642d2bf</id>
<content type='text'>
Callers expect gfs2_inode_lookup to return an inode pointer or ERR_PTR(error).
Commit b66648ad6dcf caused it to return NULL instead of ERR_PTR(-ESTALE) in
some cases.  Fix that.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Fixes: b66648ad6dcf ("gfs2: Move inode generation number check into gfs2_inode_lookup")
Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Callers expect gfs2_inode_lookup to return an inode pointer or ERR_PTR(error).
Commit b66648ad6dcf caused it to return NULL instead of ERR_PTR(-ESTALE) in
some cases.  Fix that.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Fixes: b66648ad6dcf ("gfs2: Move inode generation number check into gfs2_inode_lookup")
Signed-off-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2</title>
<updated>2020-06-08T19:47:09+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-08T19:47:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ca687877e05ad1bf5b4cefd9cdd091044626deac'/>
<id>ca687877e05ad1bf5b4cefd9cdd091044626deac</id>
<content type='text'>
Pull gfs2 updates from Andreas Gruenbacher:

 - An iopen glock locking scheme rework that speeds up deletes of inodes
   accessed from multiple nodes

 - Various bug fixes and debugging improvements

 - Convert gfs2-glocks.txt to ReST

* tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: fix use-after-free on transaction ail lists
  gfs2: new slab for transactions
  gfs2: initialize transaction tr_ailX_lists earlier
  gfs2: Smarter iopen glock waiting
  gfs2: Wake up when setting GLF_DEMOTE
  gfs2: Check inode generation number in delete_work_func
  gfs2: Move inode generation number check into gfs2_inode_lookup
  gfs2: Minor gfs2_lookup_by_inum cleanup
  gfs2: Try harder to delete inodes locally
  gfs2: Give up the iopen glock on contention
  gfs2: Turn gl_delete into a delayed work
  gfs2: Keep track of deleted inode generations in LVBs
  gfs2: Allow ASPACE glocks to also have an lvb
  gfs2: instrumentation wrt log_flush stuck
  gfs2: introduce new gfs2_glock_assert_withdraw
  gfs2: print mapping-&gt;nrpages in glock dump for address space glocks
  gfs2: Only do glock put in gfs2_create_inode for free inodes
  gfs2: Allow lock_nolock mount to specify jid=X
  gfs2: Don't ignore inode write errors during inode_go_sync
  docs: filesystems: convert gfs2-glocks.txt to ReST
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull gfs2 updates from Andreas Gruenbacher:

 - An iopen glock locking scheme rework that speeds up deletes of inodes
   accessed from multiple nodes

 - Various bug fixes and debugging improvements

 - Convert gfs2-glocks.txt to ReST

* tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: fix use-after-free on transaction ail lists
  gfs2: new slab for transactions
  gfs2: initialize transaction tr_ailX_lists earlier
  gfs2: Smarter iopen glock waiting
  gfs2: Wake up when setting GLF_DEMOTE
  gfs2: Check inode generation number in delete_work_func
  gfs2: Move inode generation number check into gfs2_inode_lookup
  gfs2: Minor gfs2_lookup_by_inum cleanup
  gfs2: Try harder to delete inodes locally
  gfs2: Give up the iopen glock on contention
  gfs2: Turn gl_delete into a delayed work
  gfs2: Keep track of deleted inode generations in LVBs
  gfs2: Allow ASPACE glocks to also have an lvb
  gfs2: instrumentation wrt log_flush stuck
  gfs2: introduce new gfs2_glock_assert_withdraw
  gfs2: print mapping-&gt;nrpages in glock dump for address space glocks
  gfs2: Only do glock put in gfs2_create_inode for free inodes
  gfs2: Allow lock_nolock mount to specify jid=X
  gfs2: Don't ignore inode write errors during inode_go_sync
  docs: filesystems: convert gfs2-glocks.txt to ReST
</pre>
</div>
</content>
</entry>
</feed>
