<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/btrfs/async-thread.c, branch v5.1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>btrfs: simplify workqueue name when allocating</title>
<updated>2019-02-25T13:13:24+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2019-01-17T16:15:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ce3ded1061c8c610c304ac797c3b4680e5b8a5f1'/>
<id>ce3ded1061c8c610c304ac797c3b4680e5b8a5f1</id>
<content type='text'>
The workqueue name is constructed from a format string but the prefix
does not need to be set by %s.

Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The workqueue name is constructed from a format string but the prefix
does not need to be set by %s.

Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: replace GPL boilerplate by SPDX -- sources</title>
<updated>2018-04-12T14:29:51+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2018-04-03T17:23:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c1d7c514f745628eb096c5cbb10737855879ae25'/>
<id>c1d7c514f745628eb096c5cbb10737855879ae25</id>
<content type='text'>
Remove GPL boilerplate text (long, short, one-line) and keep the rest,
ie. personal, company or original source copyright statements. Add the
SPDX header.

Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove GPL boilerplate text (long, short, one-line) and keep the rest,
ie. personal, company or original source copyright statements. Add the
SPDX header.

Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: fix confusing worker helper info in stacktrace</title>
<updated>2017-10-30T11:27:57+00:00</updated>
<author>
<name>Liu Bo</name>
<email>bo.li.liu@oracle.com</email>
</author>
<published>2017-09-13T18:09:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6939f667247ed792535c0697a99f600d6770f127'/>
<id>6939f667247ed792535c0697a99f600d6770f127</id>
<content type='text'>
We've seen the following backtrace stack in ftrace or dmesg log,

  kworker/u16:10-4244  [000] 241942.480955: function:             btrfs_put_ordered_extent
  kworker/u16:10-4244  [000] 241942.480956: kernel_stack:         &lt;stack trace&gt;
=&gt; finish_ordered_fn (ffffffffa0384475)
=&gt; btrfs_scrubparity_helper (ffffffffa03ca577)        &lt;-----"incorrect"
=&gt; btrfs_freespace_write_helper (ffffffffa03ca98e)    &lt;-----"correct"
=&gt; process_one_work (ffffffff81117b2f)
=&gt; worker_thread (ffffffff81118c2a)
=&gt; kthread (ffffffff81121de0)
=&gt; ret_from_fork (ffffffff81d7087a)

btrfs_freespace_write_helper is actually calling normal_worker_helper
instead of btrfs_scrubparity_helper, so somehow kernel has parsed the
incorrect function address while unwinding the stack,
btrfs_scrubparity_helper really shouldn't be shown up.

It's caused by compiler doing inline for our helper function, adding a
noinline tag can fix that.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
[ use noinline_for_stack ]
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We've seen the following backtrace stack in ftrace or dmesg log,

  kworker/u16:10-4244  [000] 241942.480955: function:             btrfs_put_ordered_extent
  kworker/u16:10-4244  [000] 241942.480956: kernel_stack:         &lt;stack trace&gt;
=&gt; finish_ordered_fn (ffffffffa0384475)
=&gt; btrfs_scrubparity_helper (ffffffffa03ca577)        &lt;-----"incorrect"
=&gt; btrfs_freespace_write_helper (ffffffffa03ca98e)    &lt;-----"correct"
=&gt; process_one_work (ffffffff81117b2f)
=&gt; worker_thread (ffffffff81118c2a)
=&gt; kthread (ffffffff81121de0)
=&gt; ret_from_fork (ffffffff81d7087a)

btrfs_freespace_write_helper is actually calling normal_worker_helper
instead of btrfs_scrubparity_helper, so somehow kernel has parsed the
incorrect function address while unwinding the stack,
btrfs_scrubparity_helper really shouldn't be shown up.

It's caused by compiler doing inline for our helper function, adding a
noinline tag can fix that.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
[ use noinline_for_stack ]
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: constify tracepoint arguments</title>
<updated>2017-08-16T12:19:53+00:00</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2017-06-29T03:56:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9a35b63728ceb8602c111260044451dd64952500'/>
<id>9a35b63728ceb8602c111260044451dd64952500</id>
<content type='text'>
Tracepoint arguments are all read-only.  If we mark the arguments
as const, we're able to keep or convert those arguments to const
where appropriate.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Tracepoint arguments are all read-only.  If we mark the arguments
as const, we're able to keep or convert those arguments to const
where appropriate.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: fix crash when tracepoint arguments are freed by wq callbacks</title>
<updated>2017-01-09T10:24:50+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2017-01-06T13:12:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ac0c7cf8be00f269f82964cf7b144ca3edc5dbc4'/>
<id>ac0c7cf8be00f269f82964cf7b144ca3edc5dbc4</id>
<content type='text'>
Enabling btrfs tracepoints leads to instant crash, as reported. The wq
callbacks could free the memory and the tracepoints started to
dereference the members to get to fs_info.

The proposed fix https://marc.info/?l=linux-btrfs&amp;m=148172436722606&amp;w=2
removed the tracepoints but we could preserve them by passing only the
required data in a safe way.

Fixes: bc074524e123 ("btrfs: prefix fsid to all trace events")
CC: stable@vger.kernel.org # 4.8+
Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Enabling btrfs tracepoints leads to instant crash, as reported. The wq
callbacks could free the memory and the tracepoints started to
dereference the members to get to fs_info.

The proposed fix https://marc.info/?l=linux-btrfs&amp;m=148172436722606&amp;w=2
removed the tracepoints but we could preserve them by passing only the
required data in a safe way.

Fixes: bc074524e123 ("btrfs: prefix fsid to all trace events")
CC: stable@vger.kernel.org # 4.8+
Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Reviewed-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: limit async_work allocation and worker func duration</title>
<updated>2016-12-13T19:01:30+00:00</updated>
<author>
<name>Maxim Patlasov</name>
<email>mpatlasov@virtuozzo.com</email>
</author>
<published>2016-12-12T22:32:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2939e1a86f758b55cdba73e29397dd3d94df13bc'/>
<id>2939e1a86f758b55cdba73e29397dd3d94df13bc</id>
<content type='text'>
Problem statement: unprivileged user who has read-write access to more than
one btrfs subvolume may easily consume all kernel memory (eventually
triggering oom-killer).

Reproducer (./mkrmdir below essentially loops over mkdir/rmdir):

[root@kteam1 ~]# cat prep.sh

DEV=/dev/sdb
mkfs.btrfs -f $DEV
mount $DEV /mnt
for i in `seq 1 16`
do
	mkdir /mnt/$i
	btrfs subvolume create /mnt/SV_$i
	ID=`btrfs subvolume list /mnt |grep "SV_$i$" |cut -d ' ' -f 2`
	mount -t btrfs -o subvolid=$ID $DEV /mnt/$i
	chmod a+rwx /mnt/$i
done

[root@kteam1 ~]# sh prep.sh

[maxim@kteam1 ~]$ for i in `seq 1 16`; do ./mkrmdir /mnt/$i 2000 2000 &amp; done

[root@kteam1 ~]# for i in `seq 1 4`; do grep "kmalloc-128" /proc/slabinfo | grep -v dma; sleep 60; done
kmalloc-128        10144  10144    128   32    1 : tunables    0    0    0 : slabdata    317    317      0
kmalloc-128       9992352 9992352    128   32    1 : tunables    0    0    0 : slabdata 312261 312261      0
kmalloc-128       24226752 24226752    128   32    1 : tunables    0    0    0 : slabdata 757086 757086      0
kmalloc-128       42754240 42754240    128   32    1 : tunables    0    0    0 : slabdata 1336070 1336070      0

The huge numbers above come from insane number of async_work-s allocated
and queued by btrfs_wq_run_delayed_node.

The problem is caused by btrfs_wq_run_delayed_node() queuing more and more
works if the number of delayed items is above BTRFS_DELAYED_BACKGROUND. The
worker func (btrfs_async_run_delayed_root) processes at least
BTRFS_DELAYED_BATCH items (if they are present in the list). So, the machinery
works as expected while the list is almost empty. As soon as it is getting
bigger, worker func starts to process more than one item at a time, it takes
longer, and the chances to have async_works queued more than needed is getting
higher.

The problem above is worsened by another flaw of delayed-inode implementation:
if async_work was queued in a throttling branch (number of items &gt;=
BTRFS_DELAYED_WRITEBACK), corresponding worker func won't quit until
the number of items &lt; BTRFS_DELAYED_BACKGROUND / 2. So, it is possible that
the func occupies CPU infinitely (up to 30sec in my experiments): while the
func is trying to drain the list, the user activity may add more and more
items to the list.

The patch fixes both problems in straightforward way: refuse queuing too
many works in btrfs_wq_run_delayed_node and bail out of worker func if
at least BTRFS_DELAYED_WRITEBACK items are processed.

Changed in v2: remove support of thresh == NO_THRESHOLD.

Signed-off-by: Maxim Patlasov &lt;mpatlasov@virtuozzo.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
Cc: stable@vger.kernel.org # v3.15+
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem statement: unprivileged user who has read-write access to more than
one btrfs subvolume may easily consume all kernel memory (eventually
triggering oom-killer).

Reproducer (./mkrmdir below essentially loops over mkdir/rmdir):

[root@kteam1 ~]# cat prep.sh

DEV=/dev/sdb
mkfs.btrfs -f $DEV
mount $DEV /mnt
for i in `seq 1 16`
do
	mkdir /mnt/$i
	btrfs subvolume create /mnt/SV_$i
	ID=`btrfs subvolume list /mnt |grep "SV_$i$" |cut -d ' ' -f 2`
	mount -t btrfs -o subvolid=$ID $DEV /mnt/$i
	chmod a+rwx /mnt/$i
done

[root@kteam1 ~]# sh prep.sh

[maxim@kteam1 ~]$ for i in `seq 1 16`; do ./mkrmdir /mnt/$i 2000 2000 &amp; done

[root@kteam1 ~]# for i in `seq 1 4`; do grep "kmalloc-128" /proc/slabinfo | grep -v dma; sleep 60; done
kmalloc-128        10144  10144    128   32    1 : tunables    0    0    0 : slabdata    317    317      0
kmalloc-128       9992352 9992352    128   32    1 : tunables    0    0    0 : slabdata 312261 312261      0
kmalloc-128       24226752 24226752    128   32    1 : tunables    0    0    0 : slabdata 757086 757086      0
kmalloc-128       42754240 42754240    128   32    1 : tunables    0    0    0 : slabdata 1336070 1336070      0

The huge numbers above come from insane number of async_work-s allocated
and queued by btrfs_wq_run_delayed_node.

The problem is caused by btrfs_wq_run_delayed_node() queuing more and more
works if the number of delayed items is above BTRFS_DELAYED_BACKGROUND. The
worker func (btrfs_async_run_delayed_root) processes at least
BTRFS_DELAYED_BATCH items (if they are present in the list). So, the machinery
works as expected while the list is almost empty. As soon as it is getting
bigger, worker func starts to process more than one item at a time, it takes
longer, and the chances to have async_works queued more than needed is getting
higher.

The problem above is worsened by another flaw of delayed-inode implementation:
if async_work was queued in a throttling branch (number of items &gt;=
BTRFS_DELAYED_WRITEBACK), corresponding worker func won't quit until
the number of items &lt; BTRFS_DELAYED_BACKGROUND / 2. So, it is possible that
the func occupies CPU infinitely (up to 30sec in my experiments): while the
func is trying to drain the list, the user activity may add more and more
items to the list.

The patch fixes both problems in straightforward way: refuse queuing too
many works in btrfs_wq_run_delayed_node and bail out of worker func if
at least BTRFS_DELAYED_WRITEBACK items are processed.

Changed in v2: remove support of thresh == NO_THRESHOLD.

Signed-off-by: Maxim Patlasov &lt;mpatlasov@virtuozzo.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
Cc: stable@vger.kernel.org # v3.15+
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: plumb fs_info into btrfs_work</title>
<updated>2016-07-26T11:53:15+00:00</updated>
<author>
<name>Jeff Mahoney</name>
<email>jeffm@suse.com</email>
</author>
<published>2016-06-09T20:22:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb001095ca705dcd95f57fe98867e38a4889916d'/>
<id>cb001095ca705dcd95f57fe98867e38a4889916d</id>
<content type='text'>
In order to provide an fsid for trace events, we'll need a btrfs_fs_info
pointer.  The most lightweight way to do that for btrfs_work structures
is to associate it with the __btrfs_workqueue structure.  Each queued
btrfs_work structure has a workqueue associated with it, so that's
a natural fit.  It's a privately defined structures, so we add accessors
to retrieve the fs_info pointer.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to provide an fsid for trace events, we'll need a btrfs_fs_info
pointer.  The most lightweight way to do that for btrfs_work structures
is to associate it with the __btrfs_workqueue structure.  Each queued
btrfs_work structure has a workqueue associated with it, so that's
a natural fit.  It's a privately defined structures, so we add accessors
to retrieve the fs_info pointer.

Signed-off-by: Jeff Mahoney &lt;jeffm@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: async-thread: Fix a use-after-free error for trace</title>
<updated>2016-01-26T00:50:26+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2016-01-22T01:28:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0a95b851370b84a4b9d92ee6d1fa0926901d0454'/>
<id>0a95b851370b84a4b9d92ee6d1fa0926901d0454</id>
<content type='text'>
Parameter of trace_btrfs_work_queued() can be freed in its workqueue.
So no one use use that pointer after queue_work().

Fix the user-after-free bug by move the trace line before queue_work().

Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Parameter of trace_btrfs_work_queued() can be freed in its workqueue.
So no one use use that pointer after queue_work().

Fix the user-after-free bug by move the trace line before queue_work().

Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: use GFP_KERNEL for allocations of workqueues</title>
<updated>2015-12-03T14:03:43+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2015-12-01T17:04:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=61dd5ae65be6dfaeadb0e841ea6639351f0e04ce'/>
<id>61dd5ae65be6dfaeadb0e841ea6639351f0e04ce</id>
<content type='text'>
We don't have to use GFP_NOFS to allocate workqueue structures, this is
done from mount context or potentially scrub start context, safe to fail
in both cases.

Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We don't have to use GFP_NOFS to allocate workqueue structures, this is
done from mount context or potentially scrub start context, safe to fail
in both cases.

Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: async_thread: Fix workqueue 'max_active' value when initializing</title>
<updated>2015-08-31T18:46:40+00:00</updated>
<author>
<name>Qu Wenruo</name>
<email>quwenruo@cn.fujitsu.com</email>
</author>
<published>2015-08-20T01:30:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6dd6ea55758cf403bdc07a51a06c2a1d474f906'/>
<id>c6dd6ea55758cf403bdc07a51a06c2a1d474f906</id>
<content type='text'>
At initializing time, for threshold-able workqueue, it's max_active
of kernel workqueue should be 1 and grow if it hits threshold.

But due to the bad naming, there is both 'max_active' for kernel
workqueue and btrfs workqueue.
So wrong value is given at workqueue initialization.

This patch fixes it, and to avoid further misunderstanding, change the
member name of btrfs_workqueue to 'current_active' and 'limit_active'.

Also corresponding comment is added for readability.

Reported-by: Alex Lyakas &lt;alex.btrfs@zadarastorage.com&gt;
Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At initializing time, for threshold-able workqueue, it's max_active
of kernel workqueue should be 1 and grow if it hits threshold.

But due to the bad naming, there is both 'max_active' for kernel
workqueue and btrfs workqueue.
So wrong value is given at workqueue initialization.

This patch fixes it, and to avoid further misunderstanding, change the
member name of btrfs_workqueue to 'current_active' and 'limit_active'.

Also corresponding comment is added for readability.

Reported-by: Alex Lyakas &lt;alex.btrfs@zadarastorage.com&gt;
Signed-off-by: Qu Wenruo &lt;quwenruo@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;clm@fb.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
