<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/mm/backing-dev.c, branch v3.1.7</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>backing-dev: ensure wakeup_timer is deleted</title>
<updated>2011-11-21T22:35:28+00:00</updated>
<author>
<name>Rabin Vincent</name>
<email>rabin.vincent@stericsson.com</email>
</author>
<published>2011-11-11T12:29:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5831e11b2e21372bf384ecc9ee3398fe5b8eaf6a'/>
<id>5831e11b2e21372bf384ecc9ee3398fe5b8eaf6a</id>
<content type='text'>
commit 7a401a972df8e184b3d1a3fc958c0a4ddee8d312 upstream.

bdi_prune_sb() in bdi_unregister() attempts to removes the bdi links
from all super_blocks and then del_timer_sync() the writeback timer.

However, this can race with __mark_inode_dirty(), leading to
bdi_wakeup_thread_delayed() rearming the writeback timer on the bdi
we're unregistering, after we've called del_timer_sync().

This can end up with the bdi being freed with an active timer inside it,
as in the case of the following dump after the removal of an SD card.

Fix this by redoing the del_timer_sync() in bdi_destory().

 ------------[ cut here ]------------
 WARNING: at /home/rabin/kernel/arm/lib/debugobjects.c:262 debug_print_object+0x9c/0xc8()
 ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x180
 Modules linked in:
 Backtrace:
 [&lt;c00109dc&gt;] (dump_backtrace+0x0/0x110) from [&lt;c0236e4c&gt;] (dump_stack+0x18/0x1c)
  r6:c02bc638 r5:00000106 r4:c79f5d18 r3:00000000
 [&lt;c0236e34&gt;] (dump_stack+0x0/0x1c) from [&lt;c0025e6c&gt;] (warn_slowpath_common+0x54/0x6c)
 [&lt;c0025e18&gt;] (warn_slowpath_common+0x0/0x6c) from [&lt;c0025f28&gt;] (warn_slowpath_fmt+0x38/0x40)
  r8:20000013 r7:c780c6f0 r6:c031613c r5:c780c6f0 r4:c02b1b29
 r3:00000009
 [&lt;c0025ef0&gt;] (warn_slowpath_fmt+0x0/0x40) from [&lt;c015eb4c&gt;] (debug_print_object+0x9c/0xc8)
  r3:c02b1b29 r2:c02bc662
 [&lt;c015eab0&gt;] (debug_print_object+0x0/0xc8) from [&lt;c015f574&gt;] (debug_check_no_obj_freed+0xac/0x1dc)
  r6:c7964000 r5:00000001 r4:c7964000
 [&lt;c015f4c8&gt;] (debug_check_no_obj_freed+0x0/0x1dc) from [&lt;c00a9e38&gt;] (kmem_cache_free+0x88/0x1f8)
 [&lt;c00a9db0&gt;] (kmem_cache_free+0x0/0x1f8) from [&lt;c014286c&gt;] (blk_release_queue+0x70/0x78)
 [&lt;c01427fc&gt;] (blk_release_queue+0x0/0x78) from [&lt;c015290c&gt;] (kobject_release+0x70/0x84)
  r5:c79641f0 r4:c796420c
 [&lt;c015289c&gt;] (kobject_release+0x0/0x84) from [&lt;c0153ce4&gt;] (kref_put+0x68/0x80)
  r7:00000083 r6:c74083d0 r5:c015289c r4:c796420c
 [&lt;c0153c7c&gt;] (kref_put+0x0/0x80) from [&lt;c01527d0&gt;] (kobject_put+0x48/0x5c)
  r5:c79643b4 r4:c79641f0
 [&lt;c0152788&gt;] (kobject_put+0x0/0x5c) from [&lt;c013ddd8&gt;] (blk_cleanup_queue+0x68/0x74)
  r4:c7964000
 [&lt;c013dd70&gt;] (blk_cleanup_queue+0x0/0x74) from [&lt;c01a6370&gt;] (mmc_blk_put+0x78/0xe8)
  r5:00000000 r4:c794c400
 [&lt;c01a62f8&gt;] (mmc_blk_put+0x0/0xe8) from [&lt;c01a64b4&gt;] (mmc_blk_release+0x24/0x38)
  r5:c794c400 r4:c0322824
 [&lt;c01a6490&gt;] (mmc_blk_release+0x0/0x38) from [&lt;c00de11c&gt;] (__blkdev_put+0xe8/0x170)
  r5:c78d5e00 r4:c74083c0
 [&lt;c00de034&gt;] (__blkdev_put+0x0/0x170) from [&lt;c00de2c0&gt;] (blkdev_put+0x11c/0x12c)
  r8:c79f5f70 r7:00000001 r6:c74083d0 r5:00000083 r4:c74083c0
 r3:00000000
 [&lt;c00de1a4&gt;] (blkdev_put+0x0/0x12c) from [&lt;c00b0724&gt;] (kill_block_super+0x60/0x6c)
  r7:c7942300 r6:c79f4000 r5:00000083 r4:c74083c0
 [&lt;c00b06c4&gt;] (kill_block_super+0x0/0x6c) from [&lt;c00b0a94&gt;] (deactivate_locked_super+0x44/0x70)
  r6:c79f4000 r5:c031af64 r4:c794dc00 r3:c00b06c4
 [&lt;c00b0a50&gt;] (deactivate_locked_super+0x0/0x70) from [&lt;c00b1358&gt;] (deactivate_super+0x6c/0x70)
  r5:c794dc00 r4:c794dc00
 [&lt;c00b12ec&gt;] (deactivate_super+0x0/0x70) from [&lt;c00c88b0&gt;] (mntput_no_expire+0x188/0x194)
  r5:c794dc00 r4:c7942300
 [&lt;c00c8728&gt;] (mntput_no_expire+0x0/0x194) from [&lt;c00c95e0&gt;] (sys_umount+0x2e4/0x310)
  r6:c7942300 r5:00000000 r4:00000000 r3:00000000
 [&lt;c00c92fc&gt;] (sys_umount+0x0/0x310) from [&lt;c000d940&gt;] (ret_fast_syscall+0x0/0x30)
 ---[ end trace e5c83c92ada51c76 ]---

Signed-off-by: Rabin Vincent &lt;rabin.vincent@stericsson.com&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7a401a972df8e184b3d1a3fc958c0a4ddee8d312 upstream.

bdi_prune_sb() in bdi_unregister() attempts to removes the bdi links
from all super_blocks and then del_timer_sync() the writeback timer.

However, this can race with __mark_inode_dirty(), leading to
bdi_wakeup_thread_delayed() rearming the writeback timer on the bdi
we're unregistering, after we've called del_timer_sync().

This can end up with the bdi being freed with an active timer inside it,
as in the case of the following dump after the removal of an SD card.

Fix this by redoing the del_timer_sync() in bdi_destory().

 ------------[ cut here ]------------
 WARNING: at /home/rabin/kernel/arm/lib/debugobjects.c:262 debug_print_object+0x9c/0xc8()
 ODEBUG: free active (active state 0) object type: timer_list hint: wakeup_timer_fn+0x0/0x180
 Modules linked in:
 Backtrace:
 [&lt;c00109dc&gt;] (dump_backtrace+0x0/0x110) from [&lt;c0236e4c&gt;] (dump_stack+0x18/0x1c)
  r6:c02bc638 r5:00000106 r4:c79f5d18 r3:00000000
 [&lt;c0236e34&gt;] (dump_stack+0x0/0x1c) from [&lt;c0025e6c&gt;] (warn_slowpath_common+0x54/0x6c)
 [&lt;c0025e18&gt;] (warn_slowpath_common+0x0/0x6c) from [&lt;c0025f28&gt;] (warn_slowpath_fmt+0x38/0x40)
  r8:20000013 r7:c780c6f0 r6:c031613c r5:c780c6f0 r4:c02b1b29
 r3:00000009
 [&lt;c0025ef0&gt;] (warn_slowpath_fmt+0x0/0x40) from [&lt;c015eb4c&gt;] (debug_print_object+0x9c/0xc8)
  r3:c02b1b29 r2:c02bc662
 [&lt;c015eab0&gt;] (debug_print_object+0x0/0xc8) from [&lt;c015f574&gt;] (debug_check_no_obj_freed+0xac/0x1dc)
  r6:c7964000 r5:00000001 r4:c7964000
 [&lt;c015f4c8&gt;] (debug_check_no_obj_freed+0x0/0x1dc) from [&lt;c00a9e38&gt;] (kmem_cache_free+0x88/0x1f8)
 [&lt;c00a9db0&gt;] (kmem_cache_free+0x0/0x1f8) from [&lt;c014286c&gt;] (blk_release_queue+0x70/0x78)
 [&lt;c01427fc&gt;] (blk_release_queue+0x0/0x78) from [&lt;c015290c&gt;] (kobject_release+0x70/0x84)
  r5:c79641f0 r4:c796420c
 [&lt;c015289c&gt;] (kobject_release+0x0/0x84) from [&lt;c0153ce4&gt;] (kref_put+0x68/0x80)
  r7:00000083 r6:c74083d0 r5:c015289c r4:c796420c
 [&lt;c0153c7c&gt;] (kref_put+0x0/0x80) from [&lt;c01527d0&gt;] (kobject_put+0x48/0x5c)
  r5:c79643b4 r4:c79641f0
 [&lt;c0152788&gt;] (kobject_put+0x0/0x5c) from [&lt;c013ddd8&gt;] (blk_cleanup_queue+0x68/0x74)
  r4:c7964000
 [&lt;c013dd70&gt;] (blk_cleanup_queue+0x0/0x74) from [&lt;c01a6370&gt;] (mmc_blk_put+0x78/0xe8)
  r5:00000000 r4:c794c400
 [&lt;c01a62f8&gt;] (mmc_blk_put+0x0/0xe8) from [&lt;c01a64b4&gt;] (mmc_blk_release+0x24/0x38)
  r5:c794c400 r4:c0322824
 [&lt;c01a6490&gt;] (mmc_blk_release+0x0/0x38) from [&lt;c00de11c&gt;] (__blkdev_put+0xe8/0x170)
  r5:c78d5e00 r4:c74083c0
 [&lt;c00de034&gt;] (__blkdev_put+0x0/0x170) from [&lt;c00de2c0&gt;] (blkdev_put+0x11c/0x12c)
  r8:c79f5f70 r7:00000001 r6:c74083d0 r5:00000083 r4:c74083c0
 r3:00000000
 [&lt;c00de1a4&gt;] (blkdev_put+0x0/0x12c) from [&lt;c00b0724&gt;] (kill_block_super+0x60/0x6c)
  r7:c7942300 r6:c79f4000 r5:00000083 r4:c74083c0
 [&lt;c00b06c4&gt;] (kill_block_super+0x0/0x6c) from [&lt;c00b0a94&gt;] (deactivate_locked_super+0x44/0x70)
  r6:c79f4000 r5:c031af64 r4:c794dc00 r3:c00b06c4
 [&lt;c00b0a50&gt;] (deactivate_locked_super+0x0/0x70) from [&lt;c00b1358&gt;] (deactivate_super+0x6c/0x70)
  r5:c794dc00 r4:c794dc00
 [&lt;c00b12ec&gt;] (deactivate_super+0x0/0x70) from [&lt;c00c88b0&gt;] (mntput_no_expire+0x188/0x194)
  r5:c794dc00 r4:c7942300
 [&lt;c00c8728&gt;] (mntput_no_expire+0x0/0x194) from [&lt;c00c95e0&gt;] (sys_umount+0x2e4/0x310)
  r6:c7942300 r5:00000000 r4:00000000 r3:00000000
 [&lt;c00c92fc&gt;] (sys_umount+0x0/0x310) from [&lt;c000d940&gt;] (ret_fast_syscall+0x0/0x30)
 ---[ end trace e5c83c92ada51c76 ]---

Signed-off-by: Rabin Vincent &lt;rabin.vincent@stericsson.com&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Add comment explaining task state setting in bdi_forker_thread()</title>
<updated>2011-09-02T23:17:02+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-09-02T23:04:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09f40f98bfa2ac22a332a713629a2f8f92896834'/>
<id>09f40f98bfa2ac22a332a713629a2f8f92896834</id>
<content type='text'>
CC: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
CC: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Cleanup clearing of BDI_pending bit in bdi_forker_thread()</title>
<updated>2011-09-02T23:17:02+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-09-02T23:04:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5a042aa4b8e994a15d2c2ee750219971f0ab3905'/>
<id>5a042aa4b8e994a15d2c2ee750219971f0ab3905</id>
<content type='text'>
bdi_forker_thread() clears BDI_pending bit at the end of the main loop.
However clearing of this bit must not be done in some cases which is
handled by calling 'continue' from switch statement. That's kind of
unusual construct and without a good reason so change the function into
more intuitive code flow.

CC: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bdi_forker_thread() clears BDI_pending bit at the end of the main loop.
However clearing of this bit must not be done in some cases which is
handled by calling 'continue' from switch statement. That's kind of
unusual construct and without a good reason so change the function into
more intuitive code flow.

CC: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback</title>
<updated>2011-07-26T17:39:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-26T17:39:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f01ef569cddb1a8627b1c6b3a134998ad1cf4b22'/>
<id>f01ef569cddb1a8627b1c6b3a134998ad1cf4b22</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback: (27 commits)
  mm: properly reflect task dirty limits in dirty_exceeded logic
  writeback: don't busy retry writeback on new/freeing inodes
  writeback: scale IO chunk size up to half device bandwidth
  writeback: trace global_dirty_state
  writeback: introduce max-pause and pass-good dirty limits
  writeback: introduce smoothed global dirty limit
  writeback: consolidate variable names in balance_dirty_pages()
  writeback: show bdi write bandwidth in debugfs
  writeback: bdi write bandwidth estimation
  writeback: account per-bdi accumulated written pages
  writeback: make writeback_control.nr_to_write straight
  writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()
  writeback: trace event writeback_queue_io
  writeback: trace event writeback_single_inode
  writeback: remove .nonblocking and .encountered_congestion
  writeback: remove writeback_control.more_io
  writeback: skip balance_dirty_pages() for in-memory fs
  writeback: add bdi_dirty_limit() kernel-doc
  writeback: avoid extra sync work at enqueue time
  writeback: elevate queue_io() into wb_writeback()
  ...

Fix up trivial conflicts in fs/fs-writeback.c and mm/filemap.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback: (27 commits)
  mm: properly reflect task dirty limits in dirty_exceeded logic
  writeback: don't busy retry writeback on new/freeing inodes
  writeback: scale IO chunk size up to half device bandwidth
  writeback: trace global_dirty_state
  writeback: introduce max-pause and pass-good dirty limits
  writeback: introduce smoothed global dirty limit
  writeback: consolidate variable names in balance_dirty_pages()
  writeback: show bdi write bandwidth in debugfs
  writeback: bdi write bandwidth estimation
  writeback: account per-bdi accumulated written pages
  writeback: make writeback_control.nr_to_write straight
  writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()
  writeback: trace event writeback_queue_io
  writeback: trace event writeback_single_inode
  writeback: remove .nonblocking and .encountered_congestion
  writeback: remove writeback_control.more_io
  writeback: skip balance_dirty_pages() for in-memory fs
  writeback: add bdi_dirty_limit() kernel-doc
  writeback: avoid extra sync work at enqueue time
  writeback: elevate queue_io() into wb_writeback()
  ...

Fix up trivial conflicts in fs/fs-writeback.c and mm/filemap.c
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge 'akpm' patch series</title>
<updated>2011-07-26T04:00:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-07-26T04:00:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=45b583b10a8b438b970e95a7d1d4db22c9e35004'/>
<id>45b583b10a8b438b970e95a7d1d4db22c9e35004</id>
<content type='text'>
* Merge akpm patch series: (122 commits)
  drivers/connector/cn_proc.c: remove unused local
  Documentation/SubmitChecklist: add RCU debug config options
  reiserfs: use hweight_long()
  reiserfs: use proper little-endian bitops
  pnpacpi: register disabled resources
  drivers/rtc/rtc-tegra.c: properly initialize spinlock
  drivers/rtc/rtc-twl.c: check return value of twl_rtc_write_u8() in twl_rtc_set_time()
  drivers/rtc: add support for Qualcomm PMIC8xxx RTC
  drivers/rtc/rtc-s3c.c: support clock gating
  drivers/rtc/rtc-mpc5121.c: add support for RTC on MPC5200
  init: skip calibration delay if previously done
  misc/eeprom: add eeprom access driver for digsy_mtc board
  misc/eeprom: add driver for microwire 93xx46 EEPROMs
  checkpatch.pl: update $logFunctions
  checkpatch: make utf-8 test --strict
  checkpatch.pl: add ability to ignore various messages
  checkpatch: add a "prefer __aligned" check
  checkpatch: validate signature styles and To: and Cc: lines
  checkpatch: add __rcu as a sparse modifier
  checkpatch: suggest using min_t or max_t
  ...

Did this as a merge because of (trivial) conflicts in
 - Documentation/feature-removal-schedule.txt
 - arch/xtensa/include/asm/uaccess.h
that were just easier to fix up in the merge than in the patch series.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Merge akpm patch series: (122 commits)
  drivers/connector/cn_proc.c: remove unused local
  Documentation/SubmitChecklist: add RCU debug config options
  reiserfs: use hweight_long()
  reiserfs: use proper little-endian bitops
  pnpacpi: register disabled resources
  drivers/rtc/rtc-tegra.c: properly initialize spinlock
  drivers/rtc/rtc-twl.c: check return value of twl_rtc_write_u8() in twl_rtc_set_time()
  drivers/rtc: add support for Qualcomm PMIC8xxx RTC
  drivers/rtc/rtc-s3c.c: support clock gating
  drivers/rtc/rtc-mpc5121.c: add support for RTC on MPC5200
  init: skip calibration delay if previously done
  misc/eeprom: add eeprom access driver for digsy_mtc board
  misc/eeprom: add driver for microwire 93xx46 EEPROMs
  checkpatch.pl: update $logFunctions
  checkpatch: make utf-8 test --strict
  checkpatch.pl: add ability to ignore various messages
  checkpatch: add a "prefer __aligned" check
  checkpatch: validate signature styles and To: and Cc: lines
  checkpatch: add __rcu as a sparse modifier
  checkpatch: suggest using min_t or max_t
  ...

Did this as a merge because of (trivial) conflicts in
 - Documentation/feature-removal-schedule.txt
 - arch/xtensa/include/asm/uaccess.h
that were just easier to fix up in the merge than in the patch series.
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/backing-dev.c: reset bdi min_ratio in bdi_unregister()</title>
<updated>2011-07-26T03:57:07+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2011-07-26T00:11:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ccb6108f5b0b541d3eb332c3a73e645c0f84278e'/>
<id>ccb6108f5b0b541d3eb332c3a73e645c0f84278e</id>
<content type='text'>
Vito said:

: The system has many usb disks coming and going day to day, with their
: respective bdi's having min_ratio set to 1 when inserted.  It works for
: some time until eventually min_ratio can no longer be set, even when the
: active set of bdi's seen in /sys/class/bdi/*/min_ratio doesn't add up to
: anywhere near 100.
:
: This then leads to an unrelated starvation problem caused by write-heavy
: fuse mounts being used atop the usb disks, a problem the min_ratio setting
: at the underlying devices bdi effectively prevents.

Fix this leakage by resetting the bdi min_ratio when unregistering the
BDI.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Reported-by: Vito Caputo &lt;lkml@pengaru.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Vito said:

: The system has many usb disks coming and going day to day, with their
: respective bdi's having min_ratio set to 1 when inserted.  It works for
: some time until eventually min_ratio can no longer be set, even when the
: active set of bdi's seen in /sys/class/bdi/*/min_ratio doesn't add up to
: anywhere near 100.
:
: This then leads to an unrelated starvation problem caused by write-heavy
: fuse mounts being used atop the usb disks, a problem the min_ratio setting
: at the underlying devices bdi effectively prevents.

Fix this leakage by resetting the bdi min_ratio when unregistering the
BDI.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Reported-by: Vito Caputo &lt;lkml@pengaru.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu</title>
<updated>2011-07-23T18:44:24+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2011-07-23T18:44:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ef3230880abd36553ab442363d3c9a0661f00769'/>
<id>ef3230880abd36553ab442363d3c9a0661f00769</id>
<content type='text'>
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu

synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is
much faster.

With 100Hz timer frequency, when we remove 10000 block devices with
"dmsetup remove_all" command, it takes 27 minutes. With this patch,
removing 10000 block devices takes only 15 seconds.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu

synchronize_rcu sleeps several timer ticks. synchronize_rcu_expedited is
much faster.

With 100Hz timer frequency, when we remove 10000 block devices with
"dmsetup remove_all" command, it takes 27 minutes. With this patch,
removing 10000 block devices takes only 15 seconds.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>writeback: show bdi write bandwidth in debugfs</title>
<updated>2011-07-10T05:09:02+00:00</updated>
<author>
<name>Wu Fengguang</name>
<email>fengguang.wu@intel.com</email>
</author>
<published>2010-08-29T17:28:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=00821b002df7da867bb2c15b4f83f3706371383f'/>
<id>00821b002df7da867bb2c15b4f83f3706371383f</id>
<content type='text'>
Add a "BdiWriteBandwidth" entry and indent others in /debug/bdi/*/stats.

btw, increase digital field width to 10, for keeping the possibly
huge BdiWritten number aligned at least for desktop systems.

Impact: this could break user space tools if they are dumb enough to
depend on the number of white spaces.

CC: Theodore Ts'o &lt;tytso@mit.edu&gt;
CC: Jan Kara &lt;jack@suse.cz&gt;
CC: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a "BdiWriteBandwidth" entry and indent others in /debug/bdi/*/stats.

btw, increase digital field width to 10, for keeping the possibly
huge BdiWritten number aligned at least for desktop systems.

Impact: this could break user space tools if they are dumb enough to
depend on the number of white spaces.

CC: Theodore Ts'o &lt;tytso@mit.edu&gt;
CC: Jan Kara &lt;jack@suse.cz&gt;
CC: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>writeback: bdi write bandwidth estimation</title>
<updated>2011-07-10T05:09:01+00:00</updated>
<author>
<name>Wu Fengguang</name>
<email>fengguang.wu@intel.com</email>
</author>
<published>2010-08-29T17:22:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e98be2d599207c6b31e9bb340d52a231b2f3662d'/>
<id>e98be2d599207c6b31e9bb340d52a231b2f3662d</id>
<content type='text'>
The estimation value will start from 100MB/s and adapt to the real
bandwidth in seconds.

It tries to update the bandwidth only when disk is fully utilized.
Any inactive period of more than one second will be skipped.

The estimated bandwidth will be reflecting how fast the device can
writeout when _fully utilized_, and won't drop to 0 when it goes idle.
The value will remain constant at disk idle time. At busy write time, if
not considering fluctuations, it will also remain high unless be knocked
down by possible concurrent reads that compete for the disk time and
bandwidth with async writes.

The estimation is not done purely in the flusher because there is no
guarantee for write_cache_pages() to return timely to update bandwidth.

The bdi-&gt;avg_write_bandwidth smoothing is very effective for filtering
out sudden spikes, however may be a little biased in long term.

The overheads are low because the bdi bandwidth update only occurs at
200ms intervals.

The 200ms update interval is suitable, because it's not possible to get
the real bandwidth for the instance at all, due to large fluctuations.

The NFS commits can be as large as seconds worth of data. One XFS
completion may be as large as half second worth of data if we are going
to increase the write chunk to half second worth of data. In ext4,
fluctuations with time period of around 5 seconds is observed. And there
is another pattern of irregular periods of up to 20 seconds on SSD tests.

That's why we are not only doing the estimation at 200ms intervals, but
also averaging them over a period of 3 seconds and then go further to do
another level of smoothing in avg_write_bandwidth.

CC: Li Shaohua &lt;shaohua.li@intel.com&gt;
CC: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The estimation value will start from 100MB/s and adapt to the real
bandwidth in seconds.

It tries to update the bandwidth only when disk is fully utilized.
Any inactive period of more than one second will be skipped.

The estimated bandwidth will be reflecting how fast the device can
writeout when _fully utilized_, and won't drop to 0 when it goes idle.
The value will remain constant at disk idle time. At busy write time, if
not considering fluctuations, it will also remain high unless be knocked
down by possible concurrent reads that compete for the disk time and
bandwidth with async writes.

The estimation is not done purely in the flusher because there is no
guarantee for write_cache_pages() to return timely to update bandwidth.

The bdi-&gt;avg_write_bandwidth smoothing is very effective for filtering
out sudden spikes, however may be a little biased in long term.

The overheads are low because the bdi bandwidth update only occurs at
200ms intervals.

The 200ms update interval is suitable, because it's not possible to get
the real bandwidth for the instance at all, due to large fluctuations.

The NFS commits can be as large as seconds worth of data. One XFS
completion may be as large as half second worth of data if we are going
to increase the write chunk to half second worth of data. In ext4,
fluctuations with time period of around 5 seconds is observed. And there
is another pattern of irregular periods of up to 20 seconds on SSD tests.

That's why we are not only doing the estimation at 200ms intervals, but
also averaging them over a period of 3 seconds and then go further to do
another level of smoothing in avg_write_bandwidth.

CC: Li Shaohua &lt;shaohua.li@intel.com&gt;
CC: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>writeback: account per-bdi accumulated written pages</title>
<updated>2011-07-10T05:09:01+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2010-12-09T04:44:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f7d2b1ecd0c714adefc7d3a942ef87beb828a763'/>
<id>f7d2b1ecd0c714adefc7d3a942ef87beb828a763</id>
<content type='text'>
Introduce the BDI_WRITTEN counter. It will be used for estimating the
bdi's write bandwidth.

Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;:
Move BDI_WRITTEN accounting into __bdi_writeout_inc().
This will cover and fix fuse, which only calls bdi_writeout_inc().

CC: Michael Rubin &lt;mrubin@google.com&gt;
Reviewed-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce the BDI_WRITTEN counter. It will be used for estimating the
bdi's write bandwidth.

Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;:
Move BDI_WRITTEN accounting into __bdi_writeout_inc().
This will cover and fix fuse, which only calls bdi_writeout_inc().

CC: Michael Rubin &lt;mrubin@google.com&gt;
Reviewed-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
