<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/block, branch v4.4-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>block: protect rw_page against device teardown</title>
<updated>2015-11-19T21:47:10+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2015-11-19T21:29:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2e6edc95382cc36423aff18a237173ad62d5ab52'/>
<id>2e6edc95382cc36423aff18a237173ad62d5ab52</id>
<content type='text'>
Fix use after free crashes like the following:

 general protection fault: 0000 [#1] SMP
 Call Trace:
  [&lt;ffffffffa0050216&gt;] ? pmem_do_bvec.isra.12+0xa6/0xf0 [nd_pmem]
  [&lt;ffffffffa0050ba2&gt;] pmem_rw_page+0x42/0x80 [nd_pmem]
  [&lt;ffffffff8128fd90&gt;] bdev_read_page+0x50/0x60
  [&lt;ffffffff812972f0&gt;] do_mpage_readpage+0x510/0x770
  [&lt;ffffffff8128fd20&gt;] ? I_BDEV+0x20/0x20
  [&lt;ffffffff811d86dc&gt;] ? lru_cache_add+0x1c/0x50
  [&lt;ffffffff81297657&gt;] mpage_readpages+0x107/0x170
  [&lt;ffffffff8128fd20&gt;] ? I_BDEV+0x20/0x20
  [&lt;ffffffff8128fd20&gt;] ? I_BDEV+0x20/0x20
  [&lt;ffffffff8129058d&gt;] blkdev_readpages+0x1d/0x20
  [&lt;ffffffff811d615f&gt;] __do_page_cache_readahead+0x28f/0x310
  [&lt;ffffffff811d6039&gt;] ? __do_page_cache_readahead+0x169/0x310
  [&lt;ffffffff811c5abd&gt;] ? pagecache_get_page+0x2d/0x1d0
  [&lt;ffffffff811c76f6&gt;] filemap_fault+0x396/0x530
  [&lt;ffffffff811f816e&gt;] __do_fault+0x4e/0xf0
  [&lt;ffffffff811fce7d&gt;] handle_mm_fault+0x11bd/0x1b50

Cc: &lt;stable@vger.kernel.org&gt;
Cc: Jens Axboe &lt;axboe@fb.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Acked-by: Matthew Wilcox &lt;willy@linux.intel.com&gt;
[willy: symmetry fixups]
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix use after free crashes like the following:

 general protection fault: 0000 [#1] SMP
 Call Trace:
  [&lt;ffffffffa0050216&gt;] ? pmem_do_bvec.isra.12+0xa6/0xf0 [nd_pmem]
  [&lt;ffffffffa0050ba2&gt;] pmem_rw_page+0x42/0x80 [nd_pmem]
  [&lt;ffffffff8128fd90&gt;] bdev_read_page+0x50/0x60
  [&lt;ffffffff812972f0&gt;] do_mpage_readpage+0x510/0x770
  [&lt;ffffffff8128fd20&gt;] ? I_BDEV+0x20/0x20
  [&lt;ffffffff811d86dc&gt;] ? lru_cache_add+0x1c/0x50
  [&lt;ffffffff81297657&gt;] mpage_readpages+0x107/0x170
  [&lt;ffffffff8128fd20&gt;] ? I_BDEV+0x20/0x20
  [&lt;ffffffff8128fd20&gt;] ? I_BDEV+0x20/0x20
  [&lt;ffffffff8129058d&gt;] blkdev_readpages+0x1d/0x20
  [&lt;ffffffff811d615f&gt;] __do_page_cache_readahead+0x28f/0x310
  [&lt;ffffffff811d6039&gt;] ? __do_page_cache_readahead+0x169/0x310
  [&lt;ffffffff811c5abd&gt;] ? pagecache_get_page+0x2d/0x1d0
  [&lt;ffffffff811c76f6&gt;] filemap_fault+0x396/0x530
  [&lt;ffffffff811f816e&gt;] __do_fault+0x4e/0xf0
  [&lt;ffffffff811fce7d&gt;] handle_mm_fault+0x11bd/0x1b50

Cc: &lt;stable@vger.kernel.org&gt;
Cc: Jens Axboe &lt;axboe@fb.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Acked-by: Matthew Wilcox &lt;willy@linux.intel.com&gt;
[willy: symmetry fixups]
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: fix blk-core.c kernel-doc warning</title>
<updated>2015-11-11T16:36:57+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@infradead.org</email>
</author>
<published>2015-10-31T01:36:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ccc2600b8a28f3eb0c126cd00312baba1c22cccb'/>
<id>ccc2600b8a28f3eb0c126cd00312baba1c22cccb</id>
<content type='text'>
Fix kernel-doc warning in blk-core.c:

Warning(..//block/blk-core.c:1549): No description found for parameter 'same_queue_rq'

Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reviewed-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix kernel-doc warning in blk-core.c:

Warning(..//block/blk-core.c:1549): No description found for parameter 'same_queue_rq'

Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reviewed-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: mark __blk_mq_complete_request() static</title>
<updated>2015-11-11T16:36:56+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T21:32:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1fa8cc52f46c14fb1afc20c220855c40a5d28fcd'/>
<id>1fa8cc52f46c14fb1afc20c220855c40a5d28fcd</id>
<content type='text'>
It's no longer used outside of blk-mq core.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's no longer used outside of blk-mq core.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.4/io-poll' of git://git.kernel.dk/linux-block</title>
<updated>2015-11-11T01:23:49+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-11-11T01:23:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3419b45039c6b799c974a8019361c045e7ca232c'/>
<id>3419b45039c6b799c974a8019361c045e7ca232c</id>
<content type='text'>
Pull block IO poll support from Jens Axboe:
 "Various groups have been doing experimentation around IO polling for
  (really) fast devices.  The code has been reviewed and has been
  sitting on the side for a few releases, but this is now good enough
  for coordinated benchmarking and further experimentation.

  Currently O_DIRECT sync read/write are supported.  A framework is in
  the works that allows scalable stats tracking so we can auto-tune
  this.  And we'll add libaio support as well soon.  Fow now, it's an
  opt-in feature for test purposes"

* 'for-4.4/io-poll' of git://git.kernel.dk/linux-block:
  direct-io: be sure to assign dio-&gt;bio_bdev for both paths
  directio: add block polling support
  NVMe: add blk polling support
  block: add block polling support
  blk-mq: return tag/queue combo in the make_request_fn handlers
  block: change -&gt;make_request_fn() and users to return a queue cookie
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block IO poll support from Jens Axboe:
 "Various groups have been doing experimentation around IO polling for
  (really) fast devices.  The code has been reviewed and has been
  sitting on the side for a few releases, but this is now good enough
  for coordinated benchmarking and further experimentation.

  Currently O_DIRECT sync read/write are supported.  A framework is in
  the works that allows scalable stats tracking so we can auto-tune
  this.  And we'll add libaio support as well soon.  Fow now, it's an
  opt-in feature for test purposes"

* 'for-4.4/io-poll' of git://git.kernel.dk/linux-block:
  direct-io: be sure to assign dio-&gt;bio_bdev for both paths
  directio: add block polling support
  NVMe: add blk polling support
  block: add block polling support
  blk-mq: return tag/queue combo in the make_request_fn handlers
  block: change -&gt;make_request_fn() and users to return a queue cookie
</pre>
</div>
</content>
</entry>
<entry>
<title>block: add block polling support</title>
<updated>2015-11-07T17:40:47+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T17:44:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=05229beeddf7e75e2e616ddaad4b70e7fca9528d'/>
<id>05229beeddf7e75e2e616ddaad4b70e7fca9528d</id>
<content type='text'>
Add basic support for polling for specific IO to complete. This uses
the cookie that blk-mq passes back, which enables the block layer
to pass this cookie to the driver to spin for a specific request.

This will be combined with request latency tracking, so we can make
qualified decisions about when to poll and when not to. For now, for
benchmark purposes, we add a sysfs file that controls whether polling
is enabled or not.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add basic support for polling for specific IO to complete. This uses
the cookie that blk-mq passes back, which enables the block layer
to pass this cookie to the driver to spin for a specific request.

This will be combined with request latency tracking, so we can make
qualified decisions about when to poll and when not to. For now, for
benchmark purposes, we add a sysfs file that controls whether polling
is enabled or not.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: return tag/queue combo in the make_request_fn handlers</title>
<updated>2015-11-07T17:40:47+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T17:41:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7b371636fb6d187873d9d2730c2b1febc48a9b47'/>
<id>7b371636fb6d187873d9d2730c2b1febc48a9b47</id>
<content type='text'>
Return a cookie, blk_qc_t, from the blk-mq make request functions, that
allows a later caller to uniquely identify a specific IO. The cookie
doesn't mean anything to the caller, but the caller can use it to later
pass back to the block layer. The block layer can then identify the
hardware queue and request from that cookie.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Return a cookie, blk_qc_t, from the blk-mq make request functions, that
allows a later caller to uniquely identify a specific IO. The cookie
doesn't mean anything to the caller, but the caller can use it to later
pass back to the block layer. The block layer can then identify the
hardware queue and request from that cookie.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: change -&gt;make_request_fn() and users to return a queue cookie</title>
<updated>2015-11-07T17:40:46+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T17:41:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dece16353ef47d8d33f5302bc158072a9d65e26f'/>
<id>dece16353ef47d8d33f5302bc158072a9d65e26f</id>
<content type='text'>
No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode</title>
<updated>2015-11-07T01:50:42+00:00</updated>
<author>
<name>Ben Segall</name>
<email>bsegall@google.com</email>
</author>
<published>2015-11-07T00:32:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8639b46139b0e4ea3b1ab1c274e410ee327f1d89'/>
<id>8639b46139b0e4ea3b1ab1c274e410ee327f1d89</id>
<content type='text'>
setpriority(PRIO_USER, 0, x) will change the priority of tasks outside of
the current pid namespace.  This is in contrast to both the other modes of
setpriority and the example of kill(-1).  Fix this.  getpriority and
ioprio have the same failure mode, fix them too.

Eric said:

: After some more thinking about it this patch sounds justifiable.
:
: My goal with namespaces is not to build perfect isolation mechanisms
: as that can get into ill defined territory, but to build well defined
: mechanisms.  And to handle the corner cases so you can use only
: a single namespace with well defined results.
:
: In this case you have found the two interfaces I am aware of that
: identify processes by uid instead of by pid.  Which quite frankly is
: weird.  Unfortunately the weird unexpected cases are hard to handle
: in the usual way.
:
: I was hoping for a little more information.  Changes like this one we
: have to be careful of because someone might be depending on the current
: behavior.  I don't think they are and I do think this make sense as part
: of the pid namespace.

Signed-off-by: Ben Segall &lt;bsegall@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Ambrose Feinstein &lt;ambrose@google.com&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
setpriority(PRIO_USER, 0, x) will change the priority of tasks outside of
the current pid namespace.  This is in contrast to both the other modes of
setpriority and the example of kill(-1).  Fix this.  getpriority and
ioprio have the same failure mode, fix them too.

Eric said:

: After some more thinking about it this patch sounds justifiable.
:
: My goal with namespaces is not to build perfect isolation mechanisms
: as that can get into ill defined territory, but to build well defined
: mechanisms.  And to handle the corner cases so you can use only
: a single namespace with well defined results.
:
: In this case you have found the two interfaces I am aware of that
: identify processes by uid instead of by pid.  Which quite frankly is
: weird.  Unfortunately the weird unexpected cases are hard to handle
: in the usual way.
:
: I was hoping for a little more information.  Changes like this one we
: have to be careful of because someone might be depending on the current
: behavior.  I don't think they are and I do think this make sense as part
: of the pid namespace.

Signed-off-by: Ben Segall &lt;bsegall@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Ambrose Feinstein &lt;ambrose@google.com&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM</title>
<updated>2015-11-07T01:50:42+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2015-11-07T00:28:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=71baba4b92dc1fa1bc461742c6ab1942ec6034e9'/>
<id>71baba4b92dc1fa1bc461742c6ab1942ec6034e9</id>
<content type='text'>
__GFP_WAIT was used to signal that the caller was in atomic context and
could not sleep.  Now it is possible to distinguish between true atomic
context and callers that are not willing to sleep.  The latter should
clear __GFP_DIRECT_RECLAIM so kswapd will still wake.  As clearing
__GFP_WAIT behaves differently, there is a risk that people will clear the
wrong flags.  This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
indicate what it does -- setting it allows all reclaim activity, clearing
them prevents it.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vitaly Wool &lt;vitalywool@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__GFP_WAIT was used to signal that the caller was in atomic context and
could not sleep.  Now it is possible to distinguish between true atomic
context and callers that are not willing to sleep.  The latter should
clear __GFP_DIRECT_RECLAIM so kswapd will still wake.  As clearing
__GFP_WAIT behaves differently, there is a risk that people will clear the
wrong flags.  This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
indicate what it does -- setting it allows all reclaim activity, clearing
them prevents it.

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vitaly Wool &lt;vitalywool@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd</title>
<updated>2015-11-07T01:50:42+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2015-11-07T00:28:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d0164adc89f6bb374d304ffcc375c6d2652fe67d'/>
<id>d0164adc89f6bb374d304ffcc375c6d2652fe67d</id>
<content type='text'>
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vitaly Wool &lt;vitalywool@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Vitaly Wool &lt;vitalywool@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
