<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git, branch v3.2.94</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Linux 3.2.94</title>
<updated>2017-10-12T14:27:23+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2017-10-12T14:27:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fb71fde4183b062e2487b7f795d3d66554182046'/>
<id>fb71fde4183b062e2487b7f795d3d66554182046</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags</title>
<updated>2017-10-12T14:27:22+00:00</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-09-25T01:41:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eb33a02fddad907cea8b2c4ec3b49835b12d0107'/>
<id>eb33a02fddad907cea8b2c4ec3b49835b12d0107</id>
<content type='text'>
commit 2ad654bc5e2b211e92f66da1d819e47d79a866f0 upstream.

When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk-&gt;flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.

Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.

Here's the full report:
https://lkml.org/lkml/2014/9/19/230

To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.

v4:
- updated mm/slab.c. (Fengguang Wu)
- updated Documentation.

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: Backported to 3.4:
 - adjust context
 - check current-&gt;flags &amp; PF_MEMPOLICY rather than current-&gt;mempolicy]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2ad654bc5e2b211e92f66da1d819e47d79a866f0 upstream.

When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk-&gt;flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.

Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.

Here's the full report:
https://lkml.org/lkml/2014/9/19/230

To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.

v4:
- updated mm/slab.c. (Fengguang Wu)
- updated Documentation.

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: Backported to 3.4:
 - adjust context
 - check current-&gt;flags &amp; PF_MEMPOLICY rather than current-&gt;mempolicy]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: add macros to define bitops for task atomic flags</title>
<updated>2017-10-12T14:27:22+00:00</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-09-25T01:40:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4fce5e64991fc0a02e47e525ce72f33def17b7ef'/>
<id>4fce5e64991fc0a02e47e525ce72f33def17b7ef</id>
<content type='text'>
commit e0e5070b20e01f0321f97db4e4e174f3f6b49e50 upstream.

This will simplify code when we add new flags.

v3:
- Kees pointed out that no_new_privs should never be cleared, so we
shouldn't define task_clear_no_new_privs(). we define 3 macros instead
of a single one.

v2:
- updated scripts/tags.sh, suggested by Peter

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: Backported to 3.4:
 - adjust context
 - remove no_new_priv code
 - add atomic_flags to struct task_struct]
[bwh: Backported to 3.2:
 - Drop changes in scripts/tags.sh
 - Adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e0e5070b20e01f0321f97db4e4e174f3f6b49e50 upstream.

This will simplify code when we add new flags.

v3:
- Kees pointed out that no_new_privs should never be cleared, so we
shouldn't define task_clear_no_new_privs(). we define 3 macros instead
of a single one.

v2:
- updated scripts/tags.sh, suggested by Peter

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: Backported to 3.4:
 - adjust context
 - remove no_new_priv code
 - add atomic_flags to struct task_struct]
[bwh: Backported to 3.2:
 - Drop changes in scripts/tags.sh
 - Adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net sched filters: fix notification of filter delete with proper handle</title>
<updated>2017-10-12T14:27:21+00:00</updated>
<author>
<name>Jamal Hadi Salim</name>
<email>jhs@mojatatu.com</email>
</author>
<published>2016-10-25T00:18:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7b73b4b97fc6d0c37c76baee966ea9b9a6323a72'/>
<id>7b73b4b97fc6d0c37c76baee966ea9b9a6323a72</id>
<content type='text'>
[ Upstream commit 9ee7837449b3d6f0fcf9132c6b5e5aaa58cc67d4 ]

Daniel says:

While trying out [1][2], I noticed that tc monitor doesn't show the
correct handle on delete:

$ tc monitor
qdisc clsact ffff: dev eno1 parent ffff:fff1
filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80

some context to explain the above:
The user identity of any tc filter is represented by a 32-bit
identifier encoded in tcm-&gt;tcm_handle. Example 0x2a in the bpf filter
above. A user wishing to delete, get or even modify a specific filter
uses this handle to reference it.
Every classifier is free to provide its own semantics for the 32 bit handle.
Example: classifiers like u32 use schemes like 800:1:801 to describe
the semantics of their filters represented as hash table, bucket and
node ids etc.
Classifiers also have internal per-filter representation which is different
from this externally visible identity. Most classifiers set this
internal representation to be a pointer address (which allows fast retrieval
of said filters in their implementations). This internal representation
is referenced with the "fh" variable in the kernel control code.

When a user successfuly deletes a specific filter, by specifying the correct
tcm-&gt;tcm_handle, an event is generated to user space which indicates
which specific filter was deleted.

Before this patch, the "fh" value was sent to user space as the identity.
As an example what is shown in the sample bpf filter delete event above
is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
which happens to be a 64-bit memory address of the internal filter
representation (address of the corresponding filter's struct cls_bpf_prog);

After this patch the appropriate user identifiable handle as encoded
in the originating request tcm-&gt;tcm_handle is generated in the event.
One of the cardinal rules of netlink rules is to be able to take an
event (such as a delete in this case) and reflect it back to the
kernel and successfully delete the filter. This patch achieves that.

Note, this issue has existed since the original TC action
infrastructure code patch back in 2004 as found in:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/

[1] http://patchwork.ozlabs.org/patch/682828/
[2] http://patchwork.ozlabs.org/patch/682829/

Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
Reported-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9ee7837449b3d6f0fcf9132c6b5e5aaa58cc67d4 ]

Daniel says:

While trying out [1][2], I noticed that tc monitor doesn't show the
correct handle on delete:

$ tc monitor
qdisc clsact ffff: dev eno1 parent ffff:fff1
filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80

some context to explain the above:
The user identity of any tc filter is represented by a 32-bit
identifier encoded in tcm-&gt;tcm_handle. Example 0x2a in the bpf filter
above. A user wishing to delete, get or even modify a specific filter
uses this handle to reference it.
Every classifier is free to provide its own semantics for the 32 bit handle.
Example: classifiers like u32 use schemes like 800:1:801 to describe
the semantics of their filters represented as hash table, bucket and
node ids etc.
Classifiers also have internal per-filter representation which is different
from this externally visible identity. Most classifiers set this
internal representation to be a pointer address (which allows fast retrieval
of said filters in their implementations). This internal representation
is referenced with the "fh" variable in the kernel control code.

When a user successfuly deletes a specific filter, by specifying the correct
tcm-&gt;tcm_handle, an event is generated to user space which indicates
which specific filter was deleted.

Before this patch, the "fh" value was sent to user space as the identity.
As an example what is shown in the sample bpf filter delete event above
is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
which happens to be a 64-bit memory address of the internal filter
representation (address of the corresponding filter's struct cls_bpf_prog);

After this patch the appropriate user identifiable handle as encoded
in the originating request tcm-&gt;tcm_handle is generated in the event.
One of the cardinal rules of netlink rules is to be able to take an
event (such as a delete in this case) and reflect it back to the
kernel and successfully delete the filter. This patch achieves that.

Note, this issue has existed since the original TC action
infrastructure code patch back in 2004 as found in:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/

[1] http://patchwork.ozlabs.org/patch/682828/
[2] http://patchwork.ozlabs.org/patch/682829/

Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
Reported-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>m32r: add io*_rep helpers</title>
<updated>2017-10-12T14:27:21+00:00</updated>
<author>
<name>Sudip Mukherjee</name>
<email>sudipm.mukherjee@gmail.com</email>
</author>
<published>2015-12-29T22:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cefa3724f1f806bb63402962bab92687437e145a'/>
<id>cefa3724f1f806bb63402962bab92687437e145a</id>
<content type='text'>
commit 92a8ed4c7643809123ef0a65424569eaacc5c6b0 upstream.

m32r allmodconfig was failing with the error:

  error: implicit declaration of function 'read'

On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.

At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.

Signed-off-by: Sudip Mukherjee &lt;sudip@vectorindia.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 92a8ed4c7643809123ef0a65424569eaacc5c6b0 upstream.

m32r allmodconfig was failing with the error:

  error: implicit declaration of function 'read'

On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.

At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.

Signed-off-by: Sudip Mukherjee &lt;sudip@vectorindia.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>m32r: add definition of ioremap_wc to io.h</title>
<updated>2017-10-12T14:27:21+00:00</updated>
<author>
<name>Abhilash Kesavan</name>
<email>a.kesavan@samsung.com</email>
</author>
<published>2015-02-06T13:45:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=aa17149de8631218f22f0a23f3239b03f458f6e5'/>
<id>aa17149de8631218f22f0a23f3239b03f458f6e5</id>
<content type='text'>
commit 71a49d16f06de2ccdf52ca247d496a2bb1ca36fe upstream.

Before adding a resource managed ioremap_wc function, we need
to have ioremap_wc defined for m32r to prevent build errors.

Signed-off-by: Abhilash Kesavan &lt;a.kesavan@samsung.com&gt;
Acked-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Sudip Mukherjee &lt;sudipm.mukherjee@gmail.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 71a49d16f06de2ccdf52ca247d496a2bb1ca36fe upstream.

Before adding a resource managed ioremap_wc function, we need
to have ioremap_wc defined for m32r to prevent build errors.

Signed-off-by: Abhilash Kesavan &lt;a.kesavan@samsung.com&gt;
Acked-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Sudip Mukherjee &lt;sudipm.mukherjee@gmail.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>perf/x86: Check if user fp is valid</title>
<updated>2017-10-12T14:27:21+00:00</updated>
<author>
<name>Arun Sharma</name>
<email>asharma@fb.com</email>
</author>
<published>2012-04-20T22:41:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b6eb7a21211e409aba13e14f941d512dd8b5de33'/>
<id>b6eb7a21211e409aba13e14f941d512dd8b5de33</id>
<content type='text'>
commit bc6ca7b342d5ae15c3ba3081fd40271b8039fb25 upstream.

Signed-off-by: Arun Sharma &lt;asharma@fb.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1334961696-19580-4-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: also add user_addr_max() macro]
Cc: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bc6ca7b342d5ae15c3ba3081fd40271b8039fb25 upstream.

Signed-off-by: Arun Sharma &lt;asharma@fb.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1334961696-19580-4-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: also add user_addr_max() macro]
Cc: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>MIPS: Refactor 'clear_page' and 'copy_page' functions.</title>
<updated>2017-10-12T14:27:21+00:00</updated>
<author>
<name>Steven J. Hill</name>
<email>sjhill@mips.com</email>
</author>
<published>2012-07-06T19:56:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a0960eda141683b532608071cc3a808a7f8559ef'/>
<id>a0960eda141683b532608071cc3a808a7f8559ef</id>
<content type='text'>
commit c022630633624a75b3b58f43dd3c6cc896a56cff upstream.

Remove usage of the '__attribute__((alias("...")))' hack that aliased
to integer arrays containing micro-assembled instructions. This hack
breaks when building a microMIPS kernel. It also makes the code much
easier to understand.

[ralf@linux-mips.org: Added back export of the clear_page and copy_page
symbols so certain modules will work again.  Also fixed build with
CONFIG_SIBYTE_DMA_PAGEOPS enabled.]

Signed-off-by: Steven J. Hill &lt;sjhill@mips.com&gt;
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/3866/
Acked-by: David Daney &lt;david.daney@cavium.com&gt;
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
[bwh: Backported to 3.2: adjust context]
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c022630633624a75b3b58f43dd3c6cc896a56cff upstream.

Remove usage of the '__attribute__((alias("...")))' hack that aliased
to integer arrays containing micro-assembled instructions. This hack
breaks when building a microMIPS kernel. It also makes the code much
easier to understand.

[ralf@linux-mips.org: Added back export of the clear_page and copy_page
symbols so certain modules will work again.  Also fixed build with
CONFIG_SIBYTE_DMA_PAGEOPS enabled.]

Signed-off-by: Steven J. Hill &lt;sjhill@mips.com&gt;
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/3866/
Acked-by: David Daney &lt;david.daney@cavium.com&gt;
Signed-off-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
[bwh: Backported to 3.2: adjust context]
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get</title>
<updated>2017-10-12T14:27:21+00:00</updated>
<author>
<name>Andrey Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2014-01-29T18:34:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cf33a744b388cbf5fdd211a3109cc17e420ef8b4'/>
<id>cf33a744b388cbf5fdd211a3109cc17e420ef8b4</id>
<content type='text'>
commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&amp;ct-&gt;tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net-&gt;ct.nf_conntrack_cachep, ct);

net-&gt;ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&amp;ct-&gt;tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

&lt;2&gt;[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
&lt;4&gt;[46267.083951] RIP: 0010:[&lt;ffffffffa01e00a4&gt;]  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
&lt;4&gt;[46267.085549] Call Trace:
&lt;4&gt;[46267.085622]  [&lt;ffffffffa023421b&gt;] alloc_null_binding+0x5b/0xa0 [iptable_nat]
&lt;4&gt;[46267.085697]  [&lt;ffffffffa02342bc&gt;] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
&lt;4&gt;[46267.085770]  [&lt;ffffffffa0234521&gt;] nf_nat_fn+0x111/0x260 [iptable_nat]
&lt;4&gt;[46267.085843]  [&lt;ffffffffa0234798&gt;] nf_nat_out+0x48/0xd0 [iptable_nat]
&lt;4&gt;[46267.085919]  [&lt;ffffffff814841b9&gt;] nf_iterate+0x69/0xb0
&lt;4&gt;[46267.085991]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086063]  [&lt;ffffffff81484374&gt;] nf_hook_slow+0x74/0x110
&lt;4&gt;[46267.086133]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086207]  [&lt;ffffffff814b5890&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[46267.086277]  [&lt;ffffffff81495204&gt;] ip_output+0xa4/0xc0
&lt;4&gt;[46267.086346]  [&lt;ffffffff814b65a4&gt;] raw_sendmsg+0x8b4/0x910
&lt;4&gt;[46267.086419]  [&lt;ffffffff814c10fa&gt;] inet_sendmsg+0x4a/0xb0
&lt;4&gt;[46267.086491]  [&lt;ffffffff814459aa&gt;] ? sock_update_classid+0x3a/0x50
&lt;4&gt;[46267.086562]  [&lt;ffffffff81444d67&gt;] sock_sendmsg+0x117/0x140
&lt;4&gt;[46267.086638]  [&lt;ffffffff8151997b&gt;] ? _spin_unlock_bh+0x1b/0x20
&lt;4&gt;[46267.086712]  [&lt;ffffffff8109d370&gt;] ? autoremove_wake_function+0x0/0x40
&lt;4&gt;[46267.086785]  [&lt;ffffffff81495e80&gt;] ? do_ip_setsockopt+0x90/0xd80
&lt;4&gt;[46267.086858]  [&lt;ffffffff8100be0e&gt;] ? call_function_interrupt+0xe/0x20
&lt;4&gt;[46267.086936]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087006]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087081]  [&lt;ffffffff8118f2e8&gt;] ? kmem_cache_alloc+0xd8/0x1e0
&lt;4&gt;[46267.087151]  [&lt;ffffffff81445599&gt;] sys_sendto+0x139/0x190
&lt;4&gt;[46267.087229]  [&lt;ffffffff81448c0d&gt;] ? sock_setsockopt+0x16d/0x6f0
&lt;4&gt;[46267.087303]  [&lt;ffffffff810efa47&gt;] ? audit_syscall_entry+0x1d7/0x200
&lt;4&gt;[46267.087378]  [&lt;ffffffff810ef795&gt;] ? __audit_syscall_exit+0x265/0x290
&lt;4&gt;[46267.087454]  [&lt;ffffffff81474885&gt;] ? compat_sys_setsockopt+0x75/0x210
&lt;4&gt;[46267.087531]  [&lt;ffffffff81474b5f&gt;] compat_sys_socketcall+0x13f/0x210
&lt;4&gt;[46267.087607]  [&lt;ffffffff8104dea3&gt;] ia32_sysret+0x0/0x5
&lt;4&gt;[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 &lt;0f&gt; 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
&lt;1&gt;[46267.088023] RIP  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Signed-off-by: Andrey Vagin &lt;avagin@openvz.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&amp;ct-&gt;tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net-&gt;ct.nf_conntrack_cachep, ct);

net-&gt;ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&amp;ct-&gt;tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

&lt;2&gt;[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
&lt;4&gt;[46267.083951] RIP: 0010:[&lt;ffffffffa01e00a4&gt;]  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
&lt;4&gt;[46267.085549] Call Trace:
&lt;4&gt;[46267.085622]  [&lt;ffffffffa023421b&gt;] alloc_null_binding+0x5b/0xa0 [iptable_nat]
&lt;4&gt;[46267.085697]  [&lt;ffffffffa02342bc&gt;] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
&lt;4&gt;[46267.085770]  [&lt;ffffffffa0234521&gt;] nf_nat_fn+0x111/0x260 [iptable_nat]
&lt;4&gt;[46267.085843]  [&lt;ffffffffa0234798&gt;] nf_nat_out+0x48/0xd0 [iptable_nat]
&lt;4&gt;[46267.085919]  [&lt;ffffffff814841b9&gt;] nf_iterate+0x69/0xb0
&lt;4&gt;[46267.085991]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086063]  [&lt;ffffffff81484374&gt;] nf_hook_slow+0x74/0x110
&lt;4&gt;[46267.086133]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086207]  [&lt;ffffffff814b5890&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[46267.086277]  [&lt;ffffffff81495204&gt;] ip_output+0xa4/0xc0
&lt;4&gt;[46267.086346]  [&lt;ffffffff814b65a4&gt;] raw_sendmsg+0x8b4/0x910
&lt;4&gt;[46267.086419]  [&lt;ffffffff814c10fa&gt;] inet_sendmsg+0x4a/0xb0
&lt;4&gt;[46267.086491]  [&lt;ffffffff814459aa&gt;] ? sock_update_classid+0x3a/0x50
&lt;4&gt;[46267.086562]  [&lt;ffffffff81444d67&gt;] sock_sendmsg+0x117/0x140
&lt;4&gt;[46267.086638]  [&lt;ffffffff8151997b&gt;] ? _spin_unlock_bh+0x1b/0x20
&lt;4&gt;[46267.086712]  [&lt;ffffffff8109d370&gt;] ? autoremove_wake_function+0x0/0x40
&lt;4&gt;[46267.086785]  [&lt;ffffffff81495e80&gt;] ? do_ip_setsockopt+0x90/0xd80
&lt;4&gt;[46267.086858]  [&lt;ffffffff8100be0e&gt;] ? call_function_interrupt+0xe/0x20
&lt;4&gt;[46267.086936]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087006]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087081]  [&lt;ffffffff8118f2e8&gt;] ? kmem_cache_alloc+0xd8/0x1e0
&lt;4&gt;[46267.087151]  [&lt;ffffffff81445599&gt;] sys_sendto+0x139/0x190
&lt;4&gt;[46267.087229]  [&lt;ffffffff81448c0d&gt;] ? sock_setsockopt+0x16d/0x6f0
&lt;4&gt;[46267.087303]  [&lt;ffffffff810efa47&gt;] ? audit_syscall_entry+0x1d7/0x200
&lt;4&gt;[46267.087378]  [&lt;ffffffff810ef795&gt;] ? __audit_syscall_exit+0x265/0x290
&lt;4&gt;[46267.087454]  [&lt;ffffffff81474885&gt;] ? compat_sys_setsockopt+0x75/0x210
&lt;4&gt;[46267.087531]  [&lt;ffffffff81474b5f&gt;] compat_sys_socketcall+0x13f/0x210
&lt;4&gt;[46267.087607]  [&lt;ffffffff8104dea3&gt;] ia32_sysret+0x0/0x5
&lt;4&gt;[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 &lt;0f&gt; 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
&lt;1&gt;[46267.088023] RIP  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Signed-off-by: Andrey Vagin &lt;avagin@openvz.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv</title>
<updated>2017-10-12T14:27:20+00:00</updated>
<author>
<name>Paul Hüber</name>
<email>phueber@kernsp.in</email>
</author>
<published>2017-02-26T16:58:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3a45ed9afc5d3f009e8358ae01a8360c70ae6e32'/>
<id>3a45ed9afc5d3f009e8358ae01a8360c70ae6e32</id>
<content type='text'>
commit 51fb60eb162ab84c5edf2ae9c63cf0b878e5547e upstream.

l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.

Signed-off-by: Paul Hüber &lt;phueber@kernsp.in&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 51fb60eb162ab84c5edf2ae9c63cf0b878e5547e upstream.

l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.

Signed-off-by: Paul Hüber &lt;phueber@kernsp.in&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
