<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/netfilter, branch v3.12.52</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>netfilter: xt_NFQUEUE: fix --queue-bypass regression</title>
<updated>2015-11-14T16:39:26+00:00</updated>
<author>
<name>Holger Eitzenberger</name>
<email>holger@eitzenberger.org</email>
</author>
<published>2013-10-28T13:42:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09f041d02b9d5407d544d6fd5a9026072c44956f'/>
<id>09f041d02b9d5407d544d6fd5a9026072c44956f</id>
<content type='text'>
commit d954777324ffcba0b2f8119c102237426c654eeb upstream.

V3 of the NFQUEUE target ignores the --queue-bypass flag,
causing packets to be dropped when the userspace listener
isn't running.

Regression is in since 8746ddcf12bb26 ("netfilter: xt_NFQUEUE:
introduce CPU fanout").

Reported-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Holger Eitzenberger &lt;holger@eitzenberger.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d954777324ffcba0b2f8119c102237426c654eeb upstream.

V3 of the NFQUEUE target ignores the --queue-bypass flag,
causing packets to be dropped when the userspace listener
isn't running.

Regression is in since 8746ddcf12bb26 ("netfilter: xt_NFQUEUE:
introduce CPU fanout").

Reported-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Holger Eitzenberger &lt;holger@eitzenberger.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: fix crash with sync protocol v0 and FTP</title>
<updated>2015-10-28T15:37:59+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2015-07-08T05:31:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=29bfe66a2dd06e00d758054de519f67ff82b77dd'/>
<id>29bfe66a2dd06e00d758054de519f67ff82b77dd</id>
<content type='text'>
commit 56184858d1fc95c46723436b455cb7261cd8be6f upstream.

Fix crash in 3.5+ if FTP is used after switching
sync_version to 0.

Fixes: 749c42b620a9 ("ipvs: reduce sync rate with time thresholds")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 56184858d1fc95c46723436b455cb7261cd8be6f upstream.

Fix crash in 3.5+ if FTP is used after switching
sync_version to 0.

Fixes: 749c42b620a9 ("ipvs: reduce sync rate with time thresholds")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: do not use random local source address for tunnels</title>
<updated>2015-10-28T15:37:59+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2015-06-27T11:39:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2f0c79dd1e9d55a279b0a8e363717b7a896fe7b4'/>
<id>2f0c79dd1e9d55a279b0a8e363717b7a896fe7b4</id>
<content type='text'>
commit 4754957f04f5f368792a0eb7dab0ae89fb93dcfd upstream.

Michael Vallaly reports about wrong source address used
in rare cases for tunneled traffic. Looks like
__ip_vs_get_out_rt in 3.10+ is providing uninitialized
dest_dst-&gt;dst_saddr.ip because ip_vs_dest_dst_alloc uses
kmalloc. While we retry after seeing EINVAL from routing
for data that does not look like valid local address, it
still succeeded when this memory was previously used from
other dests and with different local addresses. As result,
we can use valid local address that is not suitable for
our real server.

Fix it by providing 0.0.0.0 every time our cache is refreshed.
By this way we will get preferred source address from routing.

Reported-by: Michael Vallaly &lt;lvs@nolatency.com&gt;
Fixes: 026ace060dfe ("ipvs: optimize dst usage for real server")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4754957f04f5f368792a0eb7dab0ae89fb93dcfd upstream.

Michael Vallaly reports about wrong source address used
in rare cases for tunneled traffic. Looks like
__ip_vs_get_out_rt in 3.10+ is providing uninitialized
dest_dst-&gt;dst_saddr.ip because ip_vs_dest_dst_alloc uses
kmalloc. While we retry after seeing EINVAL from routing
for data that does not look like valid local address, it
still succeeded when this memory was previously used from
other dests and with different local addresses. As result,
we can use valid local address that is not suitable for
our real server.

Fix it by providing 0.0.0.0 every time our cache is refreshed.
By this way we will get preferred source address from routing.

Reported-by: Michael Vallaly &lt;lvs@nolatency.com&gt;
Fixes: 026ace060dfe ("ipvs: optimize dst usage for real server")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ctnetlink: put back references to master ct and expect objects</title>
<updated>2015-10-28T15:37:52+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2015-07-09T20:56:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=89c632d16bbfd4a8ee8ed0224e65eb4561d63dbb'/>
<id>89c632d16bbfd4a8ee8ed0224e65eb4561d63dbb</id>
<content type='text'>
commit 95dd8653de658143770cb0e55a58d2aab97c79d2 upstream.

We have to put back the references to the master conntrack and the expectation
that we just created, otherwise we'll leak them.

Fixes: 0ef71ee1a5b9 ("netfilter: ctnetlink: refactor ctnetlink_create_expect")
Reported-by: Tim Wiess &lt;Tim.Wiess@watchguard.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 95dd8653de658143770cb0e55a58d2aab97c79d2 upstream.

We have to put back the references to the master conntrack and the expectation
that we just created, otherwise we'll leak them.

Fixes: 0ef71ee1a5b9 ("netfilter: ctnetlink: refactor ctnetlink_create_expect")
Reported-by: Tim Wiess &lt;Tim.Wiess@watchguard.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: Support expectations in different zones</title>
<updated>2015-10-28T15:37:51+00:00</updated>
<author>
<name>Joe Stringer</name>
<email>joestringer@nicira.com</email>
</author>
<published>2015-07-22T04:37:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3ae90ef744defaf22e68ca307abe03451a543194'/>
<id>3ae90ef744defaf22e68ca307abe03451a543194</id>
<content type='text'>
commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 upstream.

When zones were originally introduced, the expectation functions were
all extended to perform lookup using the zone. However, insertion was
not modified to check the zone. This means that two expectations which
are intended to apply for different connections that have the same tuple
but exist in different zones cannot both be tracked.

Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones")
Signed-off-by: Joe Stringer &lt;joestringer@nicira.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 upstream.

When zones were originally introduced, the expectation functions were
all extended to perform lookup using the zone. However, insertion was
not modified to check the zone. This means that two expectations which
are intended to apply for different connections that have the same tuple
but exist in different zones cannot both be tracked.

Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones")
Signed-off-by: Joe Stringer &lt;joestringer@nicira.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt</title>
<updated>2015-09-14T14:28:42+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2015-09-11T12:26:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a7775d15b11a277f8af0dc4df69ae420b266e3dd'/>
<id>a7775d15b11a277f8af0dc4df69ae420b266e3dd</id>
<content type='text'>
[ Upstream commit e53376bef2cd97d3e3f61fdc677fb8da7d03d0da ]

With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1                                    CPU2
____nf_conntrack_find()
                                        nf_ct_put()
                                         destroy_conntrack()
                                        ...
                                        init_conntrack
                                         __nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&amp;ct-&gt;use) (use = 2)
                                         if (!l4proto-&gt;new(ct, skb, dataoff, timeouts))
                                          nf_conntrack_free(ct); (use = 2 !!!)
                                        ...
                                        __nf_conntrack_alloc (set use = 1)
 if (!nf_ct_key_equal(h, tuple, zone))
  nf_ct_put(ct); (use = 0)
   destroy_conntrack()
                                        /* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

&lt;4&gt;[67096.759334] ------------[ cut here ]------------
&lt;2&gt;[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
&lt;4&gt;[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
&lt;4&gt;[67096.759932] RIP: 0010:[&lt;ffffffffa03d99ac&gt;]  [&lt;ffffffffa03d99ac&gt;] destroy_conntrack+0x15c/0x190 [nf_conntrack]
&lt;4&gt;[67096.760255] Call Trace:
&lt;4&gt;[67096.760255]  [&lt;ffffffff814844a7&gt;] nf_conntrack_destroy+0x17/0x30
&lt;4&gt;[67096.760255]  [&lt;ffffffffa03d9bb5&gt;] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
&lt;4&gt;[67096.760255]  [&lt;ffffffffa03d9fb2&gt;] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
&lt;4&gt;[67096.760255]  [&lt;ffffffffa048c771&gt;] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
&lt;4&gt;[67096.760255]  [&lt;ffffffff81484419&gt;] nf_iterate+0x69/0xb0
&lt;4&gt;[67096.760255]  [&lt;ffffffff814b5b00&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814845d4&gt;] nf_hook_slow+0x74/0x110
&lt;4&gt;[67096.760255]  [&lt;ffffffff814b5b00&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814b66d5&gt;] raw_sendmsg+0x775/0x910
&lt;4&gt;[67096.760255]  [&lt;ffffffff8104c5a8&gt;] ? flush_tlb_others_ipi+0x128/0x130
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814c136a&gt;] inet_sendmsg+0x4a/0xb0
&lt;4&gt;[67096.760255]  [&lt;ffffffff81444e93&gt;] ? sock_sendmsg+0x13/0x140
&lt;4&gt;[67096.760255]  [&lt;ffffffff81444f97&gt;] sock_sendmsg+0x117/0x140
&lt;4&gt;[67096.760255]  [&lt;ffffffff8102e299&gt;] ? native_smp_send_reschedule+0x49/0x60
&lt;4&gt;[67096.760255]  [&lt;ffffffff81519beb&gt;] ? _spin_unlock_bh+0x1b/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff8109d930&gt;] ? autoremove_wake_function+0x0/0x40
&lt;4&gt;[67096.760255]  [&lt;ffffffff814960f0&gt;] ? do_ip_setsockopt+0x90/0xd80
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814457c9&gt;] sys_sendto+0x139/0x190
&lt;4&gt;[67096.760255]  [&lt;ffffffff810efa77&gt;] ? audit_syscall_entry+0x1d7/0x200
&lt;4&gt;[67096.760255]  [&lt;ffffffff810ef7c5&gt;] ? __audit_syscall_exit+0x265/0x290
&lt;4&gt;[67096.760255]  [&lt;ffffffff81474daf&gt;] compat_sys_socketcall+0x13f/0x210
&lt;4&gt;[67096.760255]  [&lt;ffffffff8104dea3&gt;] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Andrew Vagin &lt;avagin@parallels.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Reported-by: Andrew Vagin &lt;avagin@parallels.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Andrew Vagin &lt;avagin@parallels.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e53376bef2cd97d3e3f61fdc677fb8da7d03d0da ]

With this patch, the conntrack refcount is initially set to zero and
it is bumped once it is added to any of the list, so we fulfill
Eric's golden rule which is that all released objects always have a
refcount that equals zero.

Andrey Vagin reports that nf_conntrack_free can't be called for a
conntrack with non-zero ref-counter, because it can race with
nf_conntrack_find_get().

A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero
ref-counter says that this conntrack is used. So when we release
a conntrack with non-zero counter, we break this assumption.

CPU1                                    CPU2
____nf_conntrack_find()
                                        nf_ct_put()
                                         destroy_conntrack()
                                        ...
                                        init_conntrack
                                         __nf_conntrack_alloc (set use = 1)
atomic_inc_not_zero(&amp;ct-&gt;use) (use = 2)
                                         if (!l4proto-&gt;new(ct, skb, dataoff, timeouts))
                                          nf_conntrack_free(ct); (use = 2 !!!)
                                        ...
                                        __nf_conntrack_alloc (set use = 1)
 if (!nf_ct_key_equal(h, tuple, zone))
  nf_ct_put(ct); (use = 0)
   destroy_conntrack()
                                        /* continue to work with CT */

After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU
race in nf_conntrack_find_get" another bug was triggered in
destroy_conntrack():

&lt;4&gt;[67096.759334] ------------[ cut here ]------------
&lt;2&gt;[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211!
...
&lt;4&gt;[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G         C ---------------    2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB
&lt;4&gt;[67096.759932] RIP: 0010:[&lt;ffffffffa03d99ac&gt;]  [&lt;ffffffffa03d99ac&gt;] destroy_conntrack+0x15c/0x190 [nf_conntrack]
&lt;4&gt;[67096.760255] Call Trace:
&lt;4&gt;[67096.760255]  [&lt;ffffffff814844a7&gt;] nf_conntrack_destroy+0x17/0x30
&lt;4&gt;[67096.760255]  [&lt;ffffffffa03d9bb5&gt;] nf_conntrack_find_get+0x85/0x130 [nf_conntrack]
&lt;4&gt;[67096.760255]  [&lt;ffffffffa03d9fb2&gt;] nf_conntrack_in+0x352/0xb60 [nf_conntrack]
&lt;4&gt;[67096.760255]  [&lt;ffffffffa048c771&gt;] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4]
&lt;4&gt;[67096.760255]  [&lt;ffffffff81484419&gt;] nf_iterate+0x69/0xb0
&lt;4&gt;[67096.760255]  [&lt;ffffffff814b5b00&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814845d4&gt;] nf_hook_slow+0x74/0x110
&lt;4&gt;[67096.760255]  [&lt;ffffffff814b5b00&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814b66d5&gt;] raw_sendmsg+0x775/0x910
&lt;4&gt;[67096.760255]  [&lt;ffffffff8104c5a8&gt;] ? flush_tlb_others_ipi+0x128/0x130
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814c136a&gt;] inet_sendmsg+0x4a/0xb0
&lt;4&gt;[67096.760255]  [&lt;ffffffff81444e93&gt;] ? sock_sendmsg+0x13/0x140
&lt;4&gt;[67096.760255]  [&lt;ffffffff81444f97&gt;] sock_sendmsg+0x117/0x140
&lt;4&gt;[67096.760255]  [&lt;ffffffff8102e299&gt;] ? native_smp_send_reschedule+0x49/0x60
&lt;4&gt;[67096.760255]  [&lt;ffffffff81519beb&gt;] ? _spin_unlock_bh+0x1b/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff8109d930&gt;] ? autoremove_wake_function+0x0/0x40
&lt;4&gt;[67096.760255]  [&lt;ffffffff814960f0&gt;] ? do_ip_setsockopt+0x90/0xd80
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff8100bc4e&gt;] ? apic_timer_interrupt+0xe/0x20
&lt;4&gt;[67096.760255]  [&lt;ffffffff814457c9&gt;] sys_sendto+0x139/0x190
&lt;4&gt;[67096.760255]  [&lt;ffffffff810efa77&gt;] ? audit_syscall_entry+0x1d7/0x200
&lt;4&gt;[67096.760255]  [&lt;ffffffff810ef7c5&gt;] ? __audit_syscall_exit+0x265/0x290
&lt;4&gt;[67096.760255]  [&lt;ffffffff81474daf&gt;] compat_sys_socketcall+0x13f/0x210
&lt;4&gt;[67096.760255]  [&lt;ffffffff8104dea3&gt;] ia32_sysret+0x0/0x5

I have reused the original title for the RFC patch that Andrey posted and
most of the original patch description.

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Andrew Vagin &lt;avagin@parallels.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Reported-by: Andrew Vagin &lt;avagin@parallels.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Andrew Vagin &lt;avagin@parallels.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get</title>
<updated>2015-09-14T14:28:41+00:00</updated>
<author>
<name>Andrey Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2015-09-11T12:26:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=336323b7bd2cd60417a56703666d4b05c85e148a'/>
<id>336323b7bd2cd60417a56703666d4b05c85e148a</id>
<content type='text'>
[ Upstream commit c6825c0976fa7893692e0e43b09740b419b23c09 ]

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&amp;ct-&gt;tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net-&gt;ct.nf_conntrack_cachep, ct);

net-&gt;ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&amp;ct-&gt;tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

&lt;2&gt;[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
&lt;4&gt;[46267.083951] RIP: 0010:[&lt;ffffffffa01e00a4&gt;]  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
&lt;4&gt;[46267.085549] Call Trace:
&lt;4&gt;[46267.085622]  [&lt;ffffffffa023421b&gt;] alloc_null_binding+0x5b/0xa0 [iptable_nat]
&lt;4&gt;[46267.085697]  [&lt;ffffffffa02342bc&gt;] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
&lt;4&gt;[46267.085770]  [&lt;ffffffffa0234521&gt;] nf_nat_fn+0x111/0x260 [iptable_nat]
&lt;4&gt;[46267.085843]  [&lt;ffffffffa0234798&gt;] nf_nat_out+0x48/0xd0 [iptable_nat]
&lt;4&gt;[46267.085919]  [&lt;ffffffff814841b9&gt;] nf_iterate+0x69/0xb0
&lt;4&gt;[46267.085991]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086063]  [&lt;ffffffff81484374&gt;] nf_hook_slow+0x74/0x110
&lt;4&gt;[46267.086133]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086207]  [&lt;ffffffff814b5890&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[46267.086277]  [&lt;ffffffff81495204&gt;] ip_output+0xa4/0xc0
&lt;4&gt;[46267.086346]  [&lt;ffffffff814b65a4&gt;] raw_sendmsg+0x8b4/0x910
&lt;4&gt;[46267.086419]  [&lt;ffffffff814c10fa&gt;] inet_sendmsg+0x4a/0xb0
&lt;4&gt;[46267.086491]  [&lt;ffffffff814459aa&gt;] ? sock_update_classid+0x3a/0x50
&lt;4&gt;[46267.086562]  [&lt;ffffffff81444d67&gt;] sock_sendmsg+0x117/0x140
&lt;4&gt;[46267.086638]  [&lt;ffffffff8151997b&gt;] ? _spin_unlock_bh+0x1b/0x20
&lt;4&gt;[46267.086712]  [&lt;ffffffff8109d370&gt;] ? autoremove_wake_function+0x0/0x40
&lt;4&gt;[46267.086785]  [&lt;ffffffff81495e80&gt;] ? do_ip_setsockopt+0x90/0xd80
&lt;4&gt;[46267.086858]  [&lt;ffffffff8100be0e&gt;] ? call_function_interrupt+0xe/0x20
&lt;4&gt;[46267.086936]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087006]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087081]  [&lt;ffffffff8118f2e8&gt;] ? kmem_cache_alloc+0xd8/0x1e0
&lt;4&gt;[46267.087151]  [&lt;ffffffff81445599&gt;] sys_sendto+0x139/0x190
&lt;4&gt;[46267.087229]  [&lt;ffffffff81448c0d&gt;] ? sock_setsockopt+0x16d/0x6f0
&lt;4&gt;[46267.087303]  [&lt;ffffffff810efa47&gt;] ? audit_syscall_entry+0x1d7/0x200
&lt;4&gt;[46267.087378]  [&lt;ffffffff810ef795&gt;] ? __audit_syscall_exit+0x265/0x290
&lt;4&gt;[46267.087454]  [&lt;ffffffff81474885&gt;] ? compat_sys_setsockopt+0x75/0x210
&lt;4&gt;[46267.087531]  [&lt;ffffffff81474b5f&gt;] compat_sys_socketcall+0x13f/0x210
&lt;4&gt;[46267.087607]  [&lt;ffffffff8104dea3&gt;] ia32_sysret+0x0/0x5
&lt;4&gt;[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 &lt;0f&gt; 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
&lt;1&gt;[46267.088023] RIP  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Signed-off-by: Andrey Vagin &lt;avagin@openvz.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c6825c0976fa7893692e0e43b09740b419b23c09 ]

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&amp;ct-&gt;tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net-&gt;ct.nf_conntrack_cachep, ct);

net-&gt;ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&amp;ct-&gt;tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

&lt;2&gt;[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
&lt;4&gt;[46267.083951] RIP: 0010:[&lt;ffffffffa01e00a4&gt;]  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
&lt;4&gt;[46267.085549] Call Trace:
&lt;4&gt;[46267.085622]  [&lt;ffffffffa023421b&gt;] alloc_null_binding+0x5b/0xa0 [iptable_nat]
&lt;4&gt;[46267.085697]  [&lt;ffffffffa02342bc&gt;] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
&lt;4&gt;[46267.085770]  [&lt;ffffffffa0234521&gt;] nf_nat_fn+0x111/0x260 [iptable_nat]
&lt;4&gt;[46267.085843]  [&lt;ffffffffa0234798&gt;] nf_nat_out+0x48/0xd0 [iptable_nat]
&lt;4&gt;[46267.085919]  [&lt;ffffffff814841b9&gt;] nf_iterate+0x69/0xb0
&lt;4&gt;[46267.085991]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086063]  [&lt;ffffffff81484374&gt;] nf_hook_slow+0x74/0x110
&lt;4&gt;[46267.086133]  [&lt;ffffffff81494e70&gt;] ? ip_finish_output+0x0/0x2f0
&lt;4&gt;[46267.086207]  [&lt;ffffffff814b5890&gt;] ? dst_output+0x0/0x20
&lt;4&gt;[46267.086277]  [&lt;ffffffff81495204&gt;] ip_output+0xa4/0xc0
&lt;4&gt;[46267.086346]  [&lt;ffffffff814b65a4&gt;] raw_sendmsg+0x8b4/0x910
&lt;4&gt;[46267.086419]  [&lt;ffffffff814c10fa&gt;] inet_sendmsg+0x4a/0xb0
&lt;4&gt;[46267.086491]  [&lt;ffffffff814459aa&gt;] ? sock_update_classid+0x3a/0x50
&lt;4&gt;[46267.086562]  [&lt;ffffffff81444d67&gt;] sock_sendmsg+0x117/0x140
&lt;4&gt;[46267.086638]  [&lt;ffffffff8151997b&gt;] ? _spin_unlock_bh+0x1b/0x20
&lt;4&gt;[46267.086712]  [&lt;ffffffff8109d370&gt;] ? autoremove_wake_function+0x0/0x40
&lt;4&gt;[46267.086785]  [&lt;ffffffff81495e80&gt;] ? do_ip_setsockopt+0x90/0xd80
&lt;4&gt;[46267.086858]  [&lt;ffffffff8100be0e&gt;] ? call_function_interrupt+0xe/0x20
&lt;4&gt;[46267.086936]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087006]  [&lt;ffffffff8118cb10&gt;] ? ub_slab_ptr+0x20/0x90
&lt;4&gt;[46267.087081]  [&lt;ffffffff8118f2e8&gt;] ? kmem_cache_alloc+0xd8/0x1e0
&lt;4&gt;[46267.087151]  [&lt;ffffffff81445599&gt;] sys_sendto+0x139/0x190
&lt;4&gt;[46267.087229]  [&lt;ffffffff81448c0d&gt;] ? sock_setsockopt+0x16d/0x6f0
&lt;4&gt;[46267.087303]  [&lt;ffffffff810efa47&gt;] ? audit_syscall_entry+0x1d7/0x200
&lt;4&gt;[46267.087378]  [&lt;ffffffff810ef795&gt;] ? __audit_syscall_exit+0x265/0x290
&lt;4&gt;[46267.087454]  [&lt;ffffffff81474885&gt;] ? compat_sys_setsockopt+0x75/0x210
&lt;4&gt;[46267.087531]  [&lt;ffffffff81474b5f&gt;] compat_sys_socketcall+0x13f/0x210
&lt;4&gt;[46267.087607]  [&lt;ffffffff8104dea3&gt;] ia32_sysret+0x0/0x5
&lt;4&gt;[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 &lt;0f&gt; 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
&lt;1&gt;[46267.088023] RIP  [&lt;ffffffffa01e00a4&gt;] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Signed-off-by: Andrey Vagin &lt;avagin@openvz.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nfnetlink_cthelper: Remove 'const' and '&amp;' to avoid warnings</title>
<updated>2015-07-30T11:21:24+00:00</updated>
<author>
<name>Chen Gang</name>
<email>gang.chen.5i5j@gmail.com</email>
</author>
<published>2014-12-24T15:04:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=26c68a08f55169b2b103ad3d9a8b86191d486ded'/>
<id>26c68a08f55169b2b103ad3d9a8b86191d486ded</id>
<content type='text'>
commit b18c5d15e8714336365d9d51782d5b53afa0443c upstream.

The related code can be simplified, and also can avoid related warnings
(with allmodconfig under parisc):

    CC [M]  net/netfilter/nfnetlink_cthelper.o
  net/netfilter/nfnetlink_cthelper.c: In function ‘nfnl_cthelper_from_nlattr’:
  net/netfilter/nfnetlink_cthelper.c:97:9: warning: passing argument 1 o ‘memcpy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers]
    memcpy(&amp;help-&gt;data, nla_data(attr), help-&gt;helper-&gt;data_len);
           ^
  In file included from include/linux/string.h:17:0,
                   from include/uapi/linux/uuid.h:25,
                   from include/linux/uuid.h:23,
                   from include/linux/mod_devicetable.h:12,
                   from ./arch/parisc/include/asm/hardware.h:4,
                   from ./arch/parisc/include/asm/processor.h:15,
                   from ./arch/parisc/include/asm/spinlock.h:6,
                   from ./arch/parisc/include/asm/atomic.h:21,
                   from include/linux/atomic.h:4,
                   from ./arch/parisc/include/asm/bitops.h:12,
                   from include/linux/bitops.h:36,
                   from include/linux/kernel.h:10,
                   from include/linux/list.h:8,
                   from include/linux/module.h:9,
                   from net/netfilter/nfnetlink_cthelper.c:11:
  ./arch/parisc/include/asm/string.h:8:8: note: expected ‘void *’ but argument is of type ‘const char (*)[]’
   void * memcpy(void * dest,const void *src,size_t count);
          ^

Signed-off-by: Chen Gang &lt;gang.chen.5i5j@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@soleta.eu&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b18c5d15e8714336365d9d51782d5b53afa0443c upstream.

The related code can be simplified, and also can avoid related warnings
(with allmodconfig under parisc):

    CC [M]  net/netfilter/nfnetlink_cthelper.o
  net/netfilter/nfnetlink_cthelper.c: In function ‘nfnl_cthelper_from_nlattr’:
  net/netfilter/nfnetlink_cthelper.c:97:9: warning: passing argument 1 o ‘memcpy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers]
    memcpy(&amp;help-&gt;data, nla_data(attr), help-&gt;helper-&gt;data_len);
           ^
  In file included from include/linux/string.h:17:0,
                   from include/uapi/linux/uuid.h:25,
                   from include/linux/uuid.h:23,
                   from include/linux/mod_devicetable.h:12,
                   from ./arch/parisc/include/asm/hardware.h:4,
                   from ./arch/parisc/include/asm/processor.h:15,
                   from ./arch/parisc/include/asm/spinlock.h:6,
                   from ./arch/parisc/include/asm/atomic.h:21,
                   from include/linux/atomic.h:4,
                   from ./arch/parisc/include/asm/bitops.h:12,
                   from include/linux/bitops.h:36,
                   from include/linux/kernel.h:10,
                   from include/linux/list.h:8,
                   from include/linux/module.h:9,
                   from net/netfilter/nfnetlink_cthelper.c:11:
  ./arch/parisc/include/asm/string.h:8:8: note: expected ‘void *’ but argument is of type ‘const char (*)[]’
   void * memcpy(void * dest,const void *src,size_t count);
          ^

Signed-off-by: Chen Gang &lt;gang.chen.5i5j@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@soleta.eu&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()</title>
<updated>2015-05-20T09:15:47+00:00</updated>
<author>
<name>Ian Wilson</name>
<email>iwilson@brocade.com</email>
</author>
<published>2015-05-16T18:50:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=93091169a673f49c2574cddf1ef858cf0704f646'/>
<id>93091169a673f49c2574cddf1ef858cf0704f646</id>
<content type='text'>
[ upstream commit 78146572b9cd20452da47951812f35b1ad4906be ]

nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(),
nfnl_cthelper_get() and nfnl_cthelper_del().  In each case they pass
a pointer to an nf_conntrack_tuple data structure local variable:

    struct nf_conntrack_tuple tuple;
    ...
    ret = nfnl_cthelper_parse_tuple(&amp;tuple, tb[NFCTH_TUPLE]);

The problem is that this local variable is not initialized, and
nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and
dst.protonum.  This leaves all other fields with undefined values
based on whatever is on the stack:

    tuple-&gt;src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
    tuple-&gt;dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);

The symptom observed was that when the rpc and tns helpers were added
then traffic to port 1536 was being sent to user-space.

Signed-off-by: Ian Wilson &lt;iwilson@brocade.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ upstream commit 78146572b9cd20452da47951812f35b1ad4906be ]

nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(),
nfnl_cthelper_get() and nfnl_cthelper_del().  In each case they pass
a pointer to an nf_conntrack_tuple data structure local variable:

    struct nf_conntrack_tuple tuple;
    ...
    ret = nfnl_cthelper_parse_tuple(&amp;tuple, tb[NFCTH_TUPLE]);

The problem is that this local variable is not initialized, and
nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and
dst.protonum.  This leaves all other fields with undefined values
based on whatever is on the stack:

    tuple-&gt;src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
    tuple-&gt;dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);

The symptom observed was that when the rpc and tns helpers were added
then traffic to port 1536 was being sent to user-space.

Signed-off-by: Ian Wilson &lt;iwilson@brocade.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: conntrack: disable generic tracking for known protocols</title>
<updated>2015-04-27T18:25:29+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2014-09-26T09:35:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2fb11da9d9016f6c0a4fcb99b8ebd63495c79005'/>
<id>2fb11da9d9016f6c0a4fcb99b8ebd63495c79005</id>
<content type='text'>
commit db29a9508a9246e77087c5531e45b2c88ec6988b upstream.

Given following iptables ruleset:

-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT

One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.

This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).

All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.

Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.

 [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html

Joint work with Daniel Borkmann.

Fixes CVE-2014-8160.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Acked-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Zhiqiang Zhang &lt;zhangzhiqiang.zhang@huawei.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit db29a9508a9246e77087c5531e45b2c88ec6988b upstream.

Given following iptables ruleset:

-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT

One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.

This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).

All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.

Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.

 [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html

Joint work with Daniel Borkmann.

Fixes CVE-2014-8160.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Acked-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Zhiqiang Zhang &lt;zhangzhiqiang.zhang@huawei.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</pre>
</div>
</content>
</entry>
</feed>
