<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/netfilter, branch v6.8.4</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>netfilter: nf_tables: reject constant set with timeout</title>
<updated>2024-04-03T13:32:31+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2024-03-01T00:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=075be399f2df7a365155c5f3061649df34aced79'/>
<id>075be399f2df7a365155c5f3061649df34aced79</id>
<content type='text'>
commit 5f4fc4bd5cddb4770ab120ce44f02695c4505562 upstream.

This set combination is weird: it allows for elements to be
added/deleted, but once bound to the rule it cannot be updated anymore.
Eventually, all elements expire, leading to an empty set which cannot
be updated anymore. Reject this flags combination.

Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5f4fc4bd5cddb4770ab120ce44f02695c4505562 upstream.

This set combination is weird: it allows for elements to be
added/deleted, but once bound to the rule it cannot be updated anymore.
Eventually, all elements expire, leading to an empty set which cannot
be updated anymore. Reject this flags combination.

Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: Fix a memory leak in nf_tables_updchain</title>
<updated>2024-03-26T22:17:38+00:00</updated>
<author>
<name>Quan Tian</name>
<email>tianquan23@gmail.com</email>
</author>
<published>2024-03-06T17:24:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e77a6b53a3a547b6dedfc40c37cee4f310701090'/>
<id>e77a6b53a3a547b6dedfc40c37cee4f310701090</id>
<content type='text'>
[ Upstream commit 7eaf837a4eb5f74561e2486972e7f5184b613f6e ]

If nft_netdev_register_hooks() fails, the memory associated with
nft_stats is not freed, causing a memory leak.

This patch fixes it by moving nft_stats_alloc() down after
nft_netdev_register_hooks() succeeds.

Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
Signed-off-by: Quan Tian &lt;tianquan23@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7eaf837a4eb5f74561e2486972e7f5184b613f6e ]

If nft_netdev_register_hooks() fails, the memory associated with
nft_stats is not freed, causing a memory leak.

This patch fixes it by moving nft_stats_alloc() down after
nft_netdev_register_hooks() succeeds.

Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
Signed-off-by: Quan Tian &lt;tianquan23@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: do not compare internal table flags on updates</title>
<updated>2024-03-26T22:17:37+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2024-03-14T17:51:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=df257c435e51651c43b86326d112ddadda76350e'/>
<id>df257c435e51651c43b86326d112ddadda76350e</id>
<content type='text'>
[ Upstream commit 4a0e7f2decbf9bd72461226f1f5f7dcc4b08f139 ]

Restore skipping transaction if table update does not modify flags.

Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 4a0e7f2decbf9bd72461226f1f5f7dcc4b08f139 ]

Restore skipping transaction if table update does not modify flags.

Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nft_set_pipapo: release elements in clone only from destroy path</title>
<updated>2024-03-26T22:17:37+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2024-03-10T09:02:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9384b4d85c46ce839f51af01374062ce6318b2f2'/>
<id>9384b4d85c46ce839f51af01374062ce6318b2f2</id>
<content type='text'>
[ Upstream commit b0e256f3dd2ba6532f37c5c22e07cb07a36031ee ]

Clone already always provides a current view of the lookup table, use it
to destroy the set, otherwise it is possible to destroy elements twice.

This fix requires:

 212ed75dc5fb ("netfilter: nf_tables: integrate pipapo into commit protocol")

which came after:

 9827a0e6e23b ("netfilter: nft_set_pipapo: release elements in clone from abort path").

Fixes: 9827a0e6e23b ("netfilter: nft_set_pipapo: release elements in clone from abort path")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b0e256f3dd2ba6532f37c5c22e07cb07a36031ee ]

Clone already always provides a current view of the lookup table, use it
to destroy the set, otherwise it is possible to destroy elements twice.

This fix requires:

 212ed75dc5fb ("netfilter: nf_tables: integrate pipapo into commit protocol")

which came after:

 9827a0e6e23b ("netfilter: nft_set_pipapo: release elements in clone from abort path").

Fixes: 9827a0e6e23b ("netfilter: nft_set_pipapo: release elements in clone from abort path")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack_h323: Add protection for bmp length out of range</title>
<updated>2024-03-07T02:10:35+00:00</updated>
<author>
<name>Lena Wang</name>
<email>lena.wang@mediatek.com</email>
</author>
<published>2024-03-05T11:38:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=767146637efc528b5e3d31297df115e85a2fd362'/>
<id>767146637efc528b5e3d31297df115e85a2fd362</id>
<content type='text'>
UBSAN load reports an exception of BRK#5515 SHIFT_ISSUE:Bitwise shifts
that are out of bounds for their data type.

vmlinux   get_bitmap(b=75) + 712
&lt;net/netfilter/nf_conntrack_h323_asn1.c:0&gt;
vmlinux   decode_seq(bs=0xFFFFFFD008037000, f=0xFFFFFFD008037018, level=134443100) + 1956
&lt;net/netfilter/nf_conntrack_h323_asn1.c:592&gt;
vmlinux   decode_choice(base=0xFFFFFFD0080370F0, level=23843636) + 1216
&lt;net/netfilter/nf_conntrack_h323_asn1.c:814&gt;
vmlinux   decode_seq(f=0xFFFFFFD0080371A8, level=134443500) + 812
&lt;net/netfilter/nf_conntrack_h323_asn1.c:576&gt;
vmlinux   decode_choice(base=0xFFFFFFD008037280, level=0) + 1216
&lt;net/netfilter/nf_conntrack_h323_asn1.c:814&gt;
vmlinux   DecodeRasMessage() + 304
&lt;net/netfilter/nf_conntrack_h323_asn1.c:833&gt;
vmlinux   ras_help() + 684
&lt;net/netfilter/nf_conntrack_h323_main.c:1728&gt;
vmlinux   nf_confirm() + 188
&lt;net/netfilter/nf_conntrack_proto.c:137&gt;

Due to abnormal data in skb-&gt;data, the extension bitmap length
exceeds 32 when decoding ras message then uses the length to make
a shift operation. It will change into negative after several loop.
UBSAN load could detect a negative shift as an undefined behaviour
and reports exception.
So we add the protection to avoid the length exceeding 32. Or else
it will return out of range error and stop decoding.

Fixes: 5e35941d9901 ("[NETFILTER]: Add H.323 conntrack/NAT helper")
Signed-off-by: Lena Wang &lt;lena.wang@mediatek.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
UBSAN load reports an exception of BRK#5515 SHIFT_ISSUE:Bitwise shifts
that are out of bounds for their data type.

vmlinux   get_bitmap(b=75) + 712
&lt;net/netfilter/nf_conntrack_h323_asn1.c:0&gt;
vmlinux   decode_seq(bs=0xFFFFFFD008037000, f=0xFFFFFFD008037018, level=134443100) + 1956
&lt;net/netfilter/nf_conntrack_h323_asn1.c:592&gt;
vmlinux   decode_choice(base=0xFFFFFFD0080370F0, level=23843636) + 1216
&lt;net/netfilter/nf_conntrack_h323_asn1.c:814&gt;
vmlinux   decode_seq(f=0xFFFFFFD0080371A8, level=134443500) + 812
&lt;net/netfilter/nf_conntrack_h323_asn1.c:576&gt;
vmlinux   decode_choice(base=0xFFFFFFD008037280, level=0) + 1216
&lt;net/netfilter/nf_conntrack_h323_asn1.c:814&gt;
vmlinux   DecodeRasMessage() + 304
&lt;net/netfilter/nf_conntrack_h323_asn1.c:833&gt;
vmlinux   ras_help() + 684
&lt;net/netfilter/nf_conntrack_h323_main.c:1728&gt;
vmlinux   nf_confirm() + 188
&lt;net/netfilter/nf_conntrack_proto.c:137&gt;

Due to abnormal data in skb-&gt;data, the extension bitmap length
exceeds 32 when decoding ras message then uses the length to make
a shift operation. It will change into negative after several loop.
UBSAN load could detect a negative shift as an undefined behaviour
and reports exception.
So we add the protection to avoid the length exceeding 32. Or else
it will return out of range error and stop decoding.

Fixes: 5e35941d9901 ("[NETFILTER]: Add H.323 conntrack/NAT helper")
Signed-off-by: Lena Wang &lt;lena.wang@mediatek.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout</title>
<updated>2024-03-07T02:08:26+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2024-03-04T13:22:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=552705a3650bbf46a22b1adedc1b04181490fc36'/>
<id>552705a3650bbf46a22b1adedc1b04181490fc36</id>
<content type='text'>
While the rhashtable set gc runs asynchronously, a race allows it to
collect elements from anonymous sets with timeouts while it is being
released from the commit path.

Mingi Cho originally reported this issue in a different path in 6.1.x
with a pipapo set with low timeouts which is not possible upstream since
7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set
element timeout").

Fix this by setting on the dead flag for anonymous sets to skip async gc
in this case.

According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on
transaction abort"), Florian plans to accelerate abort path by releasing
objects via workqueue, therefore, this sets on the dead flag for abort
path too.

Cc: stable@vger.kernel.org
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Mingi Cho &lt;mgcho.minic@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While the rhashtable set gc runs asynchronously, a race allows it to
collect elements from anonymous sets with timeouts while it is being
released from the commit path.

Mingi Cho originally reported this issue in a different path in 6.1.x
with a pipapo set with low timeouts which is not possible upstream since
7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set
element timeout").

Fix this by setting on the dead flag for anonymous sets to skip async gc
in this case.

According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on
transaction abort"), Florian plans to accelerate abort path by releasing
objects via workqueue, therefore, this sets on the dead flag for abort
path too.

Cc: stable@vger.kernel.org
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Mingi Cho &lt;mgcho.minic@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nft_ct: fix l3num expectations with inet pseudo family</title>
<updated>2024-03-06T23:12:34+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2024-03-01T12:38:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=99993789966a6eb4f1295193dc543686899892d3'/>
<id>99993789966a6eb4f1295193dc543686899892d3</id>
<content type='text'>
Following is rejected but should be allowed:

table inet t {
        ct expectation exp1 {
                [..]
                l3proto ip

Valid combos are:
table ip t, l3proto ip
table ip6 t, l3proto ip6
table inet t, l3proto ip OR l3proto ip6

Disallow inet pseudeo family, the l3num must be a on-wire protocol known
to conntrack.

Retain NFPROTO_INET case to make it clear its rejected
intentionally rather as oversight.

Fixes: 8059918a1377 ("netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Following is rejected but should be allowed:

table inet t {
        ct expectation exp1 {
                [..]
                l3proto ip

Valid combos are:
table ip t, l3proto ip
table ip6 t, l3proto ip6
table inet t, l3proto ip OR l3proto ip6

Disallow inet pseudeo family, the l3num must be a on-wire protocol known
to conntrack.

Retain NFPROTO_INET case to make it clear its rejected
intentionally rather as oversight.

Fixes: 8059918a1377 ("netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: reject constant set with timeout</title>
<updated>2024-03-06T23:02:42+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2024-03-01T00:04:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5f4fc4bd5cddb4770ab120ce44f02695c4505562'/>
<id>5f4fc4bd5cddb4770ab120ce44f02695c4505562</id>
<content type='text'>
This set combination is weird: it allows for elements to be
added/deleted, but once bound to the rule it cannot be updated anymore.
Eventually, all elements expire, leading to an empty set which cannot
be updated anymore. Reject this flags combination.

Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This set combination is weird: it allows for elements to be
added/deleted, but once bound to the rule it cannot be updated anymore.
Eventually, all elements expire, leading to an empty set which cannot
be updated anymore. Reject this flags combination.

Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: disallow anonymous set with timeout flag</title>
<updated>2024-03-06T23:02:42+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2024-02-29T23:11:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=16603605b667b70da974bea8216c93e7db043bf1'/>
<id>16603605b667b70da974bea8216c93e7db043bf1</id>
<content type='text'>
Anonymous sets are never used with timeout from userspace, reject this.
Exception to this rule is NFT_SET_EVAL to ensure legacy meters still work.

Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Reported-by: lonial con &lt;kongln9170@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Anonymous sets are never used with timeout from userspace, reject this.
Exception to this rule is NFT_SET_EVAL to ensure legacy meters still work.

Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Reported-by: lonial con &lt;kongln9170@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: bridge: confirm multicast packets before passing them up the stack</title>
<updated>2024-02-28T23:22:44+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2024-02-27T15:17:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=62e7151ae3eb465e0ab52a20c941ff33bb6332e9'/>
<id>62e7151ae3eb465e0ab52a20c941ff33bb6332e9</id>
<content type='text'>
conntrack nf_confirm logic cannot handle cloned skbs referencing
the same nf_conn entry, which will happen for multicast (broadcast)
frames on bridges.

 Example:
    macvlan0
       |
      br0
     /  \
  ethX    ethY

 ethX (or Y) receives a L2 multicast or broadcast packet containing
 an IP packet, flow is not yet in conntrack table.

 1. skb passes through bridge and fake-ip (br_netfilter)Prerouting.
    -&gt; skb-&gt;_nfct now references a unconfirmed entry
 2. skb is broad/mcast packet. bridge now passes clones out on each bridge
    interface.
 3. skb gets passed up the stack.
 4. In macvlan case, macvlan driver retains clone(s) of the mcast skb
    and schedules a work queue to send them out on the lower devices.

    The clone skb-&gt;_nfct is not a copy, it is the same entry as the
    original skb.  The macvlan rx handler then returns RX_HANDLER_PASS.
 5. Normal conntrack hooks (in NF_INET_LOCAL_IN) confirm the orig skb.

The Macvlan broadcast worker and normal confirm path will race.

This race will not happen if step 2 already confirmed a clone. In that
case later steps perform skb_clone() with skb-&gt;_nfct already confirmed (in
hash table).  This works fine.

But such confirmation won't happen when eb/ip/nftables rules dropped the
packets before they reached the nf_confirm step in postrouting.

Pablo points out that nf_conntrack_bridge doesn't allow use of stateful
nat, so we can safely discard the nf_conn entry and let inet call
conntrack again.

This doesn't work for bridge netfilter: skb could have a nat
transformation. Also bridge nf prevents re-invocation of inet prerouting
via 'sabotage_in' hook.

Work around this problem by explicit confirmation of the entry at LOCAL_IN
time, before upper layer has a chance to clone the unconfirmed entry.

The downside is that this disables NAT and conntrack helpers.

Alternative fix would be to add locking to all code parts that deal with
unconfirmed packets, but even if that could be done in a sane way this
opens up other problems, for example:

-m physdev --physdev-out eth0 -j SNAT --snat-to 1.2.3.4
-m physdev --physdev-out eth1 -j SNAT --snat-to 1.2.3.5

For multicast case, only one of such conflicting mappings will be
created, conntrack only handles 1:1 NAT mappings.

Users should set create a setup that explicitly marks such traffic
NOTRACK (conntrack bypass) to avoid this, but we cannot auto-bypass
them, ruleset might have accept rules for untracked traffic already,
so user-visible behaviour would change.

Suggested-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217777
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
conntrack nf_confirm logic cannot handle cloned skbs referencing
the same nf_conn entry, which will happen for multicast (broadcast)
frames on bridges.

 Example:
    macvlan0
       |
      br0
     /  \
  ethX    ethY

 ethX (or Y) receives a L2 multicast or broadcast packet containing
 an IP packet, flow is not yet in conntrack table.

 1. skb passes through bridge and fake-ip (br_netfilter)Prerouting.
    -&gt; skb-&gt;_nfct now references a unconfirmed entry
 2. skb is broad/mcast packet. bridge now passes clones out on each bridge
    interface.
 3. skb gets passed up the stack.
 4. In macvlan case, macvlan driver retains clone(s) of the mcast skb
    and schedules a work queue to send them out on the lower devices.

    The clone skb-&gt;_nfct is not a copy, it is the same entry as the
    original skb.  The macvlan rx handler then returns RX_HANDLER_PASS.
 5. Normal conntrack hooks (in NF_INET_LOCAL_IN) confirm the orig skb.

The Macvlan broadcast worker and normal confirm path will race.

This race will not happen if step 2 already confirmed a clone. In that
case later steps perform skb_clone() with skb-&gt;_nfct already confirmed (in
hash table).  This works fine.

But such confirmation won't happen when eb/ip/nftables rules dropped the
packets before they reached the nf_confirm step in postrouting.

Pablo points out that nf_conntrack_bridge doesn't allow use of stateful
nat, so we can safely discard the nf_conn entry and let inet call
conntrack again.

This doesn't work for bridge netfilter: skb could have a nat
transformation. Also bridge nf prevents re-invocation of inet prerouting
via 'sabotage_in' hook.

Work around this problem by explicit confirmation of the entry at LOCAL_IN
time, before upper layer has a chance to clone the unconfirmed entry.

The downside is that this disables NAT and conntrack helpers.

Alternative fix would be to add locking to all code parts that deal with
unconfirmed packets, but even if that could be done in a sane way this
opens up other problems, for example:

-m physdev --physdev-out eth0 -j SNAT --snat-to 1.2.3.4
-m physdev --physdev-out eth1 -j SNAT --snat-to 1.2.3.5

For multicast case, only one of such conflicting mappings will be
created, conntrack only handles 1:1 NAT mappings.

Users should set create a setup that explicitly marks such traffic
NOTRACK (conntrack bypass) to avoid this, but we cannot auto-bypass
them, ruleset might have accept rules for untracked traffic already,
so user-visible behaviour would change.

Suggested-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217777
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
