<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/bridge, branch v5.0.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Revert "bridge: do not add port to router list when receives query with source 0.0.0.0"</title>
<updated>2019-02-24T02:36:06+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2019-02-22T13:22:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=278e2148c07559dd4ad8602f22366d61eb2ee7b7'/>
<id>278e2148c07559dd4ad8602f22366d61eb2ee7b7</id>
<content type='text'>
This reverts commit 5a2de63fd1a5 ("bridge: do not add port to router list
when receives query with source 0.0.0.0") and commit 0fe5119e267f ("net:
bridge: remove ipv6 zero address check in mcast queries")

The reason is RFC 4541 is not a standard but suggestive. Currently we
will elect 0.0.0.0 as Querier if there is no ip address configured on
bridge. If we do not add the port which recives query with source
0.0.0.0 to router list, the IGMP reports will not be about to forward
to Querier, IGMP data will also not be able to forward to dest.

As Nikolay suggested, revert this change first and add a boolopt api
to disable none-zero election in future if needed.

Reported-by: Linus Lüssing &lt;linus.luessing@c0d3.blue&gt;
Reported-by: Sebastian Gottschall &lt;s.gottschall@newmedia-net.de&gt;
Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0")
Fixes: 0fe5119e267f ("net: bridge: remove ipv6 zero address check in mcast queries")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 5a2de63fd1a5 ("bridge: do not add port to router list
when receives query with source 0.0.0.0") and commit 0fe5119e267f ("net:
bridge: remove ipv6 zero address check in mcast queries")

The reason is RFC 4541 is not a standard but suggestive. Currently we
will elect 0.0.0.0 as Querier if there is no ip address configured on
bridge. If we do not add the port which recives query with source
0.0.0.0 to router list, the IGMP reports will not be about to forward
to Querier, IGMP data will also not be able to forward to dest.

As Nikolay suggested, revert this change first and add a boolopt api
to disable none-zero election in future if needed.

Reported-by: Linus Lüssing &lt;linus.luessing@c0d3.blue&gt;
Reported-by: Sebastian Gottschall &lt;s.gottschall@newmedia-net.de&gt;
Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0")
Fixes: 0fe5119e267f ("net: bridge: remove ipv6 zero address check in mcast queries")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2019-01-28T18:51:51+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-01-28T18:51:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ff44a8373c882221c1ff30b87c42fba4f6938119'/>
<id>ff44a8373c882221c1ff30b87c42fba4f6938119</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree:

1) The nftnl mutex is now per-netns, therefore use reference counter
   for matches and targets to deal with concurrent updates from netns.
   Moreover, place extensions in a pernet list. Patches from Florian Westphal.

2) Bail out with EINVAL in case of negative timeouts via setsockopt()
   through ip_vs_set_timeout(), from ZhangXiaoxu.

3) Spurious EINVAL on ebtables 32bit binary with 64bit kernel, also
   from Florian.

4) Reset TCP option header parser in case of fingerprint mismatch,
   otherwise follow up overlapping fingerprint definitions including
   TCP options do not work, from Fernando Fernandez Mancera.

5) Compilation warning in ipt_CLUSTER with CONFIG_PROC_FS unset.
   From Anders Roxell.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree:

1) The nftnl mutex is now per-netns, therefore use reference counter
   for matches and targets to deal with concurrent updates from netns.
   Moreover, place extensions in a pernet list. Patches from Florian Westphal.

2) Bail out with EINVAL in case of negative timeouts via setsockopt()
   through ip_vs_set_timeout(), from ZhangXiaoxu.

3) Spurious EINVAL on ebtables 32bit binary with 64bit kernel, also
   from Florian.

4) Reset TCP option header parser in case of fingerprint mismatch,
   otherwise follow up overlapping fingerprint definitions including
   TCP options do not work, from Fernando Fernandez Mancera.

5) Compilation warning in ipt_CLUSTER with CONFIG_PROC_FS unset.
   From Anders Roxell.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present</title>
<updated>2019-01-28T09:49:43+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2019-01-21T20:54:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2035f3ff8eaa29cfb5c8e2160b0f6e85eeb21a95'/>
<id>2035f3ff8eaa29cfb5c8e2160b0f6e85eeb21a95</id>
<content type='text'>
Unlike ip(6)tables ebtables only counts user-defined chains.

The effect is that a 32bit ebtables binary on a 64bit kernel can do
'ebtables -N FOO' only after adding at least one rule, else the request
fails with -EINVAL.

This is a similar fix as done in
3f1e53abff84 ("netfilter: ebtables: don't attempt to allocate 0-sized compat array").

Fixes: 7d7d7e02111e9 ("netfilter: compat: reject huge allocation requests")
Reported-by: Francesco Ruggeri &lt;fruggeri@arista.com&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unlike ip(6)tables ebtables only counts user-defined chains.

The effect is that a 32bit ebtables binary on a 64bit kernel can do
'ebtables -N FOO' only after adding at least one rule, else the request
fails with -EINVAL.

This is a similar fix as done in
3f1e53abff84 ("netfilter: ebtables: don't attempt to allocate 0-sized compat array").

Fixes: 7d7d7e02111e9 ("netfilter: compat: reject huge allocation requests")
Reported-by: Francesco Ruggeri &lt;fruggeri@arista.com&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bridge: Mark FDB entries that were added by user as such</title>
<updated>2019-01-18T23:12:16+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2019-01-18T15:58:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=710ae72877378e7cde611efd30fe90502a6e5b30'/>
<id>710ae72877378e7cde611efd30fe90502a6e5b30</id>
<content type='text'>
Externally learned entries can be added by a user or by a switch driver
that is notifying the bridge driver about entries that were learned in
hardware.

In the first case, the entries are not marked with the 'added_by_user'
flag, which causes switch drivers to ignore them and not offload them.

The 'added_by_user' flag can be set on externally learned FDB entries
based on the 'swdev_notify' parameter in br_fdb_external_learn_add(),
which effectively means if the created / updated FDB entry was added by
a user or not.

Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications")
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reported-by: Alexander Petrovskiy &lt;alexpe@mellanox.com&gt;
Reviewed-by: Petr Machata &lt;petrm@mellanox.com&gt;
Cc: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Cc: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: bridge@lists.linux-foundation.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Externally learned entries can be added by a user or by a switch driver
that is notifying the bridge driver about entries that were learned in
hardware.

In the first case, the entries are not marked with the 'added_by_user'
flag, which causes switch drivers to ignore them and not offload them.

The 'added_by_user' flag can be set on externally learned FDB entries
based on the 'swdev_notify' parameter in br_fdb_external_learn_add(),
which effectively means if the created / updated FDB entry was added by
a user or not.

Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications")
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reported-by: Alexander Petrovskiy &lt;alexpe@mellanox.com&gt;
Reviewed-by: Petr Machata &lt;petrm@mellanox.com&gt;
Cc: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Cc: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: bridge@lists.linux-foundation.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Fix usage of pskb_trim_rcsum</title>
<updated>2019-01-18T22:05:14+00:00</updated>
<author>
<name>Ross Lagerwall</name>
<email>ross.lagerwall@citrix.com</email>
</author>
<published>2019-01-17T15:34:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6c57f0458022298e4da1729c67bd33ce41c14e7a'/>
<id>6c57f0458022298e4da1729c67bd33ce41c14e7a</id>
<content type='text'>
In certain cases, pskb_trim_rcsum() may change skb pointers.
Reinitialize header pointers afterwards to avoid potential
use-after-frees. Add a note in the documentation of
pskb_trim_rcsum(). Found by KASAN.

Signed-off-by: Ross Lagerwall &lt;ross.lagerwall@citrix.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In certain cases, pskb_trim_rcsum() may change skb pointers.
Reinitialize header pointers afterwards to avoid potential
use-after-frees. Add a note in the documentation of
pskb_trim_rcsum(). Found by KASAN.

Signed-off-by: Ross Lagerwall &lt;ross.lagerwall@citrix.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bridge: Fix ethernet header pointer before check skb forwardable</title>
<updated>2019-01-18T05:55:15+00:00</updated>
<author>
<name>Yunjian Wang</name>
<email>wangyunjian@huawei.com</email>
</author>
<published>2019-01-17T01:46:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=28c1382fa28f2e2d9d0d6f25ae879b5af2ecbd03'/>
<id>28c1382fa28f2e2d9d0d6f25ae879b5af2ecbd03</id>
<content type='text'>
The skb header should be set to ethernet header before using
is_skb_forwardable. Because the ethernet header length has been
considered in is_skb_forwardable(including dev-&gt;hard_header_len
length).

To reproduce the issue:
1, add 2 ports on linux bridge br using following commands:
$ brctl addbr br
$ brctl addif br eth0
$ brctl addif br eth1
2, the MTU of eth0 and eth1 is 1500
3, send a packet(Data 1480, UDP 8, IP 20, Ethernet 14, VLAN 4)
from eth0 to eth1

So the expect result is packet larger than 1500 cannot pass through
eth0 and eth1. But currently, the packet passes through success, it
means eth1's MTU limit doesn't take effect.

Fixes: f6367b4660dd ("bridge: use is_skb_forwardable in forward path")
Cc: bridge@lists.linux-foundation.org
Cc: Nkolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: Yunjian Wang &lt;wangyunjian@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The skb header should be set to ethernet header before using
is_skb_forwardable. Because the ethernet header length has been
considered in is_skb_forwardable(including dev-&gt;hard_header_len
length).

To reproduce the issue:
1, add 2 ports on linux bridge br using following commands:
$ brctl addbr br
$ brctl addif br eth0
$ brctl addif br eth1
2, the MTU of eth0 and eth1 is 1500
3, send a packet(Data 1480, UDP 8, IP 20, Ethernet 14, VLAN 4)
from eth0 to eth1

So the expect result is packet larger than 1500 cannot pass through
eth0 and eth1. But currently, the packet passes through success, it
means eth1's MTU limit doesn't take effect.

Fixes: f6367b4660dd ("bridge: use is_skb_forwardable in forward path")
Cc: bridge@lists.linux-foundation.org
Cc: Nkolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: Yunjian Wang &lt;wangyunjian@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2019-01-15T21:31:46+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-01-15T21:31:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=72f6d4d14c2e23c8ad416ccbe5cfc744ba703d0a'/>
<id>72f6d4d14c2e23c8ad416ccbe5cfc744ba703d0a</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This is the first batch of Netfilter fixes for your net tree:

1) Fix endless loop in nf_tables rules netlink dump, from Phil Sutter.

2) Reference counter leak in object from the error path, from Taehee Yoo.

3) Selective rule dump requires table and chain.

4) Fix DNAT with nft_flow_offload reverse route lookup, from wenxu.

5) Use GFP_KERNEL_ACCOUNT in vmalloc allocation from ebtables, from
   Shakeel Butt.

6) Set ifindex from route to fix interaction with VRF slave device,
   also from wenxu.

7) Use nfct_help() to check for conntrack helper, IPS_HELPER status
   flag is only set from explicit helpers via -j CT, from Henry Yen.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

This is the first batch of Netfilter fixes for your net tree:

1) Fix endless loop in nf_tables rules netlink dump, from Phil Sutter.

2) Reference counter leak in object from the error path, from Taehee Yoo.

3) Selective rule dump requires table and chain.

4) Fix DNAT with nft_flow_offload reverse route lookup, from wenxu.

5) Use GFP_KERNEL_ACCOUNT in vmalloc allocation from ebtables, from
   Shakeel Butt.

6) Set ifindex from route to fix interaction with VRF slave device,
   also from wenxu.

7) Use nfct_help() to check for conntrack helper, IPS_HELPER status
   flag is only set from explicit helpers via -j CT, from Henry Yen.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: clear skb-&gt;tstamp in bridge forwarding path</title>
<updated>2019-01-12T02:26:01+00:00</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2019-01-08T17:45:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=41d1c8839e5f8cb781cc635f12791decee8271b7'/>
<id>41d1c8839e5f8cb781cc635f12791decee8271b7</id>
<content type='text'>
Matteo reported forwarding issues inside the linux bridge,
if the enslaved interfaces use the fq qdisc.

Similar to commit 8203e2d844d3 ("net: clear skb-&gt;tstamp in
forwarding paths"), we need to clear the tstamp field in
the bridge forwarding path.

Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Reported-and-tested-by: Matteo Croce &lt;mcroce@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Acked-by: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Matteo reported forwarding issues inside the linux bridge,
if the enslaved interfaces use the fq qdisc.

Similar to commit 8203e2d844d3 ("net: clear skb-&gt;tstamp in
forwarding paths"), we need to clear the tstamp field in
the bridge forwarding path.

Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Reported-and-tested-by: Matteo Croce &lt;mcroce@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Acked-by: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ebtables: account ebt_table_info to kmemcg</title>
<updated>2019-01-10T23:55:36+00:00</updated>
<author>
<name>Shakeel Butt</name>
<email>shakeelb@google.com</email>
</author>
<published>2019-01-03T03:14:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e2c8d550a973bb34fc28bc8d0ec996f84562fb8a'/>
<id>e2c8d550a973bb34fc28bc8d0ec996f84562fb8a</id>
<content type='text'>
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
memory is already accounted to kmemcg. Do the same for ebtables. The
syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
whole system from a restricted memcg, a potential DoS.

By accounting the ebt_table_info, the memory used for ebt_table_info can
be contained within the memcg of the allocating process. However the
lifetime of ebt_table_info is independent of the allocating process and
is tied to the network namespace. So, the oom-killer will not be able to
relieve the memory pressure due to ebt_table_info memory. The memory for
ebt_table_info is allocated through vmalloc. Currently vmalloc does not
handle the oom-killed allocating process correctly and one large
allocation can bypass memcg limit enforcement. So, with this patch,
at least the small allocations will be contained. For large allocations,
we need to fix vmalloc.

Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
memory is already accounted to kmemcg. Do the same for ebtables. The
syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
whole system from a restricted memcg, a potential DoS.

By accounting the ebt_table_info, the memory used for ebt_table_info can
be contained within the memcg of the allocating process. However the
lifetime of ebt_table_info is independent of the allocating process and
is tied to the network namespace. So, the oom-killer will not be able to
relieve the memory pressure due to ebt_table_info memory. The memory for
ebt_table_info is allocated through vmalloc. Currently vmalloc does not
handle the oom-killed allocating process correctly and one large
allocation can bypass memcg limit enforcement. So, with this patch,
at least the small allocations will be contained. For large allocations,
we need to fix vmalloc.

Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bridge: Fix VLANs memory leak</title>
<updated>2019-01-08T21:53:54+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2019-01-08T16:48:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=279737939a8194f02fa352ab4476a1b241f44ef4'/>
<id>279737939a8194f02fa352ab4476a1b241f44ef4</id>
<content type='text'>
When adding / deleting VLANs to / from a bridge port, the bridge driver
first tries to propagate the information via switchdev and falls back to
the 8021q driver in case the underlying driver does not support
switchdev. This can result in a memory leak [1] when VXLAN and mlxsw
ports are enslaved to the bridge:

$ ip link set dev vxlan0 master br0
# No mlxsw ports are enslaved to 'br0', so mlxsw ignores the switchdev
# notification and the bridge driver adds the VLAN on 'vxlan0' via the
# 8021q driver
$ bridge vlan add vid 10 dev vxlan0 pvid untagged
# mlxsw port is enslaved to the bridge
$ ip link set dev swp1 master br0
# mlxsw processes the switchdev notification and the 8021q driver is
# skipped
$ bridge vlan del vid 10 dev vxlan0

This results in 'struct vlan_info' and 'struct vlan_vid_info' being
leaked, as they were allocated by the 8021q driver during VLAN addition,
but never freed as the 8021q driver was skipped during deletion.

Fix this by introducing a new VLAN private flag that indicates whether
the VLAN was added on the port by switchdev or the 8021q driver. If the
VLAN was added by the 8021q driver, then we make sure to delete it via
the 8021q driver as well.

[1]
unreferenced object 0xffff88822d20b1e8 (size 256):
  comm "bridge", pid 2532, jiffies 4295216998 (age 1188.830s)
  hex dump (first 32 bytes):
    e0 42 97 ce 81 88 ff ff 00 00 00 00 00 00 00 00  .B..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000f82d851d&gt;] kmem_cache_alloc_trace+0x1be/0x330
    [&lt;00000000e0178b02&gt;] vlan_vid_add+0x661/0x920
    [&lt;00000000218ebd5f&gt;] __vlan_add+0x1be9/0x3a00
    [&lt;000000006eafa1ca&gt;] nbp_vlan_add+0x8b3/0xd90
    [&lt;000000003535392c&gt;] br_vlan_info+0x132/0x410
    [&lt;00000000aedaa9dc&gt;] br_afspec+0x75c/0x870
    [&lt;00000000f5716133&gt;] br_setlink+0x3dc/0x6d0
    [&lt;00000000aceca5e2&gt;] rtnl_bridge_setlink+0x615/0xb30
    [&lt;00000000a2f2d23e&gt;] rtnetlink_rcv_msg+0x3a3/0xa80
    [&lt;0000000064097e69&gt;] netlink_rcv_skb+0x152/0x3c0
    [&lt;000000008be8d614&gt;] rtnetlink_rcv+0x21/0x30
    [&lt;000000009ab2ca25&gt;] netlink_unicast+0x52f/0x740
    [&lt;00000000e7d9ac96&gt;] netlink_sendmsg+0x9c7/0xf50
    [&lt;000000005d1e2050&gt;] sock_sendmsg+0xbe/0x120
    [&lt;00000000d51426bc&gt;] ___sys_sendmsg+0x778/0x8f0
    [&lt;00000000b9d7b2cc&gt;] __sys_sendmsg+0x112/0x270
unreferenced object 0xffff888227454308 (size 32):
  comm "bridge", pid 2532, jiffies 4295216998 (age 1188.882s)
  hex dump (first 32 bytes):
    88 b2 20 2d 82 88 ff ff 88 b2 20 2d 82 88 ff ff  .. -...... -....
    81 00 0a 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000f82d851d&gt;] kmem_cache_alloc_trace+0x1be/0x330
    [&lt;0000000018050631&gt;] vlan_vid_add+0x3e6/0x920
    [&lt;00000000218ebd5f&gt;] __vlan_add+0x1be9/0x3a00
    [&lt;000000006eafa1ca&gt;] nbp_vlan_add+0x8b3/0xd90
    [&lt;000000003535392c&gt;] br_vlan_info+0x132/0x410
    [&lt;00000000aedaa9dc&gt;] br_afspec+0x75c/0x870
    [&lt;00000000f5716133&gt;] br_setlink+0x3dc/0x6d0
    [&lt;00000000aceca5e2&gt;] rtnl_bridge_setlink+0x615/0xb30
    [&lt;00000000a2f2d23e&gt;] rtnetlink_rcv_msg+0x3a3/0xa80
    [&lt;0000000064097e69&gt;] netlink_rcv_skb+0x152/0x3c0
    [&lt;000000008be8d614&gt;] rtnetlink_rcv+0x21/0x30
    [&lt;000000009ab2ca25&gt;] netlink_unicast+0x52f/0x740
    [&lt;00000000e7d9ac96&gt;] netlink_sendmsg+0x9c7/0xf50
    [&lt;000000005d1e2050&gt;] sock_sendmsg+0xbe/0x120
    [&lt;00000000d51426bc&gt;] ___sys_sendmsg+0x778/0x8f0
    [&lt;00000000b9d7b2cc&gt;] __sys_sendmsg+0x112/0x270

Fixes: d70e42b22dd4 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reviewed-by: Petr Machata &lt;petrm@mellanox.com&gt;
Cc: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Cc: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: bridge@lists.linux-foundation.org
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When adding / deleting VLANs to / from a bridge port, the bridge driver
first tries to propagate the information via switchdev and falls back to
the 8021q driver in case the underlying driver does not support
switchdev. This can result in a memory leak [1] when VXLAN and mlxsw
ports are enslaved to the bridge:

$ ip link set dev vxlan0 master br0
# No mlxsw ports are enslaved to 'br0', so mlxsw ignores the switchdev
# notification and the bridge driver adds the VLAN on 'vxlan0' via the
# 8021q driver
$ bridge vlan add vid 10 dev vxlan0 pvid untagged
# mlxsw port is enslaved to the bridge
$ ip link set dev swp1 master br0
# mlxsw processes the switchdev notification and the 8021q driver is
# skipped
$ bridge vlan del vid 10 dev vxlan0

This results in 'struct vlan_info' and 'struct vlan_vid_info' being
leaked, as they were allocated by the 8021q driver during VLAN addition,
but never freed as the 8021q driver was skipped during deletion.

Fix this by introducing a new VLAN private flag that indicates whether
the VLAN was added on the port by switchdev or the 8021q driver. If the
VLAN was added by the 8021q driver, then we make sure to delete it via
the 8021q driver as well.

[1]
unreferenced object 0xffff88822d20b1e8 (size 256):
  comm "bridge", pid 2532, jiffies 4295216998 (age 1188.830s)
  hex dump (first 32 bytes):
    e0 42 97 ce 81 88 ff ff 00 00 00 00 00 00 00 00  .B..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000f82d851d&gt;] kmem_cache_alloc_trace+0x1be/0x330
    [&lt;00000000e0178b02&gt;] vlan_vid_add+0x661/0x920
    [&lt;00000000218ebd5f&gt;] __vlan_add+0x1be9/0x3a00
    [&lt;000000006eafa1ca&gt;] nbp_vlan_add+0x8b3/0xd90
    [&lt;000000003535392c&gt;] br_vlan_info+0x132/0x410
    [&lt;00000000aedaa9dc&gt;] br_afspec+0x75c/0x870
    [&lt;00000000f5716133&gt;] br_setlink+0x3dc/0x6d0
    [&lt;00000000aceca5e2&gt;] rtnl_bridge_setlink+0x615/0xb30
    [&lt;00000000a2f2d23e&gt;] rtnetlink_rcv_msg+0x3a3/0xa80
    [&lt;0000000064097e69&gt;] netlink_rcv_skb+0x152/0x3c0
    [&lt;000000008be8d614&gt;] rtnetlink_rcv+0x21/0x30
    [&lt;000000009ab2ca25&gt;] netlink_unicast+0x52f/0x740
    [&lt;00000000e7d9ac96&gt;] netlink_sendmsg+0x9c7/0xf50
    [&lt;000000005d1e2050&gt;] sock_sendmsg+0xbe/0x120
    [&lt;00000000d51426bc&gt;] ___sys_sendmsg+0x778/0x8f0
    [&lt;00000000b9d7b2cc&gt;] __sys_sendmsg+0x112/0x270
unreferenced object 0xffff888227454308 (size 32):
  comm "bridge", pid 2532, jiffies 4295216998 (age 1188.882s)
  hex dump (first 32 bytes):
    88 b2 20 2d 82 88 ff ff 88 b2 20 2d 82 88 ff ff  .. -...... -....
    81 00 0a 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000f82d851d&gt;] kmem_cache_alloc_trace+0x1be/0x330
    [&lt;0000000018050631&gt;] vlan_vid_add+0x3e6/0x920
    [&lt;00000000218ebd5f&gt;] __vlan_add+0x1be9/0x3a00
    [&lt;000000006eafa1ca&gt;] nbp_vlan_add+0x8b3/0xd90
    [&lt;000000003535392c&gt;] br_vlan_info+0x132/0x410
    [&lt;00000000aedaa9dc&gt;] br_afspec+0x75c/0x870
    [&lt;00000000f5716133&gt;] br_setlink+0x3dc/0x6d0
    [&lt;00000000aceca5e2&gt;] rtnl_bridge_setlink+0x615/0xb30
    [&lt;00000000a2f2d23e&gt;] rtnetlink_rcv_msg+0x3a3/0xa80
    [&lt;0000000064097e69&gt;] netlink_rcv_skb+0x152/0x3c0
    [&lt;000000008be8d614&gt;] rtnetlink_rcv+0x21/0x30
    [&lt;000000009ab2ca25&gt;] netlink_unicast+0x52f/0x740
    [&lt;00000000e7d9ac96&gt;] netlink_sendmsg+0x9c7/0xf50
    [&lt;000000005d1e2050&gt;] sock_sendmsg+0xbe/0x120
    [&lt;00000000d51426bc&gt;] ___sys_sendmsg+0x778/0x8f0
    [&lt;00000000b9d7b2cc&gt;] __sys_sendmsg+0x112/0x270

Fixes: d70e42b22dd4 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Reviewed-by: Petr Machata &lt;petrm@mellanox.com&gt;
Cc: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Cc: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: bridge@lists.linux-foundation.org
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
