<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/net/bonding, branch v6.5.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>bonding: fix macvlan over alb bond support</title>
<updated>2023-08-24T08:07:13+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2023-08-23T07:19:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e74216b8def3803e98ae536de78733e9d7f3b109'/>
<id>e74216b8def3803e98ae536de78733e9d7f3b109</id>
<content type='text'>
The commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode
bonds") aims to enable the use of macvlans on top of rlb bond mode. However,
the current rlb bond mode only handles ARP packets to update remote neighbor
entries. This causes an issue when a macvlan is on top of the bond, and
remote devices send packets to the macvlan using the bond's MAC address
as the destination. After delivering the packets to the macvlan, the macvlan
will rejects them as the MAC address is incorrect. Consequently, this commit
makes macvlan over bond non-functional.

To address this problem, one potential solution is to check for the presence
of a macvlan port on the bond device using netif_is_macvlan_port(bond-&gt;dev)
and return NULL in the rlb_arp_xmit() function. However, this approach
doesn't fully resolve the situation when a VLAN exists between the bond and
macvlan.

So let's just do a partial revert for commit 14af9963ba1e in rlb_arp_xmit().
As the comment said, Don't modify or load balance ARPs that do not originate
locally.

Fixes: 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds")
Reported-by: susan.zheng@veritas.com
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode
bonds") aims to enable the use of macvlans on top of rlb bond mode. However,
the current rlb bond mode only handles ARP packets to update remote neighbor
entries. This causes an issue when a macvlan is on top of the bond, and
remote devices send packets to the macvlan using the bond's MAC address
as the destination. After delivering the packets to the macvlan, the macvlan
will rejects them as the MAC address is incorrect. Consequently, this commit
makes macvlan over bond non-functional.

To address this problem, one potential solution is to check for the presence
of a macvlan port on the bond device using netif_is_macvlan_port(bond-&gt;dev)
and return NULL in the rlb_arp_xmit() function. However, this approach
doesn't fully resolve the situation when a VLAN exists between the bond and
macvlan.

So let's just do a partial revert for commit 14af9963ba1e in rlb_arp_xmit().
As the comment said, Don't modify or load balance ARPs that do not originate
locally.

Fixes: 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds")
Reported-by: susan.zheng@veritas.com
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves</title>
<updated>2023-08-07T19:19:16+00:00</updated>
<author>
<name>Ziyang Xuan</name>
<email>william.xuanziyang@huawei.com</email>
</author>
<published>2023-08-02T11:43:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=01f4fd27087078c90a0e22860d1dfa2cd0510791'/>
<id>01f4fd27087078c90a0e22860d1dfa2cd0510791</id>
<content type='text'>
BUG_ON(!vlan_info) is triggered in unregister_vlan_dev() with
following testcase:

  # ip netns add ns1
  # ip netns exec ns1 ip link add bond0 type bond mode 0
  # ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
  # ip netns exec ns1 ip link set bond_slave_1 master bond0
  # ip netns exec ns1 ip link add link bond_slave_1 name vlan10 type vlan id 10 protocol 802.1ad
  # ip netns exec ns1 ip link add link bond0 name bond0_vlan10 type vlan id 10 protocol 802.1ad
  # ip netns exec ns1 ip link set bond_slave_1 nomaster
  # ip netns del ns1

The logical analysis of the problem is as follows:

1. create ETH_P_8021AD protocol vlan10 for bond_slave_1:
register_vlan_dev()
  vlan_vid_add()
    vlan_info_alloc()
    __vlan_vid_add() // add [ETH_P_8021AD, 10] vid to bond_slave_1

2. create ETH_P_8021AD protocol bond0_vlan10 for bond0:
register_vlan_dev()
  vlan_vid_add()
    __vlan_vid_add()
      vlan_add_rx_filter_info()
          if (!vlan_hw_filter_capable(dev, proto)) // condition established because bond0 without NETIF_F_HW_VLAN_STAG_FILTER
              return 0;

          if (netif_device_present(dev))
              return dev-&gt;netdev_ops-&gt;ndo_vlan_rx_add_vid(dev, proto, vid); // will be never called
              // The slaves of bond0 will not refer to the [ETH_P_8021AD, 10] vid.

3. detach bond_slave_1 from bond0:
__bond_release_one()
  vlan_vids_del_by_dev()
    list_for_each_entry(vid_info, &amp;vlan_info-&gt;vid_list, list)
        vlan_vid_del(dev, vid_info-&gt;proto, vid_info-&gt;vid);
        // bond_slave_1 [ETH_P_8021AD, 10] vid will be deleted.
        // bond_slave_1-&gt;vlan_info will be assigned NULL.

4. delete vlan10 during delete ns1:
default_device_exit_batch()
  dev-&gt;rtnl_link_ops-&gt;dellink() // unregister_vlan_dev() for vlan10
    vlan_info = rtnl_dereference(real_dev-&gt;vlan_info); // real_dev of vlan10 is bond_slave_1
	BUG_ON(!vlan_info); // bond_slave_1-&gt;vlan_info is NULL now, bug is triggered!!!

Add S-VLAN tag related features support to bond driver. So the bond driver
will always propagate the VLAN info to its slaves.

Fixes: 8ad227ff89a7 ("net: vlan: add 802.1ad support")
Suggested-by: Ido Schimmel &lt;idosch@idosch.org&gt;
Signed-off-by: Ziyang Xuan &lt;william.xuanziyang@huawei.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://lore.kernel.org/r/20230802114320.4156068-1-william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
BUG_ON(!vlan_info) is triggered in unregister_vlan_dev() with
following testcase:

  # ip netns add ns1
  # ip netns exec ns1 ip link add bond0 type bond mode 0
  # ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
  # ip netns exec ns1 ip link set bond_slave_1 master bond0
  # ip netns exec ns1 ip link add link bond_slave_1 name vlan10 type vlan id 10 protocol 802.1ad
  # ip netns exec ns1 ip link add link bond0 name bond0_vlan10 type vlan id 10 protocol 802.1ad
  # ip netns exec ns1 ip link set bond_slave_1 nomaster
  # ip netns del ns1

The logical analysis of the problem is as follows:

1. create ETH_P_8021AD protocol vlan10 for bond_slave_1:
register_vlan_dev()
  vlan_vid_add()
    vlan_info_alloc()
    __vlan_vid_add() // add [ETH_P_8021AD, 10] vid to bond_slave_1

2. create ETH_P_8021AD protocol bond0_vlan10 for bond0:
register_vlan_dev()
  vlan_vid_add()
    __vlan_vid_add()
      vlan_add_rx_filter_info()
          if (!vlan_hw_filter_capable(dev, proto)) // condition established because bond0 without NETIF_F_HW_VLAN_STAG_FILTER
              return 0;

          if (netif_device_present(dev))
              return dev-&gt;netdev_ops-&gt;ndo_vlan_rx_add_vid(dev, proto, vid); // will be never called
              // The slaves of bond0 will not refer to the [ETH_P_8021AD, 10] vid.

3. detach bond_slave_1 from bond0:
__bond_release_one()
  vlan_vids_del_by_dev()
    list_for_each_entry(vid_info, &amp;vlan_info-&gt;vid_list, list)
        vlan_vid_del(dev, vid_info-&gt;proto, vid_info-&gt;vid);
        // bond_slave_1 [ETH_P_8021AD, 10] vid will be deleted.
        // bond_slave_1-&gt;vlan_info will be assigned NULL.

4. delete vlan10 during delete ns1:
default_device_exit_batch()
  dev-&gt;rtnl_link_ops-&gt;dellink() // unregister_vlan_dev() for vlan10
    vlan_info = rtnl_dereference(real_dev-&gt;vlan_info); // real_dev of vlan10 is bond_slave_1
	BUG_ON(!vlan_info); // bond_slave_1-&gt;vlan_info is NULL now, bug is triggered!!!

Add S-VLAN tag related features support to bond driver. So the bond driver
will always propagate the VLAN info to its slaves.

Fixes: 8ad227ff89a7 ("net: vlan: add 802.1ad support")
Suggested-by: Ido Schimmel &lt;idosch@idosch.org&gt;
Signed-off-by: Ziyang Xuan &lt;william.xuanziyang@huawei.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://lore.kernel.org/r/20230802114320.4156068-1-william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: reset bond's flags when down link is P2P device</title>
<updated>2023-07-25T07:16:17+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2023-07-21T04:03:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=da19a2b967cf1e2c426f50d28550d1915214a81d'/>
<id>da19a2b967cf1e2c426f50d28550d1915214a81d</id>
<content type='text'>
When adding a point to point downlink to the bond, we neglected to reset
the bond's flags, which were still using flags like BROADCAST and
MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink
interfaces, such as when adding a GRE device to the bonding.

To address this issue, let's reset the bond's flags for P2P interfaces.

Before fix:
7: gre0@NONE: &lt;POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000
    link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188::
8: bond0: &lt;BROADCAST,MULTICAST,MASTER,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/gre6 2006:70:10::1 brd 2006:70:10::2
    inet6 fe80::200:ff:fe00:0/64 scope link
       valid_lft forever preferred_lft forever

After fix:
7: gre0@NONE: &lt;POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000
    link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9::
8: bond0: &lt;POINTOPOINT,NOARP,MASTER,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/gre6 2006:70:10::1 peer 2006:70:10::2
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever

Reported-by: Liang Li &lt;liali@redhat.com&gt;
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438
Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When adding a point to point downlink to the bond, we neglected to reset
the bond's flags, which were still using flags like BROADCAST and
MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink
interfaces, such as when adding a GRE device to the bonding.

To address this issue, let's reset the bond's flags for P2P interfaces.

Before fix:
7: gre0@NONE: &lt;POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000
    link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188::
8: bond0: &lt;BROADCAST,MULTICAST,MASTER,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/gre6 2006:70:10::1 brd 2006:70:10::2
    inet6 fe80::200:ff:fe00:0/64 scope link
       valid_lft forever preferred_lft forever

After fix:
7: gre0@NONE: &lt;POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000
    link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9::
8: bond0: &lt;POINTOPOINT,NOARP,MASTER,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/gre6 2006:70:10::1 peer 2006:70:10::2
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever

Reported-by: Liang Li &lt;liali@redhat.com&gt;
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438
Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2023-06-27T16:45:22+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-06-27T16:45:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3674fbf0451df0395f9fa18df3122927006a3829'/>
<id>3674fbf0451df0395f9fa18df3122927006a3829</id>
<content type='text'>
Merge in late fixes to prepare for the 6.5 net-next PR.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge in late fixes to prepare for the 6.5 net-next PR.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: do not assume skb mac_header is set</title>
<updated>2023-06-24T02:00:31+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2023-06-22T15:23:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6a940abdef3162e5723f1495b8a49859d1708f79'/>
<id>6a940abdef3162e5723f1495b8a49859d1708f79</id>
<content type='text'>
Drivers must not assume in their ndo_start_xmit() that
skbs have their mac_header set. skb-&gt;data is all what is needed.

bonding seems to be one of the last offender as caught by syzbot:

WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 skb_mac_offset include/linux/skbuff.h:2913 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_xmit_hash drivers/net/bonding/bond_main.c:4170 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5149 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_3ad_xor_xmit drivers/net/bonding/bond_main.c:5186 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 __bond_start_xmit drivers/net/bonding/bond_main.c:5442 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_start_xmit+0x14ab/0x19d0 drivers/net/bonding/bond_main.c:5470
Modules linked in:
CPU: 1 PID: 12155 Comm: syz-executor.3 Not tainted 6.1.30-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
RIP: 0010:skb_mac_header include/linux/skbuff.h:2907 [inline]
RIP: 0010:skb_mac_offset include/linux/skbuff.h:2913 [inline]
RIP: 0010:bond_xmit_hash drivers/net/bonding/bond_main.c:4170 [inline]
RIP: 0010:bond_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5149 [inline]
RIP: 0010:bond_3ad_xor_xmit drivers/net/bonding/bond_main.c:5186 [inline]
RIP: 0010:__bond_start_xmit drivers/net/bonding/bond_main.c:5442 [inline]
RIP: 0010:bond_start_xmit+0x14ab/0x19d0 drivers/net/bonding/bond_main.c:5470
Code: 8b 7c 24 30 e8 76 dd 1a 01 48 85 c0 74 0d 48 89 c3 e8 29 67 2e fe e9 15 ef ff ff e8 1f 67 2e fe e9 10 ef ff ff e8 15 67 2e fe &lt;0f&gt; 0b e9 45 f8 ff ff e8 09 67 2e fe e9 dc fa ff ff e8 ff 66 2e fe
RSP: 0018:ffffc90002fff6e0 EFLAGS: 00010283
RAX: ffffffff835874db RBX: 000000000000ffff RCX: 0000000000040000
RDX: ffffc90004dcf000 RSI: 00000000000000b5 RDI: 00000000000000b6
RBP: ffffc90002fff8b8 R08: ffffffff83586d16 R09: ffffffff83586584
R10: 0000000000000007 R11: ffff8881599fc780 R12: ffff88811b6a7b7e
R13: 1ffff110236d4f6f R14: ffff88811b6a7ac0 R15: 1ffff110236d4f76
FS: 00007f2e9eb47700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e421000 CR3: 000000010e6d4000 CR4: 00000000003526e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
&lt;TASK&gt;
[&lt;ffffffff8471a49f&gt;] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
[&lt;ffffffff8471a49f&gt;] __dev_direct_xmit+0x4ef/0x850 net/core/dev.c:4380
[&lt;ffffffff851d845b&gt;] dev_direct_xmit include/linux/netdevice.h:3043 [inline]
[&lt;ffffffff851d845b&gt;] packet_direct_xmit+0x18b/0x300 net/packet/af_packet.c:284
[&lt;ffffffff851c7472&gt;] packet_snd net/packet/af_packet.c:3112 [inline]
[&lt;ffffffff851c7472&gt;] packet_sendmsg+0x4a22/0x64d0 net/packet/af_packet.c:3143
[&lt;ffffffff8467a4b2&gt;] sock_sendmsg_nosec net/socket.c:716 [inline]
[&lt;ffffffff8467a4b2&gt;] sock_sendmsg net/socket.c:736 [inline]
[&lt;ffffffff8467a4b2&gt;] __sys_sendto+0x472/0x5f0 net/socket.c:2139
[&lt;ffffffff8467a715&gt;] __do_sys_sendto net/socket.c:2151 [inline]
[&lt;ffffffff8467a715&gt;] __se_sys_sendto net/socket.c:2147 [inline]
[&lt;ffffffff8467a715&gt;] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
[&lt;ffffffff8553071f&gt;] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[&lt;ffffffff8553071f&gt;] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
[&lt;ffffffff85600087&gt;] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 7b8fc0103bb5 ("bonding: add a vlan+srcmac tx hashing option")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jarod Wilson &lt;jarod@redhat.com&gt;
Cc: Moshe Tal &lt;moshet@nvidia.com&gt;
Cc: Jussi Maki &lt;joamaki@gmail.com&gt;
Cc: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
Cc: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Cc: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Link: https://lore.kernel.org/r/20230622152304.2137482-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Drivers must not assume in their ndo_start_xmit() that
skbs have their mac_header set. skb-&gt;data is all what is needed.

bonding seems to be one of the last offender as caught by syzbot:

WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 skb_mac_offset include/linux/skbuff.h:2913 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_xmit_hash drivers/net/bonding/bond_main.c:4170 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5149 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_3ad_xor_xmit drivers/net/bonding/bond_main.c:5186 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 __bond_start_xmit drivers/net/bonding/bond_main.c:5442 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_start_xmit+0x14ab/0x19d0 drivers/net/bonding/bond_main.c:5470
Modules linked in:
CPU: 1 PID: 12155 Comm: syz-executor.3 Not tainted 6.1.30-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
RIP: 0010:skb_mac_header include/linux/skbuff.h:2907 [inline]
RIP: 0010:skb_mac_offset include/linux/skbuff.h:2913 [inline]
RIP: 0010:bond_xmit_hash drivers/net/bonding/bond_main.c:4170 [inline]
RIP: 0010:bond_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5149 [inline]
RIP: 0010:bond_3ad_xor_xmit drivers/net/bonding/bond_main.c:5186 [inline]
RIP: 0010:__bond_start_xmit drivers/net/bonding/bond_main.c:5442 [inline]
RIP: 0010:bond_start_xmit+0x14ab/0x19d0 drivers/net/bonding/bond_main.c:5470
Code: 8b 7c 24 30 e8 76 dd 1a 01 48 85 c0 74 0d 48 89 c3 e8 29 67 2e fe e9 15 ef ff ff e8 1f 67 2e fe e9 10 ef ff ff e8 15 67 2e fe &lt;0f&gt; 0b e9 45 f8 ff ff e8 09 67 2e fe e9 dc fa ff ff e8 ff 66 2e fe
RSP: 0018:ffffc90002fff6e0 EFLAGS: 00010283
RAX: ffffffff835874db RBX: 000000000000ffff RCX: 0000000000040000
RDX: ffffc90004dcf000 RSI: 00000000000000b5 RDI: 00000000000000b6
RBP: ffffc90002fff8b8 R08: ffffffff83586d16 R09: ffffffff83586584
R10: 0000000000000007 R11: ffff8881599fc780 R12: ffff88811b6a7b7e
R13: 1ffff110236d4f6f R14: ffff88811b6a7ac0 R15: 1ffff110236d4f76
FS: 00007f2e9eb47700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e421000 CR3: 000000010e6d4000 CR4: 00000000003526e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
&lt;TASK&gt;
[&lt;ffffffff8471a49f&gt;] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
[&lt;ffffffff8471a49f&gt;] __dev_direct_xmit+0x4ef/0x850 net/core/dev.c:4380
[&lt;ffffffff851d845b&gt;] dev_direct_xmit include/linux/netdevice.h:3043 [inline]
[&lt;ffffffff851d845b&gt;] packet_direct_xmit+0x18b/0x300 net/packet/af_packet.c:284
[&lt;ffffffff851c7472&gt;] packet_snd net/packet/af_packet.c:3112 [inline]
[&lt;ffffffff851c7472&gt;] packet_sendmsg+0x4a22/0x64d0 net/packet/af_packet.c:3143
[&lt;ffffffff8467a4b2&gt;] sock_sendmsg_nosec net/socket.c:716 [inline]
[&lt;ffffffff8467a4b2&gt;] sock_sendmsg net/socket.c:736 [inline]
[&lt;ffffffff8467a4b2&gt;] __sys_sendto+0x472/0x5f0 net/socket.c:2139
[&lt;ffffffff8467a715&gt;] __do_sys_sendto net/socket.c:2151 [inline]
[&lt;ffffffff8467a715&gt;] __se_sys_sendto net/socket.c:2147 [inline]
[&lt;ffffffff8467a715&gt;] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
[&lt;ffffffff8553071f&gt;] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[&lt;ffffffff8553071f&gt;] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
[&lt;ffffffff85600087&gt;] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 7b8fc0103bb5 ("bonding: add a vlan+srcmac tx hashing option")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jarod Wilson &lt;jarod@redhat.com&gt;
Cc: Moshe Tal &lt;moshet@nvidia.com&gt;
Cc: Jussi Maki &lt;joamaki@gmail.com&gt;
Cc: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
Cc: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Cc: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Link: https://lore.kernel.org/r/20230622152304.2137482-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: tls: make the offload check helper take skb not socket</title>
<updated>2023-06-15T08:01:05+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-06-13T20:50:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ed3c9a2fcab3b60b0766eb5d7566fd3b10df9a8e'/>
<id>ed3c9a2fcab3b60b0766eb5d7566fd3b10df9a8e</id>
<content type='text'>
All callers of tls_is_sk_tx_device_offloaded() currently do
an equivalent of:

 if (skb-&gt;sk &amp;&amp; tls_is_skb_tx_device_offloaded(skb-&gt;sk))

Have the helper accept skb and do the skb-&gt;sk check locally.
Two drivers have local static inlines with similar wrappers
already.

While at it change the ifdef condition to TLS_DEVICE.
Only TLS_DEVICE selects SOCK_VALIDATE_XMIT, so the two are
equivalent. This makes removing the duplicated IS_ENABLED()
check in funeth more obviously correct.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Maxim Mikityanskiy &lt;maxtram95@gmail.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Acked-by: Dimitris Michailidis &lt;dmichail@fungible.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All callers of tls_is_sk_tx_device_offloaded() currently do
an equivalent of:

 if (skb-&gt;sk &amp;&amp; tls_is_skb_tx_device_offloaded(skb-&gt;sk))

Have the helper accept skb and do the skb-&gt;sk check locally.
Two drivers have local static inlines with similar wrappers
already.

While at it change the ifdef condition to TLS_DEVICE.
Only TLS_DEVICE selects SOCK_VALIDATE_XMIT, so the two are
equivalent. This makes removing the duplicated IS_ENABLED()
check in funeth more obviously correct.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Acked-by: Maxim Mikityanskiy &lt;maxtram95@gmail.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Acked-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Acked-by: Dimitris Michailidis &lt;dmichail@fungible.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2023-05-26T02:57:39+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-05-26T02:56:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d4031ec844bc52fe7f2f844e9c476727fd6b8240'/>
<id>d4031ec844bc52fe7f2f844e9c476727fd6b8240</id>
<content type='text'>
Cross-merge networking fixes after downstream PR.

Conflicts:

net/ipv4/raw.c
  3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
  c85be08fc4fa ("raw: Stop using RTO_ONLINK.")
https://lore.kernel.org/all/20230525110037.2b532b83@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/freescale/fec_main.c
  9025944fddfe ("net: fec: add dma_wmb to ensure correct descriptor values")
  144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cross-merge networking fixes after downstream PR.

Conflicts:

net/ipv4/raw.c
  3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
  c85be08fc4fa ("raw: Stop using RTO_ONLINK.")
https://lore.kernel.org/all/20230525110037.2b532b83@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/freescale/fec_main.c
  9025944fddfe ("net: fec: add dma_wmb to ensure correct descriptor values")
  144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fix stack overflow when LRO is disabled for virtual interfaces</title>
<updated>2023-05-20T05:46:37+00:00</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2023-05-17T14:30:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae9b15fbe63447bc1d3bba3769f409d17ca6fdf6'/>
<id>ae9b15fbe63447bc1d3bba3769f409d17ca6fdf6</id>
<content type='text'>
When the virtual interface's feature is updated, it synchronizes the
updated feature for its own lower interface.
This propagation logic should be worked as the iteration, not recursively.
But it works recursively due to the netdev notification unexpectedly.
This problem occurs when it disables LRO only for the team and bonding
interface type.

       team0
         |
  +------+------+-----+-----+
  |      |      |     |     |
team1  team2  team3  ...  team200

If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
event to its own lower interfaces(team1 ~ team200).
It is worked by netdev_sync_lower_features().
So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
work iteratively.
But generated NETDEV_FEAT_CHANGE event is also sent to the upper
interface too.
upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
lower interfaces again.
lower and upper interfaces receive this event and generate this
event again and again.
So, the stack overflow occurs.

But it is not the infinite loop issue.
Because the netdev_sync_lower_features() updates features before
generating the NETDEV_FEAT_CHANGE event.
Already synchronized lower interfaces skip notification logic.
So, it is just the problem that iteration logic is changed to the
recursive unexpectedly due to the notification mechanism.

Reproducer:

ip link add team0 type team
ethtool -K team0 lro on
for i in {1..200}
do
        ip link add team$i master team0 type team
        ethtool -K team$i lro on
done

ethtool -K team0 lro off

In order to fix it, the notifier_ctx member of bonding/team is introduced.

Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/20230517143010.3596250-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the virtual interface's feature is updated, it synchronizes the
updated feature for its own lower interface.
This propagation logic should be worked as the iteration, not recursively.
But it works recursively due to the netdev notification unexpectedly.
This problem occurs when it disables LRO only for the team and bonding
interface type.

       team0
         |
  +------+------+-----+-----+
  |      |      |     |     |
team1  team2  team3  ...  team200

If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE
event to its own lower interfaces(team1 ~ team200).
It is worked by netdev_sync_lower_features().
So, the NETDEV_FEAT_CHANGE notification logic of each lower interface
work iteratively.
But generated NETDEV_FEAT_CHANGE event is also sent to the upper
interface too.
upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own
lower interfaces again.
lower and upper interfaces receive this event and generate this
event again and again.
So, the stack overflow occurs.

But it is not the infinite loop issue.
Because the netdev_sync_lower_features() updates features before
generating the NETDEV_FEAT_CHANGE event.
Already synchronized lower interfaces skip notification logic.
So, it is just the problem that iteration logic is changed to the
recursive unexpectedly due to the notification mechanism.

Reproducer:

ip link add team0 type team
ethtool -K team0 lro on
for i in {1..200}
do
        ip link add team$i master team0 type team
        ethtool -K team$i lro on
done

ethtool -K team0 lro off

In order to fix it, the notifier_ctx member of bonding/team is introduced.

Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com
Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/20230517143010.3596250-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bonding: Add SPDX identifier to remaining files</title>
<updated>2023-05-16T13:38:06+00:00</updated>
<author>
<name>Bagas Sanjaya</name>
<email>bagasdotme@gmail.com</email>
</author>
<published>2023-05-15T06:07:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=613a014191f584556a96e070d0767761912ccf61'/>
<id>613a014191f584556a96e070d0767761912ccf61</id>
<content type='text'>
Previous batches of SPDX conversion missed bond_main.c and bonding_priv.h
because these files doesn't mention intended GPL version. Add SPDX identifier
to these files, assuming GPL 1.0+.

Cc: Thomas Davis &lt;tadavis@lbl.gov&gt;
Cc: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Signed-off-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previous batches of SPDX conversion missed bond_main.c and bonding_priv.h
because these files doesn't mention intended GPL version. Add SPDX identifier
to these files, assuming GPL 1.0+.

Cc: Thomas Davis &lt;tadavis@lbl.gov&gt;
Cc: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Signed-off-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: Always assign be16 value to vlan_proto</title>
<updated>2023-05-12T08:36:16+00:00</updated>
<author>
<name>Simon Horman</name>
<email>horms@kernel.org</email>
</author>
<published>2023-05-11T15:07:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c1bc7d73c96425db12bb92fbf24c37fd1c9ac329'/>
<id>c1bc7d73c96425db12bb92fbf24c37fd1c9ac329</id>
<content type='text'>
The type of the vlan_proto field is __be16.
And most users of the field use it as such.

In the case of setting or testing the field for the special VLAN_N_VID
value, host byte order is used. Which seems incorrect.

It also seems somewhat odd to store a VLAN ID value in a field that is
otherwise used to store Ether types.

Address this issue by defining BOND_VLAN_PROTO_NONE, a big endian value.
0xffff was chosen somewhat arbitrarily. What is important is that it
doesn't overlap with any valid VLAN Ether types.

I don't believe the problems described above are a bug because
VLAN_N_VID in both little-endian and big-endian byte order does not
conflict with any supported VLAN Ether types in big-endian byte order.

Reported by sparse as:

 .../bond_main.c:2857:26: warning: restricted __be16 degrades to integer
 .../bond_main.c:2863:20: warning: restricted __be16 degrades to integer
 .../bond_main.c:2939:40: warning: incorrect type in assignment (different base types)
 .../bond_main.c:2939:40:    expected restricted __be16 [usertype] vlan_proto
 .../bond_main.c:2939:40:    got int

No functional changes intended.
Compile tested only.

Signed-off-by: Simon Horman &lt;horms@kernel.org&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The type of the vlan_proto field is __be16.
And most users of the field use it as such.

In the case of setting or testing the field for the special VLAN_N_VID
value, host byte order is used. Which seems incorrect.

It also seems somewhat odd to store a VLAN ID value in a field that is
otherwise used to store Ether types.

Address this issue by defining BOND_VLAN_PROTO_NONE, a big endian value.
0xffff was chosen somewhat arbitrarily. What is important is that it
doesn't overlap with any valid VLAN Ether types.

I don't believe the problems described above are a bug because
VLAN_N_VID in both little-endian and big-endian byte order does not
conflict with any supported VLAN Ether types in big-endian byte order.

Reported by sparse as:

 .../bond_main.c:2857:26: warning: restricted __be16 degrades to integer
 .../bond_main.c:2863:20: warning: restricted __be16 degrades to integer
 .../bond_main.c:2939:40: warning: incorrect type in assignment (different base types)
 .../bond_main.c:2939:40:    expected restricted __be16 [usertype] vlan_proto
 .../bond_main.c:2939:40:    got int

No functional changes intended.
Compile tested only.

Signed-off-by: Simon Horman &lt;horms@kernel.org&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
