<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/net/bonding, branch v6.0</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>bonding: fix NULL deref in bond_rr_gen_slave_id</title>
<updated>2022-09-22T13:39:40+00:00</updated>
<author>
<name>Jonathan Toppins</name>
<email>jtoppins@redhat.com</email>
</author>
<published>2022-09-20T17:45:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0e400d602f46360752e4b32ce842dba3808e15e6'/>
<id>0e400d602f46360752e4b32ce842dba3808e15e6</id>
<content type='text'>
Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.

This causes the following Oops on an aarch64 machine:
    [  334.686773] Unable to handle kernel paging request at virtual address ffff2c91ac905000
    [  334.694703] Mem abort info:
    [  334.697486]   ESR = 0x0000000096000004
    [  334.701234]   EC = 0x25: DABT (current EL), IL = 32 bits
    [  334.706536]   SET = 0, FnV = 0
    [  334.709579]   EA = 0, S1PTW = 0
    [  334.712719]   FSC = 0x04: level 0 translation fault
    [  334.717586] Data abort info:
    [  334.720454]   ISV = 0, ISS = 0x00000004
    [  334.724288]   CM = 0, WnR = 0
    [  334.727244] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000008044d662000
    [  334.733944] [ffff2c91ac905000] pgd=0000000000000000, p4d=0000000000000000
    [  334.740734] Internal error: Oops: 96000004 [#1] SMP
    [  334.745602] Modules linked in: bonding tls veth rfkill sunrpc arm_spe_pmu vfat fat acpi_ipmi ipmi_ssif ixgbe igb i40e mdio ipmi_devintf ipmi_msghandler arm_cmn arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram crct10dif_ce ast ghash_ce sbsa_gwdt nvme drm_vram_helper drm_ttm_helper nvme_core ttm xgene_hwmon
    [  334.772217] CPU: 7 PID: 2214 Comm: ping Not tainted 6.0.0-rc4-00133-g64ae13ed4784 #4
    [  334.779950] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021
    [  334.789244] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  334.796196] pc : bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.801691] lr : bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.807962] sp : ffff8000221733e0
    [  334.811265] x29: ffff8000221733e0 x28: ffffdbac8572d198 x27: ffff80002217357c
    [  334.818392] x26: 000000000000002a x25: ffffdbacb33ee000 x24: ffff07ff980fa000
    [  334.825519] x23: ffffdbacb2e398ba x22: ffff07ff98102000 x21: ffff07ff981029c0
    [  334.832646] x20: 0000000000000001 x19: ffff07ff981029c0 x18: 0000000000000014
    [  334.839773] x17: 0000000000000000 x16: ffffdbacb1004364 x15: 0000aaaabe2f5a62
    [  334.846899] x14: ffff07ff8e55d968 x13: ffff07ff8e55db30 x12: 0000000000000000
    [  334.854026] x11: ffffdbacb21532e8 x10: 0000000000000001 x9 : ffffdbac857178ec
    [  334.861153] x8 : ffff07ff9f6e5a28 x7 : 0000000000000000 x6 : 000000007c2b3742
    [  334.868279] x5 : ffff2c91ac905000 x4 : ffff2c91ac905000 x3 : ffff07ff9f554400
    [  334.875406] x2 : ffff2c91ac905000 x1 : 0000000000000001 x0 : ffff07ff981029c0
    [  334.882532] Call trace:
    [  334.884967]  bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.890109]  bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.896033]  __bond_start_xmit+0x128/0x3a0 [bonding]
    [  334.901001]  bond_start_xmit+0x54/0xb0 [bonding]
    [  334.905622]  dev_hard_start_xmit+0xb4/0x220
    [  334.909798]  __dev_queue_xmit+0x1a0/0x720
    [  334.913799]  arp_xmit+0x3c/0xbc
    [  334.916932]  arp_send_dst+0x98/0xd0
    [  334.920410]  arp_solicit+0xe8/0x230
    [  334.923888]  neigh_probe+0x60/0xb0
    [  334.927279]  __neigh_event_send+0x3b0/0x470
    [  334.931453]  neigh_resolve_output+0x70/0x90
    [  334.935626]  ip_finish_output2+0x158/0x514
    [  334.939714]  __ip_finish_output+0xac/0x1a4
    [  334.943800]  ip_finish_output+0x40/0xfc
    [  334.947626]  ip_output+0xf8/0x1a4
    [  334.950931]  ip_send_skb+0x5c/0x100
    [  334.954410]  ip_push_pending_frames+0x3c/0x60
    [  334.958758]  raw_sendmsg+0x458/0x6d0
    [  334.962325]  inet_sendmsg+0x50/0x80
    [  334.965805]  sock_sendmsg+0x60/0x6c
    [  334.969286]  __sys_sendto+0xc8/0x134
    [  334.972853]  __arm64_sys_sendto+0x34/0x4c
    [  334.976854]  invoke_syscall+0x78/0x100
    [  334.980594]  el0_svc_common.constprop.0+0x4c/0xf4
    [  334.985287]  do_el0_svc+0x38/0x4c
    [  334.988591]  el0_svc+0x34/0x10c
    [  334.991724]  el0t_64_sync_handler+0x11c/0x150
    [  334.996072]  el0t_64_sync+0x190/0x194
    [  334.999726] Code: b9001062 f9403c02 d53cd044 8b040042 (b8210040)
    [  335.005810] ---[ end trace 0000000000000000 ]---
    [  335.010416] Kernel panic - not syncing: Oops: Fatal exception in interrupt
    [  335.017279] SMP: stopping secondary CPUs
    [  335.021374] Kernel Offset: 0x5baca8eb0000 from 0xffff800008000000
    [  335.027456] PHYS_OFFSET: 0x80000000
    [  335.030932] CPU features: 0x0000,0085c029,19805c82
    [  335.035713] Memory Limit: none
    [  335.038756] Rebooting in 180 seconds..

The fix is to allocate the memory in bond_open() which is guaranteed
to be called before any packets are processed.

Fixes: 848ca9182a7d ("net: bonding: Use per-cpu rr_tx_counter")
CC: Jussi Maki &lt;joamaki@gmail.com&gt;
Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.

This causes the following Oops on an aarch64 machine:
    [  334.686773] Unable to handle kernel paging request at virtual address ffff2c91ac905000
    [  334.694703] Mem abort info:
    [  334.697486]   ESR = 0x0000000096000004
    [  334.701234]   EC = 0x25: DABT (current EL), IL = 32 bits
    [  334.706536]   SET = 0, FnV = 0
    [  334.709579]   EA = 0, S1PTW = 0
    [  334.712719]   FSC = 0x04: level 0 translation fault
    [  334.717586] Data abort info:
    [  334.720454]   ISV = 0, ISS = 0x00000004
    [  334.724288]   CM = 0, WnR = 0
    [  334.727244] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000008044d662000
    [  334.733944] [ffff2c91ac905000] pgd=0000000000000000, p4d=0000000000000000
    [  334.740734] Internal error: Oops: 96000004 [#1] SMP
    [  334.745602] Modules linked in: bonding tls veth rfkill sunrpc arm_spe_pmu vfat fat acpi_ipmi ipmi_ssif ixgbe igb i40e mdio ipmi_devintf ipmi_msghandler arm_cmn arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram crct10dif_ce ast ghash_ce sbsa_gwdt nvme drm_vram_helper drm_ttm_helper nvme_core ttm xgene_hwmon
    [  334.772217] CPU: 7 PID: 2214 Comm: ping Not tainted 6.0.0-rc4-00133-g64ae13ed4784 #4
    [  334.779950] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021
    [  334.789244] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  334.796196] pc : bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.801691] lr : bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.807962] sp : ffff8000221733e0
    [  334.811265] x29: ffff8000221733e0 x28: ffffdbac8572d198 x27: ffff80002217357c
    [  334.818392] x26: 000000000000002a x25: ffffdbacb33ee000 x24: ffff07ff980fa000
    [  334.825519] x23: ffffdbacb2e398ba x22: ffff07ff98102000 x21: ffff07ff981029c0
    [  334.832646] x20: 0000000000000001 x19: ffff07ff981029c0 x18: 0000000000000014
    [  334.839773] x17: 0000000000000000 x16: ffffdbacb1004364 x15: 0000aaaabe2f5a62
    [  334.846899] x14: ffff07ff8e55d968 x13: ffff07ff8e55db30 x12: 0000000000000000
    [  334.854026] x11: ffffdbacb21532e8 x10: 0000000000000001 x9 : ffffdbac857178ec
    [  334.861153] x8 : ffff07ff9f6e5a28 x7 : 0000000000000000 x6 : 000000007c2b3742
    [  334.868279] x5 : ffff2c91ac905000 x4 : ffff2c91ac905000 x3 : ffff07ff9f554400
    [  334.875406] x2 : ffff2c91ac905000 x1 : 0000000000000001 x0 : ffff07ff981029c0
    [  334.882532] Call trace:
    [  334.884967]  bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.890109]  bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.896033]  __bond_start_xmit+0x128/0x3a0 [bonding]
    [  334.901001]  bond_start_xmit+0x54/0xb0 [bonding]
    [  334.905622]  dev_hard_start_xmit+0xb4/0x220
    [  334.909798]  __dev_queue_xmit+0x1a0/0x720
    [  334.913799]  arp_xmit+0x3c/0xbc
    [  334.916932]  arp_send_dst+0x98/0xd0
    [  334.920410]  arp_solicit+0xe8/0x230
    [  334.923888]  neigh_probe+0x60/0xb0
    [  334.927279]  __neigh_event_send+0x3b0/0x470
    [  334.931453]  neigh_resolve_output+0x70/0x90
    [  334.935626]  ip_finish_output2+0x158/0x514
    [  334.939714]  __ip_finish_output+0xac/0x1a4
    [  334.943800]  ip_finish_output+0x40/0xfc
    [  334.947626]  ip_output+0xf8/0x1a4
    [  334.950931]  ip_send_skb+0x5c/0x100
    [  334.954410]  ip_push_pending_frames+0x3c/0x60
    [  334.958758]  raw_sendmsg+0x458/0x6d0
    [  334.962325]  inet_sendmsg+0x50/0x80
    [  334.965805]  sock_sendmsg+0x60/0x6c
    [  334.969286]  __sys_sendto+0xc8/0x134
    [  334.972853]  __arm64_sys_sendto+0x34/0x4c
    [  334.976854]  invoke_syscall+0x78/0x100
    [  334.980594]  el0_svc_common.constprop.0+0x4c/0xf4
    [  334.985287]  do_el0_svc+0x38/0x4c
    [  334.988591]  el0_svc+0x34/0x10c
    [  334.991724]  el0t_64_sync_handler+0x11c/0x150
    [  334.996072]  el0t_64_sync+0x190/0x194
    [  334.999726] Code: b9001062 f9403c02 d53cd044 8b040042 (b8210040)
    [  335.005810] ---[ end trace 0000000000000000 ]---
    [  335.010416] Kernel panic - not syncing: Oops: Fatal exception in interrupt
    [  335.017279] SMP: stopping secondary CPUs
    [  335.021374] Kernel Offset: 0x5baca8eb0000 from 0xffff800008000000
    [  335.027456] PHYS_OFFSET: 0x80000000
    [  335.030932] CPU features: 0x0000,0085c029,19805c82
    [  335.035713] Memory Limit: none
    [  335.038756] Rebooting in 180 seconds..

The fix is to allocate the memory in bond_open() which is guaranteed
to be called before any packets are processed.

Fixes: 848ca9182a7d ("net: bonding: Use per-cpu rr_tx_counter")
CC: Jussi Maki &lt;joamaki@gmail.com&gt;
Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bonding: Unsync device addresses on ndo_stop</title>
<updated>2022-09-16T13:34:01+00:00</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@nvidia.com</email>
</author>
<published>2022-09-07T07:56:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=86247aba599e5b07d7e828e6edaaebb0ef2b1158'/>
<id>86247aba599e5b07d7e828e6edaaebb0ef2b1158</id>
<content type='text'>
Netdev drivers are expected to call dev_{uc,mc}_sync() in their
ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
This is mentioned in the kerneldoc for those dev_* functions.

The bonding driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
ndo_stop. This is ineffective because address lists (dev-&gt;{uc,mc}) have
already been emptied in unregister_netdevice_many() before ndo_uninit is
called. This mistake can result in addresses being leftover on former bond
slaves after a bond has been deleted; see test_LAG_cleanup() in the last
patch in this series.

Add unsync calls, via bond_hw_addr_flush(), at their expected location,
bond_close().
Add dev_mc_add() call to bond_open() to match the above change.

v3:
* When adding or deleting a slave, only sync/unsync, add/del addresses if
  the bond is up. In other cases, it is taken care of at the right time by
  ndo_open/ndo_set_rx_mode/ndo_stop.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Netdev drivers are expected to call dev_{uc,mc}_sync() in their
ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
This is mentioned in the kerneldoc for those dev_* functions.

The bonding driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
ndo_stop. This is ineffective because address lists (dev-&gt;{uc,mc}) have
already been emptied in unregister_netdevice_many() before ndo_uninit is
called. This mistake can result in addresses being leftover on former bond
slaves after a bond has been deleted; see test_LAG_cleanup() in the last
patch in this series.

Add unsync calls, via bond_hw_addr_flush(), at their expected location,
bond_close().
Add dev_mc_add() call to bond_open() to match the above change.

v3:
* When adding or deleting a slave, only sync/unsync, add/del addresses if
  the bond is up. In other cases, it is taken care of at the right time by
  ndo_open/ndo_set_rx_mode/ndo_stop.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bonding: Share lacpdu_mcast_addr definition</title>
<updated>2022-09-16T13:34:01+00:00</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@nvidia.com</email>
</author>
<published>2022-09-07T07:56:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1d9a143ee3408349700f44a9197b7ae0e4faae5d'/>
<id>1d9a143ee3408349700f44a9197b7ae0e4faae5d</id>
<content type='text'>
There are already a few definitions of arrays containing
MULTICAST_LACPDU_ADDR and the next patch will add one more use. These all
contain the same constant data so define one common instance for all
bonding code.

Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are already a few definitions of arrays containing
MULTICAST_LACPDU_ADDR and the next patch will add one more use. These all
contain the same constant data so define one common instance for all
bonding code.

Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: accept unsolicited NA message</title>
<updated>2022-09-05T09:07:05+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2022-08-30T09:37:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=592335a4164c3c41f57967223a1e1efe3a0c6eb3'/>
<id>592335a4164c3c41f57967223a1e1efe3a0c6eb3</id>
<content type='text'>
The unsolicited NA message with all-nodes multicast dest address should
be valid, as this also means the link could reach the target.

Also rename bond_validate_ns() to bond_validate_na().

Reported-by: LiLiang &lt;liali@redhat.com&gt;
Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The unsolicited NA message with all-nodes multicast dest address should
be valid, as this also means the link could reach the target.

Also rename bond_validate_ns() to bond_validate_na().

Reported-by: LiLiang &lt;liali@redhat.com&gt;
Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: use unspecified address if no available link local address</title>
<updated>2022-09-05T09:07:05+00:00</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2022-08-30T09:37:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b7f14132bf58256e841774ae07d3ffb7a841c2bc'/>
<id>b7f14132bf58256e841774ae07d3ffb7a841c2bc</id>
<content type='text'>
When ns_ip6_target was set, the ipv6_dev_get_saddr() will be called to get
available source address and send IPv6 neighbor solicit message.

If the target is global address, ipv6_dev_get_saddr() will get any
available src address. But if the target is link local address,
ipv6_dev_get_saddr() will only get available address from our interface,
i.e. the corresponding bond interface.

But before bond interface up, all the address is tentative, while
ipv6_dev_get_saddr() will ignore tentative address. This makes we can't
find available link local src address, then bond_ns_send() will not be
called and no NS message was sent. Finally bond interface will keep in
down state.

Fix this by sending NS with unspecified address if there is no available
source address.

Reported-by: LiLiang &lt;liali@redhat.com&gt;
Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When ns_ip6_target was set, the ipv6_dev_get_saddr() will be called to get
available source address and send IPv6 neighbor solicit message.

If the target is global address, ipv6_dev_get_saddr() will get any
available src address. But if the target is link local address,
ipv6_dev_get_saddr() will only get available address from our interface,
i.e. the corresponding bond interface.

But before bond interface up, all the address is tentative, while
ipv6_dev_get_saddr() will ignore tentative address. This makes we can't
find available link local src address, then bond_ns_send() will not be
called and no NS message was sent. Finally bond interface will keep in
down state.

Fix this by sending NS with unspecified address if there is no available
source address.

Reported-by: LiLiang &lt;liali@redhat.com&gt;
Fixes: 5e1eeef69c0f ("bonding: NS target should accept link local address")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: 3ad: make ad_ticks_per_sec a const</title>
<updated>2022-08-23T01:30:24+00:00</updated>
<author>
<name>Jonathan Toppins</name>
<email>jtoppins@redhat.com</email>
</author>
<published>2022-08-19T15:15:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f2e44dffa97f2e1c222a959ea5b6e604548b891b'/>
<id>f2e44dffa97f2e1c222a959ea5b6e604548b891b</id>
<content type='text'>
The value is only ever set once in bond_3ad_initialize and only ever
read otherwise. There seems to be no reason to set the variable via
bond_3ad_initialize when setting the global variable will do. Change
ad_ticks_per_sec to a const to enforce its read-only usage.

Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The value is only ever set once in bond_3ad_initialize and only ever
read otherwise. There seems to be no reason to set the variable via
bond_3ad_initialize when setting the global variable will do. Change
ad_ticks_per_sec to a const to enforce its read-only usage.

Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: 802.3ad: fix no transmission of LACPDUs</title>
<updated>2022-08-23T01:30:19+00:00</updated>
<author>
<name>Jonathan Toppins</name>
<email>jtoppins@redhat.com</email>
</author>
<published>2022-08-19T15:15:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d745b5062ad2b5da90a5e728d7ca884fc07315fd'/>
<id>d745b5062ad2b5da90a5e728d7ca884fc07315fd</id>
<content type='text'>
This is caused by the global variable ad_ticks_per_sec being zero as
demonstrated by the reproducer script discussed below. This causes
all timer values in __ad_timer_to_ticks to be zero, resulting
in the periodic timer to never fire.

To reproduce:
Run the script in
`tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which
puts bonding into a state where it never transmits LACPDUs.

line 44: ip link add fbond type bond mode 4 miimon 200 \
            xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast
setting bond param: ad_actor_sys_prio
given:
    params.ad_actor_system = 0
call stack:
    bond_option_ad_actor_sys_prio()
    -&gt; bond_3ad_update_ad_actor_settings()
       -&gt; set ad.system.sys_priority = bond-&gt;params.ad_actor_sys_prio
       -&gt; ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr; because
            params.ad_actor_system == 0
results:
     ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr

line 48: ip link set fbond address 52:54:00:3B:7C:A6
setting bond MAC addr
call stack:
    bond-&gt;dev-&gt;dev_addr = new_mac

line 52: ip link set fbond type bond ad_actor_sys_prio 65535
setting bond param: ad_actor_sys_prio
given:
    params.ad_actor_system = 0
call stack:
    bond_option_ad_actor_sys_prio()
    -&gt; bond_3ad_update_ad_actor_settings()
       -&gt; set ad.system.sys_priority = bond-&gt;params.ad_actor_sys_prio
       -&gt; ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr; because
            params.ad_actor_system == 0
results:
     ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr

line 60: ip link set veth1-bond down master fbond
given:
    params.ad_actor_system = 0
    params.mode = BOND_MODE_8023AD
    ad.system.sys_mac_addr == bond-&gt;dev-&gt;dev_addr
call stack:
    bond_enslave
    -&gt; bond_3ad_initialize(); because first slave
       -&gt; if ad.system.sys_mac_addr != bond-&gt;dev-&gt;dev_addr
          return
results:
     Nothing is run in bond_3ad_initialize() because dev_addr equals
     sys_mac_addr leaving the global ad_ticks_per_sec zero as it is
     never initialized anywhere else.

The if check around the contents of bond_3ad_initialize() is no longer
needed due to commit 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings
changes immediately") which sets ad.system.sys_mac_addr if any one of
the bonding parameters whos set function calls
bond_3ad_update_ad_actor_settings(). This is because if
ad.system.sys_mac_addr is zero it will be set to the current bond mac
address, this causes the if check to never be true.

Fixes: 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings changes immediately")
Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is caused by the global variable ad_ticks_per_sec being zero as
demonstrated by the reproducer script discussed below. This causes
all timer values in __ad_timer_to_ticks to be zero, resulting
in the periodic timer to never fire.

To reproduce:
Run the script in
`tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which
puts bonding into a state where it never transmits LACPDUs.

line 44: ip link add fbond type bond mode 4 miimon 200 \
            xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast
setting bond param: ad_actor_sys_prio
given:
    params.ad_actor_system = 0
call stack:
    bond_option_ad_actor_sys_prio()
    -&gt; bond_3ad_update_ad_actor_settings()
       -&gt; set ad.system.sys_priority = bond-&gt;params.ad_actor_sys_prio
       -&gt; ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr; because
            params.ad_actor_system == 0
results:
     ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr

line 48: ip link set fbond address 52:54:00:3B:7C:A6
setting bond MAC addr
call stack:
    bond-&gt;dev-&gt;dev_addr = new_mac

line 52: ip link set fbond type bond ad_actor_sys_prio 65535
setting bond param: ad_actor_sys_prio
given:
    params.ad_actor_system = 0
call stack:
    bond_option_ad_actor_sys_prio()
    -&gt; bond_3ad_update_ad_actor_settings()
       -&gt; set ad.system.sys_priority = bond-&gt;params.ad_actor_sys_prio
       -&gt; ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr; because
            params.ad_actor_system == 0
results:
     ad.system.sys_mac_addr = bond-&gt;dev-&gt;dev_addr

line 60: ip link set veth1-bond down master fbond
given:
    params.ad_actor_system = 0
    params.mode = BOND_MODE_8023AD
    ad.system.sys_mac_addr == bond-&gt;dev-&gt;dev_addr
call stack:
    bond_enslave
    -&gt; bond_3ad_initialize(); because first slave
       -&gt; if ad.system.sys_mac_addr != bond-&gt;dev-&gt;dev_addr
          return
results:
     Nothing is run in bond_3ad_initialize() because dev_addr equals
     sys_mac_addr leaving the global ad_ticks_per_sec zero as it is
     never initialized anywhere else.

The if check around the contents of bond_3ad_initialize() is no longer
needed due to commit 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings
changes immediately") which sets ad.system.sys_mac_addr if any one of
the bonding parameters whos set function calls
bond_3ad_update_ad_actor_settings(). This is because if
ad.system.sys_mac_addr is zero it will be set to the current bond mac
address, this causes the if check to never be true.

Fixes: 5ee14e6d336f ("bonding: 3ad: apply ad_actor settings changes immediately")
Signed-off-by: Jonathan Toppins &lt;jtoppins@redhat.com&gt;
Acked-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bonding: fix reference count leak in balance-alb mode</title>
<updated>2022-08-11T15:51:05+00:00</updated>
<author>
<name>Jay Vosburgh</name>
<email>jay.vosburgh@canonical.com</email>
</author>
<published>2022-08-11T05:06:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4f5d33f4f798b1c6d92b613f0087f639d9836971'/>
<id>4f5d33f4f798b1c6d92b613f0087f639d9836971</id>
<content type='text'>
Commit d5410ac7b0ba ("net:bonding:support balance-alb interface
with vlan to bridge") introduced a reference count leak by not releasing
the reference acquired by ip_dev_find().  Remedy this by insuring the
reference is released.

Fixes: d5410ac7b0ba ("net:bonding:support balance-alb interface with vlan to bridge")
Signed-off-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/26758.1660194413@famine
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit d5410ac7b0ba ("net:bonding:support balance-alb interface
with vlan to bridge") introduced a reference count leak by not releasing
the reference acquired by ip_dev_find().  Remedy this by insuring the
reference is released.

Fixes: d5410ac7b0ba ("net:bonding:support balance-alb interface with vlan to bridge")
Signed-off-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Link: https://lore.kernel.org/r/26758.1660194413@famine
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/tls: Use RCU API to access tls_ctx-&gt;netdev</title>
<updated>2022-08-11T05:58:43+00:00</updated>
<author>
<name>Maxim Mikityanskiy</name>
<email>maximmi@nvidia.com</email>
</author>
<published>2022-08-10T08:16:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=94ce3b64c62d4b628cf85cd0d9a370aca8f7e43a'/>
<id>94ce3b64c62d4b628cf85cd0d9a370aca8f7e43a</id>
<content type='text'>
Currently, tls_device_down synchronizes with tls_device_resync_rx using
RCU, however, the pointer to netdev is stored using WRITE_ONCE and
loaded using READ_ONCE.

Although such approach is technically correct (rcu_dereference is
essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store
NULL), using special RCU helpers for pointers is more valid, as it
includes additional checks and might change the implementation
transparently to the callers.

Mark the netdev pointer as __rcu and use the correct RCU helpers to
access it. For non-concurrent access pass the right conditions that
guarantee safe access (locks taken, refcount value). Also use the
correct helper in mlx5e, where even READ_ONCE was missing.

The transition to RCU exposes existing issues, fixed by this commit:

1. bond_tls_device_xmit could read netdev twice, and it could become
NULL the second time, after the NULL check passed.

2. Drivers shouldn't stop processing the last packet if tls_device_down
just set netdev to NULL, before tls_dev_del was called. This prevents a
possible packet drop when transitioning to the fallback software mode.

Fixes: 89df6a810470 ("net/bonding: Implement TLS TX device offload")
Fixes: c55dcdd435aa ("net/tls: Fix use-after-free after the TLS device goes down and up")
Signed-off-by: Maxim Mikityanskiy &lt;maximmi@nvidia.com&gt;
Link: https://lore.kernel.org/r/20220810081602.1435800-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, tls_device_down synchronizes with tls_device_resync_rx using
RCU, however, the pointer to netdev is stored using WRITE_ONCE and
loaded using READ_ONCE.

Although such approach is technically correct (rcu_dereference is
essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store
NULL), using special RCU helpers for pointers is more valid, as it
includes additional checks and might change the implementation
transparently to the callers.

Mark the netdev pointer as __rcu and use the correct RCU helpers to
access it. For non-concurrent access pass the right conditions that
guarantee safe access (locks taken, refcount value). Also use the
correct helper in mlx5e, where even READ_ONCE was missing.

The transition to RCU exposes existing issues, fixed by this commit:

1. bond_tls_device_xmit could read netdev twice, and it could become
NULL the second time, after the NULL check passed.

2. Drivers shouldn't stop processing the last packet if tls_device_down
just set netdev to NULL, before tls_dev_del was called. This prevents a
possible packet drop when transitioning to the fallback software mode.

Fixes: 89df6a810470 ("net/bonding: Implement TLS TX device offload")
Fixes: c55dcdd435aa ("net/tls: Fix use-after-free after the TLS device goes down and up")
Signed-off-by: Maxim Mikityanskiy &lt;maximmi@nvidia.com&gt;
Link: https://lore.kernel.org/r/20220810081602.1435800-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net:bonding:support balance-alb interface with vlan to bridge</title>
<updated>2022-08-10T12:47:00+00:00</updated>
<author>
<name>Sun Shouxin</name>
<email>sunshouxin@chinatelecom.cn</email>
</author>
<published>2022-08-09T06:21:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d5410ac7b0baeca91cf73ff5241d35998ecc8c9e'/>
<id>d5410ac7b0baeca91cf73ff5241d35998ecc8c9e</id>
<content type='text'>
In my test, balance-alb bonding with two slaves eth0 and eth1,
and then Bond0.150 is created with vlan id attached bond0.
After adding bond0.150 into one linux bridge, I noted that Bond0,
bond0.150 and  bridge were assigned to the same MAC as eth0.
Once bond0.150 receives a packet whose dest IP is bridge's
and dest MAC is eth1's, the linux bridge will not match
eth1's MAC entry in FDB, and not handle it as expected.
The patch fix the issue, and diagram as below:

eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
                      |
                   bond0.150(mac:eth0_mac)
                      |
                   bridge(ip:br_ip, mac:eth0_mac)--other port

Suggested-by: Hu Yadi &lt;huyd12@chinatelecom.cn&gt;
Signed-off-by: Sun Shouxin &lt;sunshouxin@chinatelecom.cn&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In my test, balance-alb bonding with two slaves eth0 and eth1,
and then Bond0.150 is created with vlan id attached bond0.
After adding bond0.150 into one linux bridge, I noted that Bond0,
bond0.150 and  bridge were assigned to the same MAC as eth0.
Once bond0.150 receives a packet whose dest IP is bridge's
and dest MAC is eth1's, the linux bridge will not match
eth1's MAC entry in FDB, and not handle it as expected.
The patch fix the issue, and diagram as below:

eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
                      |
                   bond0.150(mac:eth0_mac)
                      |
                   bridge(ip:br_ip, mac:eth0_mac)--other port

Suggested-by: Hu Yadi &lt;huyd12@chinatelecom.cn&gt;
Signed-off-by: Sun Shouxin &lt;sunshouxin@chinatelecom.cn&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
