<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/socket.h, branch v7.2-rc1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'net-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next</title>
<updated>2026-06-17T07:17:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-17T07:17:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b85966adbf5de0668a815c6e3527f87e0c387fb4'/>
<id>b85966adbf5de0668a815c6e3527f87e0c387fb4</id>
<content type='text'>
Pull networking updates from Jakub Kicinski:
 "Core &amp; protocols:

   - Work on removing rtnl_lock protection throughout the stack
     continues. In this chapter:
       - don't use rtnl_lock for IPv6 multicast routing configuration
       - don't take rtnl_lock in ethtool for modern drivers
       - prepare Qdisc dump callbacks for rtnl_lock removal

   - Support dumping just ifindex + name of all interfaces, under RCU.
     It's a common operation for Netlink CLI tools (when translating
     names to ifindexes) and previously required full rtnl_lock.

   - Support dumping qdiscs and page pools for a specific netdev. Even
     tho user space wants a dump of all netdevs, most of the time, the
     OOO programming model results in repeating the dump for each
     netdev. Which, in absence of a cache, leads to a O(n^2) behavior.

   - Flush nexthops once on multi-nexthop removal (e.g. when device goes
     down), another O(n^2) -&gt; O(n) improvement.

   - Rehash locally generated traffic to a different nexthop on
     retransmit timeout.

   - Honor oif when choosing nexthop for locally generated IPv6 traffic.

   - Convert TCP Auth Option to crypto library, and drop non-RFC algos.

   - Increase subflow limits in MPTCP to 64 and endpoint limit to 256.

   - Support MPTCP signaling of IPv6 address + port (ADD_ADDR). We need
     to selectively skip reporting of the standard TCP Timestamp option,
     because they won't fit into the header space together (12 + 30 &gt;
     40).

   - Support using bridge neighbor suppression, Duplicate Address
     Detection, Gratuitous ARP and unsolicited NA forwarding - in EVPN
     deployments, e.g. VXLAN fabrics (IPv4 and IPv6).

   - Improve link state reporting for upper netdevs (e.g. macvlan) over
     tunnel devices (again, mostly for EVPN deployments).

   - Support binding GENEVE tunnels to a local address.

   - Speed up UDP tunnel destruction (remove one synchronize_rcu()).

   - Support exponential field encoding in multicast (IGMPv3 and MLDv2).

   - Support attaching PSP crypto offload to containers (veth, netkit).

   - Add a new IPSec Netlink message XFRM_MSG_MIGRATE_STATE that allows
     migrating individual IPsec SAs independently of their policies.

     The existing XFRM_MSG_MIGRATE is tightly coupled to policy+SA
     migration, lacks SPI for unique SA identification, and cannot
     express reqid changes or migrate Transport mode selectors.

     The new interface identifies the SA via SPI and mark, supports
     reqid changes, address family changes, encap removal, and uses an
     atomic create+install flow under x-&gt;lock to prevent SN/IV reuse
     during AEAD SA migration.

   - Implement GRO/GSO support for PPPoE.

   - Convert sockopt callbacks in a number of protocols to iov_iter.

  Cross-tree stuff:

   - Remove support for Crypto TFM cloning (unblocked after the TCP Auth
     Option rework). This feature regressed performance for all crypto
     API users, since it changed crypto transformation objects into
     reference-counted objects.

   - Add FCrypt-PCBC implementation to rxrpc and remove it from the
     global crypto API as obsolete and insecure.

  Wireless:

   - Major rework of station bandwidth handling, fixing issues with
     lower capability than AP.

   - Cleanups for EMLSR spec issues (drafts differed).

   - More Neighbor Awareness Networking (Wi-Fi Aware) work (multicast,
     schedule improvements, multi-station etc.)

   - Some Ultra High Reliability (UHR) / IEEE 802.11bn (D1.4) work
     (e.g. non-primary channel access, UHR DBE support).

   - Fine Timing Measurement ranging (i.e. distance measurement) APIs.

  Netfilter:

   - Use per-rule hash initval in nf_conncount. This avoids unnecessary
     lock contention with short keys (e.g. conntrack zones) in different
     namespaces.

   - Various safety improvements, both in packet parsing and object
     lifetimes. Notably add refcounts to conntrack timeout policy.

  Deletions:

   - Remove TLS + sockmap integration. TLS wants to pin user pages to
     avoid a copy, and sockmap wants to write to the input stream. More
     work on this integration is clearly needed, and we can't find any
     users (original author admitted that they never deployed it).

   - Remove support for TLS offload with TCP Offload Engine (the far
     more common opportunistic offload is retained). The locking looks
     unfixable (driver sleeps under TCP spin locks) and people from the
     vendor that added this are AWOL.

   - Remove more ATM code, trying to leave behind only what PPPoATM
     needs, AAL5 and br2684 with permanent circuits.

   - Remove AppleTalk. Let it join hamradio in our out of tree protocol
     graveyard, I mean, repository.

   - Disable 32-bit x_tables compatibility (32bit binaries on 64bit
     kernel) interface in user namespaces. To be deleted completely,
     soon.

   - Remove 5/10 MHz support from cfg80211/mac80211.

  Drivers:

   - Software:
       - Support DEVMEM/DMABUF Tx over NETMEM_TX_NO_DMA devices (netkit)
       - bonding: add knob to strictly follow 802.3ad for link state

   - New drivers:
       - Alibaba Elastic Ethernet Adaptor (cloud vNIC).
       - NXP NETC switch within i.MX94.

   - DPLL:
       - Add operational state to pins (implement in zl3073x).
       - Add generic DPLL type, for daisy-chaining DPLLs (implement in ice).

   - Ethernet high-speed NICs:
       - Huawei (hinic3):
           - enhance tc flow offload support with queue selection,
             tunnels
       - nVidia/Mellanox:
           - avoid over-copying payload to the skb's linear part (up to
             60% win for LRO on slow CPUs like ARM64 V2)
           - expose more per-queue stats over the standard API
           - support additional, unprivileged PFs in the DPU
             configuration
           - support Socket Direct (multi-PF) with switchdev offloads
           - add a pool / frag allocator for DMA mapped buffers for
             control objects, save memory on systems with 64kB page size
           - take advantage of the ability to dynamically change RSS
             table size, even when table is configured by the user
           - increase the max RSS table size for even traffic
             distribution

   - Ethernet NICs:
       - Marvell/Aquantia:
           - AQC113 PTP support
       - Realtek USB (r8152):
           - support 10Gbit Link Speeds and Energy-Efficient Ethernet
             (EEE)
           - support firmware loaded (for RTL8157/RTL8159)
           - support for the RTL8159
       - Intel (ixgbe):
           - support Energy-Efficient Ethernet (EEE) on E610 devices

   - Ethernet switches:
       - Airoha:
           - support multiple netdevs on a single GDM block / port
       - Marvell (mv88e6xxx):
           - support SERDES of mv88e6321
       - Microchip (ksz8/9):
           - rework the driver callbacks to remove one indirection layer
       - Motorcomm (yt921x):
           - support port rate policing
           - support TBF qdisc offload
           - support ACL/flower offload
       - nVidia/Mellanox:
           - expose per-PG rx_discards
       - Realtek:
           - rtl8365mb: bridge offloading and VLAN support

   - Ethernet PHYs:
       - Airoha:
           - support Airoha AN8801R Gigabit PHYs.
       - Micrel:
           - implement 3 low-loss cable tunables
       - Realtek:
           - support MDI swapping for RTL8226-CG
           - support MDIO for RTL931x
       - Qualcomm:
           - at803x: Rx and Tx clock management for IPQ5018 PHY
       - Motorcomm:
           - support YT8522 100M RMII PHY
           - set drive strength in YT8531s RGMII
       - TI:
           - dp83822: add optional external PHY clock

   - Bluetooth:
       - hci_sync: add support for HCI_LE_Set_Host_Feature [v2]
       - SMP: use AES-CMAC library API
       - Intel:
           - support Product level reset
           - support smart trigger dump
       - Mediatek:
           - add event filter to filter specific event
       - Realtek:
           - fix RTL8761B/BU broken LE extended scan

   - WiFi:
       - Broadcom (b43):
           - new support for a 11n device
       - MediaTek (mt76):
           - support mt7927
           - mt792x: broken usb transport detection
           - mt7921: regulatory improvements
       - Qualcomm (ath9k):
           - GPIO interface improvements
       - Qualcomm (ath12k):
           - WDS support
           - replace dynamic memory allocation in WMI Rx path
           - thermal throttling/cooling device support
           - 6 GHz incumbent interference detection
           - channel 177 in 5 GHz
       - Realtek (rt89):
           - RTL8922AU support
           - USB 3 mode switch for performance
           - better monitor radiotap support
           - RTL8922DE preparations"

* tag 'net-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1778 commits)
  ipv4: fib_rule: Move fib4_rules_exit() to -&gt;exit().
  net: serialize netif_running() check in enqueue_to_backlog()
  net: skmsg: preserve sg.copy across SG transforms
  appletalk: move the protocol out of tree
  appletalk: stop storing per-interface state in struct net_device
  selftests/bpf: test that TLS crypto is rejected on a sockmap socket
  selftests/bpf: drop the unused kTLS program from test_sockmap
  selftests/bpf: remove sockmap + ktls tests
  tls: remove dead sockmap (psock) handling from the SW path
  tls: reject the combination of TLS and sockmap
  atm: remove orphaned uAPI for deleted drivers, protocols and SVCs
  atm: remove unused ATM PHY operations
  atm: remove the unused pre_send and send_bh device operations
  atm: remove the unused change_qos device operation
  atm: remove SVC socket support and the signaling daemon interface
  atm: remove the local ATM (NSAP) address registry
  atm: remove dead SONET PHY ioctls
  atm: remove the unused send_oam / push_oam callbacks
  atm: remove AAL3/4 transport support
  net: dsa: sja1105: fix lastused timestamp in flower stats
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking updates from Jakub Kicinski:
 "Core &amp; protocols:

   - Work on removing rtnl_lock protection throughout the stack
     continues. In this chapter:
       - don't use rtnl_lock for IPv6 multicast routing configuration
       - don't take rtnl_lock in ethtool for modern drivers
       - prepare Qdisc dump callbacks for rtnl_lock removal

   - Support dumping just ifindex + name of all interfaces, under RCU.
     It's a common operation for Netlink CLI tools (when translating
     names to ifindexes) and previously required full rtnl_lock.

   - Support dumping qdiscs and page pools for a specific netdev. Even
     tho user space wants a dump of all netdevs, most of the time, the
     OOO programming model results in repeating the dump for each
     netdev. Which, in absence of a cache, leads to a O(n^2) behavior.

   - Flush nexthops once on multi-nexthop removal (e.g. when device goes
     down), another O(n^2) -&gt; O(n) improvement.

   - Rehash locally generated traffic to a different nexthop on
     retransmit timeout.

   - Honor oif when choosing nexthop for locally generated IPv6 traffic.

   - Convert TCP Auth Option to crypto library, and drop non-RFC algos.

   - Increase subflow limits in MPTCP to 64 and endpoint limit to 256.

   - Support MPTCP signaling of IPv6 address + port (ADD_ADDR). We need
     to selectively skip reporting of the standard TCP Timestamp option,
     because they won't fit into the header space together (12 + 30 &gt;
     40).

   - Support using bridge neighbor suppression, Duplicate Address
     Detection, Gratuitous ARP and unsolicited NA forwarding - in EVPN
     deployments, e.g. VXLAN fabrics (IPv4 and IPv6).

   - Improve link state reporting for upper netdevs (e.g. macvlan) over
     tunnel devices (again, mostly for EVPN deployments).

   - Support binding GENEVE tunnels to a local address.

   - Speed up UDP tunnel destruction (remove one synchronize_rcu()).

   - Support exponential field encoding in multicast (IGMPv3 and MLDv2).

   - Support attaching PSP crypto offload to containers (veth, netkit).

   - Add a new IPSec Netlink message XFRM_MSG_MIGRATE_STATE that allows
     migrating individual IPsec SAs independently of their policies.

     The existing XFRM_MSG_MIGRATE is tightly coupled to policy+SA
     migration, lacks SPI for unique SA identification, and cannot
     express reqid changes or migrate Transport mode selectors.

     The new interface identifies the SA via SPI and mark, supports
     reqid changes, address family changes, encap removal, and uses an
     atomic create+install flow under x-&gt;lock to prevent SN/IV reuse
     during AEAD SA migration.

   - Implement GRO/GSO support for PPPoE.

   - Convert sockopt callbacks in a number of protocols to iov_iter.

  Cross-tree stuff:

   - Remove support for Crypto TFM cloning (unblocked after the TCP Auth
     Option rework). This feature regressed performance for all crypto
     API users, since it changed crypto transformation objects into
     reference-counted objects.

   - Add FCrypt-PCBC implementation to rxrpc and remove it from the
     global crypto API as obsolete and insecure.

  Wireless:

   - Major rework of station bandwidth handling, fixing issues with
     lower capability than AP.

   - Cleanups for EMLSR spec issues (drafts differed).

   - More Neighbor Awareness Networking (Wi-Fi Aware) work (multicast,
     schedule improvements, multi-station etc.)

   - Some Ultra High Reliability (UHR) / IEEE 802.11bn (D1.4) work
     (e.g. non-primary channel access, UHR DBE support).

   - Fine Timing Measurement ranging (i.e. distance measurement) APIs.

  Netfilter:

   - Use per-rule hash initval in nf_conncount. This avoids unnecessary
     lock contention with short keys (e.g. conntrack zones) in different
     namespaces.

   - Various safety improvements, both in packet parsing and object
     lifetimes. Notably add refcounts to conntrack timeout policy.

  Deletions:

   - Remove TLS + sockmap integration. TLS wants to pin user pages to
     avoid a copy, and sockmap wants to write to the input stream. More
     work on this integration is clearly needed, and we can't find any
     users (original author admitted that they never deployed it).

   - Remove support for TLS offload with TCP Offload Engine (the far
     more common opportunistic offload is retained). The locking looks
     unfixable (driver sleeps under TCP spin locks) and people from the
     vendor that added this are AWOL.

   - Remove more ATM code, trying to leave behind only what PPPoATM
     needs, AAL5 and br2684 with permanent circuits.

   - Remove AppleTalk. Let it join hamradio in our out of tree protocol
     graveyard, I mean, repository.

   - Disable 32-bit x_tables compatibility (32bit binaries on 64bit
     kernel) interface in user namespaces. To be deleted completely,
     soon.

   - Remove 5/10 MHz support from cfg80211/mac80211.

  Drivers:

   - Software:
       - Support DEVMEM/DMABUF Tx over NETMEM_TX_NO_DMA devices (netkit)
       - bonding: add knob to strictly follow 802.3ad for link state

   - New drivers:
       - Alibaba Elastic Ethernet Adaptor (cloud vNIC).
       - NXP NETC switch within i.MX94.

   - DPLL:
       - Add operational state to pins (implement in zl3073x).
       - Add generic DPLL type, for daisy-chaining DPLLs (implement in ice).

   - Ethernet high-speed NICs:
       - Huawei (hinic3):
           - enhance tc flow offload support with queue selection,
             tunnels
       - nVidia/Mellanox:
           - avoid over-copying payload to the skb's linear part (up to
             60% win for LRO on slow CPUs like ARM64 V2)
           - expose more per-queue stats over the standard API
           - support additional, unprivileged PFs in the DPU
             configuration
           - support Socket Direct (multi-PF) with switchdev offloads
           - add a pool / frag allocator for DMA mapped buffers for
             control objects, save memory on systems with 64kB page size
           - take advantage of the ability to dynamically change RSS
             table size, even when table is configured by the user
           - increase the max RSS table size for even traffic
             distribution

   - Ethernet NICs:
       - Marvell/Aquantia:
           - AQC113 PTP support
       - Realtek USB (r8152):
           - support 10Gbit Link Speeds and Energy-Efficient Ethernet
             (EEE)
           - support firmware loaded (for RTL8157/RTL8159)
           - support for the RTL8159
       - Intel (ixgbe):
           - support Energy-Efficient Ethernet (EEE) on E610 devices

   - Ethernet switches:
       - Airoha:
           - support multiple netdevs on a single GDM block / port
       - Marvell (mv88e6xxx):
           - support SERDES of mv88e6321
       - Microchip (ksz8/9):
           - rework the driver callbacks to remove one indirection layer
       - Motorcomm (yt921x):
           - support port rate policing
           - support TBF qdisc offload
           - support ACL/flower offload
       - nVidia/Mellanox:
           - expose per-PG rx_discards
       - Realtek:
           - rtl8365mb: bridge offloading and VLAN support

   - Ethernet PHYs:
       - Airoha:
           - support Airoha AN8801R Gigabit PHYs.
       - Micrel:
           - implement 3 low-loss cable tunables
       - Realtek:
           - support MDI swapping for RTL8226-CG
           - support MDIO for RTL931x
       - Qualcomm:
           - at803x: Rx and Tx clock management for IPQ5018 PHY
       - Motorcomm:
           - support YT8522 100M RMII PHY
           - set drive strength in YT8531s RGMII
       - TI:
           - dp83822: add optional external PHY clock

   - Bluetooth:
       - hci_sync: add support for HCI_LE_Set_Host_Feature [v2]
       - SMP: use AES-CMAC library API
       - Intel:
           - support Product level reset
           - support smart trigger dump
       - Mediatek:
           - add event filter to filter specific event
       - Realtek:
           - fix RTL8761B/BU broken LE extended scan

   - WiFi:
       - Broadcom (b43):
           - new support for a 11n device
       - MediaTek (mt76):
           - support mt7927
           - mt792x: broken usb transport detection
           - mt7921: regulatory improvements
       - Qualcomm (ath9k):
           - GPIO interface improvements
       - Qualcomm (ath12k):
           - WDS support
           - replace dynamic memory allocation in WMI Rx path
           - thermal throttling/cooling device support
           - 6 GHz incumbent interference detection
           - channel 177 in 5 GHz
       - Realtek (rt89):
           - RTL8922AU support
           - USB 3 mode switch for performance
           - better monitor radiotap support
           - RTL8922DE preparations"

* tag 'net-next-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1778 commits)
  ipv4: fib_rule: Move fib4_rules_exit() to -&gt;exit().
  net: serialize netif_running() check in enqueue_to_backlog()
  net: skmsg: preserve sg.copy across SG transforms
  appletalk: move the protocol out of tree
  appletalk: stop storing per-interface state in struct net_device
  selftests/bpf: test that TLS crypto is rejected on a sockmap socket
  selftests/bpf: drop the unused kTLS program from test_sockmap
  selftests/bpf: remove sockmap + ktls tests
  tls: remove dead sockmap (psock) handling from the SW path
  tls: reject the combination of TLS and sockmap
  atm: remove orphaned uAPI for deleted drivers, protocols and SVCs
  atm: remove unused ATM PHY operations
  atm: remove the unused pre_send and send_bh device operations
  atm: remove the unused change_qos device operation
  atm: remove SVC socket support and the signaling daemon interface
  atm: remove the local ATM (NSAP) address registry
  atm: remove dead SONET PHY ioctls
  atm: remove the unused send_oam / push_oam callbacks
  atm: remove AAL3/4 transport support
  net: dsa: sja1105: fix lastused timestamp in flower stats
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Remove support for AIO on sockets</title>
<updated>2026-05-29T06:05:30+00:00</updated>
<author>
<name>Demi Marie Obenour</name>
<email>demiobenour@gmail.com</email>
</author>
<published>2026-05-23T19:43:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fcc77d33a34cf271702e8daafb6c593e4626776d'/>
<id>fcc77d33a34cf271702e8daafb6c593e4626776d</id>
<content type='text'>
The only user of msg-&gt;msg_iocb was AF_ALG, but that's deprecated.
It can be removed entirely at the cost of only supporting synchronous
operations.  This doesn't break userspace, which will silently block
(for a bounded amount of time) in io_submit instead of operating
asynchronously.

This also makes struct msghdr smaller, helping every other caller of
sendmsg().

Signed-off-by: Demi Marie Obenour &lt;demiobenour@gmail.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The only user of msg-&gt;msg_iocb was AF_ALG, but that's deprecated.
It can be removed entirely at the cost of only supporting synchronous
operations.  This doesn't break userspace, which will silently block
(for a bounded amount of time) in io_submit instead of operating
asynchronously.

This also makes struct msghdr smaller, helping every other caller of
sendmsg().

Signed-off-by: Demi Marie Obenour &lt;demiobenour@gmail.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: block MSG_NO_SHARED_FRAGS in sendmsg()</title>
<updated>2026-05-15T01:00:40+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2026-05-12T14:02:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4987a5763fd5ab72afde7493216d944d976a0b15'/>
<id>4987a5763fd5ab72afde7493216d944d976a0b15</id>
<content type='text'>
This change should cause no difference in behavior; it just cleans up some
hazardous code that could have become a problem in the future.

MSG_NO_SHARED_FRAGS is a kernel-internal flag that cancels the effect of
MSG_SPLICE_PAGES, another kernel-internal flag that influences the
data-sharing semantics of SKBs.

Prevent passing this flag in from userspace via sendmsg() by adding it to
MSG_INTERNAL_SENDMSG_FLAGS.

This is not currently an observable problem because MSG_NO_SHARED_FRAGS
only has an effect if kernel code adds MSG_SPLICE_PAGES to it.
The only codepath that adds MSG_SPLICE_PAGES to user-supplied flags from
which MSG_NO_SHARED_FRAGS hasn't been cleared is the path
tcp_bpf_sendmsg -&gt; tcp_bpf_send_verdict -&gt; tcp_bpf_push, and that is not a
problem because tcp_bpf_sendmsg always intentionally sets
MSG_NO_SHARED_FRAGS anyway.

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Link: https://patch.msgid.link/20260512-msg_no_shared_frags-v1-1-55ea46760331@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change should cause no difference in behavior; it just cleans up some
hazardous code that could have become a problem in the future.

MSG_NO_SHARED_FRAGS is a kernel-internal flag that cancels the effect of
MSG_SPLICE_PAGES, another kernel-internal flag that influences the
data-sharing semantics of SKBs.

Prevent passing this flag in from userspace via sendmsg() by adding it to
MSG_INTERNAL_SENDMSG_FLAGS.

This is not currently an observable problem because MSG_NO_SHARED_FRAGS
only has an effect if kernel code adds MSG_SPLICE_PAGES to it.
The only codepath that adds MSG_SPLICE_PAGES to user-supplied flags from
which MSG_NO_SHARED_FRAGS hasn't been cleared is the path
tcp_bpf_sendmsg -&gt; tcp_bpf_send_verdict -&gt; tcp_bpf_push, and that is not a
problem because tcp_bpf_sendmsg always intentionally sets
MSG_NO_SHARED_FRAGS anyway.

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Link: https://patch.msgid.link/20260512-msg_no_shared_frags-v1-1-55ea46760331@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: use ktime_t in struct scm_timestamping_internal</title>
<updated>2026-03-05T01:53:34+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-03-04T01:27:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c66e0f453d1afa82534383c58d503238a43fa76c'/>
<id>c66e0f453d1afa82534383c58d503238a43fa76c</id>
<content type='text'>
Instead of using struct timespec64 in scm_timestamping_internal,
use ktime_t, saving 24 bytes in kernel stack.

This makes tcp_update_recv_tstamps() small enough to be inlined.

The ktime_t -&gt; timespec64 conversions happen after socket lock
has been released in tcp_recvmsg(), and only if the application
requested them.

$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 0/2 grow/shrink: 5/4 up/down: 146/-277 (-131)
Function                                     old     new   delta
tcp_zerocopy_receive                        2383    2425     +42
mptcp_recvmsg                               1565    1607     +42
tcp_recvmsg_locked                          3797    3823     +26
put_cmsg_scm_timestamping64                  131     149     +18
put_cmsg_scm_timestamping                    131     149     +18
__pfx_tcp_update_recv_tstamps                 16       -     -16
do_tcp_getsockopt                           4024    4006     -18
tcp_recv_timestamp                           474     430     -44
tcp_zc_handle_leftover                       417     371     -46
__sock_recv_timestamp                       1087    1031     -56
tcp_update_recv_tstamps                       97       -     -97
Total: Before=25223788, After=25223657, chg -0.00%

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Link: https://patch.msgid.link/20260304012747.881644-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of using struct timespec64 in scm_timestamping_internal,
use ktime_t, saving 24 bytes in kernel stack.

This makes tcp_update_recv_tstamps() small enough to be inlined.

The ktime_t -&gt; timespec64 conversions happen after socket lock
has been released in tcp_recvmsg(), and only if the application
requested them.

$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 0/2 grow/shrink: 5/4 up/down: 146/-277 (-131)
Function                                     old     new   delta
tcp_zerocopy_receive                        2383    2425     +42
mptcp_recvmsg                               1565    1607     +42
tcp_recvmsg_locked                          3797    3823     +26
put_cmsg_scm_timestamping64                  131     149     +18
put_cmsg_scm_timestamping                    131     149     +18
__pfx_tcp_update_recv_tstamps                 16       -     -16
do_tcp_getsockopt                           4024    4006     -18
tcp_recv_timestamp                           474     430     -44
tcp_zc_handle_leftover                       417     371     -46
__sock_recv_timestamp                       1087    1031     -56
tcp_update_recv_tstamps                       97       -     -97
Total: Before=25223788, After=25223657, chg -0.00%

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Link: https://patch.msgid.link/20260304012747.881644-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2025-12-04T02:58:57+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-04T02:58:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0abcfd8983e3d3d27b8f5f7d01fed4354eb422c4'/>
<id>0abcfd8983e3d3d27b8f5f7d01fed4354eb422c4</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Unify how task_work cancelations are detected, placing it in the
   task_work running state rather than needing to check the task state

 - Series cleaning up and moving the cancelation code to where it
   belongs, in cancel.c

 - Cleanup of waitid and futex argument handling

 - Add support for mixed sized SQEs. 6.18 added support for mixed sized
   CQEs, improving flexibility and efficiency of workloads that need big
   CQEs. This adds similar support for SQEs, where the occasional need
   for a 128b SQE doesn't necessitate having all SQEs be 128b in size

 - Introduce zcrx and SQ/CQ layout queries. The former returns what zcrx
   features are available. And both return the ring size information to
   help with allocation size calculation for user provided rings like
   IORING_SETUP_NO_MMAP and IORING_MEM_REGION_TYPE_USER

 - Zcrx updates for 6.19. It includes a bunch of small patches,
   IORING_REGISTER_ZCRX_CTRL and RQ flushing and David's work on sharing
   zcrx b/w multiple io_uring instances

 - Series cleaning up ring initializations, notable deduplicating ring
   size and offset calculations. It also moves most of the checking
   before doing any allocations, making the code simpler

 - Add support for getsockname and getpeername, which is mostly a
   trivial hookup after a bit of refactoring on the networking side

 - Various fixes and cleanups

* tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits)
  io_uring: Introduce getsockname io_uring cmd
  socket: Split out a getsockname helper for io_uring
  socket: Unify getsockname and getpeername implementation
  io_uring/query: drop unused io_handle_query_entry() ctx arg
  io_uring/kbuf: remove obsolete buf_nr_pages and update comments
  io_uring/register: use correct location for io_rings_layout
  io_uring/zcrx: share an ifq between rings
  io_uring/zcrx: add io_fill_zcrx_offsets()
  io_uring/zcrx: export zcrx via a file
  io_uring/zcrx: move io_zcrx_scrub() and dependencies up
  io_uring/zcrx: count zcrx users
  io_uring/zcrx: add sync refill queue flushing
  io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRL
  io_uring/zcrx: elide passing msg flags
  io_uring/zcrx: use folio_nr_pages() instead of shift operation
  io_uring/zcrx: convert to use netmem_desc
  io_uring/query: introduce rings info query
  io_uring/query: introduce zcrx query
  io_uring: move cq/sq user offset init around
  io_uring: pre-calculate scq layout
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull io_uring updates from Jens Axboe:

 - Unify how task_work cancelations are detected, placing it in the
   task_work running state rather than needing to check the task state

 - Series cleaning up and moving the cancelation code to where it
   belongs, in cancel.c

 - Cleanup of waitid and futex argument handling

 - Add support for mixed sized SQEs. 6.18 added support for mixed sized
   CQEs, improving flexibility and efficiency of workloads that need big
   CQEs. This adds similar support for SQEs, where the occasional need
   for a 128b SQE doesn't necessitate having all SQEs be 128b in size

 - Introduce zcrx and SQ/CQ layout queries. The former returns what zcrx
   features are available. And both return the ring size information to
   help with allocation size calculation for user provided rings like
   IORING_SETUP_NO_MMAP and IORING_MEM_REGION_TYPE_USER

 - Zcrx updates for 6.19. It includes a bunch of small patches,
   IORING_REGISTER_ZCRX_CTRL and RQ flushing and David's work on sharing
   zcrx b/w multiple io_uring instances

 - Series cleaning up ring initializations, notable deduplicating ring
   size and offset calculations. It also moves most of the checking
   before doing any allocations, making the code simpler

 - Add support for getsockname and getpeername, which is mostly a
   trivial hookup after a bit of refactoring on the networking side

 - Various fixes and cleanups

* tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits)
  io_uring: Introduce getsockname io_uring cmd
  socket: Split out a getsockname helper for io_uring
  socket: Unify getsockname and getpeername implementation
  io_uring/query: drop unused io_handle_query_entry() ctx arg
  io_uring/kbuf: remove obsolete buf_nr_pages and update comments
  io_uring/register: use correct location for io_rings_layout
  io_uring/zcrx: share an ifq between rings
  io_uring/zcrx: add io_fill_zcrx_offsets()
  io_uring/zcrx: export zcrx via a file
  io_uring/zcrx: move io_zcrx_scrub() and dependencies up
  io_uring/zcrx: count zcrx users
  io_uring/zcrx: add sync refill queue flushing
  io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRL
  io_uring/zcrx: elide passing msg flags
  io_uring/zcrx: use folio_nr_pages() instead of shift operation
  io_uring/zcrx: convert to use netmem_desc
  io_uring/query: introduce rings info query
  io_uring/query: introduce zcrx query
  io_uring: move cq/sq user offset init around
  io_uring: pre-calculate scq layout
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>socket: Split out a getsockname helper for io_uring</title>
<updated>2025-11-26T20:45:23+00:00</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@suse.de</email>
</author>
<published>2025-11-25T21:18:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d73c1677087391379441c0bb444c7fb4238fc6e7'/>
<id>d73c1677087391379441c0bb444c7fb4238fc6e7</id>
<content type='text'>
Similar to getsockopt, split out a helper to check security and issue
the operation from the main handler that can be used by io_uring.

Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to getsockopt, split out a helper to check security and issue
the operation from the main handler that can be used by io_uring.

Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>socket: Unify getsockname and getpeername implementation</title>
<updated>2025-11-26T20:45:23+00:00</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@suse.de</email>
</author>
<published>2025-11-25T21:17:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4677e78800bbde62a9edce0eb3b40c775ec55e0d'/>
<id>4677e78800bbde62a9edce0eb3b40c775ec55e0d</id>
<content type='text'>
They are already implemented by the same get_name hook in the protocol
level.  Bring the unification one level up to reduce code duplication
in preparation to supporting these as io_uring operations.

Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
They are already implemented by the same get_name hook in the protocol
level.  Bring the unification one level up to reduce code duplication
in preparation to supporting these as io_uring operations.

Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Convert struct sockaddr to fixed-size "sa_data[14]"</title>
<updated>2025-11-05T03:10:33+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-11-04T00:26:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2b5e9f9b7e414c5eeb20dd7a7b80816ff55cf57b'/>
<id>2b5e9f9b7e414c5eeb20dd7a7b80816ff55cf57b</id>
<content type='text'>
Revert struct sockaddr from flexible array to fixed 14-byte "sa_data",
to solve over 36,000 -Wflex-array-member-not-at-end warnings, since
struct sockaddr is embedded within many network structs.

With socket/proto sockaddr-based internal APIs switched to use struct
sockaddr_unsized, there should be no more uses of struct sockaddr that
depend on reading beyond the end of struct sockaddr::sa_data that might
trigger bounds checking.

Comparing an x86_64 "allyesconfig" vmlinux build before and after this
patch showed no new "ud1" instructions from CONFIG_UBSAN_BOUNDS nor any
new "field-spanning" memcpy CONFIG_FORTIFY_SOURCE instrumentations.

Cc: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20251104002617.2752303-8-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Revert struct sockaddr from flexible array to fixed 14-byte "sa_data",
to solve over 36,000 -Wflex-array-member-not-at-end warnings, since
struct sockaddr is embedded within many network structs.

With socket/proto sockaddr-based internal APIs switched to use struct
sockaddr_unsized, there should be no more uses of struct sockaddr that
depend on reading beyond the end of struct sockaddr::sa_data that might
trigger bounds checking.

Comparing an x86_64 "allyesconfig" vmlinux build before and after this
patch showed no new "ud1" instructions from CONFIG_UBSAN_BOUNDS nor any
new "field-spanning" memcpy CONFIG_FORTIFY_SOURCE instrumentations.

Cc: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20251104002617.2752303-8-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add struct sockaddr_unsized for sockaddr of unknown length</title>
<updated>2025-11-05T03:10:32+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-11-04T00:26:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bf33247a90d3e85d53a9b55bb276b725456ff0bf'/>
<id>bf33247a90d3e85d53a9b55bb276b725456ff0bf</id>
<content type='text'>
Add flexible sockaddr structure to support addresses longer than the
traditional 14-byte struct sockaddr::sa_data limitation without
requiring the full 128-byte sa_data of struct sockaddr_storage. This
allows the network APIs to pass around a pointer to an object that
isn't lying to the compiler about how big it is, but must be accompanied
by its actual size as an additional parameter.

It's possible we may way to migrate to including the size with the
struct in the future, e.g.:

struct sockaddr_unsized {
	u16 sa_data_len;
	u16 sa_family;
	u8  sa_data[] __counted_by(sa_data_len);
};

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20251104002617.2752303-1-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add flexible sockaddr structure to support addresses longer than the
traditional 14-byte struct sockaddr::sa_data limitation without
requiring the full 128-byte sa_data of struct sockaddr_storage. This
allows the network APIs to pass around a pointer to an object that
isn't lying to the compiler about how big it is, but must be accompanied
by its actual size as an additional parameter.

It's possible we may way to migrate to including the size with the
struct in the future, e.g.:

struct sockaddr_unsized {
	u16 sa_data_len;
	u16 sa_family;
	u8  sa_data[] __counted_by(sa_data_len);
};

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Link: https://patch.msgid.link/20251104002617.2752303-1-kees@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: pass const to msg_data_left()</title>
<updated>2025-04-11T01:34:05+00:00</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2025-04-08T18:32:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b1e904999542ad6764eafa54545f1c55776006d1'/>
<id>b1e904999542ad6764eafa54545f1c55776006d1</id>
<content type='text'>
The msg_data_left() function doesn't modify the struct msghdr parameter,
so mark it as const. This allows the function to be used with const
references, improving type safety and making the API more flexible.

Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20250408-tcpsendmsg-v3-1-208b87064c28@debian.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The msg_data_left() function doesn't modify the struct msghdr parameter,
so mark it as const. This allows the function to be used with const
references, improving type safety and making the API more flexible.

Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20250408-tcpsendmsg-v3-1-208b87064c28@debian.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
