<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/netfilter/ipvs, branch v6.6.71</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems</title>
<updated>2024-12-27T12:58:49+00:00</updated>
<author>
<name>David Laight</name>
<email>David.Laight@ACULAB.COM</email>
</author>
<published>2024-12-14T17:30:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=592fc96af8274023a6f099a5a0dd3f00b7f912ec'/>
<id>592fc96af8274023a6f099a5a0dd3f00b7f912ec</id>
<content type='text'>
[ Upstream commit cf2c97423a4f89c8b798294d3f34ecfe7e7035c3 ]

The 'max_avail' value is calculated from the system memory
size using order_base_2().
order_base_2(x) is defined as '(x) ? fn(x) : 0'.
The compiler generates two copies of the code that follows
and then expands clamp(max, min, PAGE_SHIFT - 12) (11 on 32bit).
This triggers a compile-time assert since min is 5.

In reality a system would have to have less than 512MB memory
for the bounds passed to clamp to be reversed.

Swap the order of the arguments to clamp() to avoid the warning.

Replace the clamp_val() on the line below with clamp().
clamp_val() is just 'an accident waiting to happen' and not needed here.

Detected by compile time checks added to clamp(), specifically:
minmax.h: use BUILD_BUG_ON_MSG() for the lo &lt; hi test in clamp()

Reported-by: Linux Kernel Functional Testing &lt;lkft@linaro.org&gt;
Closes: https://lore.kernel.org/all/CA+G9fYsT34UkGFKxus63H6UVpYi5GRZkezT9MRLfAbM3f6ke0g@mail.gmail.com/
Fixes: 4f325e26277b ("ipvs: dynamically limit the connection hash table")
Tested-by: Bartosz Golaszewski &lt;bartosz.golaszewski@linaro.org&gt;
Reviewed-by: Bartosz Golaszewski &lt;bartosz.golaszewski@linaro.org&gt;
Signed-off-by: David Laight &lt;david.laight@aculab.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cf2c97423a4f89c8b798294d3f34ecfe7e7035c3 ]

The 'max_avail' value is calculated from the system memory
size using order_base_2().
order_base_2(x) is defined as '(x) ? fn(x) : 0'.
The compiler generates two copies of the code that follows
and then expands clamp(max, min, PAGE_SHIFT - 12) (11 on 32bit).
This triggers a compile-time assert since min is 5.

In reality a system would have to have less than 512MB memory
for the bounds passed to clamp to be reversed.

Swap the order of the arguments to clamp() to avoid the warning.

Replace the clamp_val() on the line below with clamp().
clamp_val() is just 'an accident waiting to happen' and not needed here.

Detected by compile time checks added to clamp(), specifically:
minmax.h: use BUILD_BUG_ON_MSG() for the lo &lt; hi test in clamp()

Reported-by: Linux Kernel Functional Testing &lt;lkft@linaro.org&gt;
Closes: https://lore.kernel.org/all/CA+G9fYsT34UkGFKxus63H6UVpYi5GRZkezT9MRLfAbM3f6ke0g@mail.gmail.com/
Fixes: 4f325e26277b ("ipvs: dynamically limit the connection hash table")
Tested-by: Bartosz Golaszewski &lt;bartosz.golaszewski@linaro.org&gt;
Reviewed-by: Bartosz Golaszewski &lt;bartosz.golaszewski@linaro.org&gt;
Signed-off-by: David Laight &lt;david.laight@aculab.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: introduce dst_rt6_info() helper</title>
<updated>2024-12-14T18:59:35+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2024-04-26T15:19:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=797a4c1f5b63602b963e066bfc425ce174087d0a'/>
<id>797a4c1f5b63602b963e066bfc425ce174087d0a</id>
<content type='text'>
[ Upstream commit e8dfd42c17faf183415323db1ef0c977be0d6489 ]

Instead of (struct rt6_info *)dst casts, we can use :

 #define dst_rt6_info(_ptr) \
         container_of_const(_ptr, struct rt6_info, dst)

Some places needed missing const qualifiers :

ip6_confirm_neigh(), ipv6_anycast_destination(),
ipv6_unicast_destination(), has_gateway()

v2: added missing parts (David Ahern)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Stable-dep-of: 3301ab7d5aeb ("net/ipv6: release expired exception dst cached in socket")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e8dfd42c17faf183415323db1ef0c977be0d6489 ]

Instead of (struct rt6_info *)dst casts, we can use :

 #define dst_rt6_info(_ptr) \
         container_of_const(_ptr, struct rt6_info, dst)

Some places needed missing const qualifiers :

ip6_confirm_neigh(), ipv6_anycast_destination(),
ipv6_unicast_destination(), has_gateway()

v2: added missing parts (David Ahern)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Stable-dep-of: 3301ab7d5aeb ("net/ipv6: release expired exception dst cached in socket")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: fix UB due to uninitialized stack access in ip_vs_protocol_init()</title>
<updated>2024-12-14T18:59:33+00:00</updated>
<author>
<name>Jinghao Jia</name>
<email>jinghao7@illinois.edu</email>
</author>
<published>2024-11-23T09:42:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=124834133b32f9386bb2d8581d9ab92f65e951e4'/>
<id>124834133b32f9386bb2d8581d9ab92f65e951e4</id>
<content type='text'>
[ Upstream commit 146b6f1112eb30a19776d6c323c994e9d67790db ]

Under certain kernel configurations when building with Clang/LLVM, the
compiler does not generate a return or jump as the terminator
instruction for ip_vs_protocol_init(), triggering the following objtool
warning during build time:

  vmlinux.o: warning: objtool: ip_vs_protocol_init() falls through to next function __initstub__kmod_ip_vs_rr__935_123_ip_vs_rr_init6()

At runtime, this either causes an oops when trying to load the ipvs
module or a boot-time panic if ipvs is built-in. This same issue has
been reported by the Intel kernel test robot previously.

Digging deeper into both LLVM and the kernel code reveals this to be a
undefined behavior problem. ip_vs_protocol_init() uses a on-stack buffer
of 64 chars to store the registered protocol names and leaves it
uninitialized after definition. The function calls strnlen() when
concatenating protocol names into the buffer. With CONFIG_FORTIFY_SOURCE
strnlen() performs an extra step to check whether the last byte of the
input char buffer is a null character (commit 3009f891bb9f ("fortify:
Allow strlen() and strnlen() to pass compile-time known lengths")).
This, together with possibly other configurations, cause the following
IR to be generated:

  define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #5 section ".init.text" align 16 !kcfi_type !29 {
    %1 = alloca [64 x i8], align 16
    ...

  14:                                               ; preds = %11
    %15 = getelementptr inbounds i8, ptr %1, i64 63
    %16 = load i8, ptr %15, align 1
    %17 = tail call i1 @llvm.is.constant.i8(i8 %16)
    %18 = icmp eq i8 %16, 0
    %19 = select i1 %17, i1 %18, i1 false
    br i1 %19, label %20, label %23

  20:                                               ; preds = %14
    %21 = call i64 @strlen(ptr noundef nonnull dereferenceable(1) %1) #23
    ...

  23:                                               ; preds = %14, %11, %20
    %24 = call i64 @strnlen(ptr noundef nonnull dereferenceable(1) %1, i64 noundef 64) #24
    ...
  }

The above code calculates the address of the last char in the buffer
(value %15) and then loads from it (value %16). Because the buffer is
never initialized, the LLVM GVN pass marks value %16 as undefined:

  %13 = getelementptr inbounds i8, ptr %1, i64 63
  br i1 undef, label %14, label %17

This gives later passes (SCCP, in particular) more DCE opportunities by
propagating the undef value further, and eventually removes everything
after the load on the uninitialized stack location:

  define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #0 section ".init.text" align 16 !kcfi_type !11 {
    %1 = alloca [64 x i8], align 16
    ...

  12:                                               ; preds = %11
    %13 = getelementptr inbounds i8, ptr %1, i64 63
    unreachable
  }

In this way, the generated native code will just fall through to the
next function, as LLVM does not generate any code for the unreachable IR
instruction and leaves the function without a terminator.

Zero the on-stack buffer to avoid this possible UB.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202402100205.PWXIz1ZK-lkp@intel.com/
Co-developed-by: Ruowen Qin &lt;ruqin@redhat.com&gt;
Signed-off-by: Ruowen Qin &lt;ruqin@redhat.com&gt;
Signed-off-by: Jinghao Jia &lt;jinghao7@illinois.edu&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 146b6f1112eb30a19776d6c323c994e9d67790db ]

Under certain kernel configurations when building with Clang/LLVM, the
compiler does not generate a return or jump as the terminator
instruction for ip_vs_protocol_init(), triggering the following objtool
warning during build time:

  vmlinux.o: warning: objtool: ip_vs_protocol_init() falls through to next function __initstub__kmod_ip_vs_rr__935_123_ip_vs_rr_init6()

At runtime, this either causes an oops when trying to load the ipvs
module or a boot-time panic if ipvs is built-in. This same issue has
been reported by the Intel kernel test robot previously.

Digging deeper into both LLVM and the kernel code reveals this to be a
undefined behavior problem. ip_vs_protocol_init() uses a on-stack buffer
of 64 chars to store the registered protocol names and leaves it
uninitialized after definition. The function calls strnlen() when
concatenating protocol names into the buffer. With CONFIG_FORTIFY_SOURCE
strnlen() performs an extra step to check whether the last byte of the
input char buffer is a null character (commit 3009f891bb9f ("fortify:
Allow strlen() and strnlen() to pass compile-time known lengths")).
This, together with possibly other configurations, cause the following
IR to be generated:

  define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #5 section ".init.text" align 16 !kcfi_type !29 {
    %1 = alloca [64 x i8], align 16
    ...

  14:                                               ; preds = %11
    %15 = getelementptr inbounds i8, ptr %1, i64 63
    %16 = load i8, ptr %15, align 1
    %17 = tail call i1 @llvm.is.constant.i8(i8 %16)
    %18 = icmp eq i8 %16, 0
    %19 = select i1 %17, i1 %18, i1 false
    br i1 %19, label %20, label %23

  20:                                               ; preds = %14
    %21 = call i64 @strlen(ptr noundef nonnull dereferenceable(1) %1) #23
    ...

  23:                                               ; preds = %14, %11, %20
    %24 = call i64 @strnlen(ptr noundef nonnull dereferenceable(1) %1, i64 noundef 64) #24
    ...
  }

The above code calculates the address of the last char in the buffer
(value %15) and then loads from it (value %16). Because the buffer is
never initialized, the LLVM GVN pass marks value %16 as undefined:

  %13 = getelementptr inbounds i8, ptr %1, i64 63
  br i1 undef, label %14, label %17

This gives later passes (SCCP, in particular) more DCE opportunities by
propagating the undef value further, and eventually removes everything
after the load on the uninitialized stack location:

  define hidden i32 @ip_vs_protocol_init() local_unnamed_addr #0 section ".init.text" align 16 !kcfi_type !11 {
    %1 = alloca [64 x i8], align 16
    ...

  12:                                               ; preds = %11
    %13 = getelementptr inbounds i8, ptr %1, i64 63
    unreachable
  }

In this way, the generated native code will just fall through to the
next function, as LLVM does not generate any code for the unreachable IR
instruction and leaves the function without a terminator.

Zero the on-stack buffer to avoid this possible UB.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202402100205.PWXIz1ZK-lkp@intel.com/
Co-developed-by: Ruowen Qin &lt;ruqin@redhat.com&gt;
Signed-off-by: Ruowen Qin &lt;ruqin@redhat.com&gt;
Signed-off-by: Jinghao Jia &lt;jinghao7@illinois.edu&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: properly dereference pe in ip_vs_add_service</title>
<updated>2024-08-03T06:54:04+00:00</updated>
<author>
<name>Chen Hanxiao</name>
<email>chenhx.fnst@fujitsu.com</email>
</author>
<published>2024-06-27T06:15:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3dd428039e06e1967ce294e2cd6342825aaaad77'/>
<id>3dd428039e06e1967ce294e2cd6342825aaaad77</id>
<content type='text'>
[ Upstream commit cbd070a4ae62f119058973f6d2c984e325bce6e7 ]

Use pe directly to resolve sparse warning:

  net/netfilter/ipvs/ip_vs_ctl.c:1471:27: warning: dereference of noderef expression

Fixes: 39b972231536 ("ipvs: handle connections started by real-servers")
Signed-off-by: Chen Hanxiao &lt;chenhx.fnst@fujitsu.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cbd070a4ae62f119058973f6d2c984e325bce6e7 ]

Use pe directly to resolve sparse warning:

  net/netfilter/ipvs/ip_vs_ctl.c:1471:27: warning: dereference of noderef expression

Fixes: 39b972231536 ("ipvs: handle connections started by real-servers")
Signed-off-by: Chen Hanxiao &lt;chenhx.fnst@fujitsu.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: Avoid unnecessary calls to skb_is_gso_sctp</title>
<updated>2024-08-03T06:53:40+00:00</updated>
<author>
<name>Ismael Luceno</name>
<email>iluceno@suse.de</email>
</author>
<published>2024-05-23T16:54:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fed3317b2d3f55eda9c3d9af141ba33685e816d7'/>
<id>fed3317b2d3f55eda9c3d9af141ba33685e816d7</id>
<content type='text'>
[ Upstream commit 53796b03295cf7ab1fc8600016fa6dfbf4a494a0 ]

In the context of the SCTP SNAT/DNAT handler, these calls can only
return true.

Fixes: e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")
Signed-off-by: Ismael Luceno &lt;iluceno@suse.de&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 53796b03295cf7ab1fc8600016fa6dfbf4a494a0 ]

In the context of the SCTP SNAT/DNAT handler, these calls can only
return true.

Fixes: e10d3ba4d434 ("ipvs: Fix checksumming on GSO of SCTP packets")
Signed-off-by: Ismael Luceno &lt;iluceno@suse.de&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: Fix checksumming on GSO of SCTP packets</title>
<updated>2024-05-02T14:32:37+00:00</updated>
<author>
<name>Ismael Luceno</name>
<email>iluceno@suse.de</email>
</author>
<published>2024-04-21T14:22:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=36054132dfb0b05e0bf5628b6a805827822c4441'/>
<id>36054132dfb0b05e0bf5628b6a805827822c4441</id>
<content type='text'>
[ Upstream commit e10d3ba4d434ed172914617ed8d74bd411421193 ]

It was observed in the wild that pairs of consecutive packets would leave
the IPVS with the same wrong checksum, and the issue only went away when
disabling GSO.

IPVS needs to avoid computing the SCTP checksum when using GSO.

Fixes: 90017accff61 ("sctp: Add GSO support")
Co-developed-by: Firo Yang &lt;firo.yang@suse.com&gt;
Signed-off-by: Ismael Luceno &lt;iluceno@suse.de&gt;
Tested-by: Andreas Taschner &lt;andreas.taschner@suse.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e10d3ba4d434ed172914617ed8d74bd411421193 ]

It was observed in the wild that pairs of consecutive packets would leave
the IPVS with the same wrong checksum, and the issue only went away when
disabling GSO.

IPVS needs to avoid computing the SCTP checksum when using GSO.

Fixes: 90017accff61 ("sctp: Add GSO support")
Co-developed-by: Firo Yang &lt;firo.yang@suse.com&gt;
Signed-off-by: Ismael Luceno &lt;iluceno@suse.de&gt;
Tested-by: Andreas Taschner &lt;andreas.taschner@suse.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: avoid stat macros calls from preemptible context</title>
<updated>2024-01-25T23:35:59+00:00</updated>
<author>
<name>Fedor Pchelkin</name>
<email>pchelkin@ispras.ru</email>
</author>
<published>2024-01-15T14:39:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b019406e5ad933676258af7f6b13ced24df78d56'/>
<id>b019406e5ad933676258af7f6b13ced24df78d56</id>
<content type='text'>
[ Upstream commit d6938c1c76c64f42363d0d1f051e1b4641c2ad40 ]

Inside decrement_ttl() upon discovering that the packet ttl has exceeded,
__IP_INC_STATS and __IP6_INC_STATS macros can be called from preemptible
context having the following backtrace:

check_preemption_disabled: 48 callbacks suppressed
BUG: using __this_cpu_add() in preemptible [00000000] code: curl/1177
caller is decrement_ttl+0x217/0x830
CPU: 5 PID: 1177 Comm: curl Not tainted 6.7.0+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 04/01/2014
Call Trace:
 &lt;TASK&gt;
 dump_stack_lvl+0xbd/0xe0
 check_preemption_disabled+0xd1/0xe0
 decrement_ttl+0x217/0x830
 __ip_vs_get_out_rt+0x4e0/0x1ef0
 ip_vs_nat_xmit+0x205/0xcd0
 ip_vs_in_hook+0x9b1/0x26a0
 nf_hook_slow+0xc2/0x210
 nf_hook+0x1fb/0x770
 __ip_local_out+0x33b/0x640
 ip_local_out+0x2a/0x490
 __ip_queue_xmit+0x990/0x1d10
 __tcp_transmit_skb+0x288b/0x3d10
 tcp_connect+0x3466/0x5180
 tcp_v4_connect+0x1535/0x1bb0
 __inet_stream_connect+0x40d/0x1040
 inet_stream_connect+0x57/0xa0
 __sys_connect_file+0x162/0x1a0
 __sys_connect+0x137/0x160
 __x64_sys_connect+0x72/0xb0
 do_syscall_64+0x6f/0x140
 entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7fe6dbbc34e0

Use the corresponding preemption-aware variants: IP_INC_STATS and
IP6_INC_STATS.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 8d8e20e2d7bb ("ipvs: Decrement ttl")
Signed-off-by: Fedor Pchelkin &lt;pchelkin@ispras.ru&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d6938c1c76c64f42363d0d1f051e1b4641c2ad40 ]

Inside decrement_ttl() upon discovering that the packet ttl has exceeded,
__IP_INC_STATS and __IP6_INC_STATS macros can be called from preemptible
context having the following backtrace:

check_preemption_disabled: 48 callbacks suppressed
BUG: using __this_cpu_add() in preemptible [00000000] code: curl/1177
caller is decrement_ttl+0x217/0x830
CPU: 5 PID: 1177 Comm: curl Not tainted 6.7.0+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 04/01/2014
Call Trace:
 &lt;TASK&gt;
 dump_stack_lvl+0xbd/0xe0
 check_preemption_disabled+0xd1/0xe0
 decrement_ttl+0x217/0x830
 __ip_vs_get_out_rt+0x4e0/0x1ef0
 ip_vs_nat_xmit+0x205/0xcd0
 ip_vs_in_hook+0x9b1/0x26a0
 nf_hook_slow+0xc2/0x210
 nf_hook+0x1fb/0x770
 __ip_local_out+0x33b/0x640
 ip_local_out+0x2a/0x490
 __ip_queue_xmit+0x990/0x1d10
 __tcp_transmit_skb+0x288b/0x3d10
 tcp_connect+0x3466/0x5180
 tcp_v4_connect+0x1535/0x1bb0
 __inet_stream_connect+0x40d/0x1040
 inet_stream_connect+0x57/0xa0
 __sys_connect_file+0x162/0x1a0
 __sys_connect+0x137/0x160
 __x64_sys_connect+0x72/0xb0
 do_syscall_64+0x6f/0x140
 entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7fe6dbbc34e0

Use the corresponding preemption-aware variants: IP_INC_STATS and
IP6_INC_STATS.

Found by Linux Verification Center (linuxtesting.org).

Fixes: 8d8e20e2d7bb ("ipvs: Decrement ttl")
Signed-off-by: Fedor Pchelkin &lt;pchelkin@ispras.ru&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: prevent address rewrite in kernel_bind()</title>
<updated>2023-10-01T18:31:29+00:00</updated>
<author>
<name>Jordan Rife</name>
<email>jrife@google.com</email>
</author>
<published>2023-09-21T23:46:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c889a99a21bf124c3db08d09df919f0eccc5ea4c'/>
<id>c889a99a21bf124c3db08d09df919f0eccc5ea4c</id>
<content type='text'>
Similar to the change in commit 0bdf399342c5("net: Avoid address
overwrite in kernel_connect"), BPF hooks run on bind may rewrite the
address passed to kernel_bind(). This change

1) Makes a copy of the bind address in kernel_bind() to insulate
   callers.
2) Replaces direct calls to sock-&gt;ops-&gt;bind() in net with kernel_bind()

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind")
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Jordan Rife &lt;jrife@google.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to the change in commit 0bdf399342c5("net: Avoid address
overwrite in kernel_connect"), BPF hooks run on bind may rewrite the
address passed to kernel_bind(). This change

1) Makes a copy of the bind address in kernel_bind() to insulate
   callers.
2) Replaces direct calls to sock-&gt;ops-&gt;bind() in net with kernel_bind()

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind")
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Jordan Rife &lt;jrife@google.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: replace calls to sock-&gt;ops-&gt;connect() with kernel_connect()</title>
<updated>2023-10-01T18:31:29+00:00</updated>
<author>
<name>Jordan Rife</name>
<email>jrife@google.com</email>
</author>
<published>2023-09-21T23:46:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=26297b4ce1ce4ea40bc9a48ec99f45da3f64d2e2'/>
<id>26297b4ce1ce4ea40bc9a48ec99f45da3f64d2e2</id>
<content type='text'>
commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
ensured that kernel_connect() will not overwrite the address parameter
in cases where BPF connect hooks perform an address rewrite. This change
replaces direct calls to sock-&gt;ops-&gt;connect() in net with kernel_connect()
to make these call safe.

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect")
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Jordan Rife &lt;jrife@google.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
ensured that kernel_connect() will not overwrite the address parameter
in cases where BPF connect hooks perform an address rewrite. This change
replaces direct calls to sock-&gt;ops-&gt;connect() in net with kernel_connect()
to make these call safe.

Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect")
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Jordan Rife &lt;jrife@google.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sysctl-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux</title>
<updated>2023-08-30T00:39:15+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-30T00:39:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=adfd671676c922bada16477eb68b5eb5f065addc'/>
<id>adfd671676c922bada16477eb68b5eb5f065addc</id>
<content type='text'>
Pull sysctl updates from Luis Chamberlain:
 "Long ago we set out to remove the kitchen sink on kernel/sysctl.c
  arrays and placings sysctls to their own sybsystem or file to help
  avoid merge conflicts. Matthew Wilcox pointed out though that if we're
  going to do that we might as well also *save* space while at it and
  try to remove the extra last sysctl entry added at the end of each
  array, a sentintel, instead of bloating the kernel by adding a new
  sentinel with each array moved.

  Doing that was not so trivial, and has required slowing down the moves
  of kernel/sysctl.c arrays and measuring the impact on size by each new
  move.

  The complex part of the effort to help reduce the size of each sysctl
  is being done by the patient work of el señor Don Joel Granados. A lot
  of this is truly painful code refactoring and testing and then trying
  to measure the savings of each move and removing the sentinels.
  Although Joel already has code which does most of this work,
  experience with sysctl moves in the past shows is we need to be
  careful due to the slew of odd build failures that are possible due to
  the amount of random Kconfig options sysctls use.

  To that end Joel's work is split by first addressing the major
  housekeeping needed to remove the sentinels, which is part of this
  merge request. The rest of the work to actually remove the sentinels
  will be done later in future kernel releases.

  The preliminary math is showing this will all help reduce the overall
  build time size of the kernel and run time memory consumed by the
  kernel by about ~64 bytes per array where we are able to remove each
  sentinel in the future. That also means there is no more bloating the
  kernel with the extra ~64 bytes per array moved as no new sentinels
  are created"

* tag 'sysctl-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: Use ctl_table_size as stopping criteria for list macro
  sysctl: SIZE_MAX-&gt;ARRAY_SIZE in register_net_sysctl
  vrf: Update to register_net_sysctl_sz
  networking: Update to register_net_sysctl_sz
  netfilter: Update to register_net_sysctl_sz
  ax.25: Update to register_net_sysctl_sz
  sysctl: Add size to register_net_sysctl function
  sysctl: Add size arg to __register_sysctl_init
  sysctl: Add size to register_sysctl
  sysctl: Add a size arg to __register_sysctl_table
  sysctl: Add size argument to init_header
  sysctl: Add ctl_table_size to ctl_table_header
  sysctl: Use ctl_table_header in list_for_each_table_entry
  sysctl: Prefer ctl_table_header in proc_sysctl
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull sysctl updates from Luis Chamberlain:
 "Long ago we set out to remove the kitchen sink on kernel/sysctl.c
  arrays and placings sysctls to their own sybsystem or file to help
  avoid merge conflicts. Matthew Wilcox pointed out though that if we're
  going to do that we might as well also *save* space while at it and
  try to remove the extra last sysctl entry added at the end of each
  array, a sentintel, instead of bloating the kernel by adding a new
  sentinel with each array moved.

  Doing that was not so trivial, and has required slowing down the moves
  of kernel/sysctl.c arrays and measuring the impact on size by each new
  move.

  The complex part of the effort to help reduce the size of each sysctl
  is being done by the patient work of el señor Don Joel Granados. A lot
  of this is truly painful code refactoring and testing and then trying
  to measure the savings of each move and removing the sentinels.
  Although Joel already has code which does most of this work,
  experience with sysctl moves in the past shows is we need to be
  careful due to the slew of odd build failures that are possible due to
  the amount of random Kconfig options sysctls use.

  To that end Joel's work is split by first addressing the major
  housekeeping needed to remove the sentinels, which is part of this
  merge request. The rest of the work to actually remove the sentinels
  will be done later in future kernel releases.

  The preliminary math is showing this will all help reduce the overall
  build time size of the kernel and run time memory consumed by the
  kernel by about ~64 bytes per array where we are able to remove each
  sentinel in the future. That also means there is no more bloating the
  kernel with the extra ~64 bytes per array moved as no new sentinels
  are created"

* tag 'sysctl-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: Use ctl_table_size as stopping criteria for list macro
  sysctl: SIZE_MAX-&gt;ARRAY_SIZE in register_net_sysctl
  vrf: Update to register_net_sysctl_sz
  networking: Update to register_net_sysctl_sz
  netfilter: Update to register_net_sysctl_sz
  ax.25: Update to register_net_sysctl_sz
  sysctl: Add size to register_net_sysctl function
  sysctl: Add size arg to __register_sysctl_init
  sysctl: Add size to register_sysctl
  sysctl: Add a size arg to __register_sysctl_table
  sysctl: Add size argument to init_header
  sysctl: Add ctl_table_size to ctl_table_header
  sysctl: Use ctl_table_header in list_for_each_table_entry
  sysctl: Prefer ctl_table_header in proc_sysctl
</pre>
</div>
</content>
</entry>
</feed>
