<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/netfilter/ipvs, branch v4.19.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ipvs: fix tinfo memory leak in start_sync_thread</title>
<updated>2019-07-26T07:14:11+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2019-06-18T20:07:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fe2ceeb4cffc43c4f64196b13c33f3a52390b114'/>
<id>fe2ceeb4cffc43c4f64196b13c33f3a52390b114</id>
<content type='text'>
[ Upstream commit 5db7c8b9f9fc2aeec671ae3ca6375752c162e0e7 ]

syzkaller reports for memory leak in start_sync_thread [1]

As Eric points out, kthread may start and stop before the
threadfn function is called, so there is no chance the
data (tinfo in our case) to be released in thread.

Fix this by releasing tinfo in the controlling code instead.

[1]
BUG: memory leak
unreferenced object 0xffff8881206bf700 (size 32):
 comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s)
 hex dump (first 32 bytes):
   00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff  .@|......E.!....
   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 backtrace:
   [&lt;0000000057619e23&gt;] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
   [&lt;0000000057619e23&gt;] slab_post_alloc_hook mm/slab.h:439 [inline]
   [&lt;0000000057619e23&gt;] slab_alloc mm/slab.c:3326 [inline]
   [&lt;0000000057619e23&gt;] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
   [&lt;0000000086ce5479&gt;] kmalloc include/linux/slab.h:547 [inline]
   [&lt;0000000086ce5479&gt;] start_sync_thread+0x5d2/0xe10 net/netfilter/ipvs/ip_vs_sync.c:1862
   [&lt;000000001a9229cc&gt;] do_ip_vs_set_ctl+0x4c5/0x780 net/netfilter/ipvs/ip_vs_ctl.c:2402
   [&lt;00000000ece457c8&gt;] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
   [&lt;00000000ece457c8&gt;] nf_setsockopt+0x4c/0x80 net/netfilter/nf_sockopt.c:115
   [&lt;00000000942f62d4&gt;] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline]
   [&lt;00000000942f62d4&gt;] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238
   [&lt;00000000a56a8ffd&gt;] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
   [&lt;00000000fa895401&gt;] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130
   [&lt;0000000095eef4cf&gt;] __sys_setsockopt+0x98/0x120 net/socket.c:2078
   [&lt;000000009747cf88&gt;] __do_sys_setsockopt net/socket.c:2089 [inline]
   [&lt;000000009747cf88&gt;] __se_sys_setsockopt net/socket.c:2086 [inline]
   [&lt;000000009747cf88&gt;] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
   [&lt;00000000ded8ba80&gt;] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
   [&lt;00000000893b4ac8&gt;] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+7e2e50c8adfccd2e5041@syzkaller.appspotmail.com
Suggested-by: Eric Biggers &lt;ebiggers@kernel.org&gt;
Fixes: 998e7a76804b ("ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5db7c8b9f9fc2aeec671ae3ca6375752c162e0e7 ]

syzkaller reports for memory leak in start_sync_thread [1]

As Eric points out, kthread may start and stop before the
threadfn function is called, so there is no chance the
data (tinfo in our case) to be released in thread.

Fix this by releasing tinfo in the controlling code instead.

[1]
BUG: memory leak
unreferenced object 0xffff8881206bf700 (size 32):
 comm "syz-executor761", pid 7268, jiffies 4294943441 (age 20.470s)
 hex dump (first 32 bytes):
   00 40 7c 09 81 88 ff ff 80 45 b8 21 81 88 ff ff  .@|......E.!....
   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 backtrace:
   [&lt;0000000057619e23&gt;] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
   [&lt;0000000057619e23&gt;] slab_post_alloc_hook mm/slab.h:439 [inline]
   [&lt;0000000057619e23&gt;] slab_alloc mm/slab.c:3326 [inline]
   [&lt;0000000057619e23&gt;] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
   [&lt;0000000086ce5479&gt;] kmalloc include/linux/slab.h:547 [inline]
   [&lt;0000000086ce5479&gt;] start_sync_thread+0x5d2/0xe10 net/netfilter/ipvs/ip_vs_sync.c:1862
   [&lt;000000001a9229cc&gt;] do_ip_vs_set_ctl+0x4c5/0x780 net/netfilter/ipvs/ip_vs_ctl.c:2402
   [&lt;00000000ece457c8&gt;] nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
   [&lt;00000000ece457c8&gt;] nf_setsockopt+0x4c/0x80 net/netfilter/nf_sockopt.c:115
   [&lt;00000000942f62d4&gt;] ip_setsockopt net/ipv4/ip_sockglue.c:1258 [inline]
   [&lt;00000000942f62d4&gt;] ip_setsockopt+0x9b/0xb0 net/ipv4/ip_sockglue.c:1238
   [&lt;00000000a56a8ffd&gt;] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
   [&lt;00000000fa895401&gt;] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3130
   [&lt;0000000095eef4cf&gt;] __sys_setsockopt+0x98/0x120 net/socket.c:2078
   [&lt;000000009747cf88&gt;] __do_sys_setsockopt net/socket.c:2089 [inline]
   [&lt;000000009747cf88&gt;] __se_sys_setsockopt net/socket.c:2086 [inline]
   [&lt;000000009747cf88&gt;] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
   [&lt;00000000ded8ba80&gt;] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
   [&lt;00000000893b4ac8&gt;] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+7e2e50c8adfccd2e5041@syzkaller.appspotmail.com
Suggested-by: Eric Biggers &lt;ebiggers@kernel.org&gt;
Fixes: 998e7a76804b ("ipvs: Use kthread_run() instead of doing a double-fork via kernel_thread()")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: defer hook registration to avoid leaks</title>
<updated>2019-07-26T07:14:10+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2019-06-04T18:56:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=57de3c78f0b7e1d3ba2d0a793e40ae5c49397930'/>
<id>57de3c78f0b7e1d3ba2d0a793e40ae5c49397930</id>
<content type='text'>
[ Upstream commit cf47a0b882a4e5f6b34c7949d7b293e9287f1972 ]

syzkaller reports for memory leak when registering hooks [1]

As we moved the nf_unregister_net_hooks() call into
__ip_vs_dev_cleanup(), defer the nf_register_net_hooks()
call, so that hooks are allocated and freed from same
pernet_operations (ipvs_core_dev_ops).

[1]
BUG: memory leak
unreferenced object 0xffff88810acd8a80 (size 96):
 comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s)
 hex dump (first 32 bytes):
   02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff  ........P.......
   00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff  .........w......
 backtrace:
   [&lt;0000000013db61f1&gt;] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
   [&lt;0000000013db61f1&gt;] slab_post_alloc_hook mm/slab.h:439 [inline]
   [&lt;0000000013db61f1&gt;] slab_alloc_node mm/slab.c:3269 [inline]
   [&lt;0000000013db61f1&gt;] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597
   [&lt;000000001a27307d&gt;] __do_kmalloc_node mm/slab.c:3619 [inline]
   [&lt;000000001a27307d&gt;] __kmalloc_node+0x38/0x50 mm/slab.c:3627
   [&lt;0000000025054add&gt;] kmalloc_node include/linux/slab.h:590 [inline]
   [&lt;0000000025054add&gt;] kvmalloc_node+0x4a/0xd0 mm/util.c:431
   [&lt;0000000050d1bc00&gt;] kvmalloc include/linux/mm.h:637 [inline]
   [&lt;0000000050d1bc00&gt;] kvzalloc include/linux/mm.h:645 [inline]
   [&lt;0000000050d1bc00&gt;] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61
   [&lt;00000000e8abe142&gt;] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128
   [&lt;000000004b94797c&gt;] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337
   [&lt;00000000d1545cbc&gt;] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464
   [&lt;00000000876c9b55&gt;] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480
   [&lt;000000002ea868e0&gt;] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280
   [&lt;000000002eb2d451&gt;] ops_init+0x4c/0x140 net/core/net_namespace.c:130
   [&lt;000000000284ec48&gt;] setup_net+0xde/0x230 net/core/net_namespace.c:316
   [&lt;00000000a70600fa&gt;] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439
   [&lt;00000000ff26c15e&gt;] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107
   [&lt;00000000b103dc79&gt;] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165
   [&lt;000000007cc008a2&gt;] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035
   [&lt;00000000c344af7c&gt;] copy_process kernel/fork.c:1800 [inline]
   [&lt;00000000c344af7c&gt;] _do_fork+0x121/0x4f0 kernel/fork.c:2369

Reported-by: syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com
Fixes: 719c7d563c17 ("ipvs: Fix use-after-free in ip_vs_in")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cf47a0b882a4e5f6b34c7949d7b293e9287f1972 ]

syzkaller reports for memory leak when registering hooks [1]

As we moved the nf_unregister_net_hooks() call into
__ip_vs_dev_cleanup(), defer the nf_register_net_hooks()
call, so that hooks are allocated and freed from same
pernet_operations (ipvs_core_dev_ops).

[1]
BUG: memory leak
unreferenced object 0xffff88810acd8a80 (size 96):
 comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s)
 hex dump (first 32 bytes):
   02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff  ........P.......
   00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff  .........w......
 backtrace:
   [&lt;0000000013db61f1&gt;] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
   [&lt;0000000013db61f1&gt;] slab_post_alloc_hook mm/slab.h:439 [inline]
   [&lt;0000000013db61f1&gt;] slab_alloc_node mm/slab.c:3269 [inline]
   [&lt;0000000013db61f1&gt;] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597
   [&lt;000000001a27307d&gt;] __do_kmalloc_node mm/slab.c:3619 [inline]
   [&lt;000000001a27307d&gt;] __kmalloc_node+0x38/0x50 mm/slab.c:3627
   [&lt;0000000025054add&gt;] kmalloc_node include/linux/slab.h:590 [inline]
   [&lt;0000000025054add&gt;] kvmalloc_node+0x4a/0xd0 mm/util.c:431
   [&lt;0000000050d1bc00&gt;] kvmalloc include/linux/mm.h:637 [inline]
   [&lt;0000000050d1bc00&gt;] kvzalloc include/linux/mm.h:645 [inline]
   [&lt;0000000050d1bc00&gt;] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61
   [&lt;00000000e8abe142&gt;] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128
   [&lt;000000004b94797c&gt;] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337
   [&lt;00000000d1545cbc&gt;] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464
   [&lt;00000000876c9b55&gt;] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480
   [&lt;000000002ea868e0&gt;] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280
   [&lt;000000002eb2d451&gt;] ops_init+0x4c/0x140 net/core/net_namespace.c:130
   [&lt;000000000284ec48&gt;] setup_net+0xde/0x230 net/core/net_namespace.c:316
   [&lt;00000000a70600fa&gt;] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439
   [&lt;00000000ff26c15e&gt;] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107
   [&lt;00000000b103dc79&gt;] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165
   [&lt;000000007cc008a2&gt;] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035
   [&lt;00000000c344af7c&gt;] copy_process kernel/fork.c:1800 [inline]
   [&lt;00000000c344af7c&gt;] _do_fork+0x121/0x4f0 kernel/fork.c:2369

Reported-by: syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com
Fixes: 719c7d563c17 ("ipvs: Fix use-after-free in ip_vs_in")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: Fix use-after-free in ip_vs_in</title>
<updated>2019-06-22T06:15:15+00:00</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2019-05-17T14:31:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=61c83de6e622cb89c5c85721893c30a7c7cd6e76'/>
<id>61c83de6e622cb89c5c85721893c30a7c7cd6e76</id>
<content type='text'>
[ Upstream commit 719c7d563c17b150877cee03a4b812a424989dfa ]

BUG: KASAN: use-after-free in ip_vs_in.part.29+0xe8/0xd20 [ip_vs]
Read of size 4 at addr ffff8881e9b26e2c by task sshd/5603

CPU: 0 PID: 5603 Comm: sshd Not tainted 4.19.39+ #30
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
 dump_stack+0x71/0xab
 print_address_description+0x6a/0x270
 kasan_report+0x179/0x2c0
 ip_vs_in.part.29+0xe8/0xd20 [ip_vs]
 ip_vs_in+0xd8/0x170 [ip_vs]
 nf_hook_slow+0x5f/0xe0
 __ip_local_out+0x1d5/0x250
 ip_local_out+0x19/0x60
 __tcp_transmit_skb+0xba1/0x14f0
 tcp_write_xmit+0x41f/0x1ed0
 ? _copy_from_iter_full+0xca/0x340
 __tcp_push_pending_frames+0x52/0x140
 tcp_sendmsg_locked+0x787/0x1600
 ? tcp_sendpage+0x60/0x60
 ? inet_sk_set_state+0xb0/0xb0
 tcp_sendmsg+0x27/0x40
 sock_sendmsg+0x6d/0x80
 sock_write_iter+0x121/0x1c0
 ? sock_sendmsg+0x80/0x80
 __vfs_write+0x23e/0x370
 vfs_write+0xe7/0x230
 ksys_write+0xa1/0x120
 ? __ia32_sys_read+0x50/0x50
 ? __audit_syscall_exit+0x3ce/0x450
 do_syscall_64+0x73/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff6f6147c60
Code: 73 01 c3 48 8b 0d 28 12 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 5d 73 2d 00 00 75 10 b8 01 00 00 00 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 31 c3 48 83
RSP: 002b:00007ffd772ead18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000034 RCX: 00007ff6f6147c60
RDX: 0000000000000034 RSI: 000055df30a31270 RDI: 0000000000000003
RBP: 000055df30a31270 R08: 0000000000000000 R09: 0000000000000000
R10: 00007ffd772ead70 R11: 0000000000000246 R12: 00007ffd772ead74
R13: 00007ffd772eae20 R14: 00007ffd772eae24 R15: 000055df2f12ddc0

Allocated by task 6052:
 kasan_kmalloc+0xa0/0xd0
 __kmalloc+0x10a/0x220
 ops_init+0x97/0x190
 register_pernet_operations+0x1ac/0x360
 register_pernet_subsys+0x24/0x40
 0xffffffffc0ea016d
 do_one_initcall+0x8b/0x253
 do_init_module+0xe3/0x335
 load_module+0x2fc0/0x3890
 __do_sys_finit_module+0x192/0x1c0
 do_syscall_64+0x73/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6067:
 __kasan_slab_free+0x130/0x180
 kfree+0x90/0x1a0
 ops_free_list.part.7+0xa6/0xc0
 unregister_pernet_operations+0x18b/0x1f0
 unregister_pernet_subsys+0x1d/0x30
 ip_vs_cleanup+0x1d/0xd2f [ip_vs]
 __x64_sys_delete_module+0x20c/0x300
 do_syscall_64+0x73/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8881e9b26600 which belongs to the cache kmalloc-4096 of size 4096
The buggy address is located 2092 bytes inside of 4096-byte region [ffff8881e9b26600, ffff8881e9b27600)
The buggy address belongs to the page:
page:ffffea0007a6c800 count:1 mapcount:0 mapping:ffff888107c0e600 index:0x0 compound_mapcount: 0
flags: 0x17ffffc0008100(slab|head)
raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff888107c0e600
raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

while unregistering ipvs module, ops_free_list calls
__ip_vs_cleanup, then nf_unregister_net_hooks be called to
do remove nf hook entries. It need a RCU period to finish,
however net-&gt;ipvs is set to NULL immediately, which will
trigger NULL pointer dereference when a packet is hooked
and handled by ip_vs_in where net-&gt;ipvs is dereferenced.

Another scene is ops_free_list call ops_free to free the
net_generic directly while __ip_vs_cleanup finished, then
calling ip_vs_in will triggers use-after-free.

This patch moves nf_unregister_net_hooks from __ip_vs_cleanup()
to __ip_vs_dev_cleanup(),  where rcu_barrier() is called by
unregister_pernet_device -&gt; unregister_pernet_operations,
that will do the needed grace period.

Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Fixes: efe41606184e ("ipvs: convert to use pernet nf_hook api")
Suggested-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 719c7d563c17b150877cee03a4b812a424989dfa ]

BUG: KASAN: use-after-free in ip_vs_in.part.29+0xe8/0xd20 [ip_vs]
Read of size 4 at addr ffff8881e9b26e2c by task sshd/5603

CPU: 0 PID: 5603 Comm: sshd Not tainted 4.19.39+ #30
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
 dump_stack+0x71/0xab
 print_address_description+0x6a/0x270
 kasan_report+0x179/0x2c0
 ip_vs_in.part.29+0xe8/0xd20 [ip_vs]
 ip_vs_in+0xd8/0x170 [ip_vs]
 nf_hook_slow+0x5f/0xe0
 __ip_local_out+0x1d5/0x250
 ip_local_out+0x19/0x60
 __tcp_transmit_skb+0xba1/0x14f0
 tcp_write_xmit+0x41f/0x1ed0
 ? _copy_from_iter_full+0xca/0x340
 __tcp_push_pending_frames+0x52/0x140
 tcp_sendmsg_locked+0x787/0x1600
 ? tcp_sendpage+0x60/0x60
 ? inet_sk_set_state+0xb0/0xb0
 tcp_sendmsg+0x27/0x40
 sock_sendmsg+0x6d/0x80
 sock_write_iter+0x121/0x1c0
 ? sock_sendmsg+0x80/0x80
 __vfs_write+0x23e/0x370
 vfs_write+0xe7/0x230
 ksys_write+0xa1/0x120
 ? __ia32_sys_read+0x50/0x50
 ? __audit_syscall_exit+0x3ce/0x450
 do_syscall_64+0x73/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff6f6147c60
Code: 73 01 c3 48 8b 0d 28 12 2d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 5d 73 2d 00 00 75 10 b8 01 00 00 00 0f 05 &lt;48&gt; 3d 01 f0 ff ff 73 31 c3 48 83
RSP: 002b:00007ffd772ead18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000034 RCX: 00007ff6f6147c60
RDX: 0000000000000034 RSI: 000055df30a31270 RDI: 0000000000000003
RBP: 000055df30a31270 R08: 0000000000000000 R09: 0000000000000000
R10: 00007ffd772ead70 R11: 0000000000000246 R12: 00007ffd772ead74
R13: 00007ffd772eae20 R14: 00007ffd772eae24 R15: 000055df2f12ddc0

Allocated by task 6052:
 kasan_kmalloc+0xa0/0xd0
 __kmalloc+0x10a/0x220
 ops_init+0x97/0x190
 register_pernet_operations+0x1ac/0x360
 register_pernet_subsys+0x24/0x40
 0xffffffffc0ea016d
 do_one_initcall+0x8b/0x253
 do_init_module+0xe3/0x335
 load_module+0x2fc0/0x3890
 __do_sys_finit_module+0x192/0x1c0
 do_syscall_64+0x73/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6067:
 __kasan_slab_free+0x130/0x180
 kfree+0x90/0x1a0
 ops_free_list.part.7+0xa6/0xc0
 unregister_pernet_operations+0x18b/0x1f0
 unregister_pernet_subsys+0x1d/0x30
 ip_vs_cleanup+0x1d/0xd2f [ip_vs]
 __x64_sys_delete_module+0x20c/0x300
 do_syscall_64+0x73/0x200
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8881e9b26600 which belongs to the cache kmalloc-4096 of size 4096
The buggy address is located 2092 bytes inside of 4096-byte region [ffff8881e9b26600, ffff8881e9b27600)
The buggy address belongs to the page:
page:ffffea0007a6c800 count:1 mapcount:0 mapping:ffff888107c0e600 index:0x0 compound_mapcount: 0
flags: 0x17ffffc0008100(slab|head)
raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff888107c0e600
raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

while unregistering ipvs module, ops_free_list calls
__ip_vs_cleanup, then nf_unregister_net_hooks be called to
do remove nf hook entries. It need a RCU period to finish,
however net-&gt;ipvs is set to NULL immediately, which will
trigger NULL pointer dereference when a packet is hooked
and handled by ip_vs_in where net-&gt;ipvs is dereferenced.

Another scene is ops_free_list call ops_free to free the
net_generic directly while __ip_vs_cleanup finished, then
calling ip_vs_in will triggers use-after-free.

This patch moves nf_unregister_net_hooks from __ip_vs_cleanup()
to __ip_vs_dev_cleanup(),  where rcu_barrier() is called by
unregister_pernet_device -&gt; unregister_pernet_operations,
that will do the needed grace period.

Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Fixes: efe41606184e ("ipvs: convert to use pernet nf_hook api")
Suggested-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: do not schedule icmp errors from tunnels</title>
<updated>2019-05-16T17:41:23+00:00</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2019-03-31T10:24:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4e1994ef63657da08709640eada7b4c727818a1a'/>
<id>4e1994ef63657da08709640eada7b4c727818a1a</id>
<content type='text'>
[ Upstream commit 0261ea1bd1eb0da5c0792a9119b8655cf33c80a3 ]

We can receive ICMP errors from client or from
tunneling real server. While the former can be
scheduled to real server, the latter should
not be scheduled, they are decapsulated only when
existing connection is found.

Fixes: 6044eeffafbe ("ipvs: attempt to schedule icmp packets")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0261ea1bd1eb0da5c0792a9119b8655cf33c80a3 ]

We can receive ICMP errors from client or from
tunneling real server. While the former can be
scheduled to real server, the latter should
not be scheduled, they are decapsulated only when
existing connection is found.

Fixes: 6044eeffafbe ("ipvs: attempt to schedule icmp packets")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: fix warning on unused variable</title>
<updated>2019-05-02T07:58:52+00:00</updated>
<author>
<name>Andrea Claudi</name>
<email>aclaudi@redhat.com</email>
</author>
<published>2019-02-15T16:51:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae5e0c773ca699ff2ae91bce7bd25e92b402c0a1'/>
<id>ae5e0c773ca699ff2ae91bce7bd25e92b402c0a1</id>
<content type='text'>
[ Upstream commit c93a49b9769e435990c82297aa0baa31e1538790 ]

When CONFIG_IP_VS_IPV6 is not defined, build produced this warning:

net/netfilter/ipvs/ip_vs_ctl.c:899:6: warning: unused variable ‘ret’ [-Wunused-variable]
  int ret = 0;
      ^~~

Fix this by moving the declaration of 'ret' in the CONFIG_IP_VS_IPV6
section in the same function.

While at it, drop its unneeded initialisation.

Fixes: 098e13f5b21d ("ipvs: fix dependency on nf_defrag_ipv6")
Reported-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Andrea Claudi &lt;aclaudi@redhat.com&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c93a49b9769e435990c82297aa0baa31e1538790 ]

When CONFIG_IP_VS_IPV6 is not defined, build produced this warning:

net/netfilter/ipvs/ip_vs_ctl.c:899:6: warning: unused variable ‘ret’ [-Wunused-variable]
  int ret = 0;
      ^~~

Fix this by moving the declaration of 'ret' in the CONFIG_IP_VS_IPV6
section in the same function.

While at it, drop its unneeded initialisation.

Fixes: 098e13f5b21d ("ipvs: fix dependency on nf_defrag_ipv6")
Reported-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Andrea Claudi &lt;aclaudi@redhat.com&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: fix dependency on nf_defrag_ipv6</title>
<updated>2019-03-23T19:09:45+00:00</updated>
<author>
<name>Andrea Claudi</name>
<email>aclaudi@redhat.com</email>
</author>
<published>2019-02-11T15:14:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5ca2ef674d742b4283908f5df8ba523d38a58d91'/>
<id>5ca2ef674d742b4283908f5df8ba523d38a58d91</id>
<content type='text'>
[ Upstream commit 098e13f5b21d3398065fce8780f07a3ef62f4812 ]

ipvs relies on nf_defrag_ipv6 module to manage IPv6 fragmentation,
but lacks proper Kconfig dependencies and does not explicitly
request defrag features.

As a result, if netfilter hooks are not loaded, when IPv6 fragmented
packet are handled by ipvs only the first fragment makes through.

Fix it properly declaring the dependency on Kconfig and registering
netfilter hooks on ip_vs_add_service() and ip_vs_new_dest().

Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Andrea Claudi &lt;aclaudi@redhat.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 098e13f5b21d3398065fce8780f07a3ef62f4812 ]

ipvs relies on nf_defrag_ipv6 module to manage IPv6 fragmentation,
but lacks proper Kconfig dependencies and does not explicitly
request defrag features.

As a result, if netfilter hooks are not loaded, when IPv6 fragmented
packet are handled by ipvs only the first fragment makes through.

Fix it properly declaring the dependency on Kconfig and registering
netfilter hooks on ip_vs_add_service() and ip_vs_new_dest().

Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Andrea Claudi &lt;aclaudi@redhat.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: Fix signed integer overflow when setsockopt timeout</title>
<updated>2019-03-13T21:02:27+00:00</updated>
<author>
<name>ZhangXiaoxu</name>
<email>zhangxiaoxu5@huawei.com</email>
</author>
<published>2019-01-10T08:39:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e0b03a6bad1a7c3f89d0a2d4aa0584376ebe3ec3'/>
<id>e0b03a6bad1a7c3f89d0a2d4aa0584376ebe3ec3</id>
<content type='text'>
[ Upstream commit 53ab60baa1ac4f20b080a22c13b77b6373922fd7 ]

There is a UBSAN bug report as below:
UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21
signed integer overflow:
-2147483647 * 1000 cannot be represented in type 'int'

Reproduce program:
	#include &lt;stdio.h&gt;
	#include &lt;sys/types.h&gt;
	#include &lt;sys/socket.h&gt;

	#define IPPROTO_IP 0
	#define IPPROTO_RAW 255

	#define IP_VS_BASE_CTL		(64+1024+64)
	#define IP_VS_SO_SET_TIMEOUT	(IP_VS_BASE_CTL+10)

	/* The argument to IP_VS_SO_GET_TIMEOUT */
	struct ipvs_timeout_t {
		int tcp_timeout;
		int tcp_fin_timeout;
		int udp_timeout;
	};

	int main() {
		int ret = -1;
		int sockfd = -1;
		struct ipvs_timeout_t to;

		sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
		if (sockfd == -1) {
			printf("socket init error\n");
			return -1;
		}

		to.tcp_timeout = -2147483647;
		to.tcp_fin_timeout = -2147483647;
		to.udp_timeout = -2147483647;

		ret = setsockopt(sockfd,
				 IPPROTO_IP,
				 IP_VS_SO_SET_TIMEOUT,
				 (char *)(&amp;to),
				 sizeof(to));

		printf("setsockopt return %d\n", ret);
		return ret;
	}

Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'.

Signed-off-by: ZhangXiaoxu &lt;zhangxiaoxu5@huawei.com&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 53ab60baa1ac4f20b080a22c13b77b6373922fd7 ]

There is a UBSAN bug report as below:
UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21
signed integer overflow:
-2147483647 * 1000 cannot be represented in type 'int'

Reproduce program:
	#include &lt;stdio.h&gt;
	#include &lt;sys/types.h&gt;
	#include &lt;sys/socket.h&gt;

	#define IPPROTO_IP 0
	#define IPPROTO_RAW 255

	#define IP_VS_BASE_CTL		(64+1024+64)
	#define IP_VS_SO_SET_TIMEOUT	(IP_VS_BASE_CTL+10)

	/* The argument to IP_VS_SO_GET_TIMEOUT */
	struct ipvs_timeout_t {
		int tcp_timeout;
		int tcp_fin_timeout;
		int udp_timeout;
	};

	int main() {
		int ret = -1;
		int sockfd = -1;
		struct ipvs_timeout_t to;

		sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
		if (sockfd == -1) {
			printf("socket init error\n");
			return -1;
		}

		to.tcp_timeout = -2147483647;
		to.tcp_fin_timeout = -2147483647;
		to.udp_timeout = -2147483647;

		ret = setsockopt(sockfd,
				 IPPROTO_IP,
				 IP_VS_SO_SET_TIMEOUT,
				 (char *)(&amp;to),
				 sizeof(to));

		printf("setsockopt return %d\n", ret);
		return ret;
	}

Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'.

Signed-off-by: ZhangXiaoxu &lt;zhangxiaoxu5@huawei.com&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf</title>
<updated>2018-12-17T08:24:35+00:00</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2018-11-15T07:14:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=28ad9091e186786f735d2dc63d80aafd6949b337'/>
<id>28ad9091e186786f735d2dc63d80aafd6949b337</id>
<content type='text'>
[ Upstream commit 2a31e4bd9ad255ee40809b5c798c4b1c2b09703b ]

ip_vs_dst_event is supposed to clean up all dst used in ipvs'
destinations when a net dev is going down. But it works only
when the dst's dev is the same as the dev from the event.

Now with the same priority but late registration,
ip_vs_dst_notifier is always called later than ipv6_dev_notf
where the dst's dev is set to lo for NETDEV_DOWN event.

As the dst's dev lo is not the same as the dev from the event
in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work.
Also as these dst have to wait for dest_trash_timer to clean
them up. It would cause some non-permanent kernel warnings:

  unregister_netdevice: waiting for br0 to become free. Usage count = 3

To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf
by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.

Note that for ipv4 route fib_netdev_notifier doesn't set dst's
dev to lo in NETDEV_DOWN event, so this fix is only needed when
IP_VS_IPV6 is defined.

Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2a31e4bd9ad255ee40809b5c798c4b1c2b09703b ]

ip_vs_dst_event is supposed to clean up all dst used in ipvs'
destinations when a net dev is going down. But it works only
when the dst's dev is the same as the dev from the event.

Now with the same priority but late registration,
ip_vs_dst_notifier is always called later than ipv6_dev_notf
where the dst's dev is set to lo for NETDEV_DOWN event.

As the dst's dev lo is not the same as the dev from the event
in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work.
Also as these dst have to wait for dest_trash_timer to clean
them up. It would cause some non-permanent kernel warnings:

  unregister_netdevice: waiting for br0 to become free. Usage count = 3

To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf
by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.

Note that for ipv4 route fib_netdev_notifier doesn't set dst's
dev to lo in NETDEV_DOWN event, so this fix is only needed when
IP_VS_IPV6 is defined.

Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: convert ISO_8859-1 text comments to utf-8</title>
<updated>2018-08-24T01:48:43+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2018-08-24T00:01:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3723c63247854c97fe044c12a40e29043e9bbc1b'/>
<id>3723c63247854c97fe044c12a40e29043e9bbc1b</id>
<content type='text'>
Almost all files in the kernel are either plain text or UTF-8 encoded.  A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.

This converts them all to UTF-8 for consistency.

Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;			[IPVS portion]
Acked-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;	[IIO]
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;			[powerpc]
Acked-by: Rob Herring &lt;robh@kernel.org&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Samuel Ortiz &lt;sameo@linux.intel.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Rob Herring &lt;robh+dt@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Almost all files in the kernel are either plain text or UTF-8 encoded.  A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.

This converts them all to UTF-8 for consistency.

Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;			[IPVS portion]
Acked-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;	[IIO]
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;			[powerpc]
Acked-by: Rob Herring &lt;robh@kernel.org&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Samuel Ortiz &lt;sameo@linux.intel.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Rob Herring &lt;robh+dt@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: don't show negative times in ip_vs_conn</title>
<updated>2018-08-16T17:36:57+00:00</updated>
<author>
<name>Matteo Croce</name>
<email>mcroce@redhat.com</email>
</author>
<published>2018-07-31T16:03:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b71ed54dc2a1dbb4328f7d7c98d712747aee92ba'/>
<id>b71ed54dc2a1dbb4328f7d7c98d712747aee92ba</id>
<content type='text'>
Since commit 500462a9de65 ("timers: Switch to a non-cascading wheel"),
timers duration can last even 12.5% more than the scheduled interval.

IPVS has two handlers, /proc/net/ip_vs_conn and /proc/net/ip_vs_conn_sync,
which shows the remaining time before that a connection expires.
The default expire time for a connection is 60 seconds, and the
expiration timer can fire even 4 seconds later than the scheduled time.
The expiration time is calculated subtracting jiffies to the scheduled
expiration time, and it's shown as a huge number when the timer fires late,
since both values are unsigned.

This can confuse script and tools which relies on it, like ipvsadm:

    root@mcroce-redhat:~# while ipvsadm -lc |grep SYN_RECV; do sleep 1 ; done
    TCP 00:05  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:04  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:03  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:02  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:01  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:00  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:44 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:43 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:42 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:41 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:40 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:39 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000

Signed-off-by: Matteo Croce &lt;mcroce@redhat.com&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit 500462a9de65 ("timers: Switch to a non-cascading wheel"),
timers duration can last even 12.5% more than the scheduled interval.

IPVS has two handlers, /proc/net/ip_vs_conn and /proc/net/ip_vs_conn_sync,
which shows the remaining time before that a connection expires.
The default expire time for a connection is 60 seconds, and the
expiration timer can fire even 4 seconds later than the scheduled time.
The expiration time is calculated subtracting jiffies to the scheduled
expiration time, and it's shown as a huge number when the timer fires late,
since both values are unsigned.

This can confuse script and tools which relies on it, like ipvsadm:

    root@mcroce-redhat:~# while ipvsadm -lc |grep SYN_RECV; do sleep 1 ; done
    TCP 00:05  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:04  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:03  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:02  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:01  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:00  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:44 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:43 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:42 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:41 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:40 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:39 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000

Signed-off-by: Matteo Croce &lt;mcroce@redhat.com&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
