<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv4/ipmr.c, branch linux-3.18.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route</title>
<updated>2017-04-18T05:55:46+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2016-09-25T21:08:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ad7db1290185be22dfcf13261ffcedd4409abfdb'/>
<id>ad7db1290185be22dfcf13261ffcedd4409abfdb</id>
<content type='text'>
[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&amp;mrt-&gt;ipmr_expire_timer))){+.-...}, at: [&lt;ffffffff810fbbf5&gt;] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [&lt;ffffffff8161e005&gt;] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]  &lt;IRQ&gt;  [&lt;ffffffff813a7804&gt;] dump_stack+0x85/0xc1
[ 7858.215662]  [&lt;ffffffff810a4a72&gt;] ___might_sleep+0x192/0x250
[ 7858.215868]  [&lt;ffffffff810a4b9f&gt;] __might_sleep+0x6f/0x100
[ 7858.216072]  [&lt;ffffffff8165bea3&gt;] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [&lt;ffffffff815a7a5f&gt;] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [&lt;ffffffff8157474b&gt;] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [&lt;ffffffff815a9a0c&gt;] netlink_unicast+0x19c/0x260
[ 7858.216900]  [&lt;ffffffff81573c70&gt;] rtnl_unicast+0x20/0x30
[ 7858.217128]  [&lt;ffffffff8161cd39&gt;] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [&lt;ffffffff8161e06f&gt;] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [&lt;ffffffff810fbc95&gt;] call_timer_fn+0xa5/0x350
[ 7858.218192]  [&lt;ffffffff810fbbf5&gt;] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [&lt;ffffffff810fde10&gt;] run_timer_softirq+0x260/0x640
[ 7858.218865]  [&lt;ffffffff8166379b&gt;] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [&lt;ffffffff816637c8&gt;] __do_softirq+0xe8/0x54f
[ 7858.219269]  [&lt;ffffffff8107a948&gt;] irq_exit+0xb8/0xc0
[ 7858.219463]  [&lt;ffffffff81663452&gt;] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [&lt;ffffffff816625bc&gt;] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]  &lt;EOI&gt;  [&lt;ffffffff81055f16&gt;] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [&lt;ffffffff810d64dd&gt;] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [&lt;ffffffff810298e3&gt;] default_idle+0x23/0x190
[ 7858.220574]  [&lt;ffffffff8102a20f&gt;] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [&lt;ffffffff810c9f8c&gt;] default_idle_call+0x4c/0x60
[ 7858.221016]  [&lt;ffffffff810ca33b&gt;] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [&lt;ffffffff8164f995&gt;] rest_init+0x135/0x140
[ 7858.221469]  [&lt;ffffffff81f83014&gt;] start_kernel+0x50e/0x51b
[ 7858.221670]  [&lt;ffffffff81f82120&gt;] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [&lt;ffffffff81f8243f&gt;] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [&lt;ffffffff81f8257c&gt;] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&amp;mrt-&gt;ipmr_expire_timer))){+.-...}, at: [&lt;ffffffff810fbbf5&gt;] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [&lt;ffffffff8161e005&gt;] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]  &lt;IRQ&gt;  [&lt;ffffffff813a7804&gt;] dump_stack+0x85/0xc1
[ 7858.215662]  [&lt;ffffffff810a4a72&gt;] ___might_sleep+0x192/0x250
[ 7858.215868]  [&lt;ffffffff810a4b9f&gt;] __might_sleep+0x6f/0x100
[ 7858.216072]  [&lt;ffffffff8165bea3&gt;] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [&lt;ffffffff815a7a5f&gt;] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [&lt;ffffffff8157474b&gt;] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [&lt;ffffffff815a9a0c&gt;] netlink_unicast+0x19c/0x260
[ 7858.216900]  [&lt;ffffffff81573c70&gt;] rtnl_unicast+0x20/0x30
[ 7858.217128]  [&lt;ffffffff8161cd39&gt;] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [&lt;ffffffff8161e06f&gt;] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [&lt;ffffffff810fbc95&gt;] call_timer_fn+0xa5/0x350
[ 7858.218192]  [&lt;ffffffff810fbbf5&gt;] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [&lt;ffffffff810fde10&gt;] run_timer_softirq+0x260/0x640
[ 7858.218865]  [&lt;ffffffff8166379b&gt;] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [&lt;ffffffff816637c8&gt;] __do_softirq+0xe8/0x54f
[ 7858.219269]  [&lt;ffffffff8107a948&gt;] irq_exit+0xb8/0xc0
[ 7858.219463]  [&lt;ffffffff81663452&gt;] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [&lt;ffffffff816625bc&gt;] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]  &lt;EOI&gt;  [&lt;ffffffff81055f16&gt;] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [&lt;ffffffff810d64dd&gt;] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [&lt;ffffffff810298e3&gt;] default_idle+0x23/0x190
[ 7858.220574]  [&lt;ffffffff8102a20f&gt;] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [&lt;ffffffff810c9f8c&gt;] default_idle_call+0x4c/0x60
[ 7858.221016]  [&lt;ffffffff810ca33b&gt;] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [&lt;ffffffff8164f995&gt;] rest_init+0x135/0x140
[ 7858.221469]  [&lt;ffffffff81f83014&gt;] start_kernel+0x50e/0x51b
[ 7858.221670]  [&lt;ffffffff81f82120&gt;] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [&lt;ffffffff81f8243f&gt;] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [&lt;ffffffff81f8257c&gt;] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ipmr: fix static mfc/dev leaks on table destruction</title>
<updated>2015-12-14T17:19:25+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2015-11-20T12:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=691e3dcb40a7e7a83f368cfea1952ffa6aff6cc0'/>
<id>691e3dcb40a7e7a83f368cfea1952ffa6aff6cc0</id>
<content type='text'>
[ Upstream commit 0e615e9601a15efeeb8942cf7cd4dadba0c8c5a7 ]

When destroying an mrt table the static mfc entries and the static
devices are kept, which leads to devices that can never be destroyed
(because of refcnt taken) and leaked memory, for example:
unreferenced object 0xffff880034c144c0 (size 192):
  comm "mfc-broken", pid 4777, jiffies 4320349055 (age 46001.964s)
  hex dump (first 32 bytes):
    98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff  .S.4.....S.4....
    ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00  ................
  backtrace:
    [&lt;ffffffff815c1b9e&gt;] kmemleak_alloc+0x4e/0xb0
    [&lt;ffffffff811ea6e0&gt;] kmem_cache_alloc+0x190/0x300
    [&lt;ffffffff815931cb&gt;] ip_mroute_setsockopt+0x5cb/0x910
    [&lt;ffffffff8153d575&gt;] do_ip_setsockopt.isra.11+0x105/0xff0
    [&lt;ffffffff8153e490&gt;] ip_setsockopt+0x30/0xa0
    [&lt;ffffffff81564e13&gt;] raw_setsockopt+0x33/0x90
    [&lt;ffffffff814d1e14&gt;] sock_common_setsockopt+0x14/0x20
    [&lt;ffffffff814d0b51&gt;] SyS_setsockopt+0x71/0xc0
    [&lt;ffffffff815cdbf6&gt;] entry_SYSCALL_64_fastpath+0x16/0x7a
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Make sure that everything is cleaned on netns destruction.

Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Reviewed-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0e615e9601a15efeeb8942cf7cd4dadba0c8c5a7 ]

When destroying an mrt table the static mfc entries and the static
devices are kept, which leads to devices that can never be destroyed
(because of refcnt taken) and leaked memory, for example:
unreferenced object 0xffff880034c144c0 (size 192):
  comm "mfc-broken", pid 4777, jiffies 4320349055 (age 46001.964s)
  hex dump (first 32 bytes):
    98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff  .S.4.....S.4....
    ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00  ................
  backtrace:
    [&lt;ffffffff815c1b9e&gt;] kmemleak_alloc+0x4e/0xb0
    [&lt;ffffffff811ea6e0&gt;] kmem_cache_alloc+0x190/0x300
    [&lt;ffffffff815931cb&gt;] ip_mroute_setsockopt+0x5cb/0x910
    [&lt;ffffffff8153d575&gt;] do_ip_setsockopt.isra.11+0x105/0xff0
    [&lt;ffffffff8153e490&gt;] ip_setsockopt+0x30/0xa0
    [&lt;ffffffff81564e13&gt;] raw_setsockopt+0x33/0x90
    [&lt;ffffffff814d1e14&gt;] sock_common_setsockopt+0x14/0x20
    [&lt;ffffffff814d0b51&gt;] SyS_setsockopt+0x71/0xc0
    [&lt;ffffffff815cdbf6&gt;] entry_SYSCALL_64_fastpath+0x16/0x7a
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Make sure that everything is cleaned on netns destruction.

Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Reviewed-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context.</title>
<updated>2015-12-03T15:12:17+00:00</updated>
<author>
<name>Ani Sinha</name>
<email>ani@arista.com</email>
</author>
<published>2015-10-30T23:54:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=37d4671cc4ee4dec336752740d591adb0d3f0768'/>
<id>37d4671cc4ee4dec336752740d591adb0d3f0768</id>
<content type='text'>
[ Upstream commit 44f49dd8b5a606870a1f21101522a0f9c4414784 ]

Fixes the following kernel BUG :

BUG: using __this_cpu_add() in preemptible [00000000] code: bash/2758
caller is __this_cpu_preempt_check+0x13/0x15
CPU: 0 PID: 2758 Comm: bash Tainted: P           O   3.18.19 #2
 ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
 0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
 ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
Call Trace:
[&lt;ffffffff81482b2a&gt;] dump_stack+0x52/0x80
[&lt;ffffffff812010ae&gt;] check_preemption_disabled+0xce/0xe1
[&lt;ffffffff812010d4&gt;] __this_cpu_preempt_check+0x13/0x15
[&lt;ffffffff81419d60&gt;] ipmr_queue_xmit+0x647/0x70c
[&lt;ffffffff8141a154&gt;] ip_mr_forward+0x32f/0x34e
[&lt;ffffffff8141af76&gt;] ip_mroute_setsockopt+0xe03/0x108c
[&lt;ffffffff810553fc&gt;] ? get_parent_ip+0x11/0x42
[&lt;ffffffff810e6974&gt;] ? pollwake+0x4d/0x51
[&lt;ffffffff81058ac0&gt;] ? default_wake_function+0x0/0xf
[&lt;ffffffff810553fc&gt;] ? get_parent_ip+0x11/0x42
[&lt;ffffffff810613d9&gt;] ? __wake_up_common+0x45/0x77
[&lt;ffffffff81486ea9&gt;] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[&lt;ffffffff810618bc&gt;] ? __wake_up_sync_key+0x4a/0x53
[&lt;ffffffff8139a519&gt;] ? sock_def_readable+0x71/0x75
[&lt;ffffffff813dd226&gt;] do_ip_setsockopt+0x9d/0xb55
[&lt;ffffffff81429818&gt;] ? unix_seqpacket_sendmsg+0x3f/0x41
[&lt;ffffffff813963fe&gt;] ? sock_sendmsg+0x6d/0x86
[&lt;ffffffff813959d4&gt;] ? sockfd_lookup_light+0x12/0x5d
[&lt;ffffffff8139650a&gt;] ? SyS_sendto+0xf3/0x11b
[&lt;ffffffff810d5738&gt;] ? new_sync_read+0x82/0xaa
[&lt;ffffffff813ddd19&gt;] compat_ip_setsockopt+0x3b/0x99
[&lt;ffffffff813fb24a&gt;] compat_raw_setsockopt+0x11/0x32
[&lt;ffffffff81399052&gt;] compat_sock_common_setsockopt+0x18/0x1f
[&lt;ffffffff813c4d05&gt;] compat_SyS_setsockopt+0x1a9/0x1cf
[&lt;ffffffff813c4149&gt;] compat_SyS_socketcall+0x180/0x1e3
[&lt;ffffffff81488ea1&gt;] cstar_dispatch+0x7/0x1e

Signed-off-by: Ani Sinha &lt;ani@arista.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 44f49dd8b5a606870a1f21101522a0f9c4414784 ]

Fixes the following kernel BUG :

BUG: using __this_cpu_add() in preemptible [00000000] code: bash/2758
caller is __this_cpu_preempt_check+0x13/0x15
CPU: 0 PID: 2758 Comm: bash Tainted: P           O   3.18.19 #2
 ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
 0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
 ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
Call Trace:
[&lt;ffffffff81482b2a&gt;] dump_stack+0x52/0x80
[&lt;ffffffff812010ae&gt;] check_preemption_disabled+0xce/0xe1
[&lt;ffffffff812010d4&gt;] __this_cpu_preempt_check+0x13/0x15
[&lt;ffffffff81419d60&gt;] ipmr_queue_xmit+0x647/0x70c
[&lt;ffffffff8141a154&gt;] ip_mr_forward+0x32f/0x34e
[&lt;ffffffff8141af76&gt;] ip_mroute_setsockopt+0xe03/0x108c
[&lt;ffffffff810553fc&gt;] ? get_parent_ip+0x11/0x42
[&lt;ffffffff810e6974&gt;] ? pollwake+0x4d/0x51
[&lt;ffffffff81058ac0&gt;] ? default_wake_function+0x0/0xf
[&lt;ffffffff810553fc&gt;] ? get_parent_ip+0x11/0x42
[&lt;ffffffff810613d9&gt;] ? __wake_up_common+0x45/0x77
[&lt;ffffffff81486ea9&gt;] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[&lt;ffffffff810618bc&gt;] ? __wake_up_sync_key+0x4a/0x53
[&lt;ffffffff8139a519&gt;] ? sock_def_readable+0x71/0x75
[&lt;ffffffff813dd226&gt;] do_ip_setsockopt+0x9d/0xb55
[&lt;ffffffff81429818&gt;] ? unix_seqpacket_sendmsg+0x3f/0x41
[&lt;ffffffff813963fe&gt;] ? sock_sendmsg+0x6d/0x86
[&lt;ffffffff813959d4&gt;] ? sockfd_lookup_light+0x12/0x5d
[&lt;ffffffff8139650a&gt;] ? SyS_sendto+0xf3/0x11b
[&lt;ffffffff810d5738&gt;] ? new_sync_read+0x82/0xaa
[&lt;ffffffff813ddd19&gt;] compat_ip_setsockopt+0x3b/0x99
[&lt;ffffffff813fb24a&gt;] compat_raw_setsockopt+0x11/0x32
[&lt;ffffffff81399052&gt;] compat_sock_common_setsockopt+0x18/0x1f
[&lt;ffffffff813c4d05&gt;] compat_SyS_setsockopt+0x1a9/0x1cf
[&lt;ffffffff813c4149&gt;] compat_SyS_socketcall+0x180/0x1e3
[&lt;ffffffff81488ea1&gt;] cstar_dispatch+0x7/0x1e

Signed-off-by: Ani Sinha &lt;ani@arista.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: set name_assign_type in alloc_netdev()</title>
<updated>2014-07-15T23:12:48+00:00</updated>
<author>
<name>Tom Gundersen</name>
<email>teg@jklm.no</email>
</author>
<published>2014-07-14T14:37:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c835a677331495cf137a7f8a023463afd9f032f8'/>
<id>c835a677331495cf137a7f8a023463afd9f032f8</id>
<content type='text'>
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.

Coccinelle patch:

@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@

(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)

v9: move comments here from the wrong commit

Signed-off-by: Tom Gundersen &lt;teg@jklm.no&gt;
Reviewed-by: David Herrmann &lt;dh.herrmann@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.

Coccinelle patch:

@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@

(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)

v9: move comments here from the wrong commit

Signed-off-by: Tom Gundersen &lt;teg@jklm.no&gt;
Reviewed-by: David Herrmann &lt;dh.herrmann@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: get rid of ip_id_count</title>
<updated>2014-06-02T18:00:41+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-02T12:26:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=73f156a6e8c1074ac6327e0abd1169e95eb66463'/>
<id>73f156a6e8c1074ac6327e0abd1169e95eb66463</id>
<content type='text'>
Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipmr: Replace comma with semicolon</title>
<updated>2014-05-31T00:48:57+00:00</updated>
<author>
<name>Himangi Saraogi</name>
<email>himangi774@gmail.com</email>
</author>
<published>2014-05-30T15:40:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=70cb4a4526d4d3ad901a979a5f26917149408f8d'/>
<id>70cb4a4526d4d3ad901a979a5f26917149408f8d</id>
<content type='text'>
This patch replaces a comma between expression statements by a semicolon.

A simplified version of the semantic patch that performs this
transformation is as follows:

// &lt;smpl&gt;
@r@
expression e1,e2,e;
type T;
identifier i;
@@

 e1
-,
+;
 e2;
// &lt;/smpl&gt;

Signed-off-by: Himangi Saraogi &lt;himangi774@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch replaces a comma between expression statements by a semicolon.

A simplified version of the semantic patch that performs this
transformation is as follows:

// &lt;smpl&gt;
@r@
expression e1,e2,e;
type T;
identifier i;
@@

 e1
-,
+;
 e2;
// &lt;/smpl&gt;

Signed-off-by: Himangi Saraogi &lt;himangi774@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif</title>
<updated>2014-04-16T19:05:11+00:00</updated>
<author>
<name>Cong Wang</name>
<email>cwang@twopensource.com</email>
</author>
<published>2014-04-15T23:25:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6a662719c9868b3d6c7d26b3a085f0cd3cc15e64'/>
<id>6a662719c9868b3d6c7d26b3a085f0cd3cc15e64</id>
<content type='text'>
As suggested by Julian:

	Simply, flowi4_iif must not contain 0, it does not
	look logical to ignore all ip rules with specified iif.

because in fib_rule_match() we do:

        if (rule-&gt;iifindex &amp;&amp; (rule-&gt;iifindex != fl-&gt;flowi_iif))
                goto out;

flowi4_iif should be LOOPBACK_IFINDEX by default.

We need to move LOOPBACK_IFINDEX to include/net/flow.h:

1) It is mostly used by flowi_iif

2) Fix the following compile error if we use it in flow.h
by the patches latter:

In file included from include/linux/netfilter.h:277:0,
                 from include/net/netns/netfilter.h:5,
                 from include/net/net_namespace.h:21,
                 from include/linux/netdevice.h:43,
                 from include/linux/icmpv6.h:12,
                 from include/linux/ipv6.h:61,
                 from include/net/ipv6.h:16,
                 from include/linux/sunrpc/clnt.h:27,
                 from include/linux/nfs_fs.h:30,
                 from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)

Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As suggested by Julian:

	Simply, flowi4_iif must not contain 0, it does not
	look logical to ignore all ip rules with specified iif.

because in fib_rule_match() we do:

        if (rule-&gt;iifindex &amp;&amp; (rule-&gt;iifindex != fl-&gt;flowi_iif))
                goto out;

flowi4_iif should be LOOPBACK_IFINDEX by default.

We need to move LOOPBACK_IFINDEX to include/net/flow.h:

1) It is mostly used by flowi_iif

2) Fix the following compile error if we use it in flow.h
by the patches latter:

In file included from include/linux/netfilter.h:277:0,
                 from include/net/netns/netfilter.h:5,
                 from include/net/net_namespace.h:21,
                 from include/linux/netdevice.h:43,
                 from include/linux/icmpv6.h:12,
                 from include/linux/ipv6.h:61,
                 from include/net/ipv6.h:16,
                 from include/linux/sunrpc/clnt.h:27,
                 from include/linux/nfs_fs.h:30,
                 from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)

Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Cong Wang &lt;cwang@twopensource.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipmr: fix mfc notification flags</title>
<updated>2014-03-20T20:24:28+00:00</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2014-03-19T16:47:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=65886f439ab0fdc2dff20d1fa87afb98c6717472'/>
<id>65886f439ab0fdc2dff20d1fa87afb98c6717472</id>
<content type='text'>
Commit 8cd3ac9f9b7b ("ipmr: advertise new mfc entries via rtnl") reuses the
function ipmr_fill_mroute() to notify mfc events.
But this function was used only for dump and thus was always setting the
flag NLM_F_MULTI, which is wrong in case of a single notification.

Libraries like libnl will wait forever for NLMSG_DONE.

CC: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 8cd3ac9f9b7b ("ipmr: advertise new mfc entries via rtnl") reuses the
function ipmr_fill_mroute() to notify mfc events.
But this function was used only for dump and thus was always setting the
flag NLM_F_MULTI, which is wrong in case of a single notification.

Libraries like libnl will wait forever for NLMSG_DONE.

CC: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2014-01-18T08:55:41+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2014-01-18T08:55:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=41804420586ab41049a14ab7ef04eaa2280b8647'/>
<id>41804420586ab41049a14ab7ef04eaa2280b8647</id>
<content type='text'>
Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
	net/ipv4/tcp_metrics.c

Overlapping changes between the "don't create two tcp metrics objects
with the same key" race fix in net and the addition of the destination
address in the lookup key in net-next.

Minor overlapping changes in bnx2x driver.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
	net/ipv4/tcp_metrics.c

Overlapping changes between the "don't create two tcp metrics objects
with the same key" race fix in net and the addition of the destination
address in the lookup key in net-next.

Minor overlapping changes in bnx2x driver.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: avoid reference counter overflows on fib_rules in multicast forwarding</title>
<updated>2014-01-15T01:37:25+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2014-01-13T01:45:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=95f4a45de1a0f172b35451fc52283290adb21f6e'/>
<id>95f4a45de1a0f172b35451fc52283290adb21f6e</id>
<content type='text'>
Bob Falken reported that after 4G packets, multicast forwarding stopped
working. This was because of a rule reference counter overflow which
freed the rule as soon as the overflow happend.

This patch solves this by adding the FIB_LOOKUP_NOREF flag to
fib_rules_lookup calls. This is safe even from non-rcu locked sections
as in this case the flag only implies not taking a reference to the rule,
which we don't need at all.

Rules only hold references to the namespace, which are guaranteed to be
available during the call of the non-rcu protected function reg_vif_xmit
because of the interface reference which itself holds a reference to
the net namespace.

Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
Reported-by: Bob Falken &lt;NetFestivalHaveFun@gmx.com&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Bob Falken reported that after 4G packets, multicast forwarding stopped
working. This was because of a rule reference counter overflow which
freed the rule as soon as the overflow happend.

This patch solves this by adding the FIB_LOOKUP_NOREF flag to
fib_rules_lookup calls. This is safe even from non-rcu locked sections
as in this case the flag only implies not taking a reference to the rule,
which we don't need at all.

Rules only hold references to the namespace, which are guaranteed to be
available during the call of the non-rcu protected function reg_vif_xmit
because of the interface reference which itself holds a reference to
the net namespace.

Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
Reported-by: Bob Falken &lt;NetFestivalHaveFun@gmx.com&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
