<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv4/arp.c, branch linux-3.8.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>arp: fix possible crash in arp_rcv()</title>
<updated>2013-02-11T01:39:39+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-02-08T18:48:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=044453b3efdc90bdd5feffe74b99d95dec70ac43'/>
<id>044453b3efdc90bdd5feffe74b99d95dec70ac43</id>
<content type='text'>
We should call skb_share_check() before pskb_may_pull(), or we
can crash in pskb_expand_head()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We should call skb_share_check() before pskb_may_pull(), or we
can crash in pskb_expand_head()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arp: fix a regression in arp_solicit()</title>
<updated>2012-12-25T02:42:58+00:00</updated>
<author>
<name>Cong Wang</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2012-12-23T15:23:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cf0be88057baceae033a82d669128b282308c742'/>
<id>cf0be88057baceae033a82d669128b282308c742</id>
<content type='text'>
Sedat reported the following commit caused a regression:

commit 9650388b5c56578fdccc79c57a8c82fb92b8e7f1
Author: Eric Dumazet &lt;edumazet@google.com&gt;
Date:   Fri Dec 21 07:32:10 2012 +0000

    ipv4: arp: fix a lockdep splat in arp_solicit

This is due to the 6th parameter of arp_send() needs to be NULL
for the broadcast case, the above commit changed it to an all-zero
array by mistake.

Reported-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sedat reported the following commit caused a regression:

commit 9650388b5c56578fdccc79c57a8c82fb92b8e7f1
Author: Eric Dumazet &lt;edumazet@google.com&gt;
Date:   Fri Dec 21 07:32:10 2012 +0000

    ipv4: arp: fix a lockdep splat in arp_solicit

This is due to the 6th parameter of arp_send() needs to be NULL
for the broadcast case, the above commit changed it to an all-zero
array by mistake.

Reported-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: arp: fix a lockdep splat in arp_solicit()</title>
<updated>2012-12-21T21:14:07+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-12-21T07:32:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9650388b5c56578fdccc79c57a8c82fb92b8e7f1'/>
<id>9650388b5c56578fdccc79c57a8c82fb92b8e7f1</id>
<content type='text'>
Yan Burman reported following lockdep warning :

=============================================
[ INFO: possible recursive locking detected ]
3.7.0+ #24 Not tainted
---------------------------------------------
swapper/1/0 is trying to acquire lock:
  (&amp;n-&gt;lock){++--..}, at: [&lt;ffffffff8139f56e&gt;] __neigh_event_send
+0x2e/0x2f0

but task is already holding lock:
  (&amp;n-&gt;lock){++--..}, at: [&lt;ffffffff813f63f4&gt;] arp_solicit+0x1d4/0x280

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&amp;n-&gt;lock);
   lock(&amp;n-&gt;lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by swapper/1/0:
  #0:  (((&amp;n-&gt;timer))){+.-...}, at: [&lt;ffffffff8104b350&gt;]
call_timer_fn+0x0/0x1c0
  #1:  (&amp;n-&gt;lock){++--..}, at: [&lt;ffffffff813f63f4&gt;] arp_solicit
+0x1d4/0x280
  #2:  (rcu_read_lock_bh){.+....}, at: [&lt;ffffffff81395400&gt;]
dev_queue_xmit+0x0/0x5d0
  #3:  (rcu_read_lock_bh){.+....}, at: [&lt;ffffffff813cb41e&gt;]
ip_finish_output+0x13e/0x640

stack backtrace:
Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
Call Trace:
  &lt;IRQ&gt;  [&lt;ffffffff8108c7ac&gt;] validate_chain+0xdcc/0x11f0
  [&lt;ffffffff8108d570&gt;] ? __lock_acquire+0x440/0xc30
  [&lt;ffffffff81120565&gt;] ? kmem_cache_free+0xe5/0x1c0
  [&lt;ffffffff8108d570&gt;] __lock_acquire+0x440/0xc30
  [&lt;ffffffff813c3570&gt;] ? inet_getpeer+0x40/0x600
  [&lt;ffffffff8108d570&gt;] ? __lock_acquire+0x440/0xc30
  [&lt;ffffffff8139f56e&gt;] ? __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8108ddf5&gt;] lock_acquire+0x95/0x140
  [&lt;ffffffff8139f56e&gt;] ? __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8108d570&gt;] ? __lock_acquire+0x440/0xc30
  [&lt;ffffffff81448d4b&gt;] _raw_write_lock_bh+0x3b/0x50
  [&lt;ffffffff8139f56e&gt;] ? __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8139f56e&gt;] __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8139f99b&gt;] neigh_resolve_output+0x16b/0x270
  [&lt;ffffffff813cb62d&gt;] ip_finish_output+0x34d/0x640
  [&lt;ffffffff813cb41e&gt;] ? ip_finish_output+0x13e/0x640
  [&lt;ffffffffa046f146&gt;] ? vxlan_xmit+0x556/0xbec [vxlan]
  [&lt;ffffffff813cb9a0&gt;] ip_output+0x80/0xf0
  [&lt;ffffffff813ca368&gt;] ip_local_out+0x28/0x80
  [&lt;ffffffffa046f25a&gt;] vxlan_xmit+0x66a/0xbec [vxlan]
  [&lt;ffffffffa046f146&gt;] ? vxlan_xmit+0x556/0xbec [vxlan]
  [&lt;ffffffff81394a50&gt;] ? skb_gso_segment+0x2b0/0x2b0
  [&lt;ffffffff81449355&gt;] ? _raw_spin_unlock_irqrestore+0x65/0x80
  [&lt;ffffffff81394c57&gt;] ? dev_queue_xmit_nit+0x207/0x270
  [&lt;ffffffff813950c8&gt;] dev_hard_start_xmit+0x298/0x5d0
  [&lt;ffffffff813956f3&gt;] dev_queue_xmit+0x2f3/0x5d0
  [&lt;ffffffff81395400&gt;] ? dev_hard_start_xmit+0x5d0/0x5d0
  [&lt;ffffffff813f5788&gt;] arp_xmit+0x58/0x60
  [&lt;ffffffff813f59db&gt;] arp_send+0x3b/0x40
  [&lt;ffffffff813f6424&gt;] arp_solicit+0x204/0x280
  [&lt;ffffffff813a1a70&gt;] ? neigh_add+0x310/0x310
  [&lt;ffffffff8139f515&gt;] neigh_probe+0x45/0x70
  [&lt;ffffffff813a1c10&gt;] neigh_timer_handler+0x1a0/0x2a0
  [&lt;ffffffff8104b3cf&gt;] call_timer_fn+0x7f/0x1c0
  [&lt;ffffffff8104b350&gt;] ? detach_if_pending+0x120/0x120
  [&lt;ffffffff8104b748&gt;] run_timer_softirq+0x238/0x2b0
  [&lt;ffffffff813a1a70&gt;] ? neigh_add+0x310/0x310
  [&lt;ffffffff81043e51&gt;] __do_softirq+0x101/0x280
  [&lt;ffffffff814518cc&gt;] call_softirq+0x1c/0x30
  [&lt;ffffffff81003b65&gt;] do_softirq+0x85/0xc0
  [&lt;ffffffff81043a7e&gt;] irq_exit+0x9e/0xc0
  [&lt;ffffffff810264f8&gt;] smp_apic_timer_interrupt+0x68/0xa0
  [&lt;ffffffff8145122f&gt;] apic_timer_interrupt+0x6f/0x80
  &lt;EOI&gt;  [&lt;ffffffff8100a054&gt;] ? mwait_idle+0xa4/0x1c0
  [&lt;ffffffff8100a04b&gt;] ? mwait_idle+0x9b/0x1c0
  [&lt;ffffffff8100a6a9&gt;] cpu_idle+0x89/0xe0
  [&lt;ffffffff81441127&gt;] start_secondary+0x1b2/0x1b6

Bug is from arp_solicit(), releasing the neigh lock after arp_send()
In case of vxlan, we eventually need to write lock a neigh lock later.

Its a false positive, but we can get rid of it without lockdep
annotations.

We can instead use neigh_ha_snapshot() helper.

Reported-by: Yan Burman &lt;yanb@mellanox.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Yan Burman reported following lockdep warning :

=============================================
[ INFO: possible recursive locking detected ]
3.7.0+ #24 Not tainted
---------------------------------------------
swapper/1/0 is trying to acquire lock:
  (&amp;n-&gt;lock){++--..}, at: [&lt;ffffffff8139f56e&gt;] __neigh_event_send
+0x2e/0x2f0

but task is already holding lock:
  (&amp;n-&gt;lock){++--..}, at: [&lt;ffffffff813f63f4&gt;] arp_solicit+0x1d4/0x280

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&amp;n-&gt;lock);
   lock(&amp;n-&gt;lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by swapper/1/0:
  #0:  (((&amp;n-&gt;timer))){+.-...}, at: [&lt;ffffffff8104b350&gt;]
call_timer_fn+0x0/0x1c0
  #1:  (&amp;n-&gt;lock){++--..}, at: [&lt;ffffffff813f63f4&gt;] arp_solicit
+0x1d4/0x280
  #2:  (rcu_read_lock_bh){.+....}, at: [&lt;ffffffff81395400&gt;]
dev_queue_xmit+0x0/0x5d0
  #3:  (rcu_read_lock_bh){.+....}, at: [&lt;ffffffff813cb41e&gt;]
ip_finish_output+0x13e/0x640

stack backtrace:
Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
Call Trace:
  &lt;IRQ&gt;  [&lt;ffffffff8108c7ac&gt;] validate_chain+0xdcc/0x11f0
  [&lt;ffffffff8108d570&gt;] ? __lock_acquire+0x440/0xc30
  [&lt;ffffffff81120565&gt;] ? kmem_cache_free+0xe5/0x1c0
  [&lt;ffffffff8108d570&gt;] __lock_acquire+0x440/0xc30
  [&lt;ffffffff813c3570&gt;] ? inet_getpeer+0x40/0x600
  [&lt;ffffffff8108d570&gt;] ? __lock_acquire+0x440/0xc30
  [&lt;ffffffff8139f56e&gt;] ? __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8108ddf5&gt;] lock_acquire+0x95/0x140
  [&lt;ffffffff8139f56e&gt;] ? __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8108d570&gt;] ? __lock_acquire+0x440/0xc30
  [&lt;ffffffff81448d4b&gt;] _raw_write_lock_bh+0x3b/0x50
  [&lt;ffffffff8139f56e&gt;] ? __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8139f56e&gt;] __neigh_event_send+0x2e/0x2f0
  [&lt;ffffffff8139f99b&gt;] neigh_resolve_output+0x16b/0x270
  [&lt;ffffffff813cb62d&gt;] ip_finish_output+0x34d/0x640
  [&lt;ffffffff813cb41e&gt;] ? ip_finish_output+0x13e/0x640
  [&lt;ffffffffa046f146&gt;] ? vxlan_xmit+0x556/0xbec [vxlan]
  [&lt;ffffffff813cb9a0&gt;] ip_output+0x80/0xf0
  [&lt;ffffffff813ca368&gt;] ip_local_out+0x28/0x80
  [&lt;ffffffffa046f25a&gt;] vxlan_xmit+0x66a/0xbec [vxlan]
  [&lt;ffffffffa046f146&gt;] ? vxlan_xmit+0x556/0xbec [vxlan]
  [&lt;ffffffff81394a50&gt;] ? skb_gso_segment+0x2b0/0x2b0
  [&lt;ffffffff81449355&gt;] ? _raw_spin_unlock_irqrestore+0x65/0x80
  [&lt;ffffffff81394c57&gt;] ? dev_queue_xmit_nit+0x207/0x270
  [&lt;ffffffff813950c8&gt;] dev_hard_start_xmit+0x298/0x5d0
  [&lt;ffffffff813956f3&gt;] dev_queue_xmit+0x2f3/0x5d0
  [&lt;ffffffff81395400&gt;] ? dev_hard_start_xmit+0x5d0/0x5d0
  [&lt;ffffffff813f5788&gt;] arp_xmit+0x58/0x60
  [&lt;ffffffff813f59db&gt;] arp_send+0x3b/0x40
  [&lt;ffffffff813f6424&gt;] arp_solicit+0x204/0x280
  [&lt;ffffffff813a1a70&gt;] ? neigh_add+0x310/0x310
  [&lt;ffffffff8139f515&gt;] neigh_probe+0x45/0x70
  [&lt;ffffffff813a1c10&gt;] neigh_timer_handler+0x1a0/0x2a0
  [&lt;ffffffff8104b3cf&gt;] call_timer_fn+0x7f/0x1c0
  [&lt;ffffffff8104b350&gt;] ? detach_if_pending+0x120/0x120
  [&lt;ffffffff8104b748&gt;] run_timer_softirq+0x238/0x2b0
  [&lt;ffffffff813a1a70&gt;] ? neigh_add+0x310/0x310
  [&lt;ffffffff81043e51&gt;] __do_softirq+0x101/0x280
  [&lt;ffffffff814518cc&gt;] call_softirq+0x1c/0x30
  [&lt;ffffffff81003b65&gt;] do_softirq+0x85/0xc0
  [&lt;ffffffff81043a7e&gt;] irq_exit+0x9e/0xc0
  [&lt;ffffffff810264f8&gt;] smp_apic_timer_interrupt+0x68/0xa0
  [&lt;ffffffff8145122f&gt;] apic_timer_interrupt+0x6f/0x80
  &lt;EOI&gt;  [&lt;ffffffff8100a054&gt;] ? mwait_idle+0xa4/0x1c0
  [&lt;ffffffff8100a04b&gt;] ? mwait_idle+0x9b/0x1c0
  [&lt;ffffffff8100a6a9&gt;] cpu_idle+0x89/0xe0
  [&lt;ffffffff81441127&gt;] start_secondary+0x1b2/0x1b6

Bug is from arp_solicit(), releasing the neigh lock after arp_send()
In case of vxlan, we eventually need to write lock a neigh lock later.

Its a false positive, but we can get rid of it without lockdep
annotations.

We can instead use neigh_ha_snapshot() helper.

Reported-by: Yan Burman &lt;yanb@mellanox.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Allow userns root to control ipv4</title>
<updated>2012-11-19T01:32:45+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2012-11-16T03:03:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=52e804c6dfaa5df1e4b0e290357b82ad4e4cda2c'/>
<id>52e804c6dfaa5df1e4b0e290357b82ad4e4cda2c</id>
<content type='text'>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net-&gt;user_ns,
CAP_NET_ADMIN), or capable(net-&gt;user_ns, CAP_NET_RAW) calls.

Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.

In general policy and network stack state changes are allowed
while resource control is left unchanged.

Allow creating raw sockets.
Allow the SIOCSARP ioctl to control the arp cache.
Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting gre tunnels.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipip tunnels.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipsec virtual tunnel interfaces.

Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
sockets.

Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
arbitrary ip options.

Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
Allow setting the IP_TRANSPARENT ipv4 socket option.
Allow setting the TCP_REPAIR socket option.
Allow setting the TCP_CONGESTION socket option.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net-&gt;user_ns,
CAP_NET_ADMIN), or capable(net-&gt;user_ns, CAP_NET_RAW) calls.

Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.

In general policy and network stack state changes are allowed
while resource control is left unchanged.

Allow creating raw sockets.
Allow the SIOCSARP ioctl to control the arp cache.
Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting gre tunnels.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipip tunnels.

Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipsec virtual tunnel interfaces.

Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
sockets.

Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
arbitrary ip options.

Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
Allow setting the IP_TRANSPARENT ipv4 socket option.
Allow setting the TCP_REPAIR socket option.
Allow setting the TCP_CONGESTION socket option.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4/route: arg delay is useless in rt_cache_flush()</title>
<updated>2012-09-18T19:44:34+00:00</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2012-09-07T00:45:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bafa6d9d89072c1a18853afe9ee5de05c491c13a'/>
<id>bafa6d9d89072c1a18853afe9ee5de05c491c13a</id>
<content type='text'>
Since route cache deletion (89aef8921bfbac22f), delay is no
more used. Remove it.

Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since route cache deletion (89aef8921bfbac22f), delay is no
more used. Remove it.

Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Fix input route performance regression.</title>
<updated>2012-07-26T22:50:39+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-07-26T11:14:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c6cffba4ffa26a8ffacd0bb9f3144e34f20da7de'/>
<id>c6cffba4ffa26a8ffacd0bb9f3144e34f20da7de</id>
<content type='text'>
With the routing cache removal we lost the "noref" code paths on
input, and this can kill some routing workloads.

Reinstate the noref path when we hit a cached route in the FIB
nexthops.

With help from Eric Dumazet.

Reported-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With the routing cache removal we lost the "noref" code paths on
input, and this can kill some routing workloads.

Reinstate the noref path when we hit a cached route in the FIB
nexthops.

With help from Eric Dumazet.

Reported-by: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Adjust semantics of rt-&gt;rt_gateway.</title>
<updated>2012-07-20T20:31:20+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-07-13T12:03:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f8126f1d5136be1ca1a3536d43ad7a710b5620f8'/>
<id>f8126f1d5136be1ca1a3536d43ad7a710b5620f8</id>
<content type='text'>
In order to allow prefixed routes, we have to adjust how rt_gateway
is set and interpreted.

The new interpretation is:

1) rt_gateway == 0, destination is on-link, nexthop is iph-&gt;daddr

2) rt_gateway != 0, destination requires a nexthop gateway

Abstract the fetching of the proper nexthop value using a new
inline helper, rt_nexthop(), as suggested by Joe Perches.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Tested-by: Vijay Subramanian &lt;subramanian.vijay@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to allow prefixed routes, we have to adjust how rt_gateway
is set and interpreted.

The new interpretation is:

1) rt_gateway == 0, destination is on-link, nexthop is iph-&gt;daddr

2) rt_gateway != 0, destination requires a nexthop gateway

Abstract the fetching of the proper nexthop value using a new
inline helper, rt_nexthop(), as suggested by Joe Perches.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Tested-by: Vijay Subramanian &lt;subramanian.vijay@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Kill ip_route_input_noref().</title>
<updated>2012-07-20T20:30:59+00:00</updated>
<author>
<name>David Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-07-01T02:02:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=38a424e4657462fe9f8b76f01a0e879abde99ab4'/>
<id>38a424e4657462fe9f8b76f01a0e879abde99ab4</id>
<content type='text'>
The "noref" argument to ip_route_input_common() is now always ignored
because we do not cache routes, and in that case we must always grab
a reference to the resulting 'dst'.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The "noref" argument to ip_route_input_common() is now always ignored
because we do not cache routes, and in that case we must always grab
a reference to the resulting 'dst'.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "ipv4: tcp: dont cache unconfirmed intput dst"</title>
<updated>2012-06-28T00:05:06+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-06-28T00:05:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c10237e077cef50e925f052e49f3b4fead9d71f9'/>
<id>c10237e077cef50e925f052e49f3b4fead9d71f9</id>
<content type='text'>
This reverts commit c074da2810c118b3812f32d6754bd9ead2f169e7.

This change has several unwanted side effects:

1) Sockets will cache the DST_NOCACHE route in sk-&gt;sk_rx_dst and we'll
   thus never create a real cached route.

2) All TCP traffic will use DST_NOCACHE and never use the routing
   cache at all.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit c074da2810c118b3812f32d6754bd9ead2f169e7.

This change has several unwanted side effects:

1) Sockets will cache the DST_NOCACHE route in sk-&gt;sk_rx_dst and we'll
   thus never create a real cached route.

2) All TCP traffic will use DST_NOCACHE and never use the routing
   cache at all.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: tcp: dont cache unconfirmed intput dst</title>
<updated>2012-06-27T22:34:24+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-06-26T23:14:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c074da2810c118b3812f32d6754bd9ead2f169e7'/>
<id>c074da2810c118b3812f32d6754bd9ead2f169e7</id>
<content type='text'>
DDOS synflood attacks hit badly IP route cache.

On typical machines, this cache is allowed to hold up to 8 Millions dst
entries, 256 bytes for each, for a total of 2GB of memory.

rt_garbage_collect() triggers and tries to cleanup things.

Eventually route cache is disabled but machine is under fire and might
OOM and crash.

This patch exploits the new TCP early demux, to set a nocache
boolean in case incoming TCP frame is for a not yet ESTABLISHED or
TIMEWAIT socket.

This 'nocache' boolean is then used in case dst entry is not found in
route cache, to create an unhashed dst entry (DST_NOCACHE)

SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache
output dst for syncookies), so after this patch, a machine is able to
absorb a DDOS synflood attack without polluting its IP route cache.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Hans Schillstrom &lt;hans.schillstrom@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DDOS synflood attacks hit badly IP route cache.

On typical machines, this cache is allowed to hold up to 8 Millions dst
entries, 256 bytes for each, for a total of 2GB of memory.

rt_garbage_collect() triggers and tries to cleanup things.

Eventually route cache is disabled but machine is under fire and might
OOM and crash.

This patch exploits the new TCP early demux, to set a nocache
boolean in case incoming TCP frame is for a not yet ESTABLISHED or
TIMEWAIT socket.

This 'nocache' boolean is then used in case dst entry is not found in
route cache, to create an unhashed dst entry (DST_NOCACHE)

SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache
output dst for syncookies), so after this patch, a machine is able to
absorb a DDOS synflood attack without polluting its IP route cache.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Hans Schillstrom &lt;hans.schillstrom@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
