<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/ipv4, branch v3.16-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup</title>
<updated>2014-06-13T22:39:24+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-12T23:13:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=63c6f81cdde58c41da62a8d8a209592e42a0203e'/>
<id>63c6f81cdde58c41da62a8d8a209592e42a0203e</id>
<content type='text'>
Its too easy to add thousand of UDP sockets on a particular bucket,
and slow down an innocent multicast receiver.

Early demux is supposed to be an optimization, we should avoid spending
too much time in it.

It is interesting to note __udp4_lib_demux_lookup() only tries to
match first socket in the chain.

10 is the threshold we already have in __udp4_lib_lookup() to switch
to secondary hash.

Fixes: 421b3885bf6d5 ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: David Held &lt;drheld@google.com&gt;
Cc: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Its too easy to add thousand of UDP sockets on a particular bucket,
and slow down an innocent multicast receiver.

Early demux is supposed to be an optimization, we should avoid spending
too much time in it.

It is interesting to note __udp4_lib_demux_lookup() only tries to
match first socket in the chain.

10 is the threshold we already have in __udp4_lib_lookup() to switch
to secondary hash.

Fixes: 421b3885bf6d5 ("udp: ipv4: Add udp early demux")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: David Held &lt;drheld@google.com&gt;
Cc: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next</title>
<updated>2014-06-12T21:27:40+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-06-12T21:27:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f9da455b93f6ba076935b4ef4589f61e529ae046'/>
<id>f9da455b93f6ba076935b4ef4589f61e529ae046</id>
<content type='text'>
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 &lt; v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 &lt; v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: fixing TLP's FIN recovery</title>
<updated>2014-06-12T18:05:51+00:00</updated>
<author>
<name>Per Hurtig</name>
<email>per.hurtig@kau.se</email>
</author>
<published>2014-06-12T15:08:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bef1909ee3ed1ca39231b260a8d3b4544ecd0c8f'/>
<id>bef1909ee3ed1ca39231b260a8d3b4544ecd0c8f</id>
<content type='text'>
Fix to a problem observed when losing a FIN segment that does not
contain data.  In such situations, TLP is unable to recover from
*any* tail loss and instead adds at least PTO ms to the
retransmission process, i.e., RTO = RTO + PTO.

Signed-off-by: Per Hurtig &lt;per.hurtig@kau.se&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Nandita Dukkipati &lt;nanditad@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix to a problem observed when losing a FIN segment that does not
contain data.  In such situations, TLP is unable to recover from
*any* tail loss and instead adds at least PTO ms to the
retransmission process, i.e., RTO = RTO + PTO.

Signed-off-by: Per Hurtig &lt;per.hurtig@kau.se&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Nandita Dukkipati &lt;nanditad@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2014-06-11T23:02:55+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2014-06-11T23:02:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=902455e00720018d1dbd38327c3fd5bda6d844ee'/>
<id>902455e00720018d1dbd38327c3fd5bda6d844ee</id>
<content type='text'>
Conflicts:
	net/core/rtnetlink.c
	net/core/skbuff.c

Both conflicts were very simple overlapping changes.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	net/core/rtnetlink.c
	net/core/skbuff.c

Both conflicts were very simple overlapping changes.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add skb_gro_postpull_rcsum to udp and vxlan</title>
<updated>2014-06-11T22:46:13+00:00</updated>
<author>
<name>Tom Herbert</name>
<email>therbert@google.com</email>
</author>
<published>2014-06-11T01:54:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6bae1d4cc395ad46613e40c9e865ee171dc9de5c'/>
<id>6bae1d4cc395ad46613e40c9e865ee171dc9de5c</id>
<content type='text'>
Need to gro_postpull_rcsum for GRO to work with checksum complete.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Need to gro_postpull_rcsum for GRO to work with checksum complete.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Save software checksum complete</title>
<updated>2014-06-11T22:46:13+00:00</updated>
<author>
<name>Tom Herbert</name>
<email>therbert@google.com</email>
</author>
<published>2014-06-11T01:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7e3cead5172927732f51fde77fef6f521e22f209'/>
<id>7e3cead5172927732f51fde77fef6f521e22f209</id>
<content type='text'>
In skb_checksum complete, if we need to compute the checksum for the
packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
Subsequent checksum verification can use this.

Also, added csum_complete_sw flag to distinguish between software and
hardware generated checksum complete, we should always be able to trust
the software computation.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In skb_checksum complete, if we need to compute the checksum for the
packet (via skb_checksum) save the result as CHECKSUM_COMPLETE.
Subsequent checksum verification can use this.

Also, added csum_complete_sw flag to distinguish between software and
hardware generated checksum complete, we should always be able to trust
the software computation.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: fix a race in ip4_datagram_release_cb()</title>
<updated>2014-06-11T22:39:18+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-10T13:43:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9709674e68646cee5a24e3000b3558d25412203a'/>
<id>9709674e68646cee5a24e3000b3558d25412203a</id>
<content type='text'>
Alexey gave a AddressSanitizer[1] report that finally gave a good hint
at where was the origin of various problems already reported by Dormando
in the past [2]

Problem comes from the fact that UDP can have a lockless TX path, and
concurrent threads can manipulate sk_dst_cache, while another thread,
is holding socket lock and calls __sk_dst_set() in
ip4_datagram_release_cb() (this was added in linux-3.8)

It seems that all we need to do is to use sk_dst_check() and
sk_dst_set() so that all the writers hold same spinlock
(sk-&gt;sk_dst_lock) to prevent corruptions.

TCP stack do not need this protection, as all sk_dst_cache writers hold
the socket lock.

[1]
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

AddressSanitizer: heap-use-after-free in ipv4_dst_check
Read of size 2 by thread T15453:
 [&lt;ffffffff817daa3a&gt;] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116
 [&lt;ffffffff8175b789&gt;] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531
 [&lt;ffffffff81830a36&gt;] ip4_datagram_release_cb+0x46/0x390 ??:0
 [&lt;ffffffff8175eaea&gt;] release_sock+0x17a/0x230 ./net/core/sock.c:2413
 [&lt;ffffffff81830882&gt;] ip4_datagram_connect+0x462/0x5d0 ??:0
 [&lt;ffffffff81846d06&gt;] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [&lt;ffffffff817580ac&gt;] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [&lt;ffffffff817596ce&gt;] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [&lt;ffffffff818b0a29&gt;] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Freed by thread T15455:
 [&lt;ffffffff8178d9b8&gt;] dst_destroy+0xa8/0x160 ./net/core/dst.c:251
 [&lt;ffffffff8178de25&gt;] dst_release+0x45/0x80 ./net/core/dst.c:280
 [&lt;ffffffff818304c1&gt;] ip4_datagram_connect+0xa1/0x5d0 ??:0
 [&lt;ffffffff81846d06&gt;] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [&lt;ffffffff817580ac&gt;] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [&lt;ffffffff817596ce&gt;] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [&lt;ffffffff818b0a29&gt;] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Allocated by thread T15453:
 [&lt;ffffffff8178d291&gt;] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171
 [&lt;ffffffff817db3b7&gt;] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406
 [&lt;     inlined    &gt;] __ip_route_output_key+0x3e8/0xf70
__mkroute_output ./net/ipv4/route.c:1939
 [&lt;ffffffff817dde08&gt;] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161
 [&lt;ffffffff817deb34&gt;] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249
 [&lt;ffffffff81830737&gt;] ip4_datagram_connect+0x317/0x5d0 ??:0
 [&lt;ffffffff81846d06&gt;] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [&lt;ffffffff817580ac&gt;] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [&lt;ffffffff817596ce&gt;] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [&lt;ffffffff818b0a29&gt;] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

[2]
&lt;4&gt;[196727.311203] general protection fault: 0000 [#1] SMP
&lt;4&gt;[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
&lt;4&gt;[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
&lt;4&gt;[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
&lt;4&gt;[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
&lt;4&gt;[196727.311377] RIP: 0010:[&lt;ffffffff815f8c7f&gt;]  [&lt;ffffffff815f8c7f&gt;] ipv4_dst_destroy+0x4f/0x80
&lt;4&gt;[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
&lt;4&gt;[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
&lt;4&gt;[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
&lt;4&gt;[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
&lt;4&gt;[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
&lt;4&gt;[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
&lt;4&gt;[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
&lt;4&gt;[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
&lt;4&gt;[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
&lt;4&gt;[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
&lt;4&gt;[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
&lt;4&gt;[196727.311713] Stack:
&lt;4&gt;[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
&lt;4&gt;[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
&lt;4&gt;[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
&lt;4&gt;[196727.311885] Call Trace:
&lt;4&gt;[196727.311907]  &lt;IRQ&gt;
&lt;4&gt;[196727.311912]  [&lt;ffffffff815b7f42&gt;] dst_destroy+0x32/0xe0
&lt;4&gt;[196727.311959]  [&lt;ffffffff815b86c6&gt;] dst_release+0x56/0x80
&lt;4&gt;[196727.311986]  [&lt;ffffffff81620bd5&gt;] tcp_v4_do_rcv+0x2a5/0x4a0
&lt;4&gt;[196727.312013]  [&lt;ffffffff81622b5a&gt;] tcp_v4_rcv+0x7da/0x820
&lt;4&gt;[196727.312041]  [&lt;ffffffff815fd9e0&gt;] ? ip_rcv_finish+0x360/0x360
&lt;4&gt;[196727.312070]  [&lt;ffffffff815de02d&gt;] ? nf_hook_slow+0x7d/0x150
&lt;4&gt;[196727.312097]  [&lt;ffffffff815fd9e0&gt;] ? ip_rcv_finish+0x360/0x360
&lt;4&gt;[196727.312125]  [&lt;ffffffff815fda92&gt;] ip_local_deliver_finish+0xb2/0x230
&lt;4&gt;[196727.312154]  [&lt;ffffffff815fdd9a&gt;] ip_local_deliver+0x4a/0x90
&lt;4&gt;[196727.312183]  [&lt;ffffffff815fd799&gt;] ip_rcv_finish+0x119/0x360
&lt;4&gt;[196727.312212]  [&lt;ffffffff815fe00b&gt;] ip_rcv+0x22b/0x340
&lt;4&gt;[196727.312242]  [&lt;ffffffffa0339680&gt;] ? macvlan_broadcast+0x160/0x160 [macvlan]
&lt;4&gt;[196727.312275]  [&lt;ffffffff815b0c62&gt;] __netif_receive_skb_core+0x512/0x640
&lt;4&gt;[196727.312308]  [&lt;ffffffff811427fb&gt;] ? kmem_cache_alloc+0x13b/0x150
&lt;4&gt;[196727.312338]  [&lt;ffffffff815b0db1&gt;] __netif_receive_skb+0x21/0x70
&lt;4&gt;[196727.312368]  [&lt;ffffffff815b0fa1&gt;] netif_receive_skb+0x31/0xa0
&lt;4&gt;[196727.312397]  [&lt;ffffffff815b1ae8&gt;] napi_gro_receive+0xe8/0x140
&lt;4&gt;[196727.312433]  [&lt;ffffffffa00274f1&gt;] ixgbe_poll+0x551/0x11f0 [ixgbe]
&lt;4&gt;[196727.312463]  [&lt;ffffffff815fe00b&gt;] ? ip_rcv+0x22b/0x340
&lt;4&gt;[196727.312491]  [&lt;ffffffff815b1691&gt;] net_rx_action+0x111/0x210
&lt;4&gt;[196727.312521]  [&lt;ffffffff815b0db1&gt;] ? __netif_receive_skb+0x21/0x70
&lt;4&gt;[196727.312552]  [&lt;ffffffff810519d0&gt;] __do_softirq+0xd0/0x270
&lt;4&gt;[196727.312583]  [&lt;ffffffff816cef3c&gt;] call_softirq+0x1c/0x30
&lt;4&gt;[196727.312613]  [&lt;ffffffff81004205&gt;] do_softirq+0x55/0x90
&lt;4&gt;[196727.312640]  [&lt;ffffffff81051c85&gt;] irq_exit+0x55/0x60
&lt;4&gt;[196727.312668]  [&lt;ffffffff816cf5c3&gt;] do_IRQ+0x63/0xe0
&lt;4&gt;[196727.312696]  [&lt;ffffffff816c5aaa&gt;] common_interrupt+0x6a/0x6a
&lt;4&gt;[196727.312722]  &lt;EOI&gt;
&lt;1&gt;[196727.313071] RIP  [&lt;ffffffff815f8c7f&gt;] ipv4_dst_destroy+0x4f/0x80
&lt;4&gt;[196727.313100]  RSP &lt;ffff885effd23a70&gt;
&lt;4&gt;[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
&lt;0&gt;[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt

Reported-by: Alexey Preobrazhensky &lt;preobr@google.com&gt;
Reported-by: dormando &lt;dormando@rydia.ne&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Fixes: 8141ed9fcedb2 ("ipv4: Add a socket release callback for datagram sockets")
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Alexey gave a AddressSanitizer[1] report that finally gave a good hint
at where was the origin of various problems already reported by Dormando
in the past [2]

Problem comes from the fact that UDP can have a lockless TX path, and
concurrent threads can manipulate sk_dst_cache, while another thread,
is holding socket lock and calls __sk_dst_set() in
ip4_datagram_release_cb() (this was added in linux-3.8)

It seems that all we need to do is to use sk_dst_check() and
sk_dst_set() so that all the writers hold same spinlock
(sk-&gt;sk_dst_lock) to prevent corruptions.

TCP stack do not need this protection, as all sk_dst_cache writers hold
the socket lock.

[1]
https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel

AddressSanitizer: heap-use-after-free in ipv4_dst_check
Read of size 2 by thread T15453:
 [&lt;ffffffff817daa3a&gt;] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116
 [&lt;ffffffff8175b789&gt;] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531
 [&lt;ffffffff81830a36&gt;] ip4_datagram_release_cb+0x46/0x390 ??:0
 [&lt;ffffffff8175eaea&gt;] release_sock+0x17a/0x230 ./net/core/sock.c:2413
 [&lt;ffffffff81830882&gt;] ip4_datagram_connect+0x462/0x5d0 ??:0
 [&lt;ffffffff81846d06&gt;] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [&lt;ffffffff817580ac&gt;] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [&lt;ffffffff817596ce&gt;] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [&lt;ffffffff818b0a29&gt;] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Freed by thread T15455:
 [&lt;ffffffff8178d9b8&gt;] dst_destroy+0xa8/0x160 ./net/core/dst.c:251
 [&lt;ffffffff8178de25&gt;] dst_release+0x45/0x80 ./net/core/dst.c:280
 [&lt;ffffffff818304c1&gt;] ip4_datagram_connect+0xa1/0x5d0 ??:0
 [&lt;ffffffff81846d06&gt;] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [&lt;ffffffff817580ac&gt;] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [&lt;ffffffff817596ce&gt;] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [&lt;ffffffff818b0a29&gt;] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

Allocated by thread T15453:
 [&lt;ffffffff8178d291&gt;] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171
 [&lt;ffffffff817db3b7&gt;] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406
 [&lt;     inlined    &gt;] __ip_route_output_key+0x3e8/0xf70
__mkroute_output ./net/ipv4/route.c:1939
 [&lt;ffffffff817dde08&gt;] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161
 [&lt;ffffffff817deb34&gt;] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249
 [&lt;ffffffff81830737&gt;] ip4_datagram_connect+0x317/0x5d0 ??:0
 [&lt;ffffffff81846d06&gt;] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
 [&lt;ffffffff817580ac&gt;] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
 [&lt;ffffffff817596ce&gt;] SyS_connect+0xe/0x10 ./net/socket.c:1682
 [&lt;ffffffff818b0a29&gt;] system_call_fastpath+0x16/0x1b
./arch/x86/kernel/entry_64.S:629

[2]
&lt;4&gt;[196727.311203] general protection fault: 0000 [#1] SMP
&lt;4&gt;[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
&lt;4&gt;[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
&lt;4&gt;[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
&lt;4&gt;[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
&lt;4&gt;[196727.311377] RIP: 0010:[&lt;ffffffff815f8c7f&gt;]  [&lt;ffffffff815f8c7f&gt;] ipv4_dst_destroy+0x4f/0x80
&lt;4&gt;[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
&lt;4&gt;[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
&lt;4&gt;[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
&lt;4&gt;[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
&lt;4&gt;[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
&lt;4&gt;[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
&lt;4&gt;[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
&lt;4&gt;[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
&lt;4&gt;[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
&lt;4&gt;[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
&lt;4&gt;[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
&lt;4&gt;[196727.311713] Stack:
&lt;4&gt;[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
&lt;4&gt;[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
&lt;4&gt;[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
&lt;4&gt;[196727.311885] Call Trace:
&lt;4&gt;[196727.311907]  &lt;IRQ&gt;
&lt;4&gt;[196727.311912]  [&lt;ffffffff815b7f42&gt;] dst_destroy+0x32/0xe0
&lt;4&gt;[196727.311959]  [&lt;ffffffff815b86c6&gt;] dst_release+0x56/0x80
&lt;4&gt;[196727.311986]  [&lt;ffffffff81620bd5&gt;] tcp_v4_do_rcv+0x2a5/0x4a0
&lt;4&gt;[196727.312013]  [&lt;ffffffff81622b5a&gt;] tcp_v4_rcv+0x7da/0x820
&lt;4&gt;[196727.312041]  [&lt;ffffffff815fd9e0&gt;] ? ip_rcv_finish+0x360/0x360
&lt;4&gt;[196727.312070]  [&lt;ffffffff815de02d&gt;] ? nf_hook_slow+0x7d/0x150
&lt;4&gt;[196727.312097]  [&lt;ffffffff815fd9e0&gt;] ? ip_rcv_finish+0x360/0x360
&lt;4&gt;[196727.312125]  [&lt;ffffffff815fda92&gt;] ip_local_deliver_finish+0xb2/0x230
&lt;4&gt;[196727.312154]  [&lt;ffffffff815fdd9a&gt;] ip_local_deliver+0x4a/0x90
&lt;4&gt;[196727.312183]  [&lt;ffffffff815fd799&gt;] ip_rcv_finish+0x119/0x360
&lt;4&gt;[196727.312212]  [&lt;ffffffff815fe00b&gt;] ip_rcv+0x22b/0x340
&lt;4&gt;[196727.312242]  [&lt;ffffffffa0339680&gt;] ? macvlan_broadcast+0x160/0x160 [macvlan]
&lt;4&gt;[196727.312275]  [&lt;ffffffff815b0c62&gt;] __netif_receive_skb_core+0x512/0x640
&lt;4&gt;[196727.312308]  [&lt;ffffffff811427fb&gt;] ? kmem_cache_alloc+0x13b/0x150
&lt;4&gt;[196727.312338]  [&lt;ffffffff815b0db1&gt;] __netif_receive_skb+0x21/0x70
&lt;4&gt;[196727.312368]  [&lt;ffffffff815b0fa1&gt;] netif_receive_skb+0x31/0xa0
&lt;4&gt;[196727.312397]  [&lt;ffffffff815b1ae8&gt;] napi_gro_receive+0xe8/0x140
&lt;4&gt;[196727.312433]  [&lt;ffffffffa00274f1&gt;] ixgbe_poll+0x551/0x11f0 [ixgbe]
&lt;4&gt;[196727.312463]  [&lt;ffffffff815fe00b&gt;] ? ip_rcv+0x22b/0x340
&lt;4&gt;[196727.312491]  [&lt;ffffffff815b1691&gt;] net_rx_action+0x111/0x210
&lt;4&gt;[196727.312521]  [&lt;ffffffff815b0db1&gt;] ? __netif_receive_skb+0x21/0x70
&lt;4&gt;[196727.312552]  [&lt;ffffffff810519d0&gt;] __do_softirq+0xd0/0x270
&lt;4&gt;[196727.312583]  [&lt;ffffffff816cef3c&gt;] call_softirq+0x1c/0x30
&lt;4&gt;[196727.312613]  [&lt;ffffffff81004205&gt;] do_softirq+0x55/0x90
&lt;4&gt;[196727.312640]  [&lt;ffffffff81051c85&gt;] irq_exit+0x55/0x60
&lt;4&gt;[196727.312668]  [&lt;ffffffff816cf5c3&gt;] do_IRQ+0x63/0xe0
&lt;4&gt;[196727.312696]  [&lt;ffffffff816c5aaa&gt;] common_interrupt+0x6a/0x6a
&lt;4&gt;[196727.312722]  &lt;EOI&gt;
&lt;1&gt;[196727.313071] RIP  [&lt;ffffffff815f8c7f&gt;] ipv4_dst_destroy+0x4f/0x80
&lt;4&gt;[196727.313100]  RSP &lt;ffff885effd23a70&gt;
&lt;4&gt;[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
&lt;0&gt;[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt

Reported-by: Alexey Preobrazhensky &lt;preobr@google.com&gt;
Reported-by: dormando &lt;dormando@rydia.ne&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Fixes: 8141ed9fcedb2 ("ipv4: Add a socket release callback for datagram sockets")
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ip_tunnel: fix i_key matching in ip_tunnel_find</title>
<updated>2014-06-11T07:43:37+00:00</updated>
<author>
<name>Dmitry Popov</name>
<email>ixaphire@qrator.net</email>
</author>
<published>2014-06-07T23:03:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ce54af1fc9d2718d46c9fd92a161379fb197266'/>
<id>5ce54af1fc9d2718d46c9fd92a161379fb197266</id>
<content type='text'>
Some tunnels (though only vti as for now) can use i_key just for internal use:
for example vti uses it for fwmark'ing incoming packets. So raw i_key value
shouldn't be treated as a distinguisher for them. ip_tunnel_key_match exists for
cases when we want to compare two ip_tunnel_parms' i_keys.

Example bug:
ip link add type vti ikey 1 local 1.0.0.1 remote 2.0.0.2
ip link add type vti ikey 2 local 1.0.0.1 remote 2.0.0.2
spawned two tunnels, although it doesn't make sense.

Signed-off-by: Dmitry Popov &lt;ixaphire@qrator.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some tunnels (though only vti as for now) can use i_key just for internal use:
for example vti uses it for fwmark'ing incoming packets. So raw i_key value
shouldn't be treated as a distinguisher for them. ip_tunnel_key_match exists for
cases when we want to compare two ip_tunnel_parms' i_keys.

Example bug:
ip link add type vti ikey 1 local 1.0.0.1 remote 2.0.0.2
ip link add type vti ikey 2 local 1.0.0.1 remote 2.0.0.2
spawned two tunnels, although it doesn't make sense.

Signed-off-by: Dmitry Popov &lt;ixaphire@qrator.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ip_vti: Fix 'ip tunnel add' with 'key' parameters</title>
<updated>2014-06-11T07:30:52+00:00</updated>
<author>
<name>Dmitry Popov</name>
<email>ixaphire@qrator.net</email>
</author>
<published>2014-06-07T22:06:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7c8e6b9c2811fd37702a9043eabea3545022011e'/>
<id>7c8e6b9c2811fd37702a9043eabea3545022011e</id>
<content type='text'>
ip tunnel add remote 10.2.2.1 local 10.2.2.2 mode vti ikey 1 okey 2
translates to p-&gt;iflags = VTI_ISVTI|GRE_KEY and p-&gt;i_key = 1, but GRE_KEY !=
TUNNEL_KEY, so ip_tunnel_ioctl would set i_key to 0 (same story with o_key)
making us unable to create vti tunnels with [io]key via ip tunnel.

We cannot simply translate GRE_KEY to TUNNEL_KEY (as GRE module does) because
vti_tunnels with same local/remote addresses but different ikeys will be treated
as different then. So, imo the best option here is to move p-&gt;i_flags &amp; *_KEY
check for vti tunnels from ip_tunnel.c to ip_vti.c and to think about [io]_mark
field for ip_tunnel_parm in the future.

Signed-off-by: Dmitry Popov &lt;ixaphire@qrator.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ip tunnel add remote 10.2.2.1 local 10.2.2.2 mode vti ikey 1 okey 2
translates to p-&gt;iflags = VTI_ISVTI|GRE_KEY and p-&gt;i_key = 1, but GRE_KEY !=
TUNNEL_KEY, so ip_tunnel_ioctl would set i_key to 0 (same story with o_key)
making us unable to create vti tunnels with [io]key via ip tunnel.

We cannot simply translate GRE_KEY to TUNNEL_KEY (as GRE module does) because
vti_tunnels with same local/remote addresses but different ikeys will be treated
as different then. So, imo the best option here is to move p-&gt;i_flags &amp; *_KEY
check for vti tunnels from ip_tunnel.c to ip_vti.c and to think about [io]_mark
field for ip_tunnel_parm in the future.

Signed-off-by: Dmitry Popov &lt;ixaphire@qrator.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipip, sit: fix ipv4_{update_pmtu,redirect} calls</title>
<updated>2014-06-11T06:35:52+00:00</updated>
<author>
<name>Dmitry Popov</name>
<email>ixaphire@qrator.net</email>
</author>
<published>2014-06-06T19:19:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2346829e641b804ece9ac9298136b56d9567c278'/>
<id>2346829e641b804ece9ac9298136b56d9567c278</id>
<content type='text'>
ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t-&gt;dev is a
tunnel netdevice). It caused wrong route lookup and failure of pmtu update or
redirect. We should use the same ifindex that we use in ip_route_output_* in
*tunnel_xmit code. It is t-&gt;parms.link .

Signed-off-by: Dmitry Popov &lt;ixaphire@qrator.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t-&gt;dev is a
tunnel netdevice). It caused wrong route lookup and failure of pmtu update or
redirect. We should use the same ifindex that we use in ip_route_output_* in
*tunnel_xmit code. It is t-&gt;parms.link .

Signed-off-by: Dmitry Popov &lt;ixaphire@qrator.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
