<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/net/vxlan.c, branch v3.14</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>vxlan: fix nonfunctional neigh_reduce()</title>
<updated>2014-03-24T19:35:10+00:00</updated>
<author>
<name>David Stevens</name>
<email>dlstevens@us.ibm.com</email>
</author>
<published>2014-03-24T14:39:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4b29dba9c085a4fb79058fb1c45a2f6257ca3dfa'/>
<id>4b29dba9c085a4fb79058fb1c45a2f6257ca3dfa</id>
<content type='text'>
The VXLAN neigh_reduce() code is completely non-functional since
check-in. Specific errors:

1) The original code drops all packets with a multicast destination address,
	even though neighbor solicitations are sent to the solicited-node
	address, a multicast address. The code after this check was never run.
2) The neighbor table lookup used the IPv6 header destination, which is the
	solicited node address, rather than the target address from the
	neighbor solicitation. So neighbor lookups would always fail if it
	got this far. Also for L3MISSes.
3) The code calls ndisc_send_na(), which does a send on the tunnel device.
	The context for neigh_reduce() is the transmit path, vxlan_xmit(),
	where the host or a bridge-attached neighbor is trying to transmit
	a neighbor solicitation. To respond to it, the tunnel endpoint needs
	to do a *receive* of the appropriate neighbor advertisement. Doing a
	send, would only try to send the advertisement, encapsulated, to the
	remote destinations in the fdb -- hosts that definitely did not do the
	corresponding solicitation.
4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the
	isrouter flag in the advertisement. This has nothing to do with whether
	or not the target is a router, and generally won't be set since the
	tunnel endpoint is bridging, not routing, traffic.

	The patch below creates a proxy neighbor advertisement to respond to
neighbor solicitions as intended, providing proper IPv6 support for neighbor
reduction.

Signed-off-by: David L Stevens &lt;dlstevens@us.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The VXLAN neigh_reduce() code is completely non-functional since
check-in. Specific errors:

1) The original code drops all packets with a multicast destination address,
	even though neighbor solicitations are sent to the solicited-node
	address, a multicast address. The code after this check was never run.
2) The neighbor table lookup used the IPv6 header destination, which is the
	solicited node address, rather than the target address from the
	neighbor solicitation. So neighbor lookups would always fail if it
	got this far. Also for L3MISSes.
3) The code calls ndisc_send_na(), which does a send on the tunnel device.
	The context for neigh_reduce() is the transmit path, vxlan_xmit(),
	where the host or a bridge-attached neighbor is trying to transmit
	a neighbor solicitation. To respond to it, the tunnel endpoint needs
	to do a *receive* of the appropriate neighbor advertisement. Doing a
	send, would only try to send the advertisement, encapsulated, to the
	remote destinations in the fdb -- hosts that definitely did not do the
	corresponding solicitation.
4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the
	isrouter flag in the advertisement. This has nothing to do with whether
	or not the target is a router, and generally won't be set since the
	tunnel endpoint is bridging, not routing, traffic.

	The patch below creates a proxy neighbor advertisement to respond to
neighbor solicitions as intended, providing proper IPv6 support for neighbor
reduction.

Signed-off-by: David L Stevens &lt;dlstevens@us.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vxlan: fix potential NULL dereference in arp_reduce()</title>
<updated>2014-03-18T20:09:34+00:00</updated>
<author>
<name>David Stevens</name>
<email>dlstevens@us.ibm.com</email>
</author>
<published>2014-03-18T16:32:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7346135dcd3f9b57f30a5512094848c678d7143e'/>
<id>7346135dcd3f9b57f30a5512094848c678d7143e</id>
<content type='text'>
This patch fixes a NULL pointer dereference in the event of an
skb allocation failure in arp_reduce().

Signed-Off-By: David L Stevens &lt;dlstevens@us.ibm.com&gt;
Acked-by: Cong Wang &lt;cwang@twopensource.com&gt;

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch fixes a NULL pointer dereference in the event of an
skb allocation failure in arp_reduce().

Signed-Off-By: David L Stevens &lt;dlstevens@us.ibm.com&gt;
Acked-by: Cong Wang &lt;cwang@twopensource.com&gt;

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vxlan: remove extra newline after function definition</title>
<updated>2014-02-02T00:58:02+00:00</updated>
<author>
<name>Daniel Baluta</name>
<email>dbaluta@ixiacom.com</email>
</author>
<published>2014-01-31T07:50:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4fe46b9a4d0b5eef96867e6d5134159e5a65d2a5'/>
<id>4fe46b9a4d0b5eef96867e6d5134159e5a65d2a5</id>
<content type='text'>
Signed-off-by: Daniel Baluta &lt;dbaluta@ixiacom.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Daniel Baluta &lt;dbaluta@ixiacom.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/vxlan: Go over all candidate streams for GRO matching</title>
<updated>2014-01-31T00:25:49+00:00</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@mellanox.com</email>
</author>
<published>2014-01-29T16:10:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=920a0fde5a32fd40b4d3c94ad72f7fc7039db8e3'/>
<id>920a0fde5a32fd40b4d3c94ad72f7fc7039db8e3</id>
<content type='text'>
The loop in vxlan_gro_receive() over the current set of candidates for
coalescing was wrongly aborted once a match was found. In rare cases,
this can cause a false-positives matching in the next layer GRO checks.

Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The loop in vxlan_gro_receive() over the current set of candidates for
coalescing was wrongly aborted once a match was found. In rare cases,
this can cause a false-positives matching in the next layer GRO checks.

Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/vxlan: Share RX skb de-marking and checksum checks with ovs</title>
<updated>2014-01-23T21:30:03+00:00</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@mellanox.com</email>
</author>
<published>2014-01-23T09:28:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d0bc65557ad09a57b4db176e9e3ccddb26971453'/>
<id>d0bc65557ad09a57b4db176e9e3ccddb26971453</id>
<content type='text'>
Make sure the practice set by commit 0afb166 "vxlan: Add capability
of Rx checksum offload for inner packet" is applied when the skb
goes through the portion of the RX code which is shared between
vxlan netdevices and ovs vxlan port instances.

Cc: Joseph Gasparakis &lt;joseph.gasparakis@intel.com&gt;
Cc: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make sure the practice set by commit 0afb166 "vxlan: Add capability
of Rx checksum offload for inner packet" is applied when the skb
goes through the portion of the RX code which is shared between
vxlan netdevices and ovs vxlan port instances.

Cc: Joseph Gasparakis &lt;joseph.gasparakis@intel.com&gt;
Cc: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: vxlan: convert to act as a pernet subsystem</title>
<updated>2014-01-23T01:43:38+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-01-22T20:07:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=783c14633571462a5537ee628e1df1ecb715a3a1'/>
<id>783c14633571462a5537ee628e1df1ecb715a3a1</id>
<content type='text'>
As per suggestion from Eric W. Biederman, vxlan should be using
{un,}register_pernet_subsys() instead of {un,}register_pernet_device()
to ensure the vxlan_net structure is initialized before and cleaned
up after all network devices in a given network namespace i.e. when
dealing with network notifiers. This is similarly handeled already in
commit 91e2ff3528ac ("net: Teach vlans to cleanup as a pernet subsystem")
and, thus, improves upon fd27e0d44a89 ("net: vxlan: do not use vxlan_net
before checking event type"). Just as in 91e2ff3528ac, we do not need
to explicitly handle deletion of vxlan devices as network namespace
exit calls dellink on all remaining virtual devices, and
rtnl_link_unregister() calls dellink on all outstanding devices in that
network namespace, so we can entirely drop the pernet exit operation
as well. Moreover, on vxlan module exit, rcu_barrier() is called by
netns since commit 3a765edadb28 ("netns: Add an explicit rcu_barrier
to unregister_pernet_{device|subsys}"), so this may be omitted. Tested
with various scenarios and works well on my side.

Suggested-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As per suggestion from Eric W. Biederman, vxlan should be using
{un,}register_pernet_subsys() instead of {un,}register_pernet_device()
to ensure the vxlan_net structure is initialized before and cleaned
up after all network devices in a given network namespace i.e. when
dealing with network notifiers. This is similarly handeled already in
commit 91e2ff3528ac ("net: Teach vlans to cleanup as a pernet subsystem")
and, thus, improves upon fd27e0d44a89 ("net: vxlan: do not use vxlan_net
before checking event type"). Just as in 91e2ff3528ac, we do not need
to explicitly handle deletion of vxlan devices as network namespace
exit calls dellink on all remaining virtual devices, and
rtnl_link_unregister() calls dellink on all outstanding devices in that
network namespace, so we can entirely drop the pernet exit operation
as well. Moreover, on vxlan module exit, rcu_barrier() is called by
netns since commit 3a765edadb28 ("netns: Add an explicit rcu_barrier
to unregister_pernet_{device|subsys}"), so this may be omitted. Tested
with various scenarios and works well on my side.

Suggested-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add GRO support for vxlan traffic</title>
<updated>2014-01-22T02:05:04+00:00</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@mellanox.com</email>
</author>
<published>2014-01-20T11:59:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dc01e7d3447793fd9e4090aa9d50c549848b5a18'/>
<id>dc01e7d3447793fd9e4090aa9d50c549848b5a18</id>
<content type='text'>
Add GRO handlers for vxlann, by using the UDP GRO infrastructure.

For single TCP session that goes through vxlan tunneling I got nice
improvement from 6.8Gbs to 11.5Gbs

--&gt; UDP/VXLAN GRO disabled
$ netperf  -H 192.168.52.147 -c -C

$ netperf -t TCP_STREAM -H 192.168.52.147 -c -C
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  65536  65536    10.00      6799.75   12.54    24.79    0.604   1.195

--&gt; UDP/VXLAN GRO enabled

$ netperf -t TCP_STREAM -H 192.168.52.147 -c -C
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  65536  65536    10.00      11562.72   24.90    20.34    0.706   0.577

Signed-off-by: Shlomo Pongratz &lt;shlomop@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add GRO handlers for vxlann, by using the UDP GRO infrastructure.

For single TCP session that goes through vxlan tunneling I got nice
improvement from 6.8Gbs to 11.5Gbs

--&gt; UDP/VXLAN GRO disabled
$ netperf  -H 192.168.52.147 -c -C

$ netperf -t TCP_STREAM -H 192.168.52.147 -c -C
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  65536  65536    10.00      6799.75   12.54    24.79    0.604   1.195

--&gt; UDP/VXLAN GRO enabled

$ netperf -t TCP_STREAM -H 192.168.52.147 -c -C
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  65536  65536    10.00      11562.72   24.90    20.34    0.706   0.577

Signed-off-by: Shlomo Pongratz &lt;shlomop@mellanox.com&gt;
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add vxlan description</title>
<updated>2014-01-22T00:28:07+00:00</updated>
<author>
<name>Jesse Brandeburg</name>
<email>jesse.brandeburg@intel.com</email>
</author>
<published>2014-01-17T19:00:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ead5139aa10d7df970f66a291bf450966ae7b748'/>
<id>ead5139aa10d7df970f66a291bf450966ae7b748</id>
<content type='text'>
Add a description to the vxlan module, helping save the world
from the minions of destruction and confusion.

Signed-off-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
CC: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a description to the vxlan module, helping save the world
from the minions of destruction and confusion.

Signed-off-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
CC: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: vxlan: do not use vxlan_net before checking event type</title>
<updated>2014-01-18T02:49:18+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-01-17T11:55:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fd27e0d44a893b45832df3cb8e021bd1d773a73f'/>
<id>fd27e0d44a893b45832df3cb8e021bd1d773a73f</id>
<content type='text'>
Jesse Brandeburg reported that commit acaf4e70997f caused a panic
when adding a network namespace while vxlan module was present in
the system:

[&lt;ffffffff814d0865&gt;] vxlan_lowerdev_event+0xf5/0x100
[&lt;ffffffff816e9e5d&gt;] notifier_call_chain+0x4d/0x70
[&lt;ffffffff810912be&gt;] __raw_notifier_call_chain+0xe/0x10
[&lt;ffffffff810912d6&gt;] raw_notifier_call_chain+0x16/0x20
[&lt;ffffffff815d9610&gt;] call_netdevice_notifiers_info+0x40/0x70
[&lt;ffffffff815d9656&gt;] call_netdevice_notifiers+0x16/0x20
[&lt;ffffffff815e1bce&gt;] register_netdevice+0x1be/0x3a0
[&lt;ffffffff815e1dce&gt;] register_netdev+0x1e/0x30
[&lt;ffffffff814cb94a&gt;] loopback_net_init+0x4a/0xb0
[&lt;ffffffffa016ed6e&gt;] ? lockd_init_net+0x6e/0xb0 [lockd]
[&lt;ffffffff815d6bac&gt;] ops_init+0x4c/0x150
[&lt;ffffffff815d6d23&gt;] setup_net+0x73/0x110
[&lt;ffffffff815d725b&gt;] copy_net_ns+0x7b/0x100
[&lt;ffffffff81090e11&gt;] create_new_namespaces+0x101/0x1b0
[&lt;ffffffff81090f45&gt;] copy_namespaces+0x85/0xb0
[&lt;ffffffff810693d5&gt;] copy_process.part.26+0x935/0x1500
[&lt;ffffffff811d5186&gt;] ? mntput+0x26/0x40
[&lt;ffffffff8106a15c&gt;] do_fork+0xbc/0x2e0
[&lt;ffffffff811b7f2e&gt;] ? ____fput+0xe/0x10
[&lt;ffffffff81089c5c&gt;] ? task_work_run+0xac/0xe0
[&lt;ffffffff8106a406&gt;] SyS_clone+0x16/0x20
[&lt;ffffffff816ee689&gt;] stub_clone+0x69/0x90
[&lt;ffffffff816ee329&gt;] ? system_call_fastpath+0x16/0x1b

Apparently loopback device is being registered first and thus we
receive an event notification when vxlan_net is not ready. Hence,
when we call net_generic() and request vxlan_net_id, we seem to
access garbage at that point in time. In setup_net() where we set
up a newly allocated network namespace, we traverse the list of
pernet ops ...

list_for_each_entry(ops, &amp;pernet_list, list) {
	error = ops_init(ops, net);
	if (error &lt; 0)
		goto out_undo;
}

... and loopback_net_init() is invoked first here, so in the middle
of setup_net() we get this notification in vxlan. As currently we
only care about devices that unregister, move access through
net_generic() there. Fix is based on Cong Wang's proposal, but
only changes what is needed here. It sucks a bit as we only work
around the actual cure: right now it seems the only way to check if
a netns actually finished traversing all init ops would be to check
if it's part of net_namespace_list. But that I find quite expensive
each time we go through a notifier callback. Anyway, did a couple
of tests and it seems good for now.

Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well")
Reported-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Tested-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Jesse Brandeburg reported that commit acaf4e70997f caused a panic
when adding a network namespace while vxlan module was present in
the system:

[&lt;ffffffff814d0865&gt;] vxlan_lowerdev_event+0xf5/0x100
[&lt;ffffffff816e9e5d&gt;] notifier_call_chain+0x4d/0x70
[&lt;ffffffff810912be&gt;] __raw_notifier_call_chain+0xe/0x10
[&lt;ffffffff810912d6&gt;] raw_notifier_call_chain+0x16/0x20
[&lt;ffffffff815d9610&gt;] call_netdevice_notifiers_info+0x40/0x70
[&lt;ffffffff815d9656&gt;] call_netdevice_notifiers+0x16/0x20
[&lt;ffffffff815e1bce&gt;] register_netdevice+0x1be/0x3a0
[&lt;ffffffff815e1dce&gt;] register_netdev+0x1e/0x30
[&lt;ffffffff814cb94a&gt;] loopback_net_init+0x4a/0xb0
[&lt;ffffffffa016ed6e&gt;] ? lockd_init_net+0x6e/0xb0 [lockd]
[&lt;ffffffff815d6bac&gt;] ops_init+0x4c/0x150
[&lt;ffffffff815d6d23&gt;] setup_net+0x73/0x110
[&lt;ffffffff815d725b&gt;] copy_net_ns+0x7b/0x100
[&lt;ffffffff81090e11&gt;] create_new_namespaces+0x101/0x1b0
[&lt;ffffffff81090f45&gt;] copy_namespaces+0x85/0xb0
[&lt;ffffffff810693d5&gt;] copy_process.part.26+0x935/0x1500
[&lt;ffffffff811d5186&gt;] ? mntput+0x26/0x40
[&lt;ffffffff8106a15c&gt;] do_fork+0xbc/0x2e0
[&lt;ffffffff811b7f2e&gt;] ? ____fput+0xe/0x10
[&lt;ffffffff81089c5c&gt;] ? task_work_run+0xac/0xe0
[&lt;ffffffff8106a406&gt;] SyS_clone+0x16/0x20
[&lt;ffffffff816ee689&gt;] stub_clone+0x69/0x90
[&lt;ffffffff816ee329&gt;] ? system_call_fastpath+0x16/0x1b

Apparently loopback device is being registered first and thus we
receive an event notification when vxlan_net is not ready. Hence,
when we call net_generic() and request vxlan_net_id, we seem to
access garbage at that point in time. In setup_net() where we set
up a newly allocated network namespace, we traverse the list of
pernet ops ...

list_for_each_entry(ops, &amp;pernet_list, list) {
	error = ops_init(ops, net);
	if (error &lt; 0)
		goto out_undo;
}

... and loopback_net_init() is invoked first here, so in the middle
of setup_net() we get this notification in vxlan. As currently we
only care about devices that unregister, move access through
net_generic() there. Fix is based on Cong Wang's proposal, but
only changes what is needed here. It sucks a bit as we only work
around the actual cure: right now it seems the only way to check if
a netns actually finished traversing all init ops would be to check
if it's part of net_namespace_list. But that I find quite expensive
each time we go through a notifier callback. Anyway, did a couple
of tests and it seems good for now.

Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well")
Reported-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Tested-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: vxlan: properly cleanup devs on module unload</title>
<updated>2014-01-15T07:38:39+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-01-13T17:41:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8425783c0f4292ca5be35097a467e1240735c257'/>
<id>8425783c0f4292ca5be35097a467e1240735c257</id>
<content type='text'>
We should use vxlan_dellink() handler in vxlan_exit_net(), since
i) we're not in fast-path and we should be consistent in dismantle
just as we would remove a device through rtnl ops, and more
importantly, ii) in case future code will kfree() memory in
vxlan_dellink(), we would leak it right here unnoticed. Therefore,
do not only half of the cleanup work, but make it properly.

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We should use vxlan_dellink() handler in vxlan_exit_net(), since
i) we're not in fast-path and we should be consistent in dismantle
just as we would remove a device through rtnl ops, and more
importantly, ii) in case future code will kfree() memory in
vxlan_dellink(), we would leak it right here unnoticed. Therefore,
do not only half of the cleanup work, but make it properly.

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
