<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/openvswitch/vport-internal_dev.c, branch v4.13</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>net: Fix inconsistent teardown and release of private netdev state.</title>
<updated>2017-06-07T19:53:24+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-05-08T16:52:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cf124db566e6b036b8bcbe8decbed740bdfac8c6'/>
<id>cf124db566e6b036b8bcbe8decbed740bdfac8c6</id>
<content type='text'>
Network devices can allocate reasources and private memory using
netdev_ops-&gt;ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops-&gt;ndo_uninit() or netdev-&gt;destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops-&gt;ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev-&gt;destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev-&gt;destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops-&gt;ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops-&gt;ndo_uninit().  But
it is not able to invoke netdev-&gt;destructor().

This is because netdev-&gt;destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev-&gt;destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev-&gt;destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev-&gt;priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev-&gt;destructor(), except for
free_netdev().

netdev-&gt;needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops-&gt;ndo_init() succeeds, by invoking both ndo_ops-&gt;ndo_uninit()
and netdev-&gt;priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev-&gt;priv_destructor() and optionally call free_netdev().

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Network devices can allocate reasources and private memory using
netdev_ops-&gt;ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops-&gt;ndo_uninit() or netdev-&gt;destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops-&gt;ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev-&gt;destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev-&gt;destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops-&gt;ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops-&gt;ndo_uninit().  But
it is not able to invoke netdev-&gt;destructor().

This is because netdev-&gt;destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev-&gt;destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev-&gt;destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev-&gt;priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev-&gt;destructor(), except for
free_netdev().

netdev-&gt;needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops-&gt;ndo_init() succeeds, by invoking both ndo_ops-&gt;ndo_uninit()
and netdev-&gt;priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev-&gt;priv_destructor() and optionally call free_netdev().

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: Set internal device max mtu to ETH_MAX_MTU.</title>
<updated>2017-02-15T17:40:27+00:00</updated>
<author>
<name>Jarno Rajahalme</name>
<email>jarno@ovn.org</email>
</author>
<published>2017-02-15T05:16:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=425df17ce3a26d98f76e2b6b0af2acf4aeb0b026'/>
<id>425df17ce3a26d98f76e2b6b0af2acf4aeb0b026</id>
<content type='text'>
Commit 91572088e3fd ("net: use core MTU range checking in core net
infra") changed the openvswitch internal device to use the core net
infra for controlling the MTU range, but failed to actually set the
max_mtu as described in the commit message, which now defaults to
ETH_DATA_LEN.

This patch fixes this by setting max_mtu to ETH_MAX_MTU after
ether_setup() call.

Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Jarno Rajahalme &lt;jarno@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 91572088e3fd ("net: use core MTU range checking in core net
infra") changed the openvswitch internal device to use the core net
infra for controlling the MTU range, but failed to actually set the
max_mtu as described in the commit message, which now defaults to
ETH_DATA_LEN.

This patch fixes this by setting max_mtu to ETH_MAX_MTU after
ether_setup() call.

Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Jarno Rajahalme &lt;jarno@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: make ndo_get_stats64 a void function</title>
<updated>2017-01-08T22:51:44+00:00</updated>
<author>
<name>stephen hemminger</name>
<email>stephen@networkplumber.org</email>
</author>
<published>2017-01-07T03:12:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bc1f44709cf27fb2a5766cadafe7e2ad5e9cb221'/>
<id>bc1f44709cf27fb2a5766cadafe7e2ad5e9cb221</id>
<content type='text'>
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger &lt;sthemmin@microsoft.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: use core MTU range checking in core net infra</title>
<updated>2016-10-20T18:51:09+00:00</updated>
<author>
<name>Jarod Wilson</name>
<email>jarod@redhat.com</email>
</author>
<published>2016-10-20T17:55:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=91572088e3fdbf4fe31cf397926d8b890fdb3237'/>
<id>91572088e3fdbf4fe31cf397926d8b890fdb3237</id>
<content type='text'>
geneve:
- Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
- This one isn't quite as straight-forward as others, could use some
  closer inspection and testing

macvlan:
- set min/max_mtu

tun:
- set min/max_mtu, remove tun_net_change_mtu

vxlan:
- Merge __vxlan_change_mtu back into vxlan_change_mtu
- Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
  change_mtu function
- This one is also not as straight-forward and could use closer inspection
  and testing from vxlan folks

bridge:
- set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
  change_mtu function

openvswitch:
- set min/max_mtu, remove internal_dev_change_mtu
- note: max_mtu wasn't checked previously, it's been set to 65535, which
  is the largest possible size supported

sch_teql:
- set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)

macsec:
- min_mtu = 0, max_mtu = 65535

macvlan:
- min_mtu = 0, max_mtu = 65535

ntb_netdev:
- min_mtu = 0, max_mtu = 65535

veth:
- min_mtu = 68, max_mtu = 65535

8021q:
- min_mtu = 0, max_mtu = 65535

CC: netdev@vger.kernel.org
CC: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
CC: Tom Herbert &lt;tom@herbertland.com&gt;
CC: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
CC: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
CC: Paolo Abeni &lt;pabeni@redhat.com&gt;
CC: Jiri Benc &lt;jbenc@redhat.com&gt;
CC: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
CC: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
CC: Pravin B Shelar &lt;pshelar@ovn.org&gt;
CC: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
CC: Patrick McHardy &lt;kaber@trash.net&gt;
CC: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
CC: Pravin Shelar &lt;pshelar@nicira.com&gt;
CC: Maxim Krasnyansky &lt;maxk@qti.qualcomm.com&gt;
Signed-off-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
geneve:
- Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
- This one isn't quite as straight-forward as others, could use some
  closer inspection and testing

macvlan:
- set min/max_mtu

tun:
- set min/max_mtu, remove tun_net_change_mtu

vxlan:
- Merge __vxlan_change_mtu back into vxlan_change_mtu
- Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
  change_mtu function
- This one is also not as straight-forward and could use closer inspection
  and testing from vxlan folks

bridge:
- set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
  change_mtu function

openvswitch:
- set min/max_mtu, remove internal_dev_change_mtu
- note: max_mtu wasn't checked previously, it's been set to 65535, which
  is the largest possible size supported

sch_teql:
- set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)

macsec:
- min_mtu = 0, max_mtu = 65535

macvlan:
- min_mtu = 0, max_mtu = 65535

ntb_netdev:
- min_mtu = 0, max_mtu = 65535

veth:
- min_mtu = 68, max_mtu = 65535

8021q:
- min_mtu = 0, max_mtu = 65535

CC: netdev@vger.kernel.org
CC: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
CC: Tom Herbert &lt;tom@herbertland.com&gt;
CC: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
CC: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
CC: Paolo Abeni &lt;pabeni@redhat.com&gt;
CC: Jiri Benc &lt;jbenc@redhat.com&gt;
CC: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
CC: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
CC: Pravin B Shelar &lt;pshelar@ovn.org&gt;
CC: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
CC: Patrick McHardy &lt;kaber@trash.net&gt;
CC: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
CC: Pravin Shelar &lt;pshelar@nicira.com&gt;
CC: Maxim Krasnyansky &lt;maxk@qti.qualcomm.com&gt;
Signed-off-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev</title>
<updated>2016-10-13T14:03:23+00:00</updated>
<author>
<name>Jiri Benc</name>
<email>jbenc@redhat.com</email>
</author>
<published>2016-10-10T15:02:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3145c037e74926dea9241a3f68ada6f294b0119a'/>
<id>3145c037e74926dea9241a3f68ada6f294b0119a</id>
<content type='text'>
The internal device does support 802.1AD offloading since 018c1dda5ff1
("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink
attributes").

Signed-off-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Acked-by: Pravin B Shelar &lt;pshelar@ovn.org&gt;
Acked-by: Eric Garver &lt;e@erig.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The internal device does support 802.1AD offloading since 018c1dda5ff1
("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink
attributes").

Signed-off-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Acked-by: Pravin B Shelar &lt;pshelar@ovn.org&gt;
Acked-by: Eric Garver &lt;e@erig.me&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>OVS: Ignore negative headroom value</title>
<updated>2016-08-06T04:06:11+00:00</updated>
<author>
<name>Ian Wienand</name>
<email>iwienand@redhat.com</email>
</author>
<published>2016-08-03T05:44:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ef9f289c4e698054e5687edb54f0da3cdc9173a'/>
<id>5ef9f289c4e698054e5687edb54f0da3cdc9173a</id>
<content type='text'>
net_device-&gt;ndo_set_rx_headroom (introduced in
871b642adebe300be2e50aa5f65a418510f636ec) says

  "Setting a negtaive value reset the rx headroom
   to the default value".

It seems that the OVS implementation in
3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 overlooked this and sets
dev-&gt;needed_headroom unconditionally.

This doesn't have an immediate effect, but can mess up later
LL_RESERVED_SPACE calculations, such as done in
net/ipv6/mcast.c:mld_newpack.  For reference, this issue was found
from a skb_panic raised there after the length calculations had given
the wrong result.

Note the other current users of this interface
(drivers/net/tun.c:tun_set_headroom and
drivers/net/veth.c:veth_set_rx_headroom) are both checking this
correctly thus need no modification.

Thanks to Ben for some pointers from the crash dumps!

Cc: Benjamin Poirier &lt;bpoirier@suse.com&gt;
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1361414
Signed-off-by: Ian Wienand &lt;iwienand@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
net_device-&gt;ndo_set_rx_headroom (introduced in
871b642adebe300be2e50aa5f65a418510f636ec) says

  "Setting a negtaive value reset the rx headroom
   to the default value".

It seems that the OVS implementation in
3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 overlooked this and sets
dev-&gt;needed_headroom unconditionally.

This doesn't have an immediate effect, but can mess up later
LL_RESERVED_SPACE calculations, such as done in
net/ipv6/mcast.c:mld_newpack.  For reference, this issue was found
from a skb_panic raised there after the length calculations had given
the wrong result.

Note the other current users of this interface
(drivers/net/tun.c:tun_set_headroom and
drivers/net/veth.c:veth_set_rx_headroom) are both checking this
correctly thus need no modification.

Thanks to Ben for some pointers from the crash dumps!

Cc: Benjamin Poirier &lt;bpoirier@suse.com&gt;
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1361414
Signed-off-by: Ian Wienand &lt;iwienand@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ovs: set name assign type of internal port</title>
<updated>2016-06-02T22:05:47+00:00</updated>
<author>
<name>Zhang Shengju</name>
<email>zhangshengju@cmss.chinamobile.com</email>
</author>
<published>2016-05-31T13:41:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=684ff4ef5edd758c47929b852b4ea79be56f8bc0'/>
<id>684ff4ef5edd758c47929b852b4ea79be56f8bc0</id>
<content type='text'>
Set name_assign_type of internal port to NET_NAME_USER.

Signed-off-by: Zhang Shengju &lt;zhangshengju@cmss.chinamobile.com&gt;
Acked-by: Pravin B Shelar &lt;pshelar@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Set name_assign_type of internal port to NET_NAME_USER.

Signed-off-by: Zhang Shengju &lt;zhangshengju@cmss.chinamobile.com&gt;
Acked-by: Pravin B Shelar &lt;pshelar@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: Convert to using IFF_NO_QUEUE</title>
<updated>2016-04-17T02:02:14+00:00</updated>
<author>
<name>Phil Sutter</name>
<email>phil@nwl.cc</email>
</author>
<published>2016-04-15T17:14:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4272cc51a6dcf2c086863372fd593809ffced7d5'/>
<id>4272cc51a6dcf2c086863372fd593809ffced7d5</id>
<content type='text'>
Cc: Pravin Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cc: Pravin Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ovs: internal_set_rx_headroom() can be static</title>
<updated>2016-03-18T21:50:36+00:00</updated>
<author>
<name>Wu Fengguang</name>
<email>fengguang.wu@intel.com</email>
</author>
<published>2016-03-18T16:54:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e014e8468552236f0f0cb64c7c341c1dce506070'/>
<id>e014e8468552236f0f0cb64c7c341c1dce506070</id>
<content type='text'>
Signed-off-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ovs: propagate per dp max headroom to all vports</title>
<updated>2016-03-01T20:54:30+00:00</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2016-02-26T09:45:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7'/>
<id>3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7</id>
<content type='text'>
This patch implements bookkeeping support to compute the maximum
headroom for all the devices in each datapath. When said value
changes, the underlying devs are notified via the
ndo_set_rx_headroom method.

This also increases the internal vports xmit performance.

Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch implements bookkeeping support to compute the maximum
headroom for all the devices in each datapath. When said value
changes, the underlying devs are notified via the
ndo_set_rx_headroom method.

This also increases the internal vports xmit performance.

Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
