<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/net/macvlan.c, branch v2.6.39</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>macvlan: Fix use after free of struct macvlan_port.</title>
<updated>2011-03-22T01:22:22+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@aristanetworks.com</email>
</author>
<published>2011-03-22T01:22:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d5cd92448fded12c91f7574e49747c5f7d975a8d'/>
<id>d5cd92448fded12c91f7574e49747c5f7d975a8d</id>
<content type='text'>
When the macvlan driver was extended to call unregisgter_netdevice_queue
in 23289a37e2b127dfc4de1313fba15bb4c9f0cd5b, a use after free of struct
macvlan_port was introduced.  The code in dellink relied on unregister_netdevice
actually unregistering the net device so it would be safe to free macvlan_port.

Since unregister_netdevice_queue can just queue up the unregister instead of
performing the unregiser immediately we free the macvlan_port too soon and
then the code in macvlan_stop removes the macaddress for the set of macaddress
to listen for and uses memory that has already been freed.

To fix this add a reference count to track when it is safe to free the macvlan_port
and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed
to be called after the final macvlan_port_close.

Signed-off-by: Eric W. Biederman &lt;ebiederm@aristanetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the macvlan driver was extended to call unregisgter_netdevice_queue
in 23289a37e2b127dfc4de1313fba15bb4c9f0cd5b, a use after free of struct
macvlan_port was introduced.  The code in dellink relied on unregister_netdevice
actually unregistering the net device so it would be safe to free macvlan_port.

Since unregister_netdevice_queue can just queue up the unregister instead of
performing the unregiser immediately we free the macvlan_port too soon and
then the code in macvlan_stop removes the macaddress for the set of macaddress
to listen for and uses memory that has already been freed.

To fix this add a reference count to track when it is safe to free the macvlan_port
and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed
to be called after the final macvlan_port_close.

Signed-off-by: Eric W. Biederman &lt;ebiederm@aristanetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: introduce rx_handler results and logic around that</title>
<updated>2011-03-16T19:53:54+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jpirko@redhat.com</email>
</author>
<published>2011-03-12T03:14:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8a4eb5734e8d1dc60a8c28576bbbdfdcc643626d'/>
<id>8a4eb5734e8d1dc60a8c28576bbbdfdcc643626d</id>
<content type='text'>
This patch allows rx_handlers to better signalize what to do next to
it's caller. That makes skb-&gt;deliver_no_wcard no longer needed.

kernel-doc for rx_handler_result is taken from Nicolas' patch.

Signed-off-by: Jiri Pirko &lt;jpirko@redhat.com&gt;
Reviewed-by: Nicolas de Pesloüan &lt;nicolas.2p.debian@free.fr&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch allows rx_handlers to better signalize what to do next to
it's caller. That makes skb-&gt;deliver_no_wcard no longer needed.

kernel-doc for rx_handler_result is taken from Nicolas' patch.

Signed-off-by: Jiri Pirko &lt;jpirko@redhat.com&gt;
Reviewed-by: Nicolas de Pesloüan &lt;nicolas.2p.debian@free.fr&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>macvlan : fix checksums error when we are in bridge mode</title>
<updated>2011-03-14T23:54:44+00:00</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@free.fr</email>
</author>
<published>2011-03-14T06:08:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=12a2856b604476c27d85a5f9a57ae1661fc46019'/>
<id>12a2856b604476c27d85a5f9a57ae1661fc46019</id>
<content type='text'>
When the lower device has offloading capabilities, the packets checksums
are not computed. That leads to have any macvlan port in bridge mode to
not work because the packets are dropped due to a bad checksum.

If the macvlan is in bridge mode, the packet is forwarded to another
macvlan port and reach the network stack where it looks for a checksum
but this one was not computed due to the offloading of the lower device.
In this case, we have to set the packet with CHECKSUM_UNNECESSARY
when it is forwarded to a bridged port and restore the previous value of
ip_summed when the packet goes to the lowerdev.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@free.fr&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Andrian Nord &lt;nightnord@gmail.com&gt;
Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the lower device has offloading capabilities, the packets checksums
are not computed. That leads to have any macvlan port in bridge mode to
not work because the packets are dropped due to a bad checksum.

If the macvlan is in bridge mode, the packet is forwarded to another
macvlan port and reach the network stack where it looks for a checksum
but this one was not computed due to the offloading of the lower device.
In this case, we have to set the packet with CHECKSUM_UNNECESSARY
when it is forwarded to a bridged port and restore the previous value of
ip_summed when the packet goes to the lowerdev.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@free.fr&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Andrian Nord &lt;nightnord@gmail.com&gt;
Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>macvlan: Introduce 'passthru' mode to takeover the underlying device</title>
<updated>2010-11-22T16:24:29+00:00</updated>
<author>
<name>Sridhar Samudrala</name>
<email>sri@us.ibm.com</email>
</author>
<published>2010-10-28T13:10:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eb06acdc85585f28864261f28659157848762ee4'/>
<id>eb06acdc85585f28864261f28659157848762ee4</id>
<content type='text'>
With the current default 'vepa' mode, a KVM guest using virtio with
macvtap backend has the following limitations.
- cannot change/add a mac address on the guest virtio-net
- cannot create a vlan device on the guest virtio-net
- cannot enable promiscuous mode on guest virtio-net

To address these limitations, this patch introduces a new mode called
'passthru' when creating a macvlan device which allows takeover of the
underlying device and passing it to a guest using virtio with macvtap
backend.

Only one macvlan device is allowed in passthru mode and it inherits
the mac address from the underlying device and sets it in promiscuous
mode to receive and forward all the packets.

Signed-off-by: Sridhar Samudrala &lt;sri@us.ibm.com&gt;

-------------------------------------------------------------------------
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With the current default 'vepa' mode, a KVM guest using virtio with
macvtap backend has the following limitations.
- cannot change/add a mac address on the guest virtio-net
- cannot create a vlan device on the guest virtio-net
- cannot enable promiscuous mode on guest virtio-net

To address these limitations, this patch introduces a new mode called
'passthru' when creating a macvlan device which allows takeover of the
underlying device and passing it to a guest using virtio with macvtap
backend.

Only one macvlan device is allowed in passthru mode and it inherits
the mac address from the underlying device and sets it in promiscuous
mode to receive and forward all the packets.

Signed-off-by: Sridhar Samudrala &lt;sri@us.ibm.com&gt;

-------------------------------------------------------------------------
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>macvlan: lockless tx path</title>
<updated>2010-11-16T18:58:30+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-11-10T21:14:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8ffab51b3dfc54876f145f15b351c41f3f703195'/>
<id>8ffab51b3dfc54876f145f15b351c41f3f703195</id>
<content type='text'>
macvlan is a stacked device, like tunnels. We should use the lockless
mechanism we are using in tunnels and loopback.

This patch completely removes locking in TX path.

tx stat counters are added into existing percpu stat structure, renamed
from rx_stats to pcpu_stats.

Note : this reverts commit 2c11455321f37 (macvlan: add multiqueue
capability)

Note : rx_errors converted to a 32bit counter, like tx_dropped, since
they dont need 64bit range.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Ben Greear &lt;greearb@candelatech.com&gt;
Cc: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
macvlan is a stacked device, like tunnels. We should use the lockless
mechanism we are using in tunnels and loopback.

This patch completely removes locking in TX path.

tx stat counters are added into existing percpu stat structure, renamed
from rx_stats to pcpu_stats.

Note : this reverts commit 2c11455321f37 (macvlan: add multiqueue
capability)

Note : rx_errors converted to a 32bit counter, like tx_dropped, since
they dont need 64bit range.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Ben Greear &lt;greearb@candelatech.com&gt;
Cc: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netns: keep vlan slaves on master netns move</title>
<updated>2010-09-17T23:46:04+00:00</updated>
<author>
<name>David Lamparter</name>
<email>equinox@diac24.net</email>
</author>
<published>2010-09-17T03:22:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3b27e105550f7c4a79ecb6d6a9c49c651c59ae9b'/>
<id>3b27e105550f7c4a79ecb6d6a9c49c651c59ae9b</id>
<content type='text'>
previously, if a vlan master device was moved from one network namespace
to another, all 802.1q and macvlan slaves were deleted.

we can use dev-&gt;reg_state to figure out whether dev_change_net_namespace
is happening, since that won't set dev-&gt;reg_state NETREG_UNREGISTERING.
so, this changes 8021q and macvlan to ignore NETDEV_UNREGISTER when
reg_state is not NETREG_UNREGISTERING.

Signed-off-by: David Lamparter &lt;equinox@diac24.net&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@free.fr&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
previously, if a vlan master device was moved from one network namespace
to another, all 802.1q and macvlan slaves were deleted.

we can use dev-&gt;reg_state to figure out whether dev_change_net_namespace
is happening, since that won't set dev-&gt;reg_state NETREG_UNREGISTERING.
so, this changes 8021q and macvlan to ignore NETDEV_UNREGISTER when
reg_state is not NETREG_UNREGISTERING.

Signed-off-by: David Lamparter &lt;equinox@diac24.net&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@free.fr&gt;
Acked-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>macvlan: Fix rx counters update in macvlan_handle_frame()</title>
<updated>2010-07-28T04:02:42+00:00</updated>
<author>
<name>Sridhar Samudrala</name>
<email>sri@us.ibm.com</email>
</author>
<published>2010-07-27T09:10:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ba01877f56c3244b21746d3f1537f7647ed97984'/>
<id>ba01877f56c3244b21746d3f1537f7647ed97984</id>
<content type='text'>
Fix macvlan_handle_frame() to update the rx counters based
on the return value of the vlan-&gt;receive call.

Updated the patch to not do any packet count drops when the interface
is down based on Herber'ts comments.

Signed-off-by: Sridhar Samudrala &lt;sri@us.ibm.com&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix macvlan_handle_frame() to update the rx counters based
on the return value of the vlan-&gt;receive call.

Updated the patch to not do any packet count drops when the interface
is down based on Herber'ts comments.

Signed-off-by: Sridhar Samudrala &lt;sri@us.ibm.com&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2010-07-28T04:01:35+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2010-07-28T04:01:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bb7e95c8fd859922c6cf3ebbb3a8546007df1748'/>
<id>bb7e95c8fd859922c6cf3ebbb3a8546007df1748</id>
<content type='text'>
Conflicts:
	drivers/net/bnx2x_main.c

Merge bnx2x bug fixes in by hand... :-/

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/bnx2x_main.c

Merge bnx2x bug fixes in by hand... :-/

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>macvtap: Limit packet queue length</title>
<updated>2010-07-22T20:08:56+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2010-07-21T21:44:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8a35747a5d13b99e076b0222729e0caa48cb69b6'/>
<id>8a35747a5d13b99e076b0222729e0caa48cb69b6</id>
<content type='text'>
Mark Wagner reported OOM symptoms when sending UDP traffic over
a macvtap link to a kvm receiver.

This appears to be caused by the fact that macvtap packet queues
are unlimited in length.  This means that if the receiver can't
keep up with the rate of flow, then we will hit OOM. Of course
it gets worse if the OOM killer then decides to kill the receiver.

This patch imposes a cap on the packet queue length, in the same
way as the tuntap driver, using the device TX queue length.

Please note that macvtap currently has no way of giving congestion
notification, that means the software device TX queue cannot be
used and packets will always be dropped once the macvtap driver
queue fills up.

This shouldn't be a great problem for the scenario where macvtap
is used to feed a kvm receiver, as the traffic is most likely
external in origin so congestion notification can't be applied
anyway.

Of course, if anybody decides to complain about guest-to-guest
UDP packet loss down the track, then we may have to revisit this.

Incidentally, this patch also fixes a real memory leak when
macvtap_get_queue fails.

Chris Wright noticed that for this patch to work, we need a
non-zero TX queue length.  This patch includes his work to change
the default macvtap TX queue length to 500.

Reported-by: Mark Wagner &lt;mwagner@redhat.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mark Wagner reported OOM symptoms when sending UDP traffic over
a macvtap link to a kvm receiver.

This appears to be caused by the fact that macvtap packet queues
are unlimited in length.  This means that if the receiver can't
keep up with the rate of flow, then we will hit OOM. Of course
it gets worse if the OOM killer then decides to kill the receiver.

This patch imposes a cap on the packet queue length, in the same
way as the tuntap driver, using the device TX queue length.

Please note that macvtap currently has no way of giving congestion
notification, that means the software device TX queue cannot be
used and packets will always be dropped once the macvtap driver
queue fills up.

This shouldn't be a great problem for the scenario where macvtap
is used to feed a kvm receiver, as the traffic is most likely
external in origin so congestion notification can't be applied
anyway.

Of course, if anybody decides to complain about guest-to-guest
UDP packet loss down the track, then we may have to revisit this.

Incidentally, this patch also fixes a real memory leak when
macvtap_get_queue fails.

Chris Wright noticed that for this patch to work, we need a
non-zero TX queue length.  This patch includes his work to change
the default macvtap TX queue length to 500.

Reported-by: Mark Wagner &lt;mwagner@redhat.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Get rid of rtnl_link_stats64 / net_device_stats union</title>
<updated>2010-07-10T00:41:56+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>bhutchings@solarflare.com</email>
</author>
<published>2010-07-09T09:11:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3cfde79c6c7c8002375c4a8e5be7f602fbb9675d'/>
<id>3cfde79c6c7c8002375c4a8e5be7f602fbb9675d</id>
<content type='text'>
In commit be1f3c2c027cc5ad735df6a45a542ed1db7ec48b "net: Enable 64-bit
net device statistics on 32-bit architectures" I redefined struct
net_device_stats so that it could be used in a union with struct
rtnl_link_stats64, avoiding the need for explicit copying or
conversion between the two.  However, this is unsafe because there is
no locking required and no lock consistently held around calls to
dev_get_stats() and use of the statistics structure it returns.

In commit 28172739f0a276eb8d6ca917b3974c2edb036da3 "net: fix 64 bit
counters on 32 bit arches" Eric Dumazet dealt with that problem by
requiring callers of dev_get_stats() to provide storage for the
result.  This means that the net_device::stats64 field and the padding
in struct net_device_stats are now redundant, so remove them.

Update the comment on net_device_ops::ndo_get_stats64 to reflect its
new usage.

Change dev_txq_stats_fold() to use struct rtnl_link_stats64, since
that is what all its callers are really using and it is no longer
going to be compatible with struct net_device_stats.

Eric Dumazet suggested the separate function for the structure
conversion.

Signed-off-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit be1f3c2c027cc5ad735df6a45a542ed1db7ec48b "net: Enable 64-bit
net device statistics on 32-bit architectures" I redefined struct
net_device_stats so that it could be used in a union with struct
rtnl_link_stats64, avoiding the need for explicit copying or
conversion between the two.  However, this is unsafe because there is
no locking required and no lock consistently held around calls to
dev_get_stats() and use of the statistics structure it returns.

In commit 28172739f0a276eb8d6ca917b3974c2edb036da3 "net: fix 64 bit
counters on 32 bit arches" Eric Dumazet dealt with that problem by
requiring callers of dev_get_stats() to provide storage for the
result.  This means that the net_device::stats64 field and the padding
in struct net_device_stats are now redundant, so remove them.

Update the comment on net_device_ops::ndo_get_stats64 to reflect its
new usage.

Change dev_txq_stats_fold() to use struct rtnl_link_stats64, since
that is what all its callers are really using and it is no longer
going to be compatible with struct net_device_stats.

Eric Dumazet suggested the separate function for the structure
conversion.

Signed-off-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
