<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/if_tun.h, branch linux-2.6.30.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>tun: Limit amount of queued packets per device</title>
<updated>2009-02-06T05:25:32+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2009-02-06T05:25:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=33dccbb050bbe35b88ca8cf1228dcf3e4d4b3554'/>
<id>33dccbb050bbe35b88ca8cf1228dcf3e4d4b3554</id>
<content type='text'>
Unlike a normal socket path, the tuntap device send path does
not have any accounting.  This means that the user-space sender
may be able to pin down arbitrary amounts of kernel memory by
continuing to send data to an end-point that is congested.

Even when this isn't an issue because of limited queueing at
most end points, this can also be a problem because its only
response to congestion is packet loss.  That is, when those
local queues at the end-point fills up, the tuntap device will
start wasting system time because it will continue to send
data there which simply gets dropped straight away.

Of course one could argue that everybody should do congestion
control end-to-end, unfortunately there are people in this world
still hooked on UDP, and they don't appear to be going away
anywhere fast.  In fact, we've always helped them by performing
accounting in our UDP code, the sole purpose of which is to
provide congestion feedback other than through packet loss.

This patch attempts to apply the same bandaid to the tuntap device.
It creates a pseudo-socket object which is used to account our
packets just as a normal socket does for UDP.  Of course things
are a little complex because we're actually reinjecting traffic
back into the stack rather than out of the stack.

The stack complexities however should have been resolved by preceding
patches.  So this one can simply start using skb_set_owner_w.

For now the accounting is essentially disabled by default for
backwards compatibility.  In particular, we set the cap to INT_MAX.
This is so that existing applications don't get confused by the
sudden arrival EAGAIN errors.

In future we may wish (or be forced to) do this by default.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unlike a normal socket path, the tuntap device send path does
not have any accounting.  This means that the user-space sender
may be able to pin down arbitrary amounts of kernel memory by
continuing to send data to an end-point that is congested.

Even when this isn't an issue because of limited queueing at
most end points, this can also be a problem because its only
response to congestion is packet loss.  That is, when those
local queues at the end-point fills up, the tuntap device will
start wasting system time because it will continue to send
data there which simply gets dropped straight away.

Of course one could argue that everybody should do congestion
control end-to-end, unfortunately there are people in this world
still hooked on UDP, and they don't appear to be going away
anywhere fast.  In fact, we've always helped them by performing
accounting in our UDP code, the sole purpose of which is to
provide congestion feedback other than through packet loss.

This patch attempts to apply the same bandaid to the tuntap device.
It creates a pseudo-socket object which is used to account our
packets just as a normal socket does for UDP.  Of course things
are a little complex because we're actually reinjecting traffic
back into the stack rather than out of the stack.

The stack complexities however should have been resolved by preceding
patches.  So this one can simply start using skb_set_owner_w.

For now the accounting is essentially disabled by default for
backwards compatibility.  In particular, we set the cap to INT_MAX.
This is so that existing applications don't get confused by the
sudden arrival EAGAIN errors.

In future we may wish (or be forced to) do this by default.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tun: TUNGETIFF interface to query name and flags</title>
<updated>2008-08-16T02:52:19+00:00</updated>
<author>
<name>Mark McLoughlin</name>
<email>markmc@redhat.com</email>
</author>
<published>2008-08-15T22:09:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e3b99556975907530aeb9745e7b3945a0da48f17'/>
<id>e3b99556975907530aeb9745e7b3945a0da48f17</id>
<content type='text'>
Add a TUNGETIFF interface so that userspace can query a
tun/tap descriptor for its name and flags.

This is needed because it is common for one app to create
a tap interface, exec another app and pass it the file
descriptor for the interface. Without TUNGETIFF the spawned
app has no way of detecting wheter the interface has e.g.
IFF_VNET_HDR set.

Signed-off-by: Mark McLoughlin &lt;markmc@redhat.com&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a TUNGETIFF interface so that userspace can query a
tun/tap descriptor for its name and flags.

This is needed because it is common for one app to create
a tap interface, exec another app and pass it the file
descriptor for the interface. Without TUNGETIFF the spawned
app has no way of detecting wheter the interface has e.g.
IFF_VNET_HDR set.

Signed-off-by: Mark McLoughlin &lt;markmc@redhat.com&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tun: Fix/rewrite packet filtering logic</title>
<updated>2008-07-15T05:18:19+00:00</updated>
<author>
<name>Max Krasnyansky</name>
<email>maxk@qualcomm.com</email>
</author>
<published>2008-07-15T05:18:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f271b2cc78f09c93ccd00a2056d3237134bf994c'/>
<id>f271b2cc78f09c93ccd00a2056d3237134bf994c</id>
<content type='text'>
Please see the following thread to get some context on this
	http://marc.info/?l=linux-netdev&amp;m=121564433018903&amp;w=2

Basically the issue is that current multi-cast filtering stuff in
the TUN/TAP driver is seriously broken.
Original patch went in without proper review and ACK. It was broken and
confusing to start with and subsequent patches broke it completely.
To give you an idea of what's broken here are some of the issues:

- Very confusing comments throughout the code that imply that the
character device is a network interface in its own right, and that packets
are passed between the two nics. Which is completely wrong.

- Wrong set of ioctls is used for setting up filters. They look like
shortcuts for manipulating state of the tun/tap network interface but
in reality manipulate the state of the TX filter.

- ioctls that were originally used for setting address of the the TX filter
got "fixed" and now set the address of the network interface itself. Which
made filter totaly useless.

- Filtering is done too late. Instead of filtering early on, to avoid
unnecessary wakeups, filtering is done in the read() call.

The list goes on and on :)

So the patch cleans all that up. It introduces simple and clean interface for
setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering
before enqueuing the packets.

TX filtering is useful in the scenarios where TAP is part of a bridge, in
which case it gets all broadcast, multicast and potentially other packets when
the bridge is learning. So for example Ethernet tunnelling app may want to
setup TX filters to avoid tunnelling multicast traffic. QEMU and other
hypervisors can push RX filtering that is currently done in the guest into the
host context therefore saving wakeups and unnecessary data transfer.

Signed-off-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Please see the following thread to get some context on this
	http://marc.info/?l=linux-netdev&amp;m=121564433018903&amp;w=2

Basically the issue is that current multi-cast filtering stuff in
the TUN/TAP driver is seriously broken.
Original patch went in without proper review and ACK. It was broken and
confusing to start with and subsequent patches broke it completely.
To give you an idea of what's broken here are some of the issues:

- Very confusing comments throughout the code that imply that the
character device is a network interface in its own right, and that packets
are passed between the two nics. Which is completely wrong.

- Wrong set of ioctls is used for setting up filters. They look like
shortcuts for manipulating state of the tun/tap network interface but
in reality manipulate the state of the TX filter.

- ioctls that were originally used for setting address of the the TX filter
got "fixed" and now set the address of the network interface itself. Which
made filter totaly useless.

- Filtering is done too late. Instead of filtering early on, to avoid
unnecessary wakeups, filtering is done in the read() call.

The list goes on and on :)

So the patch cleans all that up. It introduces simple and clean interface for
setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering
before enqueuing the packets.

TX filtering is useful in the scenarios where TAP is part of a bridge, in
which case it gets all broadcast, multicast and potentially other packets when
the bridge is learning. So for example Ethernet tunnelling app may want to
setup TX filters to avoid tunnelling multicast traffic. QEMU and other
hypervisors can push RX filtering that is currently done in the guest into the
host context therefore saving wakeups and unnecessary data transfer.

Signed-off-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tun: Allow GSO using virtio_net_hdr</title>
<updated>2008-07-03T10:48:02+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-07-03T10:48:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f43798c27684ab925adde7d8acc34c78c6e50df8'/>
<id>f43798c27684ab925adde7d8acc34c78c6e50df8</id>
<content type='text'>
Add a IFF_VNET_HDR flag.  This uses the same ABI as virtio_net
(ie. prepending struct virtio_net_hdr to packets) to indicate GSO and
checksum information.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a IFF_VNET_HDR flag.  This uses the same ABI as virtio_net
(ie. prepending struct virtio_net_hdr to packets) to indicate GSO and
checksum information.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tun: TUNSETFEATURES to set gso features.</title>
<updated>2008-07-03T10:46:16+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-07-03T10:46:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5228ddc98fa49b3cedab4024e269d62410a0d806'/>
<id>5228ddc98fa49b3cedab4024e269d62410a0d806</id>
<content type='text'>
ethtool is useful for setting (some) device fields, but it's
root-only.  Finer feature control is available through a tun-specific
ioctl.

(Includes Mark McLoughlin &lt;markmc@redhat.com&gt;'s fix to hold rtnl sem).

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ethtool is useful for setting (some) device fields, but it's
root-only.  Finer feature control is available through a tun-specific
ioctl.

(Includes Mark McLoughlin &lt;markmc@redhat.com&gt;'s fix to hold rtnl sem).

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tun: Interface to query tun/tap features.</title>
<updated>2008-07-03T10:45:32+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-07-03T10:45:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=07240fd0902c872f044f523893364a1a24c9f278'/>
<id>07240fd0902c872f044f523893364a1a24c9f278</id>
<content type='text'>
The problem with introducing checksum offload and gso to tun is they
need to set dev-&gt;features to enable GSO and/or checksumming, which is
supposed to be done before register_netdevice(), ie. as part of
TUNSETIFF.

Unfortunately, TUNSETIFF has always just ignored flags it doesn't
understand, so there's no good way of detecting whether the kernel
supports new IFF_ flags.

This patch implements a TUNGETFEATURES ioctl which returns all the valid IFF
flags.  It could be extended later to include other features.

Here's an example program which uses it:

#include &lt;linux/if_tun.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/ioctl.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;err.h&gt;
#include &lt;stdio.h&gt;

static struct {
	unsigned int flag;
	const char *name;
} known_flags[] = {
	{ IFF_TUN, "TUN" },
	{ IFF_TAP, "TAP" },
	{ IFF_NO_PI, "NO_PI" },
	{ IFF_ONE_QUEUE, "ONE_QUEUE" },
};

int main()
{
	unsigned int features, i;

	int netfd = open("/dev/net/tun", O_RDWR);
	if (netfd &lt; 0)
		err(1, "Opening /dev/net/tun");

	if (ioctl(netfd, TUNGETFEATURES, &amp;features) != 0) {
		printf("Kernel does not support TUNGETFEATURES, guessing\n");
		features = (IFF_TUN|IFF_TAP|IFF_NO_PI|IFF_ONE_QUEUE);
	}
	printf("Available features are: ");
	for (i = 0; i &lt; sizeof(known_flags)/sizeof(known_flags[0]); i++) {
		if (features &amp; known_flags[i].flag) {
			features &amp;= ~known_flags[i].flag;
			printf("%s ", known_flags[i].name);
		}
	}
	if (features)
		printf("(UNKNOWN %#x)", features);
	printf("\n");
	return 0;
}

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The problem with introducing checksum offload and gso to tun is they
need to set dev-&gt;features to enable GSO and/or checksumming, which is
supposed to be done before register_netdevice(), ie. as part of
TUNSETIFF.

Unfortunately, TUNSETIFF has always just ignored flags it doesn't
understand, so there's no good way of detecting whether the kernel
supports new IFF_ flags.

This patch implements a TUNGETFEATURES ioctl which returns all the valid IFF
flags.  It could be extended later to include other features.

Here's an example program which uses it:

#include &lt;linux/if_tun.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/ioctl.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;err.h&gt;
#include &lt;stdio.h&gt;

static struct {
	unsigned int flag;
	const char *name;
} known_flags[] = {
	{ IFF_TUN, "TUN" },
	{ IFF_TAP, "TAP" },
	{ IFF_NO_PI, "NO_PI" },
	{ IFF_ONE_QUEUE, "ONE_QUEUE" },
};

int main()
{
	unsigned int features, i;

	int netfd = open("/dev/net/tun", O_RDWR);
	if (netfd &lt; 0)
		err(1, "Opening /dev/net/tun");

	if (ioctl(netfd, TUNGETFEATURES, &amp;features) != 0) {
		printf("Kernel does not support TUNGETFEATURES, guessing\n");
		features = (IFF_TUN|IFF_TAP|IFF_NO_PI|IFF_ONE_QUEUE);
	}
	printf("Available features are: ");
	for (i = 0; i &lt; sizeof(known_flags)/sizeof(known_flags[0]); i++) {
		if (features &amp; known_flags[i].flag) {
			features &amp;= ~known_flags[i].flag;
			printf("%s ", known_flags[i].name);
		}
	}
	if (features)
		printf("(UNKNOWN %#x)", features);
	printf("\n");
	return 0;
}

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: remove CVS keywords</title>
<updated>2008-06-12T04:00:38+00:00</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@kernel.org</email>
</author>
<published>2008-06-11T05:46:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b040829952d84bf2a62526f0e24b624e0699447'/>
<id>0b040829952d84bf2a62526f0e24b624e0699447</id>
<content type='text'>
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: make struct tun_struct private to tun.c</title>
<updated>2008-04-13T01:48:58+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2008-04-13T01:48:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=14daa02139dcb3193b2b0250c0720a23ef610c49'/>
<id>14daa02139dcb3193b2b0250c0720a23ef610c49</id>
<content type='text'>
There's no reason for this to be in the header, and it just hurts
recompile time.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyanskiy &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's no reason for this to be in the header, and it just hurts
recompile time.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Max Krasnyanskiy &lt;maxk@qualcomm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>annotate tun</title>
<updated>2008-01-28T23:07:57+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@ftp.linux.org.uk</email>
</author>
<published>2007-12-22T17:52:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a3edb08311fc559652ffc959e93eb5be9294443f'/>
<id>a3edb08311fc559652ffc959e93eb5be9294443f</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jeff Garzik &lt;jeff@garzik.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Jeff Garzik &lt;jeff@garzik.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET] drivers/net: statistics cleanup #1 -- save memory and shrink code</title>
<updated>2007-10-10T23:51:16+00:00</updated>
<author>
<name>Jeff Garzik</name>
<email>jeff@garzik.org</email>
</author>
<published>2007-10-04T00:41:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09f75cd7bf13720738e6a196cc0107ce9a5bd5a0'/>
<id>09f75cd7bf13720738e6a196cc0107ce9a5bd5a0</id>
<content type='text'>
We now have struct net_device_stats embedded in struct net_device,
and the default -&gt;get_stats() hook does the obvious thing for us.

Run through drivers/net/* and remove the driver-local storage of
statistics, and driver-local -&gt;get_stats() hook where applicable.

This was just the low-hanging fruit in drivers/net; plenty more drivers
remain to be updated.

[ Resolved conflicts with napi_struct changes and fix sunqe build
  regression... -DaveM ]

Signed-off-by: Jeff Garzik &lt;jeff@garzik.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We now have struct net_device_stats embedded in struct net_device,
and the default -&gt;get_stats() hook does the obvious thing for us.

Run through drivers/net/* and remove the driver-local storage of
statistics, and driver-local -&gt;get_stats() hook where applicable.

This was just the low-hanging fruit in drivers/net; plenty more drivers
remain to be updated.

[ Resolved conflicts with napi_struct changes and fix sunqe build
  regression... -DaveM ]

Signed-off-by: Jeff Garzik &lt;jeff@garzik.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
