<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/packet, branch v3.15.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net: Use netlink_ns_capable to verify the permisions of netlink messages</title>
<updated>2014-04-24T17:44:54+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-04-23T21:29:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=90f62cf30a78721641e08737bda787552428061e'/>
<id>90f62cf30a78721641e08737bda787552428061e</id>
<content type='text'>
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.

To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.

To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Move the permission check in sock_diag_put_filterinfo to packet_diag_dump</title>
<updated>2014-04-24T17:44:53+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-04-23T21:26:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a53b72c83a4216f2eb883ed45a0cbce014b8e62d'/>
<id>a53b72c83a4216f2eb883ed45a0cbce014b8e62d</id>
<content type='text'>
The permission check in sock_diag_put_filterinfo is wrong, and it is so removed
from it's sources it is not clear why it is wrong.  Move the computation
into packet_diag_dump and pass a bool of the result into sock_diag_filterinfo.

This does not yet correct the capability check but instead simply moves it to make
it clear what is going on.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The permission check in sock_diag_put_filterinfo is wrong, and it is so removed
from it's sources it is not clear why it is wrong.  Move the computation
into packet_diag_dump and pass a bool of the result into sock_diag_filterinfo.

This does not yet correct the capability check but instead simply moves it to make
it clear what is going on.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Fix ns_capable check in sock_diag_put_filterinfo</title>
<updated>2014-04-22T16:49:39+00:00</updated>
<author>
<name>Andrew Lutomirski</name>
<email>luto@amacapital.net</email>
</author>
<published>2014-04-17T04:41:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=78541c1dc60b65ecfce5a6a096fc260219d6784e'/>
<id>78541c1dc60b65ecfce5a6a096fc260219d6784e</id>
<content type='text'>
The caller needs capabilities on the namespace being queried, not on
their own namespace.  This is a security bug, although it likely has
only a minor impact.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The caller needs capabilities on the namespace being queried, not on
their own namespace.  This is a security bug, although it likely has
only a minor impact.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Fix use after free by removing length arg from sk_data_ready callbacks.</title>
<updated>2014-04-11T20:15:36+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2014-04-11T20:15:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=676d23690fb62b5d51ba5d659935e9f7d9da9f8e'/>
<id>676d23690fb62b5d51ba5d659935e9f7d9da9f8e</id>
<content type='text'>
Several spots in the kernel perform a sequence like:

	skb_queue_tail(&amp;sk-&gt;s_receive_queue, skb);
	sk-&gt;sk_data_ready(sk, skb-&gt;len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb-&gt;len access is potentially
to freed up memory.

Furthermore, the skb-&gt;len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Several spots in the kernel perform a sequence like:

	skb_queue_tail(&amp;sk-&gt;s_receive_queue, skb);
	sk-&gt;sk_data_ready(sk, skb-&gt;len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb-&gt;len access is potentially
to freed up memory.

Furthermore, the skb-&gt;len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: fix packet_direct_xmit for BQL enabled drivers</title>
<updated>2014-04-03T18:29:12+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-04-02T18:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8e2f1a63f2217365223026422a2f8ba5967051d6'/>
<id>8e2f1a63f2217365223026422a2f8ba5967051d6</id>
<content type='text'>
Currently, in packet_direct_xmit() we test the assigned netdevice queue
for netif_xmit_frozen_or_stopped() before doing an ndo_start_xmit().

This can have the side-effect that BQL enabled drivers which make use
of netdev_tx_sent_queue() internally, set __QUEUE_STATE_STACK_XOFF from
within the stack and would not fully fill the device's TX ring from
packet sockets with PACKET_QDISC_BYPASS enabled.

Instead, use a test without BQL bit so that bursts can be absorbed
into the NICs TX ring. Fix and code suggested by Eric Dumazet, thanks!

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, in packet_direct_xmit() we test the assigned netdevice queue
for netif_xmit_frozen_or_stopped() before doing an ndo_start_xmit().

This can have the side-effect that BQL enabled drivers which make use
of netdev_tx_sent_queue() internally, set __QUEUE_STATE_STACK_XOFF from
within the stack and would not fully fill the device's TX ring from
packet sockets with PACKET_QDISC_BYPASS enabled.

Instead, use a test without BQL bit so that bursts can be absorbed
into the NICs TX ring. Fix and code suggested by Eric Dumazet, thanks!

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: report tx_dropped in packet_direct_xmit</title>
<updated>2014-04-03T18:29:11+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-04-02T18:52:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0f97ede45e65ffb6eab856313e79b14b902bcfaa'/>
<id>0f97ede45e65ffb6eab856313e79b14b902bcfaa</id>
<content type='text'>
Since commit 015f0688f57c ("net: net: add a core netdev-&gt;tx_dropped
counter"), we can now account for TX drops from within the core
stack instead of drivers.

Therefore, fix packet_direct_xmit() and increase drop count when we
encounter a problem before driver's xmit function was called (we do
not want to doubly account for it).

Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit 015f0688f57c ("net: net: add a core netdev-&gt;tx_dropped
counter"), we can now account for TX drops from within the core
stack instead of drivers.

Therefore, fix packet_direct_xmit() and increase drop count when we
encounter a problem before driver's xmit function was called (we do
not want to doubly account for it).

Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: respect devices with LLTX flag in direct xmit</title>
<updated>2014-03-28T20:49:48+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-03-27T15:38:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=43279500decad66ccdddacae7948a1d23be1bef6'/>
<id>43279500decad66ccdddacae7948a1d23be1bef6</id>
<content type='text'>
Quite often it can be useful to test with dummy or similar
devices as a blackhole sink for skbs. Such devices are only
equipped with a single txq, but marked as NETIF_F_LLTX as
they do not require locking their internal queues on xmit
(or implement locking themselves). Therefore, rather use
HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.

trafgen mmap/TX_RING example against dummy device with config
foo: { fill(0xff, 64) } results in the following performance
improvements for such scenarios on an ordinary Core i7/2.80GHz:

Before:

 Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):

   160,975,944,159 instructions:k            #    0.55  insns per cycle          ( +-  0.09% )
   293,319,390,278 cycles:k                  #    0.000 GHz                      ( +-  0.35% )
       192,501,104 branch-misses:k                                               ( +-  1.63% )
               831 context-switches:k                                            ( +-  9.18% )
                 7 cpu-migrations:k                                              ( +-  7.40% )
            69,382 cache-misses:k            #    0.010 % of all cache refs      ( +-  2.18% )
       671,552,021 cache-references:k                                            ( +-  1.29% )

      22.856401569 seconds time elapsed                                          ( +-  0.33% )

After:

 Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):

   133,788,739,692 instructions:k            #    0.92  insns per cycle          ( +-  0.06% )
   145,853,213,256 cycles:k                  #    0.000 GHz                      ( +-  0.17% )
        59,867,100 branch-misses:k                                               ( +-  4.72% )
               384 context-switches:k                                            ( +-  3.76% )
                 6 cpu-migrations:k                                              ( +-  6.28% )
            70,304 cache-misses:k            #    0.077 % of all cache refs      ( +-  1.73% )
        90,879,408 cache-references:k                                            ( +-  1.35% )

      11.719372413 seconds time elapsed                                          ( +-  0.24% )

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Cc: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Quite often it can be useful to test with dummy or similar
devices as a blackhole sink for skbs. Such devices are only
equipped with a single txq, but marked as NETIF_F_LLTX as
they do not require locking their internal queues on xmit
(or implement locking themselves). Therefore, rather use
HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.

trafgen mmap/TX_RING example against dummy device with config
foo: { fill(0xff, 64) } results in the following performance
improvements for such scenarios on an ordinary Core i7/2.80GHz:

Before:

 Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):

   160,975,944,159 instructions:k            #    0.55  insns per cycle          ( +-  0.09% )
   293,319,390,278 cycles:k                  #    0.000 GHz                      ( +-  0.35% )
       192,501,104 branch-misses:k                                               ( +-  1.63% )
               831 context-switches:k                                            ( +-  9.18% )
                 7 cpu-migrations:k                                              ( +-  7.40% )
            69,382 cache-misses:k            #    0.010 % of all cache refs      ( +-  2.18% )
       671,552,021 cache-references:k                                            ( +-  1.29% )

      22.856401569 seconds time elapsed                                          ( +-  0.33% )

After:

 Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):

   133,788,739,692 instructions:k            #    0.92  insns per cycle          ( +-  0.06% )
   145,853,213,256 cycles:k                  #    0.000 GHz                      ( +-  0.17% )
        59,867,100 branch-misses:k                                               ( +-  4.72% )
               384 context-switches:k                                            ( +-  3.76% )
                 6 cpu-migrations:k                                              ( +-  6.28% )
            70,304 cache-misses:k            #    0.077 % of all cache refs      ( +-  1.73% )
        90,879,408 cache-references:k                                            ( +-  1.35% )

      11.719372413 seconds time elapsed                                          ( +-  0.24% )

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Cc: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Rename skb-&gt;rxhash to skb-&gt;hash</title>
<updated>2014-03-26T19:58:20+00:00</updated>
<author>
<name>Tom Herbert</name>
<email>therbert@google.com</email>
</author>
<published>2014-03-24T22:34:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=61b905da33ae25edb6b9d2a5de21e34c3a77efe3'/>
<id>61b905da33ae25edb6b9d2a5de21e34c3a77efe3</id>
<content type='text'>
The packet hash can be considered a property of the packet, not just
on RX path.

This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The packet hash can be considered a property of the packet, not just
on RX path.

This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>packet: allow to transmit +4 byte in TX_RING slot for VLAN case</title>
<updated>2014-02-28T21:52:02+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2014-02-28T01:22:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=52f1454f629fafbfb47ad6727e0837250e1f08c0'/>
<id>52f1454f629fafbfb47ad6727e0837250e1f08c0</id>
<content type='text'>
Commit 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes
for VLAN packets.") added the possibility for non-mmaped frames to
send extra 4 byte for VLAN header so the MTU increases from 1500 to
1504 byte, for example.

Commit cbd89acb9eb2 ("af_packet: fix for sending VLAN frames via
packet_mmap") attempted to fix that for the mmap part but was
reverted as it caused regressions while using eth_type_trans()
on output path.

Lets just act analogous to 57f89bfa2140 and add a similar logic
to TX_RING. We presume size_max as overcharged with +4 bytes and
later on after skb has been built by tpacket_fill_skb() check
for ETH_P_8021Q header on packets larger than normal MTU. Can
be easily reproduced with a slightly modified trafgen in mmap(2)
mode, test cases:

 { fill(0xff, 12) const16(0x8100) fill(0xff, &lt;1504|1505&gt;) }
 { fill(0xff, 12) const16(0x0806) fill(0xff, &lt;1500|1501&gt;) }

Note that we need to do the test right after tpacket_fill_skb()
as sockets can have PACKET_LOSS set where we would not fail but
instead just continue to traverse the ring.

Reported-by: Mathias Kretschmer &lt;mathias.kretschmer@fokus.fraunhofer.de&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Cc: Ben Greear &lt;greearb@candelatech.com&gt;
Cc: Phil Sutter &lt;phil@nwl.cc&gt;
Tested-by: Mathias Kretschmer &lt;mathias.kretschmer@fokus.fraunhofer.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes
for VLAN packets.") added the possibility for non-mmaped frames to
send extra 4 byte for VLAN header so the MTU increases from 1500 to
1504 byte, for example.

Commit cbd89acb9eb2 ("af_packet: fix for sending VLAN frames via
packet_mmap") attempted to fix that for the mmap part but was
reverted as it caused regressions while using eth_type_trans()
on output path.

Lets just act analogous to 57f89bfa2140 and add a similar logic
to TX_RING. We presume size_max as overcharged with +4 bytes and
later on after skb has been built by tpacket_fill_skb() check
for ETH_P_8021Q header on packets larger than normal MTU. Can
be easily reproduced with a slightly modified trafgen in mmap(2)
mode, test cases:

 { fill(0xff, 12) const16(0x8100) fill(0xff, &lt;1504|1505&gt;) }
 { fill(0xff, 12) const16(0x0806) fill(0xff, &lt;1500|1501&gt;) }

Note that we need to do the test right after tpacket_fill_skb()
as sockets can have PACKET_LOSS set where we would not fail but
instead just continue to traverse the ring.

Reported-by: Mathias Kretschmer &lt;mathias.kretschmer@fokus.fraunhofer.de&gt;
Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Cc: Ben Greear &lt;greearb@candelatech.com&gt;
Cc: Phil Sutter &lt;phil@nwl.cc&gt;
Tested-by: Mathias Kretschmer &lt;mathias.kretschmer@fokus.fraunhofer.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_packet: remove a stray tab in packet_set_ring()</title>
<updated>2014-02-18T23:02:25+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2014-02-18T12:20:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d7cf0c34af067555737193b6c1aa7abaa677f29c'/>
<id>d7cf0c34af067555737193b6c1aa7abaa677f29c</id>
<content type='text'>
At first glance it looks like there is a missing curly brace but
actually the code works the same either way.  I have adjusted the
indenting but left the code the same.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Acked-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At first glance it looks like there is a missing curly brace but
actually the code works the same either way.  I have adjusted the
indenting but left the code the same.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Acked-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
