linux-stable.git/include/linux/skbuff.h, branch v3.19.3

net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb

2014-12-10T18:31:57+00:00

This change pulls the core functionality out of __netdev_alloc_skb and
places them in a new function named __alloc_rx_skb.  The reason for doing
this is to make these bits accessible to a new function __napi_alloc_skb.
In addition __alloc_rx_skb now has a new flags value that is used to
determine which page frag pool to allocate from.  If the SKB_ALLOC_NAPI
flag is set then the NAPI pool is used.  The advantage of this is that we
do not have to use local_irq_save/restore when accessing the NAPI pool from
NAPI context.

In my test setup I saw at least 11ns of savings using the napi_alloc_skb
function versus the netdev_alloc_skb function, most of this being due to
the fact that we didn't have to call local_irq_save/restore.

The main use case for napi_alloc_skb would be for things such as copybreak
or page fragment based receive paths where an skb is allocated after the
data has been received instead of before.

Signed-off-by: Alexander Duyck 
Signed-off-by: David S. Miller

net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag

2014-12-10T18:31:57+00:00

This patch splits the netdev_alloc_frag function up so that it can be used
on one of two page frag pools instead of being fixed on the
netdev_alloc_cache.  By doing this we can add a NAPI specific function
__napi_alloc_frag that accesses a pool that is only used from softirq
context.  The advantage to this is that we do not need to call
local_irq_save/restore which can be a significant savings.

I also took the opportunity to refactor the core bits that were placed in
__alloc_page_frag.  First I updated the allocation to do either a 32K
allocation or an order 0 page.  This is based on the changes in commmit
d9b2938aa where it was found that latencies could be reduced in case of
failures.  Then I also rewrote the logic to work from the end of the page to
the start.  By doing this the size value doesn't have to be used unless we
have run out of space for page fragments.  Finally I cleaned up the atomic
bits so that we just do an atomic_sub_and_test and if that returns true then
we set the page->_count via an atomic_set.  This way we can remove the extra
conditional for the atomic_read since it would have led to an atomic_inc in
the case of success anyway.

Signed-off-by: Alexander Duyck 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

skb_copy_datagram_iovec() can die

2014-12-09T21:29:11+00:00

no callers other than itself.

Signed-off-by: Al Viro

switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives

2014-12-09T21:29:10+00:00

... making both non-draining.  That means that tcp_recvmsg() becomes
non-draining.  And _that_ would break iscsit_do_rx_data() unless we
	a) make sure tcp_recvmsg() is uniformly non-draining (it is)
	b) make sure it copes with arbitrary (including shifted)
iov_iter (it does, all it uses is iov_iter primitives)
	c) make iscsit_do_rx_data() initialize ->msg_iter only once.

Fortunately, (c) is doable with minimal work and we are rid of one
the two places where kernel send/recvmsg users would be unhappy with
non-draining behaviour.

Actually, that makes all but one of ->recvmsg() instances iov_iter-clean.
The exception is skcipher_recvmsg() and it also isn't hard to convert
to primitives (iov_iter_get_pages() is needed there).  That'll wait
a bit - there's some interplay with ->sendmsg() path for that one.

Signed-off-by: Al Viro

put iov_iter into msghdr

2014-12-09T21:29:03+00:00

Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro

dst: no need to take reference on DST_NOCACHE dsts

2014-12-09T21:08:17+00:00

Since commit f8864972126899 ("ipv4: fix dst race in sk_dst_get()")
DST_NOCACHE dst_entries get freed by RCU. So there is no need to get a
reference on them when we are in rcu protected sections.

Cc: Eric Dumazet 
Cc: Julian Anastasov 
Signed-off-by: Hannes Frederic Sowa 
Reviewed-by: Julian Anastasov 
Signed-off-by: David S. Miller

net: avoid two atomic operations in fast clones

2014-12-09T18:40:20+00:00

Commit ce1a4ea3f125 ("net: avoid one atomic operation in skb_clone()")
took the wrong way to save one atomic operation.

It is actually possible to avoid two atomic operations, if we
do not change skb->fclone values, and only rely on clone_ref
content to signal if the clone is available or not.

skb_clone() can simply use the fast clone if clone_ref is 1.

kfree_skbmem() can avoid the atomic_dec_and_test() if clone_ref is 1.

Note that because we usually free the clone before the original skb,
this particular attempt is only done for the original skb to have better
branch prediction.

SKB_FCLONE_FREE is removed.

Signed-off-by: Eric Dumazet 
Cc: Chris Mason 
Cc: Sabrina Dubroca 
Cc: Vijay Subramanian 
Signed-off-by: David S. Miller

net: Add functions for handling padding frame and adding to length

2014-12-09T01:47:41+00:00

This patch adds two new helper functions skb_put_padto and eth_skb_pad.
These functions deviate from the standard skb_pad or skb_padto in that they
will also update the length and tail pointers so that they reflect the
padding added to the frame.

The eth_skb_pad helper is meant to be used with Ethernet devices to update
either Rx or Tx frames so that they report the correct size.  The
skb_put_padto helper is meant to be used primarily in the transmit path for
network devices that need frames to be padded up to some minimum size and
don't wish to simply update the length somewhere external to the frame.

The motivation behind this is that there are a number of implementations
throughout the network device drivers that are all doing the same thing,
but each a little bit differently and as a result several implementations
contain bugs such as updating the length without updating the tail offset
and other similar issues.

Signed-off-by: Alexander Duyck 
Signed-off-by: David S. Miller

switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()

2014-11-24T10:16:39+00:00

... and kill skb_copy_datagram_iovec()

Signed-off-by: Al Viro

kill zerocopy_sg_from_iovec()

2014-11-24T10:16:39+00:00

no users left

Signed-off-by: Al Viro