linux-stable.git/net/core/skbuff.c, branch v4.2.7

skbuff: Fix skb checksum partial check.

2015-10-27T00:53:35+00:00

[ Upstream commit 31b33dfb0a144469dd805514c9e63f4993729a48 ]

Earlier patch 6ae459bda tried to detect void ckecksum partial
skb by comparing pull length to checksum offset. But it does
not work for all cases since checksum-offset depends on
updates to skb->data.

Following patch fixes it by validating checksum start offset
after skb-data pointer is updated. Negative value of checksum
offset start means there is no need to checksum.

Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin 
Signed-off-by: Pravin B Shelar 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

mm: make page pfmemalloc check more robust

2015-08-21T21:30:10+00:00

Commit c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko 
Debugged-by: Vlastimil Babka 
Debugged-by: Jiri Bohac 
Cc: Eric Dumazet 
Cc: David Miller 
Acked-by: Mel Gorman 
Cc: 	[3.6+]
Signed-off-by: Andrew Morton 

Signed-off-by: Linus Torvalds

net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code

2015-08-14T00:08:39+00:00

The recent refactoring of the IGMP and MLD parsing code into
ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
BUG() invocation for bridges:

I wrongly assumed that skb_get() could be used as a simple reference
counter for an skb which is not the case. skb_get() bears additional
semantics, a user count. This leads to a BUG() invocation in
pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
with a user count greater than one - unfortunately the refactoring did
just that.

Fixing this by removing the skb_get() call and changing the API: The
caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
additionally check whether the returned skb_trimmed is a clone.

Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Brenden Blanco 
Signed-off-by: Linus Lüssing 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

2015-06-14T06:56:52+00:00

net: don't wait for order-3 page allocation

2015-06-12T00:33:44+00:00

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet 
Cc: Chris Mason 
Cc: Debabrata Banerjee 
Signed-off-by: Shaohua Li 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: af_unix: implement splice for stream af_unix sockets

2015-05-25T04:06:59+00:00

unix_stream_recvmsg is refactored to unix_stream_read_generic in this
patch and enhanced to deal with pipe splicing. The refactoring is
inneglible, we mostly have to deal with a non-existing struct msghdr
argument.

Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

net: make skb_splice_bits more configureable

2015-05-25T04:06:59+00:00

Prepare skb_splice_bits to be able to deal with AF_UNIX sockets.

AF_UNIX sockets don't use lock_sock/release_sock and thus we have to
use a callback to make the locking and unlocking configureable.

Signed-off-by: Hannes Frederic Sowa 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: skbuff: add skb_append_pagefrags and use it

2015-05-25T04:06:58+00:00

Signed-off-by: Hannes Frederic Sowa 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: Reserve skb headroom and set skb->dev even if using __alloc_skb

2015-05-13T22:07:24+00:00

When I had inlined __alloc_rx_skb into __netdev_alloc_skb and
__napi_alloc_skb I had overlooked the fact that there was a return in the
__alloc_rx_skb.  As a result we weren't reserving headroom or setting the
skb->dev in certain cases.  This change corrects that by adding a couple of
jump labels to jump to depending on __alloc_skb either succeeding or failing.

Fixes: 9451980a6646 ("net: Use cached copy of pfmemalloc to avoid accessing page")
Reported-by: Felipe Balbi 
Signed-off-by: Alexander Duyck 
Tested-by: Kevin Hilman 
Signed-off-by: David S. Miller

net: Add skb_free_frag to replace use of put_page in freeing skb->head

2015-05-12T14:39:26+00:00

This change adds a function called skb_free_frag which is meant to
compliment the function netdev_alloc_frag.  The general idea is to enable a
more lightweight version of page freeing since we don't actually need all
the overhead of a put_page, and we don't quite fit the model of __free_pages.

Signed-off-by: Alexander Duyck 
Signed-off-by: David S. Miller