<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/core/skbuff.c, branch v2.6.29</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>net: Kill skb_truesize_check(), it only catches false-positives.</title>
<updated>2009-02-18T05:24:05+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2009-02-18T05:24:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=92a0acce186cde8ead56c6915d9479773673ea1a'/>
<id>92a0acce186cde8ead56c6915d9479773673ea1a</id>
<content type='text'>
A long time ago we had bugs, primarily in TCP, where we would modify
skb-&gt;truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A long time ago we had bugs, primarily in TCP, where we would modify
skb-&gt;truesize (for TSO queue collapsing) in ways which would corrupt
the socket memory accounting.

skb_truesize_check() was added in order to try and catch this error
more systematically.

However this debugging check has morphed into a Frankenstein of sorts
and these days it does nothing other than catch false-positives.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Fix OOPS in skb_seq_read().</title>
<updated>2009-01-30T00:12:42+00:00</updated>
<author>
<name>Shyam Iyer</name>
<email>shyam_iyer@dell.com</email>
</author>
<published>2009-01-30T00:12:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=71b3346d182355f19509fadb8fe45114a35cc499'/>
<id>71b3346d182355f19509fadb8fe45114a35cc499</id>
<content type='text'>
It oopsd for me in skb_seq_read. addr2line said it was
linux-2.6/net/core/skbuff.c:2228, which is this line:


	while (st-&gt;frag_idx &lt; skb_shinfo(st-&gt;cur_skb)-&gt;nr_frags) {


I added some printks in there and it looks like we hit this:

        } else if (st-&gt;root_skb == st-&gt;cur_skb &amp;&amp;
                   skb_shinfo(st-&gt;root_skb)-&gt;frag_list) {
                 st-&gt;cur_skb = skb_shinfo(st-&gt;root_skb)-&gt;frag_list;
                 st-&gt;frag_idx = 0;
                 goto next_skb;
        }



Actually I did some testing and added a few printks and found that the
st-&gt;cur_skb-&gt;data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.

 	if (abs_offset &lt; block_limit) {
-		*data = st-&gt;cur_skb-&gt;data + abs_offset;
+		*data = st-&gt;cur_skb-&gt;data + (abs_offset - st-&gt;stepped_offset);

I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -

It hit this if condition -

        if (st-&gt;cur_skb-&gt;next) {
                st-&gt;cur_skb = st-&gt;cur_skb-&gt;next;
                st-&gt;frag_idx = 0;
                goto next_skb;

And so, now the st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

        else if (st-&gt;root_skb == st-&gt;cur_skb &amp;&amp;
                 skb_shinfo(st-&gt;root_skb)-&gt;frag_list) {
                st-&gt;cur_skb = skb_shinfo(st-&gt;root_skb)-&gt;frag_list;
                goto next_skb;
        }

Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches. 

Signed-off-by: Shyam Iyer &lt;shyam_iyer@dell.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It oopsd for me in skb_seq_read. addr2line said it was
linux-2.6/net/core/skbuff.c:2228, which is this line:


	while (st-&gt;frag_idx &lt; skb_shinfo(st-&gt;cur_skb)-&gt;nr_frags) {


I added some printks in there and it looks like we hit this:

        } else if (st-&gt;root_skb == st-&gt;cur_skb &amp;&amp;
                   skb_shinfo(st-&gt;root_skb)-&gt;frag_list) {
                 st-&gt;cur_skb = skb_shinfo(st-&gt;root_skb)-&gt;frag_list;
                 st-&gt;frag_idx = 0;
                 goto next_skb;
        }



Actually I did some testing and added a few printks and found that the
st-&gt;cur_skb-&gt;data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.

 	if (abs_offset &lt; block_limit) {
-		*data = st-&gt;cur_skb-&gt;data + abs_offset;
+		*data = st-&gt;cur_skb-&gt;data + (abs_offset - st-&gt;stepped_offset);

I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -

It hit this if condition -

        if (st-&gt;cur_skb-&gt;next) {
                st-&gt;cur_skb = st-&gt;cur_skb-&gt;next;
                st-&gt;frag_idx = 0;
                goto next_skb;

And so, now the st pointer is shifted to the next skb whereas actually
it should have hit the second else if first since the data is in the
frag_list.

        else if (st-&gt;root_skb == st-&gt;cur_skb &amp;&amp;
                 skb_shinfo(st-&gt;root_skb)-&gt;frag_list) {
                st-&gt;cur_skb = skb_shinfo(st-&gt;root_skb)-&gt;frag_list;
                goto next_skb;
        }

Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches. 

Signed-off-by: Shyam Iyer &lt;shyam_iyer@dell.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Fix frag_list handling in skb_seq_read</title>
<updated>2009-01-30T00:07:52+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2009-01-30T00:07:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=95e3b24cfb4ec0479d2c42f7a1780d68063a542a'/>
<id>95e3b24cfb4ec0479d2c42f7a1780d68063a542a</id>
<content type='text'>
The frag_list handling was broken in skb_seq_read:

1) We didn't add the stepped offset when looking at the head
are of fragments other than the first.

2) We didn't take the stepped offset away when setting the data
pointer in the head area.

3) The frag index wasn't reset.

This patch fixes both issues.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The frag_list handling was broken in skb_seq_read:

1) We didn't add the stepped offset when looking at the head
are of fragments other than the first.

2) We didn't take the stepped offset away when setting the data
pointer in the head area.

3) The frag index wasn't reset.

This patch fixes both issues.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gro: Fix merging of paged packets</title>
<updated>2009-01-20T22:44:03+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2009-01-17T19:48:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=37fe4732b978eb02e5433387a40f2b61706cebe3'/>
<id>37fe4732b978eb02e5433387a40f2b61706cebe3</id>
<content type='text'>
The previous fix to paged packets broke the merging because it
reset the skb-&gt;len before we added it to the merged packet.  This
wasn't detected because it simply resulted in the truncation of
the packet while the missing bit is subsequently retransmitted.

The fix is to store skb-&gt;len before we clobber it.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The previous fix to paged packets broke the merging because it
reset the skb-&gt;len before we added it to the merged packet.  This
wasn't detected because it simply resulted in the truncation of
the packet while the missing bit is subsequently retransmitted.

The fix is to store skb-&gt;len before we clobber it.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Fix data corruption when splicing from sockets.</title>
<updated>2009-01-20T01:03:56+00:00</updated>
<author>
<name>Jarek Poplawski</name>
<email>jarkao2@gmail.com</email>
</author>
<published>2009-01-20T01:03:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8b9d3728977760f6bd1317c4420890f73695354e'/>
<id>8b9d3728977760f6bd1317c4420890f73695354e</id>
<content type='text'>
The trick in socket splicing where we try to convert the skb-&gt;data
into a page based reference using virt_to_page() does not work so
well.

The idea is to pass the virt_to_page() reference via the pipe
buffer, and refcount the buffer using a SKB reference.

But if we are splicing from a socket to a socket (via sendpage)
this doesn't work.

The from side processing will grab the page (and SKB) references.
The sendpage() calls will grab page references only, return, and
then the from side processing completes and drops the SKB ref.

The page based reference to skb-&gt;data is not enough to keep the
kmalloc() buffer backing it from being reused.  Yet, that is
all that the socket send side has at this point.

This leads to data corruption if the skb-&gt;data buffer is reused
by SLAB before the send side socket actually gets the TX packet
out to the device.

The fix employed here is to simply allocate a page and copy the
skb-&gt;data bytes into that page.

This will hurt performance, but there is no clear way to fix this
properly without a copy at the present time, and it is important
to get rid of the data corruption.

With fixes from Herbert Xu.

Tested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Foreseen-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
Diagnosed-by: Willy Tarreau &lt;w@1wt.eu&gt;
Reported-by: Willy Tarreau &lt;w@1wt.eu&gt;
Fixed-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
Signed-off-by: Jarek Poplawski &lt;jarkao2@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The trick in socket splicing where we try to convert the skb-&gt;data
into a page based reference using virt_to_page() does not work so
well.

The idea is to pass the virt_to_page() reference via the pipe
buffer, and refcount the buffer using a SKB reference.

But if we are splicing from a socket to a socket (via sendpage)
this doesn't work.

The from side processing will grab the page (and SKB) references.
The sendpage() calls will grab page references only, return, and
then the from side processing completes and drops the SKB ref.

The page based reference to skb-&gt;data is not enough to keep the
kmalloc() buffer backing it from being reused.  Yet, that is
all that the socket send side has at this point.

This leads to data corruption if the skb-&gt;data buffer is reused
by SLAB before the send side socket actually gets the TX packet
out to the device.

The fix employed here is to simply allocate a page and copy the
skb-&gt;data bytes into that page.

This will hurt performance, but there is no clear way to fix this
properly without a copy at the present time, and it is important
to get rid of the data corruption.

With fixes from Herbert Xu.

Tested-by: Willy Tarreau &lt;w@1wt.eu&gt;
Foreseen-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
Diagnosed-by: Willy Tarreau &lt;w@1wt.eu&gt;
Reported-by: Willy Tarreau &lt;w@1wt.eu&gt;
Fixed-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
Signed-off-by: Jarek Poplawski &lt;jarkao2@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gro: Fix page ref count for skbs freed normally</title>
<updated>2009-01-15T04:40:03+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2009-01-15T04:40:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f557206800801410c30e53ce7a27219b2c4cf0ba'/>
<id>f557206800801410c30e53ce7a27219b2c4cf0ba</id>
<content type='text'>
When an skb with page frags is merged into an existing one, we
cannibalise its reference count.  This is OK when the skb is
reused because we set nr_frags to zero in that case.  However,
for the case where the skb is freed through kfree_skb, we didn't
clear nr_frags which causes the page to be freed prematurely.

This is fixed by moving the skb resetting into skb_gro_receive.

Reported-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When an skb with page frags is merged into an existing one, we
cannibalise its reference count.  This is OK when the skb is
reused because we set nr_frags to zero in that case.  However,
for the case where the skb is freed through kfree_skb, we didn't
clear nr_frags which causes the page to be freed prematurely.

This is fixed by moving the skb resetting into skb_gro_receive.

Reported-by: Jeff Kirsher &lt;jeffrey.t.kirsher@intel.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gro: Add page frag support</title>
<updated>2009-01-05T00:13:40+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2009-01-05T00:13:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5d38a079ce3971f932bbdc0dc5b887806fabd5dc'/>
<id>5d38a079ce3971f932bbdc0dc5b887806fabd5dc</id>
<content type='text'>
This patch allows GRO to merge page frags (skb_shinfo(skb)-&gt;frags)
in one skb, rather than using the less efficient frag_list.

It also adds a new interface, napi_gro_frags to allow drivers
to inject page frags directly into the stack without allocating
an skb.  This is intended to be the GRO equivalent for LRO's
lro_receive_frags interface.

The existing GSO interface can already handle page frags with
or without an appended frag_list so nothing needs to be changed
there.

The merging itself is rather simple.  We store any new frag entries
after the last existing entry, without checking whether the first
new entry can be merged with the last existing entry.  Making this
check would actually be easy but since no existing driver can
produce contiguous frags anyway it would just be mental masturbation.

If the total number of entries would exceed the capacity of a
single skb, we simply resort to using frag_list as we do now.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch allows GRO to merge page frags (skb_shinfo(skb)-&gt;frags)
in one skb, rather than using the less efficient frag_list.

It also adds a new interface, napi_gro_frags to allow drivers
to inject page frags directly into the stack without allocating
an skb.  This is intended to be the GRO equivalent for LRO's
lro_receive_frags interface.

The existing GSO interface can already handle page frags with
or without an appended frag_list so nothing needs to be changed
there.

The merging itself is rather simple.  We store any new frag entries
after the last existing entry, without checking whether the first
new entry can be merged with the last existing entry.  Making this
check would actually be easy but since no existing driver can
produce contiguous frags anyway it would just be mental masturbation.

If the total number of entries would exceed the capacity of a
single skb, we simply resort to using frag_list as we do now.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gro: Use gso_size to store MSS</title>
<updated>2009-01-05T00:13:19+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2009-01-05T00:13:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b530256d2e0f1a75fab31f9821129fff1bb49faa'/>
<id>b530256d2e0f1a75fab31f9821129fff1bb49faa</id>
<content type='text'>
In order to allow GRO packets without frag_list at all, we need to
store the MSS in the packet itself.  The obvious place is gso_size.
The only thing to watch out for is if the packet ends up not being
GRO then we need to clear gso_size before pushing the packet into
the stack.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to allow GRO packets without frag_list at all, we need to
store the MSS in the packet itself.  The obvious place is gso_size.
The only thing to watch out for is if the packet ends up not being
GRO then we need to clear gso_size before pushing the packet into
the stack.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add skb_gro_receive</title>
<updated>2008-12-16T07:42:33+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2008-12-16T07:42:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=71d93b39e52e92aea35f1058d957cf12250d0b75'/>
<id>71d93b39e52e92aea35f1058d957cf12250d0b75</id>
<content type='text'>
This patch adds the helper skb_gro_receive to merge packets for
GRO.  The current method is to allocate a new header skb and then
chain the original packets to its frag_list.  This is done to
make it easier to integrate into the existing GSO framework.

In future as GSO is moved into the drivers, we can undo this and
simply chain the original packets together.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds the helper skb_gro_receive to merge packets for
GRO.  The current method is to allocate a new header skb and then
chain the original packets to its frag_list.  This is done to
make it easier to integrate into the existing GSO framework.

In future as GSO is moved into the drivers, we can undo this and
simply chain the original packets together.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add frag_list support to skb_segment</title>
<updated>2008-12-16T07:26:06+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2008-12-16T07:26:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=89319d3801d1d3ac29c7df1f067038986f267d29'/>
<id>89319d3801d1d3ac29c7df1f067038986f267d29</id>
<content type='text'>
This patch adds limited support for handling frag_list packets in
skb_segment.  The intention is to support GRO (Generic Receive Offload)
packets which will be constructed by chaining normal packets using
frag_list.

As such we require all frag_list members terminate on exact MSS
boundaries.  This is checked using BUG_ON.

As there should only be one producer in the kernel of such packets,
namely GRO, this requirement should not be difficult to maintain.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds limited support for handling frag_list packets in
skb_segment.  The intention is to support GRO (Generic Receive Offload)
packets which will be constructed by chaining normal packets using
frag_list.

As such we require all frag_list members terminate on exact MSS
boundaries.  This is checked using BUG_ON.

As there should only be one producer in the kernel of such packets,
namely GRO, this requirement should not be difficult to maintain.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
