<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/dccp/output.c, branch linux-3.2.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>dccp: combine the functionality of enqeueing and cloning</title>
<updated>2011-07-04T18:36:47+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2011-07-03T15:51:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8695e80193fed35f27c06f462bd5b76132fd5697'/>
<id>8695e80193fed35f27c06f462bd5b76132fd5697</id>
<content type='text'>
Realising the following call pattern,
 * first dccp_entail() is called to enqueue a new skb and
 * then skb_clone() is called to transmit a clone of that skb,
this patch integrates both into the same function.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Realising the following call pattern,
 * first dccp_entail() is called to enqueue a new skb and
 * then skb_clone() is called to transmit a clone of that skb,
this patch integrates both into the same function.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inet: Pass flowi to -&gt;queue_xmit().</title>
<updated>2011-05-08T22:28:28+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-05-07T05:23:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d9d8da805dcb503ef8ee49918a94d49085060f23'/>
<id>d9d8da805dcb503ef8ee49918a94d49085060f23</id>
<content type='text'>
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.

It handles all of the possibilities, from the simplest case of storing
the key in inet-&gt;cork.fl to the more complex setup SCTP has where
individual transports determine the flow.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This allows us to acquire the exact route keying information from the
protocol, however that might be managed.

It handles all of the possibilities, from the simplest case of storing
the key in inet-&gt;cork.fl to the more complex setup SCTP has where
individual transports determine the flow.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix common misspellings</title>
<updated>2011-03-31T14:26:23+00:00</updated>
<author>
<name>Lucas De Marchi</name>
<email>lucas.demarchi@profusion.mobi</email>
</author>
<published>2011-03-31T01:57:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=25985edcedea6396277003854657b5f3cb31a628'/>
<id>25985edcedea6396277003854657b5f3cb31a628</id>
<content type='text'>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dccp: Policy-based packet dequeueing infrastructure</title>
<updated>2010-12-07T12:47:12+00:00</updated>
<author>
<name>Tomasz Grobelny</name>
<email>tomasz@grobelny.oswiecenia.net</email>
</author>
<published>2010-12-04T12:38:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=871a2c16c21b988688b4ab1a78eadd969765c0a3'/>
<id>871a2c16c21b988688b4ab1a78eadd969765c0a3</id>
<content type='text'>
This patch adds a generic infrastructure for policy-based dequeueing of
TX packets and provides two policies:
 * a simple FIFO policy (which is the default) and
 * a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options).

The priority policy uses skb-&gt;priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb-&gt;priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.

Signed-off-by: Tomasz Grobelny &lt;tomasz@grobelny.oswiecenia.net&gt;
Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds a generic infrastructure for policy-based dequeueing of
TX packets and provides two policies:
 * a simple FIFO policy (which is the default) and
 * a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options).

The priority policy uses skb-&gt;priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb-&gt;priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.

Signed-off-by: Tomasz Grobelny &lt;tomasz@grobelny.oswiecenia.net&gt;
Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dccp ccid-2: Schedule Sync as out-of-band mechanism</title>
<updated>2010-11-15T06:12:00+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2010-11-14T16:25:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d83447f0944e73d690218d79c07762ffa4ceb9e4'/>
<id>d83447f0944e73d690218d79c07762ffa4ceb9e4</id>
<content type='text'>
The problem with Ack Vectors is that
  i) their length is variable and can in principle grow quite large,
 ii) it is hard to predict exactly how large they will be.

Due to the second point it seems not a good idea to reduce the MPS; in
particular when on average there is enough room for the Ack Vector and an
increase in length is momentarily due to some burst loss, after which the
Ack Vector returns to its normal/average length.

The solution taken by this patch is to subtract a minimum-expected Ack Vector
length from the MPS, and to defer any larger Ack Vectors onto a separate
Sync - but only if indeed there is no space left on the skb.

This patch provides the infrastructure to schedule Sync-packets for transporting
(urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
it does not need to wait for new application data.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The problem with Ack Vectors is that
  i) their length is variable and can in principle grow quite large,
 ii) it is hard to predict exactly how large they will be.

Due to the second point it seems not a good idea to reduce the MPS; in
particular when on average there is enough room for the Ack Vector and an
increase in length is momentarily due to some burst loss, after which the
Ack Vector returns to its normal/average length.

The solution taken by this patch is to subtract a minimum-expected Ack Vector
length from the MPS, and to defer any larger Ack Vectors onto a separate
Sync - but only if indeed there is no space left on the skb.

This patch provides the infrastructure to schedule Sync-packets for transporting
(urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since
it does not need to wait for new application data.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dccp: Refine the wait-for-ccid mechanism</title>
<updated>2010-10-28T17:27:01+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2010-10-27T19:16:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b1fcf55eea541af9efa5d39f5a0d1aec8ceca55d'/>
<id>b1fcf55eea541af9efa5d39f5a0d1aec8ceca55d</id>
<content type='text'>
This extends the existing wait-for-ccid routine so that it may be used with
different types of CCID, addressing the following problems:

 1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
    example has a full TX queue and becomes network-limited just as the
    application wants to close, then waiting for CCID-2 to become unblocked
    could lead to an indefinite  delay (i.e., application "hangs").
 2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
    in its sending policy while the queue is being drained. This can lead to
    further delays during which the application will not be able to terminate.
 3) The minimum wait time for CCID-3/4 can be expected to be the queue length
    times the current inter-packet delay. For example if tx_qlen=100 and a delay
    of 15 ms is used for each packet, then the application would have to wait
    for a minimum of 1.5 seconds before being allowed to exit.
 4) There is no way for the user/application to control this behaviour. It would
    be good to use the timeout argument of dccp_close() as an upper bound. Then
    the maximum time that an application is willing to wait for its CCIDs to can
    be set via the SO_LINGER option.

These problems are addressed by giving the CCID a grace period of up to the
`timeout' value.

The wait-for-ccid function is, as before, used when the application
 (a) has read all the data in its receive buffer and
 (b) if SO_LINGER was set with a non-zero linger time, or
 (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
     state (client application closes after receiving CloseReq).

In addition, there is a catch-all case of __skb_queue_purge() after waiting for
the CCID. This is necessary since the write queue may still have data when
 (a) the host has been passively-closed,
 (b) abnormal termination (unread data, zero linger time),
 (c) wait-for-ccid could not finish within the given time limit.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This extends the existing wait-for-ccid routine so that it may be used with
different types of CCID, addressing the following problems:

 1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
    example has a full TX queue and becomes network-limited just as the
    application wants to close, then waiting for CCID-2 to become unblocked
    could lead to an indefinite  delay (i.e., application "hangs").
 2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
    in its sending policy while the queue is being drained. This can lead to
    further delays during which the application will not be able to terminate.
 3) The minimum wait time for CCID-3/4 can be expected to be the queue length
    times the current inter-packet delay. For example if tx_qlen=100 and a delay
    of 15 ms is used for each packet, then the application would have to wait
    for a minimum of 1.5 seconds before being allowed to exit.
 4) There is no way for the user/application to control this behaviour. It would
    be good to use the timeout argument of dccp_close() as an upper bound. Then
    the maximum time that an application is willing to wait for its CCIDs to can
    be set via the SO_LINGER option.

These problems are addressed by giving the CCID a grace period of up to the
`timeout' value.

The wait-for-ccid function is, as before, used when the application
 (a) has read all the data in its receive buffer and
 (b) if SO_LINGER was set with a non-zero linger time, or
 (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
     state (client application closes after receiving CloseReq).

In addition, there is a catch-all case of __skb_queue_purge() after waiting for
the CCID. This is necessary since the write queue may still have data when
 (a) the host has been passively-closed,
 (b) abnormal termination (unread data, zero linger time),
 (c) wait-for-ccid could not finish within the given time limit.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dccp: Extend CCID packet dequeueing interface</title>
<updated>2010-10-28T17:27:00+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2010-10-27T19:16:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dc841e30eaea9f9f83c9ab1ee0b3ef9e5c95ce8a'/>
<id>dc841e30eaea9f9f83c9ab1ee0b3ef9e5c95ce8a</id>
<content type='text'>
This extends the packet dequeuing interface of dccp_write_xmit() to allow
 1. CCIDs to take care of timing when the next packet may be sent;
 2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).

The main purpose is to take CCID-2 out of its polling mode (when it is network-
limited, it tries every millisecond to send, without interruption).

The mode of operation for (2) is as follows:
 * new packet is enqueued via dccp_sendmsg() =&gt; dccp_write_xmit(),
 * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full),
 * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
 * dccp_write_xmit() returns without further action;
 * after some time the wait-condition for CCID becomes true,
 * that CCID schedules the tasklet,
 * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
 * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
 * packet is sent, and possibly more (since dccp_write_xmit() loops).

Code reuse: the taskled function calls dccp_write_xmit(), the timer function
            reduces to a wrapper around the same code.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This extends the packet dequeuing interface of dccp_write_xmit() to allow
 1. CCIDs to take care of timing when the next packet may be sent;
 2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).

The main purpose is to take CCID-2 out of its polling mode (when it is network-
limited, it tries every millisecond to send, without interruption).

The mode of operation for (2) is as follows:
 * new packet is enqueued via dccp_sendmsg() =&gt; dccp_write_xmit(),
 * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full),
 * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
 * dccp_write_xmit() returns without further action;
 * after some time the wait-condition for CCID becomes true,
 * that CCID schedules the tasklet,
 * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
 * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
 * packet is sent, and possibly more (since dccp_write_xmit() loops).

Code reuse: the taskled function calls dccp_write_xmit(), the timer function
            reduces to a wrapper around the same code.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dccp: remove unused argument in CCID tx function</title>
<updated>2010-10-12T04:57:41+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2010-10-11T18:37:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=baf9e782e1dc4991edecfa3b8700cf8739c40259'/>
<id>baf9e782e1dc4991edecfa3b8700cf8739c40259</id>
<content type='text'>
This removes the argument `more' from ccid_hc_tx_packet_sent, since it was
nowhere used in the entire code.

(Btw, this argument was not even used in the original KAME code where the
 function initially came from; compare the variable moreToSend in the
 freebsd61-dccp-kame-28.08.2006.patch kept by Emmanuel Lochin.)

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This removes the argument `more' from ccid_hc_tx_packet_sent, since it was
nowhere used in the entire code.

(Btw, this argument was not even used in the original KAME code where the
 function initially came from; compare the variable moreToSend in the
 freebsd61-dccp-kame-28.08.2006.patch kept by Emmanuel Lochin.)

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dccp: merge now-reduced connect_init() function</title>
<updated>2010-10-12T04:57:40+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2010-10-11T18:36:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=93344af44c0f649582bf1e3b5ecc45b3d19e98c2'/>
<id>93344af44c0f649582bf1e3b5ecc45b3d19e98c2</id>
<content type='text'>
After moving the assignment of GAR/ISS from dccp_connect_init() to
dccp_transmit_skb(), the former function becomes very small, so that
a merger with dccp_connect() suggests itself.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After moving the assignment of GAR/ISS from dccp_connect_init() to
dccp_transmit_skb(), the former function becomes very small, so that
a merger with dccp_connect() suggests itself.

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: sock_def_readable() and friends RCU conversion</title>
<updated>2010-05-01T22:00:15+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-04-29T11:01:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=43815482370c510c569fd18edb57afcb0fa8cab6'/>
<id>43815482370c510c569fd18edb57afcb0fa8cab6</id>
<content type='text'>
sk_callback_lock rwlock actually protects sk-&gt;sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.

RCU conversion is pretty much needed :

1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).

[Future patch will add a list anchor for wakeup coalescing]

2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().

3) Respect RCU grace period when freeing a "struct socket_wq"

4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"

5) Change sk_sleep() function to use new sk-&gt;sk_wq instead of
sk-&gt;sk_sleep

6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.

7) Change all sk_has_sleeper() callers to :
  - Use rcu_read_lock() instead of read_lock(&amp;sk-&gt;sk_callback_lock)
  - Use wq_has_sleeper() to eventually wakeup tasks.
  - Use rcu_read_unlock() instead of read_unlock(&amp;sk-&gt;sk_callback_lock)

8) sock_wake_async() is modified to use rcu protection as well.

9) Exceptions :
  macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.

Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sk_callback_lock rwlock actually protects sk-&gt;sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.

RCU conversion is pretty much needed :

1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).

[Future patch will add a list anchor for wakeup coalescing]

2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().

3) Respect RCU grace period when freeing a "struct socket_wq"

4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"

5) Change sk_sleep() function to use new sk-&gt;sk_wq instead of
sk-&gt;sk_sleep

6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.

7) Change all sk_has_sleeper() callers to :
  - Use rcu_read_lock() instead of read_lock(&amp;sk-&gt;sk_callback_lock)
  - Use wq_has_sleeper() to eventually wakeup tasks.
  - Use rcu_read_unlock() instead of read_unlock(&amp;sk-&gt;sk_callback_lock)

8) sock_wake_async() is modified to use rcu protection as well.

9) Exceptions :
  macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.

Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
