linux.git/net, branch v2.6.21

[NETLINK]: Infinite recursion in netlink.

2007-04-25T20:07:28+00:00

Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.

The bug is present in all kernel versions since the feature appeared.

The patch also makes some minimal cleanup:

1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup

Signed-off-by: Alexey Kuznetsov 
Signed-off-by: David S. Miller

IPv6: fix Routing Header Type 0 handling thinko

2007-04-25T02:26:06+00:00

Oops, thinko.  The test for accempting a RH0 was exatly the wrong way
around.

Signed-off-by: YOSHIFUJI Hideaki 
Signed-off-by: Linus Torvalds

[IPV6]: Disallow RH0 by default.

2007-04-24T21:58:30+00:00

A security issue is emerging.  Disallow Routing Header Type 0 by default
as we have been doing for IPv4.
Note: We allow RH2 by default because it is harmless.

Signed-off-by: YOSHIFUJI Hideaki 
Signed-off-by: David S. Miller

[XFRM]: beet: fix pseudo header length value

2007-04-24T05:39:02+00:00

draft-nikander-esp-beet-mode-07.txt is not entirely clear on how the length
value of the pseudo header should be calculated, it states "The Header Length
field contains the length of the pseudo header, IPv4 options, and padding in
8 octets units.", but also states "Length in octets (Header Len + 1) * 8".
draft-nikander-esp-beet-mode-08-pre1.txt [1] clarifies this, the header length
should not include the first 8 byte.

This change affects backwards compatibility, but option encapsulation didn't
work until very recently anyway.

[1] http://users.piuha.net/jmelen/BEET/draft-nikander-esp-beet-mode-08-pre1.txt

Signed-off-by: Patrick McHardy 
Signed-off-by: David S. Miller

[TCP]: Congestion control initialization.

2007-04-24T05:32:11+00:00

Change to defer congestion control initialization.

If setsockopt() was used to change TCP_CONGESTION before
connection is established, then protocols that use sequence numbers
to keep track of one RTT interval (vegas, illinois, ...) get confused.

Change the init hook to be called after handshake.

Signed-off-by: Stephen Hemminger 
Signed-off-by: David S. Miller

RPC: Fix the TCP resend semantics for NFSv4

2007-04-21T05:56:30+00:00

Fix a regression due to the patch "NFS: disconnect before retrying NFSv4
requests over TCP"

The assumption made in xprt_transmit() that the condition
	"req->rq_bytes_sent == 0 and request is on the receive list"
should imply that we're dealing with a retransmission is false.
Firstly, it may simply happen that the socket send queue was full
at the time the request was initially sent through xprt_transmit().
Secondly, doing this for each request that was retransmitted implies
that we disconnect and reconnect for _every_ request that happened to
be retransmitted irrespective of whether or not a disconnection has
already occurred.

Fix is to move this logic into the call_status request timeout handler.

Signed-off-by: Trond Myklebust 
Signed-off-by: Linus Torvalds

[NETLINK]: Don't attach callback to a going-away netlink socket

2007-04-19T00:05:58+00:00

There is a race between netlink_dump_start() and netlink_release()
that can lead to the situation when a netlink socket with non-zero
callback is freed.

Here it is:

CPU1:                           CPU2
netlink_release():              netlink_dump_start():

                                sk = netlink_lookup(); /* OK */

netlink_remove();

spin_lock(&nlk->cb_lock);
if (nlk->cb) { /* false */
  ...
}
spin_unlock(&nlk->cb_lock);

                                spin_lock(&nlk->cb_lock);
                                if (nlk->cb) { /* false */
                                         ...
                                }
                                nlk->cb = cb;
                                spin_unlock(&nlk->cb_lock);
                                ...
sock_orphan(sk);
/*
 * proceed with releasing
 * the socket
 */

The proposal it to make sock_orphan before detaching the callback
in netlink_release() and to check for the sock to be SOCK_DEAD in
netlink_dump_start() before setting a new callback.

Signed-off-by: Denis Lunev 
Signed-off-by: Kirill Korotaev 
Signed-off-by: Pavel Emelianov 
Acked-by: Patrick McHardy 
Signed-off-by: David S. Miller

[IrDA]: Correctly handling socket error

2007-04-18T22:07:22+00:00

This patch fixes an oops first reported in mid 2006 - see
http://lkml.org/lkml/2006/8/29/358 The cause of this bug report is that
when an error is signalled on the socket, irda_recvmsg_stream returns
without removing a local wait_queue variable from the socket's sk_sleep
queue. This causes havoc further down the road.

In response to this problem, a patch was made that invoked sock_orphan on
the socket when receiving a disconnect indication. This is not a good fix,
as this sets sk_sleep to NULL, causing applications sleeping in recvmsg
(and other places) to oops.

This is against the latest net-2.6 and should be considered for -stable
inclusion. 

Signed-off-by: Olaf Kirch 
Signed-off-by: Samuel Ortiz 
Signed-off-by: David S. Miller

[SCTP]: Do not interleave non-fragments when in partial delivery

2007-04-18T21:16:09+00:00

The way partial delivery is currently implemnted, it is possible to
intereleave a message (either from another steram, or unordered) that
is not part of partial delivery process.  The only way to this is for
a message to not be a fragment and be 'in order' or unorderd for a
given stream.  This will result in bypassing the reassembly/ordering
queues where things live duing partial delivery, and the
message will be delivered to the socket in the middle of partial delivery.

This is a two-fold problem, in that:
1.  the app now must check the stream-id and flags which it may not
be doing.
2.  this clearing partial delivery state from the association and results
in ulp hanging.

This patch is a band-aid over a much bigger problem in that we
don't do stream interleave.

Signed-off-by: Vlad Yasevich 
Signed-off-by: David S. Miller

[IPSEC] af_key: Fix thinko in pfkey_xfrm_policy2msg()

2007-04-18T21:16:07+00:00

Make sure to actually assign the determined mode to
rq->sadb_x_ipsecrequest_mode.

Noticed by Joe Perches.

Signed-off-by: David S. Miller