linux.git/include/net/sock.h, branch v4.1-rc2

Merge branch 'iocb' into for-davem

2015-04-09T04:01:38+00:00

trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.

Signed-off-by: Al Viro

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

2015-04-07T02:34:15+00:00

Conflicts:
	drivers/net/ethernet/mellanox/mlx4/cmd.c
	net/core/fib_rules.c
	net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller

ipv6: protect skb->sk accesses from recursive dereference inside the stack

2015-04-06T20:12:49+00:00

We should not consult skb->sk for output decisions in xmit recursion
levels > 0 in the stack. Otherwise local socket settings could influence
the result of e.g. tunnel encapsulation process.

ipv6 does not conform with this in three places:

1) ip6_fragment: we do consult ipv6_npinfo for frag_size

2) sk_mc_loop in ipv6 uses skb->sk and checks if we should
   loop the packet back to the local socket

3) ip6_skb_dst_mtu could query the settings from the user socket and
   force a wrong MTU

Furthermore:
In sk_mc_loop we could potentially land in WARN_ON(1) if we use a
PF_PACKET socket ontop of an IPv6-backed vxlan device.

Reuse xmit_recursion as we are currently only interested in protecting
tunnel devices.

Cc: Jiri Pirko 
Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

fs: move struct kiocb to fs.h

2015-03-26T00:28:11+00:00

struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Al Viro

net: increase sk_[max_]ack_backlog

2015-03-20T16:40:25+00:00

sk_ack_backlog & sk_max_ack_backlog were 16bit fields, meaning
listen() backlog was limited to 65535.

It is time to increase the width to allow much bigger backlog,
if admins change /proc/sys/net/core/somaxconn &
/proc/sys/net/ipv4/tcp_max_syn_backlog default values.

Tested:

echo 5000000 >/proc/sys/net/core/somaxconn
echo 5000000 >/proc/sys/net/ipv4/tcp_max_syn_backlog

Ran a SYNFLOOD test against a listener using listen(fd, 5000000)

myhost~# grep request_sock_TCP /proc/slabinfo
request_sock_TCP  4185642 4411940    304   13    1 : tunables   54   27    8 : slabdata 339380 339380      0

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: add sk_fullsock() helper

2015-03-16T19:55:28+00:00

We have many places where we want to check if a socket is
not a timewait or request socket. Use a helper to avoid
hard coding this.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller

inet: prepare sock_edemux() & sock_gen_put() for new SYN_RECV state

2015-03-13T02:58:13+00:00

sock_edemux() & sock_gen_put() should be ready to cope with request socks.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: Introduce possible_net_t

2015-03-12T18:39:40+00:00

Having to say
> #ifdef CONFIG_NET_NS
> 	struct net *net;
> #endif

in structures is a little bit wordy and a little bit error prone.

Instead it is possible to say:
> typedef struct {
> #ifdef CONFIG_NET_NS
>       struct net *net;
> #endif
> } possible_net_t;

And then in a header say:

> 	possible_net_t net;

Which is cleaner and easier to use and easier to test, as the
possible_net_t is always there no matter what the compile options.

Further this allows read_pnet and write_pnet to be functions in all
cases which is better at catching typos.

This change adds possible_net_t, updates the definitions of read_pnet
and write_pnet, updates optional struct net * variables that
write_pnet uses on to have the type possible_net_t, and finally fixes
up the b0rked users of read_pnet and write_pnet.

Signed-off-by: "Eric W. Biederman" 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: Kill hold_net release_net

2015-03-12T18:39:40+00:00

hold_net and release_net were an idea that turned out to be useless.
The code has been disabled since 2008.  Kill the code it is long past due.

Signed-off-by: "Eric W. Biederman" 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller

net: add real socket cookies

2015-03-12T01:55:28+00:00

A long standing problem in netlink socket dumps is the use
of kernel socket addresses as cookies.

1) It is a security concern.

2) Sockets can be reused quite quickly, so there is
   no guarantee a cookie is used once and identify
   a flow.

3) request sock, establish sock, and timewait socks
   for a given flow have different cookies.

Part of our effort to bring better TCP statistics requires
to switch to a different allocator.

In this patch, I chose to use a per network namespace 64bit generator,
and to use it only in the case a socket needs to be dumped to netlink.
(This might be refined later if needed)

Note that I tried to carry cookies from request sock, to establish sock,
then timewait sockets.

Signed-off-by: Eric Dumazet 
Cc: Eric Salo 
Signed-off-by: David S. Miller