linux-stable.git/Documentation/networking, branch v3.12.35

packet: fix send path when running with proto == 0

2014-01-15T23:31:34+00:00

[ Upstream commit 66e56cd46b93ef407c60adcac62cf33b06119d50 ]

Commit e40526cb20b5 introduced a cached dev pointer, that gets
hooked into register_prot_hook(), __unregister_prot_hook() to
update the device used for the send path.

We need to fix this up, as otherwise this will not work with
sockets created with protocol = 0, plus with sll_protocol = 0
passed via sockaddr_ll when doing the bind.

So instead, assign the pointer directly. The compiler can inline
these helper functions automagically.

While at it, also assume the cached dev fast-path as likely(),
and document this variant of socket creation as it seems it is
not widely used (seems not even the author of TX_RING was aware
of that in his reference example [1]). Tested with reproducer
from e40526cb20b5.

 [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example

Fixes: e40526cb20b5 ("packet: fix use after free race in send path when dev is released")
Signed-off-by: Daniel Borkmann 
Tested-by: Salam Noureddine 
Tested-by: Jesper Dangaard Brouer 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

tcp: tsq: restore minimal amount of queueing

2013-12-08T15:29:12+00:00

[ Upstream commit 98e09386c0ef4dfd48af7ba60ff908f0d525cdee ]

After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several
users reported throughput regressions, notably on mvneta and wifi
adapters.

802.11 AMPDU requires a fair amount of queueing to be effective.

This patch partially reverts the change done in tcp_write_xmit()
so that the minimal amount is sysctl_tcp_limit_output_bytes.

It also remove the use of this sysctl while building skb stored
in write queue, as TSO autosizing does the right thing anyway.

Users with well behaving NICS and correct qdisc (like sch_fq),
can then lower the default sysctl_tcp_limit_output_bytes value from
128KB to 8KB.

This new usage of sysctl_tcp_limit_output_bytes permits each driver
authors to check how their driver performs when/if the value is set
to a minimum of 4KB.

Normally, line rate for a single TCP flow should be possible,
but some drivers rely on timers to perform TX completion and
too long TX completion delays prevent reaching full throughput.

Fixes: c9eeec26e32e ("tcp: TSQ can use a dynamic limit")
Signed-off-by: Eric Dumazet 
Reported-by: Sujith Manoharan 
Reported-by: Arnaud Ebalard 
Tested-by: Sujith Manoharan 
Cc: Felix Fietkau 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

bonding: Make alb learning packet interval configurable

2013-09-16T02:20:44+00:00

running bonding in ALB mode requires that learning packets be sent periodically,
so that the switch knows where to send responding traffic.  However, depending
on switch configuration, there may not be any need to send traffic at the
default rate of 3 packets per second, which represents little more than wasted
data.  Allow the ALB learning packet interval to be made configurable via sysfs

Signed-off-by: Neil Horman 
Acked-by: Acked-by: Veaceslav Falico 
CC: Jay Vosburgh 
CC: Andy Gospodarek 
CC: "David S. Miller" 
Signed-off-by: Andy Gospodarek 
Signed-off-by: David S. Miller

i40e: include i40e in kernel proper

2013-09-11T09:28:40+00:00

This patch adds the changes for Kconfig, i40e.txt, MAINTAINERS, Kbuild
and new i40e/Makefile to build i40e with the kernel.

New driver build option is CONFIG_I40E

Signed-off-by: Jesse Brandeburg 
Signed-off-by: Shannon Nelson 
CC: PJ Waskiewicz 
CC: e1000-devel@lists.sourceforge.net
Tested-by: Kavindya Deegala 
Signed-off-by: Jeff Kirsher

net: ipv6: mld: document force_mld_version in ip-sysctl.txt

2013-09-04T18:53:21+00:00

Document force_mld_version parameter in ip-sysctl.txt.

Signed-off-by: Daniel Borkmann 
Cc: Hannes Frederic Sowa 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

driver:net:stmmac: Disable DMA store and forward mode if platform data force_thresh_dma_mode is set.

2013-08-30T21:26:09+00:00

Some synopsys ip implementation doesn't support DMA store and forward mode,
such as BF60x. So, set force_thresh_dma_mode to use DMA thresholds only.
Update document and devicetree as well.

Signed-off-by: Sonic Zhang 
Acked-by: Giuseppe Cavallaro 
Signed-off-by: David S. Miller

net: packet: document available fanout policies

2013-08-29T20:43:29+00:00

Update documentation to add fanout policies that are available.

Signed-off-by: Daniel Borkmann 
Signed-off-by: David S. Miller

tcp: TSO packets automatic sizing

2013-08-29T19:50:06+00:00

After hearing many people over past years complaining against TSO being
bursty or even buggy, we are proud to present automatic sizing of TSO
packets.

One part of the problem is that tcp_tso_should_defer() uses an heuristic
relying on upcoming ACKS instead of a timer, but more generally, having
big TSO packets makes little sense for low rates, as it tends to create
micro bursts on the network, and general consensus is to reduce the
buffering amount.

This patch introduces a per socket sk_pacing_rate, that approximates
the current sending rate, and allows us to size the TSO packets so
that we try to send one packet every ms.

This field could be set by other transports.

Patch has no impact for high speed flows, where having large TSO packets
makes sense to reach line rate.

For other flows, this helps better packet scheduling and ACK clocking.

This patch increases performance of TCP flows in lossy environments.

A new sysctl (tcp_min_tso_segs) is added, to specify the
minimal size of a TSO packet (default being 2).

A follow-up patch will provide a new packet scheduler (FQ), using
sk_pacing_rate as an input to perform optional per flow pacing.

This explains why we chose to set sk_pacing_rate to twice the current
rate, allowing 'slow start' ramp up.

sk_pacing_rate = 2 * cwnd * mss / srtt

v2: Neal Cardwell reported a suspect deferring of last two segments on
initial write of 10 MSS, I had to change tcp_tso_should_defer() to take
into account tp->xmit_size_goal_segs

Signed-off-by: Eric Dumazet 
Cc: Neal Cardwell 
Cc: Yuchung Cheng 
Cc: Van Jacobson 
Cc: Tom Herbert 
Acked-by: Yuchung Cheng 
Acked-by: Neal Cardwell 
Signed-off-by: David S. Miller

ipv6: drop fragmented ndisc packets by default (RFC 6980)

2013-08-29T19:32:08+00:00

This patch implements RFC6980: Drop fragmented ndisc packets by
default. If a fragmented ndisc packet is received the user is informed
that it is possible to disable the check.

Cc: Fernando Gont 
Cc: YOSHIFUJI Hideaki 
Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch

2013-08-28T02:11:18+00:00

Jesse Gross says:

====================
A number of significant new features and optimizations for net-next/3.12.
Highlights are:
 * "Megaflows", an optimization that allows userspace to specify which
   flow fields were used to compute the results of the flow lookup.
   This allows for a major reduction in flow setups (the major
   performance bottleneck in Open vSwitch) without reducing flexibility.
 * Converting netlink dump operations to use RCU, allowing for
   additional parallelism in userspace.
 * Matching and modifying SCTP protocol fields.
====================

Signed-off-by: David S. Miller