<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/core/dev.c, branch linux-4.6.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net: Disable segmentation if checksumming is not supported</title>
<updated>2016-05-03T20:00:54+00:00</updated>
<author>
<name>Alexander Duyck</name>
<email>aduyck@mirantis.com</email>
</author>
<published>2016-05-02T16:25:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=996e802187889f1cd412e6929c9344b92ccb78c4'/>
<id>996e802187889f1cd412e6929c9344b92ccb78c4</id>
<content type='text'>
In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum
offload for tunnels.  With this being the case we should disable GSO in
addition to the checksum offload features when we find that a device cannot
perform a checksum on a given packet type.

Signed-off-by: Alexander Duyck &lt;aduyck@mirantis.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum
offload for tunnels.  With this being the case we should disable GSO in
addition to the checksum offload features when we find that a device cannot
perform a checksum on a given packet type.

Signed-off-by: Alexander Duyck &lt;aduyck@mirantis.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU</title>
<updated>2016-04-07T20:56:33+00:00</updated>
<author>
<name>Alexander Duyck</name>
<email>aduyck@mirantis.com</email>
</author>
<published>2016-04-05T16:13:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a0ca153f98db8cf25298565a09e11fe9d82846ad'/>
<id>a0ca153f98db8cf25298565a09e11fe9d82846ad</id>
<content type='text'>
This patch fixes an issue I found in which we were dropping frames if we
had enabled checksums on GRE headers that were encapsulated by either FOU
or GUE.  Without this patch I was barely able to get 1 Gb/s of throughput.
With this patch applied I am now at least getting around 6 Gb/s.

The issue is due to the fact that with FOU or GUE applied we do not provide
a transport offset pointing to the GRE header, nor do we offload it in
software as the GRE header is completely skipped by GSO and treated like a
VXLAN or GENEVE type header.  As such we need to prevent the stack from
generating it and also prevent GRE from generating it via any interface we
create.

Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE")
Signed-off-by: Alexander Duyck &lt;aduyck@mirantis.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch fixes an issue I found in which we were dropping frames if we
had enabled checksums on GRE headers that were encapsulated by either FOU
or GUE.  Without this patch I was barely able to get 1 Gb/s of throughput.
With this patch applied I am now at least getting around 6 Gb/s.

The issue is due to the fact that with FOU or GUE applied we do not provide
a transport offset pointing to the GRE header, nor do we offload it in
software as the GRE header is completely skipped by GSO and treated like a
VXLAN or GENEVE type header.  As such we need to prevent the stack from
generating it and also prevent GRE from generating it via any interface we
create.

Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE")
Signed-off-by: Alexander Duyck &lt;aduyck@mirantis.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add description for len argument of dev_get_phys_port_name</title>
<updated>2016-03-21T17:28:31+00:00</updated>
<author>
<name>Luis de Bethencourt</name>
<email>luisbg@osg.samsung.com</email>
</author>
<published>2016-03-21T16:31:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ed49e650371008b0e00c8004cc2ca93055740f78'/>
<id>ed49e650371008b0e00c8004cc2ca93055740f78</id>
<content type='text'>
When the function dev_get_phys_port_name was added it missed a description
for it's len argument. Adding it.

Fixes: db24a9044ee1 ("net: add support for phys_port_name")
Signed-off-by: Luis de Bethencourt &lt;luisbg@osg.samsung.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the function dev_get_phys_port_name was added it missed a description
for it's len argument. Adding it.

Fixes: db24a9044ee1 ("net: add support for phys_port_name")
Signed-off-by: Luis de Bethencourt &lt;luisbg@osg.samsung.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tunnels: Don't apply GRO to multiple layers of encapsulation.</title>
<updated>2016-03-20T20:33:40+00:00</updated>
<author>
<name>Jesse Gross</name>
<email>jesse@kernel.org</email>
</author>
<published>2016-03-19T16:32:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fac8e0f579695a3ecbc4d3cac369139d7f819971'/>
<id>fac8e0f579695a3ecbc4d3cac369139d7f819971</id>
<content type='text'>
When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross &lt;jesse@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross &lt;jesse@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2016-02-23T05:09:14+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2016-02-23T05:09:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b633353115e352d3c31c12d4c61978c810f05ea1'/>
<id>b633353115e352d3c31c12d4c61978c810f05ea1</id>
<content type='text'>
Conflicts:
	drivers/net/phy/bcm7xxx.c
	drivers/net/phy/marvell.c
	drivers/net/vxlan.c

All three conflicts were cases of simple overlapping changes.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/phy/bcm7xxx.c
	drivers/net/phy/marvell.c
	drivers/net/vxlan.c

All three conflicts were cases of simple overlapping changes.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: make netdev_for_each_lower_dev safe for device removal</title>
<updated>2016-02-19T20:29:26+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2016-02-17T17:00:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cfdd28beb3205dbd1e91571516807199c8ab84ca'/>
<id>cfdd28beb3205dbd1e91571516807199c8ab84ca</id>
<content type='text'>
When I used netdev_for_each_lower_dev in commit bad531623253 ("vrf:
remove slave queue and private slave struct") I thought that it acts
like netdev_for_each_lower_private and can be used to remove the current
device from the list while walking, but unfortunately it acts more like
netdev_for_each_lower_private_rcu and doesn't allow it. The difference
is where the "iter" points to, right now it points to the current element
and that makes it impossible to remove it. Change the logic to be
similar to netdev_for_each_lower_private and make it point to the "next"
element so we can safely delete the current one. VRF is the only such
user right now, there's no change for the read-only users.

Here's what can happen now:
[98423.249858] general protection fault: 0000 [#1] SMP
[98423.250175] Modules linked in: vrf bridge(O) stp llc nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng
sha256_generic hmac drbg ppdev aesni_intel aes_x86_64 glue_helper lrw
gf128mul ablk_helper cryptd evdev serio_raw pcspkr virtio_balloon
parport_pc parport i2c_piix4 i2c_core virtio_console acpi_cpufreq button
9pnet_virtio 9p 9pnet fscache ipv6 autofs4 ext4 crc16 mbcache jbd2 sg
virtio_blk virtio_net sr_mod cdrom e1000 ata_generic ehci_pci uhci_hcd
ehci_hcd usbcore usb_common virtio_pci ata_piix libata floppy
virtio_ring virtio scsi_mod [last unloaded: bridge]
[98423.255040] CPU: 1 PID: 14173 Comm: ip Tainted: G           O
4.5.0-rc2+ #81
[98423.255386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[98423.255777] task: ffff8800547f5540 ti: ffff88003428c000 task.ti:
ffff88003428c000
[98423.256123] RIP: 0010:[&lt;ffffffff81514f3e&gt;]  [&lt;ffffffff81514f3e&gt;]
netdev_lower_get_next+0x1e/0x30
[98423.256534] RSP: 0018:ffff88003428f940  EFLAGS: 00010207
[98423.256766] RAX: 0002000100000004 RBX: ffff880054ff9000 RCX:
0000000000000000
[98423.257039] RDX: ffff88003428f8b8 RSI: ffff88003428f950 RDI:
ffff880054ff90c0
[98423.257287] RBP: ffff88003428f940 R08: 0000000000000000 R09:
0000000000000000
[98423.257537] R10: 0000000000000001 R11: 0000000000000000 R12:
ffff88003428f9e0
[98423.257802] R13: ffff880054a5fd00 R14: ffff88003428f970 R15:
0000000000000001
[98423.258055] FS:  00007f3d76881700(0000) GS:ffff88005d000000(0000)
knlGS:0000000000000000
[98423.258418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[98423.258650] CR2: 00007ffe5951ffa8 CR3: 0000000052077000 CR4:
00000000000406e0
[98423.258902] Stack:
[98423.259075]  ffff88003428f960 ffffffffa0442636 0002000100000004
ffff880054ff9000
[98423.259647]  ffff88003428f9b0 ffffffff81518205 ffff880054ff9000
ffff88003428f978
[98423.260208]  ffff88003428f978 ffff88003428f9e0 ffff88003428f9e0
ffff880035b35f00
[98423.260739] Call Trace:
[98423.260920]  [&lt;ffffffffa0442636&gt;] vrf_dev_uninit+0x76/0xa0 [vrf]
[98423.261156]  [&lt;ffffffff81518205&gt;]
rollback_registered_many+0x205/0x390
[98423.261401]  [&lt;ffffffff815183ec&gt;] unregister_netdevice_many+0x1c/0x70
[98423.261641]  [&lt;ffffffff8153223c&gt;] rtnl_delete_link+0x3c/0x50
[98423.271557]  [&lt;ffffffff815335bb&gt;] rtnl_dellink+0xcb/0x1d0
[98423.271800]  [&lt;ffffffff811cd7da&gt;] ? __inc_zone_state+0x4a/0x90
[98423.272049]  [&lt;ffffffff815337b4&gt;] rtnetlink_rcv_msg+0x84/0x200
[98423.272279]  [&lt;ffffffff810cfe7d&gt;] ? trace_hardirqs_on+0xd/0x10
[98423.272513]  [&lt;ffffffff8153370b&gt;] ? rtnetlink_rcv+0x1b/0x40
[98423.272755]  [&lt;ffffffff81533730&gt;] ? rtnetlink_rcv+0x40/0x40
[98423.272983]  [&lt;ffffffff8155d6e7&gt;] netlink_rcv_skb+0x97/0xb0
[98423.273209]  [&lt;ffffffff8153371a&gt;] rtnetlink_rcv+0x2a/0x40
[98423.273476]  [&lt;ffffffff8155ce8b&gt;] netlink_unicast+0x11b/0x1a0
[98423.273710]  [&lt;ffffffff8155d2f1&gt;] netlink_sendmsg+0x3e1/0x610
[98423.273947]  [&lt;ffffffff814fbc98&gt;] sock_sendmsg+0x38/0x70
[98423.274175]  [&lt;ffffffff814fc253&gt;] ___sys_sendmsg+0x2e3/0x2f0
[98423.274416]  [&lt;ffffffff810d841e&gt;] ? do_raw_spin_unlock+0xbe/0x140
[98423.274658]  [&lt;ffffffff811e1bec&gt;] ? handle_mm_fault+0x26c/0x2210
[98423.274894]  [&lt;ffffffff811e19cd&gt;] ? handle_mm_fault+0x4d/0x2210
[98423.275130]  [&lt;ffffffff81269611&gt;] ? __fget_light+0x91/0xb0
[98423.275365]  [&lt;ffffffff814fcd42&gt;] __sys_sendmsg+0x42/0x80
[98423.275595]  [&lt;ffffffff814fcd92&gt;] SyS_sendmsg+0x12/0x20
[98423.275827]  [&lt;ffffffff81611bb6&gt;] entry_SYSCALL_64_fastpath+0x16/0x7a
[98423.276073] Code: c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66
90 48 8b 06 55 48 81 c7 c0 00 00 00 48 89 e5 48 8b 00 48 39 f8 74 09 48
89 06 &lt;48&gt; 8b 40 e8 5d c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66
[98423.279639] RIP  [&lt;ffffffff81514f3e&gt;] netdev_lower_get_next+0x1e/0x30
[98423.279920]  RSP &lt;ffff88003428f940&gt;

CC: David Ahern &lt;dsa@cumulusnetworks.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;
CC: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
CC: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Fixes: bad531623253 ("vrf: remove slave queue and private slave struct")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Reviewed-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Tested-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When I used netdev_for_each_lower_dev in commit bad531623253 ("vrf:
remove slave queue and private slave struct") I thought that it acts
like netdev_for_each_lower_private and can be used to remove the current
device from the list while walking, but unfortunately it acts more like
netdev_for_each_lower_private_rcu and doesn't allow it. The difference
is where the "iter" points to, right now it points to the current element
and that makes it impossible to remove it. Change the logic to be
similar to netdev_for_each_lower_private and make it point to the "next"
element so we can safely delete the current one. VRF is the only such
user right now, there's no change for the read-only users.

Here's what can happen now:
[98423.249858] general protection fault: 0000 [#1] SMP
[98423.250175] Modules linked in: vrf bridge(O) stp llc nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng
sha256_generic hmac drbg ppdev aesni_intel aes_x86_64 glue_helper lrw
gf128mul ablk_helper cryptd evdev serio_raw pcspkr virtio_balloon
parport_pc parport i2c_piix4 i2c_core virtio_console acpi_cpufreq button
9pnet_virtio 9p 9pnet fscache ipv6 autofs4 ext4 crc16 mbcache jbd2 sg
virtio_blk virtio_net sr_mod cdrom e1000 ata_generic ehci_pci uhci_hcd
ehci_hcd usbcore usb_common virtio_pci ata_piix libata floppy
virtio_ring virtio scsi_mod [last unloaded: bridge]
[98423.255040] CPU: 1 PID: 14173 Comm: ip Tainted: G           O
4.5.0-rc2+ #81
[98423.255386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[98423.255777] task: ffff8800547f5540 ti: ffff88003428c000 task.ti:
ffff88003428c000
[98423.256123] RIP: 0010:[&lt;ffffffff81514f3e&gt;]  [&lt;ffffffff81514f3e&gt;]
netdev_lower_get_next+0x1e/0x30
[98423.256534] RSP: 0018:ffff88003428f940  EFLAGS: 00010207
[98423.256766] RAX: 0002000100000004 RBX: ffff880054ff9000 RCX:
0000000000000000
[98423.257039] RDX: ffff88003428f8b8 RSI: ffff88003428f950 RDI:
ffff880054ff90c0
[98423.257287] RBP: ffff88003428f940 R08: 0000000000000000 R09:
0000000000000000
[98423.257537] R10: 0000000000000001 R11: 0000000000000000 R12:
ffff88003428f9e0
[98423.257802] R13: ffff880054a5fd00 R14: ffff88003428f970 R15:
0000000000000001
[98423.258055] FS:  00007f3d76881700(0000) GS:ffff88005d000000(0000)
knlGS:0000000000000000
[98423.258418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[98423.258650] CR2: 00007ffe5951ffa8 CR3: 0000000052077000 CR4:
00000000000406e0
[98423.258902] Stack:
[98423.259075]  ffff88003428f960 ffffffffa0442636 0002000100000004
ffff880054ff9000
[98423.259647]  ffff88003428f9b0 ffffffff81518205 ffff880054ff9000
ffff88003428f978
[98423.260208]  ffff88003428f978 ffff88003428f9e0 ffff88003428f9e0
ffff880035b35f00
[98423.260739] Call Trace:
[98423.260920]  [&lt;ffffffffa0442636&gt;] vrf_dev_uninit+0x76/0xa0 [vrf]
[98423.261156]  [&lt;ffffffff81518205&gt;]
rollback_registered_many+0x205/0x390
[98423.261401]  [&lt;ffffffff815183ec&gt;] unregister_netdevice_many+0x1c/0x70
[98423.261641]  [&lt;ffffffff8153223c&gt;] rtnl_delete_link+0x3c/0x50
[98423.271557]  [&lt;ffffffff815335bb&gt;] rtnl_dellink+0xcb/0x1d0
[98423.271800]  [&lt;ffffffff811cd7da&gt;] ? __inc_zone_state+0x4a/0x90
[98423.272049]  [&lt;ffffffff815337b4&gt;] rtnetlink_rcv_msg+0x84/0x200
[98423.272279]  [&lt;ffffffff810cfe7d&gt;] ? trace_hardirqs_on+0xd/0x10
[98423.272513]  [&lt;ffffffff8153370b&gt;] ? rtnetlink_rcv+0x1b/0x40
[98423.272755]  [&lt;ffffffff81533730&gt;] ? rtnetlink_rcv+0x40/0x40
[98423.272983]  [&lt;ffffffff8155d6e7&gt;] netlink_rcv_skb+0x97/0xb0
[98423.273209]  [&lt;ffffffff8153371a&gt;] rtnetlink_rcv+0x2a/0x40
[98423.273476]  [&lt;ffffffff8155ce8b&gt;] netlink_unicast+0x11b/0x1a0
[98423.273710]  [&lt;ffffffff8155d2f1&gt;] netlink_sendmsg+0x3e1/0x610
[98423.273947]  [&lt;ffffffff814fbc98&gt;] sock_sendmsg+0x38/0x70
[98423.274175]  [&lt;ffffffff814fc253&gt;] ___sys_sendmsg+0x2e3/0x2f0
[98423.274416]  [&lt;ffffffff810d841e&gt;] ? do_raw_spin_unlock+0xbe/0x140
[98423.274658]  [&lt;ffffffff811e1bec&gt;] ? handle_mm_fault+0x26c/0x2210
[98423.274894]  [&lt;ffffffff811e19cd&gt;] ? handle_mm_fault+0x4d/0x2210
[98423.275130]  [&lt;ffffffff81269611&gt;] ? __fget_light+0x91/0xb0
[98423.275365]  [&lt;ffffffff814fcd42&gt;] __sys_sendmsg+0x42/0x80
[98423.275595]  [&lt;ffffffff814fcd92&gt;] SyS_sendmsg+0x12/0x20
[98423.275827]  [&lt;ffffffff81611bb6&gt;] entry_SYSCALL_64_fastpath+0x16/0x7a
[98423.276073] Code: c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66 66
90 48 8b 06 55 48 81 c7 c0 00 00 00 48 89 e5 48 8b 00 48 39 f8 74 09 48
89 06 &lt;48&gt; 8b 40 e8 5d c3 31 c0 5d c3 0f 1f 84 00 00 00 00 00 66 66 66
[98423.279639] RIP  [&lt;ffffffff81514f3e&gt;] netdev_lower_get_next+0x1e/0x30
[98423.279920]  RSP &lt;ffff88003428f940&gt;

CC: David Ahern &lt;dsa@cumulusnetworks.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;
CC: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
CC: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Fixes: bad531623253 ("vrf: remove slave queue and private slave struct")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Reviewed-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Tested-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IFF_NO_QUEUE: Fix for drivers not calling ether_setup()</title>
<updated>2016-02-18T19:56:53+00:00</updated>
<author>
<name>Phil Sutter</name>
<email>phil@nwl.cc</email>
</author>
<published>2016-02-17T14:37:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a813104d923339144078939175faf4e66aca19b4'/>
<id>a813104d923339144078939175faf4e66aca19b4</id>
<content type='text'>
My implementation around IFF_NO_QUEUE driver flag assumed that leaving
tx_queue_len untouched (specifically: not setting it to zero) by drivers
would make it possible to assign a regular qdisc to them without having
to worry about setting tx_queue_len to a useful value. This was only
partially true: I overlooked that some drivers don't call ether_setup()
and therefore not initialize tx_queue_len to the default value of 1000.
Consequently, removing the workarounds in place for that case in qdisc
implementations which cared about it (namely, pfifo, bfifo, gred, htb,
plug and sfb) leads to problems with these specific interface types and
qdiscs.

Luckily, there's already a sanitization point for drivers setting
tx_queue_len to zero, which can be reused to assign the fallback value
most qdisc implementations used, which is 1.

Fixes: 348e3435cbefa ("net: sched: drop all special handling of tx_queue_len == 0")
Tested-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
My implementation around IFF_NO_QUEUE driver flag assumed that leaving
tx_queue_len untouched (specifically: not setting it to zero) by drivers
would make it possible to assign a regular qdisc to them without having
to worry about setting tx_queue_len to a useful value. This was only
partially true: I overlooked that some drivers don't call ether_setup()
and therefore not initialize tx_queue_len to the default value of 1000.
Consequently, removing the workarounds in place for that case in qdisc
implementations which cared about it (namely, pfifo, bfifo, gred, htb,
plug and sfb) leads to problems with these specific interface types and
qdiscs.

Luckily, there's already a sanitization point for drivers setting
tx_queue_len to zero, which can be reused to assign the fallback value
most qdisc implementations used, which is 1.

Fixes: 348e3435cbefa ("net: sched: drop all special handling of tx_queue_len == 0")
Tested-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bulk free SKBs that were delay free'ed due to IRQ context</title>
<updated>2016-02-11T16:59:09+00:00</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2016-02-08T12:15:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=15fad714be86eab13e7568fecaf475b2a9730d3e'/>
<id>15fad714be86eab13e7568fecaf475b2a9730d3e</id>
<content type='text'>
The network stack defers SKBs free, in-case free happens in IRQ or
when IRQs are disabled. This happens in __dev_kfree_skb_irq() that
writes SKBs that were free'ed during IRQ to the softirq completion
queue (softnet_data.completion_queue).

These SKBs are naturally delayed, and cleaned up during NET_TX_SOFTIRQ
in function net_tx_action().  Take advantage of this a use the skb
defer and flush API, as we are already in softirq context.

For modern drivers this rarely happens. Although most drivers do call
dev_kfree_skb_any(), which detects the situation and calls
__dev_kfree_skb_irq() when needed.  This due to netpoll can call from
IRQ context.

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The network stack defers SKBs free, in-case free happens in IRQ or
when IRQs are disabled. This happens in __dev_kfree_skb_irq() that
writes SKBs that were free'ed during IRQ to the softirq completion
queue (softnet_data.completion_queue).

These SKBs are naturally delayed, and cleaned up during NET_TX_SOFTIRQ
in function net_tx_action().  Take advantage of this a use the skb
defer and flush API, as we are already in softirq context.

For modern drivers this rarely happens. Although most drivers do call
dev_kfree_skb_any(), which detects the situation and calls
__dev_kfree_skb_irq() when needed.  This due to netpoll can call from
IRQ context.

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bulk free infrastructure for NAPI context, use napi_consume_skb</title>
<updated>2016-02-11T16:59:09+00:00</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2016-02-08T12:14:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=795bb1c00dd338aa0d12f9a7f1f4776fb3160416'/>
<id>795bb1c00dd338aa0d12f9a7f1f4776fb3160416</id>
<content type='text'>
Discovered that network stack were hitting the kmem_cache/SLUB
slowpath when freeing SKBs.  Doing bulk free with kmem_cache_free_bulk
can speedup this slowpath.

NAPI context is a bit special, lets take advantage of that for bulk
free'ing SKBs.

In NAPI context we are running in softirq, which gives us certain
protection.  A softirq can run on several CPUs at once.  BUT the
important part is a softirq will never preempt another softirq running
on the same CPU.  This gives us the opportunity to access per-cpu
variables in softirq context.

Extend napi_alloc_cache (before only contained page_frag_cache) to be
a struct with a small array based stack for holding SKBs.  Introduce a
SKB defer and flush API for accessing this.

Introduce napi_consume_skb() as replacement for e.g. dev_consume_skb_any()
when running in NAPI context.  A small trick to handle/detect if we
are called from netpoll is to see if budget is 0.  In that case, we
need to invoke dev_consume_skb_irq().

Joint work with Alexander Duyck.

Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Discovered that network stack were hitting the kmem_cache/SLUB
slowpath when freeing SKBs.  Doing bulk free with kmem_cache_free_bulk
can speedup this slowpath.

NAPI context is a bit special, lets take advantage of that for bulk
free'ing SKBs.

In NAPI context we are running in softirq, which gives us certain
protection.  A softirq can run on several CPUs at once.  BUT the
important part is a softirq will never preempt another softirq running
on the same CPU.  This gives us the opportunity to access per-cpu
variables in softirq context.

Extend napi_alloc_cache (before only contained page_frag_cache) to be
a struct with a small array based stack for holding SKBs.  Introduce a
SKB defer and flush API for accessing this.

Introduce napi_consume_skb() as replacement for e.g. dev_consume_skb_any()
when running in NAPI context.  A small trick to handle/detect if we
are called from netpoll is to see if budget is 0.  In that case, we
need to invoke dev_consume_skb_irq().

Joint work with Alexander Duyck.

Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add rx_nohandler stat counter</title>
<updated>2016-02-06T07:59:51+00:00</updated>
<author>
<name>Jarod Wilson</name>
<email>jarod@redhat.com</email>
</author>
<published>2016-02-01T23:51:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6e7333d315a768170a59ac771297ee0551bdddbf'/>
<id>6e7333d315a768170a59ac771297ee0551bdddbf</id>
<content type='text'>
This adds an rx_nohandler stat counter, along with a sysfs statistics
node, and copies the counter out via netlink as well.

CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Eric Dumazet &lt;edumazet@google.com&gt;
CC: Jiri Pirko &lt;jiri@mellanox.com&gt;
CC: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
CC: Tom Herbert &lt;tom@herbertland.com&gt;
CC: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
CC: Veaceslav Falico &lt;vfalico@gmail.com&gt;
CC: Andy Gospodarek &lt;gospo@cumulusnetworks.com&gt;
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds an rx_nohandler stat counter, along with a sysfs statistics
node, and copies the counter out via netlink as well.

CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: Eric Dumazet &lt;edumazet@google.com&gt;
CC: Jiri Pirko &lt;jiri@mellanox.com&gt;
CC: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
CC: Tom Herbert &lt;tom@herbertland.com&gt;
CC: Jay Vosburgh &lt;j.vosburgh@gmail.com&gt;
CC: Veaceslav Falico &lt;vfalico@gmail.com&gt;
CC: Andy Gospodarek &lt;gospo@cumulusnetworks.com&gt;
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
