<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/net/bareudp.c, branch v5.12</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>udp: call udp_encap_enable for v6 sockets when enabling encap</title>
<updated>2021-02-05T02:37:14+00:00</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2021-02-03T08:54:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a4a600dd301ccde6ea239804ec1f19364a39d643'/>
<id>a4a600dd301ccde6ea239804ec1f19364a39d643</id>
<content type='text'>
When enabling encap for a ipv6 socket without udp_encap_needed_key
increased, UDP GRO won't work for v4 mapped v6 address packets as
sk will be NULL in udp4_gro_receive().

This patch is to enable it by increasing udp_encap_needed_key for
v6 sockets in udp_tunnel_encap_enable(), and correspondingly
decrease udp_encap_needed_key in udpv6_destroy_sock().

v1-&gt;v2:
  - add udp_encap_disable() and export it.
v2-&gt;v3:
  - add the change for rxrpc and bareudp into one patch, as Alex
    suggested.
v3-&gt;v4:
  - move rxrpc part to another patch.

Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When enabling encap for a ipv6 socket without udp_encap_needed_key
increased, UDP GRO won't work for v4 mapped v6 address packets as
sk will be NULL in udp4_gro_receive().

This patch is to enable it by increasing udp_encap_needed_key for
v6 sockets in udp_tunnel_encap_enable(), and correspondingly
decrease udp_encap_needed_key in udpv6_destroy_sock().

v1-&gt;v2:
  - add udp_encap_disable() and export it.
v2-&gt;v3:
  - add the change for rxrpc and bareudp into one patch, as Alex
    suggested.
v3-&gt;v4:
  - move rxrpc part to another patch.

Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bareudp: add NETIF_F_FRAGLIST flag for dev features</title>
<updated>2021-01-19T03:57:02+00:00</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2021-01-15T09:47:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3224dcfd850fccf92e20e55ff74a7bd079458ba8'/>
<id>3224dcfd850fccf92e20e55ff74a7bd079458ba8</id>
<content type='text'>
Like vxlan and geneve, bareudp also needs this dev feature
to support some protocol's HW GSO.

Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Like vxlan and geneve, bareudp also needs this dev feature
to support some protocol's HW GSO.

Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bareudp: simplify error paths calling dellink</title>
<updated>2021-01-12T01:09:23+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2021-01-11T05:29:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1d04ccb916ce81b8313712bae8a2304c62769c1f'/>
<id>1d04ccb916ce81b8313712bae8a2304c62769c1f</id>
<content type='text'>
bareudp_dellink() only needs the device list to hand it to
unregister_netdevice_queue(). We can pass NULL in, and
unregister_netdevice_queue() will do the unregistering.
There is no chance for batching on the error path, anyway.

Suggested-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Reviewed-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Link: https://lore.kernel.org/r/20210111052922.2145003-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bareudp_dellink() only needs the device list to hand it to
unregister_netdevice_queue(). We can pass NULL in, and
unregister_netdevice_queue() will do the unregistering.
There is no chance for batching on the error path, anyway.

Suggested-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Reviewed-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Link: https://lore.kernel.org/r/20210111052922.2145003-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bareudp: add missing error handling for bareudp_link_config()</title>
<updated>2021-01-06T23:56:44+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2021-01-05T19:07:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=94bcfdbff0c210b17b27615f4952cc6ece7d5f5f'/>
<id>94bcfdbff0c210b17b27615f4952cc6ece7d5f5f</id>
<content type='text'>
.dellink does not get called after .newlink fails,
bareudp_newlink() must undo what bareudp_configure()
has done if bareudp_link_config() fails.

v2: call bareudp_dellink(), like bareudp_dev_create() does

Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Link: https://lore.kernel.org/r/20210105190725.1736246-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
.dellink does not get called after .newlink fails,
bareudp_newlink() must undo what bareudp_configure()
has done if bareudp_link_config() fails.

v2: call bareudp_dellink(), like bareudp_dev_create() does

Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Link: https://lore.kernel.org/r/20210105190725.1736246-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bareudp: Fix use of incorrect min_headroom size</title>
<updated>2021-01-04T21:13:04+00:00</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2020-12-28T15:21:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=10ad3e998fa0c25315f27cf3002ff8b02dc31c38'/>
<id>10ad3e998fa0c25315f27cf3002ff8b02dc31c38</id>
<content type='text'>
In the bareudp6_xmit_skb(), it calculates min_headroom.
At that point, it uses struct iphdr, but it's not correct.
So panic could occur.
The struct ipv6hdr should be used.

Test commands:
    ip netns add A
    ip netns add B
    ip link add veth0 netns A type veth peer name veth1 netns B
    ip netns exec A ip link set veth0 up
    ip netns exec A ip a a 2001:db8:0::1/64 dev veth0
    ip netns exec B ip link set veth1 up
    ip netns exec B ip a a 2001:db8:0::2/64 dev veth1

    for i in {10..1}
    do
            let A=$i-1
            ip netns exec A ip link add bareudp$i type bareudp dstport $i \
		    ethertype 0x86dd
            ip netns exec A ip link set bareudp$i up
            ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev bareudp$i
            ip netns exec A ip -6 r a 2001:db8:$i::2 encap ip6 src \
		    2001:db8:$A::1 dst 2001:db8:$A::2 via 2001:db8:$i::2 \
		    dev bareudp$i

            ip netns exec B ip link add bareudp$i type bareudp dstport $i \
		    ethertype 0x86dd
            ip netns exec B ip link set bareudp$i up
            ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev bareudp$i
            ip netns exec B ip -6 r a 2001:db8:$i::1 encap ip6 src \
		    2001:db8:$A::2 dst 2001:db8:$A::1 via 2001:db8:$i::1 \
		    dev bareudp$i
    done
    ip netns exec A ping 2001:db8:7::2

Splat looks like:
[   66.436679][    C2] skbuff: skb_under_panic: text:ffffffff928614c8 len:454 put:14 head:ffff88810abb4000 data:ffff88810abb3ffa tail:0x1c0 end:0x3ec0 dev:veth0
[   66.441626][    C2] ------------[ cut here ]------------
[   66.443458][    C2] kernel BUG at net/core/skbuff.c:109!
[   66.445313][    C2] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[   66.447606][    C2] CPU: 2 PID: 913 Comm: ping Not tainted 5.10.0+ #819
[   66.450251][    C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   66.453713][    C2] RIP: 0010:skb_panic+0x15d/0x15f
[   66.455345][    C2] Code: 98 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 60 8b 78 93 41 57 41 56 41 55 48 8b 54 24 20 48 8b 74 24 28 e8 b5 40 f9 ff &lt;0f&gt; 0b 48 8b 6c 24 20 89 34 24 e8 08 c9 98 fe 8b 34 24 48 c7 c1 80
[   66.462314][    C2] RSP: 0018:ffff888119209648 EFLAGS: 00010286
[   66.464281][    C2] RAX: 0000000000000089 RBX: ffff888003159000 RCX: 0000000000000000
[   66.467216][    C2] RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed10232412c0
[   66.469768][    C2] RBP: ffff88810a53d440 R08: ffffed102328018d R09: ffffed102328018d
[   66.472297][    C2] R10: ffff888119400c67 R11: ffffed102328018c R12: 000000000000000e
[   66.474833][    C2] R13: ffff88810abb3ffa R14: 00000000000001c0 R15: 0000000000003ec0
[   66.477361][    C2] FS:  00007f37c0c72f00(0000) GS:ffff888119200000(0000) knlGS:0000000000000000
[   66.480214][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.482296][    C2] CR2: 000055a058808570 CR3: 000000011039e002 CR4: 00000000003706e0
[   66.484811][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   66.487793][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.490424][    C2] Call Trace:
[   66.491469][    C2]  &lt;IRQ&gt;
[   66.492374][    C2]  ? eth_header+0x28/0x190
[   66.494054][    C2]  ? eth_header+0x28/0x190
[   66.495401][    C2]  skb_push.cold.99+0x22/0x22
[   66.496700][    C2]  eth_header+0x28/0x190
[   66.497867][    C2]  neigh_resolve_output+0x3de/0x720
[   66.499615][    C2]  ? __neigh_update+0x7e8/0x20a0
[   66.501176][    C2]  __neigh_update+0x8bd/0x20a0
[   66.502749][    C2]  ndisc_update+0x34/0xc0
[   66.504010][    C2]  ndisc_recv_na+0x8da/0xb80
[   66.505041][    C2]  ? pndisc_redo+0x20/0x20
[   66.505888][    C2]  ? rcu_read_lock_sched_held+0xc0/0xc0
[   66.506965][    C2]  ndisc_rcv+0x3a0/0x470
[   66.507797][    C2]  icmpv6_rcv+0xad9/0x1b00
[   66.508645][    C2]  ip6_protocol_deliver_rcu+0xcd6/0x1560
[   66.509719][    C2]  ip6_input_finish+0x5b/0xf0
[   66.510615][    C2]  ip6_input+0xcd/0x2d0
[   66.511406][    C2]  ? ip6_input_finish+0xf0/0xf0
[   66.512327][    C2]  ? rcu_read_lock_held+0x91/0xa0
[   66.513279][    C2]  ? ip6_protocol_deliver_rcu+0x1560/0x1560
[   66.514414][    C2]  ipv6_rcv+0xe8/0x300
[ ... ]

Acked-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Link: https://lore.kernel.org/r/20201228152146.24270-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the bareudp6_xmit_skb(), it calculates min_headroom.
At that point, it uses struct iphdr, but it's not correct.
So panic could occur.
The struct ipv6hdr should be used.

Test commands:
    ip netns add A
    ip netns add B
    ip link add veth0 netns A type veth peer name veth1 netns B
    ip netns exec A ip link set veth0 up
    ip netns exec A ip a a 2001:db8:0::1/64 dev veth0
    ip netns exec B ip link set veth1 up
    ip netns exec B ip a a 2001:db8:0::2/64 dev veth1

    for i in {10..1}
    do
            let A=$i-1
            ip netns exec A ip link add bareudp$i type bareudp dstport $i \
		    ethertype 0x86dd
            ip netns exec A ip link set bareudp$i up
            ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev bareudp$i
            ip netns exec A ip -6 r a 2001:db8:$i::2 encap ip6 src \
		    2001:db8:$A::1 dst 2001:db8:$A::2 via 2001:db8:$i::2 \
		    dev bareudp$i

            ip netns exec B ip link add bareudp$i type bareudp dstport $i \
		    ethertype 0x86dd
            ip netns exec B ip link set bareudp$i up
            ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev bareudp$i
            ip netns exec B ip -6 r a 2001:db8:$i::1 encap ip6 src \
		    2001:db8:$A::2 dst 2001:db8:$A::1 via 2001:db8:$i::1 \
		    dev bareudp$i
    done
    ip netns exec A ping 2001:db8:7::2

Splat looks like:
[   66.436679][    C2] skbuff: skb_under_panic: text:ffffffff928614c8 len:454 put:14 head:ffff88810abb4000 data:ffff88810abb3ffa tail:0x1c0 end:0x3ec0 dev:veth0
[   66.441626][    C2] ------------[ cut here ]------------
[   66.443458][    C2] kernel BUG at net/core/skbuff.c:109!
[   66.445313][    C2] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[   66.447606][    C2] CPU: 2 PID: 913 Comm: ping Not tainted 5.10.0+ #819
[   66.450251][    C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   66.453713][    C2] RIP: 0010:skb_panic+0x15d/0x15f
[   66.455345][    C2] Code: 98 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 60 8b 78 93 41 57 41 56 41 55 48 8b 54 24 20 48 8b 74 24 28 e8 b5 40 f9 ff &lt;0f&gt; 0b 48 8b 6c 24 20 89 34 24 e8 08 c9 98 fe 8b 34 24 48 c7 c1 80
[   66.462314][    C2] RSP: 0018:ffff888119209648 EFLAGS: 00010286
[   66.464281][    C2] RAX: 0000000000000089 RBX: ffff888003159000 RCX: 0000000000000000
[   66.467216][    C2] RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed10232412c0
[   66.469768][    C2] RBP: ffff88810a53d440 R08: ffffed102328018d R09: ffffed102328018d
[   66.472297][    C2] R10: ffff888119400c67 R11: ffffed102328018c R12: 000000000000000e
[   66.474833][    C2] R13: ffff88810abb3ffa R14: 00000000000001c0 R15: 0000000000003ec0
[   66.477361][    C2] FS:  00007f37c0c72f00(0000) GS:ffff888119200000(0000) knlGS:0000000000000000
[   66.480214][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.482296][    C2] CR2: 000055a058808570 CR3: 000000011039e002 CR4: 00000000003706e0
[   66.484811][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   66.487793][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.490424][    C2] Call Trace:
[   66.491469][    C2]  &lt;IRQ&gt;
[   66.492374][    C2]  ? eth_header+0x28/0x190
[   66.494054][    C2]  ? eth_header+0x28/0x190
[   66.495401][    C2]  skb_push.cold.99+0x22/0x22
[   66.496700][    C2]  eth_header+0x28/0x190
[   66.497867][    C2]  neigh_resolve_output+0x3de/0x720
[   66.499615][    C2]  ? __neigh_update+0x7e8/0x20a0
[   66.501176][    C2]  __neigh_update+0x8bd/0x20a0
[   66.502749][    C2]  ndisc_update+0x34/0xc0
[   66.504010][    C2]  ndisc_recv_na+0x8da/0xb80
[   66.505041][    C2]  ? pndisc_redo+0x20/0x20
[   66.505888][    C2]  ? rcu_read_lock_sched_held+0xc0/0xc0
[   66.506965][    C2]  ndisc_rcv+0x3a0/0x470
[   66.507797][    C2]  icmpv6_rcv+0xad9/0x1b00
[   66.508645][    C2]  ip6_protocol_deliver_rcu+0xcd6/0x1560
[   66.509719][    C2]  ip6_input_finish+0x5b/0xf0
[   66.510615][    C2]  ip6_input+0xcd/0x2d0
[   66.511406][    C2]  ? ip6_input_finish+0xf0/0xf0
[   66.512327][    C2]  ? rcu_read_lock_held+0x91/0xa0
[   66.513279][    C2]  ? ip6_protocol_deliver_rcu+0x1560/0x1560
[   66.514414][    C2]  ipv6_rcv+0xe8/0x300
[ ... ]

Acked-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Link: https://lore.kernel.org/r/20201228152146.24270-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bareudp: set NETIF_F_LLTX flag</title>
<updated>2021-01-04T21:13:03+00:00</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2020-12-28T15:21:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d9e44981739a96f1a468c13bbbd54ace378caf1c'/>
<id>d9e44981739a96f1a468c13bbbd54ace378caf1c</id>
<content type='text'>
Like other tunneling interfaces, the bareudp doesn't need TXLOCK.
So, It is good to set the NETIF_F_LLTX flag to improve performance and
to avoid lockdep's false-positive warning.

Test commands:
    ip netns add A
    ip netns add B
    ip link add veth0 netns A type veth peer name veth1 netns B
    ip netns exec A ip link set veth0 up
    ip netns exec A ip a a 10.0.0.1/24 dev veth0
    ip netns exec B ip link set veth1 up
    ip netns exec B ip a a 10.0.0.2/24 dev veth1

    for i in {2..1}
    do
            let A=$i-1
            ip netns exec A ip link add bareudp$i type bareudp \
		    dstport $i ethertype ip
            ip netns exec A ip link set bareudp$i up
            ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i
            ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \
		    dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i

            ip netns exec B ip link add bareudp$i type bareudp \
		    dstport $i ethertype ip
            ip netns exec B ip link set bareudp$i up
            ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i
            ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \
		    dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i
    done
    ip netns exec A ping 10.0.2.2

Splat looks like:
[   96.992803][  T822] ============================================
[   96.993954][  T822] WARNING: possible recursive locking detected
[   96.995102][  T822] 5.10.0+ #819 Not tainted
[   96.995927][  T822] --------------------------------------------
[   96.997091][  T822] ping/822 is trying to acquire lock:
[   96.998083][  T822] ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   96.999813][  T822]
[   96.999813][  T822] but task is already holding lock:
[   97.001192][  T822] ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   97.002908][  T822]
[   97.002908][  T822] other info that might help us debug this:
[   97.004401][  T822]  Possible unsafe locking scenario:
[   97.004401][  T822]
[   97.005784][  T822]        CPU0
[   97.006407][  T822]        ----
[   97.007010][  T822]   lock(_xmit_NONE#2);
[   97.007779][  T822]   lock(_xmit_NONE#2);
[   97.008550][  T822]
[   97.008550][  T822]  *** DEADLOCK ***
[   97.008550][  T822]
[   97.010057][  T822]  May be due to missing lock nesting notation
[   97.010057][  T822]
[   97.011594][  T822] 7 locks held by ping/822:
[   97.012426][  T822]  #0: ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00
[   97.014191][  T822]  #1: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[   97.016045][  T822]  #2: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[   97.017897][  T822]  #3: ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   97.019684][  T822]  #4: ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp]
[   97.021573][  T822]  #5: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[   97.023424][  T822]  #6: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[   97.025259][  T822]
[   97.025259][  T822] stack backtrace:
[   97.026349][  T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819
[   97.027609][  T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   97.029407][  T822] Call Trace:
[   97.030015][  T822]  dump_stack+0x99/0xcb
[   97.030783][  T822]  __lock_acquire.cold.77+0x149/0x3a9
[   97.031773][  T822]  ? stack_trace_save+0x81/0xa0
[   97.032661][  T822]  ? register_lock_class+0x1910/0x1910
[   97.033673][  T822]  ? register_lock_class+0x1910/0x1910
[   97.034679][  T822]  ? rcu_read_lock_sched_held+0x91/0xc0
[   97.035697][  T822]  ? rcu_read_lock_bh_held+0xa0/0xa0
[   97.036690][  T822]  lock_acquire+0x1b2/0x730
[   97.037515][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
[   97.038466][  T822]  ? check_flags+0x50/0x50
[   97.039277][  T822]  ? netif_skb_features+0x296/0x9c0
[   97.040226][  T822]  ? validate_xmit_skb+0x29/0xb10
[   97.041151][  T822]  _raw_spin_lock+0x30/0x70
[   97.041977][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
[   97.042927][  T822]  __dev_queue_xmit+0x1f52/0x2960
[   97.043852][  T822]  ? netdev_core_pick_tx+0x290/0x290
[   97.044824][  T822]  ? mark_held_locks+0xb7/0x120
[   97.045712][  T822]  ? lockdep_hardirqs_on_prepare+0x12c/0x3e0
[   97.046824][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
[   97.047771][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.048710][  T822]  ? trace_hardirqs_on+0x41/0x120
[   97.049626][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.050556][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
[   97.051509][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.052443][  T822]  ? check_chain_key+0x244/0x5f0
[   97.053352][  T822]  ? rcu_read_lock_bh_held+0x56/0xa0
[   97.054317][  T822]  ? ip_finish_output2+0x6ea/0x2020
[   97.055263][  T822]  ? pneigh_lookup+0x410/0x410
[   97.056135][  T822]  ip_finish_output2+0x6ea/0x2020
[ ... ]

Acked-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Like other tunneling interfaces, the bareudp doesn't need TXLOCK.
So, It is good to set the NETIF_F_LLTX flag to improve performance and
to avoid lockdep's false-positive warning.

Test commands:
    ip netns add A
    ip netns add B
    ip link add veth0 netns A type veth peer name veth1 netns B
    ip netns exec A ip link set veth0 up
    ip netns exec A ip a a 10.0.0.1/24 dev veth0
    ip netns exec B ip link set veth1 up
    ip netns exec B ip a a 10.0.0.2/24 dev veth1

    for i in {2..1}
    do
            let A=$i-1
            ip netns exec A ip link add bareudp$i type bareudp \
		    dstport $i ethertype ip
            ip netns exec A ip link set bareudp$i up
            ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i
            ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \
		    dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i

            ip netns exec B ip link add bareudp$i type bareudp \
		    dstport $i ethertype ip
            ip netns exec B ip link set bareudp$i up
            ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i
            ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \
		    dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i
    done
    ip netns exec A ping 10.0.2.2

Splat looks like:
[   96.992803][  T822] ============================================
[   96.993954][  T822] WARNING: possible recursive locking detected
[   96.995102][  T822] 5.10.0+ #819 Not tainted
[   96.995927][  T822] --------------------------------------------
[   96.997091][  T822] ping/822 is trying to acquire lock:
[   96.998083][  T822] ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   96.999813][  T822]
[   96.999813][  T822] but task is already holding lock:
[   97.001192][  T822] ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   97.002908][  T822]
[   97.002908][  T822] other info that might help us debug this:
[   97.004401][  T822]  Possible unsafe locking scenario:
[   97.004401][  T822]
[   97.005784][  T822]        CPU0
[   97.006407][  T822]        ----
[   97.007010][  T822]   lock(_xmit_NONE#2);
[   97.007779][  T822]   lock(_xmit_NONE#2);
[   97.008550][  T822]
[   97.008550][  T822]  *** DEADLOCK ***
[   97.008550][  T822]
[   97.010057][  T822]  May be due to missing lock nesting notation
[   97.010057][  T822]
[   97.011594][  T822] 7 locks held by ping/822:
[   97.012426][  T822]  #0: ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00
[   97.014191][  T822]  #1: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[   97.016045][  T822]  #2: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[   97.017897][  T822]  #3: ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[   97.019684][  T822]  #4: ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp]
[   97.021573][  T822]  #5: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[   97.023424][  T822]  #6: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[   97.025259][  T822]
[   97.025259][  T822] stack backtrace:
[   97.026349][  T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819
[   97.027609][  T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   97.029407][  T822] Call Trace:
[   97.030015][  T822]  dump_stack+0x99/0xcb
[   97.030783][  T822]  __lock_acquire.cold.77+0x149/0x3a9
[   97.031773][  T822]  ? stack_trace_save+0x81/0xa0
[   97.032661][  T822]  ? register_lock_class+0x1910/0x1910
[   97.033673][  T822]  ? register_lock_class+0x1910/0x1910
[   97.034679][  T822]  ? rcu_read_lock_sched_held+0x91/0xc0
[   97.035697][  T822]  ? rcu_read_lock_bh_held+0xa0/0xa0
[   97.036690][  T822]  lock_acquire+0x1b2/0x730
[   97.037515][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
[   97.038466][  T822]  ? check_flags+0x50/0x50
[   97.039277][  T822]  ? netif_skb_features+0x296/0x9c0
[   97.040226][  T822]  ? validate_xmit_skb+0x29/0xb10
[   97.041151][  T822]  _raw_spin_lock+0x30/0x70
[   97.041977][  T822]  ? __dev_queue_xmit+0x1f52/0x2960
[   97.042927][  T822]  __dev_queue_xmit+0x1f52/0x2960
[   97.043852][  T822]  ? netdev_core_pick_tx+0x290/0x290
[   97.044824][  T822]  ? mark_held_locks+0xb7/0x120
[   97.045712][  T822]  ? lockdep_hardirqs_on_prepare+0x12c/0x3e0
[   97.046824][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
[   97.047771][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.048710][  T822]  ? trace_hardirqs_on+0x41/0x120
[   97.049626][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.050556][  T822]  ? __local_bh_enable_ip+0xa5/0xf0
[   97.051509][  T822]  ? ___neigh_create+0x12a8/0x1eb0
[   97.052443][  T822]  ? check_chain_key+0x244/0x5f0
[   97.053352][  T822]  ? rcu_read_lock_bh_held+0x56/0xa0
[   97.054317][  T822]  ? ip_finish_output2+0x6ea/0x2020
[   97.055263][  T822]  ? pneigh_lookup+0x410/0x410
[   97.056135][  T822]  ip_finish_output2+0x6ea/0x2020
[ ... ]

Acked-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bareudp: constify device_type declaration</title>
<updated>2020-12-03T02:00:18+00:00</updated>
<author>
<name>Jonas Bonn</name>
<email>jonas@norrbonn.se</email>
</author>
<published>2020-12-02T12:23:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cec85994c6b4fa6beb5de61dcd03e23001b9deb5'/>
<id>cec85994c6b4fa6beb5de61dcd03e23001b9deb5</id>
<content type='text'>
device_type may be declared as const.

Signed-off-by: Jonas Bonn &lt;jonas@norrbonn.se&gt;
Link: https://lore.kernel.org/r/20201202122324.564918-1-jonas@norrbonn.se
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
device_type may be declared as const.

Signed-off-by: Jonas Bonn &lt;jonas@norrbonn.se&gt;
Link: https://lore.kernel.org/r/20201202122324.564918-1-jonas@norrbonn.se
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: switch to dev_get_tstats64</title>
<updated>2020-11-10T01:50:28+00:00</updated>
<author>
<name>Heiner Kallweit</name>
<email>hkallweit1@gmail.com</email>
</author>
<published>2020-11-07T20:52:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b220a4a79ce576149a412c030fdb703144a07832'/>
<id>b220a4a79ce576149a412c030fdb703144a07832</id>
<content type='text'>
Replace ip_tunnel_get_stats64() with the new identical core function
dev_get_tstats64().

Signed-off-by: Heiner Kallweit &lt;hkallweit1@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace ip_tunnel_get_stats64() with the new identical core function
dev_get_tstats64().

Signed-off-by: Heiner Kallweit &lt;hkallweit1@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bareudp: use dev_sw_netstats_rx_add()</title>
<updated>2020-10-06T13:23:21+00:00</updated>
<author>
<name>Fabian Frederick</name>
<email>fabf@skynet.be</email>
</author>
<published>2020-10-05T20:35:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8fdfffd0b97a0f1e46618afad7e24ebf889952c4'/>
<id>8fdfffd0b97a0f1e46618afad7e24ebf889952c4</id>
<content type='text'>
use new helper for netstats settings

Signed-off-by: Fabian Frederick &lt;fabf@skynet.be&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
use new helper for netstats settings

Signed-off-by: Fabian Frederick &lt;fabf@skynet.be&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tunnels: PMTU discovery support for directly bridged IP packets</title>
<updated>2020-08-04T20:01:45+00:00</updated>
<author>
<name>Stefano Brivio</name>
<email>sbrivio@redhat.com</email>
</author>
<published>2020-08-04T05:53:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4cb47a8644cc9eb8ec81190a50e79e6530d0297f'/>
<id>4cb47a8644cc9eb8ec81190a50e79e6530d0297f</id>
<content type='text'>
It's currently possible to bridge Ethernet tunnels carrying IP
packets directly to external interfaces without assigning them
addresses and routes on the bridged network itself: this is the case
for UDP tunnels bridged with a standard bridge or by Open vSwitch.

PMTU discovery is currently broken with those configurations, because
the encapsulation effectively decreases the MTU of the link, and
while we are able to account for this using PMTU discovery on the
lower layer, we don't have a way to relay ICMP or ICMPv6 messages
needed by the sender, because we don't have valid routes to it.

On the other hand, as a tunnel endpoint, we can't fragment packets
as a general approach: this is for instance clearly forbidden for
VXLAN by RFC 7348, section 4.3:

   VTEPs MUST NOT fragment VXLAN packets.  Intermediate routers may
   fragment encapsulated VXLAN packets due to the larger frame size.
   The destination VTEP MAY silently discard such VXLAN fragments.

The same paragraph recommends that the MTU over the physical network
accomodates for encapsulations, but this isn't a practical option for
complex topologies, especially for typical Open vSwitch use cases.

Further, it states that:

   Other techniques like Path MTU discovery (see [RFC1191] and
   [RFC1981]) MAY be used to address this requirement as well.

Now, PMTU discovery already works for routed interfaces, we get
route exceptions created by the encapsulation device as they receive
ICMP Fragmentation Needed and ICMPv6 Packet Too Big messages, and
we already rebuild those messages with the appropriate MTU and route
them back to the sender.

Add the missing bits for bridged cases:

- checks in skb_tunnel_check_pmtu() to understand if it's appropriate
  to trigger a reply according to RFC 1122 section 3.2.2 for ICMP and
  RFC 4443 section 2.4 for ICMPv6. This function is already called by
  UDP tunnels

- a new function generating those ICMP or ICMPv6 replies. We can't
  reuse icmp_send() and icmp6_send() as we don't see the sender as a
  valid destination. This doesn't need to be generic, as we don't
  cover any other type of ICMP errors given that we only provide an
  encapsulation function to the sender

While at it, make the MTU check in skb_tunnel_check_pmtu() accurate:
we might receive GSO buffers here, and the passed headroom already
includes the inner MAC length, so we don't have to account for it
a second time (that would imply three MAC headers on the wire, but
there are just two).

This issue became visible while bridging IPv6 packets with 4500 bytes
of payload over GENEVE using IPv4 with a PMTU of 4000. Given the 50
bytes of encapsulation headroom, we would advertise MTU as 3950, and
we would reject fragmented IPv6 datagrams of 3958 bytes size on the
wire. We're exclusively dealing with network MTU here, though, so we
could get Ethernet frames up to 3964 octets in that case.

v2:
- moved skb_tunnel_check_pmtu() to ip_tunnel_core.c (David Ahern)
- split IPv4/IPv6 functions (David Ahern)

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's currently possible to bridge Ethernet tunnels carrying IP
packets directly to external interfaces without assigning them
addresses and routes on the bridged network itself: this is the case
for UDP tunnels bridged with a standard bridge or by Open vSwitch.

PMTU discovery is currently broken with those configurations, because
the encapsulation effectively decreases the MTU of the link, and
while we are able to account for this using PMTU discovery on the
lower layer, we don't have a way to relay ICMP or ICMPv6 messages
needed by the sender, because we don't have valid routes to it.

On the other hand, as a tunnel endpoint, we can't fragment packets
as a general approach: this is for instance clearly forbidden for
VXLAN by RFC 7348, section 4.3:

   VTEPs MUST NOT fragment VXLAN packets.  Intermediate routers may
   fragment encapsulated VXLAN packets due to the larger frame size.
   The destination VTEP MAY silently discard such VXLAN fragments.

The same paragraph recommends that the MTU over the physical network
accomodates for encapsulations, but this isn't a practical option for
complex topologies, especially for typical Open vSwitch use cases.

Further, it states that:

   Other techniques like Path MTU discovery (see [RFC1191] and
   [RFC1981]) MAY be used to address this requirement as well.

Now, PMTU discovery already works for routed interfaces, we get
route exceptions created by the encapsulation device as they receive
ICMP Fragmentation Needed and ICMPv6 Packet Too Big messages, and
we already rebuild those messages with the appropriate MTU and route
them back to the sender.

Add the missing bits for bridged cases:

- checks in skb_tunnel_check_pmtu() to understand if it's appropriate
  to trigger a reply according to RFC 1122 section 3.2.2 for ICMP and
  RFC 4443 section 2.4 for ICMPv6. This function is already called by
  UDP tunnels

- a new function generating those ICMP or ICMPv6 replies. We can't
  reuse icmp_send() and icmp6_send() as we don't see the sender as a
  valid destination. This doesn't need to be generic, as we don't
  cover any other type of ICMP errors given that we only provide an
  encapsulation function to the sender

While at it, make the MTU check in skb_tunnel_check_pmtu() accurate:
we might receive GSO buffers here, and the passed headroom already
includes the inner MAC length, so we don't have to account for it
a second time (that would imply three MAC headers on the wire, but
there are just two).

This issue became visible while bridging IPv6 packets with 4500 bytes
of payload over GENEVE using IPv4 with a PMTU of 4000. Given the 50
bytes of encapsulation headroom, we would advertise MTU as 3950, and
we would reject fragmented IPv6 datagrams of 3958 bytes size on the
wire. We're exclusively dealing with network MTU here, though, so we
could get Ethernet frames up to 3964 octets in that case.

v2:
- moved skb_tunnel_check_pmtu() to ip_tunnel_core.c (David Ahern)
- split IPv4/IPv6 functions (David Ahern)

Signed-off-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
