<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv6/route.c, branch linux-4.2.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ipv6: Check rt-&gt;dst.from for the DST_NOCACHE route</title>
<updated>2015-12-15T05:25:35+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-11-11T19:51:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7c803d69b4a8b68553511614a3f034ce6b806532'/>
<id>7c803d69b4a8b68553511614a3f034ce6b806532</id>
<content type='text'>
[ Upstrem commit 02bcf4e082e4dc634409a6a6cb7def8806d6e5e6 ]

All DST_NOCACHE rt6_info used to have rt-&gt;dst.from set to
its parent.

After commit 8e3d5be73681 ("ipv6: Avoid double dst_free"),
DST_NOCACHE is also set to rt6_info which does not have
a parent (i.e. rt-&gt;dst.from is NULL).

This patch catches the rt-&gt;dst.from == NULL case.

Fixes: 8e3d5be73681 ("ipv6: Avoid double dst_free")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstrem commit 02bcf4e082e4dc634409a6a6cb7def8806d6e5e6 ]

All DST_NOCACHE rt6_info used to have rt-&gt;dst.from set to
its parent.

After commit 8e3d5be73681 ("ipv6: Avoid double dst_free"),
DST_NOCACHE is also set to rt6_info which does not have
a parent (i.e. rt-&gt;dst.from is NULL).

This patch catches the rt-&gt;dst.from == NULL case.

Fixes: 8e3d5be73681 ("ipv6: Avoid double dst_free")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Check expire on DST_NOCACHE route</title>
<updated>2015-12-15T05:25:35+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-11-11T19:51:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8b5054279ee51fd3082a0d3ec9b987548086c318'/>
<id>8b5054279ee51fd3082a0d3ec9b987548086c318</id>
<content type='text'>
[ Upstream commit 5973fb1e245086071bf71994c8b54d99526ded03 ]

Since the expires of the DST_NOCACHE rt can be set during
the ip6_rt_update_pmtu(), we also need to consider the expires
value when doing ip6_dst_check().

This patches creates __rt6_check_expired() to only
check the expire value (if one exists) of the current rt.

In rt6_dst_from_check(), it adds __rt6_check_expired() as
one of the condition check.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5973fb1e245086071bf71994c8b54d99526ded03 ]

Since the expires of the DST_NOCACHE rt can be set during
the ip6_rt_update_pmtu(), we also need to consider the expires
value when doing ip6_dst_check().

This patches creates __rt6_check_expired() to only
check the expire value (if one exists) of the current rt.

In rt6_dst_from_check(), it adds __rt6_check_expired() as
one of the condition check.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 tree</title>
<updated>2015-12-15T05:25:34+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-11-11T19:51:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=726dd3c9c8b89e804d2a0211f4fbcf48cecd1847'/>
<id>726dd3c9c8b89e804d2a0211f4fbcf48cecd1847</id>
<content type='text'>
[ Upstream commit 0d3f6d297bfb7af24d0508460fdb3d1ec4903fa3 ]

The original bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1272571

The setup has a IPv4 GRE tunnel running in a IPSec.  The bug
happens when ndisc starts sending router solicitation at the gre
interface.  The simplified oops stack is like:

__lock_acquire+0x1b2/0x1c30
lock_acquire+0xb9/0x140
_raw_write_lock_bh+0x3f/0x50
__ip6_ins_rt+0x2e/0x60
ip6_ins_rt+0x49/0x50
~~~~~~~~
__ip6_rt_update_pmtu.part.54+0x145/0x250
ip6_rt_update_pmtu+0x2e/0x40
~~~~~~~~
ip_tunnel_xmit+0x1f1/0xf40
__gre_xmit+0x7a/0x90
ipgre_xmit+0x15a/0x220
dev_hard_start_xmit+0x2bd/0x480
__dev_queue_xmit+0x696/0x730
dev_queue_xmit+0x10/0x20
neigh_direct_output+0x11/0x20
ip6_finish_output2+0x21f/0x770
ip6_finish_output+0xa7/0x1d0
ip6_output+0x56/0x190
~~~~~~~~
ndisc_send_skb+0x1d9/0x400
ndisc_send_rs+0x88/0xc0
~~~~~~~~

The rt passed to ip6_rt_update_pmtu() is created by
icmp6_dst_alloc() and it is not managed by the fib6 tree,
so its rt6i_table == NULL.  When __ip6_rt_update_pmtu() creates
a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
and it causes the ip6_ins_rt() oops.

During pmtu update, we only want to create a RTF_CACHE clone
from a rt which is currently managed (or owned) by the
fib6 tree.  It means either rt-&gt;rt6i_node != NULL or
rt is a RTF_PCPU clone.

It is worth to note that rt6i_table may not be NULL even it is
not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
Hence, rt6i_node is a better check instead of rt6i_table.

Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reported-by: Chris Siebenmann &lt;cks-rhbugzilla@cs.toronto.edu&gt;
Cc: Chris Siebenmann &lt;cks-rhbugzilla@cs.toronto.edu&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 0d3f6d297bfb7af24d0508460fdb3d1ec4903fa3 ]

The original bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1272571

The setup has a IPv4 GRE tunnel running in a IPSec.  The bug
happens when ndisc starts sending router solicitation at the gre
interface.  The simplified oops stack is like:

__lock_acquire+0x1b2/0x1c30
lock_acquire+0xb9/0x140
_raw_write_lock_bh+0x3f/0x50
__ip6_ins_rt+0x2e/0x60
ip6_ins_rt+0x49/0x50
~~~~~~~~
__ip6_rt_update_pmtu.part.54+0x145/0x250
ip6_rt_update_pmtu+0x2e/0x40
~~~~~~~~
ip_tunnel_xmit+0x1f1/0xf40
__gre_xmit+0x7a/0x90
ipgre_xmit+0x15a/0x220
dev_hard_start_xmit+0x2bd/0x480
__dev_queue_xmit+0x696/0x730
dev_queue_xmit+0x10/0x20
neigh_direct_output+0x11/0x20
ip6_finish_output2+0x21f/0x770
ip6_finish_output+0xa7/0x1d0
ip6_output+0x56/0x190
~~~~~~~~
ndisc_send_skb+0x1d9/0x400
ndisc_send_rs+0x88/0xc0
~~~~~~~~

The rt passed to ip6_rt_update_pmtu() is created by
icmp6_dst_alloc() and it is not managed by the fib6 tree,
so its rt6i_table == NULL.  When __ip6_rt_update_pmtu() creates
a RTF_CACHE clone, the newly created clone also has rt6i_table == NULL
and it causes the ip6_ins_rt() oops.

During pmtu update, we only want to create a RTF_CACHE clone
from a rt which is currently managed (or owned) by the
fib6 tree.  It means either rt-&gt;rt6i_node != NULL or
rt is a RTF_PCPU clone.

It is worth to note that rt6i_table may not be NULL even it is
not (yet) managed by the fib6 tree (e.g. addrconf_dst_alloc()).
Hence, rt6i_node is a better check instead of rt6i_table.

Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reported-by: Chris Siebenmann &lt;cks-rhbugzilla@cs.toronto.edu&gt;
Cc: Chris Siebenmann &lt;cks-rhbugzilla@cs.toronto.edu&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Don't call with rt6_uncached_list_flush_dev</title>
<updated>2015-10-27T00:53:37+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-10-12T16:02:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=58d772c203ee57c45620730198bc2d9ded7a1464'/>
<id>58d772c203ee57c45620730198bc2d9ded7a1464</id>
<content type='text'>
[ Upstream commit e332bc67cf5e5e5b71a1aec9750d0791aac65183 ]

As originally written rt6_uncached_list_flush_dev makes no sense when
called with dev == NULL as it attempts to flush all uncached routes
regardless of network namespace when dev == NULL.  Which is simply
incorrect behavior.

Furthermore at the point rt6_ifdown is called with dev == NULL no more
network devices exist in the network namespace so even if the code in
rt6_uncached_list_flush_dev were to attempt something sensible it
would be meaningless.

Therefore remove support in rt6_uncached_list_flush_dev for handling
network devices where dev == NULL, and only call rt6_uncached_list_flush_dev
 when rt6_ifdown is called with a network device.

Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Reviewed-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Tested-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e332bc67cf5e5e5b71a1aec9750d0791aac65183 ]

As originally written rt6_uncached_list_flush_dev makes no sense when
called with dev == NULL as it attempts to flush all uncached routes
regardless of network namespace when dev == NULL.  Which is simply
incorrect behavior.

Furthermore at the point rt6_ifdown is called with dev == NULL no more
network devices exist in the network namespace so even if the code in
rt6_uncached_list_flush_dev were to attempt something sensible it
would be meaningless.

Therefore remove support in rt6_uncached_list_flush_dev for handling
network devices where dev == NULL, and only call rt6_uncached_list_flush_dev
 when rt6_ifdown is called with a network device.

Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Reviewed-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Tested-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: fix multipath route replace error recovery</title>
<updated>2015-10-03T11:51:36+00:00</updated>
<author>
<name>Roopa Prabhu</name>
<email>roopa@cumulusnetworks.com</email>
</author>
<published>2015-09-08T17:53:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e60f4a39c2173ad637d5c6541404b7847acac246'/>
<id>e60f4a39c2173ad637d5c6541404b7847acac246</id>
<content type='text'>
[ Upstream commit 6b9ea5a64ed5eeb3f68f2e6fcce0ed1179801d1e ]

Problem:
The ecmp route replace support for ipv6 in the kernel, deletes the
existing ecmp route too early, ie when it installs the first nexthop.
If there is an error in installing the subsequent nexthops, its too late
to recover the already deleted existing route leaving the fib
in an inconsistent state.

This patch reduces the possibility of this by doing the following:
a) Changes the existing multipath route add code to a two stage process:
  build rt6_infos + insert them
	ip6_route_add rt6_info creation code is moved into
	ip6_route_info_create.
b) This ensures that most errors are caught during building rt6_infos
  and we fail early
c) Separates multipath add and del code. Because add needs the special
  two stage mode in a) and delete essentially does not care.
d) In any event if the code fails during inserting a route again, a
  warning is printed (This should be unlikely)

Before the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists

$ip -6 route show
/* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
 * kernel */

After the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists

$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 6b9ea5a64ed5eeb3f68f2e6fcce0ed1179801d1e ]

Problem:
The ecmp route replace support for ipv6 in the kernel, deletes the
existing ecmp route too early, ie when it installs the first nexthop.
If there is an error in installing the subsequent nexthops, its too late
to recover the already deleted existing route leaving the fib
in an inconsistent state.

This patch reduces the possibility of this by doing the following:
a) Changes the existing multipath route add code to a two stage process:
  build rt6_infos + insert them
	ip6_route_add rt6_info creation code is moved into
	ip6_route_info_create.
b) This ensures that most errors are caught during building rt6_infos
  and we fail early
c) Separates multipath add and del code. Because add needs the special
  two stage mode in a) and delete essentially does not care.
d) In any event if the code fails during inserting a route again, a
  warning is printed (This should be unlikely)

Before the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists

$ip -6 route show
/* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
 * kernel */

After the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists

$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Reviewed-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Fix a potential deadlock when creating pcpu rt</title>
<updated>2015-08-17T21:28:03+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-08-14T18:05:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9c7370a166b4e157137bfbfe2ad296d57147547c'/>
<id>9c7370a166b4e157137bfbfe2ad296d57147547c</id>
<content type='text'>
rt6_make_pcpu_route() is called under read_lock(&amp;table-&gt;tb6_lock).
rt6_make_pcpu_route() calls ip6_rt_pcpu_alloc(rt) which then
calls dst_alloc().  dst_alloc() _may_ call ip6_dst_gc() which takes
the write_lock(&amp;tabl-&gt;tb6_lock).  A visualized version:

read_lock(&amp;table-&gt;tb6_lock);
rt6_make_pcpu_route();
=&gt; ip6_rt_pcpu_alloc();
=&gt; dst_alloc();
=&gt; ip6_dst_gc();
=&gt; write_lock(&amp;table-&gt;tb6_lock); /* oops */

The fix is to do a read_unlock first before calling ip6_rt_pcpu_alloc().

A reported stack:

[141625.537638] INFO: rcu_sched self-detected stall on CPU { 27}  (t=60000 jiffies g=4159086 c=4159085 q=2139)
[141625.547469] Task dump for CPU 27:
[141625.550881] mtr             R  running task        0 22121  22081 0x00000008
[141625.558069]  0000000000000000 ffff88103f363d98 ffffffff8106e488 000000000000001b
[141625.565641]  ffffffff81684900 ffff88103f363db8 ffffffff810702b0 0000000008000000
[141625.573220]  ffffffff81684900 ffff88103f363de8 ffffffff8108df9f ffff88103f375a00
[141625.580803] Call Trace:
[141625.583345]  &lt;IRQ&gt;  [&lt;ffffffff8106e488&gt;] sched_show_task+0xc1/0xc6
[141625.589650]  [&lt;ffffffff810702b0&gt;] dump_cpu_task+0x35/0x39
[141625.595144]  [&lt;ffffffff8108df9f&gt;] rcu_dump_cpu_stacks+0x6a/0x8c
[141625.601320]  [&lt;ffffffff81090606&gt;] rcu_check_callbacks+0x1f6/0x5d4
[141625.607669]  [&lt;ffffffff810940c8&gt;] update_process_times+0x2a/0x4f
[141625.613925]  [&lt;ffffffff8109fbee&gt;] tick_sched_handle+0x32/0x3e
[141625.619923]  [&lt;ffffffff8109fc2f&gt;] tick_sched_timer+0x35/0x5c
[141625.625830]  [&lt;ffffffff81094a1f&gt;] __hrtimer_run_queues+0x8f/0x18d
[141625.632171]  [&lt;ffffffff81094c9e&gt;] hrtimer_interrupt+0xa0/0x166
[141625.638258]  [&lt;ffffffff8102bf2a&gt;] local_apic_timer_interrupt+0x4e/0x52
[141625.645036]  [&lt;ffffffff8102c36f&gt;] smp_apic_timer_interrupt+0x39/0x4a
[141625.651643]  [&lt;ffffffff8140b9e8&gt;] apic_timer_interrupt+0x68/0x70
[141625.657895]  &lt;EOI&gt;  [&lt;ffffffff81346ee8&gt;] ? dst_destroy+0x7c/0xb5
[141625.664188]  [&lt;ffffffff813d45b5&gt;] ? fib6_flush_trees+0x20/0x20
[141625.670272]  [&lt;ffffffff81082b45&gt;] ? queue_write_lock_slowpath+0x60/0x6f
[141625.677140]  [&lt;ffffffff8140aa33&gt;] _raw_write_lock_bh+0x23/0x25
[141625.683218]  [&lt;ffffffff813d4553&gt;] __fib6_clean_all+0x40/0x82
[141625.689124]  [&lt;ffffffff813d45b5&gt;] ? fib6_flush_trees+0x20/0x20
[141625.695207]  [&lt;ffffffff813d6058&gt;] fib6_clean_all+0xe/0x10
[141625.700854]  [&lt;ffffffff813d60d3&gt;] fib6_run_gc+0x79/0xc8
[141625.706329]  [&lt;ffffffff813d0510&gt;] ip6_dst_gc+0x85/0xf9
[141625.711718]  [&lt;ffffffff81346d68&gt;] dst_alloc+0x55/0x159
[141625.717105]  [&lt;ffffffff813d09b5&gt;] __ip6_dst_alloc.isra.32+0x19/0x63
[141625.723620]  [&lt;ffffffff813d1830&gt;] ip6_pol_route+0x36a/0x3e8
[141625.729441]  [&lt;ffffffff813d18d6&gt;] ip6_pol_route_output+0x11/0x13
[141625.735700]  [&lt;ffffffff813f02c8&gt;] fib6_rule_action+0xa7/0x1bf
[141625.741698]  [&lt;ffffffff813d18c5&gt;] ? ip6_pol_route_input+0x17/0x17
[141625.748043]  [&lt;ffffffff81357c48&gt;] fib_rules_lookup+0xb5/0x12a
[141625.754050]  [&lt;ffffffff81141628&gt;] ? poll_select_copy_remaining+0xf9/0xf9
[141625.761002]  [&lt;ffffffff813f0535&gt;] fib6_rule_lookup+0x37/0x5c
[141625.766914]  [&lt;ffffffff813d18c5&gt;] ? ip6_pol_route_input+0x17/0x17
[141625.773260]  [&lt;ffffffff813d008c&gt;] ip6_route_output+0x7a/0x82
[141625.779177]  [&lt;ffffffff813c44c8&gt;] ip6_dst_lookup_tail+0x53/0x112
[141625.785437]  [&lt;ffffffff813c45c3&gt;] ip6_dst_lookup_flow+0x2a/0x6b
[141625.791604]  [&lt;ffffffff813ddaab&gt;] rawv6_sendmsg+0x407/0x9b6
[141625.797423]  [&lt;ffffffff813d7914&gt;] ? do_ipv6_setsockopt.isra.8+0xd87/0xde2
[141625.804464]  [&lt;ffffffff8139d4b4&gt;] inet_sendmsg+0x57/0x8e
[141625.810028]  [&lt;ffffffff81329ba3&gt;] sock_sendmsg+0x2e/0x3c
[141625.815588]  [&lt;ffffffff8132be57&gt;] SyS_sendto+0xfe/0x143
[141625.821063]  [&lt;ffffffff813dd551&gt;] ? rawv6_setsockopt+0x5e/0x67
[141625.827146]  [&lt;ffffffff8132c9f8&gt;] ? sock_common_setsockopt+0xf/0x11
[141625.833660]  [&lt;ffffffff8132c08c&gt;] ? SyS_setsockopt+0x81/0xa2
[141625.839565]  [&lt;ffffffff8140ac17&gt;] entry_SYSCALL_64_fastpath+0x12/0x6a

Fixes: d52d3997f843 ("pv6: Create percpu rt6_info")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Reported-by: Steinar H. Gunderson &lt;sgunderson@bigfoot.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
rt6_make_pcpu_route() is called under read_lock(&amp;table-&gt;tb6_lock).
rt6_make_pcpu_route() calls ip6_rt_pcpu_alloc(rt) which then
calls dst_alloc().  dst_alloc() _may_ call ip6_dst_gc() which takes
the write_lock(&amp;tabl-&gt;tb6_lock).  A visualized version:

read_lock(&amp;table-&gt;tb6_lock);
rt6_make_pcpu_route();
=&gt; ip6_rt_pcpu_alloc();
=&gt; dst_alloc();
=&gt; ip6_dst_gc();
=&gt; write_lock(&amp;table-&gt;tb6_lock); /* oops */

The fix is to do a read_unlock first before calling ip6_rt_pcpu_alloc().

A reported stack:

[141625.537638] INFO: rcu_sched self-detected stall on CPU { 27}  (t=60000 jiffies g=4159086 c=4159085 q=2139)
[141625.547469] Task dump for CPU 27:
[141625.550881] mtr             R  running task        0 22121  22081 0x00000008
[141625.558069]  0000000000000000 ffff88103f363d98 ffffffff8106e488 000000000000001b
[141625.565641]  ffffffff81684900 ffff88103f363db8 ffffffff810702b0 0000000008000000
[141625.573220]  ffffffff81684900 ffff88103f363de8 ffffffff8108df9f ffff88103f375a00
[141625.580803] Call Trace:
[141625.583345]  &lt;IRQ&gt;  [&lt;ffffffff8106e488&gt;] sched_show_task+0xc1/0xc6
[141625.589650]  [&lt;ffffffff810702b0&gt;] dump_cpu_task+0x35/0x39
[141625.595144]  [&lt;ffffffff8108df9f&gt;] rcu_dump_cpu_stacks+0x6a/0x8c
[141625.601320]  [&lt;ffffffff81090606&gt;] rcu_check_callbacks+0x1f6/0x5d4
[141625.607669]  [&lt;ffffffff810940c8&gt;] update_process_times+0x2a/0x4f
[141625.613925]  [&lt;ffffffff8109fbee&gt;] tick_sched_handle+0x32/0x3e
[141625.619923]  [&lt;ffffffff8109fc2f&gt;] tick_sched_timer+0x35/0x5c
[141625.625830]  [&lt;ffffffff81094a1f&gt;] __hrtimer_run_queues+0x8f/0x18d
[141625.632171]  [&lt;ffffffff81094c9e&gt;] hrtimer_interrupt+0xa0/0x166
[141625.638258]  [&lt;ffffffff8102bf2a&gt;] local_apic_timer_interrupt+0x4e/0x52
[141625.645036]  [&lt;ffffffff8102c36f&gt;] smp_apic_timer_interrupt+0x39/0x4a
[141625.651643]  [&lt;ffffffff8140b9e8&gt;] apic_timer_interrupt+0x68/0x70
[141625.657895]  &lt;EOI&gt;  [&lt;ffffffff81346ee8&gt;] ? dst_destroy+0x7c/0xb5
[141625.664188]  [&lt;ffffffff813d45b5&gt;] ? fib6_flush_trees+0x20/0x20
[141625.670272]  [&lt;ffffffff81082b45&gt;] ? queue_write_lock_slowpath+0x60/0x6f
[141625.677140]  [&lt;ffffffff8140aa33&gt;] _raw_write_lock_bh+0x23/0x25
[141625.683218]  [&lt;ffffffff813d4553&gt;] __fib6_clean_all+0x40/0x82
[141625.689124]  [&lt;ffffffff813d45b5&gt;] ? fib6_flush_trees+0x20/0x20
[141625.695207]  [&lt;ffffffff813d6058&gt;] fib6_clean_all+0xe/0x10
[141625.700854]  [&lt;ffffffff813d60d3&gt;] fib6_run_gc+0x79/0xc8
[141625.706329]  [&lt;ffffffff813d0510&gt;] ip6_dst_gc+0x85/0xf9
[141625.711718]  [&lt;ffffffff81346d68&gt;] dst_alloc+0x55/0x159
[141625.717105]  [&lt;ffffffff813d09b5&gt;] __ip6_dst_alloc.isra.32+0x19/0x63
[141625.723620]  [&lt;ffffffff813d1830&gt;] ip6_pol_route+0x36a/0x3e8
[141625.729441]  [&lt;ffffffff813d18d6&gt;] ip6_pol_route_output+0x11/0x13
[141625.735700]  [&lt;ffffffff813f02c8&gt;] fib6_rule_action+0xa7/0x1bf
[141625.741698]  [&lt;ffffffff813d18c5&gt;] ? ip6_pol_route_input+0x17/0x17
[141625.748043]  [&lt;ffffffff81357c48&gt;] fib_rules_lookup+0xb5/0x12a
[141625.754050]  [&lt;ffffffff81141628&gt;] ? poll_select_copy_remaining+0xf9/0xf9
[141625.761002]  [&lt;ffffffff813f0535&gt;] fib6_rule_lookup+0x37/0x5c
[141625.766914]  [&lt;ffffffff813d18c5&gt;] ? ip6_pol_route_input+0x17/0x17
[141625.773260]  [&lt;ffffffff813d008c&gt;] ip6_route_output+0x7a/0x82
[141625.779177]  [&lt;ffffffff813c44c8&gt;] ip6_dst_lookup_tail+0x53/0x112
[141625.785437]  [&lt;ffffffff813c45c3&gt;] ip6_dst_lookup_flow+0x2a/0x6b
[141625.791604]  [&lt;ffffffff813ddaab&gt;] rawv6_sendmsg+0x407/0x9b6
[141625.797423]  [&lt;ffffffff813d7914&gt;] ? do_ipv6_setsockopt.isra.8+0xd87/0xde2
[141625.804464]  [&lt;ffffffff8139d4b4&gt;] inet_sendmsg+0x57/0x8e
[141625.810028]  [&lt;ffffffff81329ba3&gt;] sock_sendmsg+0x2e/0x3c
[141625.815588]  [&lt;ffffffff8132be57&gt;] SyS_sendto+0xfe/0x143
[141625.821063]  [&lt;ffffffff813dd551&gt;] ? rawv6_setsockopt+0x5e/0x67
[141625.827146]  [&lt;ffffffff8132c9f8&gt;] ? sock_common_setsockopt+0xf/0x11
[141625.833660]  [&lt;ffffffff8132c08c&gt;] ? SyS_setsockopt+0x81/0xa2
[141625.839565]  [&lt;ffffffff8140ac17&gt;] entry_SYSCALL_64_fastpath+0x12/0x6a

Fixes: d52d3997f843 ("pv6: Create percpu rt6_info")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Reported-by: Steinar H. Gunderson &lt;sgunderson@bigfoot.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Add rt6_make_pcpu_route()</title>
<updated>2015-08-17T21:28:03+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-08-14T18:05:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a73e4195636c17f310b8530643a576f42b82385f'/>
<id>a73e4195636c17f310b8530643a576f42b82385f</id>
<content type='text'>
It is a prep work for fixing a potential deadlock when creating
a pcpu rt.

The current rt6_get_pcpu_route() will also create a pcpu rt if one does not
exist.  This patch moves the pcpu rt creation logic into another function,
rt6_make_pcpu_route().

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is a prep work for fixing a potential deadlock when creating
a pcpu rt.

The current rt6_get_pcpu_route() will also create a pcpu rt if one does not
exist.  This patch moves the pcpu rt creation logic into another function,
rt6_make_pcpu_route().

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Remove un-used argument from ip6_dst_alloc()</title>
<updated>2015-08-17T21:28:03+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-08-14T18:05:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ad706862890171e02df1d7391b05599fb676ec18'/>
<id>ad706862890171e02df1d7391b05599fb676ec18</id>
<content type='text'>
After 4b32b5ad31a6 ("ipv6: Stop rt6_info from using inet_peer's metrics"),
ip6_dst_alloc() does not need the 'table' argument.  This patch
cleans it up.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After 4b32b5ad31a6 ("ipv6: Stop rt6_info from using inet_peer's metrics"),
ip6_dst_alloc() does not need the 'table' argument.  This patch
cleans it up.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: don't reject link-local nexthop on other interface</title>
<updated>2015-08-10T20:29:22+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2015-08-07T08:54:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=330567b71d8716704b189454553c2696e1eceb6c'/>
<id>330567b71d8716704b189454553c2696e1eceb6c</id>
<content type='text'>
48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses") is too
strict; it rejects following corner-case:

ip -6 route add default via fe80::1:2:3 dev eth1

[ where fe80::1:2:3 is assigned to a local interface, but not eth1 ]

Fix this by restricting search to given device if nh is linklocal.

Joint work with Hannes Frederic Sowa.

Fixes: 48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses")
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses") is too
strict; it rejects following corner-case:

ip -6 route add default via fe80::1:2:3 dev eth1

[ where fe80::1:2:3 is assigned to a local interface, but not eth1 ]

Fix this by restricting search to given device if nh is linklocal.

Joint work with Hannes Frederic Sowa.

Fixes: 48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses")
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net-ipv6: Delete an unnecessary check before the function call "free_percpu"</title>
<updated>2015-07-03T16:27:42+00:00</updated>
<author>
<name>Markus Elfring</name>
<email>elfring@users.sourceforge.net</email>
</author>
<published>2015-07-02T14:30:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=87775312a86bcf213e3b21f6f7c79e2e00d96f7b'/>
<id>87775312a86bcf213e3b21f6f7c79e2e00d96f7b</id>
<content type='text'>
The free_percpu() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The free_percpu() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
