<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/netfilter/ipset, branch v4.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>netfilter: ipset: Fix coding styles reported by checkpatch.pl</title>
<updated>2015-06-14T08:40:18+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T17:45:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ca0f6a5cd99e0c6ba4bb78dc402817f636370f26'/>
<id>ca0f6a5cd99e0c6ba4bb78dc402817f636370f26</id>
<content type='text'>
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Introduce RCU locking in list type</title>
<updated>2015-06-14T08:40:17+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T14:56:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=00590fdd5be0d763631ef10e6a3e2ce8fc2d9ec3'/>
<id>00590fdd5be0d763631ef10e6a3e2ce8fc2d9ec3</id>
<content type='text'>
Standard rculist is used.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Standard rculist is used.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Introduce RCU locking in hash:* types</title>
<updated>2015-06-14T08:40:17+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T15:29:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=18f84d41d34fa35d0d64bbaea01fe664553ecc06'/>
<id>18f84d41d34fa35d0d64bbaea01fe664553ecc06</id>
<content type='text'>
Three types of data need to be protected in the case of the hash types:

a. The hash buckets: standard rcu pointer operations are used.
b. The element blobs in the hash buckets are stored in an array and
   a bitmap is used for book-keeping to tell which elements in the array
   are used or free.
c. Networks per cidr values and the cidr values themselves are stored
   in fix sized arrays and need no protection. The values are modified
   in such an order that in the worst case an element testing is repeated
   once with the same cidr value.

The ipset hash approach uses arrays instead of lists and therefore is
incompatible with rhashtable.

Performance is tested by Jesper Dangaard Brouer:

Simple drop in FORWARD
~~~~~~~~~~~~~~~~~~~~~~

Dropping via simple iptables net-mask match::

 iptables -t raw -N simple || iptables -t raw -F simple
 iptables -t raw -I simple  -s 198.18.0.0/15 -j DROP
 iptables -t raw -D PREROUTING -j simple
 iptables -t raw -I PREROUTING -j simple

Drop performance in "raw": 11.3Mpps

Generator: sending 12.2Mpps (tx:12264083 pps)

Drop via original ipset in RAW table
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create a set with lots of elements::

 sudo ./ipset destroy test
 echo "create test hash:ip hashsize 65536" &gt; test.set
 for x in `seq 0 255`; do
    for y in `seq 0 255`; do
        echo "add test 198.18.$x.$y" &gt;&gt; test.set
    done
 done
 sudo ./ipset restore &lt; test.set

Dropping via ipset::

 iptables -t raw -F
 iptables -t raw -N net198 || iptables -t raw -F net198
 iptables -t raw -I net198 -m set --match-set test src -j DROP
 iptables -t raw -I PREROUTING -j net198

Drop performance in "raw" with ipset: 8Mpps

Perf report numbers ipset drop in "raw"::

 +   24.65%  ksoftirqd/1  [ip_set]           [k] ip_set_test
 -   21.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock_bh
    - _raw_read_lock_bh
       + 99.88% ip_set_test
 -   19.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_unlock_bh
    - _raw_read_unlock_bh
       + 99.72% ip_set_test
 +    4.31%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_kadt
 +    2.27%  ksoftirqd/1  [ixgbe]            [k] ixgbe_fetch_rx_buffer
 +    2.18%  ksoftirqd/1  [ip_tables]        [k] ipt_do_table
 +    1.81%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_test
 +    1.61%  ksoftirqd/1  [kernel.kallsyms]  [k] __netif_receive_skb_core
 +    1.44%  ksoftirqd/1  [kernel.kallsyms]  [k] build_skb
 +    1.42%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_rcv
 +    1.36%  ksoftirqd/1  [kernel.kallsyms]  [k] __local_bh_enable_ip
 +    1.16%  ksoftirqd/1  [kernel.kallsyms]  [k] dev_gro_receive
 +    1.09%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_unlock
 +    0.96%  ksoftirqd/1  [ixgbe]            [k] ixgbe_clean_rx_irq
 +    0.95%  ksoftirqd/1  [kernel.kallsyms]  [k] __netdev_alloc_frag
 +    0.88%  ksoftirqd/1  [kernel.kallsyms]  [k] kmem_cache_alloc
 +    0.87%  ksoftirqd/1  [xt_set]           [k] set_match_v3
 +    0.85%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_gro_receive
 +    0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] nf_iterate
 +    0.76%  ksoftirqd/1  [kernel.kallsyms]  [k] put_compound_page
 +    0.75%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_lock

Drop via ipset in RAW table with RCU-locking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With RCU locking, the RW-lock is gone.

Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps

Performance-tested-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Three types of data need to be protected in the case of the hash types:

a. The hash buckets: standard rcu pointer operations are used.
b. The element blobs in the hash buckets are stored in an array and
   a bitmap is used for book-keeping to tell which elements in the array
   are used or free.
c. Networks per cidr values and the cidr values themselves are stored
   in fix sized arrays and need no protection. The values are modified
   in such an order that in the worst case an element testing is repeated
   once with the same cidr value.

The ipset hash approach uses arrays instead of lists and therefore is
incompatible with rhashtable.

Performance is tested by Jesper Dangaard Brouer:

Simple drop in FORWARD
~~~~~~~~~~~~~~~~~~~~~~

Dropping via simple iptables net-mask match::

 iptables -t raw -N simple || iptables -t raw -F simple
 iptables -t raw -I simple  -s 198.18.0.0/15 -j DROP
 iptables -t raw -D PREROUTING -j simple
 iptables -t raw -I PREROUTING -j simple

Drop performance in "raw": 11.3Mpps

Generator: sending 12.2Mpps (tx:12264083 pps)

Drop via original ipset in RAW table
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create a set with lots of elements::

 sudo ./ipset destroy test
 echo "create test hash:ip hashsize 65536" &gt; test.set
 for x in `seq 0 255`; do
    for y in `seq 0 255`; do
        echo "add test 198.18.$x.$y" &gt;&gt; test.set
    done
 done
 sudo ./ipset restore &lt; test.set

Dropping via ipset::

 iptables -t raw -F
 iptables -t raw -N net198 || iptables -t raw -F net198
 iptables -t raw -I net198 -m set --match-set test src -j DROP
 iptables -t raw -I PREROUTING -j net198

Drop performance in "raw" with ipset: 8Mpps

Perf report numbers ipset drop in "raw"::

 +   24.65%  ksoftirqd/1  [ip_set]           [k] ip_set_test
 -   21.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock_bh
    - _raw_read_lock_bh
       + 99.88% ip_set_test
 -   19.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_unlock_bh
    - _raw_read_unlock_bh
       + 99.72% ip_set_test
 +    4.31%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_kadt
 +    2.27%  ksoftirqd/1  [ixgbe]            [k] ixgbe_fetch_rx_buffer
 +    2.18%  ksoftirqd/1  [ip_tables]        [k] ipt_do_table
 +    1.81%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_test
 +    1.61%  ksoftirqd/1  [kernel.kallsyms]  [k] __netif_receive_skb_core
 +    1.44%  ksoftirqd/1  [kernel.kallsyms]  [k] build_skb
 +    1.42%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_rcv
 +    1.36%  ksoftirqd/1  [kernel.kallsyms]  [k] __local_bh_enable_ip
 +    1.16%  ksoftirqd/1  [kernel.kallsyms]  [k] dev_gro_receive
 +    1.09%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_unlock
 +    0.96%  ksoftirqd/1  [ixgbe]            [k] ixgbe_clean_rx_irq
 +    0.95%  ksoftirqd/1  [kernel.kallsyms]  [k] __netdev_alloc_frag
 +    0.88%  ksoftirqd/1  [kernel.kallsyms]  [k] kmem_cache_alloc
 +    0.87%  ksoftirqd/1  [xt_set]           [k] set_match_v3
 +    0.85%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_gro_receive
 +    0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] nf_iterate
 +    0.76%  ksoftirqd/1  [kernel.kallsyms]  [k] put_compound_page
 +    0.75%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_lock

Drop via ipset in RAW table with RCU-locking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With RCU locking, the RW-lock is gone.

Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps

Performance-tested-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Introduce RCU locking in bitmap:* types</title>
<updated>2015-06-14T08:40:16+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T12:39:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=96f51428c43de20723630f0d756a7a9a42cbd974'/>
<id>96f51428c43de20723630f0d756a7a9a42cbd974</id>
<content type='text'>
There's nothing much required because the bitmap types use atomic
bit operations. However the logic of adding elements slightly changed:
first the MAC address updated (which is not atomic), then the element
activated (added). The extensions may call kfree_rcu() therefore we
call rcu_barrier() at module removal.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's nothing much required because the bitmap types use atomic
bit operations. However the logic of adding elements slightly changed:
first the MAC address updated (which is not atomic), then the element
activated (added). The extensions may call kfree_rcu() therefore we
call rcu_barrier() at module removal.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Prepare the ipset core to use RCU at set level</title>
<updated>2015-06-14T08:40:16+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T12:22:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b57b2d1fa53fe8563bdfc66a33b844463b9af285'/>
<id>b57b2d1fa53fe8563bdfc66a33b844463b9af285</id>
<content type='text'>
Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
accordingly. Convert the comment extension into an rcu-avare object. Also,
simplify the timeout routines.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
accordingly. Convert the comment extension into an rcu-avare object. Also,
simplify the timeout routines.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter:ipset Remove rbtree from hash:net,iface</title>
<updated>2015-06-14T08:40:15+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T12:02:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bd55389cc34b75948c2876c821175a976bbac5b1'/>
<id>bd55389cc34b75948c2876c821175a976bbac5b1</id>
<content type='text'>
Remove rbtree in order to introduce RCU instead of rwlock in ipset

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove rbtree in order to introduce RCU instead of rwlock in ipset

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Make sure listing doesn't grab a set which is just being destroyed.</title>
<updated>2015-06-14T08:40:15+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T11:39:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9c1ba5c809381fb9fb779e2cc22a1c878a269ffb'/>
<id>9c1ba5c809381fb9fb779e2cc22a1c878a269ffb</id>
<content type='text'>
There was a small window when all sets are destroyed and a concurrent
listing of all sets could grab a set which is just being destroyed.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There was a small window when all sets are destroyed and a concurrent
listing of all sets could grab a set which is just being destroyed.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Fix parallel resizing and listing of the same set</title>
<updated>2015-06-14T08:40:15+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-13T09:59:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c4c997839cf92cb1037e43a85cdb4cbf44ed39a5'/>
<id>c4c997839cf92cb1037e43a85cdb4cbf44ed39a5</id>
<content type='text'>
When elements added to a hash:* type of set and resizing triggered,
parallel listing could start to list the original set (before resizing)
and "continue" with listing the new set. Fix it by references and
using the original hash table for listing. Therefore the destroying of
the original hash table may happen from the resizing or listing functions.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When elements added to a hash:* type of set and resizing triggered,
parallel listing could start to list the original set (before resizing)
and "continue" with listing the new set. Fix it by references and
using the original hash table for listing. Therefore the destroying of
the original hash table may happen from the resizing or listing functions.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Fix cidr handling for hash:*net* types</title>
<updated>2015-06-14T08:40:14+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2015-06-12T20:11:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f690cbaed9fe4d77592e24139db7ad790641c4fd'/>
<id>f690cbaed9fe4d77592e24139db7ad790641c4fd</id>
<content type='text'>
Commit "Simplify cidr handling for hash:*net* types" broke the cidr
handling for the hash:*net* types when the sets were used by the SET
target: entries with invalid cidr values were added to the sets.
Reported by Jonathan Johnson.

Testsuite entry is added to verify the fix.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit "Simplify cidr handling for hash:*net* types" broke the cidr
handling for the hash:*net* types when the sets were used by the SET
target: entries with invalid cidr values were added to the sets.
Reported by Jonathan Johnson.

Testsuite entry is added to verify the fix.

Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Check CIDR value only when attribute is given</title>
<updated>2015-06-14T08:40:14+00:00</updated>
<author>
<name>Sergey Popovich</name>
<email>popovich_sergei@mail.ua</email>
</author>
<published>2015-06-12T19:30:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=aff227581ed1ac299e3a50eef4bb1cef944e1404'/>
<id>aff227581ed1ac299e3a50eef4bb1cef944e1404</id>
<content type='text'>
There is no reason to check CIDR value regardless attribute
specifying CIDR is given.

Initialize cidr array in element structure on element structure
declaration to let more freedom to the compiler to optimize
initialization right before element structure is used.

Remove local variables cidr and cidr2 for netnet and netportnet
hashes as we do not use packed cidr value for such set types and
can store value directly in e.cidr[].

Signed-off-by: Sergey Popovich &lt;popovich_sergei@mail.ua&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no reason to check CIDR value regardless attribute
specifying CIDR is given.

Initialize cidr array in element structure on element structure
declaration to let more freedom to the compiler to optimize
initialization right before element structure is used.

Remove local variables cidr and cidr2 for netnet and netportnet
hashes as we do not use packed cidr value for such set types and
can store value directly in e.cidr[].

Signed-off-by: Sergey Popovich &lt;popovich_sergei@mail.ua&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
