<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/splice.c, branch v4.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE</title>
<updated>2016-01-09T07:55:35+00:00</updated>
<author>
<name>Abhi Das</name>
<email>adas@redhat.com</email>
</author>
<published>2015-12-18T20:11:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=90330e689c32e5105265c461c54af6ecec3373fa'/>
<id>90330e689c32e5105265c461c54af6ecec3373fa</id>
<content type='text'>
During testing, I discovered that __generic_file_splice_read() returns
0 (EOF) when aops-&gt;readpage fails with AOP_TRUNCATED_PAGE on the first
page of a single/multi-page splice read operation. This EOF return code
causes the userspace test to (correctly) report a zero-length read error
when it was expecting otherwise.

The current strategy of returning a partial non-zero read when -&gt;readpage
returns AOP_TRUNCATED_PAGE works only when the failed page is not the
first of the lot being processed.

This patch attempts to retry lookup and call -&gt;readpage again on pages
that had previously failed with AOP_TRUNCATED_PAGE. With this patch, my
tests pass and I haven't noticed any unwanted side effects.

This version removes the thrice-retry loop and instead indefinitely
retries lookups on AOP_TRUNCATED_PAGE errors from -&gt;readpage. This
behavior is now similar to do_generic_file_read().

Signed-off-by: Abhi Das &lt;adas@redhat.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During testing, I discovered that __generic_file_splice_read() returns
0 (EOF) when aops-&gt;readpage fails with AOP_TRUNCATED_PAGE on the first
page of a single/multi-page splice read operation. This EOF return code
causes the userspace test to (correctly) report a zero-length read error
when it was expecting otherwise.

The current strategy of returning a partial non-zero read when -&gt;readpage
returns AOP_TRUNCATED_PAGE works only when the failed page is not the
first of the lot being processed.

This patch attempts to retry lookup and call -&gt;readpage again on pages
that had previously failed with AOP_TRUNCATED_PAGE. With this patch, my
tests pass and I haven't noticed any unwanted side effects.

This version removes the thrice-retry loop and instead indefinitely
retries lookups on AOP_TRUNCATED_PAGE errors from -&gt;readpage. This
behavior is now similar to do_generic_file_read().

Signed-off-by: Abhi Das &lt;adas@redhat.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Bob Peterson &lt;rpeterso@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: Avoid softlockups with sendfile(2)</title>
<updated>2015-11-24T02:15:30+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2015-11-23T12:09:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c2489e07c0a71a56fb2c84bc0ee66cddfca7d068'/>
<id>c2489e07c0a71a56fb2c84bc0ee66cddfca7d068</id>
<content type='text'>
The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:

        int r1 = eventfd(0, 0);
        int r2 = memfd_create("", 0);
        unsigned long n = 1&lt;&lt;30;
        fallocate(r2, 0, 0, n);
        sendfile(r1, r2, 0, n);

Add cond_resched() into __splice_from_pipe() to fix the problem.

CC: Dmitry Vyukov &lt;dvyukov@google.com&gt;
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:

        int r1 = eventfd(0, 0);
        int r2 = memfd_create("", 0);
        unsigned long n = 1&lt;&lt;30;
        fallocate(r2, 0, 0, n);
        sendfile(r1, r2, 0, n);

Add cond_resched() into __splice_from_pipe() to fix the problem.

CC: Dmitry Vyukov &lt;dvyukov@google.com&gt;
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: Make sendfile(2) killable even better</title>
<updated>2015-11-24T02:15:30+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2015-11-23T12:09:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c725bfce7968009756ed2836a8cd7ba4dc163011'/>
<id>c725bfce7968009756ed2836a8cd7ba4dc163011</id>
<content type='text'>
Commit 296291cdd162 (mm: make sendfile(2) killable) fixed an issue where
sendfile(2) was doing a lot of tiny writes into a filesystem and thus
was unkillable for a long time. However sendfile(2) can be (mis)used to
issue lots of writes into arbitrary file descriptor such as evenfd or
similar special file descriptors which never hit the standard filesystem
write path and thus are still unkillable. E.g. the following example
from Dmitry burns CPU for ~16s on my test system without possibility to
be killed:

        int r1 = eventfd(0, 0);
        int r2 = memfd_create("", 0);
        unsigned long n = 1&lt;&lt;30;
        fallocate(r2, 0, 0, n);
        sendfile(r1, r2, 0, n);

There are actually quite a few tests for pending signals in sendfile
code however we data to write is always available none of them seems to
trigger. So fix the problem by adding a test for pending signal into
splice_from_pipe_next() also before the loop waiting for pipe buffers to
be available. This should fix all the lockup issues with sendfile of the
do-ton-of-tiny-writes nature.

CC: stable@vger.kernel.org
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 296291cdd162 (mm: make sendfile(2) killable) fixed an issue where
sendfile(2) was doing a lot of tiny writes into a filesystem and thus
was unkillable for a long time. However sendfile(2) can be (mis)used to
issue lots of writes into arbitrary file descriptor such as evenfd or
similar special file descriptors which never hit the standard filesystem
write path and thus are still unkillable. E.g. the following example
from Dmitry burns CPU for ~16s on my test system without possibility to
be killed:

        int r1 = eventfd(0, 0);
        int r2 = memfd_create("", 0);
        unsigned long n = 1&lt;&lt;30;
        fallocate(r2, 0, 0, n);
        sendfile(r1, r2, 0, n);

There are actually quite a few tests for pending signals in sendfile
code however we data to write is always available none of them seems to
trigger. So fix the problem by adding a test for pending signal into
splice_from_pipe_next() also before the loop waiting for pipe buffers to
be available. This should fix all the lockup issues with sendfile of the
do-ton-of-tiny-writes nature.

CC: stable@vger.kernel.org
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, fs: introduce mapping_gfp_constraint()</title>
<updated>2015-11-07T01:50:42+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2015-11-07T00:28:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c62d25556be6c965dc14288e796a576e8e39a7e9'/>
<id>c62d25556be6c965dc14288e796a576e8e39a7e9</id>
<content type='text'>
There are many places which use mapping_gfp_mask to restrict a more
generic gfp mask which would be used for allocations which are not
directly related to the page cache but they are performed in the same
context.

Let's introduce a helper function which makes the restriction explicit and
easier to track.  This patch doesn't introduce any functional changes.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Suggested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are many places which use mapping_gfp_mask to restrict a more
generic gfp mask which would be used for allocations which are not
directly related to the page cache but they are performed in the same
context.

Let's introduce a helper function which makes the restriction explicit and
easier to track.  This patch doesn't introduce any functional changes.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Suggested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2015-06-25T03:47:21+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-06-25T03:47:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aefbef10e3ae6e2c6e3c54f906f10b34c73a2c66'/>
<id>aefbef10e3ae6e2c6e3c54f906f10b34c73a2c66</id>
<content type='text'>
Merge first patchbomb from Andrew Morton:

 - a few misc things

 - ocfs2 udpates

 - kernel/watchdog.c feature work (took ages to get right)

 - most of MM.  A few tricky bits are held up and probably won't make 4.2.

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (91 commits)
  mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()
  mm, thp: respect MPOL_PREFERRED policy with non-local node
  tmpfs: truncate prealloc blocks past i_size
  mm/memory hotplug: print the last vmemmap region at the end of hot add memory
  mm/mmap.c: optimization of do_mmap_pgoff function
  mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan
  mm: kmemleak: avoid deadlock on the kmemleak object insertion error path
  mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup()
  mm: kmemleak: fix delete_object_*() race when called on the same memory block
  mm: kmemleak: allow safe memory scanning during kmemleak disabling
  memcg: convert mem_cgroup-&gt;under_oom from atomic_t to int
  memcg: remove unused mem_cgroup-&gt;oom_wakeups
  frontswap: allow multiple backends
  x86, mirror: x86 enabling - find mirrored memory ranges
  mm/memblock: allocate boot time data structures from mirrored memory
  mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute
  mm: do not ignore mapping_gfp_mask in page cache allocation paths
  mm/cma.c: fix typos in comments
  mm/oom_kill.c: print points as unsigned int
  mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge first patchbomb from Andrew Morton:

 - a few misc things

 - ocfs2 udpates

 - kernel/watchdog.c feature work (took ages to get right)

 - most of MM.  A few tricky bits are held up and probably won't make 4.2.

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (91 commits)
  mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()
  mm, thp: respect MPOL_PREFERRED policy with non-local node
  tmpfs: truncate prealloc blocks past i_size
  mm/memory hotplug: print the last vmemmap region at the end of hot add memory
  mm/mmap.c: optimization of do_mmap_pgoff function
  mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan
  mm: kmemleak: avoid deadlock on the kmemleak object insertion error path
  mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup()
  mm: kmemleak: fix delete_object_*() race when called on the same memory block
  mm: kmemleak: allow safe memory scanning during kmemleak disabling
  memcg: convert mem_cgroup-&gt;under_oom from atomic_t to int
  memcg: remove unused mem_cgroup-&gt;oom_wakeups
  frontswap: allow multiple backends
  x86, mirror: x86 enabling - find mirrored memory ranges
  mm/memblock: allocate boot time data structures from mirrored memory
  mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute
  mm: do not ignore mapping_gfp_mask in page cache allocation paths
  mm/cma.c: fix typos in comments
  mm/oom_kill.c: print points as unsigned int
  mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: do not ignore mapping_gfp_mask in page cache allocation paths</title>
<updated>2015-06-25T00:49:44+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2015-06-24T23:58:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6afdb859b71019143b8eecda02b8b29b03185055'/>
<id>6afdb859b71019143b8eecda02b8b29b03185055</id>
<content type='text'>
page_cache_read, do_generic_file_read, __generic_file_splice_read and
__ntfs_grab_cache_pages currently ignore mapping_gfp_mask when calling
add_to_page_cache_lru which might cause recursion into fs down in the
direct reclaim path if the mapping really relies on GFP_NOFS semantic.

This doesn't seem to be the case now because page_cache_read (page fault
path) doesn't seem to suffer from the reclaim recursion issues and
do_generic_file_read and __generic_file_splice_read also shouldn't be
called under fs locks which would deadlock in the reclaim path.  Anyway it
is better to obey mapping gfp mask and prevent from later breakage.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Anton Altaparmakov &lt;anton@tuxera.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
page_cache_read, do_generic_file_read, __generic_file_splice_read and
__ntfs_grab_cache_pages currently ignore mapping_gfp_mask when calling
add_to_page_cache_lru which might cause recursion into fs down in the
direct reclaim path if the mapping really relies on GFP_NOFS semantic.

This doesn't seem to be the case now because page_cache_read (page fault
path) doesn't seem to suffer from the reclaim recursion issues and
do_generic_file_read and __generic_file_splice_read also shouldn't be
called under fs locks which would deadlock in the reclaim path.  Anyway it
is better to obey mapping gfp mask and prevent from later breakage.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Anton Altaparmakov &lt;anton@tuxera.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: af_unix: implement splice for stream af_unix sockets</title>
<updated>2015-05-25T04:06:59+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2015-05-21T15:00:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2b514574f7e88c8498027ee366fd6e7aae5aa4b5'/>
<id>2b514574f7e88c8498027ee366fd6e7aae5aa4b5</id>
<content type='text'>
unix_stream_recvmsg is refactored to unix_stream_read_generic in this
patch and enhanced to deal with pipe splicing. The refactoring is
inneglible, we mostly have to deal with a non-existing struct msghdr
argument.

Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
unix_stream_recvmsg is refactored to unix_stream_read_generic in this
patch and enhanced to deal with pipe splicing. The refactoring is
inneglible, we mostly have to deal with a non-existing struct msghdr
argument.

Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>splice: sendfile() at once fails for big files</title>
<updated>2015-05-06T15:27:41+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-06T15:26:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0ff28d9f4674d781e492bcff6f32f0fe48cf0fed'/>
<id>0ff28d9f4674d781e492bcff6f32f0fe48cf0fed</id>
<content type='text'>
Using sendfile with below small program to get MD5 sums of some files,
it appear that big files (over 64kbytes with 4k pages system) get a
wrong MD5 sum while small files get the correct sum.
This program uses sendfile() to send a file to an AF_ALG socket
for hashing.

/* md5sum2.c */
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;unistd.h&gt;
#include &lt;string.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;sys/socket.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;linux/if_alg.h&gt;

int main(int argc, char **argv)
{
	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
	struct stat st;
	struct sockaddr_alg sa = {
		.salg_family = AF_ALG,
		.salg_type = "hash",
		.salg_name = "md5",
	};
	int n;

	bind(sk, (struct sockaddr*)&amp;sa, sizeof(sa));

	for (n = 1; n &lt; argc; n++) {
		int size;
		int offset = 0;
		char buf[4096];
		int fd;
		int sko;
		int i;

		fd = open(argv[n], O_RDONLY);
		sko = accept(sk, NULL, 0);
		fstat(fd, &amp;st);
		size = st.st_size;
		sendfile(sko, fd, &amp;offset, size);
		size = read(sko, buf, sizeof(buf));
		for (i = 0; i &lt; size; i++)
			printf("%2.2x", buf[i]);
		printf("  %s\n", argv[n]);
		close(fd);
		close(sko);
	}
	exit(0);
}

Test below is done using official linux patch files. First result is
with a software based md5sum. Second result is with the program above.

root@vgoip:~# ls -l patch-3.6.*
-rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
-rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz

root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz

After investivation, it appears that sendfile() sends the files by blocks
of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
is reset as if it was the end of the file.

This patch adds SPLICE_F_MORE to the flags when more data is pending.

With the patch applied, we get the correct sums:

root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using sendfile with below small program to get MD5 sums of some files,
it appear that big files (over 64kbytes with 4k pages system) get a
wrong MD5 sum while small files get the correct sum.
This program uses sendfile() to send a file to an AF_ALG socket
for hashing.

/* md5sum2.c */
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;unistd.h&gt;
#include &lt;string.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;sys/socket.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;linux/if_alg.h&gt;

int main(int argc, char **argv)
{
	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
	struct stat st;
	struct sockaddr_alg sa = {
		.salg_family = AF_ALG,
		.salg_type = "hash",
		.salg_name = "md5",
	};
	int n;

	bind(sk, (struct sockaddr*)&amp;sa, sizeof(sa));

	for (n = 1; n &lt; argc; n++) {
		int size;
		int offset = 0;
		char buf[4096];
		int fd;
		int sko;
		int i;

		fd = open(argv[n], O_RDONLY);
		sko = accept(sk, NULL, 0);
		fstat(fd, &amp;st);
		size = st.st_size;
		sendfile(sko, fd, &amp;offset, size);
		size = read(sko, buf, sizeof(buf));
		for (i = 0; i &lt; size; i++)
			printf("%2.2x", buf[i]);
		printf("  %s\n", argv[n]);
		close(fd);
		close(sko);
	}
	exit(0);
}

Test below is done using official linux patch files. First result is
with a software based md5sum. Second result is with the program above.

root@vgoip:~# ls -l patch-3.6.*
-rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
-rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz

root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz

After investivation, it appears that sendfile() sends the files by blocks
of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
is reset as if it was the end of the file.

This patch adds SPLICE_F_MORE to the flags when more data is pending.

With the patch applied, we get the correct sums:

root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dax: unify ext2/4_{dax,}_file_operations</title>
<updated>2015-04-15T23:35:20+00:00</updated>
<author>
<name>Boaz Harrosh</name>
<email>boaz@plexistor.com</email>
</author>
<published>2015-04-15T23:15:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=be64f884bed729b5d127db6a737155a4e514d286'/>
<id>be64f884bed729b5d127db6a737155a4e514d286</id>
<content type='text'>
The original dax patchset split the ext2/4_file_operations because of the
two NULL splice_read/splice_write in the dax case.

In the vfs if splice_read/splice_write are NULL we then call
default_splice_read/write.

What we do here is make generic_file_splice_read aware of IS_DAX() so the
original ext2/4_file_operations can be used as is.

For write it appears that iter_file_splice_write is just fine.  It uses
the regular f_op-&gt;write(file,..) or new_sync_write(file, ...).

Signed-off-by: Boaz Harrosh &lt;boaz@plexistor.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Dave Chinner &lt;dchinner@redhat.com&gt;
Cc: Matthew Wilcox &lt;willy@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The original dax patchset split the ext2/4_file_operations because of the
two NULL splice_read/splice_write in the dax case.

In the vfs if splice_read/splice_write are NULL we then call
default_splice_read/write.

What we do here is make generic_file_splice_read aware of IS_DAX() so the
original ext2/4_file_operations can be used as is.

For write it appears that iter_file_splice_write is just fine.  It uses
the regular f_op-&gt;write(file,..) or new_sync_write(file, ...).

Signed-off-by: Boaz Harrosh &lt;boaz@plexistor.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Dave Chinner &lt;dchinner@redhat.com&gt;
Cc: Matthew Wilcox &lt;willy@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vmsplice_to_user(): switch to import_iovec()</title>
<updated>2015-04-12T02:27:11+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-03-21T23:17:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=345995fa48c599210fd339a768eb68dead430b79'/>
<id>345995fa48c599210fd339a768eb68dead430b79</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
