linux-stable.git/include/linux/mm.h, branch linux-2.6.37.y

mm: wrap get_locked_pte() using __cond_lock()

2010-10-26T23:52:09+00:00

The get_locked_pte() conditionally grabs 'ptl' in case of returning
non-NULL.  This leads sparse to complain about context imbalance.  Rename
and wrap it using __cond_lock() to make sparse happy.

Signed-off-by: Namhyung Kim 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: retry page fault when blocking on disk transfer

2010-10-26T23:52:09+00:00

This change reduces mmap_sem hold times that are caused by waiting for
disk transfers when accessing file mapped VMAs.

It introduces the VM_FAULT_ALLOW_RETRY flag, which indicates that the call
site wants mmap_sem to be released if blocking on a pending disk transfer.
In that case, filemap_fault() returns the VM_FAULT_RETRY status bit and
do_page_fault() will then re-acquire mmap_sem and retry the page fault.

It is expected that the retry will hit the same page which will now be
cached, and thus it will complete with a low mmap_sem hold time.

Tests:

- microbenchmark: thread A mmaps a large file and does random read accesses
  to the mmaped area - achieves about 55 iterations/s. Thread B does
  mmap/munmap in a loop at a separate location - achieves 55 iterations/s
  before, 15000 iterations/s after.

- We are seeing related effects in some applications in house, which show
  significant performance regressions when running without this change.

[akpm@linux-foundation.org: fix warning & crash]
Signed-off-by: Michel Lespinasse 
Acked-by: Rik van Riel 
Acked-by: Linus Torvalds 
Cc: Nick Piggin 
Reviewed-by: Wu Fengguang 
Cc: Ying Han 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Acked-by: "H. Peter Anvin" 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: fix typo in mm.h when NODE_NOT_IN_PAGE_FLAGS

2010-10-26T23:52:07+00:00

NODE_NOT_IN_PAGE_FLAGS is defined in mm.h when the node information is not
stored in the page flags bitmap.

Unfortunately, there's a typo in one of the checks for it.  This patch
fixes it (s/NODE_NOT_IN_PAGEFLAGS/NODE_NOT_IN_PAGE_FLAGS/).  Since this
has been around for ages, I doubt it's been causing any serious problems.

Signed-off-by: Will Deacon 
Cc: Christoph Lameter 
Cc: Mel Gorman 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: add account_page_writeback()

2010-10-26T23:52:06+00:00

To help developers and applications gain visibility into writeback
behaviour this patch adds two counters to /proc/vmstat.

  # grep nr_dirtied /proc/vmstat
  nr_dirtied 3747
  # grep nr_written /proc/vmstat
  nr_written 3618

These entries allow user apps to understand writeback behaviour over time
and learn how it is impacting their performance.  Currently there is no
way to inspect dirty and writeback speed over time.  It's not possible for
nr_dirty/nr_writeback.

These entries are necessary to give visibility into writeback behaviour.
We have /proc/diskstats which lets us understand the io in the block
layer.  We have blktrace for more in depth understanding.  We have
e2fsprogs and debugsfs to give insight into the file systems behaviour,
but we don't offer our users the ability understand what writeback is
doing.  There is no way to know how active it is over the whole system, if
it's falling behind or to quantify it's efforts.  With these values
exported users can easily see how much data applications are sending
through writeback and also at what rates writeback is processing this
data.  Comparing the rates of change between the two allow developers to
see when writeback is not able to keep up with incoming traffic and the
rate of dirty memory being sent to the IO back end.  This allows folks to
understand their io workloads and track kernel issues.  Non kernel
engineers at Google often use these counters to solve puzzling performance
problems.

Patch #4 adds a pernode vmstat file with nr_dirtied and nr_written

Patch #5 add writeback thresholds to /proc/vmstat

Currently these values are in debugfs. But they should be promoted to
/proc since they are useful for developers who are writing databases
and file servers and are not debugging the kernel.

The output is as below:

 # grep threshold /proc/vmstat
 nr_pages_dirty_threshold 409111
 nr_pages_dirty_background_threshold 818223

This patch:

This allows code outside of the mm core to safely manipulate page
writeback state and not worry about the other accounting.  Not using these
routines means that some code will lose track of the accounting and we get
bugs.

Modify nilfs2 to use interface.

Signed-off-by: Michael Rubin 
Reviewed-by: KOSAKI Motohiro 
Reviewed-by: Wu Fengguang 
Cc: KONISHI Ryusuke 
Cc: Jiro SEKIBA 
Cc: Dave Chinner 
Cc: Jens Axboe 
Cc: KOSAKI Motohiro 
Cc: Nick Piggin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Merge branch 'hwpoison-hugepages' into hwpoison

2010-10-22T15:40:48+00:00

Conflicts:
	mm/memory-failure.c

Encode huge page size for VM_FAULT_HWPOISON errors

2010-10-08T07:32:46+00:00

This fixes a problem introduced with the hugetlb hwpoison handling

The user space SIGBUS signalling wants to know the size of the hugepage
that caused a HWPOISON fault.

Unfortunately the architecture page fault handlers do not have easy
access to the struct page.

Pass the information out in the fault error code instead.

I added a separate VM_FAULT_HWPOISON_LARGE bit for this case and encode
the hpage index in some free upper bits of the fault code. The small
page hwpoison keeps stays with the VM_FAULT_HWPOISON name to minimize
changes.

Also add code to hugetlb.h to convert that index into a page shift.

Will be used in a further patch.

Cc: Naoya Horiguchi 
Cc: fengguang.wu@intel.com
Signed-off-by: Andi Kleen

Merge commit 'v2.6.36-rc7' into core/memblock

2010-10-08T07:15:00+00:00

Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar

mm: Move vma_stack_continue into mm.h

2010-09-09T16:05:06+00:00

So it can be used by all that need to check for that.

Signed-off-by: Stefan Bader 
Signed-off-by: Linus Torvalds

Merge commit 'v2.6.36-rc3' into x86/memblock

2010-08-31T07:45:46+00:00

Conflicts:
	arch/x86/kernel/trampoline.c
	mm/memblock.c

Merge reason: Resolve the conflicts, update to latest upstream.

Signed-off-by: Ingo Molnar

NOMMU: Stub out vm_get_page_prot() if there's no MMU

2010-08-28T21:01:03+00:00

Stub out vm_get_page_prot() if there's no MMU.

This was added by commit 804af2cf6e7a ("[AGPGART] remove private page
protection map") and is used in commit c07fbfd17e61 ("fbmem: VM_IO set,
but not propagated") in the fbmem video driver, but the function doesn't
exist on NOMMU, resulting in an undefined symbol at link time.

Signed-off-by: David Howells 
Reviewed-by: Konrad Rzeszutek Wilk 
Signed-off-by: Linus Torvalds