linux.git/arch/s390/lib, branch v3.11

s390/uaccess: add "fallthrough" comments

2013-05-02T13:50:19+00:00

Add "fallthrough" comments so nobody wonders if a break statement is missing.

Reported-by: Joe Perches 
Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky

Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS

2013-05-01T00:04:09+00:00

The help text for this config is duplicated across the x86, parisc, and
s390 Kconfig.debug files.  Arnd Bergman noted that the help text was
slightly misleading and should be fixed to state that enabling this
option isn't a problem when using pre 4.4 gcc.

To simplify the rewording, consolidate the text into lib/Kconfig.debug
and modify it there to be more explicit about when you should say N to
this config.

Also, make the text a bit more generic by stating that this option
enables compile time checks so we can cover architectures which emit
warnings vs.  ones which emit errors.  The details of how an
architecture decided to implement the checks isn't as important as the
concept of compile time checking of copy_from_user() calls.

While we're doing this, remove all the copy_from_user_overflow() code
that's duplicated many times and place it into lib/ so that any
architecture supporting this option can get the function for free.

Signed-off-by: Stephen Boyd 
Acked-by: Arnd Bergmann 
Acked-by: Ingo Molnar 
Acked-by: H. Peter Anvin 
Cc: Arjan van de Ven 
Acked-by: Helge Deller 
Cc: Heiko Carstens 
Cc: Stephen Rothwell 
Cc: Chris Metcalf 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

s390/uaccess: fix page table walk

2013-04-02T06:53:08+00:00

When translating user space addresses to kernel addresses the follow_table()
function had two bugs:

- PROT_NONE mappings could be read accessed via the kernel mapping. That is
  e.g. putting a filename into a user page, then protecting the page with
  PROT_NONE and afterwards issuing the "open" syscall with a pointer to
  the filename would incorrectly succeed.

- when walking the page tables it used the pgd/pud/pmd/pte primitives which
  with dynamic page tables give no indication which real level of page tables
  is being walked (region2, region3, segment or page table). So in case of an
  exception the translation exception code passed to __handle_fault() is not
  necessarily correct.
  This is not really an issue since __handle_fault() doesn't evaluate the code.
  Only in case of e.g. a SIGBUS this code gets passed to user space. If user
  space can do something sane with the value is a different question though.

To fix these issues don't use any Linux primitives. Only walk the page tables
like the hardware would do it, however we leave quite some checks away since
we know that we only have full size page tables and each index is within bounds.

In theory this should fix all issues...

Signed-off-by: Heiko Carstens 
Reviewed-by: Gerald Schaefer 
Signed-off-by: Martin Schwidefsky

s390/uaccess: fix clear_user_pt()

2013-03-21T12:35:38+00:00

The page table walker variant of clear_user() is supposed to copy the
contents of the empty zero page to user space.
However since 238ec4ef "[S390] zero page cache synonyms" empty_zero_page
is not anymore the page itself but contains the pointer to the empty zero
pages. Therefore the page table walker variant of clear_user() copied
the address of the first empty zero page and afterwards more or less
random data to user space instead of clearing the given user space range.

Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky

s390/uaccess: fix kernel ds access for page table walk

2013-02-28T08:37:12+00:00

When the kernel resides in home space and the mvcos instruction is not
available uaccesses for kernel ds happen via simple strnlen() or memcpy()
calls.
This however can break badly, since uaccesses in kernel space may fail as
well, especially if CONFIG_DEBUG_PAGEALLOC is turned on.

To fix this implement strnlen_kernel() and copy_in_kernel() functions
which can only be used by the page table uaccess functions. These two
functions detect invalid memory accesses and return the correct length
of processed data.. Both functions are more or less a copy of the std
variants without sacf calls.

Fixes ipl crashes on 31 bit machines as well on 64 bit machines without
mvcos. Caused by changing the default address space of the kernel being
home space.

Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky

s390/uaccess: fix strncpy_from_user string length check

2013-02-28T08:37:11+00:00

The "standard" and page table walk variants of strncpy_from_user() first
check the length of the to be copied string in userspace.
The string is then copied to kernel space and the length returned to the
caller.
However userspace can modify the string at any time while the kernel
checks for the length of the string or copies the string. In result the
returned length of the string is not necessarily correct.
Fix this by copying in a loop which mimics the mvcos variant of
strncpy_from_user(), which handles this correctly.

Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky

s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case

2013-02-28T08:37:08+00:00

If the maximum length specified for the to be accessed string for
strncpy_from_user() and strnlen_user() is zero the following incorrect
values would be returned or incorrect memory accesses would happen:

strnlen_user_std() and strnlen_user_pt() incorrectly return "1"
strncpy_from_user_pt() would incorrectly access "dst[maxlen - 1]"
strncpy_from_user_mvcos() would incorrectly return "-EFAULT"

Fix all these oddities by adding early checks.

Reviewed-by: Gerald Schaefer 
Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky

s390/uaccess: shorten strncpy_from_user/strnlen_user

2013-02-28T08:37:07+00:00

Always stay within page boundaries when copying from user within
strlen_user_mvcos()/strncpy_from_user_mvcos(). This allows to
shorten the code a bit and may prevent unnecessary faults, since
we copy quite large amounts of memory to kernel space.

Also directly call the mvcos variants of copy_from_user() to
avoid indirect branches.

Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky

s390/mm: implement software dirty bits

2013-02-14T14:55:23+00:00

The s390 architecture is unique in respect to dirty page detection,
it uses the change bit in the per-page storage key to track page
modifications. All other architectures track dirty bits by means
of page table entries. This property of s390 has caused numerous
problems in the past, e.g. see git commit ef5d437f71afdf4a
"mm: fix XFS oops due to dirty pages without buffers on s390".

To avoid future issues in regard to per-page dirty bits convert
s390 to a fault based software dirty bit detection mechanism. All
user page table entries which are marked as clean will be hardware
read-only, even if the pte is supposed to be writable. A write by
the user process will trigger a protection fault which will cause
the user pte to be marked as dirty and the hardware read-only bit
is removed.

With this change the dirty bit in the storage key is irrelevant
for Linux as a host, but the storage key is still required for
KVM guests. The effect is that page_test_and_clear_dirty and the
related code can be removed. The referenced bit in the storage
key is still used by the page_test_and_clear_young primitive to
provide page age information.

For page cache pages of mappings with mapping_cap_account_dirty
there will not be any change in behavior as the dirty bit tracking
already uses read-only ptes to control the amount of dirty pages.
Only for swap cache pages and pages of mappings without
mapping_cap_account_dirty there can be additional protection faults.
To avoid an excessive number of additional faults the mk_pte
primitive checks for PageDirty if the pgprot value allows for writes
and pre-dirties the pte. That avoids all additional faults for
tmpfs and shmem pages until these pages are added to the swap cache.

Signed-off-by: Martin Schwidefsky

s390/time: rename tod clock access functions

2013-02-14T14:55:10+00:00

Fix name clash with some common code device drivers and add "tod"
to all tod clock access function names.

Signed-off-by: Heiko Carstens 
Signed-off-by: Martin Schwidefsky