<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/x86/kernel/kvmclock.c, branch linux-3.9.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>x86: kvmclock: zero initialize pvclock shared memory area</title>
<updated>2013-06-27T17:38:44+00:00</updated>
<author>
<name>Igor Mammedov</name>
<email>imammedo@redhat.com</email>
</author>
<published>2013-06-10T16:31:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=65f918123bd108a9b50bceaf4b32ac173b9e3f3e'/>
<id>65f918123bd108a9b50bceaf4b32ac173b9e3f3e</id>
<content type='text'>
commit 07868fc6aaf57847b0f3a3d53086b7556eb83f4a upstream.

kernel might hung in pvclock_clocksource_read() due to
uninitialized memory might contain odd version value in
following cycle:

        do {
                version = __pvclock_read_cycles(src, &amp;ret, &amp;flags);
        } while ((src-&gt;version &amp; 1) || version != src-&gt;version);

if secondary kvmclock is accessed before it's registered with kvm.

Clear garbage in pvclock shared memory area right after it's
allocated to avoid this issue.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521
Signed-off-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
[See BZ for analysis.  We may want a different fix for 3.11, but
 this is the safest for now - Paolo]
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 07868fc6aaf57847b0f3a3d53086b7556eb83f4a upstream.

kernel might hung in pvclock_clocksource_read() due to
uninitialized memory might contain odd version value in
following cycle:

        do {
                version = __pvclock_read_cycles(src, &amp;ret, &amp;flags);
        } while ((src-&gt;version &amp; 1) || version != src-&gt;version);

if secondary kvmclock is accessed before it's registered with kvm.

Clear garbage in pvclock shared memory area right after it's
allocated to avoid this issue.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521
Signed-off-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
[See BZ for analysis.  We may want a different fix for 3.11, but
 this is the safest for now - Paolo]
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2013-02-24T21:07:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-24T21:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=89f883372fa60f604d136924baf3e89ff1870e9e'/>
<id>89f883372fa60f604d136924baf3e89ff1870e9e</id>
<content type='text'>
Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: pvclock kvm: align allocation size to page size</title>
<updated>2013-02-19T02:05:19+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2013-02-19T01:58:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ed55705dd5008b408c48a8459b8b34b01f3de985'/>
<id>ed55705dd5008b408c48a8459b8b34b01f3de985</id>
<content type='text'>
To match whats mapped via vsyscalls to userspace.

Reported-by: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To match whats mapped via vsyscalls to userspace.

Reported-by: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86, kvm: Fix kvm's use of __pa() on percpu areas</title>
<updated>2013-01-26T00:34:55+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave@linux.vnet.ibm.com</email>
</author>
<published>2013-01-22T21:24:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5dfd486c4750c9278c63fa96e6e85bdd2fb58e9d'/>
<id>5dfd486c4750c9278c63fa96e6e85bdd2fb58e9d</id>
<content type='text'>
In short, it is illegal to call __pa() on an address holding
a percpu variable.  This replaces those __pa() calls with
slow_virt_to_phys().  All of the cases in this patch are
in boot time (or CPU hotplug time at worst) code, so the
slow pagetable walking in slow_virt_to_phys() is not expected
to have a performance impact.

The times when this actually matters are pretty obscure
(certain 32-bit NUMA systems), but it _does_ happen.  It is
important to keep KVM guests working on these systems because
the real hardware is getting harder and harder to find.

This bug manifested first by me seeing a plain hang at boot
after this message:

	CPU 0 irqstacks, hard=f3018000 soft=f301a000

or, sometimes, it would actually make it out to the console:

[    0.000000] BUG: unable to handle kernel paging request at ffffffff

I eventually traced it down to the KVM async pagefault code.
This can be worked around by disabling that code either at
compile-time, or on the kernel command-line.

The kvm async pagefault code was injecting page faults in
to the guest which the guest misinterpreted because its
"reason" was not being properly sent from the host.

The guest passes a physical address of an per-cpu async page
fault structure via an MSR to the host.  Since __pa() is
broken on percpu data, the physical address it sent was
bascially bogus and the host went scribbling on random data.
The guest never saw the real reason for the page fault (it
was injected by the host), assumed that the kernel had taken
a _real_ page fault, and panic()'d.  The behavior varied,
though, depending on what got corrupted by the bad write.

Signed-off-by: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20130122212435.4905663F@kernel.stglabs.ibm.com
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In short, it is illegal to call __pa() on an address holding
a percpu variable.  This replaces those __pa() calls with
slow_virt_to_phys().  All of the cases in this patch are
in boot time (or CPU hotplug time at worst) code, so the
slow pagetable walking in slow_virt_to_phys() is not expected
to have a performance impact.

The times when this actually matters are pretty obscure
(certain 32-bit NUMA systems), but it _does_ happen.  It is
important to keep KVM guests working on these systems because
the real hardware is getting harder and harder to find.

This bug manifested first by me seeing a plain hang at boot
after this message:

	CPU 0 irqstacks, hard=f3018000 soft=f301a000

or, sometimes, it would actually make it out to the console:

[    0.000000] BUG: unable to handle kernel paging request at ffffffff

I eventually traced it down to the KVM async pagefault code.
This can be worked around by disabling that code either at
compile-time, or on the kernel command-line.

The kvm async pagefault code was injecting page faults in
to the guest which the guest misinterpreted because its
"reason" was not being properly sent from the host.

The guest passes a physical address of an per-cpu async page
fault structure via an MSR to the host.  Since __pa() is
broken on percpu data, the physical address it sent was
bascially bogus and the host went scribbling on random data.
The guest never saw the real reason for the page fault (it
was injected by the host), assumed that the kernel had taken
a _real_ page fault, and panic()'d.  The behavior varied,
though, depending on what got corrupted by the bad write.

Signed-off-by: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20130122212435.4905663F@kernel.stglabs.ibm.com
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvm guest: pvclock vsyscall support</title>
<updated>2012-11-28T01:29:10+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-11-28T01:28:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3dc4f7cfb7441e5e0fed3a02fc81cdaabd28300a'/>
<id>3dc4f7cfb7441e5e0fed3a02fc81cdaabd28300a</id>
<content type='text'>
Hook into generic pvclock vsyscall code, with the aim to
allow userspace to have visibility into pvclock data.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hook into generic pvclock vsyscall code, with the aim to
allow userspace to have visibility into pvclock data.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvmclock: allocate pvclock shared memory area</title>
<updated>2012-11-28T01:29:05+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-11-28T01:28:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7069ed67635b8d574541af426d752cd7fbd465a6'/>
<id>7069ed67635b8d574541af426d752cd7fbd465a6</id>
<content type='text'>
We want to expose the pvclock shared memory areas, which
the hypervisor periodically updates, to userspace.

For a linear mapping from userspace, it is necessary that
entire page sized regions are used for array of pvclock
structures.

There is no such guarantee with per cpu areas, therefore move
to memblock_alloc based allocation.

Acked-by: Glauber Costa &lt;glommer@parallels.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We want to expose the pvclock shared memory areas, which
the hypervisor periodically updates, to userspace.

For a linear mapping from userspace, it is necessary that
entire page sized regions are used for array of pvclock
structures.

There is no such guarantee with per cpu areas, therefore move
to memblock_alloc based allocation.

Acked-by: Glauber Costa &lt;glommer@parallels.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvmclock: remove check_and_clear_guest_paused warning</title>
<updated>2012-06-12T02:18:33+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-06-12T02:18:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e32025a56403df4386cd61a741c0a36afe79ae8a'/>
<id>e32025a56403df4386cd61a741c0a36afe79ae8a</id>
<content type='text'>
CPU offline path calls the hrtimer interrupt handler with interrupts
disabled, without touching preempt_count, triggering this warning.

Remove the warning since it is supposed to be used from hrtimer
interrupt context only.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
CPU offline path calls the hrtimer interrupt handler with interrupts
disabled, without touching preempt_count, triggering this warning.

Remove the warning since it is supposed to be used from hrtimer
interrupt context only.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kvmclock: remove unneeded EXPORT macro</title>
<updated>2012-04-08T09:49:54+00:00</updated>
<author>
<name>Eric B Munson</name>
<email>emunson@mgebm.net</email>
</author>
<published>2012-03-15T22:16:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=248997095d652576f1213028a95ca5fff85d089f'/>
<id>248997095d652576f1213028a95ca5fff85d089f</id>
<content type='text'>
check_and_clear_guest_paused does not need to be exported as it isn't used
by any modules, remove the export.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
check_and_clear_guest_paused does not need to be exported as it isn't used
by any modules, remove the export.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>kvmclock: Add functions to check if the host has stopped the vm</title>
<updated>2012-04-08T09:48:59+00:00</updated>
<author>
<name>Eric B Munson</name>
<email>emunson@mgebm.net</email>
</author>
<published>2012-03-10T19:37:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3b5d56b9317fa7b5407dff1aa7b115bf6cdbd494'/>
<id>3b5d56b9317fa7b5407dff1aa7b115bf6cdbd494</id>
<content type='text'>
When a host stops or suspends a VM it will set a flag to show this.  The
watchdog will use these functions to determine if a softlockup is real, or the
result of a suspended VM.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
asm-generic changes Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a host stops or suspends a VM it will set a flag to show this.  The
watchdog will use these functions to determine if a softlockup is real, or the
result of a suspended VM.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
asm-generic changes Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvmclock: abstract save/restore sched_clock_state</title>
<updated>2012-03-20T10:37:45+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-02-13T13:07:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b74f05d61b73af584d0c39121980171389ecfaaa'/>
<id>b74f05d61b73af584d0c39121980171389ecfaaa</id>
<content type='text'>
Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.

Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.

Also move restore_sched_clock_state before __restore_processor_state,
since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
Thanks to Igor Mammedov for tracking it down.

Fixes suspend-to-disk with kvmclock.

Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.

Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.

Also move restore_sched_clock_state before __restore_processor_state,
since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
Thanks to Igor Mammedov for tracking it down.

Fixes suspend-to-disk with kvmclock.

Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</pre>
</div>
</content>
</entry>
</feed>
