<feed xmlns='http://www.w3.org/2005/Atom'>
<title>src/sys/amd64/include/proc.h, branch releng/11.0</title>
<subtitle>FreeBSD source tree</subtitle>
<id>https://cgit-dev.freebsd.org/src/atom?h=releng%2F11.0</id>
<link rel='self' href='https://cgit-dev.freebsd.org/src/atom?h=releng%2F11.0'/>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/'/>
<updated>2016-05-14T23:35:11Z</updated>
<entry>
<title>Eliminate pvh_global_lock from the amd64 pmap.</title>
<updated>2016-05-14T23:35:11Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2016-05-14T23:35:11Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=56e61f57b0939bd435c3b2f6b60649f94aa44ca5'/>
<id>urn:sha1:56e61f57b0939bd435c3b2f6b60649f94aa44ca5</id>
<content type='text'>
The only current purpose of the pvh lock was explained there
On Wed, Jan 09, 2013 at 11:46:13PM -0600, Alan Cox wrote:
&gt; Let me lay out one example for you in detail.  Suppose that we have
&gt; three processors and two of these processors are actively using the same
&gt; pmap.  Now, one of the two processors sharing the pmap performs a
&gt; pmap_remove().  Suppose that one of the removed mappings is to a
&gt; physical page P.  Moreover, suppose that the other processor sharing
&gt; that pmap has this mapping cached with write access in its TLB.  Here's
&gt; where the trouble might begin.  As you might expect, the processor
&gt; performing the pmap_remove() will acquire the fine-grained lock on the
&gt; PV list for page P before destroying the mapping to page P.  Moreover,
&gt; this processor will ensure that the vm_page's dirty field is updated
&gt; before releasing that PV list lock.  However, the TLB shootdown for this
&gt; mapping may not be initiated until after the PV list lock is released.
&gt; The processor performing the pmap_remove() is not problematic, because
&gt; the code being executed by that processor won't presume that the mapping
&gt; is destroyed until the TLB shootdown has completed and pmap_remove() has
&gt; returned.  However, the other processor sharing the pmap could be
&gt; problematic.  Specifically, suppose that the third processor is
&gt; executing the page daemon and concurrently trying to reclaim page P.
&gt; This processor performs a pmap_remove_all() on page P in preparation for
&gt; reclaiming the page.  At this instant, the PV list for page P may
&gt; already be empty but our second processor still has a stale TLB entry
&gt; mapping page P.  So, changes might still occur to the page after the
&gt; page daemon believes that all mappings have been destroyed.  (If the PV
&gt; entry had still existed, then the pmap lock would have ensured that the
&gt; TLB shootdown completed before the pmap_remove_all() finished.)  Note,
&gt; however, the page daemon will know that the page is dirty.  It can't
&gt; possibly mistake a dirty page for a clean one.  However, without the
&gt; current pvh global locking, I don't think anything is stopping the page
&gt; daemon from starting the laundering process before the TLB shootdown has
&gt; completed.
&gt;
&gt; I believe that a similar example could be constructed with a clean page
&gt; P' and a stale read-only TLB entry.  In this case, the page P' could be
&gt; "cached" in the cache/free queues and recycled before the stale TLB
&gt; entry is flushed.

TLBs for addresses with updated PTEs are always flushed before pmap
lock is unlocked.  On the other hand, amd64 pmap code does not always
flushes TLBs before PV list locks are unlocked, if previously PTEs
were cleared and PV entries removed.

To handle the situations where a thread might notice empty PV list but
third thread still having access to the page due to TLB invalidation
not finished yet, introduce delayed invalidation.  Comparing with the
pvh_global_lock, DI does not block entered thread when
pmap_remove_all() or pmap_remove_write() (callers of
pmap_delayed_invl_wait()) are executed in parallel.  But _invl_wait()
callers are blocked until all previously noted DI blocks are leaved,
thus ensuring that neccessary TLB invalidations were performed before
returning from pmap_remove_all() or pmap_remove_write().

See comments for detailed description of the mechanism, and also for
the explanations why several pmap methods, most important
pmap_enter(), do not need DI protection.

Reviewed by:	alc, jhb (turnstile KPI usage)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D5747
</content>
</entry>
<entry>
<title>Handle spurious page faults that may occur in no-fault sections of the</title>
<updated>2012-03-22T04:52:51Z</updated>
<author>
<name>Alan Cox</name>
<email>alc@FreeBSD.org</email>
</author>
<published>2012-03-22T04:52:51Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=5730afc9b6405f9325dfbbf7164dd6c818191ff5'/>
<id>urn:sha1:5730afc9b6405f9325dfbbf7164dd6c818191ff5</id>
<content type='text'>
kernel.

When access restrictions are added to a page table entry, we flush the
corresponding virtual address mapping from the TLB.  In contrast, when
access restrictions are removed from a page table entry, we do not
flush the virtual address mapping from the TLB.  This is exactly as
recommended in AMD's documentation.  In effect, when access
restrictions are removed from a page table entry, AMD's MMUs will
transparently refresh a stale TLB entry.  In short, this saves us from
having to perform potentially costly TLB flushes.  In contrast,
Intel's MMUs are allowed to generate a spurious page fault based upon
the stale TLB entry.  Usually, such spurious page faults are handled
by vm_fault() without incident.  However, when we are executing
no-fault sections of the kernel, we are not allowed to execute
vm_fault().  This change introduces special-case handling for spurious
page faults that occur in no-fault sections of the kernel.

In collaboration with:	kib
Tested by:		gibbs (an earlier version)

I would also like to acknowledge Hiroki Sato's assistance in
diagnosing this problem.

MFC after:	1 week
</content>
</entry>
<entry>
<title>Remove unused define.</title>
<updated>2011-10-07T16:09:44Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2011-10-07T16:09:44Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=6bfe4c78c8ff63806fa0899bf46499579c71b8c6'/>
<id>urn:sha1:6bfe4c78c8ff63806fa0899bf46499579c71b8c6</id>
<content type='text'>
MFC after:	1 month
</content>
</entry>
<entry>
<title>Reorganize syscall entry and leave handling.</title>
<updated>2010-05-23T18:32:02Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2010-05-23T18:32:02Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=afe1a68827e6810d86163cc8919247f26a9c45b2'/>
<id>urn:sha1:afe1a68827e6810d86163cc8919247f26a9c45b2</id>
<content type='text'>
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
</content>
</entry>
<entry>
<title>Style: use #define&lt;TAB&gt; instead of #define&lt;SPACE&gt;.</title>
<updated>2010-04-27T09:48:43Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2010-04-27T09:48:43Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=8bac98182aa3f4976b0f5c3067a53f8b29b5a526'/>
<id>urn:sha1:8bac98182aa3f4976b0f5c3067a53f8b29b5a526</id>
<content type='text'>
Noted by:	bde, pluknet gmail com
MFC after:	11 days
</content>
</entry>
<entry>
<title>Move the constants specifying the size of struct kinfo_proc into</title>
<updated>2010-04-24T12:49:52Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2010-04-24T12:49:52Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=ed7806879b9fb7bd34a9ec2ae1360760e87d244d'/>
<id>urn:sha1:ed7806879b9fb7bd34a9ec2ae1360760e87d244d</id>
<content type='text'>
machine-specific header files. Add KINFO_PROC32_SIZE for struct
kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add
CTASSERT for the size of struct kinfo_proc32.

Submitted by:	pluknet
Reviewed by:	imp, jhb, nwhitehorn
MFC after:	2 weeks
</content>
</entry>
<entry>
<title>Save and restore segment registers on amd64 when entering and leaving</title>
<updated>2009-04-01T13:09:26Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2009-04-01T13:09:26Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=2c66cccab7ceceb3eed086da3b2dedfc77ce72de'/>
<id>urn:sha1:2c66cccab7ceceb3eed086da3b2dedfc77ce72de</id>
<content type='text'>
the kernel on amd64. Fill and read segment registers for mcontext and
signals. Handle traps caused by restoration of the
invalidated selectors.

Implement user-mode creation and manipulation of the process-specific
LDT descriptors for amd64, see sysarch(2).

Implement support for TSS i/o port access permission bitmap for amd64.

Context-switch LDT and TSS. Do not save and restore segment registers on
the context switch, that is handled by kernel enter/leave trampolines
now. Remove segment restore code from the signal trampolines for
freebsd/amd64, freebsd/ia32 and linux/i386 for the same reason.

Implement amd64-specific compat shims for sysarch.

Linuxolator (temporary ?) switched to use gsbase for thread_area pointer.

TODO:
Currently, gdb is not adapted to show segment registers from struct reg.
Also, no machine-depended ptrace command is added to set segment
registers for debugged process.

In collaboration with:	pho
Discussed with:	peter
Reviewed by:	jhb
Linuxolator tested by:	dchagin
</content>
</entry>
<entry>
<title>Move GET_STACK_USAGE from MI header to i386/amd64 MD ones.</title>
<updated>2008-01-31T08:24:27Z</updated>
<author>
<name>Alexander Motin</name>
<email>mav@FreeBSD.org</email>
</author>
<published>2008-01-31T08:24:27Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=2a57ca33c742c600a5837ba32f9325b79e669f45'/>
<id>urn:sha1:2a57ca33c742c600a5837ba32f9325b79e669f45</id>
<content type='text'>
Somebody who can, please feel free to implement it for other archs
or copy this one if it suits.
</content>
</entry>
<entry>
<title>Divorce critical sections from spinlocks.  Critical sections as denoted by</title>
<updated>2005-04-04T21:53:56Z</updated>
<author>
<name>John Baldwin</name>
<email>jhb@FreeBSD.org</email>
</author>
<published>2005-04-04T21:53:56Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=c6a37e84139a1c73d4ef46ce4fdf8598a0ebbf45'/>
<id>urn:sha1:c6a37e84139a1c73d4ef46ce4fdf8598a0ebbf45</id>
<content type='text'>
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
</content>
</entry>
<entry>
<title>Begin all license/copyright comments with /*-</title>
<updated>2005-01-05T20:17:21Z</updated>
<author>
<name>Warner Losh</name>
<email>imp@FreeBSD.org</email>
</author>
<published>2005-01-05T20:17:21Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=46280ae71938465be665fb19cf8f7d1ca48a379a'/>
<id>urn:sha1:46280ae71938465be665fb19cf8f7d1ca48a379a</id>
<content type='text'>
</content>
</entry>
</feed>
