<feed xmlns='http://www.w3.org/2005/Atom'>
<title>src/sys/amd64/include, branch releng/11.0</title>
<subtitle>FreeBSD source tree</subtitle>
<id>https://cgit-dev.freebsd.org/src/atom?h=releng%2F11.0</id>
<link rel='self' href='https://cgit-dev.freebsd.org/src/atom?h=releng%2F11.0'/>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/'/>
<updated>2016-07-15T09:44:48Z</updated>
<entry>
<title>MFC r302635:</title>
<updated>2016-07-15T09:44:48Z</updated>
<author>
<name>Roger Pau Monné</name>
<email>royger@FreeBSD.org</email>
</author>
<published>2016-07-15T09:44:48Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=b9ca940a8818c6328e505342c0d6289cab2734b0'/>
<id>urn:sha1:b9ca940a8818c6328e505342c0d6289cab2734b0</id>
<content type='text'>
xen: automatically disable MSI-X interrupt migration

Approved by:	re (kib)
</content>
</entry>
<entry>
<title>Replace a number of conflations of mp_ncpus and mp_maxid with either</title>
<updated>2016-07-06T14:09:49Z</updated>
<author>
<name>Nathan Whitehorn</name>
<email>nwhitehorn@FreeBSD.org</email>
</author>
<published>2016-07-06T14:09:49Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=96c85efb4b85dbc0bfc4968516a627a727ac7ec5'/>
<id>urn:sha1:96c85efb4b85dbc0bfc4968516a627a727ac7ec5</id>
<content type='text'>
mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in
the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try,
for example, to run tasks on CPUs that did not exist or to allocate too
few buffers on systems with sparse CPU IDs in which there are holes in the
range and mp_maxid &gt; mp_ncpus. Such circumstances generally occur on
systems with SMT, but on which SMT is disabled. This patch restores system
operation at least on POWER8 systems configured in this way.

There are a number of other places in the kernel with potential problems
in these situations, but where sparse CPU IDs are not currently known
to occur, mostly in the ARM machine-dependent code. These will be fixed
in a follow-up commit after the stable/11 branch.

PR:		kern/210106
Reviewed by:	jhb
Approved by:	re (glebius)
</content>
</entry>
<entry>
<title>atomic: Add testandclear on i386/amd64</title>
<updated>2016-05-16T07:19:33Z</updated>
<author>
<name>Sepherosa Ziehau</name>
<email>sephe@FreeBSD.org</email>
</author>
<published>2016-05-16T07:19:33Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=dfdc9a05c6797d244cb733f108719fdcfa0e8379'/>
<id>urn:sha1:dfdc9a05c6797d244cb733f108719fdcfa0e8379</id>
<content type='text'>
Reviewed by:	kib
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6381
</content>
</entry>
<entry>
<title>Eliminate pvh_global_lock from the amd64 pmap.</title>
<updated>2016-05-14T23:35:11Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2016-05-14T23:35:11Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=56e61f57b0939bd435c3b2f6b60649f94aa44ca5'/>
<id>urn:sha1:56e61f57b0939bd435c3b2f6b60649f94aa44ca5</id>
<content type='text'>
The only current purpose of the pvh lock was explained there
On Wed, Jan 09, 2013 at 11:46:13PM -0600, Alan Cox wrote:
&gt; Let me lay out one example for you in detail.  Suppose that we have
&gt; three processors and two of these processors are actively using the same
&gt; pmap.  Now, one of the two processors sharing the pmap performs a
&gt; pmap_remove().  Suppose that one of the removed mappings is to a
&gt; physical page P.  Moreover, suppose that the other processor sharing
&gt; that pmap has this mapping cached with write access in its TLB.  Here's
&gt; where the trouble might begin.  As you might expect, the processor
&gt; performing the pmap_remove() will acquire the fine-grained lock on the
&gt; PV list for page P before destroying the mapping to page P.  Moreover,
&gt; this processor will ensure that the vm_page's dirty field is updated
&gt; before releasing that PV list lock.  However, the TLB shootdown for this
&gt; mapping may not be initiated until after the PV list lock is released.
&gt; The processor performing the pmap_remove() is not problematic, because
&gt; the code being executed by that processor won't presume that the mapping
&gt; is destroyed until the TLB shootdown has completed and pmap_remove() has
&gt; returned.  However, the other processor sharing the pmap could be
&gt; problematic.  Specifically, suppose that the third processor is
&gt; executing the page daemon and concurrently trying to reclaim page P.
&gt; This processor performs a pmap_remove_all() on page P in preparation for
&gt; reclaiming the page.  At this instant, the PV list for page P may
&gt; already be empty but our second processor still has a stale TLB entry
&gt; mapping page P.  So, changes might still occur to the page after the
&gt; page daemon believes that all mappings have been destroyed.  (If the PV
&gt; entry had still existed, then the pmap lock would have ensured that the
&gt; TLB shootdown completed before the pmap_remove_all() finished.)  Note,
&gt; however, the page daemon will know that the page is dirty.  It can't
&gt; possibly mistake a dirty page for a clean one.  However, without the
&gt; current pvh global locking, I don't think anything is stopping the page
&gt; daemon from starting the laundering process before the TLB shootdown has
&gt; completed.
&gt;
&gt; I believe that a similar example could be constructed with a clean page
&gt; P' and a stale read-only TLB entry.  In this case, the page P' could be
&gt; "cached" in the cache/free queues and recycled before the stale TLB
&gt; entry is flushed.

TLBs for addresses with updated PTEs are always flushed before pmap
lock is unlocked.  On the other hand, amd64 pmap code does not always
flushes TLBs before PV list locks are unlocked, if previously PTEs
were cleared and PV entries removed.

To handle the situations where a thread might notice empty PV list but
third thread still having access to the page due to TLB invalidation
not finished yet, introduce delayed invalidation.  Comparing with the
pvh_global_lock, DI does not block entered thread when
pmap_remove_all() or pmap_remove_write() (callers of
pmap_delayed_invl_wait()) are executed in parallel.  But _invl_wait()
callers are blocked until all previously noted DI blocks are leaved,
thus ensuring that neccessary TLB invalidations were performed before
returning from pmap_remove_all() or pmap_remove_write().

See comments for detailed description of the mechanism, and also for
the explanations why several pmap methods, most important
pmap_enter(), do not need DI protection.

Reviewed by:	alc, jhb (turnstile KPI usage)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D5747
</content>
</entry>
<entry>
<title>Add locking annotations to amd64 struct md_page members.</title>
<updated>2016-05-10T09:58:51Z</updated>
<author>
<name>Konstantin Belousov</name>
<email>kib@FreeBSD.org</email>
</author>
<published>2016-05-10T09:58:51Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=aa3ec63e0231b84d10dc4b585ee0adc3519d6db2'/>
<id>urn:sha1:aa3ec63e0231b84d10dc4b585ee0adc3519d6db2</id>
<content type='text'>
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
</content>
</entry>
<entry>
<title>Add a new bus method to fetch device-specific CPU sets.</title>
<updated>2016-05-09T20:50:21Z</updated>
<author>
<name>John Baldwin</name>
<email>jhb@FreeBSD.org</email>
</author>
<published>2016-05-09T20:50:21Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=8d791e5af1e6773782656648cdef47bdc35002fb'/>
<id>urn:sha1:8d791e5af1e6773782656648cdef47bdc35002fb</id>
<content type='text'>
bus_get_cpus() returns a specified set of CPUs for a device.  It accepts
an enum for the second parameter that indicates the type of cpuset to
request.  Currently two valus are supported:

 - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
   the device when DEVICE_NUMA is enabled)
 - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL.  INTR_CPUS is mapped to 'all_cpus'
by default.  The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled.  They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Compared to the r298933, this version uses 'struct _cpuset' in
&lt;sys/bus.h&gt; instead of 'cpuset_t' to avoid requiring &lt;sys/param.h&gt;
(&lt;sys/_cpuset.h&gt; still requires &lt;sys/param.h&gt; for MAXCPU even though
&lt;sys/_bitset.h&gt; does not after recent changes).
</content>
</entry>
<entry>
<title>sys/amd64: Small spelling fixes.</title>
<updated>2016-05-03T22:13:04Z</updated>
<author>
<name>Pedro F. Giffuni</name>
<email>pfg@FreeBSD.org</email>
</author>
<published>2016-05-03T22:13:04Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=edafb5a327f6d9d3cdcc3ac98d0591665e288f1a'/>
<id>urn:sha1:edafb5a327f6d9d3cdcc3ac98d0591665e288f1a</id>
<content type='text'>
No functional change.
</content>
</entry>
<entry>
<title>Revert bus_get_cpus() for now.</title>
<updated>2016-05-03T01:17:40Z</updated>
<author>
<name>John Baldwin</name>
<email>jhb@FreeBSD.org</email>
</author>
<published>2016-05-03T01:17:40Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=8a08b7d36b7946ba88a40afe0165de8671d4f568'/>
<id>urn:sha1:8a08b7d36b7946ba88a40afe0165de8671d4f568</id>
<content type='text'>
I really thought I had run this through the tinderbox before committing,
but many places need &lt;sys/types.h&gt; -&gt; &lt;sys/param.h&gt; for &lt;sys/bus.h&gt; now.
</content>
</entry>
<entry>
<title>Add a new bus method to fetch device-specific CPU sets.</title>
<updated>2016-05-02T18:00:38Z</updated>
<author>
<name>John Baldwin</name>
<email>jhb@FreeBSD.org</email>
</author>
<published>2016-05-02T18:00:38Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=bc153c692ff962bda50532211fdc0086a58e6c52'/>
<id>urn:sha1:bc153c692ff962bda50532211fdc0086a58e6c52</id>
<content type='text'>
bus_get_cpus() returns a specified set of CPUs for a device.  It accepts
an enum for the second parameter that indicates the type of cpuset to
request.  Currently two valus are supported:

 - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
   the device when DEVICE_NUMA is enabled)
 - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL.  INTR_CPUS is mapped to 'all_cpus'
by default.  The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled.  They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Reviewed by:	wblock (manpage)
Differential Revision:	https://reviews.freebsd.org/D5519
</content>
</entry>
<entry>
<title>Enable DEVICE_NUMA with up to 8 domains by default on amd64.</title>
<updated>2016-04-12T21:23:44Z</updated>
<author>
<name>John Baldwin</name>
<email>jhb@FreeBSD.org</email>
</author>
<published>2016-04-12T21:23:44Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=7ecf8cab6f4b3c8cb2c2a551e03b3c617d7d628d'/>
<id>urn:sha1:7ecf8cab6f4b3c8cb2c2a551e03b3c617d7d628d</id>
<content type='text'>
8 memory domains should handle a quad-socket board with dual-domain
processors.

Reviewed by:	kib
Relnotes:	maybe?
Differential Revision:	https://reviews.freebsd.org/D5893
</content>
</entry>
</feed>
