| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to efficiently serve web traffic on a NUMA
machine, one must avoid as many NUMA domain crossings as
possible. With SO_REUSEPORT_LB, a number of workers can share a
listen socket. However, even if a worker sets affinity to a core
or set of cores on a NUMA domain, it will receive connections
associated with all NUMA domains in the system. This will lead to
cross-domain traffic when the server writes to the socket or
calls sendfile(), and memory is allocated on the server's local
NUMA node, but transmitted on the NUMA node associated with the
TCP connection. Similarly, when the server reads from the socket,
he will likely be reading memory allocated on the NUMA domain
associated with the TCP connection.
This change provides a new socket ioctl, TCP_REUSPORT_LB_NUMA. A
server can now tell the kernel to filter traffic so that only
incoming connections associated with the desired NUMA domain are
given to the server. (Of course, in the case where there are no
servers sharing the listen socket on some domain, then as a
fallback, traffic will be hashed as normal to all servers sharing
the listen socket regardless of domain). This allows a server to
deal only with traffic that is local to its NUMA domain, and
avoids cross-domain traffic in most cases.
This patch, and a corresponding small patch to nginx to use
TCP_REUSPORT_LB_NUMA allows us to serve 190Gb/s of kTLS encrypted
https media content from dual-socket Xeons with only 13% (as
measured by pcm.x) cross domain traffic on the memory controller.
Reviewed by: jhb, bz (earlier version), bcr (man page)
Tested by: gonzo
Sponsored by: Netfix
Differential Revision: https://reviews.freebsd.org/D21636
Notes:
svn path=/head/; revision=368819
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ktls_bind_thread is 2, we pick a ktls worker thread that is
bound to the same domain as the TCP connection associated with
the socket. We use roughly the same code as netinet/tcp_hpts.c to
do this. This allows crypto to run on the same domain as the TCP
connection is associated with. Assuming TCP_REUSPORT_LB_NUMA
(D21636) is in place & in use, this ensures that the crypto source
and destination buffers are local to the same NUMA domain as we're
running crypto on.
This change (when TCP_REUSPORT_LB_NUMA, D21636, is used) reduces
cross-domain traffic from over 37% down to about 13% as measured
by pcm.x on a dual-socket Xeon using nginx and a Netflix workload.
Reviewed by: jhb
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21648
Notes:
svn path=/head/; revision=368818
|
|
|
|
|
|
|
| |
Sponsored by: Mellanox Technologies // NVIDIA Networking
Notes:
svn path=/head/; revision=368801
|
|
|
|
|
|
|
|
|
|
|
| |
reversal.
MFC after: 1 week
Reported by: Mark Millard <marklmi@yahoo.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
Notes:
svn path=/head/; revision=368799
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This partially lifts a restriction imposed by r191639 ("Prevent a superuser
inside a jail from modifying the dedicated root cpuset of that jail") that's
perhaps beneficial after r192895 ("Add hierarchical jails."). Jails still
cannot modify their own cpuset, but they can modify child jails' roots to
further restrict them or widen them back to the modifying jails' own mask.
As a side effect of this, the system root may once again widen the mask of
jails as long as they're still using a subset of the parent jails' mask.
This was previously prevented by the fact that cpuset_getroot of a root set
will return that root, rather than the root's parent -- cpuset_modify uses
cpuset_getroot since it was introduced in r327895, previously it was just
validating against set->cs_parent which allowed the system root to widen
jail masks.
Reviewed by: jamie
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27352
Notes:
svn path=/head/; revision=368779
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These Mini-Box LCDs are using Microchip components and sub-licensed product
IDs. Whilst here, update the constant names and descriptions for the products
to use the names listed on the manufacturer's website rather than vague ones.
The picoLCD 4x20 is named that on the manufacturer's website so prefer that
name, even though linux-usb.org lists it with the numbers reversed as one might
expect.
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D27670
Notes:
svn path=/head/; revision=368774
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also centralize and unify checks to enable ASLR stack gap in a new
helper exec_stackgap().
PR: 239873
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=368772
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use traditional explicit leading zero format for hex numbers.
Align P2_ hex values.
Wrap long lines by splitting comments.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=368771
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rtsock code was build around the assumption that each rtentry record
in the system radix tree is a ready-to-use sockaddr. This assumptions
turned out to be not quite true:
* masks have their length tweaked, so we have rtsock_fix_netmask() hack
* IPv6 addresses have their scope embedded, so we have another explicit
deembedding hack.
Change the code to decouple rtentry internals from rtsock code using
newly-created rtentry accessors. This will allow to eventually eliminate
both of the hacks and change rtentry dst/mask format.
Differential Revision: https://reviews.freebsd.org/D27451
Notes:
svn path=/head/; revision=368769
|
|
|
|
|
|
|
|
|
|
|
| |
This sysctl node can generate very verbose output, so don't trigger it
for sysctl -a or sysctl vm.pmap.
Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D27504
Notes:
svn path=/head/; revision=368768
|
|
|
|
|
|
|
|
|
|
|
|
| |
These implementation IDs are defined in the SBI spec, so we should print
their name if detected.
Submitted by: Danjel Qyteza <danq1222@gmail.com>
Reviewed by: jhb, kp
Differential Revision: https://reviews.freebsd.org/D27660
Notes:
svn path=/head/; revision=368767
|
|
|
|
|
|
|
|
| |
This clock is used by the watchdog IP and can be controlled only
in the secure world.
Notes:
svn path=/head/; revision=368766
|
|
|
|
|
|
|
|
|
|
|
| |
Prefer these newly-added definitions to bare values.
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Notes:
svn path=/head/; revision=368765
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to the recent patch to arm's gdb stub in r368414, allow GDB to
update the contents of most general purpose registers.
Reviewed by: cem, jhb, markj
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
NetApp PR: 44
Differential Revision: https://reviews.freebsd.org/D27642
Notes:
svn path=/head/; revision=368764
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SRAT may contain multiple distinct entries that together describe a
contiguous region of physical memory. In this case we were not
coalescing the corresponding entries in the memory affinity table, which
led to fragmented phys_avail[] entries. Since r338431 the vm_phys_segs[]
entries derived from phys_avail[] will be coalesced, resulting in a
situation where vm_phys_segs[] entries do not have a covering
phys_avail[] entry. vm_page_startup() will not add such segments to the
physical memory allocator, leaving them unused.
Reported by: Don Morris <dgmorris@earthlink.net>
Reviewed by: kib, vangyzen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27620
Notes:
svn path=/head/; revision=368763
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This caused us to write to the low half of the feature word twice, once with
the high bits and once with the low bits. Common legacy device implementations
seem to be fairly lenient about being able to write to the feature bits
multiple times, but Arm's models use a stricter implementation that will ignore
the second write. This fixes using vtnet(4) on those models.
Reported by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Pointy hat: jrtc27
Notes:
svn path=/head/; revision=368761
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of bailing out with EBUSY if there are any.
If driver module is unloaded, or just device is forcibly detached from
the driver, there is no way for driver to correctly unload otherwise.
Esp. if there are resources dedicated to the VFs which prevent turning
down other resources.
Reviewed by: jhb
Sponsored by: Mellanox Technologies / NVidia Networking
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27615
Notes:
svn path=/head/; revision=368749
|
|
|
|
|
|
|
|
|
|
|
| |
Reapply r364240 after driver update in r365617.
Reviewed by: lwhsu
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D27561
Notes:
svn path=/head/; revision=368745
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The argument is a void * so there's no need to cast it to caddr_t.
Update documentation to match function decleration.
Reviewed by: freqlabs
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27093
Notes:
svn path=/head/; revision=368744
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: imp, hselasky
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27582
Notes:
svn path=/head/; revision=368741
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to r366897, this uses the .incbin directive to pull in a
firmware file's contents into a .fwo file. The same scheme for
computing symbol names from the filename is used as before to maximize
compatiblity and not require rebuilding existing .fwo files for
NO_CLEAN builds. Using ld -o binary requires extra hacks in linkers
to either specify ABI options (e.g. soft- vs hard-float) or to ignore
ABI incompatiblities when linking certain objects (e.g. object files
with only data). Using the compiler driver avoids the need for these
hacks as the compiler driver is able to set all the appropriate ABI
options.
Reviewed by: imp, markj
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27579
Notes:
svn path=/head/; revision=368739
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Use [u]intptr_t casts to convert pointers to integers.
- Change IS_ERR* to return bool instead of long.
Reviewed by: manu
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27577
Notes:
svn path=/head/; revision=368738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we do not own the session lock, a parallel killjobc() might
reset s_leader to NULL after we checked it. Read s_leader only once
and ensure that compiler is not allowed to reload.
While there, make access to t_session somewhat more pretty by using
local variable.
PR: 251915
Submitted by: Jakub Piecuch <j.piecuch96@gmail.com>
MFC after: 1 week
Notes:
svn path=/head/; revision=368735
|
|
|
|
| |
Notes:
svn path=/head/; revision=368732
|
|
|
|
|
|
|
| |
This in particular avoids spurious lookups on close.
Notes:
svn path=/head/; revision=368731
|
|
|
|
| |
Notes:
svn path=/head/; revision=368730
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is just a minor optimization, but it's sensitive. This gives an improvement of 30-50 kpps.
Reviewed by: kp, markj, glebius, lutz_donnerhacke.de
Approved by: vmaffione (mentor)
Sponsored by: vstack.com
Differential Revision: https://reviews.freebsd.org/D27382
Notes:
svn path=/head/; revision=368727
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add capability to SPIBUS to have child device with IRQ.
For example many ADC chip have a dedicated pin to signal "data ready"
and the host can just wait for a interrupt to go out and read the result.
It is the same code as in R282674 and R282702 for IICBUS by Michal Meloun
Submitted by: Oskar Holmund <oskar.holmlund@ohdata.se>
Differential Revision: https://reviews.freebsd.org/D27396
Notes:
svn path=/head/; revision=368725
|
|
|
|
|
|
|
|
|
|
| |
Remove the exynos SoC support, this haven't been updated in a while,
isn't present in GENERIC and nobody is motivated to resurect it.
Differential Revision: https://reviews.freebsd.org/D24444
Notes:
svn path=/head/; revision=368724
|
|
|
|
|
|
|
|
| |
Setting DEBUG_FLAGS results in make installkernel trying to install debug
information that doesn't exist if the kernel was built without it.
Notes:
svn path=/head/; revision=368718
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're looking for file content differences, so ask the question of git
more directly. This helps a lot, saving tens of thousands of fork()s,
when the builder and editor see different stat() results (e.g., UIDs),
as they might with containers.
Submitted by: Nathaniel Wesley Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: bdrewery, emaste, imp
Obtained from: CheriBSD
MFC after: 3 days
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27646
Notes:
svn path=/head/; revision=368709
|
|
|
|
|
|
|
|
|
|
|
| |
refcount_acquire_if_not_zero returns true on saturation.
The case of 0 is handled by looping again, after which the originally
found pointer will no longer be there.
Noted by: kib
Notes:
svn path=/head/; revision=368703
|
|
|
|
|
|
|
| |
MFC with: r368698
Notes:
svn path=/head/; revision=368700
|
|
|
|
| |
Notes:
svn path=/head/; revision=368699
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current way, hardcoded value plus heuristic is not conform to the PCI(e)
specification and it fails on systems where MSI-X bar is not initialized by
BIOS/ACPI (many arm or arm64 systems for example).
Instead, use the standard PCI(e) capability for determining of
MSIX table bar address.
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D27265
Notes:
svn path=/head/; revision=368698
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One of the disadvantages of our current busdma code is the fact that
we process the bounced buffer in a page-by-page manner. This means that
the short (subpage) buffer allocated across page boundaries is bounced
to 2 separate pages.
This suboptimal behavior is consistent across all platforms and can be
related to (probably unimplementable or incompatible with bouncing)
BUS_DMA_KEEP_PG_OFFSET flag.
Therefore, allocate one additional page to be fully comply with this
requirement.
Discused with: markj
PR: 251018
Notes:
svn path=/head/; revision=368697
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Use a uintptr_t cast to get the virtual address of a pointer in
USB_P2U() instead of a ptrdiff_t.
- Add offsets to a char * pointer directly without roundtripping the
pointer through a ptrdiff_t in USB_ADD_BYTES().
Reviewed by: imp, hselasky
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27581
Notes:
svn path=/head/; revision=368688
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: imp, gallatin
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27580
Notes:
svn path=/head/; revision=368687
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sense_ptr thing is quite broken. As near as I can tell, the
driver tries to copyout to a physical address rather than whatever
user address the sense buffer should be copied to. It is not
immediately obvious what user address the sense buffer should be
copied to.
Reviewed by: imp
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27578
Notes:
svn path=/head/; revision=368686
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: imp
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27576
Notes:
svn path=/head/; revision=368685
|
|
|
|
|
|
|
|
|
|
| |
This needs to account for empty NUMA domains or domains which do not
satisfy the requested range.
Discussed with: markj
Notes:
svn path=/head/; revision=368673
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move initialization of num_altsetting under USB_CFG_INIT, else
there will be a page fault when enumerating USB devices.
PR: 251856
MFC after: 1 week
Submitted by: Ma, Horse <Shichun.Ma@dell.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
Notes:
svn path=/head/; revision=368664
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow setting the alternate interface number to fail when there is only
one alternate setting present, to comply with the USB specification.
Refactor how iface->num_altsetting is computed.
Bump the __FreeBSD_version due to change of core USB structure.
PR: 251856
MFC after: 1 week
Submitted by: Ma, Horse <Shichun.Ma@dell.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
Notes:
svn path=/head/; revision=368659
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Limit the number of alternate settings to 256.
Else the alternate index variable may wrap around.
PR: 251856
MFC after: 1 week
Submitted by: Ma, Horse <Shichun.Ma@dell.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
Notes:
svn path=/head/; revision=368658
|
|
|
|
|
|
|
| |
Reported by: mjg
Notes:
svn path=/head/; revision=368651
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When r362031 moved local TLB invalidation after shootdown IPI send, it
moved too much. In particular, PCID-mode clearing of the pm_gen
generation counters must occur before IPIs are send, which is in fact
described by the comment before seq_cst fence in the invalidation
functions.
Fix it by extracting pm_gen clearing into new helper
pmap_invalidate_preipi(), which is executed before a call to
smp_masked_tlb_shootdown().
Rest of the local invalidation callbacks is simplified as result, and
become very similar to the remote shootdown handlers (to be merged in
some future).
Move pin of the thread to pmap_invalidate_preipi(), and do unpin in
smp_masked_tlb_shootdown().
Reported and tested by: mjg (previous version)
Reviewed by: alc, cem (previous version), markj
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D227588
Notes:
svn path=/head/; revision=368649
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ability to load-balance traffic over multiple path is a must-have thing for routers.
It may be used by the servers to balance outgoing traffic over multiple default gateways.
The previous implementation, RADIX_MPATH stayed in the shadow for too long.
It was not well maintained, which lead us to a vicious circle - people were using
non-contiguous mask or firewalls to achieve similar goals. As a result, some routing
daemons implementation still don't have multipath support enabled for FreeBSD.
Turning on ROUTE_MPATH by default would fix it. It will allow to reduce networking
feature gap to other operating systems. Linux and OpenBSD enabled similar support
at least 5 years ago.
ROUTE_MPATH does not consume memory unless actually used. It enables around ~1k LOC.
It does not bring any behaviour changes for userland.
Additionally, feature is (temporarily) turned off by the net.route.multipath sysctl
defaulting to 0.
Differential Revision: https://reviews.freebsd.org/D27428
Notes:
svn path=/head/; revision=368648
|
|
|
|
| |
Notes:
svn path=/head/; revision=368635
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARM PMU may use single per-core interrupt or may use multiple generic
interrupts, one per core. In this case, special attention must be paid to
the correct identification of the physical location of the core, its order
in the external database (FDT) and the associated cpuid.
Also keep in mind that a SoC can have multiple different PMUs
(usually one per cluster)
Notes:
svn path=/head/; revision=368634
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some older PSCI implementations corrupt (or do not pass) the context_id
argument to newly started secondary cores. Although the ideal solution to this
problem is u-boot update, we can find the correct value for the argument (cpuid)
by comparing of real core mpidr register with the value stored in pcu->mpidr.
MFC after: 2 weeks
Notes:
svn path=/head/; revision=368633
|