aboutsummaryrefslogtreecommitdiff
path: root/sys
Commit message (Collapse)AuthorAgeFilesLines
* dts: Fix arm dts pathEmmanuel Vadot2024-03-214-37/+37
| | | | | | | Linux 6.5 moved to a vendor-based subdirectory for arm DTS, change our Makefiles accordingly. Sponsored by: Beckhoff Automation GmbH & Co. KG
* Import device-tree files from Linux 6.5Emmanuel Vadot2024-03-214146-217071/+253668
| | | | Sponsored by: Beckhoff Automation GmbH & Co. KG
* kassert.h: update MPASS definition commentaryMitchell Horne2024-03-211-2/+8
| | | | | | | | | | | | | | | We now have a detailed man page describing both MPASS and KASSERT. Give a warning that careless use of MPASS can result in inadequate assertion messages, and point to the MPASS(9) page which describes this. While here add a comment above the KASSERT definitions pointing to the man page. Suggested by: bz Reviewed by: emaste MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D44438
* arm64: Add EL1 hardware breakpoint exceptionsAndrew Turner2024-03-214-1/+7
| | | | | | Reviewed by: jhb Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44353
* arm64: Use a switch to decide when to enable debugAndrew Turner2024-03-211-2/+8
| | | | | | | | | | Use a switch statement to decide which exceptions we need to call dbg_enable for. This simplifies adding more esceptions to the list in the future. Reviewed by: jhb Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44352
* arm64: Always set the debug control and value regsAndrew Turner2024-03-211-13/+14
| | | | | | | | | | | When listing watchpoints we read the raw registers. To ensure we print an accurate list always set the watchpoint and breakpoint registers. Sponsored by: Arm Ltd Reviewed by: jhb Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44351
* arm64: Mask non-debug exceptions when single steppingAndrew Turner2024-03-211-0/+12
| | | | | | | | | | | | | | | | When an exception is pending when single stepping we may execute the handler for that exception rather than the single step handler. This could cause the scheduler to fire to run a new thread. This will mean we single step to a new thread causing unexpected results. Handle this by masking non-debug exceptions. This will cause issues when stepping over instructions that access the DAIF values so future work is needed to handle these cases, but for most code this now works as expected. Reviewed by: jhb Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44350
* arm64: Split out a savectx version of vfp_save_stateAndrew Turner2024-03-213-19/+28
| | | | | | | | | Rather than try to detect when vfp_save_state is called by savectx use a separate function that sets up the pcb as needed. Reviewed by: imp Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D43304
* arm64: Support passing more registers to signalsAndrew Turner2024-03-212-7/+94
| | | | | | | | | | | | | To support recent extensions to the Arm architecture we may need to store more or larger registers when sending a signal. To support this create a list of these extra registers. Userspace that needs to access a register in the signal handler can then walk the list to find the correct register struct and read/write its contents. Reviewed by: kib, markj (earlier version) Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D43302
* Merge commit bbb8a0df7367 from llvm-project (by Shafik Yaghmour):Dimitry Andric2024-03-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | [Clang] Fix ResolveConstructorOverload to not select a conversion function if we are going use copy elision ResolveConstructorOverload needs to check properly if we are going to use copy elision we can't use a conversion function. This fixes: https://github.com/llvm/llvm-project/issues/39319 https://github.com/llvm/llvm-project/issues/60182 https://github.com/llvm/llvm-project/issues/62157 https://github.com/llvm/llvm-project/issues/64885 https://github.com/llvm/llvm-project/issues/65568 Differential Revision: https://reviews.llvm.org/D148474 This should fix 'Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!")' errors when building devel/boost-libs, specifically libs/url/src/segments_view.cpp. Bump __FreeBSD_version so this fix can easily be detected from devel/boost-all/compiled.mk. PR: 273335
* cxgbe tom: Handle a race condition when enabling TLS offloadJohn Baldwin2024-03-202-4/+13
| | | | | | | | | | | | Use a separate state for when a request to set RX_QUIESCE has been sent but the resulting TCB reply has not been received. In particular, this correctly handles the case where data has been received and queued in the receive queue before the quiesce request takes effect. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44435
* NFS: Request use of TCP_USE_DDP for in-kernel TCP socketsJohn Baldwin2024-03-203-0/+29
| | | | | | | | | | | | Since this is an optimization, ignore failures to enable the option. For the server side, defer enabling DDP until the first non-NULLPROC RPC is received. This allows TLS handling (which uses NULLPROC RPCs) to enable TLS offload first. Reviewed by: rmacklem Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44002
* cxgbe: Support TCP_USE_DDP on offloaded TOE connectionsJohn Baldwin2024-03-207-59/+853
| | | | | | | | | | | | | | | | | | | | | | | | | When this socket option is enabled, relatively large contiguous buffers are allocated and used to receive data from the remote connection. When data is received a wrapper M_EXT mbuf is queued to the socket's receive buffer. This reduces the length of the linked list of received mbufs and allows consumers to consume receive data in larger chunks. To minimize reprogramming the page pods in the adapter, receive buffers for a given connection are recycled. When a buffer has been fully consumed by the receiver and freed, the buffer is placed on a per-connection free buffers list. The size of the receive buffers defaults to 256k and can be set via the hw.cxgbe.toe.ddp_rcvbuf_len sysctl. The hw.cxgbe.toe.ddp_rcvbuf_cache sysctl (defaults to 4) determines the maximum number of free buffers cached per connection. Note that this limit does not apply to "in-flight" receive buffers that are associated with mbufs in the socket's receive buffer. Co-authored-by: Navdeep Parhar <np@FreeBSD.org> Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44001
* tcp: Add a new kernel-only TCP_USE_DDP socket optionJohn Baldwin2024-03-201-0/+3
| | | | | | | | | | | This socket option can be used by in-kernel consumers (like NFS) to request a NIC to use optimized receive of large buffers for a connection. The current use case is to support DDP by the TOE on Chelsio NICs. Reviewed by: rscheff, tuexen, glebius Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44000
* ddp: Clear active DDP buffer members to NULL to pacify an assertionJohn Baldwin2024-03-201-1/+8
| | | | | | Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43999
* arm/GENERIC: Remove TI DTBsEmmanuel Vadot2024-03-201-2/+0
| | | | | | | We've removed TI support in 3416e102c4e9 ("arm: Remove TI code from GENERIC") so no need to build the DTBs now. Sponsored by: Beckhoff Automation GmbH & Co. KG
* ip6_output: Reduce cache misses on pktoptsAndrew Gallatin2024-03-202-40/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When profiling an IP6 heavy workload, I noticed that we were getting a lot of cache misses in ip6_output() around ip6_pktopts. This was happening because the TCP stack passes inp->in6p_outputopts even if all options are unused. So in the common case of no options present, pkt_opts is not null, and is checked repeatedly for different options. Since ip6_pktopts is large (4 cachelines), and every field is checked, we take 4 cache misses (2 of which tend to be hidden by the adjacent line prefetcher). To fix this common case, I introduced a new flag in ip6_pktopts (ip6po_valid) which tracks which options have been set. In the common case where nothing is set, this causes just a single cache miss to load. It also eliminates a test for some options (if (opt != NULL && opt->val >= const) vs if ((optvalid & flag) !=0 ) To keep the struct the same size in 64-bit kernels, and to keep the integer values (like ip6po_hlim, ip6po_tclass, etc) on the same cacheline, I moved them to the top. As suggested by zlei, the null check in MAKE_EXTHDR() becomes redundant, and can be removed. For our web server workload (with the ip6po_tclass option set), this drops the CPI from 2.9 to 2.4 for ip6_output Differential Revision: https://reviews.freebsd.org/D44204 Reviewed by: bz, glebius, zlei No Objection from: melifaro Sponsored by: Netflix Inc.
* sys/syscallsubr.h: align definition of kern_fcntl_freebsd() on 32bitKonstantin Belousov2024-03-201-1/+1
| | | | Fixes: d0efabdf15d956e9bc0414356ed798ca3c846e08
* sysent: regenBrooks Davis2024-03-194-104/+104
|
* syscalls.master: use __acl_type_tBrooks Davis2024-03-191-12/+12
| | | | | Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44418
* sys/acl.h: move main typedefs to sys/_types.hBrooks Davis2024-03-192-7/+16
| | | | | | | | Make __ prefixed versions available without the pollution of sys/acl.h (and by extension sys/param.h). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44382
* syscalls.master: make __sys_fcntl take an intptr_tBrooks Davis2024-03-193-7/+7
| | | | | | | | | | | | | The (optional) third argument of fcntl is sometimes a pointer so change the type to intptr_t. Update the libc-internal defintion (actually used by libthr) to take a fixed intptr_t argument rather than pretending it's a variadic function. (That worked because all supported architectures pass variadic arguments as though the function was declared with those types. In CheriBSD that changes because variadic arguments are passed via a bounded array.) Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44381
* syscalls.master: struct siginfo -> struct __siginfoBrooks Davis2024-03-191-3/+3
| | | | | | | struct siginfo doesn't exist, it's struct __siginfo (and siginfo_t). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44380
* freebsd32: struct siginfo32 -> struct __siginfo32Brooks Davis2024-03-1910-21/+22
| | | | | | | | | In the next commit I will update syscalls.master to use struct __siginfo (which actually exists) so this update will be needed to make generated files (from make sysent) align. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44380
* syscalls.master: align with sigfastblock declarationBrooks Davis2024-03-191-1/+1
| | | | | | | | | sigfastblock is declared to take a void * argument in the manpage in headers so declare it that way and use SAL annotations to say it interacts with a 32-bit word. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44379
* syscall.master: fix aio_suspend signatureBrooks Davis2024-03-191-1/+1
| | | | | | | It takes a `const struct iovec *iovp`. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44378
* syscalls.master: fix readv and writev iovp declBrooks Davis2024-03-191-2/+2
| | | | | | | Both take const struct iovec * and only read the values. Reviewed by: olce, kib Differential Revision: https://reviews.freebsd.org/D44377
* freebsd32: freebsd32_copyinuio takes const iovpBrooks Davis2024-03-192-2/+2
| | | | | | | We only read the iovp so make it const like in copyinuio. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44376
* Define stack_t in sys/_sigaltstack.hBrooks Davis2024-03-192-23/+66
| | | | | | | | The sigaltstack(2) definition needs this type so make it available without all of sys/signal.h. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D44383
* arm64: Move curthread setup earlierIsaac Cilia Attard2024-03-191-16/+16
| | | | | | | | | | | | | In 469cfa3c30ee cperciva added TSLOG profiling to link_elf_ireloc. This requires curthread to be read when the kernel linker is invoked, but it hadn't yet been initialized. On amd64 this was harmless since [gs:0] was readable; but on arm64 this broke since [x18] was not readable. Move the curthread (and associated PCPU) setup earlier on arm64 in order to allow TSLOG to work there. Fixes: 469cfa3c30ee ("tslog: Annotate some early boot functions") Differential Revision: https://reviews.freebsd.org/D44317
* carp: check CARP status in in_localip_fib(), in6_localip_fib()Gleb Smirnoff2024-03-192-2/+6
| | | | | | | | | | | | | | | | Don't report a BACKUP CARP address as local. These two functions are used only by source address validation for input packets, controlled by sysctls net.inet.ip.source_address_validation and net.inet6.ip6.source_address_validation. For this purpose we definitely want to treat BACKUP addresses as non local. This change is conservative and doesn't modify compat in_localip() and in6_localip(). They are used more widely than the FIB-aware versions. The change would modify the notion of ipfw(4) 'me' keyword. There might be other consequences as in_localip() is used by various tunneling protocols. PR: 277349
* pf: convert DIOCSETSTATUSIF to netlinkKristof Provost2024-03-192-0/+43
| | | | | | | While here also add a basic test case for it. Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D44368
* pf: fix dummynet + route-toKristof Provost2024-03-191-5/+21
| | | | | | | | | | | | | Ensure that we pick the correct dummynet pipe (i.e. forward vs. reverse direction) when applying route-to. We mark the processing as outbound so that dummynet will re-inject in the correct phase of processing after it's done with the packet, but that will cause us to pick the wrong pipe number. Reverse them so that the incorrect decision ends up picking the correct pipe. Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D44366
* pf: avoid passing through dummynet multiple timesKristof Provost2024-03-192-1/+5
| | | | | | | | | | | | In some setups we end up with multiple states created for a single packet, which in turn can mean we run the packet through dummynet multiple times. That's not expected or intended. Mark each packet when it goes through dummynet, and do not pass packet through dummynet if they're marked as having already passed through. See also: https://redmine.pfsense.org/issues/14854 Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D44365
* kerneldump: add livedump_start_vnode(9)Vijeyalakshumi Koteeswaran2024-03-183-10/+25
| | | | | | | | | | | | livedump_start_vnode(9) is introduced such that the live minidump on the system could take a vnode. This interface could be used to extend support for the existing framework in downstream. Bump __FreeBSD_version for introducing livedump_start_vnode(9). Sponsored by: Juniper Networks, Inc. Reviewed by: khng Differential Revision: https://reviews.freebsd.org/D43471
* tcp: clear all TCP timers in tcp_timer_stop() when in calloutGleb Smirnoff2024-03-182-4/+5
| | | | | | | | | | | | | When a TCP callout decides to disable self, e.g. tcp_timer_2msl() calling tcp_close(), we must also clear all other possible timers. Otherwise, upon return, the callout would be scheduled again in tcp_timer_enter(). Revert 57e27ff07aff, which was a temporary partial revert of otherwise correct 62d47d73b7eb, that exposed the problem being fixed now. Add an extra assertion in tcp_timer_enter() to check we aren't arming callout for a closed connection. Reviewed by: rscheff
* ktls: catch invalid parameters earlierRichard Scheffenegger2024-03-181-28/+41
| | | | | | | | | | | | | Move safety checks forward from ktls_session_create() to ktls_copyin_tls_enable(). Prevents zero mallocs, and excessively large kernel mallocs. Reported-by: syzbot+72022fa9163fa958b66c@syzkaller.appspotmail.com Reported-by: syzbot+8992893e13058ce0670a@syzkaller.appspotmail.com Sponsored by: NetApp, Inc. X-NetApp-PR: #79 Reviewed By: tuexen Differential Revision: https://reviews.freebsd.org/D44364
* arm64: Return all registers to gdb when ableAndrew Turner2024-03-181-0/+4
| | | | | | | | | | | When the kdb thread is the current thread we read the registers from the trap frame. As this contains all general purpose registers we can use it to read these in the gdb stub. This allows us to include the non-callee saved registers, e.g. function arguments. Reviewed by: imp Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44360
* uart: Add uart_cpu_acpi_setup to setup the uartAndrew Turner2024-03-184-9/+15
| | | | | | | | | In preperation for adding debug port support add a generic function to setup the uart from ACPI tables. Reviewed by: imp Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44358
* uart: Split out initilisation of the acpi devinfoAndrew Turner2024-03-181-37/+49
| | | | | | | | | | Split out the common parts of building the uart devinfo from ACPI tables from the SPCR parser. This will be used when we support the DBG2 table to find the debug uart to be used by the kernel gdb stub. Reviewed by: imp Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D44357
* arm64: Rename drop_to_el1 to enter_kernel_elAndrew Turner2024-03-181-6/+6
| | | | | | | | | In the future we may not drop to EL1, e.g. when we support FEAT_VHE where the kernel runs in EL2. Reviewed by: emaste, imp Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D43976
* tcp: remove IS_FASTOPEN() macroGleb Smirnoff2024-03-188-62/+54
| | | | | | | | | | | | The macro is more obfuscating than helping as it just checks a single flag of t_flags. All other t_flags bits are checked without a macro. A bigger problem was that declaration of the macro in tcp_var.h depended on a kernel option. It is a bad practice to create such definitions in installable headers. Reviewed by: rscheff, tuexen, kib Differential Revision: https://reviews.freebsd.org/D44362
* sockets: remove unused KPIs to manipulate socketsGleb Smirnoff2024-03-185-131/+3
| | | | | | | | | | | | | | These KPIs were added in dd0e6c383a9f0 and through 15 years had zero use. They slightly remind what IfAPI does for struct ifnet. But IfAPI does that for the sake of large collection of NIC drivers not being aware of struct ifnet. For the sockets it is unclear what could be a large collection of externally written kernel modules that need extensively use sockets and not be aware of their internals at the same time. This isolation of a structure knowledge requires a lot of work, and just throwing in a few KPIs isn't helpful. Reviewed by: kib, olce, markj Differential Revision: https://reviews.freebsd.org/D44311
* inpcb: remove unused KPIs to manipulate inpcbsGleb Smirnoff2024-03-183-26/+1
| | | | | | | | | | | | | | These KPIs were added in 9d29c635daa69 and through 15 years had zero use. They slightly remind what IfAPI does for struct ifnet. But IfAPI does that for the sake of large collection of NIC drivers not being aware of struct ifnet. For the inpcb it is unclear what could be a large collection of externally written kernel modules that need extensively use inpcb and not be aware of its internals at the same time. This isolation of a structure knowledge requires a lot of work, and just throwing in a few KPIs isn't helpful. Reviewed by: kib, bz, markj Differential Revision: https://reviews.freebsd.org/D44310
* Rename VM_LAST to more appropriate VM_GUEST_LASTMateusz Guzik2024-03-182-2/+2
| | | | | | NFC Sponsored by: Rubicon Communications, LLC ("Netgate")
* nfsd: Add a sysctl to limit NFSv4.2 Copy RPC sizeRick Macklem2024-03-161-2/+14
| | | | | | | | | | | | | | | | | | | | | | NFSv4.2 supports a Copy operation, which avoids file data being read to the client and then written back to the server, if both input and output files are on the same NFSv4.2 mount for copy_file_range(2). Unfortunately, this Copy operation can take a long time under certain circumstances. If this occurs concurrently with a RPC that requires an exclusive lock on the nfsd such as ExchangeID done for a new mount, the result can be an nfsd "stall" until the Copy completes. This patch adds a sysctl that can be set to limit the size of a Copy operation or, if set to 0, disable Copy operations. The use of this sysctl and other ways to avoid Copy operations taking too long will be documented in the nfsd.4 man page by a separate commit. MFC after: 2 weeks
* vnet: remove unneeded backslashGleb Smirnoff2024-03-151-1/+1
| | | | Fixes: 430e0e409ce94246bb252cbdddef866fc69dea95
* arm64: Use void pointers for arguments to arm64_get_writable_addrJohn Baldwin2024-03-155-13/+14
| | | | | | | | No functional change, but this reduces diffs with CheriBSD downstream. Reviewed by: andrew Sponsored by: University of Cambridge, Google, Inc. Differential Revision: https://reviews.freebsd.org/D44344
* arm busdma: Fix parameter types to exclusion_bounce_checkJohn Baldwin2024-03-151-1/+1
| | | | | | | | These are bus addresses not CPU virtual addresses. Reviewed by: andrew Sponsored by: University of Cambridge, Google, Inc. Differential Revision: https://reviews.freebsd.org/D44343
* arm64: Switch the address argument to cpu_*cache* to a pointerJohn Baldwin2024-03-1512-50/+50
| | | | | | | | No functional change, but this reduces diffs with CheriBSD downstream. Reviewed by: andrew Sponsored by: University of Cambridge, Google, Inc. Differential Revision: https://reviews.freebsd.org/D44342