| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
and use the current file offset instead.
Requested by: Vinícius dos Santos Oliveira <vini.ipsmaker@gmail.com>
Reviewed by: jhb
Discussed with: asomers
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43448
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce the allocuio() and freeuio() functions to allocate and
deallocate struct uio. This hides the actual allocator interface, so it
is easier to modify the sub-allocation layout of struct uio and the
corresponding iovec array.
Obtained from: CheriBSD
Reviewed by: kib, markj
MFC after: 2 weeks
Sponsored by: CHaOS, EPSRC grant EP/V000292/1
Differential Revision: https://reviews.freebsd.org/D43711
|
|
|
|
|
|
|
|
|
| |
Bump __FreeBSD_version for ZFS use.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
|
|
|
|
| |
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
|
|
|
|
|
|
|
|
| |
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Use atomic_store to set job->error. atomic_set does an or
operation, not assignment.
- Use refcount_* to manage job->nbio.
This ensures proper memory barriers are present so that the last bio
won't see a possibly stale value of job->error.
- Don't re-read job->error after reading it via atomic_load.
Reported by: markj (1)
Reviewed by: mjg, markj
Differential Revision: https://reviews.freebsd.org/D38611
|
| |
|
|
|
|
|
|
|
|
| |
Use atomic_fetchadd in place of separate atomic_subtract / atomic_load.
Reviewed by: markj
Sponsored by: HPE TidalScale
Differential Revision: https://reviews.freebsd.org/D38559
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove the AIO proc zone. This zone gets one allocation per AIO
daemon process, which isn't enough to warrant a dedicated zone. Plus,
unlike other AIO structures, aiops are small (32 bytes with LP64), so
UMA doesn't provide better space efficiency than malloc(9). Change
one of the malloc types in vfs_aio.c to make it more general.
- Don't set the NOFREE flag on the other AIO zones. This flag means
that memory allocated to the AIO subsystem is never freed back to the
VM, so it's always preferable to avoid using it when possible. NOFREE
was set without explanation when AIO was converted to use UMA 20 years
ago, but it does not appear to be required; all of the structures
allocated from UMA (per-process kaioinfo, kaiocb, and aioliojob) keep
track of references and get freed only when none exist. Plus, these
structures will contain dangling pointer after they're freed (e.g.,
the "cred", "fd_file" and "uiop" fields of struct kaiocb), so
use-after-frees are dangerous even when the structures themselves are
type-stable.
Reviewed by: asomers
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35493
|
|
|
|
|
|
| |
Where appropriate hide sysent.h under proper condition.
MFC after: 2 weeks
|
| |
|
|
|
|
|
|
| |
PR: 258698
Submitted by: sigsys@gmail.com
MFC after: 1 week
|
|
|
|
|
|
|
|
| |
Reported by: tmunro
Reviewed by: jhb, tmunro
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32023
|
|
|
|
|
|
|
| |
Reviewed by: jhb, tmunro
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With lio_listio(2), the opcode is specified by userspace rather than
being hard-coded by the system call (e.g., aio_readv() -> LIO_READV).
kern_lio_listio() calls aio_aqueue() with an opcode of LIO_NOP, which
gets fixed up when the aiocb is copied in.
When copying in a job request for vectored I/O, we need to dynamically
allocate a uio to wrap an iovec. So aiocb_copyin() needs to get the
opcode from the aiocb and then decide whether an allocation is required.
We failed to do this in the COMPAT_FREEBSD32 case. Fix it.
Reported by: syzbot+27eab6f2c2162f2885ee@syzkaller.appspotmail.com
Reviewed by: kib, asomers
Fixes: f30a1ae8d529 ("lio_listio(2): Allow LIO_READV and LIO_WRITEV.")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31914
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow multiple vector IOs to be started with one system call.
aio_readv() and aio_writev() already used these opcodes under the
covers. This commit makes them available to user space.
Being non-standard extensions, they're only visible if __BSD_VISIBLE is
defined, like the functions.
Reviewed by: asomers, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D31627
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One is allowed to use LIO_NOWAIT without specifying a sigevent. In this
case, lj->lioj_signal is left uninitialized, but several code paths
examine liov_signal.sigev_notify to figure out which notification to
post. Unconditionally initialize that field to SIGEV_NONE.
Add a dumb test case which triggers the bug.
Reported by: KMSAN+syzkaller
Reviewed by: asomers
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31197
|
|
|
|
|
|
|
|
|
| |
Reviewed by: markj
Tested by: pho
Discussed with: walker.aj325_gmail.com, wulf
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29323
|
|
|
|
|
|
|
|
|
|
|
| |
This allows slightly more efficient opcode testing in-kernel. It is
transparent to userland, except to applications that sneakily submit
aio fsync or aio mlock operations via lio_listio, which has never been
documented, requires the use of deliberately undefined constants
(LIO_SYNC and LIO_MLOCK), and is arguably a bug.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D27942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we would accept any kind of LIO_* opcode, including ones
that were intended for in-kernel use only like LIO_SYNC (which is not
defined in userland). The situation became more serious with
022ca2fc7fe08d51f33a1d23a9be49e6d132914e. After that revision, setting
aio_lio_opcode to LIO_WRITEV or LIO_READV would trigger an assertion.
Note that POSIX does not specify what should happen if aio_lio_opcode is
invalid.
MFC-with: 022ca2fc7fe08d51f33a1d23a9be49e6d132914e
Reviewed by: jhb, tmunro, 0mp
Differential Revision: <https://reviews.freebsd.org/D28078
|
|
|
|
|
|
|
| |
aio_fsync(O_DSYNC, ...) is the asynchronous version of fdatasync(2).
Reviewed by: kib, asomers, jhb
Differential Review: https://reviews.freebsd.org/D25071
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
POSIX AIO is great, but it lacks vectored I/O functions. This commit
fixes that shortcoming by adding aio_writev and aio_readv. They aren't
part of the standard, but they're an obvious extension. They work just
like their synchronous equivalents pwritev and preadv.
It isn't yet possible to use vectored aiocbs with lio_listio, but that
could be added in the future.
Reviewed by: jhb, kib, bcr
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D27743
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vectored aio will require each aiocb to be associated with multiple
bios, so we can't store a link to the latter from the former. But we
don't really need to. aio_biowakeup already knows the bio it's using,
and the other fields can be stored within the bio and/or buf itself.
Also, remove the unused kaiocb.backend2 field.
Reviewed By: kib
Differential Revision: https://reviews.freebsd.org/D27682
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now, if lio registered zero jobs, syscall frees lio job
structure, cleaning up queued ksi. As result, the realtime signal is
dequeued and never delivered.
Fix it by allowing sendsig() to copy ksi when job count is zero.
PR: 220398
Reported and reviewed by: asomers
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27421
Notes:
svn path=/head/; revision=368265
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly re-wrap conditions to split after binary ops.
Reviewed by: asomers
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27421
Notes:
svn path=/head/; revision=368264
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: asomers
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27421
Notes:
svn path=/head/; revision=368262
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently some architectures, like ppc in its hashed page tables
variants, account mappings by pmap_qenter() in the response from
pmap_is_page_mapped().
While there, eliminate useless userp variable.
Noted and reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27409
Notes:
svn path=/head/; revision=368142
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
Notes:
svn path=/head/; revision=368124
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly mechanical except for vmspace_exit(). There, use the new
refcount_release_if_last() to avoid switching to vmspace0 unless other
processes are sharing the vmspace. In that case, upon switching to
vmspace0 we can unconditionally release the reference.
Remove the volatile qualifier from vm_refcnt now that accesses are
protected using refcount(9) KPIs.
Reviewed by: alc, kib, mmel
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27057
Notes:
svn path=/head/; revision=367334
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A user pointer is not a suitable value for bio_data and the next block
of code always overwrites bio_data anyway. Just use cb->aio_buf
directly in the call to vm_fault_quick_hold_pages().
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26595
Notes:
svn path=/head/; revision=366296
|
|
|
|
|
|
|
| |
Most consumers pass NULL.
Notes:
svn path=/head/; revision=364372
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
Notes:
svn path=/head/; revision=358333
|
|
|
|
|
|
|
|
|
|
|
| |
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427
Notes:
svn path=/head/; revision=356337
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in
the FreeBSD kernel. It is a useful tool for finding data races between
threads executing on different CPUs.
This can be enabled by enabling KCSAN in the kernel config, or by using the
GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later
needs a compiler change to allow -fsanitize=thread that KCSAN uses.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22315
Notes:
svn path=/head/; revision=354942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
pbufs are we going to have set.
In various subsystems that are going to utilize pbufs create private zones
via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
and sets a limit on created zone. After startup preallocate pbufs according
to requirements of all pbuf zones.
Subsystems that used to have a private limit with old allocator now have
private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
swap, vnode pager.
The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
aio(4). They should have their private limits, but changing that is out of
scope of this commit.
o Fetch tunable value of kern.nswbuf from init_param2() and while here move
NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
this option.
Default values aren't touched by this commit, but they probably should be
reviewed wrt to modern hardware.
This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.
Together with: gallatin
Tested by: pho
Notes:
svn path=/head/; revision=343030
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
aio has two paths: an asynchronous "physio" path and a synchronous path.
Confusingly, physio(9) isn't actually used by the "physio" path, and never
has been. In fact, it may even be called by the synchronous path! Rename
the "physio" path to the "bio" path to reflect what it actually does:
directly compose BIOs and send them to character devices.
MFC after: 2 weeks
Notes:
svn path=/head/; revision=340988
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some kevent functions have a boolean "waitok" parameter for use when
calling malloc(9). Replace them with the corresponding malloc() flags:
the desired behaviour is known at compile-time, so this eliminates a
couple of conditional branches, and makes the code easier to read.
No functional change intended.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18318
Notes:
svn path=/head/; revision=340900
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel may register for events on behalf of a userspace process,
in which case it must be careful to zero the kevent struct that will be
copied out to userspace.
Reviewed by: kib
MFC after: 3 days
Security: kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18317
Notes:
svn path=/head/; revision=340899
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add macros to allow preinitialization of cap_rights_t.
- Convert most commonly used code paths to use preinitialized cap_rights_t.
A 3.6% speedup in fstat was measured with this change.
Reported by: mjg
Reviewed by: oshogbo
Approved by: sbruno
MFC after: 1 month
Notes:
svn path=/head/; revision=333425
|
|
|
|
|
|
|
|
|
|
|
| |
This behavior is already documented by the man page, and suggested by POSIX.
Reviewed by: jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D15099
Notes:
svn path=/head/; revision=332631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.
Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.
Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.
Reviewed by: kib, cem, jhb, jtl
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14941
Notes:
svn path=/head/; revision=332122
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the AIO subsystem would save a snapshot of the currently
configured per-process limits the first time a process used AIO. The
process would continue to use the snapshotted limits ignoring any
changes to the global limits during the rest of its lifetime. This
change removes the snapshotted values and changes the AIO code to
always check the global values which can be toggled at runtime.
This means an administrator can now change the effective limits of
existing processes. This is more consistent with how other limits
configured via sysctl work in FreeBSD.
Reviewed by: asomers, kib
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D13819
Notes:
svn path=/head/; revision=327792
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- If aio_qphysio() returns a non-zero error code, fail the request rather
than queueing it to the AIO kproc pool to be retried via the slow path.
Currently this means that if vm_fault_quick_hold_pages() reports an
error, EFAULT is returned from the fast-path rather than retrying the
request in the slow path where it will still fail with EFAULT.
- If aio_qphysio() wishes to use the fast path for a device that doesn't
support unmapped I/O but there are already the maximum number of
such requests in flight, fail with EAGAIN as we do for other AIO
resource limits rather than queueing the request to the AIO kproc pool.
- Move the opcode check for aio_qphysio() out of the caller and into
aio_qphysio() to simplify some logic and remove two goto's while here.
It also uses a whitelist (only supported for LIO_READ / LIO_WRITE)
rather than a blacklist (skipped for LIO_SYNC).
PR: 217261
Submitted by: jkim (an earlier version)
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Notes:
svn path=/head/; revision=327755
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically, in aio_queue_file() the code was doing this:
if (opcode == LIO_SYNC) {
...
}
switch (opcode) {
...
case LIO_SYNC:
...
}
This moves the body of the if statement into the LIO_SYNC case of the
switch statement.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Notes:
svn path=/head/; revision=327753
|
|
|
|
|
|
|
|
| |
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Notes:
svn path=/head/; revision=327752
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.
Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
Notes:
svn path=/head/; revision=327173
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Notes:
svn path=/head/; revision=326271
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An off-by-one error has been present since the system call was first present
in 185878. It additionally became a memory corruption bug after change
324941. The failure is actually revealed by our existing AIO tests.
However, apparently nobody's been running those in 32-bit emulation mode.
Reported by: Coverity, cem
CID: 1382114
MFC after: 18 days
X-MFC-With: 324941
Sponsored by: Spectra Logic Corp
Notes:
svn path=/head/; revision=325018
|