summaryrefslogtreecommitdiff
path: root/lib/libc/sys
Commit message (Collapse)AuthorAgeFilesLines
...
* kqueue(2): de-vandalize the random sentence in the middleKyle Evans2020-04-221-1/+2
| | | | | | | | | | A last minute change appears to have inadvertently vandalized unrelated parts of the manpage with the date. =-( Reported by: rpokala Notes: svn path=/head/; revision=360183
* kqueue(2): add a note about EV_RECEIPTKyle Evans2020-04-221-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the below-referenced PR, a case is attached of a simple reproducer that exhibits suboptimal behavior: EVFILT_READ and EVFILT_WRITE being set in the same kevent(2) call will only honor the first one. This is, in-fact, how it's supposed to work. A read of the manpage leads me to believe we could be more clear about this; right now there's a logical leap to make in the relevant statement: "When passed as input, it forces EV_ERROR to always be returned." -- the logical leap being that this indicates the caller should have allocated space for the change to be returned with EV_ERROR indicated in the events, or subsequent filters will get dropped on the floor. Another possible workaround that accomplishes similar effect without needing space for all events is just setting EV_RECEIPT on the final change being passed in; if any errored before it, the kqueue would not be drained. If we made it to the final change with EV_RECEIPT set, then we would return that one with EV_ERROR and still not drain the kqueue. This would seem to not be all that advisable. PR: 229741 MFC after: 1 week Notes: svn path=/head/; revision=360182
* closefrom: clamp lowfd to >= 0; close_range's parameters are unsigned.Kyle Evans2020-04-141-1/+2
| | | | | | | | Pointy hat: kevans Reported by: CI (lwhsu) Notes: svn path=/head/; revision=359943
* Mark closefrom(2) COMPAT12, reimplement in libc to wrap close_rangeKyle Evans2020-04-143-2/+48
| | | | | | | | | | | Include a temporarily compatibility shim as well for kernels predating close_range, since closefrom is used in some critical areas. Reviewed by: markj (previous version), kib Differential Revision: https://reviews.freebsd.org/D24399 Notes: svn path=/head/; revision=359930
* Make sonewconn() overflow messages have per-socket rate-limits and values.Jonathan T. Looney2020-04-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | sonewconn() emits debug-level messages when a listen socket's queue overflows. Currently, sonewconn() tracks overflows on a global basis. It will only log one message every 60 seconds, regardless of how many sockets experience overflows. And, when it next logs at the end of the 60 seconds, it records a single message referencing a single PCB with the total number of overflows across all sockets. This commit changes to per-socket overflow tracking. The code will now log one message every 60 seconds per socket. And, the code will provide per-socket queue length and overflow counts. It also provides a way to change the period between log messages using a sysctl. Reviewed by: jhb (previous version), bcr (manpages) MFC after: 2 weeks Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D24316 Notes: svn path=/head/; revision=359923
* libc: remove shm_open(2)'s compat fallbackKyle Evans2020-04-131-15/+1
| | | | | | | | | | | | This had been introduced to ease any pain for using slightly older kernels with a newer libc, e.g., for bisecting a kernel across the introduction of shm_open2(2). 6 months has passed, retire the fallback and let shm_open() unconditionally call shm_open2(). Stale includes are removed as well. Notes: svn path=/head/; revision=359865
* Implement a close_range(2) syscallKyle Evans2020-04-123-2/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | close_range(min, max, flags) allows for a range of descriptors to be closed. The Python folk have indicated that they would much prefer this interface to closefrom(2), as the case may be that they/someone have special fds dup'd to higher in the range and they can't necessarily closefrom(min) because they don't want to hit the upper range, but relocating them to lower isn't necessarily feasible. sys_closefrom has been rewritten to use kern_close_range() using ~0U to indicate closing to the end of the range. This was chosen rather than requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the call to kern_close_range for simplicity. The flags argument of close_range(2) is currently unused, so any flags set is currently EINVAL. It was added to the interface in Linux so that future flags could be added for, e.g., "halt on first error" and things of this nature. This patch is based on a syscall of the same design that is expected to be merged into Linux. Reviewed by: kib, markj, vangyzen (all slightly earlier revisions) Differential Revision: https://reviews.freebsd.org/D21627 Notes: svn path=/head/; revision=359836
* libc: Fix possible overflow in binuptime().Konstantin Belousov2020-04-091-2/+16
| | | | | | | | | | | | | This is an application of the kernel overflow fix from r357948 to userspace, based on the algorithm developed by Bruce Evans. To keep the ABI of the vds_timekeep stable, instead of adding the large_delta member, MSB of both multipliers are added to quickly estimate the overflow. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Notes: svn path=/head/; revision=359758
* Trim some duplicate EIO descriptions.John Baldwin2020-03-302-7/+1
| | | | | | | | | | | While here, drop an extra conjunction from the list of error conditions for the remaining EIO description in symlink(2). Discussed with: mckusick (trimming duplicates) MFC after: 2 weeks Notes: svn path=/head/; revision=359467
* Document EINTEGRITY errors for many system calls.John Baldwin2020-03-3042-42/+151
| | | | | | | | | | | | | | | | | | | EINTEGRITY was previously documented as a UFS-specific error for mount(2). This documents EINTEGRITY as a filesystem-independent error that may be reported by the backing store of a filesystem. While here, document EIO as a filesystem-independent error for both mount(2) and posix_fadvise(2). EIO was previously only documented for UFS for mount(2). Reviewed by: mckusick Suggested by: mckusick MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24168 Notes: svn path=/head/; revision=359465
* exec{l,v}{e,p} arrived in 7th Edition research Unix to support the Bourne ShellWarner Losh2020-03-241-1/+1
| | | | | | | | which introduced environment variables. Document that here. Verified by consulting the TUHS archive. Notes: svn path=/head/; revision=359284
* sendfile() does currently not support SCTP sockets.Michael Tuexen2020-03-131-0/+9
| | | | | | | | | | | Therefore, fail the call. Reviewed by: markj@ MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24059 Notes: svn path=/head/; revision=358965
* When mounting a UFS filesystem, return EINTEGRITY rather than EIOKirk McKusick2020-03-111-0/+4
| | | | | | | | | | | | when a superblock check-hash error is detected. This change clarifies a mount that failed due to media hardware failures (EIO) from a mount that failed due to media errors (EINTEGRITY) that can be corrected by running fsck(8). Sponsored by: Netflix Notes: svn path=/head/; revision=358899
* umtx_op.2: correct typoEd Maste2020-03-051-1/+1
| | | | | | | | | PR: 244611 Submitted by: John F. Carr <jfc@mit.edu> MFC after: 3 days Notes: svn path=/head/; revision=358675
* thr_self.2: Fix some typos in the thread identifier rangeMateusz Piotrowski2020-03-031-2/+2
| | | | | | | | | Reported by: kaktus Approved by: bcr (mentor) Differential Revision: https://reviews.freebsd.org/D23936 Notes: svn path=/head/; revision=358570
* Return ENOTSUP for mmap/mprotect if prot not subset of prot_maxEd Maste2020-02-262-8/+8
| | | | | | | | | | | | | | | | | | | From POSIX, [ENOTSUP] The implementation does not support the combination of accesses requested in the prot argument. This fits the case that prot contains permissions which are not a subset of prot_max. Reviewed by: brooks, cem Relnotes: Yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23843 Notes: svn path=/head/; revision=358355
* Remove sparc64 specific parts of libc.Warner Losh2020-02-261-7/+1
| | | | | | | | | | | | | | | Also update comments for which architectures use 128 bit long doubles, as appropriate. The softfloat specialization routines weren't updated since they appear to be from an upstream source which we may want to update in the future to get a more favorable license. Reviewed by: emaste@ Differential Revision: https://reviews.freebsd.org/D23658 Notes: svn path=/head/; revision=358348
* mprotect.2: sort errors alphabeticallyEd Maste2020-02-261-6/+6
| | | | | | | | Reported by: brooks MFC after: 3 days Notes: svn path=/head/; revision=358344
* truncate(2): extending the file is required by POSIX 2008Eric van Gyzen2020-02-201-3/+6
| | | | | | | | | | | | | Update the man page to mention that extending a file with truncate(2) is required by POSIX as of 2008. Reviewed by: bcr MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D23354 Notes: svn path=/head/; revision=358186
* Add a way to manage thread signal mask using shared word, instead of syscall.Konstantin Belousov2020-02-093-0/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new syscall sigfastblock(2) is added which registers a uint32_t variable as containing the count of blocks for signal delivery. Its content is read by kernel on each syscall entry and on AST processing, non-zero count of blocks is interpreted same as the signal mask blocking all signals. The biggest downside of the feature that I see is that memory corruption that affects the registered fast sigblock location, would cause quite strange application misbehavior. For instance, the process would be immune to ^C (but killable by SIGKILL). With consumers (rtld and libthr added), benchmarks do not show a slow-down of the syscalls in micro-measurements, and macro benchmarks like buildworld do not demonstrate a difference. Part of the reason is that buildworld time is dominated by compiler, and clang already links to libthr. On the other hand, small utilities typically used by shell scripts have the total number of syscalls cut by half. The syscall is not exported from the stable libc version namespace on purpose. It is intended to be used only by our C runtime implementation internals. Tested by: pho Disscussed with: cem, emaste, jilles Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D12773 Notes: svn path=/head/; revision=357693
* Provide O_SEARCHKyle Evans2020-02-021-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping permissions checks on the directory itself after the initial open(). This is close to the semantics we've historically applied for O_EXEC on a directory, which is UB according to POSIX. Conveniently, O_SEARCH on a file is also explicitly undefined behavior according to POSIX, so O_EXEC would be a fine choice. The spec goes on to state that O_SEARCH and O_EXEC need not be distinct values, but they're not defined to be the same value. This was pointed out as an incompatibility with other systems that had made its way into libarchive, which had assumed that O_EXEC was an alias for O_SEARCH. This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a directory is checked in vn_open_vnode already, so for completeness we add a NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not re-check that when descending in namei. [0] https://pubs.opengroup.org/onlinepubs/9699919799/ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23247 Notes: svn path=/head/; revision=357412
* vfs: provide F_ISUNIONSTACK as a kludge for libcMateusz Guzik2020-01-171-1/+6
| | | | | | | | | | | | Prior to introduction of this op libc's readdir would call fstatfs(2), in effect unnecessarily copying kilobytes of data just to check fs name and a mount flag. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D23162 Notes: svn path=/head/; revision=356830
* getrandom(2): Add Linux GRND_INSECURE API flagConrad Meyer2020-01-121-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Treat it as a synonym for GRND_NONBLOCK. The reasoning is this: We have two choices for handling Linux's GRND_INSECURE API flag. 1. We could ignore it completely (like GRND_RANDOM). However, this might produce the surprising result of GRND_INSECURE requests blocking, when the Linux API does not block. 2. Alternatively, we could treat GRND_INSECURE requests as requests for GRND_NONBLOCk. Here, the surprising result for Linux programs is that invocations with unseeded random(4) will produce EAGAIN, rather than garbage. Honoring the flag in the way Linux does seems fraught. If we actually use the output of a random(4) implementation prior to seeding, we leak some entropy (in an information theory and also practical sense) from what will be the initial seed to attackers (or allow attackers to arbitrary DoS initial seeding, if we don't leak). This seems unacceptable -- it defeats the purpose of blocking on initial seeding. Secondary to that concern, before seeding we may have arbitrarily little entropy collected; producing output from zero or a handful of entropy bits does not seem particularly useful to userspace. If userspace can accept garbage, insecure, non-random bytes, they can create their own insecure garbage with srandom(time(NULL)) or similar. Any program which would be satisfied with a 3-bit key CTR stream has no need for CSPRNG bytes. So asking the kernel to produce such an output from the secure getrandom(2) API seems inane. For now, we've elected to emulate GRND_INSECURE as an alternative spelling of GRND_NONBLOCK (2). Consider this API not-quite stable for now. We guarantee it will never block. But we will attempt to monitor actual port uptake of this bizarre API and may revise our plans for the unseeded behavior (prior stable/13 branching). Approved by: csprng(markm), manpages(bcr) See also: https://lwn.net/ml/linux-kernel/cover.1577088521.git.luto@kernel.org/ See also: https://lwn.net/ml/linux-kernel/20200107204400.GH3619@mit.edu/ Differential Revision: https://reviews.freebsd.org/D23130 Notes: svn path=/head/; revision=356667
* posix_fallocate: push vnop implementation into the fileop layerKyle Evans2020-01-081-2/+3
| | | | | | | | | | | This opens the door for other descriptor types to implement posix_fallocate(2) as needed. Reviewed by: kib, bcr (manpages) Differential Revision: https://reviews.freebsd.org/D23042 Notes: svn path=/head/; revision=356510
* Only return EPERM from kill(-pid) when no process was signalled.Konstantin Belousov2019-12-071-5/+4
| | | | | | | | | | | | | | | | | As mandated by POSIX. Also clarify the kill(2) manpage. While there, restructure the code in killpg1() to use helper which keeps overall state of the process list iteration in the killpg1_ctx structued, later used to infer the error returned. Reported by: amdmi3 Reviewed by: jilles Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D22621 Notes: svn path=/head/; revision=355500
* clock_gettime(2): add a HISTORY sectionAlan Somers2019-12-071-1/+9
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=355489
* lio_listio(2): add a HISTORY sectionAlan Somers2019-12-071-1/+6
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=355488
* Regularize my copyright noticeWarner Losh2019-12-049-11/+2
| | | | | | | | | | | | o Remove All Rights Reserved from my notices o imp@FreeBSD.org everywhere o regularize punctiation, eliminate date ranges o Make sure that it's clear that I don't claim All Rights reserved by listing All Rights Reserved on same line as other copyright holders (but not me). Other such holders are also listed last where it's clear. Notes: svn path=/head/; revision=355394
* Fix typos in the cpuset_{get,set}domain() man page.Mark Johnston2019-11-221-3/+5
| | | | | | | | MFC after: 1 week Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=355000
* Update the copy_file_range man page to reflect the semantic changeRick Macklem2019-11-101-1/+14
| | | | | | | | | done by r354574. This is a content change. Notes: svn path=/head/; revision=354575
* Update the copy_file_range.2 man page to reflect the semantic changeRick Macklem2019-11-081-8/+9
| | | | | | | | | implemented by r354564. This is a content change. Notes: svn path=/head/; revision=354565
* memfd_create(3): Don't actually force hugetlb size with MFD_HUGETLBKyle Evans2019-09-291-3/+0
| | | | | | | | The size flags are only required to select a size on systems that support multiple sizes. MFD_HUGETLB by itself is valid. Notes: svn path=/head/; revision=352870
* Revert the mode_t -> int changes and add a warning in the BUGS section instead.Warner Losh2019-09-282-4/+16
| | | | | | | | | | | | | While FreeBSD's implementation of these expect an int inside of libc, that's an implementation detail that we can hide from the user as it's the natural promotion of the current mode_t type and before it is used in the kernel, it's converted back to the narrower type that's the current definition of mode_t. As such, documenting int is at best confusing and at worst misleading. Instead add a note that these args are variadic and as such calling conventions may differ from non-variadic arguments. Notes: svn path=/head/; revision=352846
* Document varadic args as int, since you can't have short varadic args (they areWarner Losh2019-09-272-2/+2
| | | | | | | | | | | | | | | | promoted to ints). - `mode_t` is `uint16_t` (`sys/sys/_types.h`) - `openat` takes variadic args - variadic args cannot be 16-bit, and indeed the code uses int - the manpage currently kinda implies the argument is 16-bit by saying `mode_t` Prompted by Rust things: https://github.com/tailhook/openat/issues/21 Submitted by: Greg V at unrelenting Differential Revision: https://reviews.freebsd.org/D21816 Notes: svn path=/head/; revision=352795
* Further normalize copyright noticesKyle Evans2019-09-261-1/+0
| | | | | | | | | | | - s/C/c/ where I've been inconsistent about it - +SPDX tags - Remove "All rights reserved" where possible Requested by: rgrimes (all rights reserved) Notes: svn path=/head/; revision=352757
* Correct mistake in MLINKS introduced in r352747David Bright2019-09-261-1/+1
| | | | | | | | | | Messed up a merge conflict resolution and didn't catch that before commit. Sponsored by: Dell EMC Isilon Notes: svn path=/head/; revision=352756
* Add an shm_rename syscallDavid Bright2019-09-263-12/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | Add an atomic shm rename operation, similar in spirit to a file rename. Atomically unlink an shm from a source path and link it to a destination path. If an existing shm is linked at the destination path, unlink it as part of the same atomic operation. The caller needs the same permissions as shm_unlink to the shm being renamed, and the same permissions for the shm at the destination which is being unlinked, if it exists. If those fail, EACCES is returned, as with the other shm_* syscalls. truss support is included; audit support will come later. This commit includes only the implementation; the sysent-generated bits will come in a follow-on commit. Submitted by: Matthew Bryan <matthew.bryan@isilon.com> Reviewed by: jilles (earlier revision) Reviewed by: brueffer (manpages, earlier revision) Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21423 Notes: svn path=/head/; revision=352747
* Add SPDX tags to recently added filesKyle Evans2019-09-251-0/+2
| | | | | | | Reported by: Pawel Biernacki Notes: svn path=/head/; revision=352727
* posix_spawn(3): handle potential signal issues with vforkKyle Evans2019-09-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Described in [1], signal handlers running in a vfork child have opportunities to corrupt the parent's state. Address this by adding a new rfork(2) flag, RFSPAWN, that has vfork(2) semantics but also resets signal handlers in the child during creation. x86 uses rfork_thread(3) instead of a direct rfork(2) because rfork with RFMEM/RFSPAWN cannot work when the return address is stored on the stack -- further information about this problem is described under RFMEM in the rfork(2) man page. Addressing this has been identified as a prerequisite to using posix_spawn in subprocess on FreeBSD [2]. [1] https://ewontfix.com/7/ [2] https://bugs.python.org/issue35823 Reviewed by: jilles, kib Differential Revision: https://reviews.freebsd.org/D19058 Notes: svn path=/head/; revision=352712
* rfork(2): add RFSPAWN flagKyle Evans2019-09-251-2/+12
| | | | | | | | | | | | | | When RFSPAWN is passed, rfork exhibits vfork(2) semantics but also resets signal handlers in the child during creation to avoid a point of corruption of parent state from the child. This flag will be used by posix_spawn(3) to handle potential signal issues. Reviewed by: jilles, kib Differential Revision: https://reviews.freebsd.org/D19058 Notes: svn path=/head/; revision=352711
* sysent: regenerate after r352705Kyle Evans2019-09-251-4/+0
| | | | | | | | This also implements it, fixes kdump, and removes no longer needed bits from lib/libc/sys/shm_open.c for the interim. Notes: svn path=/head/; revision=352706
* Add linux-compatible memfd_createKyle Evans2019-09-254-5/+210
| | | | | | | | | | | | | | | | | memfd_create is effectively a SHM_ANON shm_open(2) mapping with optional CLOEXEC and file sealing support. This is used by some mesa parts, some linux libs, and qemu can also take advantage of it and uses the sealing to prevent resizing the region. This reimplements shm_open in terms of shm_open2(2) at the same time. shm_open(2) will be moved to COMPAT12 shortly. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D21393 Notes: svn path=/head/; revision=352703
* Update fcntl(2) after r352695Kyle Evans2019-09-251-1/+62
| | | | Notes: svn path=/head/; revision=352696
* Add two options to allow mount to avoid covering up existing mount points.Sean Eric Fagan2019-09-231-1/+15
| | | | | | | | | | | | | | | | | | | The two options are * nocover/cover: Prevent/allow mounting over an existing root mountpoint. E.g., "mount -t ufs -o nocover /dev/sd1a /usr/local" will fail if /usr/local is already a mountpoint. * emptydir/noemptydir: Prevent/allow mounting on a non-empty directory. E.g., "mount -t ufs -o emptydir /dev/sd1a /usr" will fail. Neither of these options is intended to be a default, for historical and compatibility reasons. Reviewed by: allanjude, kib Differential Revision: https://reviews.freebsd.org/D21458 Notes: svn path=/head/; revision=352614
* Return EISDIR when directory is opened with O_CREAT without O_DIRECTORY.Konstantin Belousov2019-09-171-1/+6
| | | | | | | | | | | Reviewed by: bcr (man page), emaste (previous version) PR: 240452 Sponsored by: The FreeBSD Foundation MFC after: 1 week DIfferential revision: https://reviews.freebsd.org/D21634 Notes: svn path=/head/; revision=352455
* getsockopt.2: clarify that SO_TIMESTAMP is not 100% reliableAlan Somers2019-09-111-2/+3
| | | | | | | | | | | | | | When SO_TIMESTAMP is set, the kernel will attempt to attach a timestamp as ancillary data to each IP datagram that is received on the socket. However, it may fail, for example due to insufficient memory. In that case the packet will still be received but not timestamp will be attached. Reviewed by: kib MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21607 Notes: svn path=/head/; revision=352231
* Fix cpuwhich_t column widthMitchell Horne2019-09-081-1/+1
| | | | | | | | | Not bumping .Dd since this is purely a format change. Approved by: markj (mentor) Notes: svn path=/head/; revision=352048
* Add procctl(PROC_STACKGAP_CTL)Konstantin Belousov2019-09-031-1/+62
| | | | | | | | | | | | | | It allows a process to request that stack gap was not applied to its stacks, retroactively. Also it is possible to control the gaps in the process after exec. PR: 239894 Reviewed by: alc Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21352 Notes: svn path=/head/; revision=351773
* Add sysctlbyname system callMateusz Guzik2019-09-031-0/+1
| | | | | | | | | | | | | | | Previously userspace would issue one syscall to resolve the sysctl and then another one to actually use it. Do it all in one trip. Fallback is provided in case newer libc happens to be running on an older kernel. Submitted by: Pawel Biernacki Reported by: kib, brooks Differential Revision: https://reviews.freebsd.org/D17282 Notes: svn path=/head/; revision=351729
* Add @generated tag to libc syscall asm wrappersEd Maste2019-08-161-2/+4
| | | | | | | | | | | | | | Although libc syscall wrappers do not get checked in this can aid in finding the source of generated files when spelunking in the objdir. Multiple tools use @generated to identify generated files (for example, in a review Phabricator will by default hide diffs in generated files). For consistency use the @generated tag in makesyscalls.sh as we've done for other generated files, even though these wrappers aren't checked in to the tree. Notes: svn path=/head/; revision=351122