diff options
Diffstat (limited to 'lib/libsys/shm_open.2')
-rw-r--r-- | lib/libsys/shm_open.2 | 623 |
1 files changed, 623 insertions, 0 deletions
diff --git a/lib/libsys/shm_open.2 b/lib/libsys/shm_open.2 new file mode 100644 index 000000000000..c3196d966e6b --- /dev/null +++ b/lib/libsys/shm_open.2 @@ -0,0 +1,623 @@ +.\" +.\" Copyright 2000 Massachusetts Institute of Technology +.\" +.\" Permission to use, copy, modify, and distribute this software and +.\" its documentation for any purpose and without fee is hereby +.\" granted, provided that both the above copyright notice and this +.\" permission notice appear in all copies, that both the above +.\" copyright notice and this permission notice appear in all +.\" supporting documentation, and that the name of M.I.T. not be used +.\" in advertising or publicity pertaining to distribution of the +.\" software without specific, written prior permission. M.I.T. makes +.\" no representations about the suitability of this software for any +.\" purpose. It is provided "as is" without express or implied +.\" warranty. +.\" +.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''. M.I.T. DISCLAIMS +.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE, +.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT +.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.Dd August 4, 2025 +.Dt SHM_OPEN 2 +.Os +.Sh NAME +.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink +.Nd "shared memory object operations" +.Sh LIBRARY +.Lb libc +.Sh SYNOPSIS +.In sys/types.h +.In sys/mman.h +.In fcntl.h +.Ft int +.Fn memfd_create "const char *name" "unsigned int flags" +.Ft int +.Fo shm_create_largepage +.Fa "const char *path" +.Fa "int flags" +.Fa "int psind" +.Fa "int alloc_policy" +.Fa "mode_t mode" +.Fc +.Ft int +.Fn shm_open "const char *path" "int flags" "mode_t mode" +.Ft int +.Fn shm_rename "const char *path_from" "const char *path_to" "int flags" +.Ft int +.Fn shm_unlink "const char *path" +.Sh DESCRIPTION +The +.Fn shm_open +function opens (or optionally creates) a +POSIX +shared memory object named +.Fa path . +The +.Fa flags +argument contains a subset of the flags used by +.Xr open 2 . +An access mode of either +.Dv O_RDONLY +or +.Dv O_RDWR +must be included in +.Fa flags . +The optional flags +.Dv O_CREAT , +.Dv O_EXCL , +.Dv O_TRUNC , +and +.Dv O_CLOFORK +may also be specified. +.Pp +If +.Dv O_CREAT +is specified, +then a new shared memory object named +.Fa path +will be created if it does not exist. +In this case, +the shared memory object is created with mode +.Fa mode +subject to the process' umask value. +If both the +.Dv O_CREAT +and +.Dv O_EXCL +flags are specified and a shared memory object named +.Fa path +already exists, +then +.Fn shm_open +will fail with +.Er EEXIST . +.Pp +Newly created objects start off with a size of zero. +If an existing shared memory object is opened with +.Dv O_RDWR +and the +.Dv O_TRUNC +flag is specified, +then the shared memory object will be truncated to a size of zero. +The size of the object can be adjusted via +.Xr ftruncate 2 +and queried via +.Xr fstat 2 . +.Pp +The new descriptor is set to close during +.Xr execve 2 +system calls; +see +.Xr close 2 +and +.Xr fcntl 2 . +.Pp +The constant +.Dv SHM_ANON +may be used for the +.Fa path +argument to +.Fn shm_open . +In this case, an anonymous, unnamed shared memory object is created. +Since the object has no name, +it cannot be removed via a subsequent call to +.Fn shm_unlink , +or moved with a call to +.Fn shm_rename . +Instead, +the shared memory object will be garbage collected when the last reference to +the shared memory object is removed. +The shared memory object may be shared with other processes by sharing the +file descriptor via +.Xr fork 2 +or +.Xr sendmsg 2 . +Attempting to open an anonymous shared memory object with +.Dv O_RDONLY +will fail with +.Er EINVAL . +All other flags are ignored. +.Pp +The +.Fn shm_create_largepage +function behaves similarly to +.Fn shm_open , +except that the +.Dv O_CREAT +flag is implicitly specified, and the returned +.Dq largepage +object is always backed by aligned, physically contiguous chunks of memory. +This ensures that the object can be mapped using so-called +.Dq superpages , +which can improve application performance in some workloads by reducing the +number of translation lookaside buffer (TLB) entries required to access a +mapping of the object, +and by reducing the number of page faults performed when accessing a mapping. +This happens automatically for all largepage objects. +.Pp +An existing largepage object can be opened using the +.Fn shm_open +function. +Largepage shared memory objects behave slightly differently from non-largepage +objects: +.Bl -bullet -offset indent +.It +Memory for a largepage object is allocated when the object is +extended using the +.Xr ftruncate 2 +system call, whereas memory for regular shared memory objects is allocated +lazily and may be paged out to a swap device when not in use. +.It +The size of a mapping of a largepage object must be a multiple of the +underlying large page size. +Most attributes of such a mapping can only be modified at the granularity +of the large page size. +For example, when using +.Xr munmap 2 +to unmap a portion of a largepage object mapping, or when using +.Xr mprotect 2 +to adjust protections of a mapping of a largepage object, the starting address +must be large page size-aligned, and the length of the operation must be a +multiple of the large page size. +If not, the corresponding system call will fail and set +.Va errno +to +.Er EINVAL . +.El +.Pp +The +.Fa psind +argument to +.Fn shm_create_largepage +specifies the size of large pages used to back the object. +This argument is an index into the page sizes array returned by +.Xr getpagesizes 3 . +In particular, all large pages backing a largepage object must be of the +same size. +For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage +object will consist of either 1024 2MB pages, or 2 1GB pages, depending on +the value specified for the +.Fa psind +argument. +The +.Fa alloc_policy +parameter specifies what happens when an attempt to use +.Xr ftruncate 2 +to allocate memory for the object fails. +The following values are accepted: +.Bl -tag -offset indent -width SHM_ +.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT +If the (non-blocking) memory allocation fails because there is insufficient free +contiguous memory, the kernel will attempt to defragment physical memory and +try another allocation. +The subsequent allocation may or may not succeed. +If this subsequent allocation also fails, +.Xr ftruncate 2 +will fail and set +.Va errno +to +.Er ENOMEM . +.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT +If the memory allocation fails, +.Xr ftruncate 2 +will fail and set +.Va errno +to +.Er ENOMEM . +.It Dv SHM_LARGEPAGE_ALLOC_HARD +The kernel will attempt defragmentation until the allocation succeeds, +or an unblocked signal is delivered to the thread. +However, it is possible for physical memory to be fragmented such that the +allocation will never succeed. +.El +.Pp +The +.Dv FIOSSHMLPGCNF +and +.Dv FIOGSHMLPGCNF +.Xr ioctl 2 +commands can be used with a largepage shared memory object to get and set +largepage object parameters. +Both commands operate on the following structure: +.Bd -literal +struct shm_largepage_conf { + int psind; + int alloc_policy; +}; + +.Ed +The +.Dv FIOGSHMLPGCNF +command populates this structure with the current values of these parameters, +while the +.Dv FIOSSHMLPGCNF +command modifies the largepage object. +Currently only the +.Va alloc_policy +parameter may be modified. +Internally, +.Fn shm_create_largepage +works by creating a regular shared memory object using +.Fn shm_open , +and then converting it into a largepage object using the +.Dv FIOSSHMLPGCNF +ioctl command. +.Pp +The +.Fn shm_rename +system call atomically removes a shared memory object named +.Fa path_from +and relinks it at +.Fa path_to . +If another object is already linked at +.Fa path_to , +that object will be unlinked, unless one of the following flags are provided: +.Bl -tag -offset indent -width Er +.It Er SHM_RENAME_EXCHANGE +Atomically exchange the shms at +.Fa path_from +and +.Fa path_to . +.It Er SHM_RENAME_NOREPLACE +Return an error if an shm exists at +.Fa path_to , +rather than unlinking it. +.El +.Pp +The +.Fn shm_unlink +system call removes a shared memory object named +.Fa path . +.Pp +The +.Fn memfd_create +function creates an anonymous shared memory object, identical to that created +by +.Fn shm_open +when +.Dv SHM_ANON +is specified. +Newly created objects start off with a size of zero. +The size of the new object must be adjusted via +.Xr ftruncate 2 . +.Pp +The +.Fa name +argument must not be +.Dv NULL , +but it may be an empty string. +The length of the +.Fa name +argument may not exceed +.Dv NAME_MAX +minus six characters for the prefix +.Dq memfd: , +which will be prepended. +The +.Fa name +argument is intended solely for debugging purposes and will never be used by the +kernel to identify a memfd. +Names are therefore not required to be unique. +.Pp +The following +.Fa flags +may be specified to +.Fn memfd_create : +.Bl -tag -width MFD_ALLOW_SEALING +.It Dv MFD_CLOEXEC +Set +.Dv FD_CLOEXEC +on the resulting file descriptor. +.It Dv MFD_ALLOW_SEALING +Allow adding seals to the resulting file descriptor using the +.Dv F_ADD_SEALS +.Xr fcntl 2 +command. +.It Dv MFD_HUGETLB +This flag is currently unsupported. +.El +.Sh RETURN VALUES +If successful, +.Fn memfd_create +and +.Fn shm_open +both return a non-negative integer, +and +.Fn shm_rename +and +.Fn shm_unlink +return zero. +All functions return -1 on failure, and set +.Va errno +to indicate the error. +.Sh COMPATIBILITY +The +.Fn shm_create_largepage +and +.Fn shm_rename +functions are +.Fx +extensions, as is support for the +.Dv SHM_ANON +value in +.Fn shm_open . +.Pp +The +.Fa path , +.Fa path_from , +and +.Fa path_to +arguments do not necessarily represent a pathname (although they do in +most other implementations). +Two processes opening the same +.Fa path +are guaranteed to access the same shared memory object if and only if +.Fa path +begins with a slash +.Pq Ql \&/ +character. +.Pp +Only the +.Dv O_RDONLY , +.Dv O_RDWR , +.Dv O_CREAT , +.Dv O_EXCL , +and +.Dv O_TRUNC +flags may be used in portable programs. +.Pp +POSIX +specifications state that the result of using +.Xr open 2 , +.Xr read 2 , +or +.Xr write 2 +on a shared memory object, or on the descriptor returned by +.Fn shm_open , +is undefined. +However, the +.Fx +kernel implementation explicitly includes support for +.Xr read 2 +and +.Xr write 2 . +.Pp +.Fx +also supports zero-copy transmission of data from shared memory +objects with +.Xr sendfile 2 . +.Pp +Neither shared memory objects nor their contents persist across reboots. +.Pp +Writes do not extend shared memory objects, so +.Xr ftruncate 2 +must be called before any data can be written. +See +.Sx EXAMPLES . +.Sh EXAMPLES +This example fails without the call to +.Xr ftruncate 2 : +.Bd -literal -compact + + uint8_t buffer[getpagesize()]; + ssize_t len; + int fd; + + fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600); + if (fd < 0) + err(EX_OSERR, "%s: shm_open", __func__); + if (ftruncate(fd, getpagesize()) < 0) + err(EX_IOERR, "%s: ftruncate", __func__); + len = pwrite(fd, buffer, getpagesize(), 0); + if (len < 0) + err(EX_IOERR, "%s: pwrite", __func__); + if (len != getpagesize()) + errx(EX_IOERR, "%s: pwrite length mismatch", __func__); +.Ed +.Sh ERRORS +.Fn memfd_create +fails with these error codes for these conditions: +.Bl -tag -width Er +.It Bq Er EBADF +The +.Fa name +argument was NULL. +.It Bq Er EINVAL +The +.Fa name +argument was too long. +.Pp +An invalid or unsupported flag was included in +.Fa flags . +.It Bq Er EMFILE +The process has already reached its limit for open file descriptors. +.It Bq Er ENFILE +The system file table is full. +.It Bq Er ENOSYS +In +.Fa memfd_create , +.Dv MFD_HUGETLB +was specified in +.Fa flags , +and this system does not support forced hugetlb mappings. +.El +.Pp +.Fn shm_open +fails with these error codes for these conditions: +.Bl -tag -width Er +.It Bq Er EINVAL +A flag other than +.Dv O_RDONLY , +.Dv O_RDWR , +.Dv O_CREAT , +.Dv O_EXCL , +or +.Dv O_TRUNC +was included in +.Fa flags . +.It Bq Er EMFILE +The process has already reached its limit for open file descriptors. +.It Bq Er ENFILE +The system file table is full. +.It Bq Er EINVAL +.Dv O_RDONLY +was specified while creating an anonymous shared memory object via +.Dv SHM_ANON . +.It Bq Er EFAULT +The +.Fa path +argument points outside the process' allocated address space. +.It Bq Er ENAMETOOLONG +The entire pathname exceeds 1023 characters. +.It Bq Er EINVAL +The +.Fa path +does not begin with a slash +.Pq Ql \&/ +character. +.It Bq Er ENOENT +.Dv O_CREAT +is not specified and the named shared memory object does not exist. +.It Bq Er EEXIST +.Dv O_CREAT +and +.Dv O_EXCL +are specified and the named shared memory object does exist. +.It Bq Er EACCES +The required permissions (for reading or reading and writing) are denied. +.It Bq Er ECAPMODE +The process is running in capability mode (see +.Xr capsicum 4 ) +and attempted to create a named shared memory object. +.El +.Pp +.Fn shm_create_largepage +can fail for the reasons listed above. +It also fails with these error codes for the following conditions: +.Bl -tag -width Er +.It Bq Er ENOTTY +The kernel does not support large pages on the current platform. +.El +.Pp +The following errors are defined for +.Fn shm_rename : +.Bl -tag -width Er +.It Bq Er EFAULT +The +.Fa path_from +or +.Fa path_to +argument points outside the process' allocated address space. +.It Bq Er ENAMETOOLONG +The entire pathname exceeds 1023 characters. +.It Bq Er ENOENT +The shared memory object at +.Fa path_from +does not exist. +.It Bq Er EACCES +The required permissions are denied. +.It Bq Er EEXIST +An shm exists at +.Fa path_to , +and the +.Dv SHM_RENAME_NOREPLACE +flag was provided. +.El +.Pp +.Fn shm_unlink +fails with these error codes for these conditions: +.Bl -tag -width Er +.It Bq Er EFAULT +The +.Fa path +argument points outside the process' allocated address space. +.It Bq Er ENAMETOOLONG +The entire pathname exceeds 1023 characters. +.It Bq Er ENOENT +The named shared memory object does not exist. +.It Bq Er EACCES +The required permissions are denied. +.Fn shm_unlink +requires write permission to the shared memory object. +.El +.Sh SEE ALSO +.Xr posixshmcontrol 1 , +.Xr close 2 , +.Xr fstat 2 , +.Xr ftruncate 2 , +.Xr ioctl 2 , +.Xr mmap 2 , +.Xr munmap 2 , +.Xr sendfile 2 +.Sh STANDARDS +The +.Fn memfd_create +function is expected to be compatible with the Linux system call of the same +name. +.Pp +The +.Fn shm_open +and +.Fn shm_unlink +functions are believed to conform to +.St -p1003.1b-93 . +.Sh HISTORY +The +.Fn memfd_create +function appeared in +.Fx 13.0 . +.Pp +The +.Fn shm_open +and +.Fn shm_unlink +functions first appeared in +.Fx 4.3 . +The functions were reimplemented as system calls using shared memory objects +directly rather than files in +.Fx 8.0 . +.Pp +.Fn shm_rename +first appeared in +.Fx 13.0 +as a +.Fx +extension. +.Sh AUTHORS +.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org +(C library support and this manual page) +.Pp +.An Matthew Dillon Aq Mt dillon@FreeBSD.org +.Pq Dv MAP_NOSYNC +.Pp +.An Matthew Bryan Aq Mt matthew.bryan@isilon.com +.Pq Dv shm_rename implementation |