aboutsummaryrefslogtreecommitdiff
path: root/lib/libsys/shm_open.2
diff options
context:
space:
mode:
Diffstat (limited to 'lib/libsys/shm_open.2')
-rw-r--r--lib/libsys/shm_open.2623
1 files changed, 623 insertions, 0 deletions
diff --git a/lib/libsys/shm_open.2 b/lib/libsys/shm_open.2
new file mode 100644
index 000000000000..c3196d966e6b
--- /dev/null
+++ b/lib/libsys/shm_open.2
@@ -0,0 +1,623 @@
+.\"
+.\" Copyright 2000 Massachusetts Institute of Technology
+.\"
+.\" Permission to use, copy, modify, and distribute this software and
+.\" its documentation for any purpose and without fee is hereby
+.\" granted, provided that both the above copyright notice and this
+.\" permission notice appear in all copies, that both the above
+.\" copyright notice and this permission notice appear in all
+.\" supporting documentation, and that the name of M.I.T. not be used
+.\" in advertising or publicity pertaining to distribution of the
+.\" software without specific, written prior permission. M.I.T. makes
+.\" no representations about the suitability of this software for any
+.\" purpose. It is provided "as is" without express or implied
+.\" warranty.
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''. M.I.T. DISCLAIMS
+.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
+.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
+.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+.\" SUCH DAMAGE.
+.\"
+.Dd August 4, 2025
+.Dt SHM_OPEN 2
+.Os
+.Sh NAME
+.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink
+.Nd "shared memory object operations"
+.Sh LIBRARY
+.Lb libc
+.Sh SYNOPSIS
+.In sys/types.h
+.In sys/mman.h
+.In fcntl.h
+.Ft int
+.Fn memfd_create "const char *name" "unsigned int flags"
+.Ft int
+.Fo shm_create_largepage
+.Fa "const char *path"
+.Fa "int flags"
+.Fa "int psind"
+.Fa "int alloc_policy"
+.Fa "mode_t mode"
+.Fc
+.Ft int
+.Fn shm_open "const char *path" "int flags" "mode_t mode"
+.Ft int
+.Fn shm_rename "const char *path_from" "const char *path_to" "int flags"
+.Ft int
+.Fn shm_unlink "const char *path"
+.Sh DESCRIPTION
+The
+.Fn shm_open
+function opens (or optionally creates) a
+POSIX
+shared memory object named
+.Fa path .
+The
+.Fa flags
+argument contains a subset of the flags used by
+.Xr open 2 .
+An access mode of either
+.Dv O_RDONLY
+or
+.Dv O_RDWR
+must be included in
+.Fa flags .
+The optional flags
+.Dv O_CREAT ,
+.Dv O_EXCL ,
+.Dv O_TRUNC ,
+and
+.Dv O_CLOFORK
+may also be specified.
+.Pp
+If
+.Dv O_CREAT
+is specified,
+then a new shared memory object named
+.Fa path
+will be created if it does not exist.
+In this case,
+the shared memory object is created with mode
+.Fa mode
+subject to the process' umask value.
+If both the
+.Dv O_CREAT
+and
+.Dv O_EXCL
+flags are specified and a shared memory object named
+.Fa path
+already exists,
+then
+.Fn shm_open
+will fail with
+.Er EEXIST .
+.Pp
+Newly created objects start off with a size of zero.
+If an existing shared memory object is opened with
+.Dv O_RDWR
+and the
+.Dv O_TRUNC
+flag is specified,
+then the shared memory object will be truncated to a size of zero.
+The size of the object can be adjusted via
+.Xr ftruncate 2
+and queried via
+.Xr fstat 2 .
+.Pp
+The new descriptor is set to close during
+.Xr execve 2
+system calls;
+see
+.Xr close 2
+and
+.Xr fcntl 2 .
+.Pp
+The constant
+.Dv SHM_ANON
+may be used for the
+.Fa path
+argument to
+.Fn shm_open .
+In this case, an anonymous, unnamed shared memory object is created.
+Since the object has no name,
+it cannot be removed via a subsequent call to
+.Fn shm_unlink ,
+or moved with a call to
+.Fn shm_rename .
+Instead,
+the shared memory object will be garbage collected when the last reference to
+the shared memory object is removed.
+The shared memory object may be shared with other processes by sharing the
+file descriptor via
+.Xr fork 2
+or
+.Xr sendmsg 2 .
+Attempting to open an anonymous shared memory object with
+.Dv O_RDONLY
+will fail with
+.Er EINVAL .
+All other flags are ignored.
+.Pp
+The
+.Fn shm_create_largepage
+function behaves similarly to
+.Fn shm_open ,
+except that the
+.Dv O_CREAT
+flag is implicitly specified, and the returned
+.Dq largepage
+object is always backed by aligned, physically contiguous chunks of memory.
+This ensures that the object can be mapped using so-called
+.Dq superpages ,
+which can improve application performance in some workloads by reducing the
+number of translation lookaside buffer (TLB) entries required to access a
+mapping of the object,
+and by reducing the number of page faults performed when accessing a mapping.
+This happens automatically for all largepage objects.
+.Pp
+An existing largepage object can be opened using the
+.Fn shm_open
+function.
+Largepage shared memory objects behave slightly differently from non-largepage
+objects:
+.Bl -bullet -offset indent
+.It
+Memory for a largepage object is allocated when the object is
+extended using the
+.Xr ftruncate 2
+system call, whereas memory for regular shared memory objects is allocated
+lazily and may be paged out to a swap device when not in use.
+.It
+The size of a mapping of a largepage object must be a multiple of the
+underlying large page size.
+Most attributes of such a mapping can only be modified at the granularity
+of the large page size.
+For example, when using
+.Xr munmap 2
+to unmap a portion of a largepage object mapping, or when using
+.Xr mprotect 2
+to adjust protections of a mapping of a largepage object, the starting address
+must be large page size-aligned, and the length of the operation must be a
+multiple of the large page size.
+If not, the corresponding system call will fail and set
+.Va errno
+to
+.Er EINVAL .
+.El
+.Pp
+The
+.Fa psind
+argument to
+.Fn shm_create_largepage
+specifies the size of large pages used to back the object.
+This argument is an index into the page sizes array returned by
+.Xr getpagesizes 3 .
+In particular, all large pages backing a largepage object must be of the
+same size.
+For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage
+object will consist of either 1024 2MB pages, or 2 1GB pages, depending on
+the value specified for the
+.Fa psind
+argument.
+The
+.Fa alloc_policy
+parameter specifies what happens when an attempt to use
+.Xr ftruncate 2
+to allocate memory for the object fails.
+The following values are accepted:
+.Bl -tag -offset indent -width SHM_
+.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT
+If the (non-blocking) memory allocation fails because there is insufficient free
+contiguous memory, the kernel will attempt to defragment physical memory and
+try another allocation.
+The subsequent allocation may or may not succeed.
+If this subsequent allocation also fails,
+.Xr ftruncate 2
+will fail and set
+.Va errno
+to
+.Er ENOMEM .
+.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT
+If the memory allocation fails,
+.Xr ftruncate 2
+will fail and set
+.Va errno
+to
+.Er ENOMEM .
+.It Dv SHM_LARGEPAGE_ALLOC_HARD
+The kernel will attempt defragmentation until the allocation succeeds,
+or an unblocked signal is delivered to the thread.
+However, it is possible for physical memory to be fragmented such that the
+allocation will never succeed.
+.El
+.Pp
+The
+.Dv FIOSSHMLPGCNF
+and
+.Dv FIOGSHMLPGCNF
+.Xr ioctl 2
+commands can be used with a largepage shared memory object to get and set
+largepage object parameters.
+Both commands operate on the following structure:
+.Bd -literal
+struct shm_largepage_conf {
+ int psind;
+ int alloc_policy;
+};
+
+.Ed
+The
+.Dv FIOGSHMLPGCNF
+command populates this structure with the current values of these parameters,
+while the
+.Dv FIOSSHMLPGCNF
+command modifies the largepage object.
+Currently only the
+.Va alloc_policy
+parameter may be modified.
+Internally,
+.Fn shm_create_largepage
+works by creating a regular shared memory object using
+.Fn shm_open ,
+and then converting it into a largepage object using the
+.Dv FIOSSHMLPGCNF
+ioctl command.
+.Pp
+The
+.Fn shm_rename
+system call atomically removes a shared memory object named
+.Fa path_from
+and relinks it at
+.Fa path_to .
+If another object is already linked at
+.Fa path_to ,
+that object will be unlinked, unless one of the following flags are provided:
+.Bl -tag -offset indent -width Er
+.It Er SHM_RENAME_EXCHANGE
+Atomically exchange the shms at
+.Fa path_from
+and
+.Fa path_to .
+.It Er SHM_RENAME_NOREPLACE
+Return an error if an shm exists at
+.Fa path_to ,
+rather than unlinking it.
+.El
+.Pp
+The
+.Fn shm_unlink
+system call removes a shared memory object named
+.Fa path .
+.Pp
+The
+.Fn memfd_create
+function creates an anonymous shared memory object, identical to that created
+by
+.Fn shm_open
+when
+.Dv SHM_ANON
+is specified.
+Newly created objects start off with a size of zero.
+The size of the new object must be adjusted via
+.Xr ftruncate 2 .
+.Pp
+The
+.Fa name
+argument must not be
+.Dv NULL ,
+but it may be an empty string.
+The length of the
+.Fa name
+argument may not exceed
+.Dv NAME_MAX
+minus six characters for the prefix
+.Dq memfd: ,
+which will be prepended.
+The
+.Fa name
+argument is intended solely for debugging purposes and will never be used by the
+kernel to identify a memfd.
+Names are therefore not required to be unique.
+.Pp
+The following
+.Fa flags
+may be specified to
+.Fn memfd_create :
+.Bl -tag -width MFD_ALLOW_SEALING
+.It Dv MFD_CLOEXEC
+Set
+.Dv FD_CLOEXEC
+on the resulting file descriptor.
+.It Dv MFD_ALLOW_SEALING
+Allow adding seals to the resulting file descriptor using the
+.Dv F_ADD_SEALS
+.Xr fcntl 2
+command.
+.It Dv MFD_HUGETLB
+This flag is currently unsupported.
+.El
+.Sh RETURN VALUES
+If successful,
+.Fn memfd_create
+and
+.Fn shm_open
+both return a non-negative integer,
+and
+.Fn shm_rename
+and
+.Fn shm_unlink
+return zero.
+All functions return -1 on failure, and set
+.Va errno
+to indicate the error.
+.Sh COMPATIBILITY
+The
+.Fn shm_create_largepage
+and
+.Fn shm_rename
+functions are
+.Fx
+extensions, as is support for the
+.Dv SHM_ANON
+value in
+.Fn shm_open .
+.Pp
+The
+.Fa path ,
+.Fa path_from ,
+and
+.Fa path_to
+arguments do not necessarily represent a pathname (although they do in
+most other implementations).
+Two processes opening the same
+.Fa path
+are guaranteed to access the same shared memory object if and only if
+.Fa path
+begins with a slash
+.Pq Ql \&/
+character.
+.Pp
+Only the
+.Dv O_RDONLY ,
+.Dv O_RDWR ,
+.Dv O_CREAT ,
+.Dv O_EXCL ,
+and
+.Dv O_TRUNC
+flags may be used in portable programs.
+.Pp
+POSIX
+specifications state that the result of using
+.Xr open 2 ,
+.Xr read 2 ,
+or
+.Xr write 2
+on a shared memory object, or on the descriptor returned by
+.Fn shm_open ,
+is undefined.
+However, the
+.Fx
+kernel implementation explicitly includes support for
+.Xr read 2
+and
+.Xr write 2 .
+.Pp
+.Fx
+also supports zero-copy transmission of data from shared memory
+objects with
+.Xr sendfile 2 .
+.Pp
+Neither shared memory objects nor their contents persist across reboots.
+.Pp
+Writes do not extend shared memory objects, so
+.Xr ftruncate 2
+must be called before any data can be written.
+See
+.Sx EXAMPLES .
+.Sh EXAMPLES
+This example fails without the call to
+.Xr ftruncate 2 :
+.Bd -literal -compact
+
+ uint8_t buffer[getpagesize()];
+ ssize_t len;
+ int fd;
+
+ fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600);
+ if (fd < 0)
+ err(EX_OSERR, "%s: shm_open", __func__);
+ if (ftruncate(fd, getpagesize()) < 0)
+ err(EX_IOERR, "%s: ftruncate", __func__);
+ len = pwrite(fd, buffer, getpagesize(), 0);
+ if (len < 0)
+ err(EX_IOERR, "%s: pwrite", __func__);
+ if (len != getpagesize())
+ errx(EX_IOERR, "%s: pwrite length mismatch", __func__);
+.Ed
+.Sh ERRORS
+.Fn memfd_create
+fails with these error codes for these conditions:
+.Bl -tag -width Er
+.It Bq Er EBADF
+The
+.Fa name
+argument was NULL.
+.It Bq Er EINVAL
+The
+.Fa name
+argument was too long.
+.Pp
+An invalid or unsupported flag was included in
+.Fa flags .
+.It Bq Er EMFILE
+The process has already reached its limit for open file descriptors.
+.It Bq Er ENFILE
+The system file table is full.
+.It Bq Er ENOSYS
+In
+.Fa memfd_create ,
+.Dv MFD_HUGETLB
+was specified in
+.Fa flags ,
+and this system does not support forced hugetlb mappings.
+.El
+.Pp
+.Fn shm_open
+fails with these error codes for these conditions:
+.Bl -tag -width Er
+.It Bq Er EINVAL
+A flag other than
+.Dv O_RDONLY ,
+.Dv O_RDWR ,
+.Dv O_CREAT ,
+.Dv O_EXCL ,
+or
+.Dv O_TRUNC
+was included in
+.Fa flags .
+.It Bq Er EMFILE
+The process has already reached its limit for open file descriptors.
+.It Bq Er ENFILE
+The system file table is full.
+.It Bq Er EINVAL
+.Dv O_RDONLY
+was specified while creating an anonymous shared memory object via
+.Dv SHM_ANON .
+.It Bq Er EFAULT
+The
+.Fa path
+argument points outside the process' allocated address space.
+.It Bq Er ENAMETOOLONG
+The entire pathname exceeds 1023 characters.
+.It Bq Er EINVAL
+The
+.Fa path
+does not begin with a slash
+.Pq Ql \&/
+character.
+.It Bq Er ENOENT
+.Dv O_CREAT
+is not specified and the named shared memory object does not exist.
+.It Bq Er EEXIST
+.Dv O_CREAT
+and
+.Dv O_EXCL
+are specified and the named shared memory object does exist.
+.It Bq Er EACCES
+The required permissions (for reading or reading and writing) are denied.
+.It Bq Er ECAPMODE
+The process is running in capability mode (see
+.Xr capsicum 4 )
+and attempted to create a named shared memory object.
+.El
+.Pp
+.Fn shm_create_largepage
+can fail for the reasons listed above.
+It also fails with these error codes for the following conditions:
+.Bl -tag -width Er
+.It Bq Er ENOTTY
+The kernel does not support large pages on the current platform.
+.El
+.Pp
+The following errors are defined for
+.Fn shm_rename :
+.Bl -tag -width Er
+.It Bq Er EFAULT
+The
+.Fa path_from
+or
+.Fa path_to
+argument points outside the process' allocated address space.
+.It Bq Er ENAMETOOLONG
+The entire pathname exceeds 1023 characters.
+.It Bq Er ENOENT
+The shared memory object at
+.Fa path_from
+does not exist.
+.It Bq Er EACCES
+The required permissions are denied.
+.It Bq Er EEXIST
+An shm exists at
+.Fa path_to ,
+and the
+.Dv SHM_RENAME_NOREPLACE
+flag was provided.
+.El
+.Pp
+.Fn shm_unlink
+fails with these error codes for these conditions:
+.Bl -tag -width Er
+.It Bq Er EFAULT
+The
+.Fa path
+argument points outside the process' allocated address space.
+.It Bq Er ENAMETOOLONG
+The entire pathname exceeds 1023 characters.
+.It Bq Er ENOENT
+The named shared memory object does not exist.
+.It Bq Er EACCES
+The required permissions are denied.
+.Fn shm_unlink
+requires write permission to the shared memory object.
+.El
+.Sh SEE ALSO
+.Xr posixshmcontrol 1 ,
+.Xr close 2 ,
+.Xr fstat 2 ,
+.Xr ftruncate 2 ,
+.Xr ioctl 2 ,
+.Xr mmap 2 ,
+.Xr munmap 2 ,
+.Xr sendfile 2
+.Sh STANDARDS
+The
+.Fn memfd_create
+function is expected to be compatible with the Linux system call of the same
+name.
+.Pp
+The
+.Fn shm_open
+and
+.Fn shm_unlink
+functions are believed to conform to
+.St -p1003.1b-93 .
+.Sh HISTORY
+The
+.Fn memfd_create
+function appeared in
+.Fx 13.0 .
+.Pp
+The
+.Fn shm_open
+and
+.Fn shm_unlink
+functions first appeared in
+.Fx 4.3 .
+The functions were reimplemented as system calls using shared memory objects
+directly rather than files in
+.Fx 8.0 .
+.Pp
+.Fn shm_rename
+first appeared in
+.Fx 13.0
+as a
+.Fx
+extension.
+.Sh AUTHORS
+.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org
+(C library support and this manual page)
+.Pp
+.An Matthew Dillon Aq Mt dillon@FreeBSD.org
+.Pq Dv MAP_NOSYNC
+.Pp
+.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
+.Pq Dv shm_rename implementation