summaryrefslogtreecommitdiff
path: root/sys/fs/tmpfs
Commit message (Collapse)AuthorAgeFilesLines
* tmpfs: reorder struct tmpfs_node to shrink it by 8 bytesMateusz Guzik2020-11-051-3/+7
| | | | | | | | The reduction (232 -> 224 bytes) allows UMA to fit one more item (17 -> 18) per slab as reported in vm.uma.TMPFS_node.keg.ipers. Notes: svn path=/head/; revision=367368
* tmpfs: change tmpfs dirent zone into a malloc typeMateusz Guzik2020-10-301-7/+3
| | | | | | | It is 64 bytes. Notes: svn path=/head/; revision=367165
* cache: add cache_vop_mkdir and rename cache_rename to cache_vop_renameMateusz Guzik2020-10-301-2/+2
| | | | Notes: svn path=/head/; revision=367162
* vfs: drop spurious cache_purge on rmdirMateusz Guzik2020-10-231-1/+0
| | | | | | | | | | The removed directory gets cache_purged which is sufficient to remove any entries related to the parent. Note only tmpfs, ufs and zfs are patched. Notes: svn path=/head/; revision=366975
* vm_ooffset_t is now unsignedEric van Gyzen2020-09-181-5/+7
| | | | | | | | | | | | | vm_ooffset_t is now unsigned. Remove some tests for negative values, or make other adjustments accordingly. Reported by: Coverity Reviewed by: kib markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26214 Notes: svn path=/head/; revision=365886
* tmpfs: restore atime updates for reads from page cache.Konstantin Belousov2020-09-164-33/+49
| | | | | | | | | | | | | | | | | Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated atomically without locks or (locked) atomics. tn_update_getattr() change also contains unrelated bug fix. Reported by: lwhsu PR: 249362 Reviewed by: markj (previous version) Discussed with: mjg Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26451 Notes: svn path=/head/; revision=365810
* Style.Konstantin Belousov2020-09-162-19/+24
| | | | | | | | Sponsored by: The FreeBSD Foundation MFC after: 3 days Notes: svn path=/head/; revision=365809
* Add tmpfs page cache read support.Konstantin Belousov2020-09-154-10/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | Or it could be explained as lockless (for vnode lock) reads. Reads are performed from the node tn_obj object. Tmpfs regular vnode object lifecycle is significantly different from the normal OBJT_VNODE: it is alive as far as ref_count > 0. Ensure liveness of the tmpfs VREG node and consequently v_object inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open(). Provide custom tmpfs fo_close() method on file, to ensure that close is paired with open. Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks. It is quite cheap in code size sense to support page-ins for read for tmpfs even if we do not own tmpfs vnode lock. Also, we can handle holes in tmpfs node without additional efforts, and do not have limitation of the transfer size. Reviewed by: markj Discussed with and benchmarked by: mjg (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346 Notes: svn path=/head/; revision=365787
* Microoptimize tmpfs node ref/unref by using atomics.Konstantin Belousov2020-09-153-22/+18
| | | | | | | | | | | | | Avoid tmpfs mount and node locks when ref count is greater than zero, which is the case until node is being destroyed by unlink or unmount. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346 Notes: svn path=/head/; revision=365786
* tmpfs: drop spurious cache_purge in tmpfs_reclaimMateusz Guzik2020-09-041-2/+0
| | | | | | | vgone already performs it. Notes: svn path=/head/; revision=365338
* fs: clean up empty lines in .c and .h filesMateusz Guzik2020-09-012-2/+1
| | | | Notes: svn path=/head/; revision=365070
* cache: add cache_rename, a dedicated helper to use for renamesMateusz Guzik2020-08-201-4/+1
| | | | | | | | | While here make both tmpfs and ufs use it. No fuctional changes. Notes: svn path=/head/; revision=364419
* tmpfs: use vget_prep/vget_finish instead of vget + vnodeMateusz Guzik2020-08-161-4/+3
| | | | Notes: svn path=/head/; revision=364272
* vfs: remove the thread argument from vgetMateusz Guzik2020-08-162-3/+2
| | | | | | | | | | | | | | | | | | It was already asserted to be curthread. Semantic patch: @@ expression arg1, arg2, arg3; @@ - vget(arg1, arg2, arg3) + vget(arg1, arg2) Notes: svn path=/head/; revision=364271
* vfs: clean MNTK_FPLOOKUP if MNT_UNION is setMateusz Guzik2020-08-101-1/+8
| | | | | | | Elides checking it during lookup. Notes: svn path=/head/; revision=364077
* tmpfs: add VOP_STAT handlerMateusz Guzik2020-08-072-0/+50
| | | | Notes: svn path=/head/; revision=364045
* vfs: remove the obsolete privused argument from vaccessMateusz Guzik2020-08-051-2/+2
| | | | | | | | This brings argument count down to 6, which is passable without the stack on amd64. Notes: svn path=/head/; revision=363893
* tmpfs: add support for lockless lookupMateusz Guzik2020-07-255-9/+76
| | | | | | | | | Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25580 Notes: svn path=/head/; revision=363521
* Call swap_pager_freespace() from vm_object_page_remove().Mark Johnston2020-06-251-4/+1
| | | | | | | | | | | | | | | | | All vm_object_page_remove() callers, except linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when removing a range of pages from an object. The LinuxKPI case appears to be an unintentional omission that could result in leaked swap blocks, so unconditionally free swap space in vm_object_page_remove() to protect against similar bugs in the future. Reviewed by: alc, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25329 Notes: svn path=/head/; revision=362613
* tmpfs: Preserve alignment of struct fid fieldsRyan Moeller2020-06-033-18/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | On 64-bit platforms, the two short fields in `struct tmpfs_fid` are padded to the 64-bit alignment of the long field. This pushes the offsets of the subsequent fields by 4 bytes and makes `struct tmpfs_fid` bigger than `struct fid`. `tmpfs_vptofh()` casts a `struct fid *` to `struct tmpfs_fid *`, causing 4 bytes of adjacent memory to be overwritten when the struct fields are set. Through several layers of indirection and embedded structs, the adjacent memory for one particular call to `tmpfs_vptofh()` happens to be the stack canary for `nfsrvd_compound()`. Half of the canary ends up being clobbered, going unnoticed until eventually the stack check fails when `nfsrvd_compound()` returns and a panic is triggered. Instead of duplicating fields of `struct fid` in `struct tmpfs_fid`, narrow the struct to cover only the unique fields for tmpfs and assert at compile time that the struct fits in the allotted space. This way we don't have to replicate the offsets of `struct fid` fields, we just use them directly. Reviewed by: kib, mav, rmacklem Approved by: mav (mentor) MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25077 Notes: svn path=/head/; revision=361748
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-1/+2
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* vfs: make write suspension mandatoryMateusz Guzik2020-02-151-10/+0
| | | | | | | | | | | | | | | At the time opt-in was introduced adding yourself as a writer was esrializing across the mount point. Nowadays it is fully per-cpu, the only impact being a small single-threaded hit on top of what's there right now. Vast majority of the overhead stems from the call to VOP_GETWRITEMOUNT which has is done regardless. Should someone want to microoptimize this single-threaded they can coalesce looking the mount up with adding a write to it. Notes: svn path=/head/; revision=357962
* tmpfs: add nomtime mount option,Konstantin Belousov2020-02-042-8/+15
| | | | | | | | | | | | | | | | which disables tracking mtime updates due to writes through the shared mapped areas backed by tmpfs files. This removes periodic scans which downgrades rw mapped pages to ro to note the writes. Suggested by: mjg Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D23432 Notes: svn path=/head/; revision=357515
* tmpfs_mount update: simplify, cache the value of VFS_TO_TMPFS() calculation.Konstantin Belousov2020-02-041-4/+5
| | | | | | | | Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=357511
* tmpfs: inline tmpfs_updateMateusz Guzik2020-02-033-9/+21
| | | | | | | | | | It was generated to be just a jumping off point to tmpfs_itimes. While here provide a dedicated variant for getattr since we normally don't expect to need to the update from that caller. Notes: svn path=/head/; revision=357451
* Provide O_SEARCHKyle Evans2020-02-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping permissions checks on the directory itself after the initial open(). This is close to the semantics we've historically applied for O_EXEC on a directory, which is UB according to POSIX. Conveniently, O_SEARCH on a file is also explicitly undefined behavior according to POSIX, so O_EXEC would be a fine choice. The spec goes on to state that O_SEARCH and O_EXEC need not be distinct values, but they're not defined to be the same value. This was pointed out as an incompatibility with other systems that had made its way into libarchive, which had assumed that O_EXEC was an alias for O_SEARCH. This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a directory is checked in vn_open_vnode already, so for completeness we add a NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not re-check that when descending in namei. [0] https://pubs.opengroup.org/onlinepubs/9699919799/ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23247 Notes: svn path=/head/; revision=357412
* vfs: consistently use size_t for buflen around VOP_VPTOCNPMateusz Guzik2020-02-011-2/+2
| | | | Notes: svn path=/head/; revision=357383
* Don't hold the object lock while calling getpages.Jeff Roberson2020-01-191-0/+4
| | | | | | | | | | | | | The vnode pager does not want the object lock held. Moving this out allows further object lock scope reduction in callers. While here add some missing paging in progress calls and an assert. The object handle is now protected explicitly with pip. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23033 Notes: svn path=/head/; revision=356902
* tmpfs: add missing CLTFLAG_MPSAFE annotationMateusz Guzik2020-01-151-2/+3
| | | | Notes: svn path=/head/; revision=356744
* vfs: rework vnode list managementMateusz Guzik2020-01-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | The current notion of an active vnode is eliminated. Vnodes transition between 0<->1 hold counts all the time and the associated traversal between different lists induces significant scalability problems in certain workloads. Introduce a global list containing all allocated vnodes. They get unlinked only when UMA reclaims memory and are only requeued when hold count reaches 0. Sample result from an incremental make -s -j 104 bzImage on tmpfs: stock: 118.55s user 3649.73s system 7479% cpu 50.382 total patched: 122.38s user 1780.45s system 6242% cpu 30.480 total Reviewed by: jeff Tested by: pho (in a larger patch, previous version) Differential Revision: https://reviews.freebsd.org/D22997 Notes: svn path=/head/; revision=356672
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-032-16/+16
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* Remove page locking for queue operations.Mark Johnston2019-12-281-2/+0
| | | | | | | | | | | | | | | | With the previous reviews, the page lock is no longer required in order to perform queue operations on a page. It is also no longer needed in the page queue scans. This change effectively eliminates remaining uses of the page lock and also the false sharing caused by multiple pages sharing a page lock. Reviewed by: jeff Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22885 Notes: svn path=/head/; revision=356157
* Including <sys/tmpfs.h> into non-kernel software leads to aDoug Moore2019-12-191-2/+1
| | | | | | | | | | | | | | | | | | compilation error because, without _KERNEL defined, the macro TMPFS_VALIDATE_DIR is invoked, but never defined. User-level software that includes sys/tmpfs.h must define _KERNEL to make the definition of TMPFS_VALIDATE_DIR visible. This change puts all the inline functions that, directly or indirectly, invoke MPASS into the scope of the _KERNEL block, allowing many user-space includers of <sys/tmpfs.h> to stop defining _KERNEL. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22874 Notes: svn path=/head/; revision=355913
* vfs: flatten vop vectorsMateusz Guzik2019-12-162-0/+3
| | | | | | | | | | | | | | | This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738 Notes: svn path=/head/; revision=355790
* Add a deferred free mechanism for freeing swap space that does not requireJeff Roberson2019-12-151-2/+1
| | | | | | | | | | | | | | | | | | | | | | an exclusive object lock. Previously swap space was freed on a best effort basis when a page that had valid swap was dirtied, thus invalidating the swap copy. This may be done inconsistently and requires the object lock which is not always convenient. Instead, track when swap space is present. The first dirty is responsible for deleting space or setting PGA_SWAP_FREE which will trigger background scans to free the swap space. Simplify the locking in vm_fault_dirty() now that we can reliably identify the first dirty. Discussed with: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22654 Notes: svn path=/head/; revision=355765
* vfs: locking primitives which elide ->v_vnlock and shared locking disablementMateusz Guzik2019-12-112-1/+4
| | | | | | | | | | | | | | | | | | | Both of these features are not needed by many consumers and result in avoidable reads which in turn puts them on profiles due to cache-line ping ponging. On top of that the current lockgmr entry point is slower than necessary single-threaded. As an attempted clean up preparing for other changes, provide new routines which don't support any of the aforementioned features. With these patches in place vop_stdlock and vop_stdunlock disappear from flamegraphs during -j 104 buildkernel. Reviewed by: jeff (previous version) Tested by: pho Differential Revision: https://reviews.freebsd.org/D22665 Notes: svn path=/head/; revision=355633
* vfs: introduce v_irflag and make v_type smallerMateusz Guzik2019-12-082-4/+4
| | | | | | | | | | | | | | | | | | The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715 Notes: svn path=/head/; revision=355537
* Stop using per-mount tmpfs zones.Konstantin Belousov2019-12-053-65/+89
| | | | | | | | | | Requested and reviewed by: jeff Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22643 Notes: svn path=/head/; revision=355407
* tmpfs: use proper macros for permission values in tmpfs_accessMateusz Guzik2019-12-011-2/+2
| | | | | | | | | | While here group them in one var to prevent overy long lines. Perhaps a general macro of the same sort should be introduced. Requested by: kib Notes: svn path=/head/; revision=355255
* tmpfs: add fast path to tmpfs_access for common case lookupMateusz Guzik2019-11-301-0/+6
| | | | | | | | VEXEC consists of vast majority of all calls and almost all targets have at least 0111. Notes: svn path=/head/; revision=355227
* tmpfs: resolve deadlock between rename and unmount.Konstantin Belousov2019-11-241-13/+1
| | | | | | | | | | | | | | | | | | | Top-level kern_renameat() increases the writecount on the mount point, which, together with tmpfs unmount suspending the mount, already ensures that unmount cannot proceed while rename unlocks and relocks all operated vnodes. Remove vfs_busy() call from tmpfs_rename() which was done while holding a vnode lock, creating the deadlock. The only intent of the busy operation seems to be the prevention of unmount, which is already ensured. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=355061
* Simplify anonymous memory handling with an OBJ_ANON flag. This eliminatesJeff Roberson2019-11-191-2/+1
| | | | | | | | | | | | | | | reudundant complicated checks and additional locking required only for anonymous memory. Introduce vm_object_allocate_anon() to create these objects. DEFAULT and SWAP objects now have the correct settings for non-anonymous consumers and so individual consumers need not modify the default flags to create super-pages and avoid ONEMAPPING/NOSPLIT. Reviewed by: alc, dougm, kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D22119 Notes: svn path=/head/; revision=354869
* Replace OBJ_MIGHTBEDIRTY with a system using atomics. Remove the TMPFS_DIRTYJeff Roberson2019-10-293-5/+5
| | | | | | | | | | | | | | flag and use the same system. This enables further fault locking improvements by allowing more faults to proceed with a shared lock. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22116 Notes: svn path=/head/; revision=354158
* (4/6) Protect page valid with the busy lock.Jeff Roberson2019-10-151-1/+1
| | | | | | | | | | | | | | Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594 Notes: svn path=/head/; revision=353539
* (1/6) Replace busy checks with acquires where it is trival to do so.Jeff Roberson2019-10-151-4/+2
| | | | | | | | | | | | | | This is the first in a series of patches that promotes the page busy field to a first class lock that no longer requires the object lock for consistency. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21548 Notes: svn path=/head/; revision=353535
* tmpfs: use MNTK_NOMSYNCMateusz Guzik2019-10-131-1/+1
| | | | | | | | | Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009 Notes: svn path=/head/; revision=353474
* Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map.Doug Moore2019-10-081-2/+1
| | | | | | | | | | | | | | | | | In case the implementation ever changes from using a chain of next pointers, then changing the macro definition will be necessary, but changing all the files that iterate over vm_map entries will not. Drop a counter in vm_object.c that would have an effect only if the vm_map entry count was wrong. Discussed with: alc Reviewed by: markj Tested by: pho (earlier version) Differential Revision: https://reviews.freebsd.org/D21882 Notes: svn path=/head/; revision=353298
* tmpfs: add root vnode cachingMateusz Guzik2019-10-061-1/+2
| | | | | | | | | | See r353150. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21646 Notes: svn path=/head/; revision=353153
* tmpfs_readdir(): unlock the locked node.Konstantin Belousov2019-10-031-5/+7
| | | | | | | | | | | | | During readdir() we guarantee that the tn_dir.tn_parent does not go away, but it might be replaced by a parallel rename. Read tn_parent only once, then use the cached value. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=353065
* tmpfs_rename: style.Konstantin Belousov2019-10-031-34/+63
| | | | | | | | | | | Reformat multi-line comments to follow style. Also fix some typos. Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=353064