summaryrefslogtreecommitdiff
path: root/sys/kern/vnode_if.src
Commit message (Collapse)AuthorAgeFilesLines
* vfs: prevent avoidable evictions on mkdir of existing directoriesMateusz Guzik2020-10-221-0/+1
| | | | | | | | | | | | | | | | | mkdir -p /foo/bar/baz will mkdir each path component and ignore EEXIST. The NOCACHE lookup will make the namecache unnecessarily evict the existing entry, and then fallback to the fs lookup routine eventually leading namei to return an error as the directory is already there. For invocations like mkdir -p /usr/obj/usr/src/sys/GENERIC/modules this triggers fallbacks to the slowpath for concurrently executing lookups. Tested by: pho Discussed with: kib Notes: svn path=/head/; revision=366950
* vfs: drop the de facto curthread argument from VOP_INACTIVEMateusz Guzik2020-10-201-1/+0
| | | | Notes: svn path=/head/; revision=366870
* vfs: drop spurious cred argument from VOP_VPTOCNPMateusz Guzik2020-10-201-1/+0
| | | | Notes: svn path=/head/; revision=366869
* Convert page cache read to VOP.Konstantin Belousov2020-09-151-0/+11
| | | | | | | | | | | | | | | | | | | | | There are several negative side-effects of not calling into VOP layer at all for page cache reads. The biggest is the missed activation of EVFILT_READ knotes. Also, it allows filesystem to make more fine grained decision to refuse read from page cache. Keep VIRF_PGREAD flag around, it is still useful for nullfs, and for asserts. Reviewed by: markj Tested by: pho Discussed with: mjg Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346 Notes: svn path=/head/; revision=365785
* vfs: remove the always-curthread td argument from VOP_RECLAIMMateusz Guzik2020-08-191-1/+0
| | | | Notes: svn path=/head/; revision=364373
* vfs: drop the thread argumemnt from vfs_fplookup_vexecMateusz Guzik2020-08-101-1/+0
| | | | | | | | | | It is guaranteed curthread. Tested by: pho Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=364065
* vfs: add VOP_STATMateusz Guzik2020-08-071-0/+11
| | | | | | | | | | | | | | The current scheme of calling VOP_GETATTR adds avoidable overhead. An example with tmpfs doing fstat (ops/s): before: 7488958 after: 7913833 Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D25910 Notes: svn path=/head/; revision=364044
* vfs: inline vops if there are no pre/post associated callsMateusz Guzik2020-07-301-8/+8
| | | | | | | | | | This removes a level of indirection from frequently used methods, most notably VOP_LOCK1 and VOP_UNLOCK1. Tested by: pho Notes: svn path=/head/; revision=363708
* vfs: add the infrastructure for lockless lookupMateusz Guzik2020-07-251-0/+11
| | | | | | | | | Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25577 Notes: svn path=/head/; revision=363518
* vfs: introduce vnode sequence countersMateusz Guzik2020-07-251-0/+14
| | | | | | | | | | | Modified on each permission change and link/unlink. Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25573 Notes: svn path=/head/; revision=363517
* vfs: quiet -Wwrite-stringsRyan Libby2020-02-231-1/+1
| | | | | | | | Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23797 Notes: svn path=/head/; revision=358257
* vfs: remove the now empty vop_unlock_postMateusz Guzik2020-02-021-1/+0
| | | | Notes: svn path=/head/; revision=357404
* vfs: consistently use size_t for buflen around VOP_VPTOCNPMateusz Guzik2020-02-011-1/+1
| | | | Notes: svn path=/head/; revision=357383
* vfs: replace VOP_MARKATIME with VOP_MMAPPEDMateusz Guzik2020-02-011-2/+2
| | | | | | | | | | The routine is only provided by ufs and is only used on mmap and exec. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23422 Notes: svn path=/head/; revision=357361
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-031-1/+0
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* vfs: add VOP_NEED_INACTIVEMateusz Guzik2019-08-281-0/+6
| | | | | | | | | | | | | | | | | | | | | vnode usecount drops to 0 all the time (e.g. for directories during path lookup). When that happens the kernel would always lock the exclusive lock for the vnode in order to call vinactive(). This blocks other threads who want to use the vnode for looukp. vinactive is very rarely needed and can be tested for without the vnode lock held. This patch gives filesytems an opportunity to do it, sample total wait time for tmpfs over 500 minutes of poudriere -j 104: before: 557563641706 (lockmgr:tmpfs) after: 46309603301 (lockmgr:tmpfs) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21371 Notes: svn path=/head/; revision=351584
* Change locking requirements for VOP_UNSET_TEXT().Konstantin Belousov2019-08-181-1/+1
| | | | | | | | | | | | Require the vnode to be locked for the VOP_UNSET_TEXT() call. This will be used by the following bug fix for a tmpfs issue. Tested by: sbruno, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=351194
* Add kernel support for a Linux compatible copy_file_range(2) syscall.Rick Macklem2019-07-251-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support to the kernel for a Linux compatible copy_file_range(2) syscall and the related VOP_COPY_FILE_RANGE(9). This syscall/VOP can be used by the NFSv4.2 client to implement the Copy operation against an NFSv4.2 server to do file copies locally on the server. The vn_generic_copy_file_range() function in this patch can be used by the NFSv4.2 server to implement the Copy operation. Fuse may also me able to use the VOP_COPY_FILE_RANGE() method. vn_generic_copy_file_range() attempts to maintain holes in the output file in the range to be copied, but may fail to do so if the input and output files are on different file systems with different _PC_MIN_HOLE_SIZE values. Separate commits will be done for the generated syscall files and userland changes. A commit for a compat32 syscall will be done later. Reviewed by: kib, asomers (plus comments by brooks, jilles) Relnotes: yes Differential Revision: https://reviews.freebsd.org/D20584 Notes: svn path=/head/; revision=350315
* Switch to use shared vnode locks for text files during image activation.Konstantin Belousov2019-05-051-11/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kern_execve() locks text vnode exclusive to be able to set and clear VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0 condition. The change removes VV_TEXT, replacing it with the condition v_writecount <= -1, and puts v_writecount under the vnode interlock. Each text reference decrements v_writecount. To clear the text reference when the segment is unmapped, it is recorded in the vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and v_writecount is incremented on the map entry removal The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that v_writecount does not contradict the desired change. vn_writecheck() is now racy and its use was eliminated everywhere except access. Atomic check for writeability and increment of v_writecount is performed by the VOP. vn_truncate() now increments v_writecount around VOP_SETATTR() call, lack of which is arguably a bug on its own. nullfs bypasses v_writecount to the lower vnode always, so nullfs vnode has its own v_writecount correct, and lower vnode gets all references, since object->handle is always lower vnode. On the text vnode' vm object dealloc, the v_writecount value is reset to zero, and deadfs vop_unset_text short-circuit the operation. Reclamation of lowervp always reclaims all nullfs vnodes referencing lowervp first, so no stray references are left. Reviewed by: markj, trasz Tested by: mjg, pho Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D19923 Notes: svn path=/head/; revision=347151
* Make vop_symlink take a const target path.Brooks Davis2018-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | This will enable callers to take const paths as part of syscall decleration improvements. Where doing so is easy and non-distruptive carry the const through implementations. In UFS the value is passed to an interface that must take non-const values. In ZFS, const poisoning would touch code shared with upstream and it's not worth adding diffs. Bump __FreeBSD_version for external API consumers. Reviewed by: kib (prior version) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D17805 Notes: svn path=/head/; revision=340055
* Use long for the last argument to VOP_PATHCONF rather than a register_t.John Baldwin2018-01-171-1/+1
| | | | | | | | | | | | | | pathconf(2) and fpathconf(2) both return a long. The kern_[f]pathconf() functions now accept a pointer to a long value rather than modifying td_retval directly. Instead, the system calls explicitly store the returned long value in td_retval[0]. Requested by: bde Reviewed by: kib Sponsored by: Chelsio Communications Notes: svn path=/head/; revision=328099
* For UNIX sockets make vnode point not to the socket, but to the UNIX PCB,Gleb Smirnoff2017-06-021-2/+2
| | | | | | | | | since the latter is the thing that links together VFS and sockets. While here, make the union in the struct vnode anonymous. Notes: svn path=/head/; revision=319502
* Renumber license clauses in sys/kern to avoid skipping #3Ed Maste2016-09-151-1/+1
| | | | Notes: svn path=/head/; revision=305832
* Consistently delimit each vnode description block with two blankKonstantin Belousov2016-08-271-0/+15
| | | | | | | | | | lines. Sponsored by: The FreeBSD Foundation MFC after: 3 days Notes: svn path=/head/; revision=304916
* Add an implementation of fdatasync(2).Konstantin Belousov2016-08-151-0/+8
| | | | | | | | | | | | | | | | | | | | The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing code with fsync(2). For all filesystems, this commit provides the implementation which delegates the work of VOP_FDATASYNC() to VOP_FSYNC(). This is functionally correct but not efficient. This is not yet POSIX-compliant implementation, because it does not ensure that queued AIO requests are completed before returning. Reviewed by: mckusick Discussed with: avg (ZFS), jhb (AIO part) Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D7471 Notes: svn path=/head/; revision=304176
* Remove unused "X" vnode lock assertion, somehow missed in r303743.Edward Tomasz Napierala2016-08-121-1/+0
| | | | | | | MFC after: 1 month Notes: svn path=/head/; revision=304024
* Remove unused - never actually implemented - vnode lock typesEdward Tomasz Napierala2016-08-041-3/+0
| | | | | | | | | from vnode_if.src. MFC after: 1 month Notes: svn path=/head/; revision=303743
* Add EVFILT_VNODE open, read and close notifications.Konstantin Belousov2016-05-031-0/+4
| | | | | | | | | | While there, order EVFILT_VNODE notes descriptions alphabetically. Based on submission, and tested by: Vladimir Kondratyev <wulf@cicgroup.ru> MFC after: 2 weeks Notes: svn path=/head/; revision=298982
* sys/kern: spelling fixes in comments.Pedro F. Giffuni2016-04-291-1/+1
| | | | | | | No functional change. Notes: svn path=/head/; revision=298819
* A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES().Gleb Smirnoff2015-12-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix Notes: svn path=/head/; revision=292373
* kevent(2): Note DOOMED vnodes with NOTE_REVOKEConrad Meyer2015-09-151-0/+1
| | | | | | | | | | | | | | | | | In poll mode, check for and wake VBAD vnodes. (Vnodes that are VBAD at registration will never be woken by the RECLAIM trigger.) Add post-VOP_RECLAIM hook to trigger notes on vnode reclamation. (Vnodes that were fine at registration but are vgoned while being monitored should signal waiters.) Reviewed by: kib Approved by: markj (mentor) Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3675 Notes: svn path=/head/; revision=287831
* Catch up on r271387 and remove unused parameter fromGleb Smirnoff2015-03-301-1/+0
| | | | | | | VOP_GETPAGES_ASYNC(). Notes: svn path=/head/; revision=280869
* Merge from projects/sendfile:Gleb Smirnoff2014-11-231-0/+13
| | | | | | | | | | | | | | | | | o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but doesn't sleep. It returns immediately, and will execute the I/O done handler function that must be supplied as argument. o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager. o Extend pagertab to support pgo_getpages_async method, and implement this method for vnode_pager. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=274914
* Remove unused arguments for VOP_GETPAGES(), VOP_PUTPAGES().Gleb Smirnoff2014-09-101-2/+0
| | | | Notes: svn path=/head/; revision=271387
* If filesystem declares that it supports shared locking for writes, useKonstantin Belousov2013-11-091-1/+1
| | | | | | | | | | | | | | | | shared vnode lock for VOP_PUTPAGES() as well. The only such filesystem in the tree is ZFS, and it uses vnode_pager_generic_putpages(), which performs the pageout with VOP_WRITE(). Reviewed by: alc Discussed with: avg Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Notes: svn path=/head/; revision=257899
* remove vop_lookup_pre and vop_lookup_postAndriy Gapon2012-11-221-2/+0
| | | | | | | | Suggested by: kib MFC after: 5 days Notes: svn path=/head/; revision=243400
* vnode_if: fix locking protocol description for lookup and cachedlookupAndriy Gapon2012-11-191-2/+2
| | | | | | | | | | | | Also remove the checks from vop_lookup_pre and vop_lookup_post, which are now completely redundant (before this change they were partially redundant). Discussed with: kib MFC after: 10 days Notes: svn path=/head/; revision=243271
* The r241025 fixed the case when a binary, executed from nullfs mount,Konstantin Belousov2012-11-021-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks Notes: svn path=/head/; revision=242476
* Fix the mis-handling of the VV_TEXT on the nullfs vnodes.Konstantin Belousov2012-09-281-0/+18
| | | | | | | | | | | | | | | | | | | If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks Notes: svn path=/head/; revision=241025
* Introduce VOP_UNP_BIND(), VOP_UNP_CONNECT(), and VOP_UNP_DETACH()Mikolaj Golub2012-02-291-0/+20
| | | | | | | | | | | | | | | | | | | | | | operations for setting and accessing vnode's v_socket field. The operations are necessary to implement proper unix socket handling on layered file systems like nullfs(5). This change fixes the long standing issue with nullfs(5) being in that unix sockets did not work between lower and upper layers: if we bound to a socket on the lower layer we could connect only to the lower path; if we bound to the upper layer we could connect only to the upper path. The new behavior is one can connect to both the lower and the upper paths regardless what layer path one binds to. PR: kern/51583, kern/159663 Suggested by: kib Reviewed by: arch MFC after: 2 weeks Notes: svn path=/head/; revision=232317
* Add 5 spare VOPs as placeholders to avoid breaking the KBI in the futureJohn Baldwin2012-01-061-1/+26
| | | | | | | | | | when new VOPs are MFC'd to a branch. Reviewed by: kib, bz MFC after: 3 days Notes: svn path=/head/; revision=229728
* Add post-VOP hooks for VOP_DELETEEXTATTR() and VOP_SETEXTATTR() and useJohn Baldwin2011-12-231-0/+2
| | | | | | | | | | | | | these to trigger a NOTE_ATTRIB EVFILT_VNODE kevent when the extended attributes of a vnode are changed. Note that OS X already implements this behavior. Reviewed by: rwatson MFC after: 2 weeks Notes: svn path=/head/; revision=228849
* Add the posix_fadvise(2) system call. It is somewhat similar toJohn Baldwin2011-11-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | madvise(2) except that it operates on a file descriptor instead of a memory region. It is currently only supported on regular files. Just as with madvise(2), the advice given to posix_fadvise(2) can be divided into two types. The first type provide hints about data access patterns and are used in the file read and write routines to modify the I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are thus filesystem independent. Note that to ease implementation (and since this API is only advisory anyway), only a single non-normal range is allowed per file descriptor. The second type of hints are used to hint to the OS that data will or will not be used. These hints are implemented via a new VOP_ADVISE(). A default implementation is provided which does nothing for the WILLNEED request and attempts to move any clean pages to the cache page queue for the DONTNEED request. This latter case required two other changes. First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests vinvalbuf() to only flush clean buffers for the vnode from the buffer cache and to not remove any backing pages from the vnode. This is used to ensure clean pages are not wired into the buffer cache before attempting to move them to the cache page queue. The second change adds a new vm_object_page_cache() method. This method is somewhat similar to vm_object_page_remove() except that instead of freeing each page in the specified range, it attempts to move clean pages to the cache queue if possible. To preserve the ABI of struct file, the f_cdevpriv pointer is now reused in a union to point to the currently active advice region if one is present for regular files. Reviewed by: jilles, kib, arch@ Approved by: re (kib) MFC after: 1 month Notes: svn path=/head/; revision=227070
* Correctly use INOUT for the offset/len parameters to vop_allocate. AsMatthew D Fleming2011-05-131-2/+2
| | | | | | | far as I can tell this is for documentation only at the moment. Notes: svn path=/head/; revision=221836
* Allow VOP_ALLOCATE to be iterative, and have kern_posix_fallocate(9)Matthew D Fleming2011-04-191-3/+3
| | | | | | | | | drive looping and potentially yielding. Requested by: kib Notes: svn path=/head/; revision=220846
* Add the posix_fallocate(2) syscall. The default implementation inMatthew D Fleming2011-04-181-0/+10
| | | | | | | | | | | | | | | | | vop_stdallocate() is filesystem agnostic and will run as slow as a read/write loop in userspace; however, it serves to correctly implement the functionality for filesystems that do not implement a VOP_ALLOCATE. Note that __FreeBSD_version was already bumped today to 900036 for any ports which would like to use this function. Also reserve space in the syscall table for posix_fadvise(2). Reviewed by: -arch (previous version) Notes: svn path=/head/; revision=220791
* Add VOP_ADVLOCKPURGE so that the file system is called when purgingZachary Loafman2010-05-121-0/+7
| | | | | | | | | | locks (in the case where the VFS impl isn't using lf_*) Submitted by: Matthew Fleming <matthew.fleming@isilon.com> Reviewed by: zml, dfr Notes: svn path=/head/; revision=208003
* Add explicit struct ucred * argument for VOP_VPTOCNP, to be used byKonstantin Belousov2009-06-211-0/+1
| | | | | | | | | | | | | vn_open_cred in default implementation. Valid struct ucred is needed for audit and MAC, and curthread credentials may be wrong. This further requires modifying the interface of vn_fullpath(9), but it is out of scope of this change. Reviewed by: rwatson Notes: svn path=/head/; revision=194601
* Stop asserting on exclusive locks in fsync since it can now supportPaul Saab2009-06-111-1/+1
| | | | | | | | | shared vnode locking on ZFS. Reviewed by: jhb Notes: svn path=/head/; revision=194019
* Simply shared vnode locking and extend it to also include fsync.Paul Saab2009-06-081-1/+1
| | | | | | | | | | Also, in vop_write, no longer assert for exclusive locks on the vnode. Reviewed by: jhb, kmacy, jeffr Notes: svn path=/head/; revision=193762