| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an implementation of inotify_init(), inotify_add_watch(),
inotify_rm_watch(), source-compatible with Linux. This provides
functionality similar to kevent(2)'s EVFILT_VNODE, i.e., it lets
applications monitor filesystem files for accesses. Compared to
inotify, however, EVFILT_VNODE has the limitation of requiring the
application to open the file to be monitored. This means that activity
on a newly created file cannot be monitored reliably, and that a file
descriptor per file in the hierarchy is required.
inotify on the other hand allows a directory and its entries to be
monitored at once. It introduces a new file descriptor type to which
"watches" can be attached; a watch is a pseudo-file descriptor
associated with a file or directory and a set of events to watch for.
When a watched vnode is accessed, a description of the event is queued
to the inotify descriptor, readable with read(2). Events for files in a
watched directory include the file name.
A watched vnode has its usecount bumped, so name cache entries
originating from a watched directory are not evicted. Name cache
entries are used to populate inotify events for files with a link in a
watched directory. In particular, if a file is accessed with, say,
read(2), an IN_ACCESS event will be generated for any watched hard link
of the file.
The inotify_add_watch_at() variant is included so that this
functionality is available in capability mode; plain inotify_add_watch()
is disallowed in capability mode.
When a file in a nullfs mount is watched, the watch is attached to the
lower vnode, such that accesses via either layer generate inotify
events.
Many thanks to Gleb Popov for testing this patch and finding lots of
bugs.
PR: 258010, 215011
Reviewed by: kib
Tested by: arrowd
MFC after: 3 months
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D50315
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Suppose a process has its cwd pointing to a nullfs directory, where the
lower directory is also visible in the jail's filesystem namespace.
Suppose that the lower directory vnode is moved out from under the
nullfs mount. The nullfs vnode still shadows the lower vnode, and
dotdot lookups relative to that directory will instantiate new nullfs
vnodes outside of the nullfs mountpoint, effectively shadowing the lower
filesystem.
This phenomenon can be abused to escape a chroot, since the nullfs
vnodes instantiated by these dotdot lookups defeat the root vnode check
in vfs_lookup(), which uses vnode pointer equality to test for the
process root.
Fix this by extending nullfs and unionfs to perform the same check,
exploiting the fact that the passed componentname is embedded in a
nameidata structure to avoid changing the VOP_LOOKUP interface. That
is, add a flag to indicate that containerof can be used to get the full
nameidata structure, and perform the root vnode check on the lower vnode
when performing a dotdot lookup.
PR: 262180
Reviewed by: olce, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D50418
|
|
|
|
|
|
|
| |
Reviewed by: olce
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D50390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of casting a vop_F_args object to vop_generic_args, use a
vop_F_args.a_gen field when calling null_bypass(). This way we don't
hardcode the vop_generic_args data type in the callers of null_bypass().
Before this change, there were 3 null_bypass() calls using
a vop_F_args.a_gen field and 5 null_bypass() calls using a cast to
vop_generic_args. This change makes all null_bypass() calls consistent
and easier to maintain.
Pointed out by: jrtc27
Reviewed by: kib, oshogbo
Accepted by: oshogbo (mentor)
Differential Revision: https://reviews.freebsd.org/D37359
|
|
|
|
|
|
|
|
|
|
|
| |
There must be no callers of VOP_COPY_FILE_RANGE() except
vn_copy_file_range(), which does enough to find the write-vnodes where
to call the VOP.
Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is similar to VOP_GETWRITEMOUNT(), and for given vnode vp should
return the lower vnode which would actually handle write to vp.
Flags allow to specify FREAD or FWRITE for benefit of possible unionfs
implementation.
Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D42603
|
|
|
|
|
|
|
|
| |
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.
Sponsored by: Netflix
|
|
|
|
| |
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
|
|
|
|
|
| |
- s/examing/examining/
MFC after: 3 days
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
null_nodeget() needs a valid mount point data, otherwise we might
race and dereference NULL.
Using MBF_NOWAIT makes non-forced unmount non-transparent for
vn_fullpath() over nullfs, but we make no guarantee that fullpath
calculation succeeds anyway.
Reported and tested by: pho
Reviewed by: jah
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35477
|
|
|
|
|
|
|
|
|
|
|
| |
fdvp and fvp vnodes are not locked, and race with reclaim cannot be handled
by the generic bypass routine.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31310
|
|
|
|
|
|
|
| |
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31310
|
|
|
|
|
|
|
|
|
|
|
|
| |
Caller of VOP_LOOKUP() passes dvp locked and expect it locked on return.
Relock of lower vnode in any case could leave upper vnode reclaimed and
unlocked.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31310
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The upper vnode reference to the lower vnode is the only reference that
keeps our pointer to the lower vnode alive. If lower vnode is relocked
during the VOP call, upper vnode might become unlocked and reclaimed,
which invalidates our reference.
Add a transient vhold around VOP call.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31310
|
|
|
|
|
|
|
|
|
|
|
|
| |
The advlock VOP takes the vnode unlocked, which makes the normal bypass
function racy. Same as null_pgcache_read(), nullfs implementation needs
to take interlock and reference lower vnode under it.
Reported and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31310
|
|
|
|
|
|
|
| |
Reivewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31310
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise pages are cleaned some time later when the lower fs decides
that it is time to do it. This mostly manifests itself as delayed
mtime update, e.g. breaking make-like programs.
Reported by: mav
Tested by: mav, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VFS conventions is that VOP_LOOKUP() methods do not need to handle
ISDOTDOT lookups for VV_ROOT vnodes (since they cannot, after all). Nullfs
bypasses VOP_LOOKUP() to lower filesystem, and there, due to user actions,
it is possible to get into situation where
- upper vnode does not have VV_ROOT set
- lower vnode is root
- ISDOTDOT is requested
User just needs to nullfs-mount non-root of some filesystem, and then move
some directory under mount, out of mount, using lower filesystem.
In this case, nullfs cannot do much, but we still should and can ensure
internal kernel structures are consistent. Avoid ISDOTDOT lookup forwarding
when VV_ROOT is set on lower dvp, return somewhat arbitrary ENOENT.
PR: 253593
Reported by: Gregor Koscak <elogin41@gmail.com>
Test by: Patrick Sullivan <sulli00777@gmail.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We might own the last use reference, and then vrele() at the end would
need to take the dvp vnode lock to inactivate, which causes deadlock
with vp. We cannot vrele() dvp from start since this might unlock ldvp.
Handle it by holding the vnode and dropping use ref after lowerfs
VOP_VPUT_PAIR() ended. This effectivaly requires unlock of the vp vnode
after VOP_VPUT_PAIR(), so the call is changed to set unlock_vp to true
unconditionally. This opens more opportunities for vp to be reclaimed,
if lvp is still alive we reinstantiate vp with null_nodeget().
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D29178
|
|
|
|
|
|
|
|
|
| |
Generic bypass cannot understand the rules of liveness for the VOP.
Reviewed by: chs, mckusick
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
|
|
|
|
|
| |
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D27793
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normal bypass expects locked vnode, which is not true for
VOP_READ_PGCACHE(). Ensure liveness of the lower vnode by taking the
upper vnode interlock, which is also taked by null_reclaim() when
setting v_data to NULL.
Reported and tested by: pho
Reviewed by: markj, mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27327
Notes:
svn path=/head/; revision=368077
|
|
|
|
| |
Notes:
svn path=/head/; revision=366869
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If lower VOP relocked the lower vnode, it is possible that nullfs
vnode was reclaimed meantime. In this case nullfs vnode no longer
shares lock with lower vnode, which breaks locking protocol.
Check for the condition and acquire nullfs vnode lock if detected.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=366849
|
|
|
|
| |
Notes:
svn path=/head/; revision=365070
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D25968
Notes:
svn path=/head/; revision=364288
|
|
|
|
|
|
|
| |
Tested by: pho
Notes:
svn path=/head/; revision=364063
|
|
|
|
| |
Notes:
svn path=/head/; revision=357403
|
|
|
|
| |
Notes:
svn path=/head/; revision=357287
|
|
|
|
|
|
|
|
|
|
|
| |
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427
Notes:
svn path=/head/; revision=356337
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This eliminates the following loop from all VOP calls:
while(vop != NULL && \
vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
vop = vop->vop_default;
Reviewed by: jeff
Tesetd by: pho
Differential Revision: https://reviews.freebsd.org/D22738
Notes:
svn path=/head/; revision=355790
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.
v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.
Reviewed by: kib, jeff
Differential Revision: https://reviews.freebsd.org/D22715
Notes:
svn path=/head/; revision=355537
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similarly to the other routine stop taking the interlock for the lower
vnode. The interlock for nullfs vnode is still taken to ensure
stability of ->v_data.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21480
Notes:
svn path=/head/; revision=351651
|
|
|
|
|
|
|
|
|
| |
Reviewed by: kib
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
Notes:
svn path=/head/; revision=351617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vnode usecount drops to 0 all the time (e.g. for directories during path lookup).
When that happens the kernel would always lock the exclusive lock for the vnode
in order to call vinactive(). This blocks other threads who want to use the vnode
for looukp.
vinactive is very rarely needed and can be tested for without the vnode lock held.
This patch gives filesytems an opportunity to do it, sample total wait time for
tmpfs over 500 minutes of poudriere -j 104:
before: 557563641706 (lockmgr:tmpfs)
after: 46309603301 (lockmgr:tmpfs)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21371
Notes:
svn path=/head/; revision=351584
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some places only take the interlock to hold the vnode, which was a requiremnt
before they started being manipulated with atomics. Use the newly introduced
vholdnz to bump the count.
Reviewed by: kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21358
Notes:
svn path=/head/; revision=351472
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
null_nodeget which follows almost always finds the target vnode in the hash,
avoiding insmntque1 altogether. Should it be needed, it already checks if the
lock needs to be upgraded.
Reviewed by: kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20244
Notes:
svn path=/head/; revision=351360
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both filesystems do no use vnode_pager_dealloc() which would handle
this case otherwise. Nullfs because vnode vm_object handle never
points to nullfs vnode. Tmpfs because its vm_object is never vnode
object at all.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=348698
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kern_execve() locks text vnode exclusive to be able to set and clear
VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0
condition.
The change removes VV_TEXT, replacing it with the condition
v_writecount <= -1, and puts v_writecount under the vnode interlock.
Each text reference decrements v_writecount. To clear the text
reference when the segment is unmapped, it is recorded in the
vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and
v_writecount is incremented on the map entry removal
The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that
v_writecount does not contradict the desired change. vn_writecheck()
is now racy and its use was eliminated everywhere except access.
Atomic check for writeability and increment of v_writecount is
performed by the VOP. vn_truncate() now increments v_writecount
around VOP_SETATTR() call, lack of which is arguably a bug on its own.
nullfs bypasses v_writecount to the lower vnode always, so nullfs
vnode has its own v_writecount correct, and lower vnode gets all
references, since object->handle is always lower vnode.
On the text vnode' vm object dealloc, the v_writecount value is reset
to zero, and deadfs vop_unset_text short-circuit the operation.
Reclamation of lowervp always reclaims all nullfs vnodes referencing
lowervp first, so no stray references are left.
Reviewed by: markj, trasz
Tested by: mjg, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D19923
Notes:
svn path=/head/; revision=347151
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vp vnode is unlocked during the execution of the VOP method and
can be reclaimed, zeroing vp->v_data. Caching allows to use the
correct mount point.
Reported and tested by: pho
PR: 235549
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=343899
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
Notes:
svn path=/head/; revision=326023
|
|
|
|
|
|
|
|
|
|
|
|
| |
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
Notes:
svn path=/head/; revision=314436
|
|
|
|
|
|
|
| |
Sponsored by: Dell EMC Isilon
Notes:
svn path=/head/; revision=308457
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lower vnode is already referenced and nodeget is supposed to consume
the reference. Thus the extra vref call was causing a leak.
Reported by: pho
Reviewed by: kib
MFC after: 1 week
Notes:
svn path=/head/; revision=305659
|
|
|
|
|
|
|
|
|
|
|
| |
The previous code was forcing an expensive walk in vop_stdvptocnp,
which was causing performance issues on highly contended zfs.
No objections: kib
MFC after: 2 weeks
Notes:
svn path=/head/; revision=305504
|
|
|
|
|
|
|
| |
No functional change.
Notes:
svn path=/head/; revision=298806
|
|
|
|
|
|
|
|
|
|
|
|
| |
unlinked. Otherwise the vnode stays cached, causing leak. This is
similar to r292961 for regular files.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=295717
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lower vnode. Otherwise, reference to the lower vnode from the upper
one prevents final unlink.
PR: 178238
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=292961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nullfs vnode shares vnode lock with lower vnode, this allows the
reclamation of nullfs directory vnode in null_lookup(). In this
situation, VOP must return ENOENT.
More, since after the reclamation, the locks of nullfs directory vnode
and lower vnode are no longer shared, the relock of the ldvp does not
restore the correct locking state of dvp, and leaks ldvp lock.
Correct this by unlocking ldvp and locking dvp.
Use cached value of dvp->v_mount.
Reported by: bdrewery
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Notes:
svn path=/head/; revision=269708
|
|
|
|
|
|
|
|
|
|
| |
Assert that dotdot lookup on the root vnode is not performed.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Notes:
svn path=/head/; revision=269187
|