summaryrefslogtreecommitdiff
path: root/sys/fs
Commit message (Collapse)AuthorAgeFilesLines
* VFS_QUOTACTL: Remove needless casts of argBrooks Davis2020-12-171-7/+7
| | | | | | | | | | | | | | | The argument is a void * so there's no need to cast it to caddr_t. Update documentation to match function decleration. Reviewed by: freqlabs Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D27093 Notes: svn path=/head/; revision=368744
* In ext2fs, BA_CLRBUF is used in ext2_balloc() not UFS_BALLOC().Kirk McKusick2020-12-081-1/+1
| | | | | | | | | Noted by: kib MFC after: 3 days Sponsored by: Netflix Notes: svn path=/head/; revision=368425
* Document the BA_CLRBUF flag used in ufs and ext2fs filesystems.Kirk McKusick2020-12-061-0/+7
| | | | | | | | | Suggested by: kib MFC after: 3 days Sponsored by: Netflix Notes: svn path=/head/; revision=368396
* Make MAXPHYS tunable. Bump MAXPHYS to 1M.Konstantin Belousov2020-11-285-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225 Notes: svn path=/head/; revision=368124
* nullfs: provide custom bypass for VOP_READ_PGCACHE().Konstantin Belousov2020-11-261-0/+23
| | | | | | | | | | | | | | | Normal bypass expects locked vnode, which is not true for VOP_READ_PGCACHE(). Ensure liveness of the lower vnode by taking the upper vnode interlock, which is also taked by null_reclaim() when setting v_data to NULL. Reported and tested by: pho Reviewed by: markj, mjg Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27327 Notes: svn path=/head/; revision=368077
* msdosfs: suspend around unmount or remount rw->ro.Konstantin Belousov2020-11-201-11/+32
| | | | | | | | | | | | | | This also eliminates unsafe use of VFS_SYNC(MNT_WAIT). Requested by: mckusick Discussed with: imp Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D27269 Notes: svn path=/head/; revision=367895
* msdosfs: Add trivial support for suspension.Konstantin Belousov2020-11-202-2/+8
| | | | | | | | | | Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D27269 Notes: svn path=/head/; revision=367890
* msdosfs(5): Fix debug-only format stringConrad Meyer2020-11-181-1/+1
| | | | | | | | No functional change; MSDOSFS_DEBUG isn't a real build option, so this isn't covered by LINT kernels. Notes: svn path=/head/; revision=367817
* nfs: Mark unused statistics variable as reservedAlan Somers2020-11-182-22/+15
| | | | | | | | | | | | | | | FreeBSD's NFS exporter has long exported some unused statistics fields. Revision r366992 removed them from nfsstat. This revision renames those fields in the kernel's exported structures to make it clear to other consumers that they are unused. Reported by: emaste Reviewed by: emaste Sponsored by: Axcient Differential Revision: https://reviews.freebsd.org/D27258 Notes: svn path=/head/; revision=367785
* Split out cwd/root/jail, cmask state from filedesc tableConrad Meyer2020-11-173-4/+4
| | | | | | | | | | | | | | | | No functional change intended. Tracking these structures separately for each proc enables future work to correctly emulate clone(2) in linux(4). __FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof. Reviewed by: kib Discussed with: markj, mjg Differential Revision: https://reviews.freebsd.org/D27037 Notes: svn path=/head/; revision=367777
* Make it possible to mount a fuse filesystem, such as squashfuse,Edward Tomasz Napierala2020-11-093-0/+17
| | | | | | | | | | | | from a Linux binary. Should come handy for AppImages. Reviewed by: asomers MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26959 Notes: svn path=/head/; revision=367517
* tmpfs: reorder struct tmpfs_node to shrink it by 8 bytesMateusz Guzik2020-11-051-3/+7
| | | | | | | | The reduction (232 -> 224 bytes) allows UMA to fit one more item (17 -> 18) per slab as reported in vm.uma.TMPFS_node.keg.ipers. Notes: svn path=/head/; revision=367368
* Add sbuf streaming mode to pseudofs(9), use in linprocfs(5)Conrad Meyer2020-11-052-8/+82
| | | | | | | | | | | | | | | | | Add a pseudofs node flag 'PFS_AUTODRAIN', which automatically emits sbuf contents to the caller when the sbuf buffer fills. This is only permissible if the corresponding PFS node fill function can sleep whenever it appends to the sbuf. linprocfs' /proc/self/maps node happens to meet this requirement. Streaming out the file as it is composed avoids truncating the output and also avoids preallocating a very large buffer. Reviewed by: markj; earlier version: emaste, kib, trasz Differential Revision: https://reviews.freebsd.org/D27047 Notes: svn path=/head/; revision=367362
* tmpfs: change tmpfs dirent zone into a malloc typeMateusz Guzik2020-10-301-7/+3
| | | | | | | It is 64 bytes. Notes: svn path=/head/; revision=367165
* cache: add cache_vop_mkdir and rename cache_rename to cache_vop_renameMateusz Guzik2020-10-301-2/+2
| | | | Notes: svn path=/head/; revision=367162
* Make it possible to mount nullfs(5) using plain mount(8)Edward Tomasz Napierala2020-10-291-1/+3
| | | | | | | | | | | | | | | instead of mount_nullfs(8). Obviously you'd need to force mount(8) to not call mount_nullfs(8) to make use of it. Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26934 Notes: svn path=/head/; revision=367137
* Drop "All rights reserved" from all my stuff. This includesEdward Tomasz Napierala2020-10-285-5/+0
| | | | | | | | | | | | Foundation copyrights, approved by emaste@. It does not include files which carry other people's copyrights; if you're one of those people, feel free to make similar change. Reviewed by: emaste, imp, gbe (manpages) Differential Revision: https://reviews.freebsd.org/D26980 Notes: svn path=/head/; revision=367105
* vfs: drop spurious cache_purge on rmdirMateusz Guzik2020-10-231-1/+0
| | | | | | | | | | The removed directory gets cache_purged which is sufficient to remove any entries related to the parent. Note only tmpfs, ufs and zfs are patched. Notes: svn path=/head/; revision=366975
* Fix for loading cuse.ko via rc.d . Make sure we declare the cuse(3)Hans Petter Selasky2020-10-231-0/+18
| | | | | | | | | | | | module by name and not only by the version information, so that "kldstat -q -m cuse" works. Found by: Goran Mekic <meka@tilda.center> MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking Notes: svn path=/head/; revision=366961
* vfs: drop the de facto curthread argument from VOP_INACTIVEMateusz Guzik2020-10-205-9/+8
| | | | Notes: svn path=/head/; revision=366870
* vfs: drop spurious cred argument from VOP_VPTOCNPMateusz Guzik2020-10-201-2/+1
| | | | Notes: svn path=/head/; revision=366869
* nullfs: ensure correct lock is taken after bypass.Konstantin Belousov2020-10-191-0/+18
| | | | | | | | | | | | | | | If lower VOP relocked the lower vnode, it is possible that nullfs vnode was reclaimed meantime. In this case nullfs vnode no longer shares lock with lower vnode, which breaks locking protocol. Check for the condition and acquire nullfs vnode lock if detected. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=366849
* Bump pseudofs size limit from 128kB to 1MB. The old limit could resultEdward Tomasz Napierala2020-10-161-2/+4
| | | | | | | | | | | | in process' memory maps being truncated. PR: 237883 Submitted by: dchagin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20575 Notes: svn path=/head/; revision=366748
* cache: fix vexec panic when racing against vgoneMateusz Guzik2020-10-091-0/+1
| | | | | | | | | | | | | | Use of dead_vnodeops would result in a panic instead of returning the intended EOPNOTSUPP error. While here make sure to abort, not just try to return a partial result. The former allows the regular lookup to restart from scratch, while the latter makes it stuck with an unusable vnode. Reported by: kevans Notes: svn path=/head/; revision=366582
* ext2fs: minor typo.Pedro F. Giffuni2020-10-061-1/+1
| | | | | | | | Obtained from: Dragonfly MFC after: 3 days Notes: svn path=/head/; revision=366501
* Modify the NFSv4.2 VOP_COPY_FILE_RANGE() client call to return after oneRick Macklem2020-10-011-14/+12
| | | | | | | | | | | | | | successful RPC. Without this patch, the NFSv4.2 VOP_COPY_FILE_RANGE() client call would loop until the copy "len" was completed. The problem with doing this is that it might take a considerable time to complete for a large "len". By returning after a single successful Copy RPC that copied some of the data, the application that did the copy_file_range(2) syscall will be more responsive to signal delivery for large "len" copies. Notes: svn path=/head/; revision=366303
* Bjorn reported a problem where the Linux NFSv4.1 client isRick Macklem2020-09-261-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | using an open_to_lock_owner4 when that lock_owner4 has already been created by a previous open_to_lock_owner4. This caused the NFS server to reply NFSERR_INVAL. For NFSv4.0, this is an error, although the updated NFSv4.0 RFC7530 notes that the correct error reply is NFSERR_BADSEQID (RFC3530 did not specify what error to return). For NFSv4.1, it is not obvious whether or not this is allowed by RFC5661, but the NFSv4.1 server can handle this case without error. This patch changes the NFSv4.1 (and NFSv4.2) server to handle multiple uses of the same lock_owner in open_to_lock_owner so that it now correctly interoperates with the Linux NFS client. It also changes the error returned for NFSv4.0 to be NFSERR_BADSEQID. Thanks go to Bjorn for diagnosing this and testing the patch. He also provided a program that I could use to reproduce the problem. Tested by: bj@cebitec.uni-bielefeld.de (Bjorn Fischer) PR: 249567 Reported by: bj@cebitec.uni-bielefeld.de (Bjorn Fischer) MFC after: 3 days Notes: svn path=/head/; revision=366189
* fusefs: fix mmap'd writes in direct_io modeAlan Somers2020-09-241-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | If a FUSE server returns FOPEN_DIRECT_IO in response to FUSE_OPEN, that instructs the kernel to bypass the page cache for that file. This feature is also known by libfuse's name: "direct_io". However, when accessing a file via mmap, there is no possible way to bypass the cache completely. This change fixes a deadlock that would happen when an mmap'd write tried to invalidate a portion of the cache, wrongly assuming that a write couldn't possibly come from cache if direct_io were set. Arguably, we could instead disable mmap for files with FOPEN_DIRECT_IO set. But allowing it is less likely to cause user complaints, and is more in keeping with the spirit of open(2), where O_DIRECT instructs the kernel to "reduce", not "eliminate" cache effects. PR: 247276 Reported by: trapexit@spawn.link Reviewed by: cem MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D26485 Notes: svn path=/head/; revision=366121
* udf: Validate the full file entry lengthMark Johnston2020-09-221-16/+30
| | | | | | | | | | | | | | | Otherwise a corrupted file entry containing invalid extended attribute lengths or allocation descriptor lengths can trigger an overflow when the file entry is loaded. admbug: 965 PR: 248613 Reported by: C Turt <ecturt@gmail.com> MFC after: 3 days Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=366005
* Fix a LOR between the NFS server and server side krpc.Rick Macklem2020-09-181-2/+3
| | | | | | | | | | | | | | | | | | Recent testing of the NFS-over-TLS code found a LOR between the mutex lock used for sessions and the sleep lock used for server side krpc socket structures in nfsrv_checksequence(). This was fixed by r365789. A similar bug exists in nfsrv_bindconnsess(), where SVC_RELEASE() is called while mutexes are held. This patch applies a fix similar to r365789, moving the SVC_RELEASE() call down to after the mutexes are released. This patch fixes the problem by moving the SVC_RELEASE() call in nfsrv_checksequence() down a few lines to below where the mutex is released. MFC after: 1 week Notes: svn path=/head/; revision=365895
* vm_ooffset_t is now unsignedEric van Gyzen2020-09-181-5/+7
| | | | | | | | | | | | | vm_ooffset_t is now unsigned. Remove some tests for negative values, or make other adjustments accordingly. Reported by: Coverity Reviewed by: kib markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26214 Notes: svn path=/head/; revision=365886
* tmpfs: restore atime updates for reads from page cache.Konstantin Belousov2020-09-164-33/+49
| | | | | | | | | | | | | | | | | Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated atomically without locks or (locked) atomics. tn_update_getattr() change also contains unrelated bug fix. Reported by: lwhsu PR: 249362 Reviewed by: markj (previous version) Discussed with: mjg Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26451 Notes: svn path=/head/; revision=365810
* Style.Konstantin Belousov2020-09-162-19/+24
| | | | | | | | Sponsored by: The FreeBSD Foundation MFC after: 3 days Notes: svn path=/head/; revision=365809
* Fix a LOR between the NFS server and server side krpc.Rick Macklem2020-09-161-2/+3
| | | | | | | | | | | | | | | | | | | Recent testing of the NFS-over-TLS code found a LOR between the mutex lock used for sessions and the sleep lock used for server side krpc socket structures. The code in nfsrv_checksequence() would call SVC_RELEASE() with the mutex held. Normally this is ok, since all that happens is SVC_RELEASE() decrements a reference count. However, if the socket has just been shut down, SVC_RELEASE() drops the reference count to 0 and acquires a sleep lock during destruction of the server side krpc structure. This patch fixes the problem by moving the SVC_RELEASE() call in nfsrv_checksequence() down a few lines to below where the mutex is released. MFC after: 1 week Notes: svn path=/head/; revision=365789
* Add tmpfs page cache read support.Konstantin Belousov2020-09-154-10/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | Or it could be explained as lockless (for vnode lock) reads. Reads are performed from the node tn_obj object. Tmpfs regular vnode object lifecycle is significantly different from the normal OBJT_VNODE: it is alive as far as ref_count > 0. Ensure liveness of the tmpfs VREG node and consequently v_object inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open(). Provide custom tmpfs fo_close() method on file, to ensure that close is paired with open. Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks. It is quite cheap in code size sense to support page-ins for read for tmpfs even if we do not own tmpfs vnode lock. Also, we can handle holes in tmpfs node without additional efforts, and do not have limitation of the transfer size. Reviewed by: markj Discussed with and benchmarked by: mjg (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346 Notes: svn path=/head/; revision=365787
* Microoptimize tmpfs node ref/unref by using atomics.Konstantin Belousov2020-09-153-22/+18
| | | | | | | | | | | | | Avoid tmpfs mount and node locks when ref count is greater than zero, which is the case until node is being destroyed by unlink or unmount. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346 Notes: svn path=/head/; revision=365786
* Do not copy vp into f_data for DTYPE_VNODE files.Konstantin Belousov2020-09-151-1/+1
| | | | | | | | | | | | | | | | The pointer to vnode is already stored into f_vnode, so f_data can be reused. Fix all found users of f_data for DTYPE_VNODE. Provide finit_vnode() helper to initialize file of DTYPE_VNODE type. Reviewed by: markj (previous version) Discussed with: freqlabs (openzfs chunk) Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26346 Notes: svn path=/head/; revision=365783
* Fix a case where the NFSv4.0 server might crash if delegations are enabled.Rick Macklem2020-09-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asomers@ reported a crash on an NFSv4.0 server with a backtrace of: kdb_backtrace vpanic panic nfsrv_docallback nfsrv_checkgetattr nfsrvd_getattr nfsrvd_dorpc nfssvc_program svc_run_internal svc_thread_start fork_exit fork_trampoline where the panic message was "docallb", which indicates that a callback was attempted when the ClientID is unconfirmed. This would not normally occur, but it is possible to have an unconfirmed ClientID structure with delegation structure(s) chained off it if the client were to issue a SetClientID with the same "id" but different "verifier" after acquiring delegations on the previously confirmed ClientID. The bug appears to be that nfsrv_checkgetattr() failed to check for this uncommon case of an unconfirmed ClientID with a delegation structure that no longer refers to a delegation the client knows about. This patch adds a check for this case, handling it as if no delegation exists, which is the case when the above occurs. Although difficult to reproduce, this change should avoid the panic(). PR: 249127 Reported by: asomers Reviewed by: asomers MFC after: 1 week Differential Revision: https://reviews.freebbsd.org/D26342 Notes: svn path=/head/; revision=365703
* tmpfs: drop spurious cache_purge in tmpfs_reclaimMateusz Guzik2020-09-041-2/+0
| | | | | | | vgone already performs it. Notes: svn path=/head/; revision=365338
* fs: clean up empty lines in .c and .h filesMateusz Guzik2020-09-0174-180/+72
| | | | Notes: svn path=/head/; revision=365070
* Add a check to test for the case of the "tls" option being used with "udp".Rick Macklem2020-09-011-1/+3
| | | | | | | | | The KERN_TLS only supports TCP, so use of the "tls" option with "udp" will not work. This patch adds a test for this case, so that the mount is not attempted when both "tls" and "udp" are specified. Notes: svn path=/head/; revision=365019
* Fix nfsrvd_locku memory leakEric van Gyzen2020-08-311-0/+2
| | | | | | | | | | | | | | Coverity detected memory leak fix. Submitted by: bret_ketchum@dell.com Reported by: Coverity Reviewed by: rmacklem MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26231 Notes: svn path=/head/; revision=364992
* Add flags to enable NFS over TLS to the NFS client and server.Rick Macklem2020-08-2710-9/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An Internet Draft titled "Towards Remote Procedure Call Encryption By Default" (soon to be an RFC I think) describes how Sun RPC is to use TLS with NFS as a specific application case. Various commits prepared the NFS code to use KERN_TLS, mainly enabling use of ext_pgs mbufs for large RPC messages. r364475 added TLS support to the kernel RPC. This commit (which is the final one for kernel changes required to do NFS over TLS) adds support for three export flags: MNT_EXTLS - Requires a TLS connection. MNT_EXTLSCERT - Requires a TLS connection where the client presents a valid X.509 certificate during TLS handshake. MNT_EXTLSCERTUSER - Requires a TLS connection where the client presents a valid X.509 certificate with "user@domain" in the otherName field of the SubjectAltName during TLS handshake. Without these export options, clients are permitted, but not required, to use TLS. For the client, a new nmount(2) option called "tls" makes the client do a STARTTLS Null RPC and TLS handshake for all TCP connections used for the mount. The CLSET_TLS client control option is used to indicate to the kernel RPC that this should be done. Unless the above export flags or "tls" option is used, semantics should not change for the NFS client nor server. For NFS over TLS to work, the userspace daemons rpctlscd(8) { for client } or rpctlssd(8) daemon { for server } must be running. Notes: svn path=/head/; revision=364896
* fuse: unbreak after r364814Mateusz Guzik2020-08-261-1/+2
| | | | | | | Reported by: kevans Notes: svn path=/head/; revision=364837
* cache: drop the always curthread argument from reverse lookup routinesMateusz Guzik2020-08-243-3/+3
| | | | | | | | | Note VOP_VPTOCNP keeps getting it as temporary compatibility for zfs. Tested by: pho Notes: svn path=/head/; revision=364633
* cache: add cache_rename, a dedicated helper to use for renamesMateusz Guzik2020-08-201-4/+1
| | | | | | | | | While here make both tmpfs and ufs use it. No fuctional changes. Notes: svn path=/head/; revision=364419
* extfs: remove redundant little endian conversion.Pedro F. Giffuni2020-08-201-4/+4
| | | | | | | | | | | | | The XTIME_TO_NSEC macro already calls the htole32(), so there is no need to call it twice. This code does nothing on LE platforms and affects only nanosecond and birthtime fields so it's difficult to notice on regular use. Hinted by: DragonFlyBSD (git ae503f8f6f4b9a413932ffd68be029f20c38cab4) X-MFC with: r361136 Notes: svn path=/head/; revision=364416
* vfs: remove the always-curthread td argument from VOP_RECLAIMMateusz Guzik2020-08-195-7/+7
| | | | Notes: svn path=/head/; revision=364373
* vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_errorMateusz Guzik2020-08-194-4/+4
| | | | | | | Most consumers pass NULL. Notes: svn path=/head/; revision=364372
* Delete the unused "use_ext" argument to nfscl_reqstart().Rick Macklem2020-08-185-72/+52
| | | | | | | | | | This is a partial revert of r363210, since the "use_ext" argument added by that commit is not actually useful. This patch should not result in any semantics change. Notes: svn path=/head/; revision=364330