aboutsummaryrefslogtreecommitdiff
path: root/sys/dev/md
Commit message (Collapse)AuthorAgeFilesLines
* md: Fix a read-after-free in BIO_GETATTR handlingMark Johnston2020-12-231-33/+32
| | | | | | | | | | | | | | | | | g_handleattr_int() consumes the bio if the attribute matches, so when we check bp->bio_cmd bp may have been freed. Move GETATTR handling to a separate function to avoid the problem. We do not need to set bio_completed for such bios, g_handleattr_int() will handle it. Also remove the setting of bio_resid before the devstat_end_transaction_bio() call. All of the md(4) bio handlers set bio_resid already. Reported by: KASAN Reviewed by: kib MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27724
* Make MAXPHYS tunable. Bump MAXPHYS to 1M.Konstantin Belousov2020-11-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225 Notes: svn path=/head/; revision=368124
* Fix a typo in a license commentMateusz Piotrowski2020-11-121-1/+1
| | | | | | | Approved by: kaktus (src) Notes: svn path=/head/; revision=367618
* Use a template assembly file to generate the embedded MFS.John Baldwin2020-10-201-0/+46
| | | | | | | | | | | | | | | | | | | This uses the .incbin directive to pull in the MFS image contents. Using assembly directly ensures that symbols can be defined with the name and properties (such as .size) desired without having to rename symbols, etc. via a second objcopy invocation. Since it is compiled by the C compiler driver, it also avoids the need for all of the EMBEDFS* make variables. Suggested by: jrtc27 Reviewed by: kib, markj Obtained from: CheriBSD MFC after: 2 weeks Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26781 Notes: svn path=/head/; revision=366897
* md: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-3/+0
| | | | Notes: svn path=/head/; revision=365211
* Remove some redundant assignments and computations.Mark Johnston2020-06-281-2/+2
| | | | | | | | | | | Reported by: alc Reviewed by: alc, kib Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25400 Notes: svn path=/head/; revision=362739
* Call swap_pager_freespace() from vm_object_page_remove().Mark Johnston2020-06-251-2/+0
| | | | | | | | | | | | | | | | | All vm_object_page_remove() callers, except linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when removing a range of pages from an object. The LinuxKPI case appears to be an unintentional omission that could result in leaked swap blocks, so unconditionally free swap space in vm_object_page_remove() to protect against similar bugs in the future. Reviewed by: alc, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25329 Notes: svn path=/head/; revision=362613
* Convert a few triviail consumers to the new unlocked grab API.Jeff Roberson2020-02-281-3/+1
| | | | | | | | Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23847 Notes: svn path=/head/; revision=358447
* Don't hold the object lock while calling getpages.Jeff Roberson2020-01-191-2/+9
| | | | | | | | | | | | | The vnode pager does not want the object lock held. Moving this out allows further object lock scope reduction in callers. While here add some missing paging in progress calls and an assert. The object handle is now protected explicitly with pip. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D23033 Notes: svn path=/head/; revision=356902
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-031-7/+7
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* Fix a page leak in the md(4) swap I/O path.Mark Johnston2020-01-031-1/+10
| | | | | | | | | | | | | | | r356147 removed a vm_page_activate() call, but this is required to ensure that pages end up in the page queues in the first place. Restore the pre-r356157 logic. Now, without the page lock, the vm_page_active() check is racy, but this race is harmless. Reviewed by: alc, kib Reported and tested by: pho Differential Revision: https://reviews.freebsd.org/D23024 Notes: svn path=/head/; revision=356326
* Avoid duplicate I/O statistics accounting.Alexander Motin2020-01-031-2/+5
| | | | | | | | | | Alike to geom_disk free the provider statistics structure and point GEOM toward local statistics. It allows to save some CPU time. MFC after: 2 weeks Notes: svn path=/head/; revision=356315
* Use atomic for start_count in devstat_start_transaction().Alexander Motin2019-12-301-6/+0
| | | | | | | | | | | Combined with earlier nstart/nend removal it allows to remove several locks from request path of GEOM and few other places. It would be cool if we had more SMP-friendly statistics, but this helps too. Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=356200
* Remove page locking for queue operations.Mark Johnston2019-12-281-6/+1
| | | | | | | | | | | | | | | | With the previous reviews, the page lock is no longer required in order to perform queue operations on a page. It is also no longer needed in the page queue scans. This change effectively eliminates remaining uses of the page lock and also the false sharing caused by multiple pages sharing a page lock. Reviewed by: jeff Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D22885 Notes: svn path=/head/; revision=356157
* Add a deferred free mechanism for freeing swap space that does not requireJeff Roberson2019-12-151-8/+2
| | | | | | | | | | | | | | | | | | | | | | an exclusive object lock. Previously swap space was freed on a best effort basis when a page that had valid swap was dirtied, thus invalidating the swap copy. This may be done inconsistently and requires the object lock which is not always convenient. Instead, track when swap space is present. The first dirty is responsible for deleting space or setting PGA_SWAP_FREE which will trigger background scans to free the swap space. Simplify the locking in vm_fault_dirty() now that we can reliably identify the first dirty. Discussed with: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22654 Notes: svn path=/head/; revision=355765
* vfs: introduce v_irflag and make v_type smallerMateusz Guzik2019-12-081-1/+1
| | | | | | | | | | | | | | | | | | The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715 Notes: svn path=/head/; revision=355537
* Fix a few places that free a page from an object without busy held. This isJeff Roberson2019-12-021-13/+5
| | | | | | | | | | | tightening constraints on busy as a precursor to lockless page lookup and should largely be a NOP for these cases. Reviewed by: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22611 Notes: svn path=/head/; revision=355314
* (4/6) Protect page valid with the busy lock.Jeff Roberson2019-10-151-5/+5
| | | | | | | | | | | | | | Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594 Notes: svn path=/head/; revision=353539
* Change synchonization rules for vm_page reference counting.Mark Johnston2019-09-091-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures. Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks. The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller. The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files. __FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes. Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486 Notes: svn path=/head/; revision=352110
* md(4): remove the unused and unusable MDIOCLIST ioctl.Brooks Davis2019-08-161-53/+2
| | | | | | | | | | | | | | | | | | It is unused, the ABI was broken in r322969, and it is broken by design (more than MDNPAD md devices can exist and there is no way to retreive them with this interface). mdconfig(8) was converted to use libgeom to obtain this information in r157160 and any other consumers of MDIOCLIST should likewise be converted. Reviewed by: emaste Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18936 Notes: svn path=/head/; revision=351132
* When using the force option to shut down a memory-disk device,Kirk McKusick2019-03-311-4/+20
| | | | | | | | | | | | | | | | | | | | | | | I/O operations already in its queue were not being properly drained. The GEOM framework does the queue draining, but the device driver needs to wait for the draining to happen. The waiting is done by adding a g_md_providergone() function to wait for the I/O operations to finish up. It is likely that every GEOM provider that implements orphaning attached GEOM consumers needs to use the "providergone" mechanism for this same reason, but some of them do not do so. Apparently Kenneth Merry (ken@) added the drain for just such races, but he missed adding it to some of the device drivers that needed it. Submitted by: Chuck Silvers Reviewed by: imp Tested by: Chuck Silvers MFC after: 1 week Sponsored by: Netflix Notes: svn path=/head/; revision=345758
* Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.Gleb Smirnoff2019-01-151-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many pbufs are we going to have set. In various subsystems that are going to utilize pbufs create private zones via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(), and sets a limit on created zone. After startup preallocate pbufs according to requirements of all pbuf zones. Subsystems that used to have a private limit with old allocator now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS, swap, vnode pager. The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9), aio(4). They should have their private limits, but changing that is out of scope of this commit. o Fetch tunable value of kern.nswbuf from init_param2() and while here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only this option. Default values aren't touched by this commit, but they probably should be reviewed wrt to modern hardware. This change removes a tight bottleneck from sendfile(2) operation, that uses pbufs in vnode pager. Other pagers also would benefit from faster allocation. Together with: gallatin Tested by: pho Notes: svn path=/head/; revision=343030
* Fix devstat on md devices, second attempt. r341765 depends onBruce Evans2018-12-221-2/+12
| | | | | | | | | | | | | | | g_io_deliver() finishing initialization of the bio, but g_io_deliver() actually destroys the bio. INVARIANTS makes the bug obvious by overwriting the bio with garbage. Restore the old order for calling devstat (except don't restore not calling it for the error case), and translate to the devstat KPI so that this order works. Reviewed by: kib Notes: svn path=/head/; revision=342375
* Use VOP_ADVISE() with POSIX_FADV_DONTNEED instead of IO_DIRECT toBruce Evans2018-12-211-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implement not double-caching for reads from vnode-backed md devices. Use VOP_ADVISE() similarly instead of !IO_DIRECT unsimilarly for writes. Add a "cache" option to mdconfig to allow changing the default of not caching. This depends on a recent commit to fix VOP_ADVISE(). A previous version had optimizations for sequential i/o's (merge the i/o's and only uncache for discontiguous i/o's and for full blocks), but optimizations and knowledge of block boundaries belong in VOP_ADVISE(). Read-ahead should also be handled better, by supporting it in md and discarding it in VOP_ADVISE(). POSIX_FADV_DONTNEED is ignored by zfs, but so is IO_DIRECT. POSIX_FADV_DONTNEED works better than IO_DIRECT if it is not ignored, since it only discards from the buffer cache immediately, while IO_DIRECT also discards from the page cache immediately. IO_DIRECT was not used for writes since it was claimed to be too slow, but most of the slowness for writes is from doing them synchronously by default. Non-synchronous writes still deadlock in many cases. IO_DIRECT only has a special implementation for ffs reads with DIRECTIO configured. Otherwise, if it is not ignored than it uses the buffer and page caches normally except for discarding everything after each i/o, and then it has much the same overheads as POSIX_FADV_DONTNEED. The overheads for reading with ffs and DIRECTIO were similar in tests of md. Reviewed by: kib Notes: svn path=/head/; revision=342297
* Fix devstat on md devices.Bruce Evans2018-12-091-2/+2
| | | | | | | | | | | | | | | | | devstat_end_transaction() was called before the i/o was actually ended (by delivering it to GEOM), so at least the i/o length was messed up. It was always recorded as 0, so the average transaction size and the average transfer rate was always displayed as 0. devstat_end_transaction() was not called at all for the error case, so there were sometimes multiple starts per end. I didn't observe this in practice and don't know if it did much damage. I think it extended the length of the i/o to the next transaction. Reviewed by: kib Notes: svn path=/head/; revision=341765
* md: use prestaged mfs_rootBreno Leitao2018-06-071-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | On PowerNV systems, the rootfs is passed through kexec, which loads the rootfs into memory and set two fdt entries to describe where the file is located in the memory; I need to pass this memory region to the md device as a mfs_root, but, current md driver does not support two things: * Just getting a pointer from an external (bootloader) memory. If I need to workaround it, I would need to declare a static array and memcopy from this external memory to this static variable. * The size of the image. The usage of mfs_root_end, which is not a pointer, seems to be not possible for this prestaged scenario. This patch simply adds a new way to load mfs_root from memory. Differential Revision: https://reviews.freebsd.org/D15625 Approved by: kib, jhibbits (mentor) Notes: svn path=/head/; revision=334784
* Move most of the contents of opt_compat.h to opt_global.h.Brooks Davis2018-04-061-1/+0
| | | | | | | | | | | | | | | | | | | | | opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941 Notes: svn path=/head/; revision=332122
* Move 32-bit compat for md(4) ioctls into the md code.Brooks Davis2018-03-271-23/+107
| | | | | | | | | | | | | | | This is more correct in that ioctl commands have no meaning until they hit the handler associated with the file descriptor. Add support for MDIOCRESIZE_32 which was missed when it was added. Reviewed by: cem, kib, markj (various versions) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14714 Notes: svn path=/head/; revision=331623
* Move uio enums to sys/_uio.h.Brooks Davis2018-03-271-0/+1
| | | | | | | | | | | | | | | | | Include _uio.h instead of uio.h in several headers to reduce header polution. Fix a few places that relied on header polution to get the uio.h header. I have not moved struct uio as many more things that use it rely on header polution to get other definitions from uio.h. Reviewed by: cem, kib, markj Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14811 Notes: svn path=/head/; revision=331621
* Add a request structure and make the implementation use it.Brooks Davis2018-03-151-115/+157
| | | | | | | | | | | | | | | | | | | This allows compatibility translation to take place on the stack (md_ioctl is too big) and is more suitable as a public interface within the kernel than the kern_ioctl interface. Except for the initialization of the md_req from the md_ioctl (including detection of kernel md_file pointers) and the updating of the md_ioctl prior to return, this is a mechanical replacment of md_ioctl and mdio with md_req and mdr. Reviewed by: markj, cem, kib (assorted versions) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14704 Notes: svn path=/head/; revision=331030
* Move implementation of ioctls into kern_*() functions.Brooks Davis2018-03-151-148/+254
| | | | | | | | | | | | | | | | Move locks from outside ioctl to the individual implementations. This is the first step of changing the implementations to act on a kernel-internal request struct rather than on struct md_ioctl and to removing the use of kern_ioctl in mountroot. Reviewed by: cem, kib, markj (prior version) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14700 Notes: svn path=/head/; revision=331014
* Restore the behavior of returning the total number of units byBrooks Davis2018-03-151-1/+2
| | | | | | | | | | | | unconditionally incrementing i in the loop; Reported by: cem MFC with: r330880 Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14685 Notes: svn path=/head/; revision=331008
* Don't overflow the kernel struct mdio in the MDIOCLIST ioctl.Brooks Davis2018-03-131-3/+14
| | | | | | | | | | | | | | | | | | | | | | Always terminate the list with -1 and document the ioctl behavior. This preserves existing behavior as seen from userspace with the addition of the unconditional termination which will not be seen by working consumers of MDIOCLIST. Because this ioctl can only be performed by root (in default configurations) and is not used in the base system this bug is not deemed to warrant either a security advisory or an eratta notice. Reviewed by: kib Obtained from: CheriBSD Discussed with: security-officer (gordon) MFC after: 3 days Security: kernel heap buffer overflow Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14685 Notes: svn path=/head/; revision=330880
* Fix backwards MD_VERIFY logic for md devices.Jonathan T. Looney2018-01-101-1/+1
| | | | | | | | | | | | If the MD_VERIFY flag is set, we should use O_VERIFY. If the MD_VERIFY flag is not set, we should not. Reviewed by: stevek Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D13814 Notes: svn path=/head/; revision=327754
* Add a new kernel config option, MD_ROOT_READONLY, which forces on theIan Lepore2017-12-201-2/+8
| | | | | | | | | | | | | | | | | MD_READONLY flag for the md device automatically instantiated during kernel init for an mdroot filesystem. Note that there is specifically and by design no tunable or sysctl control over this feature. Without this option, you already have control over whether the mdroot fs is writeable using vfs.root.mountfrom.options from loader(8), the root_rw_mount rcvar, and by using "mount -u[rw] /" or equivelent on the fly. This option is being added to provide a way to make the mdroot fs truly immutable before userland code begins running. Differential Revision: https://reviews.freebsd.org/D13411 Notes: svn path=/head/; revision=327032
* SPDX: use the Beerware identifier.Pedro F. Giffuni2017-11-301-2/+2
| | | | Notes: svn path=/head/; revision=326408
* sys/dev: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326022
* Make md(4) support GEOM::ident for vnode-backed disks. It's basedEdward Tomasz Napierala2017-10-041-0/+11
| | | | | | | | | | | | | on backing file device and inode numbers. This is useful for gmountver(8) regression tests. MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D12230 Notes: svn path=/head/; revision=324275
* When mdstart_swap() accesses a page that is already in the active queue,Alan Cox2017-10-021-1/+4
| | | | | | | | | | | | | mark the page as referenced rather than calling vm_page_activate(). This allows the page's act_count to grow beyond ACT_INIT and better reflect its usage. (See also r324146, which modified a function used by tmpfs, uiomove_object_page(), to behave in the same way.) Reviewed by: kib, markj MFC after: 2 weeks Notes: svn path=/head/; revision=324189
* Add ability to label md(4) devices.Maxim Sobolev2017-08-281-0/+16
| | | | | | | | | | | | | | | This feature comes from the fact that we rely memory-backed md(4) in our build process heavily. However, if the build goes haywire the allocated resources (i.e. swap and memory-backed md(4)'s) need to be purged. It is extremely useful to have ability to attach arbitrary labels to each of the virtual disks so that they can be identified and GC'ed if neecessary. MFC after: 4 weeks Differential Revision: https://reviews.freebsd.org/D10457 Notes: svn path=/head/; revision=322969
* Don't call vm_pager_page_unswapped() when writing or deleting a dirty page.Mark Johnston2017-06-141-6/+10
| | | | | | | | | | | | The swap space backing a clean page is released when it is first dirtied, so there's no need to attempt to release swap space when the page is already dirty. Reviewed by: alc MFC after: 1 week Notes: svn path=/head/; revision=319934
* Free the request page if an I/O error occurs while reading from swap.Mark Johnston2017-06-141-3/+3
| | | | | | | | | | | | | | After such a failure, the page is invalid, so there's point in keeping it around. Moreover, such pages were not being inserted into the active queue, making them unreclaimable until a subsequent write or delete made them valid. Reported by: alc Reviewed by: alc (previous revision) MFC after: 1 week Notes: svn path=/head/; revision=319933
* Fix handling of subpage BIO_WRITE and BIO_DELETE requests on swap MDs.Mark Johnston2017-06-141-22/+40
| | | | | | | | | | | | | | Such requests would previously mark the entire page as valid, which was incorrect since nothing guaranteed that the page's contents had been initialized. This change also modifies subpage BIO_DELETEs so that the entire page is marked dirty, rather than only a subrange. There is no benefit to creating partially dirty swap pages. Reviewed by: alc, kib (previous version) MFC after: 3 days Notes: svn path=/head/; revision=319932
* Add MD_VERIFY option to enable O_VERIFY in open for vnode type.Stephen J. Kiernan2017-05-311-2/+12
| | | | | | | | | | | | | | | | Add -o [no]verify option to mdconfig (and document in man page.) Implement GEOM attribute MNT::verified to ask md if the backing vnode is verified. Check for MNT::verified in cd9660 mount to flag the mount as MNT_VERIFIED if the underlying device has been verified. Reviewed by: rwatson Approved by: sjg (mentor) Obtained from: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D2902 Notes: svn path=/head/; revision=319358
* Renumber copyright clause 4Warner Losh2017-02-281-1/+1
| | | | | | | | | | | | Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96 Notes: svn path=/head/; revision=314436
* sys/dev: Replace zero with NULL for pointers.Pedro F. Giffuni2017-02-201-1/+1
| | | | | | | | | | | Makes things easier to read, plus architectures may set NULL to something different than zero. Found with: devel/coccinelle MFC after: 3 weeks Notes: svn path=/head/; revision=313982
* Fix typo where opening brace was needed.Stephen J. Kiernan2017-02-131-1/+1
| | | | | | | | | Reported by: Michael Butler Reviewed by: sjg Approved by: sjg (mentor) Notes: svn path=/head/; revision=313703
* For MD_PRELOAD type md(4) devices, if there is a file name in the preloadedStephen J. Kiernan2017-02-131-3/+8
| | | | | | | | | | | | | | | | | | | meta-data, copy it into the softc structure. When returning md(4) device details to the caller, include the file name in any MD_PRELOAD type devices if it is set (first character is not NUL.) In mdconfig, for "preload" type md(4) devices, if there is file config available, print it in the file column of the output. Reviewed by: brooks Approved by: sjg (mentor) MFC after: 1 month Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D9529 Notes: svn path=/head/; revision=313701
* For the MD_ROOT option don't inject /dev/md0 as root dev when ROOTDEVNAMEMaxim Sobolev2016-03-091-1/+2
| | | | | | | | | | | | | | | is defined explicitly. It's kinda pointless and results in extra step in boot sequence which is not really needed, i.e.: md0: Embedded image 1331200 bytes at 0x8038b7b4 Trying to mount root from ufs:/dev/md0 []... Mounting from ufs:/dev/md0 failed with error 22. Trying to mount root from ufs:md0.uzip []... warning: no time-of-day clock registered, system time will not be set accurately start_init: trying /sbin/init Notes: svn path=/head/; revision=296574
* Fix MFS builds when both MD_ROOT_SIZE and MFS_IMAGE are specifiedAdrian Chadd2016-02-021-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | MD_ROOT_SIZE and embed_mfs.sh were basically retired as part of https://reviews.freebsd.org/D2903 . However, when building a kernel with 'options MD_ROOT_SIZE' specified, this results in a non-working MFS, as within sys/dev/md/md.c we fall within the wrong # ifdef. This patch implements the following: * Allow kernels to be built without the MD_ROOT_SIZE option, which results in a kernel built as per D2903. * Allow kernels to be built with the MD_ROOT_SIZE option, which results in a kernel built similarly to the pre-D2903 way, with the following differences: * The MFS is now put in a separate section within the kernel (oldmfs, so it differs from the mfs section introduced by D2903). * embed_mfs.sh is changed, so it looks up the oldmfs section within the kernel, gets its size and offset, sees if the MFS will fit within the allocated oldmfs section and only if all is well does a dd of the MFS image into the kernel. Submitted by: Stanislav Galabov <sgalabov@gmail.com> Reviewed by: brooks, imp Differential Revision: https://reviews.freebsd.org/D5093 Notes: svn path=/head/; revision=295137