aboutsummaryrefslogtreecommitdiff
path: root/sys/kern/vfs_cluster.c
Commit message (Collapse)AuthorAgeFilesLines
* sys: Automated cleanup of cdefs and other formattingWarner Losh2023-11-271-1/+0
| | | | | | | | | | | | | | | | Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
* sys: Remove ancient SCCS tags.Warner Losh2023-11-271-2/+0
| | | | | | | | Remove ancient SCCS tags from the tree, automated scripting, with two minor fixup to keep things compiling. All the common forms in the tree were removed with a perl script. Sponsored by: Netflix
* sys: Remove $FreeBSD$: one-line .c patternWarner Losh2023-08-161-2/+0
| | | | Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
* cluster_write(): do not access buffer after it is releasedKonstantin Belousov2021-09-021-3/+8
| | | | | | | | | | | | | The issue was reported by Alexander Lochmann <alexander.lochmann@tu-dortmund.de>, who found the problem by performing lock analysis using LockDoc, see https://doi.org/10.1145/3302424.3303948. Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D31780
* Minor style tidy: if( -> if (Warner Losh2021-04-181-1/+1
| | | | | | | | Fix a few 'if(' to be 'if (' in a few places, per style(9) and overwhelming usage in the rest of the kernel / tree. MFC After: 3 days Sponsored by: Netflix
* vnode: move write cluster support data to inodes.Konstantin Belousov2021-02-211-37/+48
| | | | | | | | | | | | The data is only needed by filesystems that 1. use buffer cache 2. utilize clustering write support. Requested by: mjg Reviewed by: asomers (previous version), fsu (ext2 parts), mckusick Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28679
* Delete dead CLUSTERDEBUG config option.Konstantin Belousov2021-02-211-8/+0
| | | | | | | | Reviewed by: mckusick Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28679
* Make MAXPHYS tunable. Bump MAXPHYS to 1M.Konstantin Belousov2020-11-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225 Notes: svn path=/head/; revision=368124
* vfs: fix trivial whitespace issues which don't interefere with blameMateusz Guzik2020-07-101-1/+1
| | | | | | | .. even without the -w switch Notes: svn path=/head/; revision=363071
* Remove duplicated empty lines from kern/*.cMateusz Guzik2020-01-301-1/+0
| | | | | | | No functional changes. Notes: svn path=/head/; revision=357312
* Do not use waitable allocation of pbuf when creating cluster for write.Konstantin Belousov2019-12-231-2/+1
| | | | | | | | | | | | | | | | Previously just ensuring that we do not sleep when clustering for md(4) vnode was enough. Now, with the switch of the pbuf allocator to uma and completely broken per-subsystem pbuf limits, it might cause unbounded sleep even for non-md(4) vnodes. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22899 Notes: svn path=/head/; revision=356038
* Currently the breadn_flags() and getblkx() interfaces are passedKirk McKusick2019-12-031-1/+1
| | | | | | | | | | | | | | | | | | the vnode, logical block number, and size of data block that is being requested. They then use the VOP_BMAP function to calculate the mapping from logical block number to physical block number from which to access the data. This change expands the interface to also pass the physical block number in cases where the VOP_MAP function may no longer work, for example when a file is being truncated. No functional change. Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix Notes: svn path=/head/; revision=355371
* Drop the object lock in vfs_bio and cluster where it is now safe to do so.Jeff Roberson2019-10-291-16/+2
| | | | | | | | | | | | Recent changes to busy/valid/dirty have enabled page based synchronization and the object lock is no longer required in many cases. Reviewed by: kib Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21597 Notes: svn path=/head/; revision=354155
* (4/6) Protect page valid with the busy lock.Jeff Roberson2019-10-151-5/+7
| | | | | | | | | | | | | | Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594 Notes: svn path=/head/; revision=353539
* (1/6) Replace busy checks with acquires where it is trival to do so.Jeff Roberson2019-10-151-9/+9
| | | | | | | | | | | | | | This is the first in a series of patches that promotes the page busy field to a first class lock that no longer requires the object lock for consistency. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21548 Notes: svn path=/head/; revision=353535
* The VFS-level clustering code collects together sequential blocksKirk McKusick2019-09-171-2/+20
| | | | | | | | | | | | | | | | | | | | | | by issuing delayed-writes (bdwrite()) until a non-sequential block is written or the maximum cluster size is reached. At that point it collects the delayed buffers together (using bread()) to write them in a single operation. The assumption was that since we just looked at them they will still be in memory so there is no need to check for a read error from bread(). Very occationally (apparently every 10-hours or so when being pounded by Peter Holm's tests) this assumption is wrong. The fix is to check for errors from bread() and fail the cluster write thus falling back to the default individual flushing of any still dirty buffers. Reported by: Peter Holm and Chuck Silvers Reviewed by: kib MFC after: 3 days Notes: svn path=/head/; revision=352453
* Use an atomic reference count for paging in progress so that callers do notJeff Roberson2019-08-191-1/+2
| | | | | | | | | | | | require the object lock. Reviewed by: markj Tested by: pho (as part of a larger branch) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21311 Notes: svn path=/head/; revision=351241
* Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.Gleb Smirnoff2019-01-151-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many pbufs are we going to have set. In various subsystems that are going to utilize pbufs create private zones via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(), and sets a limit on created zone. After startup preallocate pbufs according to requirements of all pbuf zones. Subsystems that used to have a private limit with old allocator now have private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS, swap, vnode pager. The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9), aio(4). They should have their private limits, but changing that is out of scope of this commit. o Fetch tunable value of kern.nswbuf from init_param2() and while here move NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only this option. Default values aren't touched by this commit, but they probably should be reviewed wrt to modern hardware. This change removes a tight bottleneck from sendfile(2) operation, that uses pbufs in vnode pager. Other pagers also would benefit from faster allocation. Together with: gallatin Tested by: pho Notes: svn path=/head/; revision=343030
* ANSIfy sys/kernEd Maste2018-06-011-2/+1
| | | | Notes: svn path=/head/; revision=334486
* Detect and optimize reads from the hole on UFS.Konstantin Belousov2018-05-131-11/+17
| | | | | | | | | | | | | | | | | | | | | - Create getblkx(9) variant of getblk(9) which can return error. - Add GB_NOSPARSE flag for getblk()/getblkx() which requests that BMAP was performed before the buffer is created, and EJUSTRETURN returned in case the requested block does not exist. - Make ffs_read() use GB_NOSPARSE to avoid instantiating buffer (and allocating the pages for it), copying from zero_region instead. The end result is less page allocations and buffer recycling when a hole is read, which is important for some benchmarks. Requested and reviewed by: jeff Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D14917 Notes: svn path=/head/; revision=333576
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* Move bogus_page declaration to vm_page.h and initialization to vm_page.c.Gleb Smirnoff2017-01-041-3/+0
| | | | | | | Reviewed by: kib Notes: svn path=/head/; revision=311336
* Add BUF_TRACKING and FULL_BUF_TRACKING buffer debuggingConrad Meyer2016-10-311-0/+1
| | | | | | | | | | | | | | | | | | Upstream the BUF_TRACKING and FULL_BUF_TRACKING buffer debugging code. This can be handy in tracking down what code touched hung bios and bufs last. The full history is especially useful, but adds enough bloat that it shouldn't be enabled in release builds. Function names (or arbitrary string constants) are tracked in a fixed-size ring in bufs. Bios gain a pointer to the upper buf for tracking. SCSI CCBs gain a pointer to the upper bio for tracking. Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8366 Notes: svn path=/head/; revision=308155
* Renumber license clauses in sys/kern to avoid skipping #3Ed Maste2016-09-151-1/+1
| | | | Notes: svn path=/head/; revision=305832
* Remove b_pin_count from struct buf.Mark Johnston2016-08-111-14/+0
| | | | | | | | | | | | | It was added in r153192 for XFS and doesn't appear to have been used for anything else. XFS was disconnected in r241607 and removed entirely in r247631. Reported by: mlaier Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D7468 Notes: svn path=/head/; revision=303951
* sys/kern: spelling fixes in comments.Pedro F. Giffuni2016-04-291-1/+1
| | | | | | | No functional change. Notes: svn path=/head/; revision=298819
* kern: for pointers replace 0 with NULL.Pedro F. Giffuni2016-04-151-1/+1
| | | | | | | | | These are mostly cosmetical, no functional change. Found with devel/coccinelle. Notes: svn path=/head/; revision=298069
* Add four new RCTL resources - readbps, readiops, writebps and writeiops,Edward Tomasz Napierala2016-04-071-0/+15
| | | | | | | | | | | | | | | | | | for limiting disk (actually filesystem) IO. Note that in some cases these limits are not quite precise. It's ok, as long as it's within some reasonable bounds. Testing - and review of the code, in particular the VFS and VM parts - is very welcome. MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5080 Notes: svn path=/head/; revision=297633
* The bread() function was inconsistent about whether it would returnKirk McKusick2016-01-271-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | a buffer pointer in the event of an error (for some errors it would return a buffer pointer and for other errors it would not return a buffer pointer). The cluster_read() function was similarly inconsistent. Clients of these functions were inconsistent in handling errors. Some would assume that no buffer was returned after an error and would thus lose buffers under certain error conditions. Others would assume that brelse() should always be called after an error and would thus panic the system under certain error conditions. To correct both of these problems with minimal code churn, bread() and cluster_write() now always free the buffer when returning an error thus ensuring that buffers will never be lost. The brelse() routine checks for being passed a NULL buffer pointer and silently returns to avoid panics. Thus both approaches to handling error returns from bread() and cluster_read() will work correctly. Future code should be written assuming that bread() and cluster_read() will never return a buffer with an error, so should not attempt to brelse() the buffer when an error is returned. Reviewed by: kib Notes: svn path=/head/; revision=294954
* Refactor unmapped buffer address handling.Jeff Roberson2015-07-231-7/+3
| | | | | | | | | | | | | | | | | | | | - Use pointer assignment rather than a combination of pointers and flags to switch buffers between unmapped and mapped. This eliminates multiple flags and generally simplifies the logic. - Eliminate b_saveaddr since it is only used with pager bufs which have their b_data re-initialized on each allocation. - Gather up some convenience routines in the buffer cache for manipulating buf space and buf malloc space. - Add an inline, buf_mapped(), to standardize checks around unmapped buffers. In collaboration with: mlaier Reviewed by: kib Tested by: pho (many small revisions ago) Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=285819
* Remove several write-only variables, all reported by the gcc 4.9Konstantin Belousov2015-05-291-2/+0
| | | | | | | | | | | | | | | | | | | buildkernel run. Some of them were write-only under some kernel options, e.g. variables keeping values only used by CTR() macros. It costs nothing to the code readability and correctness to eliminate the warnings in those cases too by removing the local cached values used only for single-access. Review: https://reviews.freebsd.org/D2665 Reviewed by: rodrigc Looked at by: bjk Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=283735
* When allocating a pbuf for the cluster write, do not sleep waitingKonstantin Belousov2013-08-271-1/+3
| | | | | | | | | | | | | | for the available pbuf when passed vnode is backing md(4). Other i/o directed to the same md device might already hold pbufs, and then we could deadlock since only our progress can free a pbuf needed for wakeup. Obtained from: projects/vm6 Reminded and tested by: pho MFC after: 1 week Notes: svn path=/head/; revision=254945
* Fix a whitespace.Jung-uk Kim2013-08-231-1/+1
| | | | Notes: svn path=/head/; revision=254717
* Both cluster_rbuild() and cluster_wbuild() sometimes set the pagesKonstantin Belousov2013-08-221-9/+26
| | | | | | | | | | | | | | | | shared busy without first draining the hard busy state. Previously it went unnoticed since VPO_BUSY and m->busy fields were distinct, and vm_page_io_start() did not verified that the passed page has VPO_BUSY flag cleared, but such page state is wrong. New implementation is more strict and catched this case. Drain the busy state as needed, before calling vm_page_sbusy(). Tested by: pho, jkim Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=254668
* The soft and hard busy mechanism rely on the vm object lock to work.Attilio Rao2013-08-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl Notes: svn path=/head/; revision=254138
* - Convert the bufobj lock to rwlock.Jeff Roberson2013-05-311-8/+7
| | | | | | | | | | | | | - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG. Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf Notes: svn path=/head/; revision=251171
* Add a sysctl vfs.read_min to complement the exiting vfs.read_max. ItScott Long2013-05-071-0/+12
| | | | | | | | | | | | | | | | | | | | | defaults to 1, meaning that it's off. When read-ahead is enabled on a file, the vfs cluster code deliberately breaks a read into 2 I/O transactions; one to satisfy the actual read, and one to perform read-ahead. This makes sense in low-latency circumstances, but often produces unbalanced i/o transactions that penalize disks. By setting vfs.read_min, we can tell the algorithm to fetch a larger transaction that what we asked for, achieving the same effect as the read-ahead but without the doubled, unbalanced transaction and the slightly lower latency. This significantly helps our workloads with video streaming. Submitted by: emax Reviewed by: kib Obtained from: Netflix Notes: svn path=/head/; revision=250327
* Implement the concept of the unmapped VMIO buffers, i.e. buffers whichKonstantin Belousov2013-03-191-41/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks Notes: svn path=/head/; revision=248508
* Some style fixes.Konstantin Belousov2013-03-141-1/+1
| | | | | | | Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=248283
* Add currently unused flag argument to the cluster_read(),Konstantin Belousov2013-03-141-16/+8
| | | | | | | | | | | cluster_write() and cluster_wbuild() functions. The flags to be allowed are a subset of the GB_* flags for getblk(). Sponsored by: The FreeBSD Foundation Tested by: pho Notes: svn path=/head/; revision=248282
* Switch the vm_object mutex to be a rwlock. This will enable in theAttilio Rao2013-03-091-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho Notes: svn path=/head/; revision=248084
* Add barrier write capability to the VFS buffer interface. A barrierKirk McKusick2013-02-161-3/+9
| | | | | | | | | | | | | | | | | | | | | | write is a disk write request that tells the disk that the buffer being written must be committed to the media along with any writes that preceeded it before any future blocks may be written to the drive. Barrier writes are provided by adding the functions bbarrierwrite (bwrite with barrier) and babarrierwrite (bawrite with barrier). Following a bbarrierwrite the client knows that the requested buffer is on the media. It does not ensure that buffers written before that buffer are on the media. It only ensure that buffers written before that buffer will get to the media before any buffers written after that buffer. A flush command must be sent to the disk to ensure that all earlier written buffers are on the media. Reviewed by: kib Tested by: Peter Holm Notes: svn path=/head/; revision=246876
* Correct a KASSERT message.Alan Cox2012-08-151-1/+1
| | | | | | | Submitted by: bde Notes: svn path=/head/; revision=239315
* Unbreak detection of the async mode for clustered writes after r231075.Konstantin Belousov2012-02-081-1/+1
| | | | | | | | Submitted by: bde MFC after: 12 days Notes: svn path=/head/; revision=231204
* The hardware has caught up; improvements are now observed even at 128,Ivan Voras2011-03-161-1/+1
| | | | | | | | but stay conservative and bump read_max to "only" 64 (it will probably be a good idea to increase this to 128 after the next major release). Notes: svn path=/head/; revision=219699
* Bumping the read-ahead count once more, to value equivalent to 512 KiB onIvan Voras2010-08-091-1/+1
| | | | | | | | | | | | | | | | | | | most system, based on benchmark results on a low-end fibre channel SAN under VMWare: vfs.read_max read performance 8 (historical default) 83 MB/s 16 (recent bump) 131 MB/s 32 (this version) 152 MB/s 64 157 MB/s (results are +/- 3 MB/s) As read-ahead is heuristic, based on past IO requests, it shouldn't be problematic. The new default is still smaller then in other OSes. Notes: svn path=/head/; revision=211126
* To help with sequential read UFS performance on modern systems, increaseIvan Voras2010-08-071-1/+1
| | | | | | | | | | | | | | | | | the vfs.read_max default. For most systems this means going from 128 KiB to 256 KiB, which is still very conservative and lower than what most other operating systems use, but as a sane default should not interfere much with existing systems. For systems with RAID volumes and/or virtualization envirnments, where read performance is very important, increasing this sysctl tunable to 32 or even more will demonstratively yield additional performance benefits. If MAXPHYS ever gets bumped up, it will probably be a good idea to slave read_max to it. Notes: svn path=/head/; revision=211031
* Remove a stale comment. The very same revision (r85511) that introducedAlan Cox2009-06-301-3/+0
| | | | | | | | | this comment also implemented the proposed change to the code. Approved by: re (kib) Notes: svn path=/head/; revision=195209
* Correct a long-standing performance bug in cluster_rbuild(). Specifically,Alan Cox2009-06-271-4/+15
| | | | | | | | | | | | | in the case of a file system with a block size that is less than the page size, cluster_rbuild() looks at too many of the page's valid bits. Consequently, it may terminate prematurely, resulting in poor performance. Reported by: bde Reviewed by: tegge Approved by: re (kib) Notes: svn path=/head/; revision=195122
* Eliminate unnecessary obfuscation when testing a page's valid bits.Alan Cox2009-06-071-4/+2
| | | | Notes: svn path=/head/; revision=193643