aboutsummaryrefslogtreecommitdiff
path: root/sys/vm/swap_pager.c
Commit message (Collapse)AuthorAgeFilesLines
* A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES().Gleb Smirnoff2015-12-161-128/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix Notes: svn path=/head/; revision=292373
* Mark swap_pager_putpages static at its definition. It was alreadyWarner Losh2015-10-051-3/+1
| | | | | | | | | | static at its declaration. Remove needless swapdev_strategy forward declaration. MFC After: 3 days Notes: svn path=/head/; revision=288901
* The swap pager is compatible with direct dispatch. It does its ownWarner Losh2015-09-081-11/+42
| | | | | | | | | | | | | | | | | | | | locking and doesn't sleep. Flag the consumer we create as such. In addition, decrement the in flight index when we have an out of memory error after having incremented it previously. This would have prevented swapoff from working if the swap pager ever hit a resource shortage trying to swap out something (the swap in path always waits for a bio, so won't have this issue). Simplify the close logic by abandoning the use of private and initializing the index to 1 and dropping that reference when we previously set private. Also, set sw_id only while sw_dev_mtx is held. This should only affect swapping to a vnode, as opposed to a geom whose close always sets it to NULL with sw_dev_mtx held. Differential Review: https://reviews.freebsd.org/D3547 Notes: svn path=/head/; revision=287567
* Eliminate pointless assignments to rtvals[] in swap_pager_putpages().Alan Cox2015-08-211-12/+12
| | | | | | | | Reviewed by: kib Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=287002
* Refactor unmapped buffer address handling.Jeff Roberson2015-07-231-9/+4
| | | | | | | | | | | | | | | | | | | | - Use pointer assignment rather than a combination of pointers and flags to switch buffers between unmapped and mapped. This eliminates multiple flags and generally simplifies the logic. - Eliminate b_saveaddr since it is only used with pager bufs which have their b_data re-initialized on each allocation. - Gather up some convenience routines in the buffer cache for manipulating buf space and buf malloc space. - Add an inline, buf_mapped(), to standardize checks around unmapped buffers. In collaboration with: mlaier Reviewed by: kib Tested by: pho (many small revisions ago) Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=285819
* o Un-inline vm_pager_get_pages(), vm_pager_get_pages_async().Gleb Smirnoff2015-06-171-4/+0
| | | | | | | | | | | o Provide an extensive set of assertions for input array of pages. o Remove now duplicate assertions from different pagers. Sponsored by: Nginx, Inc. Sponsored by: Netflix Notes: svn path=/head/; revision=284529
* Implement lockless resource limits.Mateusz Guzik2015-06-101-3/+1
| | | | | | | | | | | | | Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits. Notes: svn path=/head/; revision=284215
* Place VM objects on the object list when created and never remove them.John Baldwin2015-05-081-0/+2
| | | | | | | | | | | | | | | | | | This is ok since objects come from a NOFREE zone and allows objects to be locked while traversing the object list without triggering a LOR. Ensure that objects on the list are marked DEAD while free or stillborn, and that they have a refcount of zero. This required updating most of the pagers to explicitly mark an object as dead when deallocating it. (Only the vnode pager did this previously.) Differential Revision: https://reviews.freebsd.org/D2423 Reviewed by: alc, kib (earlier version) MFC after: 2 weeks Sponsored by: Norse Corp, Inc. Notes: svn path=/head/; revision=282660
* Instead of reading, validating and adjusting value of the vm.swap_async_maxGleb Smirnoff2015-05-021-38/+41
| | | | | | | | | | | | | in the main swapper work cycle, do it in the sysctl handler. This removes extra mutex acquisition from the main cycle and makes the sysctl knob return error on an invalid value, instead of accepting and fixing it. Reviewed by: kib Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=282353
* Add kern.racct.enable tunable and RACCT_DISABLED config option.Edward Tomasz Napierala2015-04-291-5/+7
| | | | | | | | | | | | | | The point of this is to be able to add RACCT (with RACCT_DISABLED) to GENERIC, to avoid having to rebuild the kernel to use rctl(8). Differential Revision: https://reviews.freebsd.org/D2369 Reviewed by: kib@ MFC after: 1 month Relnotes: yes Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=282213
* Remove sleeps from geom_up thread on device destruction.Alexander Motin2015-04-091-7/+5
| | | | | | | MFC after: 3 days. Notes: svn path=/head/; revision=281310
* Make swapper release orphaned (lost) GEOM provider.Alexander Motin2015-03-261-14/+50
| | | | | | | | | | | | Swap device is still reported as enabled, and system still may crash later if some swapped-out kernel pages were lost with the device, but at least GEOM and CAM can now release the lost disk, allowing it to be reconnected. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=280702
* \n at end of panicstr is redundant.Gleb Smirnoff2014-11-231-1/+1
| | | | | | | Submitted by: alc Notes: svn path=/head/; revision=274923
* Merge from projects/sendfile:Gleb Smirnoff2014-11-231-0/+36
| | | | | | | | | | | | | | | | | o Provide a new VOP_GETPAGES_ASYNC(), which works like VOP_GETPAGES(), but doesn't sleep. It returns immediately, and will execute the I/O done handler function that must be supplied as argument. o Provide VOP_GETPAGES_ASYNC() for the FFS, which uses vnode_pager. o Extend pagertab to support pgo_getpages_async method, and implement this method for vnode_pager. Reviewed by: kib Tested by: pho Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=274914
* Fix mis-spelling of bits and types names in theKonstantin Belousov2014-11-041-3/+6
| | | | | | | | | | | | | default_pager_putpages() and swap_pager_putpages(). It is the same fix as was done for vnode_pager_putpages() in r271586. Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=274100
* Add sysctl OIDs showing the actual size and capacity of the swap zone.Dag-Erling Smørgrav2014-04-261-3/+11
| | | | | | | MFC after: 1 week Notes: svn path=/head/; revision=264966
* Rename global cnt to vm_cnt to avoid shadowing.Bryan Drewery2014-03-221-3/+3
| | | | | | | | | | | | | | | | | To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=263620
* vm_page_grab() and vm_pager_get_pages() can drop the vm_object lock,Attilio Rao2014-03-191-2/+2
| | | | | | | | | | | | | | then threads can sleep on the pip condition. Avoid to deadlock such threads by correctly awakening the sleeping ones after the pip is finished. swapoff side of the bug can likely result in shutdown deadlocks. Sponsored by: EMC / Isilon Storage Division Reported by: pho, pluknet Tested by: pho Notes: svn path=/head/; revision=263328
* Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9).Konstantin Belousov2013-08-221-1/+1
| | | | | | | | | | | The flag was mandatory since r209792, where vm_page_grab(9) was changed to only support the alloc retry semantic. Suggested and reviewed by: alc Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=254649
* The soft and hard busy mechanism rely on the vm object lock to work.Attilio Rao2013-08-091-18/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl Notes: svn path=/head/; revision=254138
* When swap pager allocates metadata in the pagedaemon context, allow itKonstantin Belousov2013-07-111-1/+2
| | | | | | | | | | | | | to drain the reserve. This was broken in r243040, causing deadlock. Note that VM_WAIT call in case of uma_zalloc() failure from pagedaemon would only wait for the v_pageout_free_min anyway. Reported and tested by: pho Reviewed by: alc Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=253221
* Fix typo in comment.Konstantin Belousov2013-07-091-1/+1
| | | | | | | MFC after: 3 days Notes: svn path=/head/; revision=253095
* Complete r251452:Attilio Rao2013-06-061-2/+3
| | | | | | | | | | | | Avoid to busy/unbusy a page in cases where there is no need to drop the vm_obj lock, more nominally when the page is full valid after vm_page_grab(). Sponsored by: EMC / Isilon storage division Reviewed by: alc Notes: svn path=/head/; revision=251471
* o Change the locking scheme for swp_bcount.Attilio Rao2013-05-281-5/+7
| | | | | | | | | | | | | | It can now be accessed with a write lock on the object containing it OR with a read lock on the object containing it along with the swhash_mtx. o Remove some duplicate assertions for swap_pager_freespace() and swap_pager_unswapped() but keep the object locking references for documentation. Sponsored by: EMC / Isilon storage division Reviewed by: alc Notes: svn path=/head/; revision=251077
* Do not map the swap i/o pbufs if the geom provider for the swapKonstantin Belousov2013-03-191-13/+33
| | | | | | | | | | partition accepts unmapped requests. Sponsored by: The FreeBSD Foundation Tested by: pho Notes: svn path=/head/; revision=248514
* Switch the vm_object mutex to be a rwlock. This will enable in theAttilio Rao2013-03-091-36/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho Notes: svn path=/head/; revision=248084
* Merge from vmc-playground branch:Attilio Rao2013-02-261-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the sub-optimal uma_zone_set_obj() primitive with more modern uma_zone_reserve_kva(). The new primitive reserves before hand the necessary KVA space to cater the zone allocations and allocates pages with ALLOC_NOOBJ. More specifically: - uma_zone_reserve_kva() does not need an object to cater the backend allocator. - uma_zone_reserve_kva() can cater M_WAITOK requests, in order to serve zones which need to do uma_prealloc() too. - When possible, uma_zone_reserve_kva() uses directly the direct-mapping by uma_small_alloc() rather than relying on the KVA / offset combination. The removal of the object attribute allows 2 further changes: 1) _vm_object_allocate() becomes static within vm_object.c 2) VM_OBJECT_LOCK_INIT() is removed. This function is replaced by direct calls to mtx_init() as there is no need to export it anymore and the calls aren't either homogeneous anymore: there are now small differences between arguments passed to mtx_init(). Sponsored by: EMC / Isilon storage division Reviewed by: alc (which also offered almost all the comments) Tested by: pho, jhb, davide Notes: svn path=/head/; revision=247360
* Wrap the sleeps synchronized by the vm_object lock into the specificAttilio Rao2013-02-261-1/+1
| | | | | | | | | | | | | | macro VM_OBJECT_SLEEP(). This hides some implementation details like the usage of the msleep() primitive and the necessity to access to the lock address directly. For this reason VM_OBJECT_MTX() macro is now retired. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho Notes: svn path=/head/; revision=247323
* - Don't pass geom and provider names as format strings.Jaakko Heinonen2012-11-201-1/+1
| | | | | | | | | | - Add __printflike() attributes. - Remove an extra argument for the g_new_geomf() call in swapongeom_ev(). Reviewed by: pjd Notes: svn path=/head/; revision=243333
* Whitespace cleanup.Dag-Erling Smørgrav2012-09-051-76/+76
| | | | Notes: svn path=/head/; revision=240134
* No memory barrier is required. This was pointed out by kib@ a while ago,Dag-Erling Smørgrav2012-09-041-2/+2
| | | | | | | | | but I got distracted by other matters. (for real this time) Notes: svn path=/head/; revision=240113
* Revert previous commit, which was performed in the wrong tree.Dag-Erling Smørgrav2012-09-041-89/+82
| | | | Notes: svn path=/head/; revision=240105
* No memory barrier is required. This was pointed out by kib@ a while ago,Dag-Erling Smørgrav2012-09-041-82/+89
| | | | | | | but I got distracted by other matters. Notes: svn path=/head/; revision=240096
* Typo in previous change: print half the theoretical maximum as maximumSergey Kandaurov2012-08-271-1/+1
| | | | | | | | | | recommended amount. Reported by: <site freebsd at orientalsensation com> Reviewed by: des Notes: svn path=/head/; revision=239723
* - When running out of swzone, instead of spewing an error message everyDag-Erling Smørgrav2012-08-161-1/+33
| | | | | | | | | | | tick until the situation is resolved (if ever), just print a single message when running out and another when space becomes available. - When adding more swap, warn if the total amount exceeds half the theoretical maximum we can handle. Notes: svn path=/head/; revision=239327
* The page flag PGA_WRITEABLE is set and cleared exclusively by the pmapAlan Cox2012-06-161-1/+1
| | | | | | | | | | | | | | | | | | | layer, but it is read directly by the MI VM layer. This change introduces pmap_page_is_write_mapped() in order to completely encapsulate all direct access to PGA_WRITEABLE in the pmap layer. Aesthetics aside, I am making this change because amd64 will likely begin using an alternative method to track write mappings, and having pmap_page_is_write_mapped() in place allows me to make such a change without further modification to the MI VM layer. As an added bonus, tidy up some nearby comments concerning page flags. Reviewed by: kib MFC after: 6 weeks Notes: svn path=/head/; revision=237168
* Revert r236380Eitan Adler2012-06-011-15/+0
| | | | | | | | | PR: kern/166780 Requested by: many Approved by: cperciva (implicit) Notes: svn path=/head/; revision=236417
* Add sysctl to query amount of swap space freeEitan Adler2012-06-011-0/+15
| | | | | | | | | | PR: kern/166780 Submitted by: Radim Kolar <hsn@sendmail.cz> Approved by: cperciva MFC after: 1 week Notes: svn path=/head/; revision=236380
* Remove direct access to si_name.Ed Schouten2012-02-101-3/+3
| | | | | | | | | | | Code should just use the devtoname() function to obtain the name of a character device. Also add const keywords to pieces of code that need it to build properly. MFC after: 2 weeks Notes: svn path=/head/; revision=231378
* Fix NULL dereference panic on attempt to turn off (on system shutdown)Alexander Motin2012-02-011-1/+1
| | | | | | | | | | | | | | | | disconnected swap device. This is quick and imperfect solution, as swap device will still be opened and GEOM will not be able to destroy it. Proper solution would be to automatically turn off and close disconnected swap device, but with existing code it will cause panic if there is at least one page on device, even if it is unimportant page of the user-level process. It needs some work. Reviewed by: kib@ MFC after: 1 week Notes: svn path=/head/; revision=230877
* Fix printf.Konstantin Belousov2011-12-121-1/+1
| | | | | | | | Submitted by: az MFC after: 1 week Notes: svn path=/head/; revision=228432
* In order to maximize the re-usability of kernel code in user space thisKip Macy2011-09-161-2/+2
| | | | | | | | | | | | | | | | patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz) Notes: svn path=/head/; revision=225617
* Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomicKonstantin Belousov2011-09-061-1/+1
| | | | | | | | | | | | | | | | | | | | flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz) Notes: svn path=/head/; revision=225418
* Update some comments in swap_pager.c.Konstantin Belousov2011-08-221-30/+17
| | | | | | | | | Reviewed and most wording by: alc MFC after: 1 week Approved by: re (bz) Notes: svn path=/head/; revision=225089
* Apply the limit to avoid the overflows in the radix tree subr_blist.cKonstantin Belousov2011-08-221-10/+12
| | | | | | | | | | | | | | after the conversion of the swap device size to the page size units, not before. That lifts the limit on the usable swap partition size from 32GB to 256GB, that is less depressing for the modern systems. Submitted by: Alexander V. Chernikov <melifaro ipfw ru> Reviewed by: alc Approved by: re (bz) MFC after: 2 weeks Notes: svn path=/head/; revision=225076
* Implement the linprocfs swaps file, providing information about theKonstantin Belousov2011-08-011-20/+38
| | | | | | | | | | | | | configured swap devices in the Linux-compatible format. Based on the submission by: Robert Millan <rmh debian org> PR: kern/159281 Reviewed by: bde Approved by: re (kensmith) MFC after: 2 weeks Notes: svn path=/head/; revision=224582
* All the racct_*() calls need to happen with the proc locked. Fixing thisEdward Tomasz Napierala2011-07-061-0/+6
| | | | | | | | | won't happen before 9.0. This commit adds "#ifdef RACCT" around all the "PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order to avoid useless locking/unlocking in kernels built without "options RACCT". Notes: svn path=/head/; revision=223825
* Reap old SPL comments.David E. O'Brien2011-04-261-35/+2
| | | | | | | Reviewed by: alc Notes: svn path=/head/; revision=221096
* Add accounting for most of the memory-related resources.Edward Tomasz Napierala2011-04-051-0/+19
| | | | | | | | Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version) Notes: svn path=/head/; revision=220373
* Change the return type of vmspace_swap_count to a long to match the otherRebecca Cran2011-03-011-2/+2
| | | | | | | | | vmspace_*_count functions. MFC after: 3 days Notes: svn path=/head/; revision=219124