summaryrefslogtreecommitdiff
path: root/sys/vm/vm_object.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Eliminate the conditional for releasing the page queues lock inAlan Cox2012-10-131-2/+0
| | | | | | | | | | | | | | vm_page_sleep(). vm_page_sleep() is no longer called with this lock held. Eliminate assertions that the page queues lock is NOT held. These assertions won't translate well to having distinct locks on the active and inactive page queues, and they really aren't that useful. MFC after: 3 weeks Notes: svn path=/head/; revision=241512
* Fix the mis-handling of the VV_TEXT on the nullfs vnodes.Konstantin Belousov2012-09-281-1/+1
| | | | | | | | | | | | | | | | | | | If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks Notes: svn path=/head/; revision=241025
* Plug the accounting leak for the wired pages when msync(MS_INVALIDATE)Konstantin Belousov2012-09-201-2/+7
| | | | | | | | | | | | | | | is performed on the vnode mapping which is wired in other address space. While there, explicitely assert that the page is unwired and zero the wire_count instead of substract. The condition is rechecked later in vm_page_free(_toq) already. Reported and tested by: zont Reviewed by: alc (previous version) MFC after: 1 week Notes: svn path=/head/; revision=240741
* Document the object type movements, related to swp_pager_copy(),Attilio Rao2012-07-111-0/+9
| | | | | | | | | | in vm_object_collapse() and vm_object_split(). In collabouration with: alc MFC after: 3 days Notes: svn path=/head/; revision=238359
* Fix madvise(MADV_WILLNEED) to properly handle individual mappings largerJohn Baldwin2012-03-191-3/+3
| | | | | | | | | | | than 4GB. Specifically, the inlined version of 'ptoa' of the the 'int' count of pages overflowed on 64-bit platforms. While here, change vm_object_madvise() to accept two vm_pindex_t parameters (start and end) rather than a (start, count) tuple to match other VM APIs as suggested by alc@. Notes: svn path=/head/; revision=233191
* In vm_object_page_clean(), do not clean OBJ_MIGHTBEDIRTY object flagKonstantin Belousov2012-03-171-18/+38
| | | | | | | | | | | | | | | | | | | | | if the filesystem performed short write and we are skipping the page due to this. Propogate write error from the pager back to the callers of vm_pageout_flush(). Report the failure to write a page from the requested range as the FALSE return value from vm_object_page_clean(), and propagate it back to msync(2) to return EIO to usermode. While there, convert the clearobjflags variable in the vm_object_page_clean() and arguments of the helper functions to boolean. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks Notes: svn path=/head/; revision=233100
* Do not restart the scan in vm_object_page_clean() on the objectKonstantin Belousov2012-01-041-4/+12
| | | | | | | | | | | | | | | generation change if requested mode is async. The object generation is only changed when the object is marked as OBJ_MIGHTBEDIRTY. For async mode it is enough to write each dirty page, not to make a guarantee that all pages are cleared after the vm_object_page_clean() returned. Diagnosed by: truckman Tested by: flo Reviewed by: alc, truckman MFC after: 2 weeks Notes: svn path=/head/; revision=229495
* Optimize vm_object_split()'s handling of reservations.Alan Cox2011-12-281-0/+15
| | | | Notes: svn path=/head/; revision=228936
* Optimize the common case of msyncing the whole file mapping withKonstantin Belousov2011-12-231-3/+18
| | | | | | | | | | | | | | | | | | MS_SYNC flag. The system must guarantee that all writes are finished before syscalls returned. Schedule the writes in async mode, which is much faster and allows the clustering to occur. Wait for writes using VOP_FSYNC(), since we are syncing the whole file mapping. Potentially, the restriction to only apply the optimization can be relaxed by not requiring that the mapping cover whole file, as it is done by other OSes. Reported and tested by: az Reviewed by: alc MFC after: 2 weeks Notes: svn path=/head/; revision=228838
* Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.Ed Schouten2011-11-071-1/+2
| | | | | | | | | The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static. Notes: svn path=/head/; revision=227309
* Add the posix_fadvise(2) system call. It is somewhat similar toJohn Baldwin2011-11-041-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | madvise(2) except that it operates on a file descriptor instead of a memory region. It is currently only supported on regular files. Just as with madvise(2), the advice given to posix_fadvise(2) can be divided into two types. The first type provide hints about data access patterns and are used in the file read and write routines to modify the I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are thus filesystem independent. Note that to ease implementation (and since this API is only advisory anyway), only a single non-normal range is allowed per file descriptor. The second type of hints are used to hint to the OS that data will or will not be used. These hints are implemented via a new VOP_ADVISE(). A default implementation is provided which does nothing for the WILLNEED request and attempts to move any clean pages to the cache page queue for the DONTNEED request. This latter case required two other changes. First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests vinvalbuf() to only flush clean buffers for the vnode from the buffer cache and to not remove any backing pages from the vnode. This is used to ensure clean pages are not wired into the buffer cache before attempting to move them to the cache page queue. The second change adds a new vm_object_page_cache() method. This method is somewhat similar to vm_object_page_remove() except that instead of freeing each page in the specified range, it attempts to move clean pages to the cache queue if possible. To preserve the ABI of struct file, the f_cdevpriv pointer is now reused in a union to point to the currently active advice region if one is present for regular files. Reviewed by: jilles, kib, arch@ Approved by: re (kib) MFC after: 1 month Notes: svn path=/head/; revision=227070
* Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomicKonstantin Belousov2011-09-061-3/+1
| | | | | | | | | | | | | | | | | | | | flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz) Notes: svn path=/head/; revision=225418
* - Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flagKonstantin Belousov2011-08-091-1/+3
| | | | | | | | | | | | | | | | | to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged. Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz) Notes: svn path=/head/; revision=224746
* Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing thisAlan Cox2011-06-291-58/+65
| | | | | | | | | | | | | | | | | | | | | option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib Notes: svn path=/head/; revision=223677
* In the VOP_PUTPAGES() implementations, change the default error fromKonstantin Belousov2011-06-011-0/+15
| | | | | | | | | | | | | | | | | | | VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return VM_PAGER_AGAIN for the partially written page. Always forward at least one page in the loop of vm_object_page_clean(). VM_PAGER_ERROR causes the page reactivation and does not clear the page dirty state, so the write is not lost. The change fixes an infinite loop in vm_object_page_clean() when the filesystem returns permanent errors for some page writes. Reported and tested by: gavin Reviewed by: alc, rmacklem MFC after: 1 week Notes: svn path=/head/; revision=222586
* Another long standing vm bug found at Isilon:Max Laier2011-05-091-0/+18
| | | | | | | | | | Fix a race between vm_object_collapse and vm_fault. Reviewed by: alc@ MFC after: 3 days Notes: svn path=/head/; revision=221714
* Fix two bugs in r218670.Konstantin Belousov2011-04-231-4/+11
| | | | | | | | | | | | | | | | Hold the vnode around the region where object lock is dropped, until vnode lock is acquired. Do not drop the vnode reference for a case when the object was deallocated during unlock. Note that in this case, VV_TEXT is cleared by vnode_pager_dealloc(). Reported and tested by: pho Reviewed by: alc MFC after: 3 days Notes: svn path=/head/; revision=220977
* Lock the vnode around clearing of VV_TEXT flag. Remove mp_fixme() noteKonstantin Belousov2011-02-131-9/+14
| | | | | | | | | | | mentioning that vnode lock is needed. Reviewed by: alc Tested by: pho MFC after: 1 week Notes: svn path=/head/; revision=218670
* Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() areAlan Cox2011-02-051-9/+7
| | | | | | | | | | | | | | incorrectly calling vm_object_page_clean(). They are passing the length of the range rather than the ending offset of the range. Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the callers. Reviewed by: kib MFC after: 3 weeks Notes: svn path=/head/; revision=218345
* Since the last parameter to vm_object_shadow() is a vm_size_t and not aAlan Cox2011-02-041-1/+1
| | | | | | | | vm_pindex_t, it makes no sense for its callers to perform atop(). Let vm_object_shadow() do that instead. Notes: svn path=/head/; revision=218304
* For consistency, use kernel_object instead of &kernel_object_storeKonstantin Belousov2011-01-151-2/+2
| | | | | | | | | | when initializing the object mutex. Do the same for kmem_object. Discussed with: alc MFC after: 1 week Notes: svn path=/head/; revision=217463
* Make a couple refinements to r216799 and r216810. In particular, reviseAlan Cox2011-01-011-10/+8
| | | | | | | | | a comment and move it to its proper place. Reviewed by: kib Notes: svn path=/head/; revision=216874
* Remove OBJ_CLEANING flag. The vfs_setdirty_locked_object() is the onlyKonstantin Belousov2010-12-291-3/+0
| | | | | | | | | | | | | | | | consumer of the flag, and it used the flag because OBJ_MIGHTBEDIRTY was cleared early in vm_object_page_clean, before the cleaning pass was done. This is no longer true after r216799. Moreover, since OBJ_CLEANING is a flag, and not the counter, it could be reset too prematurely when parallel vm_object_page_clean() are performed. Reviewed by: alc (as a part of the bigger patch) MFC after: 1 month (after r216799 is merged) Notes: svn path=/head/; revision=216810
* Move the increment of vm object generation count intoKonstantin Belousov2010-12-291-31/+34
| | | | | | | | | | | | | | | | | | | | | | | | | vm_object_set_writeable_dirty(). Fix an issue where restart of the scan in vm_object_page_clean() did not removed write permissions for newly added pages or, if the mapping for some already scanned page changed to writeable due to fault. Merge the two loops in vm_object_page_clean(), doing the remove of write permission and cleaning in the same loop. The restart of the loop then correctly downgrade writeable mappings. Fix an issue where a second caller to msync() might actually return before the first caller had actually completed flushing the pages. Clear the OBJ_MIGHTBEDIRTY flag after the cleaning loop, not before. Calls to pmap_is_modified() are not needed after pmap_remove_write() there. Proposed, reviewed and tested by: alc MFC after: 1 week Notes: svn path=/head/; revision=216799
* Replace pointer to "struct uidinfo" with pointer to "struct ucred"Edward Tomasz Napierala2010-12-021-16/+16
| | | | | | | | | | | | in "struct vm_object". This is required to make it possible to account for per-jail swap usage. Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation Notes: svn path=/head/; revision=216128
* After the sleep caused by encountering a busy page, relookup the page.Konstantin Belousov2010-11-241-1/+3
| | | | | | | | | Submitted and reviewed by: alc Reprted and tested by: pho MFC after: 5 days Notes: svn path=/head/; revision=215796
* Eliminate the mab, maf arrays and related variables.Konstantin Belousov2010-11-211-32/+15
| | | | | | | | | | | The change also fixes off-by-one error in the calculation of mreq. Suggested and reviewed by: alc Tested by: pho MFC after: 5 days Notes: svn path=/head/; revision=215610
* Optimize vm_object_terminate().Alan Cox2010-11-201-9/+28
| | | | | | | | Reviewed by: kib MFC after: 1 week Notes: svn path=/head/; revision=215597
* The runlen returned from vm_pageout_flush() might be zero legitimately,Konstantin Belousov2010-11-201-1/+0
| | | | | | | | | when mreq page has status VM_PAGER_AGAIN. MFC after: 5 days Notes: svn path=/head/; revision=215574
* vm_pageout_flush() might cache the pages that finished write to theKonstantin Belousov2010-11-181-24/+3
| | | | | | | | | | | | | | | | | | | | | | backing storage. Such pages might be then reused, racing with the assert in vm_object_page_collect_flush() that verified that dirty pages from the run (most likely, pages with VM_PAGER_AGAIN status) are write-protected still. In fact, the page indexes for the pages that were removed from the object page list should be ignored by vm_object_page_clean(). Return the length of successfully written run from vm_pageout_flush(), that is, the count of pages between requested page and first page after requested with status VM_PAGER_AGAIN. Supply the requested page index in the array to vm_pageout_flush(). Use the returned run length to forward the index of next page to clean in vm_object_page_clean(). Reported by: avg Reviewed by: alc MFC after: 1 week Notes: svn path=/head/; revision=215471
* Only increment object generation count when inserting the page intoKonstantin Belousov2010-11-181-7/+0
| | | | | | | | | | | | | object page list. The only use of object generation count now is a restart of the scan in vm_object_page_clean(), which makes sense to do on the page addition. Page removals do not affect the dirtiness of the object, as well as manipulations with the shadow chain. Suggested and reviewed by: alc MFC after: 1 week Notes: svn path=/head/; revision=215469
* Several cleanups for the r209686:Konstantin Belousov2010-07-041-13/+6
| | | | | | | | | | | | - remove unused defines; - remove unused curgeneration argument for vm_object_page_collect_flush(); - always assert that vm_object_page_clean() is called for OBJT_VNODE; - move vm_page_find_least() into for() statement initial clause. Submitted by: alc Notes: svn path=/head/; revision=209702
* Reimplement vm_object_page_clean(), using the fact that vm object memqKonstantin Belousov2010-07-041-189/+73
| | | | | | | | | | | | | | | | is ordered by page index. This greatly simplifies the implementation, since we no longer need to mark the pages with VPO_CLEANCHK to denote the progress. It is enough to remember the current position by index before dropping the object lock. Remove VPO_CLEANCHK and VM_PAGER_IGNORE_CLEANCHK as unused. Garbage-collect vm.msync_flush_flags sysctl. Suggested and reviewed by: alc Tested by: pho Notes: svn path=/head/; revision=209686
* Introduce a helper function vm_page_find_least(). Use it in several places,Konstantin Belousov2010-07-041-14/+2
| | | | | | | | | | | which inline the function. Reviewed by: alc Tested by: pho MFC after: 1 week Notes: svn path=/head/; revision=209685
* Roughly half of a typical pmap_mincore() implementation is machine-Alan Cox2010-05-241-29/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version) Notes: svn path=/head/; revision=208504
* Add a comment about the proper use of vm_object_page_remove().Alan Cox2010-05-161-1/+2
| | | | | | | MFC after: 1 week Notes: svn path=/head/; revision=208159
* Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), andAlan Cox2010-05-081-14/+1
| | | | | | | | | | | | | | vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached. Notes: svn path=/head/; revision=207796
* Eliminate acquisitions of the page queues lock that are no longer needed.Alan Cox2010-05-071-9/+2
| | | | | | | | Switch to a per-processor counter for the number of pages freed during process termination. Notes: svn path=/head/; revision=207739
* Eliminate page queues locking around most calls to vm_page_free().Alan Cox2010-05-061-2/+0
| | | | Notes: svn path=/head/; revision=207728
* Acquire the page lock around all remaining calls to vm_page_free() onAlan Cox2010-05-051-4/+0
| | | | | | | | | | | | | | | | managed pages that didn't already have that lock held. (Freeing an unmanaged page, such as the various pmaps use, doesn't require the page lock.) This allows a change in vm_page_remove()'s locking requirements. It now expects the page lock to be held instead of the page queues lock. Consequently, the page queues lock is no longer required at all by callers to vm_page_rename(). Discussed with: kib Notes: svn path=/head/; revision=207669
* Correct an error in r207410: Remove an unlock of a lock that is no longerAlan Cox2010-05-021-1/+0
| | | | | | | held. Notes: svn path=/head/; revision=207531
* push up dropping of the page queue lock to avoid holding it in vm_pageout_flushKip Macy2010-04-301-29/+17
| | | | Notes: svn path=/head/; revision=207452
* don't call vm_pageout_flush with the page queue mutex heldKip Macy2010-04-301-0/+2
| | | | | | | Reported by: Michael Butler Notes: svn path=/head/; revision=207451
* On Alan's advice, rather than do a wholesale conversion on a singleKip Macy2010-04-301-12/+77
| | | | | | | | | | | | | | | architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib Notes: svn path=/head/; revision=207410
* Change vm_object_madvise() so that it checks whether the page is invalidAlan Cox2010-04-281-10/+6
| | | | | | | | | | | | or unmanaged before acquiring the page queues lock. Neither of these tests require that lock. Moreover, a better way of testing if the page is unmanaged is to test the type of vm object. This avoids a pointless vm_page_lookup(). MFC after: 3 weeks Notes: svn path=/head/; revision=207306
* There is no justification for vm_object_split() setting PG_REFERENCED on aAlan Cox2010-04-181-1/+0
| | | | | | | | | page that it is going to sleep on. Eliminate it. MFC after: 3 weeks Notes: svn path=/head/; revision=206801
* In vm_object_madvise() setting PG_REFERENCED on a page before sleeping onAlan Cox2010-04-171-2/+9
| | | | | | | | | | | | | | | | that page only makes sense if the advice is MADV_WILLNEED. In that case, the intention is to activate the page, so discouraging the page daemon from reclaiming the page makes sense. In contrast, in the other cases, MADV_DONTNEED and MADV_FREE, it makes no sense whatsoever to discourage the page daemon from reclaiming the page by setting PG_REFERENCED. Wrap a nearby line. Discussed with: kib MFC after: 3 weeks Notes: svn path=/head/; revision=206770
* In vm_object_backing_scan(), setting PG_REFERENCED on a page beforeAlan Cox2010-04-171-3/+0
| | | | | | | | | | | | | | | | | | sleeping on that page is nonsensical. Doing so reduces the likelihood that the page daemon will reclaim the page before the thread waiting in vm_object_backing_scan() is reawakened. However, it does not guarantee that the page is not reclaimed, so vm_object_backing_scan() restarts after reawakening. More importantly, this muddles the meaning of PG_REFERENCED. There is no reason to believe that the caller of vm_object_backing_scan() is going to use (i.e., access) the contents of the page. There is especially no reason to believe that an access is more likely because vm_object_backing_scan() had to sleep on the page. Discussed with: kib MFC after: 3 weeks Notes: svn path=/head/; revision=206768
* VI_OBJDIRTY vnode flag mirrors the state of OBJ_MIGHTBEDIRTY vm objectKonstantin Belousov2009-12-211-21/+5
| | | | | | | | | | | | | | | | flag. Besides providing the redundand information, need to update both vnode and object flags causes more acquisition of vnode interlock. OBJ_MIGHTBEDIRTY is only checked for vnode-backed vm objects. Remove VI_OBJDIRTY and make sure that OBJ_MIGHTBEDIRTY is set only for vnode-backed vm objects. Suggested and reviewed by: alc Tested by: pho MFC after: 3 weeks Notes: svn path=/head/; revision=200770
* Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar toJohn Baldwin2009-07-241-0/+1
| | | | | | | | | | | | | | a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks Notes: svn path=/head/; revision=195840