aboutsummaryrefslogtreecommitdiff
path: root/sys/vm
Commit message (Collapse)AuthorAgeFilesLines
* Revert r368523 which fixed contig allocs waiting forever.Bryan Drewery2020-12-151-16/+4
| | | | | | | | | | This needs to account for empty NUMA domains or domains which do not satisfy the requested range. Discussed with: markj Notes: svn path=/head/; revision=368673
* contig allocs: Don't retry forever on M_WAITOK.Bryan Drewery2020-12-101-4/+16
| | | | | | | | | | | | | | | | | | | | | | This restores behavior from before domain iterators were added in r327895 and r327896. The vm_domainset_iter_policy() will do a vm_wait_doms() and then restart its iterator when M_WAITOK is set. It will also force the containing loop to have M_NOWAIT. So we get an unbounded retry loop rather than the intended bounded retries that kmem_alloc_contig_pages() already handles. This also restores M_WAITOK to the vmem_alloc() call in kmem_alloc_attr_domain() and kmem_alloc_contig_domain(). Reviewed by: markj, kib MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D27507 Notes: svn path=/head/; revision=368523
* uma: Make uma_zone_set_maxcache() work better with small limitsMark Johnston2020-12-062-45/+34
| | | | | | | | | | | | | | | | | | | | | | | | | The old implementation chose the largest bucket zone such that if the per-CPU caches are fully populated, the total number of items cached is no larger than the specified limit. If no such zone existed, UMA would not do any caching. We can now use uz_bucket_size_max to set a precise limit on the number of items in a zone's bucket, so the total size of per-CPU caches can be bounded more easily. Implement a new policy in uma_zone_set_maxcache(): choose a bucket size such that up to half of the limit can be cached in per-CPU caches, with the rest going to the full bucket cache. This fixes a problem with the kstack_cache zone: the limit of 4 * mp_ncpus items meant that the zone would not do any caching, defeating the whole purpose of the zone. That's because the smallest bucket size holds up to 2 items and we may cache up to 3 full buckets per CPU, and 2 * 3 * mp_ncpus > 4 * mp_ncpus. Reported by: mjg Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27168 Notes: svn path=/head/; revision=368400
* uma: Enforce the use of uz_bucket_size_max in the free pathMark Johnston2020-12-062-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | uz_bucket_size_max is the maximum permitted bucket size. When filling a new bucket to satisfy uma_zalloc(), the bucket is populated with at most uz_bucket_size_max items. The maximum number of entries in the bucket may be larger. When freeing items, however, we will fill per-CPPU buckets up to their maximum number of entries, potentially exceeding uz_bucket_size_max. This makes it difficult to precisely limit the number of items that may be cached in a zone. For example, if one wants to limit buckets to 1 entry for a particular zone, that's not possible since the smallest bucket holds up to 2 entries. Try to solve the problem by using uz_bucket_size_max to limit the number of entries in a bucket. Note that the ub_entries field is initialized upon every bucket allocation. Most zones are not affected since they do not impose any specific limit on the maximum bucket size. While here, remove the UMA_ZONE_MINBUCKET flag. It was unused and we now have uma_zone_set_maxcache() to control the zone's cache size more precisely. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27167 Notes: svn path=/head/; revision=368399
* uma: Use atomic load for uz_sleepersMark Johnston2020-12-061-1/+1
| | | | | | | | | This field is updated locklessly. Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=368398
* uma: Avoid allocating buckets with the cross-domain lock heldMark Johnston2020-11-301-7/+33
| | | | | | | | | | | | | | | | | | | | Allocation of a bucket can trigger a cross-domain free in the bucket zone, e.g., if the per-CPU alloc bucket is empty, we free it and get migrated to a remote domain. This can lead to deadlocks since a bucket zone may allocate buckets from itself or a pair of bucket zones could be allocating from each other. Fix the problem by dropping the cross-domain lock before allocating a new bucket and handling refill races. Use a list of empty buckets to ensure that we can make forward progress. Reported by: imp, mjg (witness(9) warnings) Discussed with: jeff Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27341 Notes: svn path=/head/; revision=368189
* Make MAXPHYS tunable. Bump MAXPHYS to 1M.Konstantin Belousov2020-11-287-18/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225 Notes: svn path=/head/; revision=368124
* Wrap a long line in vm_pqbatch_process_page()Mark Johnston2020-11-191-1/+2
| | | | Notes: svn path=/head/; revision=367845
* Micro-optimize vm_page_pqbatch_submit()Mark Johnston2020-11-191-2/+2
| | | | | | | | | Avoid calling vm_page_domain() twice. Discussed with: alc (in D27207) Notes: svn path=/head/; revision=367844
* vm_phys: Try to clean up NUMA KPIsMark Johnston2020-11-199-75/+115
| | | | | | | | | | | | | | | | | | | | | | | | | It can useful for code outside the VM system to look up the NUMA domain of a page backing a virtual or physical address, specifically when creating NUMA-aware data structures. We have _vm_phys_domain() for this, but the leading underscore implies that it's an internal function, and vm_phys.h has dependencies on a number of other headers. Rename vm_phys_domain() to vm_page_domain(), and _vm_phys_domain() to vm_phys_domain(). Make the latter an inline function. Add _vm_phys.h and define struct vm_phys_seg there so that it's easier to use in other headers. Include it from vm_page.h so that vm_page_domain() can be defined there. Include machine/vmparam.h from _vm_phys.h since it depends directly on some constants defined there. Reviewed by: alc Reviewed by: dougm, kib (earlier versions) Differential Revision: https://reviews.freebsd.org/D27207 Notes: svn path=/head/; revision=367828
* vm_map: Handle kernel map entry allocator recursionMark Johnston2020-11-112-22/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On platforms without a direct map[*], vm_map_insert() may in rare situations need to allocate a kernel map entry in order to allocate kernel map entries. This poses a problem similar to the one solved for vmem boundary tags by vmem_bt_alloc(). In fact the kernel map case is a bit more complicated since we must allocate entries with the kernel map locked, whereas vmem can recurse into itself because boundary tags are allocated up-front. The solution is to add a custom slab allocator for kmapentzone which allocates KVA directly from kernel_map, bypassing the kmem_* layer. This avoids mutual recursion with the vmem btag allocator. Then, when vm_map_insert() allocates a new kernel map entry, it avoids triggering allocation of a new slab with M_NOVM until after the insertion is complete. Instead, vm_map_insert() allocates from the reserve and sets a flag in kernel_map to trigger re-population of the reserve just before the map is unlocked. This places an implicit upper bound on the number of kernel map entries that may be allocated before the kernel map lock is released, but in general a bound of 1 suffices. [*] This also comes up on amd64 with UMA_MD_SMALL_ALLOC undefined, a configuration required by some kernel sanitizers. Discussed with: kib, rlibby Reported by: andrew Tested by: pho (i386 and amd64 with !UMA_MD_SMALL_ALLOC) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26851 Notes: svn path=/head/; revision=367595
* When destroying a UMA zone which has a reserve (set withJonathan T. Looney2020-11-101-0/+4
| | | | | | | | | | | | | | | | | | | uma_zone_reserve()), messages like the following appear on the console: "Freed UMA keg (Test zone) was not empty (0 items). Lost 528 pages of memory." When keg_drain_domain() is draining the zone, it tries to keep the number of items specified in the reservation. However, when we are destroying the UMA zone, we do not need to keep those items. Therefore, when destroying a non-secondary and non-cache zone, we should reset the keg reservation to 0 prior to draining the zone. Reviewed by: markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D27129 Notes: svn path=/head/; revision=367573
* Add more per-cpu zones.Mateusz Guzik2020-11-091-0/+3
| | | | | | | | | This covers powers of 2 up to 64. Example pending user is ZFS. Notes: svn path=/head/; revision=367503
* Implement superpages for PowerPC64 (HPT)Leandro Lupori2020-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This change adds support for transparent superpages for PowerPC64 systems using Hashed Page Tables (HPT). All pmap operations are supported. The changes were inspired by RISC-V implementation of superpages, by @markj (r344106), but heavily adapted to fit PPC64 HPT architecture and existing MMU OEA64 code. While these changes are not better tested, superpages support is disabled by default. To enable it, use vm.pmap.superpages_enabled=1. In this initial implementation, when superpages are disabled, system performance stays at the same level as without these changes. When superpages are enabled, buildworld time increases a bit (~2%). However, for workloads that put a heavy pressure on the TLB the performance boost is much bigger (see HPC Challenge and pgbench on D25237). Reviewed by: jhibbits Sponsored by: Eldorado Research Institute (eldorado.org.br) Differential Revision: https://reviews.freebsd.org/D25237 Notes: svn path=/head/; revision=367417
* Rationalize per-cpu zones.Mateusz Guzik2020-11-051-2/+2
| | | | | | | | | | | | | The 2 provided zones had inconsistent naming between each other ("int" and "64") and other allocator zones (which use bytes). Follow malloc by naming them "pcpu-" + size in bytes. This is a step towards replacing ad-hoc per-cpu zones with general slabs. Notes: svn path=/head/; revision=367384
* vmspace: Convert to refcount(9)Mark Johnston2020-11-043-42/+29
| | | | | | | | | | | | | | | | | | This is mostly mechanical except for vmspace_exit(). There, use the new refcount_release_if_last() to avoid switching to vmspace0 unless other processes are sharing the vmspace. In that case, upon switching to vmspace0 we can unconditionally release the reference. Remove the volatile qualifier from vm_refcnt now that accesses are protected using refcount(9) KPIs. Reviewed by: alc, kib, mmel MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27057 Notes: svn path=/head/; revision=367334
* Conditionally compile struct vm_phys_seg's md_first field. This field isAlan Cox2020-10-231-0/+2
| | | | | | | | | | only used by arm64's pmap. Reviewed by: kib, markj, scottph Differential Revision: https://reviews.freebsd.org/D26907 Notes: svn path=/head/; revision=366960
* uma: fix KTR message after r366840Ed Maste2020-10-191-1/+1
| | | | | | | | Reported by: bz Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=366847
* uma: Respect uk_reserve in keg_drain()Mark Johnston2020-10-191-27/+64
| | | | | | | | | | | | | | | | | When a reserve of free items is configured for a zone, the reserve must not be reclaimed under memory pressure. Modify keg_drain() to simply respect the reserved pool. While here remove an always-false uk_freef == NULL check (kegs that shouldn't be drained should set _NOFREE instead), and make sure that the keg_drain() KTR statement does not reference an uninitialized variable. Reviewed by: alc, rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26772 Notes: svn path=/head/; revision=366840
* uma: Avoid depleting keg reserves when filling a bucketMark Johnston2020-10-191-5/+13
| | | | | | | | | | | | | | | | | | | | | | zone_import() fetches a free or partially free slab from the keg and then uses its items to populate an array, typically filling a bucket. If a single allocation causes the keg to drop below its minimum reserve, the inner loop ends. However, if the bucket is still not full and M_USE_RESERVE is specified, the outer loop will continue to fetch items from the keg. If M_USE_RESERVE is specified and the number of free items is below the reserved limit, we should return only a single item. Otherwise, if the bucket size is larger than the reserve, all of the reserved items may end up in a single per-CPU bucket, invisible to other CPUs. Reviewed by: rlibby MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26771 Notes: svn path=/head/; revision=366839
* Avoid dump_avail[] redefinition.Konstantin Belousov2020-10-145-67/+102
| | | | | | | | | | | | | Move dump_avail[] extern declaration and inlines into a new header vm/vm_dumpset.h. This fixes default gcc build for mips. Reviewed by: alc, scottph Tested by: kevans (previous version) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26741 Notes: svn path=/head/; revision=366711
* Use unlocked page lookup for inmem() to avoid object lock contentionBryan Drewery2020-10-092-0/+16
| | | | | | | | | | Reviewed By: kib, markj Submitted by: mlaier Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D26653 Notes: svn path=/head/; revision=366594
* vm_page_dump_index_to_pa(): Add braces to the expression involving + and &.Konstantin Belousov2020-10-081-1/+1
| | | | | | | | | | | | | | | | The precedence of the '&' operator is less than of '+'. Added braces do change the order of evaluation into the natural one, in my opinion. On the other hand, the value of the expression should not change since all elements should have page-aligned values. This fixes a gcc warning reported. Reported by: adrian Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=366552
* vm_pageout: Avoid rounding down the inactive scan targetMark Johnston2020-10-021-7/+8
| | | | | | | | | | | | | | | | | | | | | | | With helper page daemon threads, enabled by default in r364786, we divide the inactive target by the number of threads, rounding down, and sum the total number of pages freed by the threads. This sum is compared with the original target, but by rounding down we might lose pages, causing the page daemon control loop to conclude that inactive queue scanning isn't keeping up with demand for free pages. Typically this results in excessive swapping. Fix the problem by accounting for the error in the main pagedaemon thread's target. Note that by default the problem will manifest only in systems with >16 CPUs in a NUMA domain. Reviewed by: cem Discussed with: dougm Reported and tested by: dhw, glebius Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26610 Notes: svn path=/head/; revision=366380
* uma: Use the bucket cache for cross-domain allocationsMark Johnston2020-10-021-5/+49
| | | | | | | | | | | | | | | | | | | | | | | | uma_zalloc_domain() allocates from the requested domain instead of following a first-touch policy (the default for most zones). Currently it is only used by malloc_domainset(), and consumers free returned items with free(9) since r363834. Previously uma_zalloc_domain() worked by always going to the keg for an item. As a result, the use of UMA zone caches was unbalanced: we free items to the caches, but always allocate from the keg, skipping the caches. Make some effort to allocate from the UMA caches when performing a cross-domain allocation. This avoids blowing up the caches when something is performing many transient allocations with malloc_domainset(). Reported and tested by: dhw, glebius Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26427 Notes: svn path=/head/; revision=366379
* uma: Use LIFO for non-SMR bucket cachesMark Johnston2020-10-021-1/+9
| | | | | | | | | | | | | | | | When SMR was introduced, zone_put_bucket() was changed to always place full buckets at the end of the queue. However, it is generally preferable to use recently used buckets since their items are more likely to be resident in cache. So, for buckets that have no constraint on item reuse, use a last-in-first-out ordering as we did before. Reviewed by: rlibby Tested by: dhw, glebius Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26426 Notes: svn path=/head/; revision=366378
* uma: Remove newlines from panic messagesMark Johnston2020-10-021-10/+10
| | | | | | | Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=366377
* Implement sparse core dumpsMark Johnston2020-10-022-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we allocate and map zero-filled anonymous pages when dumping core. This can result in lots of needless disk I/O and page allocations. This change tries to make the core dumper more clever and represent unbacked ranges of virtual memory by holes in the core dump file. Add a new page fault type, VM_FAULT_NOFILL, which causes vm_fault() to clean up and return an error when it would otherwise map a zero-filled page. Then, in the core dumper code, prefault all user pages and handle errors by simply extending the size of the core file. This also fixes a bug related to the fact that vn_io_fault1() does not attempt partial I/O in the face of errors from vm_fault_quick_hold_pages(): if a truncated file is mapped into a user process, an attempt to dump beyond the end of the file results in an error, but this means that valid pages immediately preceding the end of the file might not have been dumped either. The change reduces the core dump size of trivial programs by a factor of ten simply by excluding unaccessed libc.so pages. PR: 249067 Reviewed by: kib Tested by: pho MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26590 Notes: svn path=/head/; revision=366368
* Flag vm_reserv and vm_phys sysctls as MPSAFE.Mark Johnston2020-09-232-4/+4
| | | | | | | | | Nothing in these subsystems relies on Giant. MFC after: 1 week Notes: svn path=/head/; revision=366091
* Add a vmparam.h constant indicating pmap support for large pages.Mark Johnston2020-09-231-0/+2
| | | | | | | | | | | Enable SHM_LARGEPAGE support on arm64. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc., Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26467 Notes: svn path=/head/; revision=366090
* arm64/pmap: Sparsify pv_tableD Scott Phillips2020-09-211-0/+1
| | | | | | | | | | | Reviewed by: markj, kib Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26132 Notes: svn path=/head/; revision=365981
* vm_reserv: Sparsify the vm_reserv_array when VM_PHYSSEG_SPARSED Scott Phillips2020-09-212-16/+58
| | | | | | | | | | | | | | | | | | | | | | | On an Ampere Altra system, the physical memory is populated sparsely within the physical address space, with only about 0.4% of physical addresses backed by RAM in the range [0, last_pa]. This is causing the vm_reserv_array to be over-sized by a few orders of magnitude, wasting roughly 5 GiB on a system with 256 GiB of RAM. The sparse allocation of vm_reserv_array is controlled by defining VM_PHYSSEG_SPARSE, with the dense allocation still remaining for platforms with VM_PHYSSEG_DENSE. Reviewed by: markj, alc, kib Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26130 Notes: svn path=/head/; revision=365980
* Sparsify the vm_page_dump bitmapD Scott Phillips2020-09-213-7/+53
| | | | | | | | | | | | | | | | | | | On Ampere Altra systems, the sparse population of RAM within the physical address space causes the vm_page_dump bitmap to be much larger than necessary, increasing the size from ~8 Mib to > 2 Gib (and overflowing `int` for the size). Changing the page dump bitmap also changes the minidump file format, so changes are also necessary in libkvm. Reviewed by: jhb Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26131 Notes: svn path=/head/; revision=365978
* Move vm_page_dump bitset array definition to MI codeD Scott Phillips2020-09-212-5/+31
| | | | | | | | | | | | | | | | | | | These definitions were repeated by all architectures, with small variations. Consolidate the common definitons in machine independent code and use bitset(9) macros for manipulation. Many opportunities for deduplication remain in the machine dependent minidump logic. The only intended functional change is increasing the bit index type to vm_pindex_t, allowing the indexing of pages with address of 8 TiB and greater. Reviewed by: kib, markj Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26129 Notes: svn path=/head/; revision=365977
* vm_ooffset_t is now unsignedEric van Gyzen2020-09-181-3/+0
| | | | | | | | | | | | | vm_ooffset_t is now unsigned. Remove some tests for negative values, or make other adjustments accordingly. Reported by: Coverity Reviewed by: kib markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26214 Notes: svn path=/head/; revision=365886
* Increase the default vm.max_user_wired value.Mark Johnston2020-09-171-2/+7
| | | | | | | | | | | | | | | | | | | | Since r347532 (merged to stable/12) we only count user-wired pages towards the system limit. However, we now also treat pages wired by hypervisors (bhyve and virtualbox) as user-wired, so starting VMs with large amounts of RAM tends to fail due to the low limit. The purpose of the limit is to provide a seatbelt, not to impose some policy on the use of wired memory. Thus, increase the default limit to allow reasonable VM configurations to work without tuning. Reviewed by: kib Discussed with: dougm MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26424 Notes: svn path=/head/; revision=365841
* Support for userspace non-transparent superpages (largepages).Konstantin Belousov2020-09-092-7/+82
| | | | | | | | | | | | | | | | | | | Created with shm_open2(SHM_LARGEPAGE) and then configured with FIOSSHMLPGCNF ioctl, largepages posix shared memory objects guarantee that all userspace mappings of it are served by superpage non-managed mappings. Only amd64 for now, both 2M and 1G superpages can be requested, the later requires CPU feature. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365522
* vm_map: Add a map entry kind that can only be clipped at specific boundary.Konstantin Belousov2020-09-092-61/+188
| | | | | | | | | | | | | | | | | | | The entries and their clip boundaries must be aligned on supported superpages sizes from pagesizes[]. vm_map operations return Mach error KERN_INVALID_ARGUMENT, which is usually translated to EINVAL, if it would require clip not at the boundary. In other words, entries force preserving virtual addresses superpage properties. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365520
* Add pmap_enter(9) PMAP_ENTER_LARGEPAGE flag and implement it on amd64.Konstantin Belousov2020-09-091-0/+1
| | | | | | | | | | | | | | | | | | The flag requests entry of non-managed superpage mapping of size pagesizes[psind] into the page table. Pmap supports fake wiring of the largepage mappings. Only attributes of the largepage mapping can be changed by calling pmap_enter(9) over existing mapping, physical address of the page must be unchanged. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365518
* Add vm_map_find_aligned(9).Konstantin Belousov2020-09-092-0/+15
| | | | | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365516
* Move MAP_32BIT_MAX_ADDR definition to sys/mman.h.Konstantin Belousov2020-09-091-2/+0
| | | | | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365515
* Prepare to handle non-trivial errors from vm_map_delete().Konstantin Belousov2020-09-093-8/+14
| | | | | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365513
* Allow consumer to customize physical pager.Konstantin Belousov2020-09-094-10/+98
| | | | | | | | | | | | | | | | | | | | | | Add support for user-supplied callbacks into phys pager operations, providing custom getpages(), haspage(), and populate() methods implementations. Pager stores user data ptr/val in the object to provide context. Add phys_pager_allocate() helper that takes user ops table as one of the arguments. Current code for these methods is moved to the 'default' ops table, assigned automatically when vm_pager_alloc() is used. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365488
* Add kern_mmap_racct_check(), a helper to verify limits in vm_mmap*().Konstantin Belousov2020-09-081-28/+37
| | | | | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365486
* Add interruptible variant of vm_wait(9), vm_wait_intr(9).Konstantin Belousov2020-09-086-24/+41
| | | | | | | | | | | | | Also add msleep flags argument to vm_wait_doms(9). Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365484
* vm_object_split(): Handle orig_object type changes.Mark Johnston2020-09-071-3/+17
| | | | | | | | | | | | | | | | | | | | | orig_object->type can change from OBJT_DEFAULT to OBJT_SWAP while vm_object_split() is sleeping. In this case some pages in new_object may be left unbusied, but vm_object_split() attempts to unbusy all of them. Track the beginning of the busied range. Add an assertion to verify that pages are not re-added to the source object while sleeping. Reported by: Olympios Petrakis <olympios.petrakis@netapp.com> Reviewed by: alc, kib Tested by: pho MFC after: 1 week Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26223 Notes: svn path=/head/; revision=365437
* Avoid unnecessary object locking in vm_page_grab_pages_unlocked().Mark Johnston2020-09-021-4/+5
| | | | | | | | | | | | | | | We were needlessly acquiring the object lock to call vm_page_grab_pages() even when all of the requested pages were looked up locklessly. Fix that, stop testing for count == 0 in vm_page_grab_pages(), and add assertions to help catch this kind of mistake. Reported by: cem Reviewed by: alc, cem, dougm, jeff Differential Revision: https://reviews.freebsd.org/D26304 Notes: svn path=/head/; revision=365275
* Include the psind in data returned by mincore(2).Mark Johnston2020-09-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Currently we use a single bit to indicate whether the virtual page is part of a superpage. To support a forthcoming implementation of non-transparent 1GB superpages, it is useful to provide more detailed information about large page sizes. The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind) values, indicating a mapping of size psind, where psind is an index into the pagesizes array returned by getpagesizes(3), which in turn comes from the hw.pagesizes sysctl. MINCORE_PSIND(1) is equal to the old value of MINCORE_SUPER. For now, two bits are used to record the page size, permitting values of MAXPAGESIZES up to 4. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26238 Notes: svn path=/head/; revision=365267
* vm: clean up empty lines in .c and .h filesMateusz Guzik2020-09-0118-32/+6
| | | | Notes: svn path=/head/; revision=365074
* LinuxKPI: Implement ksize() function.Vladimir Kondratyev2020-08-292-0/+8
| | | | | | | | | | | | | | | In Linux, ksize() gets the actual amount of memory allocated for a given object. This commit adds malloc_usable_size() to FreeBSD KPI which does the same. It also maps LinuxKPI ksize() to newly created function. ksize() function is used by drm-kmod. Reviewed by: hselasky, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D26215 Notes: svn path=/head/; revision=364964