summaryrefslogtreecommitdiff
path: root/sys/vm/vm_radix.c
Commit message (Collapse)AuthorAgeFilesLines
* vm: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-1/+0
| | | | Notes: svn path=/head/; revision=365074
* kernel: provide panicky version of __unreachableKyle Evans2020-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | __builtin_unreachable doesn't raise any compile-time warnings/errors on its own, so problems with its usage can't be easily detected. While it would be nice for this situation to change and compilers to at least add a warning for trivial cases where local state means the instruction can't be reached, this isn't the case at the moment and likely will not happen. This commit adds an __assert_unreachable, whose intent is incredibly clear: it asserts that this instruction is unreachable. On INVARIANTS builds, it's a panic(), and on non-INVARIANTS it expands to __unreachable(). Existing users of __unreachable() are converted to __assert_unreachable, to improve debuggability if this assumption is violated. Reviewed by: mjg Differential Revision: https://reviews.freebsd.org/D23793 Notes: svn path=/head/; revision=361011
* Move SMR pointer type definition and access macros to smr_types.h.Mark Johnston2020-03-071-1/+2
| | | | | | | | | | | | | | | | | | | | | The intent is to provide a header that can be included by other headers without introducing too much pollution. smr.h depends on various headers and will likely grow over time, but is less likely to be required by system headers. Rename SMR_TYPE_DECLARE() to SMR_POINTER(): - One might use SMR to protect more than just pointers; it could be used for resizeable arrays, for example, so TYPE seems too generic. - It is useful to be able to define anonymous SMR-protected pointer types and the _DECLARE suffix makes that look wrong. Reviewed by: jeff, mjg, rlibby Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D23988 Notes: svn path=/head/; revision=358728
* vm_radix: prefer __builtin_unreachable() to an unreachable panic()Kyle Evans2020-02-221-2/+1
| | | | | | | | | | This provides the needed hint to GCC and offers an annotation for readers to observe that it's in-fact impossible to hit this point. We'll get hit with a a -Wswitch error if the enum applicable to the switch above were to get expanded without the new value(s) being handled. Notes: svn path=/head/; revision=358248
* Silence a gcc warning about no return from a function that handles everyJeff Roberson2020-02-191-0/+2
| | | | | | | | | | | possible enum in a switch statement. I verified that this emits nothing as expected on clang. radix relies on constant propagation to eliminate any branching from these access routines. Reported by: lwhsu/tinderbox Notes: svn path=/head/; revision=358133
* Use SMR to provide a safe unlocked lookup for vm_radix.Jeff Roberson2020-02-191-117/+208
| | | | | | | | | | | | The tree is kept correct for readers with store barriers and careful ordering. The existing object lock serializes writers. Consumers will be introduced in later commits. Reviewed by: markj, kib Differential Revision: https://reviews.freebsd.org/D23446 Notes: svn path=/head/; revision=358130
* vm: stop passing M_ZERO when allocating radix nodesMateusz Guzik2018-06-241-2/+12
| | | | | | | | | | | | | | | Allocation explicitely initialized the 3 leading fields. The rest is an array which is supposed to be NULL-ed prior to deallocation. Delegate zeroing to the infrequently called object initializator. This gets rid of one of the most common memset consumers. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D15989 Notes: svn path=/head/; revision=335600
* Fix boot_pages calculation for machines that don't have UMA_MD_SMALL_ALLOC.Gleb Smirnoff2018-02-061-4/+3
| | | | | | | | | | | | | o Call uma_startup1() after initializing kmem, vmem and domains. o Include 8 eight VM startup pages into uma_startup_count() calculation. o Account for vmem_startup() and vm_map_startup() preallocating pages. o Account for extra two allocations done by kmem_init() and vmem_create(). o Hardcode the place of execution of vm_radix_reserve_kva(). Using SYSINIT allowed several other SYSINITs to sneak in before it, thus bumping requirement for amount of boot pages. Notes: svn path=/head/; revision=328952
* sys: general adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-271-1/+3
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended. Notes: svn path=/head/; revision=326272
* Replace manyinstances of VM_WAIT with blocking page allocation flagsJeff Roberson2017-11-081-0/+6
| | | | | | | | | | | | | | | | | | | similar to the kernel memory allocator. This simplifies NUMA allocation because the domain will be known at wait time and races between failure and sleeping are eliminated. This also reduces boilerplate code and simplifies callers. A wait primitive is supplied for uma zones for similar reasons. This eliminates some non-specific VM_WAIT calls in favor of more explicit sleeps that may be satisfied without new pages. Reviewed by: alc, kib, markj Tested by: pho Sponsored by: Netflix, Dell/EMC Isilon Notes: svn path=/head/; revision=325530
* Add pctrie_init() and vm_radix_init() to initialize generic pctrie andKonstantin Belousov2017-07-191-1/+1
| | | | | | | | | | | | | | | vm_radix trie. Existing vm_radix_init() function is renamed to vm_radix_zinit(). Inlines moved out of the _ headers. Reviewed by: alc, markj (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D11661 Notes: svn path=/head/; revision=321247
* Previously, vm_radix_remove() would panic if the radix trie didn'tAlan Cox2016-12-081-9/+9
| | | | | | | | | | | | | | | | contain a vm_page_t at the specified index. However, with this change, vm_radix_remove() no longer panics. Instead, it returns NULL if there is no vm_page_t at the specified index. Otherwise, it returns the vm_page_t. The motivation for this change is that it simplifies the use of radix tries in the amd64, arm64, and i386 pmap implementations. Instead of performing a lookup before every remove, the pmap can simply perform the remove. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D8708 Notes: svn path=/head/; revision=309703
* Eliminate a stale comment; vm_radix_prealloc() was replaced in r254141.Alan Cox2016-12-021-2/+0
| | | | | | | MFC after: 3 days Notes: svn path=/head/; revision=309416
* During vm_page_cache()'s call to vm_radix_insert(), if vm_page_alloc() wasAlan Cox2016-12-011-55/+2
| | | | | | | | | | | | | | | | | called to allocate a new page of radix trie nodes, there could be a call to vm_radix_remove() on the same trie (of PG_CACHED pages) as the in-progress vm_radix_insert(). With the removal of PG_CACHED pages, we can simplify vm_radix_insert() and vm_radix_remove() by removing the flags on the root of the trie that were used to detect this case and the code for restarting vm_radix_insert() when it happened. Reviewed by: kib, markj Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8664 Notes: svn path=/head/; revision=309365
* Cleanup redundant parenthesis from existing howmany()/roundup() macro uses.Pedro F. Giffuni2016-04-221-1/+1
| | | | Notes: svn path=/head/; revision=298482
* Pull in r267961 and r267973 again. Fix for issues reported will follow.Hans Petter Selasky2014-06-281-1/+1
| | | | Notes: svn path=/head/; revision=267992
* Revert r267961, r267973:Glen Barber2014-06-271-1/+1
| | | | | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory Notes: svn path=/head/; revision=267985
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifHans Petter Selasky2014-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=267961
* Rename global cnt to vm_cnt to avoid shadowing.Bryan Drewery2014-03-221-1/+1
| | | | | | | | | | | | | | | | | To reduce the diff struct pcu.cnt field was not renamed, so PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in kvm(3) and vmstat(8). The goal was to not affect externally used KPI. Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the the global cnt variable. Exp-run revealed no ports using it directly. No objection from: arch@ Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=263620
* Eliminate a redundant parameter to vm_radix_replace().Alan Cox2013-12-081-7/+5
| | | | | | | | | | | Improve the wording of the comment describing vm_radix_replace(). Reviewed by: attilio MFC after: 6 weeks Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=259107
* Addendum to r254141: The call to vm_radix_insert() in vm_page_cache() canAlan Cox2013-08-231-0/+15
| | | | | | | | | | | | | reclaim the last preexisting cached page in the object, resulting in a call to vdrop(). Detect this scenario so that the vnode's hold count is correctly maintained. Otherwise, we panic. Reported by: scottl Tested by: pho Discussed with: attilio, jeff, kib Notes: svn path=/head/; revision=254719
* On all the architectures, avoid to preallocate the physical memoryAttilio Rao2013-08-091-48/+126
| | | | | | | | | | | | | | | | | | | | | | | | for nodes used in vm_radix. On architectures supporting direct mapping, also avoid to pre-allocate the KVA for such nodes. In order to do so make the operations derived from vm_radix_insert() to fail and handle all the deriving failure of those. vm_radix-wise introduce a new function called vm_radix_replace(), which can replace a leaf node, already present, with a new one, and take into account the possibility, during vm_radix_insert() allocation, that the operations on the radix trie can recurse. This means that if operations in vm_radix_insert() recursed vm_radix_insert() will start from scratch again. Sponsored by: EMC / Isilon storage division Reviewed by: alc (older version) Reviewed by: jeff Tested by: pho, scottl Notes: svn path=/head/; revision=254141
* To reduce the amount of arithmetic performed in the various radix treeAlan Cox2013-05-111-13/+12
| | | | | | | | | | functions, reverse the numbering scheme for the levels. The highest numbered level in the tree now appears near the root instead of the leaves. Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=250520
* Remove a redundant call to panic() from vm_radix_keydiff(). The assertionAlan Cox2013-05-071-4/+2
| | | | | | | | | before the loop accomplishes the same thing. Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=250334
* Optimize vm_radix_lookup_ge() and vm_radix_lookup_le(). Specifically,Alan Cox2013-05-041-103/+75
| | | | | | | | | | | | | | | | change the way that these functions ascend the tree when the search for a matching leaf fails at an interior node. Rather than returning to the root of the tree and repeating the lookup with an updated key, maintain a stack of interior nodes that were visited during the descent and use that stack to resume the lookup at the closest ancestor that might have a matching descendant. Sponsored by: EMC / Isilon Storage Division Reviewed by: attilio Tested by: pho Notes: svn path=/head/; revision=250259
* Eliminate an unneeded call to vm_radix_trimkey() from vm_radix_lookup_le().Alan Cox2013-04-281-1/+0
| | | | | | | | | | This call is clearing bits from the key that will be set again by the next line. Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=250018
* Avoid some lookup restarts in vm_radix_lookup_{ge,le}().Alan Cox2013-04-271-22/+24
| | | | | | | Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249986
* Simplify vm_radix_{add,dec}lev().Alan Cox2013-04-221-8/+13
| | | | | | | Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249745
* When calculating the number of reserved nodes, discount the pages that willAlan Cox2013-04-181-2/+9
| | | | | | | | | be used to store the nodes. Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249605
* Although we perform path compression to reduce the height of the trie andAlan Cox2013-04-151-26/+32
| | | | | | | | | | | | | | | | | | | the number of interior nodes, we have previously created a level zero interior node at the root of every non-empty trie, even when that node is not strictly necessary, i.e., it has only one child. This change is the second (and final) step in eliminating those unnecessary level zero interior nodes. Specifically, it updates the deletion and insertion functions so that they do not require a level zero interior node at the root of the trie. For a "buildworld" workload, this change results in a 16.8% reduction in the number of interior nodes allocated and a similar reduction in the average execution time for lookup functions. For example, the average execution time for a call to vm_radix_lookup_ge() is reduced by 22.9%. Reviewed by: attilio, jeff (an earlier version) Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249502
* Although we perform path compression to reduce the height of the trie andAlan Cox2013-04-121-20/+33
| | | | | | | | | | | | | | | the number of interior nodes, we always create a level zero interior node at the root of every non-empty trie, even when that node is not strictly necessary, i.e., it has only one child. This change is the first step in eliminating those unnecessary level zero interior nodes. Specifically, it updates all of the lookup functions so that they do not require a level zero interior node at the root. Reviewed by: attilio, jeff (an earlier version) Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249427
* Micro-optimize the order of struct vm_radix_node's fields. Specifically,Alan Cox2013-04-071-2/+2
| | | | | | | | | | | | | arrange for all of the fields to start at a short offset from the beginning of the structure. Eliminate unnecessary masking of VM_RADIX_FLAGS from the root pointer in vm_radix_getroot(). Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249221
* Simplify vm_radix_keybarr().Alan Cox2013-04-061-3/+1
| | | | | | | Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249211
* Simplify vm_radix_insert().Alan Cox2013-04-061-29/+8
| | | | | | | | | Reviewed by: attilio Tested by: pho Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249182
* Replace the remaining uses of vm_radix_node_page() by vm_radix_isleaf() andAlan Cox2013-04-031-65/+67
| | | | | | | | | | | | | | | vm_radix_topage(). This transformation eliminates some unnecessary conditional branches from the inner loops of vm_radix_insert(), vm_radix_lookup{,_ge,_le}(), and vm_radix_remove(). Simplify the control flow of vm_radix_lookup_{ge,le}(). Reviewed by: attilio (an earlier version) Tested by: pho Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=249038
* Introduce vm_radix_isleaf() and use it in a couple places. As compared toAlan Cox2013-03-261-2/+12
| | | | | | | | | | | | | using vm_radix_node_page() == NULL, the compiler is able to generate one less conditional branch when vm_radix_isleaf() is used. More use cases involving the inner loops of vm_radix_insert(), vm_radix_lookup{,_ge,_le}(), and vm_radix_remove() will follow. Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=248728
* Micro-optimize the control flow in a few places. Eliminate a panic callAlan Cox2013-03-241-8/+6
| | | | | | | | | | | | that could never be reached in vm_radix_insert(). (If the pointer being checked by the panic call were ever NULL, the immmediately preceding loop would have already crashed on a NULL pointer dereference.) Reviewed by: attilio (an earlier version) Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=248684
* Sync back vmcontention branch into HEAD:Attilio Rao2013-03-181-0/+777
Replace the per-object resident and cached pages splay tree with a path-compressed multi-digit radix trie. Along with this, switch also the x86-specific handling of idle page tables to using the radix trie. This change is supposed to do the following: - Allowing the acquisition of read locking for lookup operations of the resident/cached pages collections as the per-vm_page_t splay iterators are now removed. - Increase the scalability of the operations on the page collections. The radix trie does rely on the consumers locking to ensure atomicity of its operations. In order to avoid deadlocks the bisection nodes are pre-allocated in the UMA zone. This can be done safely because the algorithm needs at maximum one new node per insert which means the maximum number of the desired nodes is the number of available physical frames themselves. However, not all the times a new bisection node is really needed. The radix trie implements path-compression because UFS indirect blocks can lead to several objects with a very sparse trie, increasing the number of levels to usually scan. It also helps in the nodes pre-fetching by introducing the single node per-insert property. This code is not generalized (yet) because of the possible loss of performance by having much of the sizes in play configurable. However, efforts to make this code more general and then reusable in further different consumers might be really done. The only KPI change is the removal of the function vm_page_splay() which is now reaped. The only KBI change, instead, is the removal of the left/right iterators from struct vm_page, which are now reaped. Further technical notes broken into mealpieces can be retrieved from the svn branch: http://svn.freebsd.org/base/user/attilio/vmcontention/ Sponsored by: EMC / Isilon storage division In collaboration with: alc, jeff Tested by: flo, pho, jhb, davide Tested by: ian (arm) Tested by: andreast (powerpc) Notes: svn path=/head/; revision=248449