aboutsummaryrefslogtreecommitdiff
path: root/sys/kern/kern_sx.c
Commit message (Collapse)AuthorAgeFilesLines
* SCHEDULER_STOPPED(): Rely on a global variableOlivier Certner2024-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A commit from 2012 (5d7380f8e34f0083, r228424) introduced 'td_stopsched', on the ground that a global variable would cause all CPUs to have a copy of it in their cache, and consequently of all other variables sharing the same cache line. This is really a problem only if that cache line sees relatively frequent modifications. This was unlikely to be the case back then because nearby variables are almost never modified as well. In any case, today we have a new tool at our disposal to ensure that this variable goes into a read-mostly section containing frequently-accessed variables ('__read_frequently'). Most of the cache lines covering this section are likely to always be in every CPU cache. This makes the second reason stated in the commit message (ensuring the field is in the same cache line as some lock-related fields, since these are accessed in close proximity) moot, as well as the second order effect of requiring an additional line to be present in the cache (the one containing the new 'scheduler_stopped' boolean, see below). From a pure logical point of view, whether the scheduler is stopped is a global state and is certainly not a per-thread quality. Consequently, remove 'td_stopsched', which immediately frees a byte in 'struct thread'. Currently, the latter's size (and layout) stays unchanged, but some of the later re-orderings will probably benefit from this removal. Available bytes at the original position for 'td_stopsched' have been made explicit with the addition of the '_td_pad0' member. Store the global state in the new 'scheduler_stopped' boolean, which is annotated with '__read_frequently'. Replace uses of SCHEDULER_STOPPED_TD() with SCHEDULER_STOPPER() and remove the former as it is now unnecessary. Reviewed by: markj, kib Approved by: markj (mentor) MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D43572
* sys: Automated cleanup of cdefs and other formattingWarner Losh2023-11-271-1/+0
| | | | | | | | | | | | | | | | Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
* sx: fixup copy pasto in previousMateusz Guzik2023-10-231-1/+1
| | | | | Spotted by: glebius Sponsored by: Rubicon Communications, LLC ("Netgate")
* sx: unset td_wantedlock around going to sleepMateusz Guzik2023-10-231-0/+14
| | | | | | | | Otherwise it can crash in sleepq_wait_sig -> sleepq_catch_signals -> sig_ast_checksusp -> thread_suspend_check due to a mutex acquire. Reported by: pho Sponsored by: Rubicon Communications, LLC ("Netgate")
* thread: add td_wantedlockMateusz Guzik2023-10-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | This enables obtaining lock information threads are actively waiting for while sampling. Without the change one would only see a bunch of calls to lock_delay(), where the stacktrace often does not reveal what the lock might be. Note this is not the same as lock profiling, which only produces data for cases which wait for locks. struct thread already has a td_lockname field, but I did not use it because it has different semantics -- denotes when the thread is off cpu. At the same time it could not be converted to hold a lock_object pointer because non-curthread access would no longer be guaranteed to be safe -- by the time it reads the pointer the lock might have been taken, released and the object containing it freed. Sample usage with dtrace: rm /tmp/out.kern_stacks ; dtrace -x stackframes=100 -n 'profile-997 { @[curthread->td_wantedlock != NULL ? stringof(curthread->td_wantedlock->lo_name) : stringof("\n"), stack()] = count(); }' -o /tmp/out.kern_stacks This also facilitates addition of lock information to traces produced by hwpmc. Note: spinlocks are not supported at the moment. Sponsored by: Rubicon Communications, LLC ("Netgate")
* sys: Remove $FreeBSD$: one-line .c patternWarner Losh2023-08-161-2/+0
| | | | Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
* spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh2023-05-121-1/+1
| | | | | | | | | The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
* sx: whack set-but-not-used warn in _sx_slock_hardMateusz Guzik2023-02-211-1/+1
| | | | Sponsored by: Rubicon Communications, LLC ("Netgate")
* lockprof: pass lock type as an argument instead of reading the spin flagMateusz Guzik2021-05-231-4/+4
|
* Minor style cleanupWarner Losh2021-04-181-1/+1
| | | | | | | | | We prefer 'while (0)' to 'while(0)' according to grep and stlye(9)'s space after keyword rule. Remove a few stragglers of the latter. Many of these usages were inconsistent within the file. MFC After: 3 days Sponsored by: Netflix
* locks: push lock_delay_arg_init calls downMateusz Guzik2020-11-241-6/+6
| | | | | | | | Minor cleanup to skip doing them when recursing on locks and so that they can act on found lock value if need be. Notes: svn path=/head/; revision=367978
* sx: drop spurious volatile keywordMateusz Guzik2020-11-241-2/+2
| | | | Notes: svn path=/head/; revision=367977
* locks: fix a long standing bug for primitives with kdtrace but without spinningMateusz Guzik2020-07-231-2/+2
| | | | | | | | | | | | | In such a case the second argument to lock_delay_arg_init was NULL which was immediately causing a null pointer deref. Since the sructure is only used for spin count, provide a dedicate routine initializing it. Reported by: andrew Notes: svn path=/head/; revision=363451
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-1/+2
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* locks: add default delay structMateusz Guzik2020-01-051-0/+6
| | | | | | | Use it for all primitives. This makes everything fit in 8 bytes. Notes: svn path=/head/; revision=356375
* locks: convert delay times to u_shortMateusz Guzik2020-01-051-6/+6
| | | | | | | int is just a waste of space for this purpose. Notes: svn path=/head/; revision=356374
* sleep(9), sleepqueue(9): const'ify wchan pointersConrad Meyer2019-12-241-1/+1
| | | | | | | | | | | | | | | | | | _sleep(9), wakeup(9), sleepqueue(9), et al do not dereference or modify the channel pointers provided in any way; they are merely used as intptrs into a dictionary structure to match waiters with wakers. Correctly annotate this such that _sleep() and wakeup() may be used on const pointers without invoking ugly patterns like __DECONST(). Plumb const through all of the underlying sleepqueue bits. No functional change. Reviewed by: rlibby Discussed with: kib, markj Differential Revision: https://reviews.freebsd.org/D22914 Notes: svn path=/head/; revision=356057
* sx: check for SX_LOCK_SHARED | SX_LOCK_WRITE_SPINNER when exclusive-lockingMateusz Guzik2019-12-051-0/+6
| | | | | | | | | First, this removes a spurious difference compared to rw locks. More importantly though this avoids a trip through sleepq code if the lock happens to be caught in this state. Notes: svn path=/head/; revision=355416
* sx: retire SX_NOADAPTIVEMateusz Guzik2018-12-051-32/+11
| | | | | | | | | | | | The flag is not used by anything for years and supporting it requires an explicit read from the lock when entering slow path. Flag value is left unused on purpose. Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=341593
* Make no assertions about lock state when the scheduler is stopped.Eric van Gyzen2018-11-131-1/+1
| | | | | | | | | | | | Change the assert paths in rm, rw, and sx locks to match the lock and unlock paths. I did this for mutexes in r306346. Reported by: Travis Lane <tlane@isilon.com> MFC after: 2 weeks Sponsored by: Dell EMC Isilon Notes: svn path=/head/; revision=340409
* sx: fixup a braino in r334024Mateusz Guzik2018-05-221-1/+1
| | | | | | | | | | | | | | If a thread waiting on sx dropped Giant it would not be properly reacquired on exit from the routine, later resulting in panics indicating Giant is not held (when it should be). The bug was not present in the original patch sent to pho, I wittingly added it just prior to the commit and only smoke-tested it. Reported by: pho Notes: svn path=/head/; revision=334048
* sx: port over writer starvation prevention measures from rwlockMateusz Guzik2018-05-221-97/+199
| | | | | | | | | | | | | | | | | | A constant stream of readers could completely starve writers and this is not a hypothetical scenario. The 'poll2_threads' test from the will-it-scale suite reliably starves writers even with concurrency < 10 threads. The problem was run into and diagnosed by dillon@backplane.com There was next to no change in lock contention profile during -j 128 pkg build, despite an sx lock being at the top. Tested by: pho Notes: svn path=/head/; revision=334024
* fix uninitialized variable warning in reader locksMatt Macy2018-05-191-3/+3
| | | | Notes: svn path=/head/; revision=333831
* locks: extend speculative spin waiting for readers to drainMateusz Guzik2018-04-111-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that 10 years have passed since the original limit of 10000 was committed, bump it a little bit. Spinning waiting for writers is semi-informed in the sense that we always know if the owner is running and base the decision to spin on that. However, no such information is provided for read-locking. In particular this means that it is possible for a write-spinner to completely waste cpu time waiting for the lock to be released, while the reader holding it was preempted and is now waiting for the spinner to go off cpu. Nonetheless, in majority of cases it is an improvement to spin instead of instantly giving up and going to sleep. The current approach is pretty simple: snatch the number of current readers and performs that many pauses before checking again. The total number of pauses to execute is limited to 10k. If the lock is still not free by that time, go to sleep. Given the previously noted problem of not knowing whether spinning makes any sense to begin with the new limit has to remain rather conservative. But at the very least it should also be related to the machine. Waiting for writers uses parameters selected based on the number of activated hardware threads. The upper limit of pause instructions to be executed in-between re-reads of the lock is typically 16384 or 32678. It was selected as the limit of total spins. The lower bound is set to already present 10000 as to not change it for smaller machines. Bumping the limit reduces system time by few % during benchmarks like buildworld, buildkernel and others. Tested on 2 and 4 socket machines (Broadwell, Skylake). Figuring out how to make a more informed decision while not pessimizing the fast path is left as an exercise for the reader. Notes: svn path=/head/; revision=332398
* locks: slightly depessimize lockstatMateusz Guzik2018-03-171-20/+33
| | | | | | | | | | | | | | | | | | | | | The slow path is always taken when lockstat is enabled. This induces rdtsc (or other) calls to get the cycle count even when there was no contention. Still go to the slow path to not mess with the fast path, but avoid the heavy lifting unless necessary. This reduces sys and real time during -j 80 buildkernel: before: 3651.84s user 1105.59s system 5394% cpu 1:28.18 total after: 3685.99s user 975.74s system 5450% cpu 1:25.53 total disabled: 3697.96s user 411.13s system 5261% cpu 1:18.10 total So note this is still a significant hit. LOCK_PROFILING results are not affected. Notes: svn path=/head/; revision=331109
* sx: don't do an atomic op in upgrade if it cananot succeedMateusz Guzik2018-03-041-3/+13
| | | | | | | | | | | | | The code already pays the cost of reading the lock to obtain the waiters flag. Checking whether there is more than one reader is not a problem and avoids dirtying the line. This also fixes a small corner case: if waiters were to show up between reading the flag and upgrading the lock, the operation would fail even though it should not. No correctness change here though. Notes: svn path=/head/; revision=330415
* locks: fix a corner case in r327399Mateusz Guzik2018-03-041-33/+28
| | | | | | | | | | | | | | | | | If there were exactly rowner_retries/asx_retries (by default: 10) transitions between read and write state and the waiters still did not get the lock, the next owner -> reader transition would result in the code correctly falling back to turnstile/sleepq where it would incorrectly think it was waiting for a writer and decide to leave turnstile/sleepq to loop back. From this point it would take ts/sq trips until the lock gets released. The bug sometimes manifested itself in stalls during -j 128 package builds. Refactor the code to fix the bug, while here remove some of the gratituous differences between rw and sx locks. Notes: svn path=/head/; revision=330414
* sx: fix adaptive spinning broken in r327397Mateusz Guzik2018-03-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The condition was flipped. In particular heavy multithreaded kernel builds on zfs started suffering due to nested sx locks. For instance make -s -j 128 buildkernel: before: 3326.67s user 1269.62s system 6981% cpu 1:05.84 total after: 3365.55s user 911.27s system 6871% cpu 1:02.24 total ps. .-'---`-. .-'---`-. ,' `. ,' `. | \ | \ | \ | \ \ _ \ \ _ \ ,\ _ ,'-,/-)\ ,\ _ ,'-,/-)\ ( * \ \,' ,' ,'-) ( * \ \,' ,' ,'-) `._,) -',-') `._,) -',-') \/ ''/ \/ ''/ ) / / ) / / / ,'-' / ,'-' Notes: svn path=/head/; revision=330294
* Undo LOCK_PROFILING pessimisation after r313454 and r313455Mateusz Guzik2018-02-171-2/+7
| | | | | | | | | | | | | | | | | With the option used to compile the kernel both sx and rw shared ops would always go to the slow path which added avoidable overhead even when the facility is disabled. Furthermore the increased time spent doing uncontested shared lock acquire would be bogusly added to total wait time, somewhat skewing the results. Restore old behaviour of going there only when profiling is enabled. This change is a no-op for kernels without LOCK_PROFILING (which is the default). Notes: svn path=/head/; revision=329451
* sx: retry hard shared unlock just like in r327905 for rwlocksMateusz Guzik2018-01-131-1/+4
| | | | Notes: svn path=/head/; revision=327914
* locks: adjust loop limit check when waiting for readersMateusz Guzik2017-12-311-1/+1
| | | | | | | | The check was for the exact value, but since the counter started being incremented by the number of readers it could have jumped over. Notes: svn path=/head/; revision=327402
* sx: fix up non-smp compilation after r327397Mateusz Guzik2017-12-311-2/+2
| | | | Notes: svn path=/head/; revision=327401
* locks: re-check the reason to go to sleep after locking sleepq/turnstileMateusz Guzik2017-12-311-3/+11
| | | | | | | | | | | | | | In both rw and sx locks we always go to sleep if the lock owner is not running. We do spin for some time if the lock is read-locked. However, if we decide to go to sleep due to the lock owner being off cpu and after sleepq/turnstile gets acquired the lock is read-locked, we should fallback to the aforementioned wait. Notes: svn path=/head/; revision=327399
* sx: read the SX_NOADAPTIVE flag and Giant ownership only onceMateusz Guzik2017-12-311-71/+87
| | | | | | | | These used to be read multiple times when waiting for the lock the become free, which had the potential to issue completely avoidable traffic. Notes: svn path=/head/; revision=327397
* sys/kern: adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-271-0/+2
| | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326271
* sx: change sunlock to wake waiters up if it locked sleepqMateusz Guzik2017-11-251-19/+20
| | | | | | | | | | | sleepq is only locked if the curhtread is the last reader. By the time the lock gets acquired new ones could have arrived. The previous code would unlock and loop back. This results spurious relocking of sleepq. This is a step towards xadd-based unlock routine. Notes: svn path=/head/; revision=326196
* locks: retry turnstile/sleepq loops on failed cmpsetMateusz Guzik2017-11-251-21/+13
| | | | | | | | | | In order to go to sleep threads set waiter flags, but that can spuriously fail e.g. when a new reader arrives. Instead of unlocking everything and looping back, re-evaluate the new state while still holding the lock necessary to go to sleep. Notes: svn path=/head/; revision=326195
* Have lockstat:::sx-release fire only after the lock state has changed.Mark Johnston2017-11-241-2/+1
| | | | | | | MFC after: 1 week Notes: svn path=/head/; revision=326176
* Add a missing lockstat:::sx-downgrade probe.Mark Johnston2017-11-241-7/+6
| | | | | | | | | | We were returning without firing the probe when the lock had no shared waiters. MFC after: 1 week Notes: svn path=/head/; revision=326175
* sx: unbreak debug after r326107Mateusz Guzik2017-11-231-1/+1
| | | | | | | | | | | | | | An assertion was modified to use the found value, but it was not updated to handle a race where blocked threads appear after the entrance to the func. Move the assertion down to the area protected with sleepq lock where the lock is read anyway. This does not affect coverage of the assertion and is consistent with what rw locks are doing. Reported by: Shawn Webb Notes: svn path=/head/; revision=326112
* locks: pass the found lock value to unlock slow pathMateusz Guzik2017-11-221-7/+10
| | | | | | | | | This avoids an explicit read later. While here whack the cheaply obtainable 'tid' argument. Notes: svn path=/head/; revision=326107
* locks: remove the file + line argument from internal primitives when not usedMateusz Guzik2017-11-221-17/+60
| | | | | | | | | | | | | The pair is of use only in debug or LOCKPROF kernels, but was passed (zeroed) for many locks even in production kernels. While here whack the tid argument from wlock hard and xlock hard. There is no kbi change of any sort - "external" primitives still accept the pair. Notes: svn path=/head/; revision=326106
* locks: fix compilation issues without SMP or KDTRACE_HOOKSMateusz Guzik2017-11-171-2/+2
| | | | Notes: svn path=/head/; revision=325963
* sx: perform a minor cleanup of the unlock slowpathMateusz Guzik2017-11-171-7/+9
| | | | | | | No functional changes. Notes: svn path=/head/; revision=325922
* locks: pull up PMC_SOFT_CALLs out of slow path loopsMateusz Guzik2017-11-171-11/+12
| | | | Notes: svn path=/head/; revision=325919
* sx: avoid branches if in the slow path if lockstat is disabledMateusz Guzik2017-11-171-12/+41
| | | | Notes: svn path=/head/; revision=325917
* locks: take the number of readers into account when waitingMateusz Guzik2017-10-051-3/+4
| | | | | | | | | | | | | | | | Previous code would always spin once before checking the lock. But a lock with e.g. 6 readers is not going to become free in the duration of once spin even if they start draining immediately. Conservatively perform one for each reader. Note that the total number of allowed spins is still extremely small and is subject to change later. MFC after: 1 week Notes: svn path=/head/; revision=324335
* locks: partially tidy up waiting on readersMateusz Guzik2017-10-051-5/+4
| | | | | | | | | | | | spin first instant of instantly re-readoing and don't re-read after spinning is finished - the state is already known. Note the code is subject to significant changes later. MFC after: 1 week Notes: svn path=/head/; revision=324314
* Sprinkle __read_frequently on few obvious places.Mateusz Guzik2017-09-061-3/+3
| | | | | | | | Note that some of annotated variables should probably change their types to something smaller, preferably bit-sized. Notes: svn path=/head/; revision=323236
* Fix the !TD_IS_IDLETHREAD(curthread) locking assertions.Mark Johnston2017-06-191-3/+5
| | | | | | | | | | | | Most of the lock slowpaths assert that the calling thread isn't an idle thread. However, this may not be true if the system has panicked, and in some cases the assertion appears before a SCHEDULER_STOPPED() check. MFC after: 3 days Sponsored by: Dell EMC Isilon Notes: svn path=/head/; revision=320124