summaryrefslogtreecommitdiff
path: root/sys/kern/kern_resource.c
Commit message (Collapse)AuthorAgeFilesLines
* rufetch and calcru sometimes should be called atomically together.Attilio Rao2007-06-091-13/+21
| | | | | | | | | | | | | This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor) Notes: svn path=/head/; revision=170472
* The current rusage code show peculiar problems:Attilio Rao2007-06-091-6/+3
| | | | | | | | | | | | | | | | | - Unsafeness on ruadd() in thread_exit() - Unatomicity of thread_exiit() in the exit1() operations This patch addresses these problems allocating p_fd as part of the process and modifying the way it is accessed. A small chunk of this patch, resolves a race about p_state in kern_wait(), since we have to be sure about the zombif-ing process. Submitted by: jeff Approved by: jeff (mentor) Notes: svn path=/head/; revision=170466
* Commit 14/14 of sched_lock decomposition.Jeff Roberson2007-06-051-24/+33
| | | | | | | | | | | | | | - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each) Notes: svn path=/head/; revision=170307
* - Move rusage from being per-process in struct pstats to per-thread inJeff Roberson2007-06-011-22/+103
| | | | | | | | | | | | | | | | | | | | | | td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent) Notes: svn path=/head/; revision=170174
* Universally adopt most conventional spelling of acquire.Robert Watson2007-05-271-1/+1
| | | | Notes: svn path=/head/; revision=170035
* Rework the support for ABIs to override resource limits (used by 32-bitJohn Baldwin2007-05-141-6/+4
| | | | | | | | | | | | | | | | | | | | | | processes under 64-bit kernels). Previously, each 32-bit process overwrote its resource limits at exec() time. The problem with this approach is that the new limits affect all child processes of the 32-bit process, including if the child process forks and execs a 64-bit process. To fix this, don't ovewrite the resource limits during exec(). Instead, sv_fixlimits() is now replaced with a different function sv_fixlimit() which asks the ABI to sanitize a single resource limit. We then use this when querying and setting resource limits. Thus, if a 32-bit process sets a limit, then that new limit will be inherited by future children. However, if the 32-bit process doesn't change a limit, then a future 64-bit child will see the "full" 64-bit limit rather than the 32-bit limit. MFC is tentative since it will break the ABI of old linux.ko modules (no other modules are affected). MFC after: 1 week Notes: svn path=/head/; revision=169565
* Further system call comment cleanup:Robert Watson2007-03-051-3/+0
| | | | | | | | | | | | | - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments. Notes: svn path=/head/; revision=167232
* Remove 'MPSAFE' annotations from the comments above most system calls: allRobert Watson2007-03-041-25/+0
| | | | | | | | | | | system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments. Notes: svn path=/head/; revision=167211
* Close race conditions between fork() and [sg]etpriority()'sXin LI2007-02-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PRIO_USER case, possibly also other places that deferences p_ucred. In the past, we insert a new process into the allproc list right after PID allocation, and release the allproc_lock sx. Because most content in new proc's structure is not yet initialized, this could lead to undefined result if we do not handle PRS_NEW with care. The problem with PRS_NEW state is that it does not provide fine grained information about how much initialization is done for a new process. By defination, after PRIO_USER setpriority(), all processes that belongs to given user should have their nice value set to the specified value. Therefore, if p_{start,end}copy section was done for a PRS_NEW process, we can not safely ignore it because p_nice is in this area. On the other hand, we should be careful on PRS_NEW processes because we do not allow non-root users to lower their nice values, and without a successful copy of the copy section, we can get stale values that is inherted from the uninitialized area of the process structure. This commit tries to close the race condition by grabbing proc mutex *before* we release allproc_lock xlock, and do copy as well as zero immediately after the allproc_lock xunlock. This guarantees that the new process would have its p_copy and p_zero sections, as well as user credential informaion initialized. In getpriority() case, instead of grabbing PROC_LOCK for a PRS_NEW process, we just skip the process in question, because it does not affect the final result of the call, as the p_nice value would be copied from its parent, and we will see it during allproc traverse. Other potential solutions are still under evaluation. Discussed with: davidxu, jhb, rwatson PR: kern/108071 MFC after: 2 weeks Notes: svn path=/head/; revision=167007
* Use priv_check(9) instead of suser(9) for checking the privilege toRobert Watson2007-02-191-1/+1
| | | | | | | | | set real-time priority on a thread. It looks like this suser(9) call was introduced after my first pass through replacing superuser checks with named privilege checks. Notes: svn path=/head/; revision=166828
* Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.Xin LI2007-01-171-1/+1
| | | | Notes: svn path=/head/; revision=166073
* Threading cleanup.. part 2 of several.Julian Elischer2006-12-061-89/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread. Notes: svn path=/head/; revision=164936
* Use scheduler API sched_user_prio() to adjust thread's userland priority,David Xu2006-11-201-12/+15
| | | | | | | | use td_base_user_prio to get real userland priority since POSIX priority mutex may adjust td_user_pri which is an effective priority. Notes: svn path=/head/; revision=164431
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningRobert Watson2006-11-061-3/+5
| | | | | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net> Notes: svn path=/head/; revision=164033
* Make KSE a kernel option, turned on by default in all GENERICJohn Birrell2006-10-261-0/+86
| | | | | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@ Notes: svn path=/head/; revision=163709
* Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparamDavid Xu2006-09-211-0/+95
| | | | | | | | with rtprio_thread, while rtprio system call is for process only, the new system call rtprio_thread is responsible for LWP. Notes: svn path=/head/; revision=162497
* Commit the results of the typo hunt by Darren Pilgrim.Yaroslav Tykhiy2006-08-041-1/+1
| | | | | | | | | | | | | This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week Notes: svn path=/head/; revision=160964
* Go over calcru and friends once more.Poul-Henning Kamp2006-03-111-47/+48
| | | | | | | | Reintroduce the monotonicity for the normal case and make the two special cases behave in what is belived to be the most sensible fasion. Notes: svn path=/head/; revision=156570
* Add slop to "backwards" cpu accounting messages, 3 usec or 1% whicheverPoul-Henning Kamp2006-03-091-1/+5
| | | | | | | | | | | | | | | | triggers. This should eliminate all the trivial messages which result from minor increases in cpu_tick frequency. Machines which don't du cpu clock fiddling shouldn't issue "backwards" messages now. Laptops and other machines where the initial estimate of cputicks may be waaaay off will still issue warnings. Notes: svn path=/head/; revision=156484
* Various style and comment fixes.John Baldwin2006-02-221-8/+7
| | | | | | | Submitted by: bde Notes: svn path=/head/; revision=155916
* Split calcru() back into a calcru1() function shared with calccru() andJohn Baldwin2006-02-211-10/+33
| | | | | | | | | | | | | a calcru() wrapper that passes a local rusage_ext on the stack that is a snapshot to do the calculations on. Now we can pass p->p_crux to calcru1() in calccru() again which fixes the issues with runtime going backwards messages when dead processes are harvested by init. Reviewed by: phk Tested by: Stefan Ehmann shoesoft at gmx dot net Notes: svn path=/head/; revision=155882
* CPU time accounting speedup (step 2)Poul-Henning Kamp2006-02-111-68/+45
| | | | | | | | | | | | | | | | | | | | | | Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64 Notes: svn path=/head/; revision=155534
* Modify the way we account for CPU time spent (step 1)Poul-Henning Kamp2006-02-071-9/+12
| | | | | | | | | | | | | | | | | | | Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen. Notes: svn path=/head/; revision=155444
* Back out changes made in rev. 1.151.Stephan Uphoff2006-01-251-1/+1
| | | | | | | | | They were bogus. Cluebat applied by: jhb@ Notes: svn path=/head/; revision=154793
* Hopefully fix the "calcru: runtime went backwards from ..." problem byStephan Uphoff2006-01-231-1/+1
| | | | | | | | | | keeping the resource values locked (where needed) while we use them for calculations. MFC after: 3 days Notes: svn path=/head/; revision=154731
* Calling setrlimit from 32bit apps could potentially increase certainPaul Saab2005-11-021-0/+7
| | | | | | | | | | limits beyond what should be capiable in a 32bit process, so we must fixup the limits. Reviewed by: jhb Notes: svn path=/head/; revision=151980
* Use the reference count API to manage the reference counts for processJohn Baldwin2005-09-271-11/+4
| | | | | | | | | | limit structures rather than using pool mutexes to protect the reference counts. Tested on: i386, alpha, sparc64 Notes: svn path=/head/; revision=150633
* Giant is no longer required in kern_setrlimit(); remove its acquisition andAlan Cox2005-06-011-2/+0
| | | | | | | | | release. Reviewed by: jhb Notes: svn path=/head/; revision=146879
* Stop explicitly touching td_base_pri outside of the scheduler and simplyJohn Baldwin2004-12-301-1/+0
| | | | | | | | set a thread's priority via sched_prio() when that is the desired action. The schedulers will start managing td_base_pri internally shortly. Notes: svn path=/head/; revision=139451
* Rework how we store process times in the kernel such that we always storeJohn Baldwin2004-10-051-79/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the raw values including for child process statistics and only compute the system and user timevals on demand. - Fix the various kern_wait() syscall wrappers to only pass in a rusage pointer if they are going to use the result. - Add a kern_getrusage() function for the ABI syscalls to use so that they don't have to play stackgap games to call getrusage(). - Fix the svr4_sys_times() syscall to just call calcru() to calculate the times it needs rather than calling getrusage() twice with associated stackgap, etc. - Add a new rusage_ext structure to store raw time stats such as tick counts for user, system, and interrupt time as well as a bintime of the total runtime. A new p_rux field in struct proc replaces the same inline fields from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux field in struct proc contains the "raw" child time usage statistics. ruadd() has been changed to handle adding the associated rusage_ext structures as well as the values in rusage. Effectively, the values in rusage_ext replace the ru_utime and ru_stime values in struct rusage. These two fields in struct rusage are no longer used in the kernel. - calcru() has been split into a static worker function calcru1() that calculates appropriate timevals for user and system time as well as updating the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a copy of the process' p_rux structure to compute the timevals after updating the runtime appropriately if any of the threads in that process are currently executing. It also now only locks sched_lock internally while doing the rux_runtime fixup. calcru() now only requires the caller to hold the proc lock and calcru1() only requires the proc lock internally. calcru() also no longer allows callers to ask for an interrupt timeval since none of them actually did. - calcru() now correctly handles threads executing on other CPUs. - A new calccru() function computes the child system and user timevals by calling calcru1() on p_crux. Note that this means that any code that wants child times must now call this function rather than reading from p_cru directly. This function also requires the proc lock. - This finishes the locking for rusage and friends so some of the Giant locks in exit1() and kern_wait() are now gone. - The locking in ttyinfo() has been tweaked so that a shared lock of the proctree lock is used to protect the process group rather than the process group lock. By holding this lock until the end of the function we now ensure that the process/thread that we pick to dump info about will no longer vanish while we are trying to output its info to the console. Submitted by: bde (mostly) MFC after: 1 month Notes: svn path=/head/; revision=136152
* A modest collection of various and sundry style, spelling, and whitespaceJohn Baldwin2004-09-241-38/+33
| | | | | | | | | fixes. Submitted by: bde (mostly) Notes: svn path=/head/; revision=135688
* Various small style fixes.John Baldwin2004-09-221-2/+4
| | | | Notes: svn path=/head/; revision=135573
* Push UIDINFO_UNLOCK() slightly earlier in chgsbize(), as it's notRobert Watson2004-08-061-2/+2
| | | | | | | | needed if we print the local variable version of the limit rather than the shared version. Notes: svn path=/head/; revision=133233
* Remove spl's from kern_resource.c.Robert Watson2004-08-041-4/+0
| | | | Notes: svn path=/head/; revision=133126
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This isColin Percival2004-07-261-1/+1
| | | | | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb Notes: svn path=/head/; revision=132653
* Turned off the "calcru: negative time" warning for certain SMP casesBruce Evans2004-06-211-12/+34
| | | | | | | | | | | | | | where it is known to detect a problem but the problem is not very easy to fix. The warning became very common recently after a call to calcru() was added to fill_kinfo_thread(). Another (much older) cause of "negative times" (actually non-monotonic times) was fixed in rev.1.237 of kern_exit.c. Print separate messages for non-monotonic and negative times. Notes: svn path=/head/; revision=130858
* Nice, is a property of a process as a whole..Julian Elischer2004-06-161-34/+10
| | | | | | | | I mistakenly moved it to the ksegroup when breaking up the process structure. Put it back in the proc structure. Notes: svn path=/head/; revision=130551
* Deorbit COMPAT_SUNOS.Poul-Henning Kamp2004-06-111-2/+2
| | | | | | | | We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days. Notes: svn path=/head/; revision=130344
* Fix rtprio() to do sensible things when called from threaded processes.Julian Elischer2004-05-081-4/+45
| | | | | | | | | | | | | | It's not quite correct from a posix Point Of view, but it is a lot better than what was there before. This will be revisited later when we decide what form our priority extensions will take. Posix doesn't specify how a system scope thread can change its priority so you need to add non-standard extensions to be able to do it.. For now make this slightly non standard to allow it to be done. Submitted by: Dan Eischen originally, changed by myself. Notes: svn path=/head/; revision=129050
* Remove a comment that complains about the lack of %qd, to justifyMaxime Henrion2004-04-101-3/+2
| | | | | | | | | truncating a rlim_t to a long. We have %qd since some time now. However, the correct format to use here is %jd and a cast to intmax_t, so do this. Notes: svn path=/head/; revision=128088
* Remove advertising clause from University of California Regent's license,Warner Losh2004-04-051-4/+0
| | | | | | | | | per letter dated July 22, 1999. Approved by: core Notes: svn path=/head/; revision=127911
* Argh! Fix a bogon. lim_cur() was returning the hard (max) limit ratherJohn Baldwin2004-02-111-1/+1
| | | | | | | | | than the soft (cur) limit. Submitted by: bde Notes: svn path=/head/; revision=125712
* - Convert the plimit lock to a pool mutex lock.John Baldwin2004-02-061-3/+3
| | | | | | | | | - Hide struct plimit from userland. Submitted by: bde (2) Notes: svn path=/head/; revision=125525
* - Correct the translation of old rlimit values to properly handle the oldJohn Baldwin2004-02-061-21/+28
| | | | | | | | | | | | | | | | RLIM_INFINITY case for ogetrlimit(). - Use %jd and intmax_t to output negative time in usec in calcru(). - Rework getrusage() to make a copy of the rusage struct into a local variable while holding Giant and then do the copyout from the local variable to avoid having to have the original process rusage struct locked while doing the copyout (which would not be safe). This also includes a few style fixes from Bruce to getrusage(). Submitted by: bde (1, parts of 3) Suggested by: bde (2) Notes: svn path=/head/; revision=125524
* A few more style fixes from Bruce including a few I missed last time.John Baldwin2004-02-061-18/+12
| | | | | | | Submitted by: bde Notes: svn path=/head/; revision=125523
* - A lot of style and whitespace fixes.John Baldwin2004-02-051-60/+53
| | | | | | | | | - Update a few comments regarding locking notes. Submitted by: bde (1, mostly) Notes: svn path=/head/; revision=125495
* Locking for the per-process resource limits structure.John Baldwin2004-02-041-64/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64 Notes: svn path=/head/; revision=125454
* - Don't set td_priority directly here, use sched_prio().Jeff Roberson2003-10-271-1/+1
| | | | Notes: svn path=/head/; revision=121608
* Extend the mutex pool implementation to permit the creation and use ofDon Lewis2003-07-131-1/+1
| | | | | | | | | | | | | | | | | | | multiple mutex pools with different options and sizes. Mutex pools can be created with either the default sleep mutexes or with spin mutexes. A dynamically created mutex pool can now be destroyed if it is no longer needed. Create two pools by default, one that matches the existing pool that uses the MTX_NOWITNESS option that should be used for building higher level locks, and a new pool with witness checking enabled. Modify the users of the existing mutex pool to use the appropriate pool in the new implementation. Reviewed by: jhb Notes: svn path=/head/; revision=117494
* Use __FBSDID().David E. O'Brien2003-06-111-1/+3
| | | | Notes: svn path=/head/; revision=116182