summaryrefslogtreecommitdiff
path: root/sys/kern/kern_resource.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r210225 - turns out I was wrong; the "/*-" is not license-onlyEdward Tomasz Napierala2010-07-181-1/+1
| | | | | | | | | | thing; it's also used to indicate that the comment should not be automatically rewrapped. Explained by: cperciva@ Notes: svn path=/head/; revision=210226
* The "/*-" comment marker is supposed to denote copyrights. Remove non-copyrightEdward Tomasz Napierala2010-07-181-1/+1
| | | | | | | occurences from sys/sys/ and sys/kern/. Notes: svn path=/head/; revision=210225
* Remove outdated comment and move part of it into more applicable place.Edward Tomasz Napierala2010-07-181-5/+0
| | | | Notes: svn path=/head/; revision=210224
* Use ISO C99 integer types in sys/kern where possible.Ed Schouten2010-06-211-1/+1
| | | | | | | | | There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often. Notes: svn path=/head/; revision=209390
* Fix the double counting of the last process thread td_incruntimeKonstantin Belousov2010-05-241-3/+3
| | | | | | | | | | | | | | on exit, that is done once in thread_exit() and the second time in proc_reap(), by clearing td_incruntime. Use the opportunity to revert to the pre-RUSAGE_THREAD exporting of ruxagg() instead of ruxagg_locked() and use it from thread_exit(). Diagnosed and tested by: neel MFC after: 3 days Notes: svn path=/head/; revision=208488
* Implement RUSAGE_THREAD. Add td_rux to keep extended runtime and ticksKonstantin Belousov2010-05-041-11/+22
| | | | | | | | | | | | | | | | | | information for thread to allow calcru1() (re)use. Rename ruxagg()->ruxagg_locked(), ruxagg_tlock()->ruxagg() [1]. The ruxagg_locked() function no longer clears thread ticks nor td_incruntime. Requested by: attilio [1] Discussed with: attilio, bde Reviewed by: bde Based on submission by: Alexander Krizhanovsky <ak natsys-lab com> MFC after: 1 week X-MFC-Note: td_rux shall be moved to the end of struct thread Notes: svn path=/head/; revision=207602
* Extract thread_lock()/ruxagg()/thread_unlock() fragment into utilityKonstantin Belousov2010-05-011-13/+14
| | | | | | | | | | | function ruxagg_tlock(). Convert the definition of kern_getrusage() to ANSI C. Submitted by: Alexander Krizhanovsky <ak natsys-lab com> MFC after: 1 week Notes: svn path=/head/; revision=207468
* sched_getparam was just plain broke for time-shareRandall Stewart2010-03-031-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | processes. It did not return an error but instead just let garbage be passed back. This I fix so it actually properly translates the priority the process is at to a posix's high means more priority. I also fix it so that if the ULE scheduler has bumped it up to a realtime process you get back a sane value i.e. the highest priority (63 for time-share). sched_setscheduler() had the setting of the timeshare class priority disabled. With some notes about rejecting the posix high numbers is greater priority and use nice instead. This fix also adjusts that to work, with the cavet that a t-s process may well get bumped up or down i.e. the setscheduler() will NOT change the nice value only the current priority. I think this is reasonable considering if the user wants to play with nice then he can. At least all the posix'ish interfaces now respond sanely. MFC after: 3 weeks Notes: svn path=/head/; revision=204670
* Implement global and per-uid accounting of the anonymous memory. AddKonstantin Belousov2009-06-231-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith) Notes: svn path=/head/; revision=194766
* Don't rearm callout if the process is exiting, it may leak a calloutDavid Xu2008-10-241-1/+2
| | | | | | | | because callout_drain() only waits for running callout, but not disable it if it is rearmed. Notes: svn path=/head/; revision=184217
* Retire the MALLOC and FREE macros. They are an abomination unto style(9).Dag-Erling Smørgrav2008-10-231-1/+1
| | | | | | | MFC after: 3 months Notes: svn path=/head/; revision=184205
* Fix a small typo in a comment in calcru1().Ed Schouten2008-09-051-1/+1
| | | | | | | | | The word "happene" should read "happened". Submitted by: Jille Timmermans <jille quis cx> Notes: svn path=/head/; revision=182792
* Integrate the new MPSAFE TTY layer to the FreeBSD operating system.Ed Schouten2008-08-201-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan Notes: svn path=/head/; revision=181905
* Remove extra uihold() call that accidentally sneak in during perforcePawel Jakub Dawidek2008-03-191-1/+0
| | | | | | | change @125544. Notes: svn path=/head/; revision=177377
* - Relax requirements for p_numthreads, p_threads, p_swtick, and p_nice fromJeff Roberson2008-03-191-16/+4
| | | | | | | | | | requiring the per-process spinlock to only requiring the process lock. - Reflect these changes in the proc.h documentation and consumers throughout the kernel. This is a substantial reduction in locking cost for these fields and was made possible by recent changes to threading support. Notes: svn path=/head/; revision=177368
* Whitespace cleanups.Pawel Jakub Dawidek2008-03-161-7/+7
| | | | Notes: svn path=/head/; revision=177278
* - Use wait-free method to manage ui_sbsize and ui_proccnt fields in thePawel Jakub Dawidek2008-03-161-58/+48
| | | | | | | | | | | | | uidinfo structure. This entirely removes contention observed on the ui_mtxp mutex (as it is now gone). - Convert the uihashtbl_mtx mutex to a rwlock, as most of the time we just need to read-lock it. Reviewed by: jhb, jeff, kris & others Tested by: kris Notes: svn path=/head/; revision=177277
* Style fixes.Pawel Jakub Dawidek2008-03-161-11/+7
| | | | Notes: svn path=/head/; revision=177264
* Fix information leak. We can find PIDs of running processes from withinPawel Jakub Dawidek2008-03-161-1/+2
| | | | | | | | | | | | a jail, etc. by simply calling setpriority(PRIO_PROCESS, <PID>, 0) and checking the return value: 0 means that the process exists and -1 that it doesn't exist. Reviewed by: rwatson MFC after: 1 week Notes: svn path=/head/; revision=177263
* Remove kernel support for M:N threading.Jeff Roberson2008-03-121-2/+0
| | | | | | | | | | | While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken. Notes: svn path=/head/; revision=177091
* Don't zero td_runtime when billing thread CPU usage to the process;Robert Watson2008-01-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | maintain a separate td_incruntime to hold unbilled CPU usage for the thread that has the previous properties of td_runtime. When thread information is requested using the thread monitoring sysctls, export thread td_runtime instead of process rusage runtime in kinfo_proc. This restores the display of individual ithread and other kernel thread CPU usage since inception in ps -H and top -SH, as well for libthr user threads, valuable debugging information lost with the move to try kthreads since they are no longer independent processes. There is universal agreement that we should rewrite the process and thread export sysctls, but this commit gets things going a bit better in the mean time. Likewise, there are resevations about the continued validity of statclock given the speed of modern processors. Reviewed by: attilio, emaste, jhb, julian Notes: svn path=/head/; revision=175219
* Fix LOR of thread lock and umtx's priority propagation mutex dueDavid Xu2007-12-111-1/+8
| | | | | | | | | to the reworking of scheduler lock. MFC: after 3 days Notes: svn path=/head/; revision=174536
* - Use ruxagg() in calcru() to make sure we have current tick informationJeff Roberson2007-07-171-0/+8
| | | | | | | | | | from all threads. Discussed with: bde, attilio Approved by: re Notes: svn path=/head/; revision=171468
* Fix a couple of issues with the stack limit for 32-bit processes on 64-bitJohn Baldwin2007-07-121-8/+12
| | | | | | | | | | | | | | | | kernels exposed by the recent fixes to resource limits for 32-bit processes on 64-bit kernels: - Let ABIs expose their maximum stack size via a new pointer in sysentvec and use that in preference to maxssiz during exec() rather than always using maxssiz for all processses. - Apply the ABI's limit fixup to the previous stack size when adjusting RLIMIT_STACK to determine if the existing mapping for the stack needs to be grown or shrunk (as well as how much it should be grown or shrunk). Approved by: re (kensmith) Notes: svn path=/head/; revision=171410
* Remove the restriction that rtprio(2) cannot be used to set the realtimeRobert Watson2007-06-141-17/+8
| | | | | | | | | | or idle priority of another process owned by the same user. This means that privilege in rtprio(2) (and rtprio_thread(2)) is required indirectly via p_cansched(9) or directly to set realtime/idle privilege, rather than directly affecting target process authorization. Notes: svn path=/head/; revision=170745
* Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); inRobert Watson2007-06-121-2/+1
| | | | | | | | | | | | | | | | | | some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project Notes: svn path=/head/; revision=170587
* rufetch and calcru sometimes should be called atomically together.Attilio Rao2007-06-091-13/+21
| | | | | | | | | | | | | This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor) Notes: svn path=/head/; revision=170472
* The current rusage code show peculiar problems:Attilio Rao2007-06-091-6/+3
| | | | | | | | | | | | | | | | | - Unsafeness on ruadd() in thread_exit() - Unatomicity of thread_exiit() in the exit1() operations This patch addresses these problems allocating p_fd as part of the process and modifying the way it is accessed. A small chunk of this patch, resolves a race about p_state in kern_wait(), since we have to be sure about the zombif-ing process. Submitted by: jeff Approved by: jeff (mentor) Notes: svn path=/head/; revision=170466
* Commit 14/14 of sched_lock decomposition.Jeff Roberson2007-06-051-24/+33
| | | | | | | | | | | | | | - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each) Notes: svn path=/head/; revision=170307
* - Move rusage from being per-process in struct pstats to per-thread inJeff Roberson2007-06-011-22/+103
| | | | | | | | | | | | | | | | | | | | | | td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent) Notes: svn path=/head/; revision=170174
* Universally adopt most conventional spelling of acquire.Robert Watson2007-05-271-1/+1
| | | | Notes: svn path=/head/; revision=170035
* Rework the support for ABIs to override resource limits (used by 32-bitJohn Baldwin2007-05-141-6/+4
| | | | | | | | | | | | | | | | | | | | | | processes under 64-bit kernels). Previously, each 32-bit process overwrote its resource limits at exec() time. The problem with this approach is that the new limits affect all child processes of the 32-bit process, including if the child process forks and execs a 64-bit process. To fix this, don't ovewrite the resource limits during exec(). Instead, sv_fixlimits() is now replaced with a different function sv_fixlimit() which asks the ABI to sanitize a single resource limit. We then use this when querying and setting resource limits. Thus, if a 32-bit process sets a limit, then that new limit will be inherited by future children. However, if the 32-bit process doesn't change a limit, then a future 64-bit child will see the "full" 64-bit limit rather than the 32-bit limit. MFC is tentative since it will break the ABI of old linux.ko modules (no other modules are affected). MFC after: 1 week Notes: svn path=/head/; revision=169565
* Further system call comment cleanup:Robert Watson2007-03-051-3/+0
| | | | | | | | | | | | | - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments. Notes: svn path=/head/; revision=167232
* Remove 'MPSAFE' annotations from the comments above most system calls: allRobert Watson2007-03-041-25/+0
| | | | | | | | | | | system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments. Notes: svn path=/head/; revision=167211
* Close race conditions between fork() and [sg]etpriority()'sXin LI2007-02-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PRIO_USER case, possibly also other places that deferences p_ucred. In the past, we insert a new process into the allproc list right after PID allocation, and release the allproc_lock sx. Because most content in new proc's structure is not yet initialized, this could lead to undefined result if we do not handle PRS_NEW with care. The problem with PRS_NEW state is that it does not provide fine grained information about how much initialization is done for a new process. By defination, after PRIO_USER setpriority(), all processes that belongs to given user should have their nice value set to the specified value. Therefore, if p_{start,end}copy section was done for a PRS_NEW process, we can not safely ignore it because p_nice is in this area. On the other hand, we should be careful on PRS_NEW processes because we do not allow non-root users to lower their nice values, and without a successful copy of the copy section, we can get stale values that is inherted from the uninitialized area of the process structure. This commit tries to close the race condition by grabbing proc mutex *before* we release allproc_lock xlock, and do copy as well as zero immediately after the allproc_lock xunlock. This guarantees that the new process would have its p_copy and p_zero sections, as well as user credential informaion initialized. In getpriority() case, instead of grabbing PROC_LOCK for a PRS_NEW process, we just skip the process in question, because it does not affect the final result of the call, as the p_nice value would be copied from its parent, and we will see it during allproc traverse. Other potential solutions are still under evaluation. Discussed with: davidxu, jhb, rwatson PR: kern/108071 MFC after: 2 weeks Notes: svn path=/head/; revision=167007
* Use priv_check(9) instead of suser(9) for checking the privilege toRobert Watson2007-02-191-1/+1
| | | | | | | | | set real-time priority on a thread. It looks like this suser(9) call was introduced after my first pass through replacing superuser checks with named privilege checks. Notes: svn path=/head/; revision=166828
* Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.Xin LI2007-01-171-1/+1
| | | | Notes: svn path=/head/; revision=166073
* Threading cleanup.. part 2 of several.Julian Elischer2006-12-061-89/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread. Notes: svn path=/head/; revision=164936
* Use scheduler API sched_user_prio() to adjust thread's userland priority,David Xu2006-11-201-12/+15
| | | | | | | | use td_base_user_prio to get real userland priority since POSIX priority mutex may adjust td_user_pri which is an effective priority. Notes: svn path=/head/; revision=164431
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningRobert Watson2006-11-061-3/+5
| | | | | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net> Notes: svn path=/head/; revision=164033
* Make KSE a kernel option, turned on by default in all GENERICJohn Birrell2006-10-261-0/+86
| | | | | | | | | | kernel configs except sun4v (which doesn't process signals properly with KSE). Reviewed by: davidxu@ Notes: svn path=/head/; revision=163709
* Replace system call thr_getscheduler, thr_setscheduler, thr_setschedparamDavid Xu2006-09-211-0/+95
| | | | | | | | with rtprio_thread, while rtprio system call is for process only, the new system call rtprio_thread is responsible for LWP. Notes: svn path=/head/; revision=162497
* Commit the results of the typo hunt by Darren Pilgrim.Yaroslav Tykhiy2006-08-041-1/+1
| | | | | | | | | | | | | This change affects documentation and comments only, no real code involved. PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week Notes: svn path=/head/; revision=160964
* Go over calcru and friends once more.Poul-Henning Kamp2006-03-111-47/+48
| | | | | | | | Reintroduce the monotonicity for the normal case and make the two special cases behave in what is belived to be the most sensible fasion. Notes: svn path=/head/; revision=156570
* Add slop to "backwards" cpu accounting messages, 3 usec or 1% whicheverPoul-Henning Kamp2006-03-091-1/+5
| | | | | | | | | | | | | | | | triggers. This should eliminate all the trivial messages which result from minor increases in cpu_tick frequency. Machines which don't du cpu clock fiddling shouldn't issue "backwards" messages now. Laptops and other machines where the initial estimate of cputicks may be waaaay off will still issue warnings. Notes: svn path=/head/; revision=156484
* Various style and comment fixes.John Baldwin2006-02-221-8/+7
| | | | | | | Submitted by: bde Notes: svn path=/head/; revision=155916
* Split calcru() back into a calcru1() function shared with calccru() andJohn Baldwin2006-02-211-10/+33
| | | | | | | | | | | | | a calcru() wrapper that passes a local rusage_ext on the stack that is a snapshot to do the calculations on. Now we can pass p->p_crux to calcru1() in calccru() again which fixes the issues with runtime going backwards messages when dead processes are harvested by init. Reviewed by: phk Tested by: Stefan Ehmann shoesoft at gmx dot net Notes: svn path=/head/; revision=155882
* CPU time accounting speedup (step 2)Poul-Henning Kamp2006-02-111-68/+45
| | | | | | | | | | | | | | | | | | | | | | Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch. Don't reach across CPUs in calcru(). Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines). Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true. Use 27MHz counter on i386/Geode. Use TSC on amd64 & i386 if present. Use tick counter on sparc64 Notes: svn path=/head/; revision=155534
* Modify the way we account for CPU time spent (step 1)Poul-Henning Kamp2006-02-071-9/+12
| | | | | | | | | | | | | | | | | | | Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers. For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.) On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen. Notes: svn path=/head/; revision=155444
* Back out changes made in rev. 1.151.Stephan Uphoff2006-01-251-1/+1
| | | | | | | | | They were bogus. Cluebat applied by: jhb@ Notes: svn path=/head/; revision=154793