aboutsummaryrefslogtreecommitdiff
path: root/sys/kern
Commit message (Collapse)AuthorAgeFilesLines
* This patch adds an M_NOFREE flag which allows one to mark an mbuf asKip Macy2007-10-062-0/+12
| | | | | | | | | | | | | | | not being independently freeable. This allows one to embed an mbuf in the cluster itself. This confers the benefits of the packet zone on all cluster sizes. Embedded mbufs currently suffer from the same limitation that packet zone mbufs do in that one cannot disconnect them and pass them around independently of the cluster. It would likely be possible to eliminate this limitation in the future by adding a second reference for the mbuf itself. Approved by: re(gnn) Notes: svn path=/head/; revision=172463
* Allow drivers to free an mbuf without having the mbuf be touched ifKip Macy2007-10-061-2/+5
| | | | | | | | | the driver has already freed any attached tags Approved by: re(gnn) Notes: svn path=/head/; revision=172462
* Fix sx_try_slock(), so it only fails when there is an exclusive owner.Pawel Jakub Dawidek2007-10-021-9/+12
| | | | | | | | | | | | | | | | | Before that fix, it was possible for the function to fail if number of sharers changes between 'x = sx->sx_lock' step and atomic_cmpset_acq_ptr() call. This fixes ZFS problem when ZFS returns strange EIO errors under load. In ZFS there is a code that depends on the fact that sx_try_slock() can only fail if there is an exclusive owner. Discussed with: attilio Reviewed by: jhb Approved by: re (kensmith) Notes: svn path=/head/; revision=172416
* - Reassign the thread queue lock to newtd prior to switching. AssigningJeff Roberson2007-10-021-4/+6
| | | | | | | | | | | | | | after the switch leads to a race where the outgoing thread still owns the local queue lock while another cpu may switch it in. This race is only possible on machines where cpu_switch can take significantly longer on different cpus which in practice means HTT machines with unfair thread scheduling algorithms. Found by: kris (of course) Approved by: re Notes: svn path=/head/; revision=172411
* - Move the rebalancer back into hardclock to prevent potential softclockJeff Roberson2007-10-021-55/+86
| | | | | | | | | | | | | starvation caused by unbalanced interrupt loads. - Change the rebalancer to work on stathz ticks but retain randomization. - Simplify locking in tdq_idled() to use the tdq_lock_pair() rather than complex sequences of locks to avoid deadlock. Reported by: kris Approved by: re Notes: svn path=/head/; revision=172409
* - Honor the PREEMPTION and FULL_PREEMPTION flags by setting the defaultJeff Roberson2007-09-271-2/+10
| | | | | | | | | | | | | | value for kern.sched.preempt_thresh appropriately. It can still by adjusted at runtime. ULE will still use IPI_PREEMPT in certain migration situations. - Assert that we're not trying to compile ULE on an unsupported architecture. To date, I believe only i386 and amd64 have implemented the third cpu switch argument required. Approved by: re Notes: svn path=/head/; revision=172345
* Fix the description of the formula used to autosize the number ofRuslan Ermilov2007-09-261-1/+1
| | | | | | | | | buffers in the buffer cache. Approved by: re (kensmith) Notes: svn path=/head/; revision=172329
* Change the management of cached pages (PQ_CACHE) in two fundamentalAlan Cox2007-09-252-11/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith) Notes: svn path=/head/; revision=172317
* - Bound the interactivity score so that it cannot become negative.Jeff Roberson2007-09-241-1/+1
| | | | | | | Approved by: re Notes: svn path=/head/; revision=172308
* - Improve grammar. s/it's/its/.Jeff Roberson2007-09-221-5/+13
| | | | | | | | | | | | | | | | | | | | | - Improve load long-term load balancer by always IPIing exactly once. Previously the delay after rebalancing could cause problems with uneven workloads. - Allow nice to have a linear effect on the interactivity score. This allows negatively niced programs to stay interactive longer. It may be useful with very expensive Xorg servers under high loads. In general it should not be necessary to alter the nice level to improve interactive response. We may also want to consider never allowing positively niced processes to become interactive at all. - Initialize ccpu to 0 rather than 0.0. The decimal point was leftover from when the code was copied from 4bsd. ccpu is 0 in ULE because ULE only exports weighted cpu values. Reported by: Steve Kargl (Load balancing problem) Approved by: re Notes: svn path=/head/; revision=172293
* Fix some locking cases where we ask for exclusively locked vnode, but we getPawel Jakub Dawidek2007-09-212-4/+25
| | | | | | | | | | | shared locked vnode in instead when vfs.lookup_shared is set to 1. Discussed with: kib, kris Tested by: kris Approved by: re (kensmith) Notes: svn path=/head/; revision=172274
* - Redefine p_swtime and td_slptime as p_swtick and td_slptick. ThisJeff Roberson2007-09-214-26/+29
| | | | | | | | | | | | | | | changes the units from seconds to the value of 'ticks' when swapped in/out. ULE does not have a periodic timer that scans all threads in the system and as such maintaining a per-second counter is difficult. - Change computations requiring the unit in seconds to subtract ticks and divide by hz. This does make the wraparound condition hz times more frequent but this is still in the range of several months to years and the adverse effects are minimal. Approved by: re Notes: svn path=/head/; revision=172264
* - Call sched_sleep() before we suspend threads. sched_wakeup() is alreadyJeff Roberson2007-09-211-0/+2
| | | | | | | | | | called via setrunnable(). This allows time slept while suspended to be accounted for swap. Approved by: re Notes: svn path=/head/; revision=172263
* Fix some entries in the locks static table of witness.Attilio Rao2007-09-203-11/+9
| | | | | | | | | | | | | | | | | | | | | | | In particular: - smp_tlb_mtx is no longer used, so it is axed. - smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in the table, however, has been the source of a false positive LOR reporting with the dt_lock. However, smp rendezvous lock would have had sched_lock there for older lock, so it wasn't still a leaf lock. - allpmaps is only used in ia32 architecture, so it is inserted in the appropriate stub. Addictionally: - kse_zombie_lock is no longer present, so its definition is axed out. - zombie_lock doesn't need to have an exported symbol, so just let's it be declared as static. Tested by: kris Approved by: jeff (mentor) Approved by: re Notes: svn path=/head/; revision=172256
* - Move all of the PS_ flags into either p_flag or td_flags.Jeff Roberson2007-09-1713-69/+47
| | | | | | | | | | | | | | | | | - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith) Notes: svn path=/head/; revision=172207
* Remove the definition and implementation of 'CALLOUT_NETGIANT', a now- (andRobert Watson2007-09-151-11/+2
| | | | | | | | | | possibly always-) unused define. Reported by: kmacy Approved by: re (kensmith) Notes: svn path=/head/; revision=172184
* Currently the LO_NOPROFILE flag (which is masked on upper level code byAttilio Rao2007-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | per-primitive macros like MTX_NOPROFILE, SX_NOPROFILE or RW_NOPROFILE) is not really honoured. In particular lock_profile_obtain_lock_failure() and lock_profile_obtain_lock_success() are naked respect this flag. The bug leads to locks marked with no-profiling to be profiled as well. In the case of the clock_lock, used by the timer i8254 this leads to unpredictable behaviour both on amd64 and ia32 (double faults panic, sudden reboots, etc.). The amd64 clock_lock is also not marked as not profilable as it should be. Fix these bugs adding proper checks in the lock profiling code and at clock_lock initialization time. i8254 bug pointed out by: kris Tested by: matteo, Giuseppe Cocomazzi <sbudella at libero dot it> Approved by: jeff (mentor) Approved by: re Notes: svn path=/head/; revision=172163
* subr_sleepqueue.c presents a thread lock missing which leads to dangerousAttilio Rao2007-09-131-0/+2
| | | | | | | | | | | | | | | | races for some struct thread members. More specifically, this bug seems responsible for some memory dumping problems people were experiencing. Fix this adding correct thread locking. Tested by: rwatson Submitted by: tegge Approved by: jeff Approved by: re Notes: svn path=/head/; revision=172155
* When restoring the mount after umount failed, the MNTK_UNMOUNT flagKonstantin Belousov2007-09-122-5/+10
| | | | | | | | | | | | | | | | | | prevents insmntque() from placing reallocated syncer vnode on mount list, that causes panic in vfs_allocate_syncvnode(). Introduce MNTK_NOINSMNTQ flag, that marks the period when instmntque is not allowed to success, instead of MNTK_UNMOUNT. The MNTK_NOINSMNTQ is set and cleared simultaneously with MNTK_UNMOUNT, except on umount error path, where it is cleaned just before the syncer vnode is going to be allocated. Reported by: Peter Jeremy <peterjeremy optushome com au> Suggested by: tegge Approved by: re (rwatson) Notes: svn path=/head/; revision=172151
* This is a follow-up, cleaning-up commit about recent changes involvingAttilio Rao2007-09-111-1/+1
| | | | | | | | | | | | | | | | | | | topology foo functions. Working at the patch for topology problems in ia32/amd64 evicted some problems regarding functions ordering in the SI_SUB_CPU family of SYSINIT'ed subsystems. In order to avoid problems with new modified to involved functions, a correct ordering is not semantically specified for SI_SUB_CPU functions (for a larger view of the issue please visit: http://lists.freebsd.org/pipermail/freebsd-current/2007-July/075409.html ) Discussed with: peter Tested by: kris, Rui Paulo <rpaulo@FreeBSD.org> Approved by: jeff Approved by: re Notes: svn path=/head/; revision=172144
* Rename mac_check_vnode_delete() MAC Framework and MAC Policy entryRobert Watson2007-09-101-2/+2
| | | | | | | | | | | | | | | | | point to mac_check_vnode_unlink(), reflecting UNIX naming conventions. This is the first of several commits to synchronize the MAC Framework in FreeBSD 7.0 with the MAC Framework as it will appear in Mac OS X Leopard. Reveiwed by: csjp, Samy Bahra <sbahra at gwu dot edu> Submitted by: Jacques Vidrine <nectar at apple dot com> Obtained from: Apple Computer, Inc. Sponsored by: SPARTA, SPAWAR Approved by: re (bmah) Notes: svn path=/head/; revision=172107
* In userland_sysctl(), call useracc() with the actual newlen value to beRobert Watson2007-09-021-1/+1
| | | | | | | | | | | | | | | | used, rather than the one passed via 'req', which may not reflect a rewrite. This call to useracc() is redundant to validation performed by later copyin()/copyout() calls, so there isn't a security issue here, but this could technically lead to excessive validation of addresses if the length in newlen is shorter than req.newlen. Approved by: re (kensmith) Reviewed by: jhb Submitted by: Constantine A. Murenin <cnst+freebsd@bugmail.mojo.ru> Sponsored by: Google Summer of Code 2007 Notes: svn path=/head/; revision=172038
* Close a race that snuck in with the recent changes to fix a LOR betweenJohn Baldwin2007-08-311-13/+27
| | | | | | | | | | | | | | | | | | | the callout_lock spin lock and the sleepqueue spin locks. In the fix, callout_drain() has to drop the callout_lock so it can acquire the sleepqueue lock. The state of the callout can change while the callout_lock is held however (for example, it can be rescheduled via callout_reset()). The previous code assumed that the only state change that could happen is that the callout could finish executing. This change alters callout_drain() to effectively restart and recheck everything after it acquires the sleepqueue lock thus handling all the possible states that the callout could be in after any changes while callout_lock was dropped. Approved by: re (kensmith) Tested by: kris Notes: svn path=/head/; revision=172025
* Add missing newline in the log message of the previous commit.Diomidis Spinellis2007-08-311-1/+1
| | | | | | | Approved by: re (kensmith) - implied Notes: svn path=/head/; revision=172024
* Don't panic. When encountering a negative value call log(LOG_NOTICE, ...)Diomidis Spinellis2007-08-311-1/+7
| | | | | | | | | | and record LONG_MAX, instead of calling KASSERT(...). Reported by: rwatson Approved by: re (kensmith) Notes: svn path=/head/; revision=172023
* Partially revert the previous change. I failed to notice that whereJohn Baldwin2007-08-291-2/+0
| | | | | | | | | | | | ktruserret() is invoked, an unlocked check of the per-process queue is performed inline, thus, we don't lock the ktrace_sx on every userret(). Pointy hat to: jhb Approved by: re (kensmith) Pointy hat recovered from: rwatson Notes: svn path=/head/; revision=172011
* Rework the routines to convert a 5.x+ statfs structure (with fixed-sizeJohn Baldwin2007-08-281-4/+46
| | | | | | | | | | | | | | | | | | 64-bit counters) to a 4.x statfs structure (with long-sized counters). - For block counters, we scale up the block size sufficiently large so that the resulting block counts fit into a the long-sized (long for the ABI, so 32-bit in freebsd32) counters. In 4.x the NFS client's statfs VOP did this already. This can lie about the block size to 4.x binaries, but it presents a more accurate picture of the ratios of free and available space. - For non-block counters, fix the freebsd32 stats converter to cap the values at INT32_MAX rather than losing the upper 32-bits to match the behavior of the 4.x statfs conversion routine in vfs_syscalls.c Approved by: re (kensmith) Notes: svn path=/head/; revision=172003
* - During shutdown pending, when the last sack came in andRandall Stewart2007-08-271-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the last message on the send stream was "null" but still there, a state we allow, we could get hung and not clean it up and wait for the shutdown guard timer to clear the association without a graceful close. Fix this so that that we properly clean up. - Added support for Multiple ASCONF per new RFC. We only (so far) accept input of these and cannot yet generate a multi-asconf. - Sysctl'd support for experimental Fast Handover feature. Always disabled unless sysctl or socket option changes to enable. - Error case in add-ip where the peer supports AUTH and ADD-IP but does NOT require AUTH of ASCONF/ASCONF-ACK. We need to ABORT in this case. - According to the Kyoto summit of socket api developers (Solaris, Linux, BSD). We need to have: o non-eeor mode messages be atomic - Fixed o Allow implicit setup of an assoc in 1-2-1 model if using the sctp_**() send calls - Fixed o Get rid of HAVE_XXX declarations - Done o add a sctp_pr_policy in hole in sndrcvinfo structure - Done o add a PR_SCTP_POLICY_VALID type flag - yet to-do in a future patch! - Optimize sctp6 calls to reuse code in sctp_usrreq. Also optimize when we close sending out the data and disabling Nagle. - Change key concatenation order to match the auth RFC - When sending OOTB shutdown_complete always do csum. - Don't send PKT-DROP to a PKT-DROP - For abort chunks just always checksums same for shutdown-complete. - inpcb_free front state had a bug where in queue data could wedge an assoc. We need to just abandon ones in front states (free_assoc). - If a peer sends us a 64k abort, we would try to assemble a response packet which may be larger than 64k. This then would be dropped by IP. Instead make a "minimum" size for us 64k-2k (we want at least 2k for our initack). If we receive such an init discard it early without all the processing. - When we peel off we must increment the tcb ref count to keep it from being freed from underneath us. - handling fwd-tsn had bugs that caused memory overwrites when given faulty data, fixed so can't happen and we also stop at the first bad stream no. - Fixed so comm-up generates the adaption indication. - peeloff did not get the hmac params copied. - fix it so we lock the addr list when doing src-addr selection (in future we need to use a multi-reader/one writer lock here) - During lowlevel output, we could end up with a _l_addr set to null if the iterator is calling the output routine. This means we would possibly crash when we gather the MTU info. Fix so we only do the gather where we have a src address cached. - we need to be sure to set abort flag on conn state when we receive an abort. - peeloff could leak a socket. Moved code so the close will find the socket if the peeloff fails (uipc_syscalls.c) Approved by: re@freebsd.org(Ken Smith) Notes: svn path=/head/; revision=171990
* Destroy the kaio_mtx on the freeing the struct kaioinfo in theKonstantin Belousov2007-08-201-1/+5
| | | | | | | | | | | | | aio_proc_rundown. Do not allow for zero-length read to be passed to the fo_read file method by aio. Reported and tested by: Peter Holm Approved by: re (kensmith) Notes: svn path=/head/; revision=171901
* - Improve runq_findbit_from() which is used by ULE's circular queue. MaskJeff Roberson2007-08-201-32/+22
| | | | | | | | | | | | of the bits we want to ignore on the first pass rather than doing a linear scan. This puts us within a few instructions of the cost of runq_findbit() and removes this function from the top of profiling output for context switch heavy workloads. Approved by: re Notes: svn path=/head/; revision=171900
* - Set steal_thresh to log2(ncpus). This improves idle-time load balancingJeff Roberson2007-08-201-0/+6
| | | | | | | | | | | on 2cpu machines by reducing it to 1 by default. This improves loaded operation on 8cpu machines by increasing it to 3 where the extra idle time is not as critical. Approved by: re Notes: svn path=/head/; revision=171899
* Always call sched_bind(), even if on the CPU in question. It is wrong toNate Lawson2007-08-201-25/+15
| | | | | | | | | | | check if we're already on that cpu and skip the bind since the thread could be migrated off in the meantime. Suggested by: jeff Approved by: re Notes: svn path=/head/; revision=171898
* Use a different loop variable for the inner loop. This previous reuse couldNate Lawson2007-08-191-4/+4
| | | | | | | | | | | | have caused a hang, but we got lucky with the available multi-CPU states on actual hardware. Submitted by: Bjorn Koenig <bkoenig / alpha-tierchen.de> Approved by: re MFC after: 3 days Notes: svn path=/head/; revision=171896
* Regenerate.David Xu2007-08-163-0/+11
| | | | | | | Approved by: re(kensmith) Notes: svn path=/head/; revision=171861
* Add thr_kill2 syscall which sends a signal to a thread in another process.David Xu2007-08-162-0/+56
| | | | | | | | Submitted by: Tijl Coosemans tijl at ulyssis dot org Approved by: re (kensmith) Notes: svn path=/head/; revision=171859
* On 6.x this works:John Baldwin2007-08-151-11/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | % mount | grep home /dev/ad4s1e on /home (ufs, local, noatime, soft-updates) % mount -u -o atime /home % mount | grep home /dev/ad4s1e on /home (ufs, local, soft-updates) Restore this behavior for on 7.x for the following mount options: noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow In addition, on 7.x, the following are equivalent: mount -u -o atime /home mount -u -o nonoatime /home Ideally, when we introduce new mount options, we should avoid options starting with "no". :) Requested by: jhb Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com> Approved by: re (bmah) Proxy commit for: rodrigc Notes: svn path=/head/; revision=171852
* Improve vn_printf() by:Pawel Jakub Dawidek2007-08-131-7/+45
| | | | | | | | | | | - adding missing vnode flags, - printing unknown flags as numbers, - using strlcat() instead of strcat(). Approved by: re (bmah) Notes: svn path=/head/; revision=171823
* Do not call free() while holding vnode interlock.Konstantin Belousov2007-08-071-27/+44
| | | | | | | | | Reported and tested by: Peter Holm Reviewed by: jeff Approved by: re (kensmith) Notes: svn path=/head/; revision=171772
* Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, whichRobert Watson2007-08-065-107/+25
| | | | | | | | | | | | | | | | | | previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith) Notes: svn path=/head/; revision=171744
* - Fix one line that erroneously crept in my last commit.Jeff Roberson2007-08-041-1/+0
| | | | | | | Approved by: re Notes: svn path=/head/; revision=171715
* - Share scheduler locks between hyper-threaded cores to protect theJeff Roberson2007-08-031-114/+200
| | | | | | | | | | | | | | | | tdq_group structure. Hyper-threaded cores won't really benefit from seperate locks anyway. - Seperate out the migration case from sched_switch to simplify the main switch code. We only migrate here if called via sched_bind(). - When preempted place the preempted thread back in the same queue at the head. - Improve the cpu group and topology infrastructure. Tested by: many on current@ Approved by: re Notes: svn path=/head/; revision=171713
* - Set SW_PREEMPT when we preempt in critical_exit().Jeff Roberson2007-08-031-1/+1
| | | | | | | Approved by: re Notes: svn path=/head/; revision=171712
* First in a series of changes to remove the now-unused Giant compatibilityRobert Watson2007-07-272-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | framework for non-MPSAFE network protocols: - Remove debug_mpsafenet variable, sysctl, and tunable. - Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel. - Remove logic to automatically flag interrupt handlers as non-MPSAFE if debug.mpsafenet is set for an INTR_TYPE_NET handler. - Remove logic to automatically flag netisr handlers as non-MPSAFE if debug.mpsafenet is set. - Remove references in a few subsystems, including NFS and Cronyx drivers, which keyed off debug_mpsafenet to determine various aspects of their own locking behavior. - Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into no-op's, as their entire behavior was determined by the value in debug_mpsafenet. - Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE. Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still present in subsystems, and will be removed in followup commits. Reviewed by: bz, jhb Approved by: re (kensmith) Notes: svn path=/head/; revision=171613
* Actually, upcalls cannot be freed while destroying the thread because weAttilio Rao2007-07-272-0/+20
| | | | | | | | | | | | | | | | | | should call uma_zfree() with various spinlock helds. Rearranging the code would not help here because we cannot break atomicity respect prcess spinlock, so the only one choice we have is to defer the operation. In order to do this use a global queue synchronized through the kse_lock spinlock which is freed at any thread_alloc() / thread_wait() through a call to thread_reap(). Note that this approach is not ideal as we should want a per-process list of zombie upcalls, but it follows initial guidelines of KSE authors. Tested by: jkim, pav Approved by: jeff, julian Approved by: re Notes: svn path=/head/; revision=171611
* When we do open, we should lock the vnode exclusively. This fixes few races:Pawel Jakub Dawidek2007-07-262-3/+3
| | | | | | | | | | | | | - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more... Discussed with: kib, ups Approved by: re (rwatson) Notes: svn path=/head/; revision=171599
* The v_mountedhere field is protected by the vnode lock, not vnode's internalPawel Jakub Dawidek2007-07-261-1/+1
| | | | | | | | | lock. Approved by: re (rwatson) Notes: svn path=/head/; revision=171598
* upcall_free() was only used in kse_GC() which has been removed so it nowAttilio Rao2007-07-231-8/+0
| | | | | | | | | | | | | | results unused; this, with -Werror option of gcc, rise a warning for gcc which let the buildkernel to be busted. Fix this removing upcall_free(). Reported by: various Approved by: jeff Approved by: re Pointy hat to: attilio Notes: svn path=/head/; revision=171558
* Actually, KSE kernel bits locking is broken and can lead likely toAttilio Rao2007-07-232-82/+71
| | | | | | | | | | | | | | | | | | | | dangerous races. Fix this problems adding correct locking for the members of 'struct kse_upcall' and other struct proc/struct thread related members. For the moment, just leave ku_mflag and ku_flags "lazy" locked. While here, cleanup the code removing the function kse_GC() (unused), and merging upcall_link(), upcall_unlink(), upcall_stash() in their respective callers (static functions, very short and only called in one place). Reported by: pav Tested by: pav (on some pointyhat cluster nodes) Approved by: jeff Approved by: re Sponsorized by: NGX Italy (http://www.ngx.it) Notes: svn path=/head/; revision=171556
* If clock_ct_to_ts fails to convert time time from the real time clock,David Malone2007-07-231-1/+1
| | | | | | | | | | | | print a one line error message. Add some comments on not being able to trust the day of week field (I'll act on these comments in a follow up commit). Approved by: re MFC after: 3 weeks Notes: svn path=/head/; revision=171553
* ttyfree() frees the cdev(). But if there are pending kevents,Konstantin Belousov2007-07-201-7/+17
| | | | | | | | | | | | | | | | | | | filt_ttyrdetach() etc would later attempt to dereference cdev->si_tty, causing a 0xdeadc0de dereference. Change kn_hook value from cdev to struct tty to avoid dereferencing freed cdev. In ttygone(), wake up select(), sigio and kevent() users in addition to the queue sleepers. Return EV_EOF from kevent filters if TS_GONE is set. Submitted by: peter Tested by: Peter Holm Approved by: re (kensmith) MFC after: 2 weeks Notes: svn path=/head/; revision=171517