summaryrefslogtreecommitdiff
path: root/sys/netinet/tcp_usrreq.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove advertising clause from University of California Regent'sWarner Losh2004-04-071-4/+0
| | | | | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson Notes: svn path=/head/; revision=128019
* Fix a panic possibility caused by returning without releasing locks.Pawel Jakub Dawidek2004-04-041-37/+26
| | | | | | | | | | | | | | | It was fixed by moving problemetic checks, as well as checks that doesn't need locking before locks are acquired. Submitted by: Ryan Sommers <ryans@gamersimpact.com> In co-operation with: cperciva, maxim, mlaier, sam Tested by: submitter (previous patch), me (current patch) Reviewed by: cperciva, mlaier (previous patch), sam (current patch) Approved by: sam Dedicated to: enough! Notes: svn path=/head/; revision=127862
* Remove unused argument.Pawel Jakub Dawidek2004-03-281-4/+3
| | | | Notes: svn path=/head/; revision=127526
* Reduce 'td' argument to 'cred' (struct ucred) argument in those functions:Pawel Jakub Dawidek2004-03-271-8/+9
| | | | | | | | | | | | | | | | | - in_pcbbind(), - in_pcbbind_setup(), - in_pcbconnect(), - in_pcbconnect_setup(), - in6_pcbbind(), - in6_pcbconnect(), - in6_pcbsetport(). "It should simplify/clarify things a great deal." --rwatson Requested by: rwatson Reviewed by: rwatson, ume Notes: svn path=/head/; revision=127505
* Remove unused argument.Pawel Jakub Dawidek2004-03-271-1/+1
| | | | | | | Reviewed by: ume Notes: svn path=/head/; revision=127504
* Shorten the name of the socket option used to enable TCP-MD5 packetBruce M Simpson2004-02-161-2/+2
| | | | | | | | | treatment. Submitted by: Vincent Jardin Notes: svn path=/head/; revision=125890
* Brucification.Bruce M Simpson2004-02-131-1/+1
| | | | | | | Submitted by: bde Notes: svn path=/head/; revision=125783
* Initial import of RFC 2385 (TCP-MD5) digest support.Bruce M Simpson2004-02-111-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first of two commits; bringing in the kernel support first. This can be enabled by compiling a kernel with options TCP_SIGNATURE and FAST_IPSEC. For the uninitiated, this is a TCP option which provides for a means of authenticating TCP sessions which came into being before IPSEC. It is still relevant today, however, as it is used by many commercial router vendors, particularly with BGP, and as such has become a requirement for interconnect at many major Internet points of presence. Several parts of the TCP and IP headers, including the segment payload, are digested with MD5, including a shared secret. The PF_KEY interface is used to manage the secrets using security associations in the SADB. There is a limitation here in that as there is no way to map a TCP flow per-port back to an SPI without polluting tcpcb or using the SPD; the code to do the latter is unstable at this time. Therefore this code only supports per-host keying granularity. Whilst FAST_IPSEC is mutually exclusive with KAME IPSEC (and thus IPv6), TCP_SIGNATURE applies only to IPv4. For the vast majority of prospective users of this feature, this will not pose any problem. This implementation is output-only; that is, the option is honoured when responding to a host initiating a TCP session, but no effort is made [yet] to authenticate inbound traffic. This is, however, sufficient to interwork with Cisco equipment. Tested with a Cisco 2501 running IOS 12.0(27), and Quagga 0.96.4 with local patches. Patches for tcpdump to validate TCP-MD5 sessions are also available from me upon request. Sponsored by: sentex.net Notes: svn path=/head/; revision=125680
* Check that sa_len is the appropriate value in tcp_usr_bind(),Don Lewis2004-01-101-0/+8
| | | | | | | | | | | | | tcp6_usr_bind(), tcp_usr_connect(), and tcp6_usr_connect() before checking to see whether the address is multicast so that the proper errno value will be returned if sa_len is incorrect. The checks are identical to the ones in in_pcbbind_setup(), in6_pcbbind(), and in6_pcbladdr(), which are called after the multicast address check passes. MFC after: 30 days Notes: svn path=/head/; revision=124336
* Limiters and sanity checks for TCP MSS (maximum segement size)Andre Oppermann2004-01-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | resource exhaustion attacks. For network link optimization TCP can adjust its MSS and thus packet size according to the observed path MTU. This is done dynamically based on feedback from the remote host and network components along the packet path. This information can be abused to pretend an extremely low path MTU. The resource exhaustion works in two ways: o during tcp connection setup the advertized local MSS is exchanged between the endpoints. The remote endpoint can set this arbitrarily low (except for a minimum MTU of 64 octets enforced in the BSD code). When the local host is sending data it is forced to send many small IP packets instead of a large one. For example instead of the normal TCP payload size of 1448 it forces TCP payload size of 12 (MTU 64) and thus we have a 120 times increase in workload and packets. On fast links this quickly saturates the local CPU and may also hit pps processing limites of network components along the path. This type of attack is particularly effective for servers where the attacker can download large files (WWW and FTP). We mitigate it by enforcing a minimum MTU settable by sysctl net.inet.tcp.minmss defaulting to 256 octets. o the local host is reveiving data on a TCP connection from the remote host. The local host has no control over the packet size the remote host is sending. The remote host may chose to do what is described in the first attack and send the data in packets with an TCP payload of at least one byte. For each packet the tcp_input() function will be entered, the packet is processed and a sowakeup() is signalled to the connected process. For example an attack with 2 Mbit/s gives 4716 packets per second and the same amount of sowakeup()s to the process (and context switches). This type of attack is particularly effective for servers where the attacker can upload large amounts of data. Normally this is the case with WWW server where large POSTs can be made. We mitigate this by calculating the average MSS payload per second. If it goes below 'net.inet.tcp.minmss' and the pps rate is above 'net.inet.tcp.minmssoverload' defaulting to 1000 this particular TCP connection is resetted and dropped. MITRE CVE: CAN-2004-0002 Reviewed by: sam (mentor) MFC after: 1 day Notes: svn path=/head/; revision=124258
* Split the "inp" mutex class into separate classes for each of divert,Sam Leffler2003-11-261-1/+1
| | | | | | | | | | | raw, tcp, udp, raw6, and udp6 sockets to avoid spurious witness complaints. Reviewed by: rwatson Approved by: re (rwatson) Notes: svn path=/head/; revision=122991
* Introduce tcp_hostcache and remove the tcp specific metrics fromAndre Oppermann2003-11-201-20/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache. It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve. tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address. It removes significant locking requirements from the tcp stack with regard to the routing table. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl) Notes: svn path=/head/; revision=122922
* Introduce a MAC label reference in 'struct inpcb', which cachesRobert Watson2003-11-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the MAC label referenced from 'struct socket' in the IPv4 and IPv6-based protocols. This permits MAC labels to be checked during network delivery operations without dereferencing inp->inp_socket to get to so->so_label, which will eventually avoid our having to grab the socket lock during delivery at the network layer. This change introduces 'struct inpcb' as a labeled object to the MAC Framework, along with the normal circus of entry points: initialization, creation from socket, destruction, as well as a delivery access control check. For most policies, the inpcb label will simply be a cache of the socket label, so a new protocol switch method is introduced, pr_sosetlabel() to notify protocols that the socket layer label has been updated so that the cache can be updated while holding appropriate locks. Most protocols implement this using pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use the the worker function in_pcbsosetlabel(), which calls into the MAC Framework to perform a cache update. Biba, LOMAC, and MLS implement these entry points, as do the stub policy, and test policy. Reviewed by: sam, bms Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories Notes: svn path=/head/; revision=122875
* speedup stream socket recv handling by tracking the tail ofSam Leffler2003-10-281-2/+2
| | | | | | | | | | the mbuf chain instead of walking the list for each append Submitted by: ps/jayanth Obtained from: netbsd (jason thorpe) Notes: svn path=/head/; revision=121628
* Remove check for t_state == TCPS_TIME_WAIT and introduce the tw structure.Jonathan Lemon2003-03-081-13/+15
| | | | | | | Sponsored by: DARPA, NAI Labs Notes: svn path=/head/; revision=112010
* Hold the TCP protocol lock while modifying the connection hash table.Jeffrey Hsu2003-02-251-4/+4
| | | | Notes: svn path=/head/; revision=111459
* Unbreak the automatic remapping of an INADDR_ANY destination addressIan Dowse2002-10-241-5/+4
| | | | | | | | | | | | | | | | | | to the primary local IP address when doing a TCP connect(). The tcp_connect() code was relying on in_pcbconnect (actually in_pcbladdr) modifying the passed-in sockaddr, and I failed to notice this in the recent change that added in_pcbconnect_setup(). As a result, tcp_connect() was ending up using the unmodified sockaddr address instead of the munged version. There are two cases to handle: if in_pcbconnect_setup() succeeds, then the PCB has already been updated with the correct destination address as we pass it pointers to inp_faddr and inp_fport directly. If in_pcbconnect_setup() fails due to an existing but dead connection, then copy the destination address from the old connection. Notes: svn path=/head/; revision=105840
* Replace in_pcbladdr() with a more generic inner subroutine forIan Dowse2002-10-211-14/+12
| | | | | | | | | | | | | | | | | | in_pcbconnect() called in_pcbconnect_setup(). This version performs all of the functions of in_pcbconnect() except for the final committing of changes to the PCB. In the case of an EADDRINUSE error it can also provide to the caller the PCB of the duplicate connection, avoiding an extra in_pcblookup_hash() lookup in tcp_connect(). This change will allow the "temporary connect" hack in udp_output() to be removed and is part of the preparation for adding the IP_SENDSRCADDR control message. Discussed on: -net Approved by: re Notes: svn path=/head/; revision=105629
* Replace (ab)uses of "NULL" where "0" is really meant.Archie Cobbs2002-08-221-2/+2
| | | | Notes: svn path=/head/; revision=102291
* Create new functions in_sockaddr(), in6_sockaddr(), andDon Lewis2002-08-211-20/+43
| | | | | | | | | | | | | | | | | | | | in6_v4mapsin6_sockaddr() which allocate the appropriate sockaddr_in* structure and initialize it with the address and port information passed as arguments. Use calls to these new functions to replace code that is replicated multiple times in in_setsockaddr(), in_setpeeraddr(), in6_setsockaddr(), in6_setpeeraddr(), in6_mapped_sockaddr(), and in6_mapped_peeraddr(). Inline COMMON_END in tcp_usr_accept() so that we can call in_sockaddr() with temporary copies of the address and port after the PCB is unlocked. Fix the lock violation in tcp6_usr_accept() (caused by calling MALLOC() inside in6_mapped_peeraddr() while the PCB is locked) by changing the implementation of tcp6_usr_accept() to match tcp_usr_accept(). Reviewed by: suz Notes: svn path=/head/; revision=102218
* Implement TCP bandwidth delay product window limiting, similar to (butMatthew Dillon2002-08-171-0/+2
| | | | | | | | | | | | | | | not meant to duplicate) TCP/Vegas. Add four sysctls and default the implementation to 'off'. net.inet.tcp.inflight_enable enable algorithm (defaults to 0=off) net.inet.tcp.inflight_debug debugging (defaults to 1=on) net.inet.tcp.inflight_min minimum window limit net.inet.tcp.inflight_max maximum window limit MFC after: 1 week Notes: svn path=/head/; revision=102017
* Use a common way to release locks before exit.Maxim Konovalov2002-07-291-2/+4
| | | | | | | Reviewed by: hsu Notes: svn path=/head/; revision=100871
* make setsockopt(IPV6_V6ONLY, 0) actuall work for tcp6.Hajimu UMEMOTO2002-07-251-3/+3
| | | | | | | MFC after: 1 week Notes: svn path=/head/; revision=100685
* cleanup usage of ip6_mapped_addr_on and ip6_v6only. now,Hajimu UMEMOTO2002-07-251-5/+3
| | | | | | | | | ip6_mapped_addr_on is unified into ip6_v6only. MFC after: 1 week Notes: svn path=/head/; revision=100683
* Because we're holding an exclusive write lock on the head, references toJeffrey Hsu2002-06-131-3/+0
| | | | | | | the new inp cannot leak out even though it has been placed on the head list. Notes: svn path=/head/; revision=98191
* Lock up inpcb.Jeffrey Hsu2002-06-101-37/+161
| | | | | | | Submitted by: Jennifer Yang <yangjihui@yahoo.com> Notes: svn path=/head/; revision=98102
* Back out my lats commit of locking down a socket, it conflicts with hsu's work.Seigo Tanimura2002-05-311-51/+12
| | | | | | | Requested by: hsu Notes: svn path=/head/; revision=97658
* Lock down a socket, milestone 1.Seigo Tanimura2002-05-201-12/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred Notes: svn path=/head/; revision=96972
* Fixed some style bugs in the removal of __P(()). Continuation linesBruce Evans2002-03-241-3/+3
| | | | | | | | were not outdented to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting. Notes: svn path=/head/; revision=93085
* Remove __P.Alfred Perlstein2002-03-191-7/+7
| | | | Notes: svn path=/head/; revision=92723
* - Set inc_isipv6 in tcp6_usr_connect().Hajimu UMEMOTO2002-02-281-0/+1
| | | | | | | | | | - When making a pcb from a sync cache, do not forget to copy inc_isipv6. Obtained from: KAME MFC After: 1 week Notes: svn path=/head/; revision=91492
* Simple p_ucred -> td_ucred changes to start using the per-thread ucredJohn Baldwin2002-02-271-2/+2
| | | | | | | reference. Notes: svn path=/head/; revision=91406
* Introduce a syncache, which enables FreeBSD to withstand a SYN floodJonathan Lemon2001-11-221-2/+2
| | | | | | | | | | DoS in an improved fashion over the existing code. Reviewed by: silby (in a previous iteration) Sponsored by: DARPA, NAI Labs Notes: svn path=/head/; revision=86764
* KSE Milestone 2Julian Elischer2001-09-121-35/+35
| | | | | | | | | | | | | | | | | Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha Notes: svn path=/head/; revision=83366
* Much delayed but now present: RFC 1948 style sequence numbersMike Silbersack2001-08-221-2/+2
| | | | | | | | | | | | | In order to ensure security and functionality, RFC 1948 style initial sequence number generation has been implemented. Barring any major crypographic breakthroughs, this algorithm should be unbreakable. In addition, the problems with TIME_WAIT recycling which affect our currently used algorithm are not present. Reviewed by: jesper Notes: svn path=/head/; revision=82122
* move ipsec security policy allocation into in_pcballoc, beforeHajimu UMEMOTO2001-07-261-12/+0
| | | | | | | | | | | making pcbs available to the outside world. otherwise, we will see inpcb without ipsec security policy attached (-> panic() in ipsec.c). Obtained from: KAME MFC after: 3 days Notes: svn path=/head/; revision=80406
* Bump net.inet.tcp.sendspace to 32k and net.inet.tcp.recvspace to 65k.David E. O'Brien2001-07-131-2/+2
| | | | | | | | | | | | | | | | This should help us in nieve benchmark "tests". It seems a wide number of people think 32k buffers would not cause major issues, and is in fact in use by many other OS's at this time. The receive buffers can be bumped higher as buffers are hardly used and several research papers indicate that receive buffers rarely use much space at all. Submitted by: Leo Bicknell <bicknell@ufp.org> <20010713101107.B9559@ussenterprise.ufp.org> Agreed to in principle by: dillon (at the 32k level) Notes: svn path=/head/; revision=79685
* Temporary feature: Runtime tuneable tcp initial sequence numberMike Silbersack2001-07-081-2/+2
| | | | | | | | | | | | | | | | | | | | | generation scheme. Users may now select between the currently used OpenBSD algorithm and the older random positive increment method. While the OpenBSD algorithm is more secure, it also breaks TIME_WAIT handling; this is causing trouble for an increasing number of folks. To switch between generation schemes, one sets the sysctl net.inet.tcp.tcp_seq_genscheme. 0 = random positive increments, 1 = the OpenBSD algorithm. 1 is still the default. Once a secure _and_ compatible algorithm is implemented, this sysctl will be removed. Reviewed by: jlemon Tested by: numerous subscribers of -net Notes: svn path=/head/; revision=79413
* Eliminate the allocation of a tcp template structure for eachMike Silbersack2001-06-231-12/+0
| | | | | | | | | | | | | | | connection. The information contained in a tcptemp can be reconstructed from a tcpcb when needed. Previously, tcp templates required the allocation of one mbuf per connection. On large systems, this change should free up a large number of mbufs. Reviewed by: bmilekic, jlemon, ru MFC after: 2 weeks Notes: svn path=/head/; revision=78642
* Sync with recent KAME.Hajimu UMEMOTO2001-06-111-5/+8
| | | | | | | | | | | | | | | | | | | | | This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks Notes: svn path=/head/; revision=78064
* Say goodbye to TCP_COMPAT_42Jesper Skriver2001-04-201-9/+0
| | | | | | | | Reviewed by: wollman Requested by: wollman Notes: svn path=/head/; revision=75733
* Randomize the TCP initial sequence numbers more thoroughly.Kris Kennaway2001-04-171-1/+10
| | | | | | | | Obtained from: OpenBSD Reviewed by: jesper, peter, -developers Notes: svn path=/head/; revision=75619
* Unbreak LINT.Jonathan Lemon2001-03-121-5/+17
| | | | | | | Pointed out by: phk Notes: svn path=/head/; revision=74134
* Push the test for a disconnected socket when accept()ing down to theJonathan Lemon2001-03-091-0/+8
| | | | | | | | protocol layer. Not all protocols behave identically. This fixes the brokenness observed with unix-domain sockets (and postfix) Notes: svn path=/head/; revision=74018
* o Move per-process jail pointer (p->pr_prison) to inside of the subjectRobert Watson2001-02-211-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project Notes: svn path=/head/; revision=72786
* When turning off TCP_NOPUSH, call tcp_output to immediately flushJonathan Lemon2001-02-021-4/+14
| | | | | | | | | out any data pending in the buffer. Submitted by: Tony Finch <dot@dotat.at> Notes: svn path=/head/; revision=71937
* Support per socket based IPv4 mapped IPv6 addr enable/disable control.Yoshinobu Inoue2000-04-011-4/+3
| | | | | | | Submitted by: ume Notes: svn path=/head/; revision=58907
* tcp updates to support IPv6.Yoshinobu Inoue2000-01-091-1/+287
| | | | | | | | | | also a small patch to sys/nfs/nfs_socket.c, as max_hdr size change. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project Notes: svn path=/head/; revision=55679
* IPSEC support in the kernel.Yoshinobu Inoue1999-12-221-0/+12
| | | | | | | | | | | pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project Notes: svn path=/head/; revision=55009
* Always set INP_IPV4 flag for IPv4 pcb entries, because netstat needs itYoshinobu Inoue1999-12-131-3/+0
| | | | | | | | | | | | | | to print out protocol specific pcb info. A patch submitted by guido@gvr.org, and asmodai@wxs.nl also reported the problem. Thanks and sorry for your troubles. Submitted by: guido@gvr.org Reviewed by: shin Notes: svn path=/head/; revision=54526