summaryrefslogtreecommitdiff
path: root/sys/netinet/tcp_syncache.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix two occurences of a typo in a comment introduced in r367530.Michael Tuexen2020-11-231-1/+1
| | | | | | | | | Reported by: lstewart@ MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27148 Notes: svn path=/head/; revision=367946
* RFC 7323 specifies that:Michael Tuexen2020-11-091-26/+34
| | | | | | | | | | | | | | | | | * TCP segments without timestamps should be dropped when support for the timestamp option has been negotiated. * TCP segments with timestamps should be processed normally if support for the timestamp option has not been negotiated. This patch enforces the above. PR: 250499 Reviewed by: gnn, rrs MFC after: 1 week Sponsored by: Netflix, Inc Differential Revision: https://reviews.freebsd.org/D27148 Notes: svn path=/head/; revision=367530
* net: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-1/+0
| | | | Notes: svn path=/head/; revision=365071
* Fix the following issues related to the TCP SYN-cache:Michael Tuexen2020-08-101-13/+28
| | | | | | | | | | | | | | | | * Let the accepted TCP/IPv4 socket inherit the configured TTL and TOS value. * Let the accepted TCP/IPv6 socket inherit the configured Hop Limit. * Use the configured Hop Limit and Traffic Class when sending IPv6 packets. Reviewed by: rrs, lutz_donnerhacke.de MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D25909 Notes: svn path=/head/; revision=364089
* Improve the ECN negotiation when the TCP SYN-cache is used by makingMichael Tuexen2020-08-081-0/+10
| | | | | | | | | | | | | | | sure that * ECN is disabled if the client sends an non-ECN-setup SYN segment. * ECN is disabled is the ECN-setup SYN-ACK segment is retransmitted more than net.inet.tcp.ecn.maxretries times. Reviewed by: rscheff MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D26008 Notes: svn path=/head/; revision=364054
* When using automatically generated flow labels and using TCP SYNMichael Tuexen2020-03-041-1/+2
| | | | | | | | | | | | | | | cookies, use the same flow label for the segments sent during the handshake and after the handshake. This fixes a bug by making sure that sc_flowlabel is always stored in network byte order. Reviewed by: bz@ MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D23957 Notes: svn path=/head/; revision=358621
* Don't send an uninitilised traffic class in the IPv6 header, whenMichael Tuexen2020-03-041-1/+2
| | | | | | | | | | | | | | | | sending a TCP segment from the TCP SYN cache (like a SYN-ACK). This fix initialises it to zero. This is correct for the ECN bits, but is does not honor the DSCP what an application might have set via the IPPROTO_IPV6 level socket options IPV6_TCLASS. That will be fixed separately. Reviewed by: Richard Scheffenegger MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D23900 Notes: svn path=/head/; revision=358614
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-2/+3
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* White space cleanup -- remove trailing tab's or spacesRandall Stewart2020-02-121-7/+7
| | | | | | | | | from any line. Sponsored by: Netflix Inc. Notes: svn path=/head/; revision=357818
* This small fix makes it so we properly followRandall Stewart2020-02-121-1/+2
| | | | | | | | | | | the RFC and only enable ECN when both the CWR and ECT bits our set within the SYN packet. Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D23645 Notes: svn path=/head/; revision=357816
* Make ip6_output() and ip_output() require network epoch.Gleb Smirnoff2020-01-221-0/+3
| | | | | | | | All callers that before may called into these functions without network epoch now must enter it. Notes: svn path=/head/; revision=356974
* Add some documenting NET_EPOCH_ASSERTs.Gleb Smirnoff2020-01-221-0/+3
| | | | Notes: svn path=/head/; revision=356969
* Fix race when accepting TCP connections.Michael Tuexen2020-01-121-28/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When expanding a SYN-cache entry to a socket/inp a two step approach was taken: 1) The local address was filled in, then the inp was added to the hash table. 2) The remote address was filled in and the inp was relocated in the hash table. Before the epoch changes, a write lock was held when this happens and the code looking up entries was holding a corresponding read lock. Since the read lock is gone away after the introduction of the epochs, the half populated inp was found during lookup. This resulted in processing TCP segments in the context of the wrong TCP connection. This patch changes the above procedure in a way that the inp is fully populated before inserted into the hash table. Thanks to Paul <devgs@ukr.net> for reporting the issue on the net@ mailing list and for testing the patch! Reviewed by: rrs@ MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D22971 Notes: svn path=/head/; revision=356663
* Move all ECN related flags from the flags to the flags2 field.Michael Tuexen2019-12-011-1/+1
| | | | | | | | | | | | This allows adding more ECN related flags in the future. No functional change intended. Submitted by: Richard Scheffenegger Reviewed by: rrs@, tuexen@ Differential Revision: https://reviews.freebsd.org/D22497 Notes: svn path=/head/; revision=355273
* In order for the TCP Handshake to support ECN++, and further ECN-relatedMichael Tuexen2019-12-011-1/+1
| | | | | | | | | | | | | improvements, the ECN bits need to be exposed to the TCP SYNcache. This change is a minimal modification to the function headers, without any functional change intended. Submitted by: Richard Scheffenegger Reviewed by: rgrimes@, rrs@, tuexen@ Differential Revision: https://reviews.freebsd.org/D22436 Notes: svn path=/head/; revision=355266
* Now that there is no R/W lock on PCB list the pcblist sysctlsGleb Smirnoff2019-11-071-19/+14
| | | | | | | | | | handlers can be greatly simplified. All the previous double cycling and complex locking was added to avoid these functions holding global PCB locks for extended period of time, preventing addition of new entries. Notes: svn path=/head/; revision=354484
* Mechanically convert INP_INFO_RLOCK() to NET_EPOCH_ENTER().Gleb Smirnoff2019-11-071-11/+3
| | | | | | | | Remove few outdated comments and extraneous assertions. No functional change here. Notes: svn path=/head/; revision=354421
* Add new functionality to switch to using cookies exclusively when we theJonathan T. Looney2019-09-261-18/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | syn cache overflows. Whether this is due to an attack or due to the system having more legitimate connections than the syn cache can hold, this situation can quickly impact performance. To make the system perform better during these periods, the code will now switch to exclusively using cookies until the syn cache stops overflowing. In order for this to occur, the system must be configured to use the syn cache with syn cookie fallback. If syn cookies are completely disabled, this change should have no functional impact. When the system is exclusively using syn cookies (either due to configuration or the overflow detection enabled by this change), the code will now skip acquiring a lock on the syn cache bucket. Additionally, the code will now skip lookups in several places (such as when the system receives a RST in response to a SYN|ACK frame). Reviewed by: rrs, gallatin (previous version) Discussed with: tuexen Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21644 Notes: svn path=/head/; revision=352746
* Access the syncache secret directly from the V_tcp_syncache variable,Jonathan T. Looney2019-09-261-7/+3
| | | | | | | | | | | | | | | | rather than indirectly through the backpointer to the tcp_syncache structure stored in the hashtable bucket. This also allows us to remove the requirement in syncookie_generate() and syncookie_lookup() that the syncache hashtable bucket must be locked. Reviewed by: gallatin, rrs Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21644 Notes: svn path=/head/; revision=352745
* Remove the unused sch parameter to the syncache_respond() function. TheJonathan T. Looney2019-09-261-8/+6
| | | | | | | | | | | | use of this parameter was removed in r313330. This commit now removes passing this now-unused parameter. Reviewed by: gallatin, rrs Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D21644 Notes: svn path=/head/; revision=352744
* Avoid unneeded call to arc4random() in syncache_add()Andrew Gallatin2019-09-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | Don't call arc4random() unconditionally to initialize sc_iss, and then when syncookies are enabled, just overwrite it with the return value from from syncookie_generate(). Instead, only call arc4random() to initialize sc_iss when syncookies are not enabled. Note that on a system under a syn flood attack, arc4random() becomes quite expensive, and the chacha_poly crypto that it calls is one of the more expensive things happening on the system. Removing this unneeded arc4random() call reduces CPU from about 40% to about 35% in my test scenario (Broadwell Xeon, 6Mpps syn flood attack). Reviewed by: rrs, tuxen, bz Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21591 Notes: svn path=/head/; revision=352228
* When an ACK segment as the third message of the three way handshake isMichael Tuexen2019-05-261-0/+22
| | | | | | | | | | | | | | | received and support for time stamps was negotiated in the SYN/SYNACK exchange, perform the PAWS check and only expand the syn cache entry if the check is passed. Without this check, endpoints may get stuck on the incomplete queue. Reviewed by: jtl@ MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D20374 Notes: svn path=/head/; revision=348290
* Track TCP connection's NUMA domain in the inpcbAndrew Gallatin2019-04-251-0/+3
| | | | | | | | | | | | | | | | | | | | | Drivers can now pass up numa domain information via the mbuf numa domain field. This information is then used by TCP syncache_socket() to associate that information with the inpcb. The domain information is then fed back into transmitted mbufs in ip{6}_output(). This mechanism is nearly identical to what is done to track RSS hash values in the inp_flowid. Follow on changes will use this information for lacp egress port selection, binding TCP pacers to the appropriate NUMA domain, etc. Reviewed by: markj, kib, slavash, bz, scottl, jtl, tuexen Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20028 Notes: svn path=/head/; revision=346677
* Add sysctl variable net.inet.tcp.rexmit_initial for setting RTO.InitialMichael Tuexen2019-03-231-6/+7
| | | | | | | | | | | used by TCP. Reviewed by: rrs@, 0mp@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D19355 Notes: svn path=/head/; revision=345458
* Reduce the TCP initial retransmission timeout from 3 seconds toMichael Tuexen2019-02-201-1/+1
| | | | | | | | | | | 1 second as allowed by RFC 6298. Reviewed by: kbowling@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D18941 Notes: svn path=/head/; revision=344368
* Use exponential backoff for retransmitting SYN segments as specifiedMichael Tuexen2019-02-201-6/+6
| | | | | | | | | | | in the TCP RFCs. Reviewed by: rrs@, Richard Scheffenegger Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D18974 Notes: svn path=/head/; revision=344367
* Get the arithmetic right...Michael Tuexen2019-01-241-1/+1
| | | | | | | | MFC after: 3 days Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=343403
* Kill a trailing whitespace character...Michael Tuexen2019-01-241-1/+1
| | | | | | | | MFC after: 3 days Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=343402
* Update a comment to reflect the current reality.Michael Tuexen2019-01-241-1/+6
| | | | | | | | | | | SYN-cache entries live for abaut 12 seconds, not 45, when default setting are used. MFC after: 1 week Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=343401
* Remove debug code which slipped in accidently.Michael Tuexen2018-11-011-1/+0
| | | | | | | | | MFC after: 4 weeks X-MFC with: r339989 Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=339991
* Improve a comment to refer to the actual sections in the TCPMichael Tuexen2018-11-011-4/+18
| | | | | | | | | | | | specification for the comparisons made. Thanks to lstewart@ for the suggestion. MFC after: 4 weeks Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D17595 Notes: svn path=/head/; revision=339989
* The handling of RST segments in the SYN-RCVD state exists in theMichael Tuexen2018-10-181-38/+61
| | | | | | | | | | | | | | | | | | | | code paths. Both are not consistent and the one on the syn cache code does not conform to the relevant specifications (Page 69 of RFC 793 and Section 4.2 of RFC 5961). This patch fixes this: * The sequence numbers checks are fixed as specified on page Page 69 RFC 793. * The sysctl variable net.inet.tcp.insecure_rst is now honoured and the behaviour as specified in Section 4.2 of RFC 5961. Approved by: re (gjb@) Reviewed by: bz@, glebius@, rrs@, Differential Revision: https://reviews.freebsd.org/D17595 Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=339430
* Remove the unused parameter 'locked' from the functionMichael Tuexen2018-09-231-5/+5
| | | | | | | | | | | | syncache_respond(). There is no functional change. The parameter became unused in r313330, but wasn't removed. Approved by: re (kib@) MFC after: 1 month Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=338898
* Fix the inheritance of IPv6 level socket options on TCP sockets.Michael Tuexen2018-08-211-3/+7
| | | | | | | | | | | | | This was broken for IPv6 listening socket, which are not IPV6_ONLY, and the accepted TCP connection was using IPv4. Reviewed by: bz@, rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16792 Notes: svn path=/head/; revision=338137
* Don't expose the uptime via the TCP timestamps.Michael Tuexen2018-08-191-6/+2
| | | | | | | | | | | | | | | | The TCP client side or the TCP server side when not using SYN-cookies used the uptime as the TCP timestamp value. This patch uses in all cases an offset, which is the result of a keyed hash function taking the source and destination addresses and port numbers into account. The keyed hash function is the same a used for the initial TSN. Reviewed by: rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16636 Notes: svn path=/head/; revision=338053
* Add missing send/recv dtrace probes for TCP.Michael Tuexen2018-07-301-1/+9
| | | | | | | | | | | | These missing probe are mostly in the syncache and timewait code. Reviewed by: markj@, rrs@ MFC after: 1 month Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16369 Notes: svn path=/head/; revision=336932
* Use the new VNET_DEFINE_STATIC macro when we are defining static VNETAndrew Turner2018-07-241-4/+4
| | | | | | | | | | | variables. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147 Notes: svn path=/head/; revision=336676
* When retransmitting TCP SYN-ACK segments with the TCP timestamp optionMichael Tuexen2018-06-151-25/+2
| | | | | | | | | | | | | | | | | | | | enabled use an updated timestamp instead of reusing the one used in the initial TCP SYN-ACK segment. This patch ensures that an updated timestamp is used when sending the SYN-ACK from the syncache code. It was already done if the SYN-ACK was retransmitted from the generic code. This makes the behaviour consistent and also conformant with the TCP specification. Reviewed by: jtl@, Jason Eggleston MFC after: 1 month Sponsored by: Neflix, Inc. Differential Revision: https://reviews.freebsd.org/D15634 Notes: svn path=/head/; revision=335194
* Limit the retransmission timer for SYN-ACKs by TCPTV_REXMTMAX.Michael Tuexen2018-06-011-2/+8
| | | | | | | | | | | Use the same logic to handle the SYN-ACK retransmission when sent from the syn cache code as when sent from the main code. MFC after: 3 days Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=334497
* Ensure net.inet.tcp.syncache.rexmtlimit is limited by TCP_MAXRXTSHIFT.Michael Tuexen2018-06-011-1/+20
| | | | | | | | | | | | If the sysctl variable is set to a value larger than TCP_MAXRXTSHIFT+1, the array tcp_syn_backoff[] is accessed out of bounds. Discussed with: jtl@ MFC after: 3 days Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=334494
* This commit brings in the TCP high precision timer system (tcp_hpts).Randall Stewart2018-04-191-0/+6
| | | | | | | | | | | | | | | It is the forerunner/foundational work of bringing in both Rack and BBR which use hpts for pacing out packets. The feature is optional and requires the TCPHPTS option to be enabled before the feature will be active. TCP modules that use it must assure that the base component is compile in the kernel in which they are loaded. MFC after: Never Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D15020 Notes: svn path=/head/; revision=332770
* Set the inp_vflag consistently for accepted TCP/IPv6 connections whenMichael Tuexen2018-03-161-0/+2
| | | | | | | | | | | | | | | | | | | net.inet6.ip6.v6only=0. Without this patch, the inp_vflag would have INP_IPV4 and the INP_IPV6 flags for accepted TCP/IPv6 connections if the sysctl variable net.inet6.ip6.v6only is 0. This resulted in netstat to report the source and destination addresses as IPv4 addresses, even they are IPv6 addresses. PR: 226421 Reviewed by: bz, hiren, kib MFC after: 3 days Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D13514 Notes: svn path=/head/; revision=331061
* Greatly reduce the number of #ifdefs supporting the TCP_RFC7413 kernel option.Patrick Kelsey2018-02-261-22/+0
| | | | | | | | | | | | | | | | | | | | | The conditional compilation support is now centralized in tcp_fastopen.h and tcp_var.h. This doesn't provide the minimum theoretical code/data footprint when TCP_RFC7413 is disabled, but nearly all the TFO code should wind up being removed by the optimizer, the additional footprint in the syncache entries is a single pointer, and the additional overhead in the tcpcb is at the end of the structure. This enables the TCP_RFC7413 kernel option by default in amd64 and arm64 GENERIC. Reviewed by: hiren MFC after: 1 month Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14048 Notes: svn path=/head/; revision=330002
* This is an implementation of the client side of TCP Fast Open (TFO)Patrick Kelsey2018-02-261-3/+4
| | | | | | | | | | | | | | | | | [RFC7413]. It also includes a pre-shared key mode of operation in which the server requires the client to be in possession of a shared secret in order to successfully open TFO connections with that server. The names of some existing fastopen sysctls have changed (e.g., net.inet.tcp.fastopen.enabled -> net.inet.tcp.fastopen.server_enable). Reviewed by: tuexen MFC after: 1 month Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14047 Notes: svn path=/head/; revision=330001
* sys: general adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-271-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended. Notes: svn path=/head/; revision=326272
* The soisconnected() call removed from syncache_socket() in r307966 wasPatrick Kelsey2017-10-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | not extraneous in the TCP Fast Open (TFO) passive-open case. In the TFO passive-open case, syncache_socket() is being called during processing of a TFO SYN bearing a valid cookie, and a call to soisconnected() is required in order to allow the application to immediately consume any data delivered in the SYN and to have a chance to generate response data to accompany the SYN-ACK. The removal of this call to soisconnected() effectively converted all TFO passive opens to having the same RTT cost as a standard 3WHS. This commit adds a call to soisconnected() to syncache_tfo_expand() so that it is only in the TFO passive-open path, thereby restoring TFO passve-open RTT performance and preserving the non-TFO connection-rate performance gains realized by r307966. MFC after: 1 week Sponsored by: Limelight Networks Notes: svn path=/head/; revision=324181
* tcp: Don't "negotiate" MSS.Sepherosa Ziehau2017-09-271-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | _NO_ OSes actually "negotiate" MSS. RFC 879: "... This Maximum Segment Size (MSS) announcement (often mistakenly called a negotiation) ..." This negotiation behaviour was introduced 11 years ago by r159955 without any explaination about why FreeBSD had to "negotiate" MSS: In syncache_respond() do not reply with a MSS that is larger than what the peer announced to us but make it at least tcp_minmss in size. Sponsored by: TCP/IP Optimization Fundraise 2005 The tcp_minmss behaviour is still kept. Syncookie fix was prodded by tuexen, who also helped to test this patch w/ packetdrill. Reviewed by: tuexen, karels, bz (previous version) MFC after: 2 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D12430 Notes: svn path=/head/; revision=324050
* Listening sockets improvements.Gleb Smirnoff2017-06-081-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | o Separate fields of struct socket that belong to listening from fields that belong to normal dataflow, and unionize them. This shrinks the structure a bit. - Take out selinfo's from the socket buffers into the socket. The first reason is to support braindamaged scenario when a socket is added to kevent(2) and then listen(2) is cast on it. The second reason is that there is future plan to make socket buffers pluggable, so that for a dataflow socket a socket buffer can be changed, and in this case we also want to keep same selinfos through the lifetime of a socket. - Remove struct struct so_accf. Since now listening stuff no longer affects struct socket size, just move its fields into listening part of the union. - Provide sol_upcall field and enforce that so_upcall_set() may be called only on a dataflow socket, which has buffers, and for listening sockets provide solisten_upcall_set(). o Remove ACCEPT_LOCK() global. - Add a mutex to socket, to be used instead of socket buffer lock to lock fields of struct socket that don't belong to a socket buffer. - Allow to acquire two socket locks, but the first one must belong to a listening socket. - Make soref()/sorele() to use atomic(9). This allows in some situations to do soref() without owning socket lock. There is place for improvement here, it is possible to make sorele() also to lock optionally. - Most protocols aren't touched by this change, except UNIX local sockets. See below for more information. o Reduce copy-and-paste in kernel modules that accept connections from listening sockets: provide function solisten_dequeue(), and use it in the following modules: ctl(4), iscsi(4), ng_btsocket(4), ng_ksocket(4), infiniband, rpc. o UNIX local sockets. - Removal of ACCEPT_LOCK() global uncovered several races in the UNIX local sockets. Most races exist around spawning a new socket, when we are connecting to a local listening socket. To cover them, we need to hold locks on both PCBs when spawning a third one. This means holding them across sonewconn(). This creates a LOR between pcb locks and unp_list_lock. - To fix the new LOR, abandon the global unp_list_lock in favor of global unp_link_lock. Indeed, separating these two locks didn't provide us any extra parralelism in the UNIX sockets. - Now call into uipc_attach() may happen with unp_link_lock hold if, we are accepting, or without unp_link_lock in case if we are just creating a socket. - Another problem in UNIX sockets is that uipc_close() basicly did nothing for a listening socket. The vnode remained opened for connections. This is fixed by removing vnode in uipc_close(). Maybe the right way would be to do it for all sockets (not only listening), simply move the vnode teardown from uipc_detach() to uipc_close()? Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D9770 Notes: svn path=/head/; revision=319722
* Fix the ICMP6 handling for TCP.Michael Tuexen2017-06-031-2/+2
| | | | | | | | | | | | | The ICMP6 packets might not be contained in a single mbuf. So don't assume this. Keep the IPv4 and IPv6 code in sync and make explicit that the syncache code only need the TCP sequence number, not the complete TCP header. MFC after: 3 days Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=319556
* Represent "a syncache overflow hasn't happend yet" by usingMichael Tuexen2017-04-211-1/+2
| | | | | | | | | | | | | -(SYNCOOKIE_LIFETIME + 1) instead of INT64_MIN, since it is good enough and works when time_t is int32 or int64. This fixes the issue reported by cy@ on i386. Reported by: cy MFC after: 1 week Sponsored by: Netflix, Inc. Notes: svn path=/head/; revision=317244