| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a packet is provided to LRO using tcp_lro_queue_mbuf(), a
sequence number is computed based on the m_pkthdr.flowid provided
by he driver. The implicit assumption is that the m_pkthdr.flowid
has hash properties.
The recent use of tcp_lro_queue_mbuf() in iflib exposed a bug in at
least one driver (igc) , which
* reports always that is uses M_HASHTYPE_OPAQUE.
* sets in some cases m_pkthdr.flowid not consistently for packets
belonging to the same TCP connection.
This results in severe performance problems for the base TCP stack,
since it handles the packets in the wrong sequence, although they were
received in the correct sequence.
To protect against such misbehaving drivers, just take the
m_pkthdr.flowid only into account, if it has hash properties.
The performance problems were observed by gallatin@ and analyzed
together with rrs@.
Reviewed by: gallatin
Tested by: gallatin
MFC after: 5 Minutes
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52989
|
|
|
|
|
| |
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the fifth step in SEGMENT ARRIVES, send a RST segment in
response to an ACK segment which fails the SEG.ACK check, but leave
the endpoint state unchanged.
FreeBSD handles this correctly when entering the SYN-RECEIVED state via
the SYN-SENT state, but not in the SYN-cache code, which handles the
SYN-RECEIVED state via the LISTEN state.
This also fixes a panic reported by Alexander Leidinger.
Reviewed by: jtl, glebius
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52934
|
|
|
|
|
|
|
| |
No functional change intended.
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using the word "contested" for the case where there are threads blocked
on the lock is misleading at best (the lock is already contested if it
is being held by one thread and wanted by another). It also diverges
from naming used in other primitives (which refer to them as "waiters").
Rename it for some consistency.
There were uses of the flag outside of mutex code itself.
This is an abuse of the interface. The netgraph thing looks suspicious
at best, the sctp thing is fundamentally wrong. Fixing those up is left
as an exercise for the reader.
While here touch up stale commentary.
Sponsored by: Rubicon Communications, LLC ("Netgate")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* use ND_NA_FLAG_ROUTER flag in carp_send_na() when we work as router.
* use in6addr_any as destination address for nd6_na_output(), then it
will use ipv6-all-nodes multicast address.
* add in6_selectsrc_nbr() function that accepts additional argument
ip6_moptions. Use this function from ND6 code to avoid cases when
nd6_na_output/nd6_ns_output can not find source address for
multicast destinations.
* add some comments from RFC2461 for better understanding.
* use tlladdr argument as flags and use ND6_NA_OPT_LLA when we need
to add target link-layer address option, and ND6_NA_CARP_MASTER when
we know that target address is CARP master. Then we can prepare
correct CARP's mac address if target address is CARP master.
* move blocks of code where multicast options is initialized and
use it when destination address is multicast.
Reviewed by: kp
Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D52825
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The syncache entry is locked by the hash bucket lock. After running
SCH_UNLOCK(), we have no guarantee that the syncache entry still
exists.
Resolve the race by moving SCH_UNLOCK() after the log() call which
reads variables from the syncache entry.
Reviewed by: rrs, tuexen, Nick Banks
Sponsored by: Netflix
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D52868
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The validation of SEG.SEQ (first step in SEGMENT ARRIVES of RFC 9293)
should be done before the validation of SEG.ACK (fifth step in
SEGMENT ARRIVES in RFC 9293).
Furthermore, when the SEG.SEQ validation fails, a challenge ACK
should be sent instead of sending a RST-segment and moving the
endpoint to CLOSED.
Reported by: Tilnel on freebsd-net
Reviewed by: Nick Banks
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52849
|
|
|
|
|
|
|
|
|
|
| |
Don't drop a SYN-cache entry just because a challenge ACK couldn't
be sent. This might only be a temporary failure.
Reviewed by: Nick Banks, glebius, jtl
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52840
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only validate SEG.SEQ and SEG.ACK when processing a real SYN-cache
entry. In the SYN-cookie case, these conditions are always true, since
the SYN-cache entry on the stack is constructed from the incoming
TCP segment.
While there, fix the logging messages.
Reviewed by: Nick Banks
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52816
|
|
|
|
|
|
|
|
|
|
| |
When sending challenge ACKs from the SYN-cache, apply the same rate
limiting as in other states.
Reviewed by: cc, rrs
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor tcp_send_challenge_ack() such that the logic checking whether
a challenge ACK is sent or not is available in the separate function
tcp_challenge_ack_check(). This new function will also be used for
sending challenge ACKs in the SYN-cache code, which will be added in
upcoming commits.
No functional change intended.
Reviewed by: cc, Nick Banks, Peter Lei
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52717
|
|
|
|
|
|
|
| |
No functional change intended.
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The two sysctls net.inet.tcp.hostcache.list and net.inet.tcp.hostcache.histo
are readonly and are to operate hostcache of vnet jails. Add CTLFLAG_VNET
flag to them since they are per-vnet sysctls.
This change does not have any impact on reading the two sysctls, but
`sysctl -ANV net.inet.tcp.hostcache` will report them correctly.
Reviewed by: tuexen, #transport, #network
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D52693
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A jailed process, `sysctl -j foo` or `jexec foo sysctl`, do not have
privilege to write to non-vnet sysctls but only to those marked as jail
writable, aka sysctls those marked with CTLFLAG_VNET flag.
Without this change we will get EPERM when trying to expire and purge
hostcache entries of vnet jails via the net.inet.tcp.hostcache.purgenow
sysctl. Fix that by adding a CTLFLAG_VNET flag.
Reviewed by: tuexen, #transport, #network
Fixes: 264563806496 Add a new sysctl net.inet.tcp.hostcache.purgenow=1 to expire ...
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D52692
|
|
|
|
|
|
|
|
|
| |
Remove a check which is also done in tcp_lro_rx_common().
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52683
|
|
|
|
|
|
|
|
| |
Take endpoint parameters into account when available.
Fixes: 463b5aed0d62 ("tcp: retire rstreason")
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
| |
No functional change intended.
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
|
| |
When building with DDB support, the inclusion of in_kdtrace.h
is needed. Make this explicit and don't rely on tcp_var.h to do this.
This is required for stable/14.
Fixes: a62c6b0de48a ("ddb: add optional printing of BBLog entries")
MFC after: immediately
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
| |
Depricate the support for the old RFC3517 behavior of SACK loss
recovery, and simplfy the code to always adhere to RFC6675.
Reviewed By: tuexen, cc, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D52383
|
|
|
|
|
|
|
|
|
|
|
| |
When adding an interface with an IP address to a bridge, or assigning an
IP address to an interface which is in a bridge, and member_ifaddrs=1,
print a warning so users are informed this is deprecated. Also add
"(deprecated)" to the sysctl description.
MFC after: 9 hours
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D52335
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While diagnosing PR 279653 and PR 285129, I observed that thread may
write to freed memory but the system does not crash. This hides the
real problem. A clear NULL pointer derefence is much better than writing
to freed memory.
PR: 279653
PR: 285129
Reviewed by: glebius
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D49444
|
|
|
|
|
|
|
|
|
|
|
| |
Only compute wscale when it is actually used. While there, change the
type of wscale to u_int as suggested by glebius.
No functional change intended.
Reviewed by: glebius, rscheff (older version)
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52296
|
|
|
|
|
|
|
|
|
| |
Make it bool. Reword the comment, add note that mbuf is always consumed.
In case tunnel consumed the mbuf, don't INP_RUNLOCK(), behave just like
all the other normal exits from the function.
Reviewed by: tuexen, kp, markj
Differential Revision: https://reviews.freebsd.org/D52171
|
|
|
|
|
|
|
| |
Fixes: e1751ef896119d7372035b1b60f18a6342bd0e3b
Reviewed by: tuexen, kp, markj
Differential Revision: https://reviews.freebsd.org/D52170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and fix assigning IP addresses to the gif(4) interface when it is a
member of a if_bridge(4) interface.
When setting the sysctl net.link.bridge.member_ifaddrs to 1, if_bridge(4)
can eliminate unnecessary walk of the member list to determine whether
the inbound unicast packets are for us or not.
Well when a gif(4) interface is member of a if_bridge(4) interface, it
acts as the tunnel endpoint to tunnel Ethernet frames over IP network,
aka the EtherIP protocol, so the IP addresses configured on it are
independent of the if_bridge(4) interface or other if_bridge(4) members,
hence the sysctl net.link.bridge.member_ifaddrs should not have any
influnce over gif(4) interfaces's behavior of assigning IP addresses.
PR: 227450
Reported by: Siva Mahadevan <me@svmhdvn.name>
Reviewed by: ivy, #bridge
MFC after: 1 week
Fixes: 0a1294f6c610 bridge: allow IP addresses on members to be disabled
Differential Revision: https://reviews.freebsd.org/D52200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that when the sysctl-variable net.inet.tcp.syncookies_only is
non zero, SYN-cookies are sent and no SYN-cache entry is added to the
SYN-cache. In particular, this behavior should not depend on the value
of the sysctl-variable net.inet.tcp.syncookies, which controls whether
SYN cookies are used in combination with the SYN-cache to deal with
bucket overflows.
Also ensure that tcps_sc_completed does not include TCP connections
established via a SYN-cookie.
While there, make V_tcp_syncookies and V_tcp_syncookiesonly bool
instead of int, since they are used as boolean variables.
Reviewed by: rscheff, cc, Peter Lei, Nick Banks
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52225
|
|
|
|
|
| |
MFC after: 3 days
Sponsored by: Netflix, Inc.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The need for such a variant comes from the fact that we need to
re-calculate checksum aftet ng_nat(4) transformations while getting
mbufs from the layer 2 (ethernet) directly.
Reviewed by: markj, tuexen
Approved by: tuexen
Sponsored by: Sippy Software, Inc.
Differential Revision: https://reviews.freebsd.org/D49677
MFC After: 2 weeks
|
| |
|
|
|
|
|
|
| |
- s/assigments/assignments/
MFC after: 3 days
|
|
|
|
|
|
|
|
|
| |
Don't subtract tcp_sack_adjust() sometimes twice, just once in all
cases.
Reviewed by: rscheff
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52140
|
|
|
|
|
|
|
|
|
|
|
| |
Take the condition of RFC 6675 into account.
While there, remove stale comments.
PR: 282605
Reviewed by: cc (earlier version)
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51426
|
|
|
|
|
|
|
|
|
|
|
| |
When reflecting a packet, use an offset of 0 and clear all three bits,
in particular the DF bit.
PR: 288558
Reviewed by: markj, zlei
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51991
|
|
|
|
|
|
| |
- s/datgram/datagram/
MFC after: 3 days
|
|
|
|
| |
Fixes: c3fc0db3bc50df18a724e6e6b12ea4e060fd9255
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RFC 2460 section 5 paragraph 7 allowed a Packet Too Big message
to report a Next-Hop MTU less than 1280 in support of 6-to-4 routers.
A node receiving such a message was required to add a Fragment
Header to outgoing packets, even though they were not fragmented.
Almost 20 years later, RFC 8200 was published. It obsoletes RFC 2460
and removes that paragraph. UNH IOL Intact was updated to test for
compliance with the new standard.
Remove code supporting that obsolete paragraph.
Test cases v6LC_4_1_06a and 06b failed before this change, saying:
DUT processed PTB and sent a fragmented echo reply
Those two test cases now pass:
DUT did not process PTB and sent un-fragmented echo reply
All PMTU test cases pass except v6LC_4_1_08. It fails because we
ignore the MTU in RAs.
Reviewed by: tuexen
MFC After: 1 month
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D51835
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the latest changes, this variable and parameter for
tcp_dropwithreset() is not needed anymore.
It also makes it harder to introduce the usage of multiple counters
for TCP again, which might open side channel attacks.
No funtional changes intended.
Reviewed by: rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51872
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't use the rstreason variable as a hint that a second lookup is
performed, since the rstreason variable will be removed.
Use the INPLOOKUP_WILDCARD flag in the lookupflag variable instead.
No functional change intended.
Reviewed by: rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51847
|
|
|
|
|
|
|
|
|
|
| |
Since there are multicast and broadcast specific error counters,
use them.
Reviewed by: rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51869
|
|
|
|
|
|
|
| |
Reviewed by: Nick Banks
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51849
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a blind attacker wants to guess by sending ACK segments if there
exists a TCP connection , this might trigger a challenge ACK on an
existing TCP connection. To make this hit non-observable for the
attacker, also increment the global counter, which would have been
incremented if it would have been a non-hit.
This issue was reported as issue number 11 in Keyu Man et al.:
SCAD: Towards a Universal and Automated Network Side-Channel
Vulnerability Detection
Reviewed by: Nick Banks, Peter Lei
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51724
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also rate limit the sending of RST segments in the following cases:
* when receiving data on a closed socket.
* when a socket can not be created at the end of the handshake and
the sysctl-variable net.inet.tcp.syncache.rst_on_sock_fail is 1.
* when an ACK segment is received in SYN SENT state and it does not
acknowledge the SYN segment.
After this change, there is no need anymore to provide a rstreason
to tcp_dropwithreset(), since it is always BANDLIM_TCP_RST.
This will be a follow-up commit, since it will change the code in a
couple of places, but will not change the functionality.
Reviewed by: rrs, Nick Banks, Peter Lei
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51815
|
|
|
|
|
|
|
|
|
|
|
| |
rstreason is only relevant in the code paths with the label
'dropwithreset', but not in the one with the label 'drop'.
No functional change intended.
Reviewed by: Nick Banks, rrs, Peter Lei, imp
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51814
|
|
|
|
|
|
|
|
|
| |
Note: btw submitted a number of other things in this area that haven't
made it into the tree, so I'm making an exception to the no typo rule
since it was done in that context.
Submitted by: btw (Tiwei Bie GSOC 2015 so unsure what to use for author)
Differential Revision: https://reviews.freebsd.org/D3510
|
|
|
|
|
|
|
|
|
|
|
| |
When a RTO happens during SACK loss recovery, snd_recover can possibly pulled left.
With Lost Retransmission Detection (LRD) this can lead to rxmit of a hole to end up
pointing to the left of the hole, which is unexpected and leads to complications.
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D51725
|
|
|
|
|
|
|
|
|
|
|
| |
delivered_data is the number of bytes, which have newly been
delivered to the peer. This includes the number of bytes
cumulatively acknowledged and selectively acknowledged.
Reviewed by: rscheff
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51718
|
|
|
|
|
|
|
|
|
|
| |
When panicing, don't print the condition, which was violated,
but the condition which holds at the time of the panic.
Reviewed by: Nick Banks
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51726
|
|
|
|
|
|
|
|
|
| |
No functional change intended.
Reviewed by: Nick Banks
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D51734
|
|
|
|
|
|
|
|
|
| |
This is mostly for better readability when we need to resolve
what opcode corresponds to specific number.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D51457
|