| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
ROUTE_MPATH is the new config option controlling new multipath routing
implementation. Remove the last pieces of RADIX_MPATH-related code and
the config option.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D27244
Notes:
svn path=/head/; revision=368164
|
|
|
|
|
|
|
|
| |
Discussed with: gnn
MFC with: r367628
Notes:
svn path=/head/; revision=367645
|
|
|
|
|
|
|
|
|
|
|
|
| |
due to its lack of support for ICMP redirects. The following commit
adds redirects to the fastforward path, again allowing for decent
forwarding performance in the kernel.
Reviewed by: ae, melifaro
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Notes:
svn path=/head/; revision=367628
|
|
|
|
| |
Notes:
svn path=/head/; revision=365071
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It may be possible to fix this by deferring the lookup, but let's
keep the initial change simple to make MFCs easier.
PR: 246951
Reviewed by: melifaro
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25519
Notes:
svn path=/head/; revision=362840
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New fib[46]_lookup() functions support multipath transparently.
Given that, switch the last rtalloc_mpath_fib() calls to
dib4_lookup() and eliminate the function itself.
Note: proper flowid generation (especially for the outbound traffic) is a
bigger topic and will be handled in a separate review.
This change leaves flowid generation intact.
Differential Revision: https://reviews.freebsd.org/D24595
Notes:
svn path=/head/; revision=360431
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is build on top of nexthop objects introduced in r359823.
Nexthops are separate datastructures, containing all necessary information
to perform packet forwarding such as gateway interface and mtu. Nexthops
are shared among the routes, providing more pre-computed cache-efficient
data while requiring less memory. Splitting the LPM code and the attached
data solves multiple long-standing problems in the routing layer,
drastically reduces the coupling with outher parts of the stack and allows
to transparently introduce faster lookup algorithms.
Route caching was (re)introduced to minimise (slow) routing lookups, allowing
for notably better performance for large TCP senders. Caching works by
acquiring rtentry reference, which is protected by per-rtentry mutex.
If the routing table is changed (checked by comparing the rtable generation id)
or link goes down, cache record gets withdrawn.
Nexthops have the same reference counting interface, backed by refcount(9).
This change merely replaces rtentry with the actual forwarding nextop as a
cached object, which is mostly mechanical. Other moving parts like cache
cleanup on rtable change remains the same.
Differential Revision: https://reviews.freebsd.org/D24340
Notes:
svn path=/head/; revision=360292
|
|
|
|
| |
Notes:
svn path=/head/; revision=359381
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
Notes:
svn path=/head/; revision=358333
|
|
|
|
|
|
|
|
|
| |
from any line.
Sponsored by: Netflix Inc.
Notes:
svn path=/head/; revision=357818
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.
However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.
Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.
On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().
This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.
Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.
This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.
Reviewed by: gallatin, hselasky, cy, adrian, kristof
Differential Revision: https://reviews.freebsd.org/D19111
Notes:
svn path=/head/; revision=353292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are a few places that use hand crafted versions of the macros
from sys/netinet/in.h making it difficult to actually alter the
values in use by these macros. Correct that by replacing handcrafted
code with proper macro usage.
Reviewed by: karels, kristof
Approved by: bde (mentor)
MFC after: 3 weeks
Sponsored by: John Gilmore
Differential Revision: https://reviews.freebsd.org/D19317
Notes:
svn path=/head/; revision=345888
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The KPI have been reviewed and cleansed of features that were planned
back 20 years ago and never implemented. The pfil(9) internals have
been made opaque to protocols with only returned types and function
declarations exposed. The KPI is made more strict, but at the same time
more extensible, as kernel uses same command structures that userland
ioctl uses.
In nutshell [KA]PI is about declaring filtering points, declaring
filters and linking and unlinking them together.
New [KA]PI makes it possible to reconfigure pfil(9) configuration:
change order of hooks, rehook filter from one filtering point to a
different one, disconnect a hook on output leaving it on input only,
prepend/append a filter to existing list of filters.
Now it possible for a single packet filter to provide multiple rulesets
that may be linked to different points. Think of per-interface ACLs in
Cisco or Juniper. None of existing packet filters yet support that,
however limited usage is already possible, e.g. default ruleset can
be moved to single interface, as soon as interface would pride their
filtering points.
Another future feature is possiblity to create pfil heads, that provide
not an mbuf pointer but just a memory pointer with length. That would
allow filtering at very early stages of a packet lifecycle, e.g. when
packet has just been received by a NIC and no mbuf was yet allocated.
Differential Revision: https://reviews.freebsd.org/D18951
Notes:
svn path=/head/; revision=343631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove macros that covertly create epoch_tracker on thread stack. Such
macros a quite unsafe, e.g. will produce a buggy code if same macro is
used in embedded scopes. Explicitly declare epoch_tracker always.
- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
locking macros to what they actually are - the net_epoch.
Keeping them as is is very misleading. They all are named FOO_RLOCK(),
while they no longer have lock semantics. Now they allow recursion and
what's more important they now no longer guarantee protection against
their companion WLOCK macros.
Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.
This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.
Discussed with: jtl, gallatin
Notes:
svn path=/head/; revision=342872
|
|
|
|
|
|
|
|
|
|
| |
PR: 220078
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12457
Tested-by: Cassiano Peixoto and others
Notes:
svn path=/head/; revision=339808
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was lost when tryforward appeared. Now ip[6]_tryforward will be enabled
only when sending redirects for corresponding IP version is disabled via
sysctl. Otherwise will be used default forwarding function.
PR: 221137
Submitted by: mckay@
MFC after: 2 weeks
Notes:
svn path=/head/; revision=337736
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
Notes:
svn path=/head/; revision=336914
|
|
|
|
|
|
|
|
|
|
|
| |
variables.
Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147
Notes:
svn path=/head/; revision=336676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cumulative throughput while running 64
netperf -H $DUT -t UDP_STREAM -- -m 1
on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps
Single stream throughput increases from 910kpps to 1.18Mpps
Baseline:
https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg
- Protect read access to global ifnet list with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg
- Protect short lived ifaddr references with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg
- Convert if_afdata read lock path to epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg
A fix for the inpcbhash contention is pending sufficient time
on a canary at LLNW.
Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15409
Notes:
svn path=/head/; revision=334118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Run on LLNW canaries and tested by pho@
gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.
When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32
4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32
4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32
4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32
4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32
4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32
4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32
After the patch
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree
5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51
5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51
5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51
5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51
5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52
5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52
Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch
Reviewed by: gallatin
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15366
Notes:
svn path=/head/; revision=333813
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Forwarded packets passed through PFIL_OUT, which made it difficult for
firewalls to figure out if they were forwarding or producing packets. This in
turn is an issue for pf for IPv6 fragment handling: it needs to call
ip6_output() or ip6_forward() to handle the fragments. Figuring out which was
difficult (and until now, incorrect).
Having pfil distinguish the two removes an ugly piece of code from pf.
Introduce a new variant of the netpfil callbacks with a flags variable, which
has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if
a packet is forwarded.
Reviewed by: ae, kevans
Differential Revision: https://reviews.freebsd.org/D13715
Notes:
svn path=/head/; revision=331436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
Notes:
svn path=/head/; revision=326023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and other similar socket options.
Provide new control message SCM_TIME_INFO to supply information about
timestamp. Currently it indicates that the timestamp was
hardware-assisted and high-precision, for software timestamps the
message is not returned. Reserved fields are added to ABI to report
additional info about it, it is expected that raw hardware clock value
might be useful for some applications.
Reviewed by: gallatin (previous version), hselasky
Sponsored by: Mellanox Technologies
MFC after: 2 weeks
X-Differential revision: https://reviews.freebsd.org/D12638
Notes:
svn path=/head/; revision=325507
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce IPCTL_INTRDQMAXLEN and IPCTL_INTRDQDROPS macros for this purpose.
Reviewed by: gnn
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10358
Notes:
svn path=/head/; revision=316715
|
|
|
|
|
|
|
|
|
|
|
|
| |
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
Notes:
svn path=/head/; revision=314436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Small summary
-------------
o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
option IPSEC_SUPPORT added. It enables support for loading
and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
support was removed. Added TCP/UDP checksum handling for
inbound packets that were decapsulated by transport mode SAs.
setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
build as part of ipsec.ko module (or with IPSEC kernel).
It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
methods. The only one header file <netipsec/ipsec_support.h>
should be included to declare all the needed things to work
with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
- now all security associations stored in the single SPI namespace,
and all SAs MUST have unique SPI.
- several hash tables added to speed up lookups in SADB.
- SADB now uses rmlock to protect access, and concurrent threads
can do SA lookups in the same time.
- many PF_KEY message handlers were reworked to reflect changes
in SADB.
- SADB_UPDATE message was extended to support new PF_KEY headers:
SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
avoid locking protection for ipsecrequest. Now we support
only limited number (4) of bundled SAs, but they are supported
for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
check for full history of applied IPsec transforms.
o References counting rules for security policies and security
associations were changed. The proper SA locking added into xform
code.
o xform code was also changed. Now it is possible to unregister xforms.
tdb_xxx structures were changed and renamed to reflect changes in
SADB/SPDB, and changed rules for locking and refcounting.
Reviewed by: gnn, wblock
Obtained from: Yandex LLC
Relnotes: yes
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D9352
Notes:
svn path=/head/; revision=313330
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sources to return timestamps when SO_TIMESTAMP is enabled. Two additional
clock sources are:
o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME);
o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC).
In addition to this, this option provides unified interface to get bintime
(equivalent of using SO_BINTIME), except it also supported with IPv6 where
SO_BINTIME has never been supported. The long term plan is to depreciate
SO_BINTIME and move everything to using SO_TS_CLOCK.
Idea for this enhancement has been briefly discussed on the Net session
during dev summit in Ottawa last June and the general input was positive.
This change is believed to benefit network benchmarks/profiling as well
as other scenarios where precise time of arrival measurement is necessary.
There are two regression test cases as part of this commit: one extends unix
domain test code (unix_cmsg) to test new SCM_XXX types and another one
implementis totally new test case which exchanges UDP packets between two
processes using both conventional methods (i.e. calling clock_gettime(2)
before recv(2) and after send(2)), as well as using setsockopt()+recv() in
receive path. The resulting delays are checked for sanity for all supported
clock types.
Reviewed by: adrian, gnn
Differential Revision: https://reviews.freebsd.org/D9171
Notes:
svn path=/head/; revision=312296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can lead to change of mbuf pointer (packet filter could do m_pullup(),
NAT, etc). Also in case of change of destination address, tryforward can
decide that packet should be handled by local system. In this case modified
mbuf can be returned to the ip[6]_input(). To handle this correctly, check
M_FASTFWD_OURS flag after return from ip[6]_tryforward. And if it is present,
update variables that depend from mbuf pointer and skip another inbound
firewall processing.
No objection from: #network
MFC after: 3 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D8764
Notes:
svn path=/head/; revision=310258
|
|
|
|
|
|
|
|
| |
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D7878
Notes:
svn path=/head/; revision=305824
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.
Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.
Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.
For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.
Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.
For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).
Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.
Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747
Notes:
svn path=/head/; revision=302054
|
|
|
|
|
|
|
|
| |
It was used for conditional build IPFIREWALL_FORWARD support.
But IPFIREWALL_FORWARD option was removed a long time ago.
Notes:
svn path=/head/; revision=301717
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add accessor functions to toggle the state per VNET.
The base system (vnet0) will always enable itself with the normal
registration. We will share the registered protocol handlers in all
VNETs minimising duplication and management.
Upon disabling netisr processing for a VNET drain the netisr queue from
packets for that VNET.
Update netisr consumers to (de)register on a per-VNET start/teardown using
VNET_SYS(UN)INIT functionality.
The change should be transparent for non-VIMAGE kernels.
Reviewed by: gnn (, hiren)
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6691
Notes:
svn path=/head/; revision=301270
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
specific order. VNET_SYSUNINITs however are doing exactly that.
Thus remove the VIMAGE conditional field from the domain(9) protosw
structure and replace it with VNET_SYSUNINITs.
This also allows us to change some order and to make the teardown functions
file local static.
Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use
internally.
Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g.,
for pfil consumers (firewalls), partially for this commit and for others
to come.
Reviewed by: gnn, tuexen (sctp), jhb (kernel.h)
Obtained from: projects/vnet
MFC after: 2 weeks
X-MFC: do not remove pr_destroy
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6652
Notes:
svn path=/head/; revision=301114
|
|
|
|
|
|
|
|
|
|
|
|
| |
It does not cause any real issues because the variable is overwritten
only when the packet is forwarded (and the variable is not used anymore).
Obtained from: pfSense
MFC after: 2 weeks
Sponsored by: Rubicon Communications (Netgate)
Notes:
svn path=/head/; revision=297985
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ip_dooptions(), icmp6_redirect_input(), in6_lltable_rtcheck(),
in6p_lookup_mcast_ifp() and in6_selecthlim() use new routing api.
Eliminate now-unused ip_rtaddr().
Fix lookup key fib6_lookup_nh_basic() which was lost diring merge.
Make fib6_lookup_nh_basic() and fib6_lookup_nh_extended() always
return IPv6 destination address with embedded scope. Currently
rw_gateway has it scope embedded, do the same for non-gatewayed
destinations.
Sponsored by: Yandex LLC
Notes:
svn path=/head/; revision=292015
|
|
|
|
|
|
|
|
|
|
|
| |
Use hhook(9) framework to achieve ability of loading and unloading
if_enc(4) kernel module. INET and INET6 code on initialization registers
two helper hooks points in the kernel. if_enc(4) module uses these helper
hook points and registers its hooks. IPSEC code uses these hhook points
to call helper hooks implemented in if_enc(4).
Notes:
svn path=/head/; revision=291292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sysctl and will always be on. The former split between default and
fast forwarding is removed by this commit while preserving the ability
to use all network stack features.
Differential Revision: https://reviews.freebsd.org/D4042
Reviewed by: ae, melifaro, olivier, rwatson
MFC after: 1 month
Sponsored by: Rubicon Communications (Netgate)
Notes:
svn path=/head/; revision=290383
|
|
|
|
|
|
|
|
|
|
| |
because the RSS hash may need to be recalculated.
Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn>
Differential Revision: https://reviews.freebsd.org/D3564
Notes:
svn path=/head/; revision=287524
|
|
|
|
|
|
|
|
|
|
| |
an IPv6 version to show up.
Submitted by: Tiwei Bie <btw@mail.ustc.edu.cn>
Differential Revision: https://reviews.freebsd.org/D3504
Notes:
svn path=/head/; revision=287277
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both are used to protect access to IP addresses lists and they can be
acquired for reading several times per packet. To reduce lock contention
it is better to use rmlock here.
Reviewed by: gnn (previous version)
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D3149
Notes:
svn path=/head/; revision=286001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
apparently neither clang nor gcc complain about this.
But clang intis the var to NULL correctly while gcc on at least mips does not.
Correct the undefined behavior by initializing the variable properly.
PR: 201371
Differential Revision: https://reviews.freebsd.org/D3036
Reviewed by: gnn
Approved by: gnn(mentor)
Notes:
svn path=/head/; revision=285325
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ip_forward() does a route lookup for testing this packet can be sent to a known destination,
it also can do another route lookup if it detects that an ICMP redirect is needed,
it forgets all of this and handovers to ip_output() to do the same lookup yet again.
This optimisation just does one route lookup during the forwarding path and handovers that to be considered by ip_output().
Differential Revision: https://reviews.freebsd.org/D2964
Approved by: ae, gnn(mentor)
MFC after: 1 week
Notes:
svn path=/head/; revision=285051
|
|
|
|
|
|
|
| |
functions that were moved to ip_reass.c.
Notes:
svn path=/head/; revision=281388
|
|
|
|
|
|
|
|
|
|
|
|
| |
function names have changed and comments are reformatted or added, but
there is no functional change.
Claim copyright for me and Adrian.
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281351
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of allocations in V_nipq is racy. To fix that, we would simply stop doing
book-keeping ourselves, and rely on UMA doing that. There could be a
slight overcommit due to caches, but that isn't a big deal.
o V_nipq and V_maxnipq go away.
o net.inet.ip.fragpackets is now just SYSCTL_UMA_CUR()
o net.inet.ip.maxfragpackets could have been just SYSCTL_UMA_MAX(), but
historically it has special semantics about values of 0 and -1, so
provide sysctl_maxfragpackets() to handle these special cases.
o If zone limit lowers either due to net.inet.ip.maxfragpackets or due to
kern.ipc.nmbclusters, then new function ipq_drain_tomax() goes over
buckets and frees the oldest packets until we are in the limit.
The code that (incorrectly) did that in ip_slowtimo() is removed.
o ip_reass() doesn't check any limits and calls uma_zalloc(M_NOWAIT).
If it fails, a new function ipq_reuse() is called. This function will
find the oldest packet in the currently locked bucket, and if there is
none, it will search in other buckets until success.
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281342
|
|
|
|
|
|
|
|
|
| |
locks and doing lookups.
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281334
|
|
|
|
|
|
|
|
|
| |
that they protect.
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281329
|
|
|
|
|
|
|
| |
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281296
|
|
|
|
|
|
|
|
|
| |
relying on timeouts.
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281295
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
free a fragment, provide two inline functions that do that for us:
ipq_drop() and ipq_timeout().
o Rename ip_free_f() to ipq_free() to match the name scheme of IP reassembly.
o Remove assertion from ipq_free(), since it requires extra argument to be
passed, but locking scheme is simple enough and function is static.
Sponsored by: Nginx, Inc.
Notes:
svn path=/head/; revision=281294
|