summaryrefslogtreecommitdiff
path: root/sys/netinet6/in6_proto.c
Commit message (Collapse)AuthorAgeFilesLines
* Remove unused nhop_ref_any() function.Alexander V. Chernikov2020-09-201-1/+0
| | | | | | | | | Remove "opt_mpath.h" header where not needed. No functional changes. Notes: svn path=/head/; revision=365930
* Simplify dom_<rtattach|rtdetach>.Alexander V. Chernikov2020-08-141-5/+0
| | | | | | | | | | | | | | Remove unused arguments from dom_rtattach/dom_rtdetach functions and make them return/accept 'struct rib_head' instead of 'void **'. Declare inet/inet6 implementations in the relevant _var.h headers similar to domifattach / domifdetach. Add rib_subscribe_internal() function to accept subscriptions to the rnh directly. Differential Revision: https://reviews.freebsd.org/D26053 Notes: svn path=/head/; revision=364238
* Fix typo.Andrey V. Elsukov2020-08-051-1/+1
| | | | | | | | | Submitted by: Evgeniy Khramtsov <evgeniy at khramtsov org> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25932 Notes: svn path=/head/; revision=363900
* Add the SCTP_SUPPORT kernel option.Mark Johnston2020-06-181-1/+1
| | | | | | | | | | | | | This is in preparation for enabling a loadable SCTP stack. Analogous to IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured in order to support a loadable SCTP implementation. Discussed with: tuexen MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=362338
* Avoid calling protocol drain routines more than once per reclamation event.Jonathan T. Looney2020-04-161-4/+4
| | | | | | | | | | | | | | | | | | | | | | | mb_reclaim() calls the protocol drain routines for each protocol in each domain. Some protocols exist in more than one domain and share drain routines. In the case of SCTP, it also uses the same drain routine for its SOCK_SEQPACKET and SOCK_STREAM entries in the same domain. On systems with INET, INET6, and SCTP all defined, mb_reclaim() calls sctp_drain() four times. On systems with INET and INET6 defined, mb_reclaim() calls tcp_drain() twice. mb_reclaim() is the only in-tree caller of the pr_drain protocol entry. Eliminate this duplication by ensuring that each pr_drain routine is only specified for one protocol entry in one domain. Reviewed by: tuexen MFC after: 2 weeks Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D24418 Notes: svn path=/head/; revision=360020
* Remove per-AF radix_mpath initializtion functions.Alexander V. Chernikov2020-04-111-7/+0
| | | | | | | | | | | Split their functionality by moving random seed allocation to SYSINIT and calling (new) generic multipath function from standard IPv4/IPv5 RIB init handlers. Differential Revision: https://reviews.freebsd.org/D24356 Notes: svn path=/head/; revision=359797
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-10/+16
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* Bring back redirect route expiration.Alexander V. Chernikov2020-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Redirect (and temporal) route expiration was broken a while ago. This change brings route expiration back, with unified IPv4/IPv6 handling code. It introduces net.inet.icmp.redirtimeout sysctl, allowing to set an expiration time for redirected routes. It defaults to 10 minutes, analogues with net.inet6.icmp6.redirtimeout. Implementation uses separate file, route_temporal.c, as route.c is already bloated with tons of different functions. Internally, expiration is implemented as an per-rnh callout scheduled when route with non-zero rt_expire time is added or rt_expire is changed. It does not add any overhead when no temporal routes are present. Callout traverses entire routing tree under wlock, scheduling expired routes for deletion and calculating the next time it needs to be run. The rationale for such implemention is the following: typically workloads requiring large amount of routes have redirects turned off already, while the systems with small amount of routes will not inhibit large overhead during tree traversal. This changes also fixes netstat -rn display of route expiration time, which has been broken since the conversion from kread() to sysctl. Reviewed by: bz MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D23075 Notes: svn path=/head/; revision=356984
* Add fibnum, family and vnet pointer to each rib head.Alexander V. Chernikov2020-01-091-1/+1
| | | | | | | | | | | | | Having metadata such as fibnum or vnet in the struct rib_head is handy as it eases building functionality in the routing space. This change is required to properly bring back route redirect support. Reviewed by: bz MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D23047 Notes: svn path=/head/; revision=356559
* frag6.c: move variables and sysctls into local fileBjoern A. Zeeb2019-08-021-35/+0
| | | | | | | | | | | | | | | | | | Move the sysctls and the related variables only used in frag6.c into the file and out of in6_proto.c. That way everything belonging together is in one place. Sort the variables into global and per-vnet scopes and make them static. No longer export the (helper) function frag6_set_bucketsize() now also file-local only. Should be no functional changes, only reduced public KPI/KBI surface. MFC after: 3 months Sponsored by: Netflix Notes: svn path=/head/; revision=350533
* Update for IETF draft-ietf-6man-ipv6only-flag.Bjoern A. Zeeb2019-03-061-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | All changes are hidden behind the EXPERIMENTAL option and are not compiled in by default. Add ND6_IFF_IPV6_ONLY_MANUAL to be able to set the interface into no-IPv4-mode manually without router advertisement options. This will allow developers to test software for the appropriate behaviour even on dual-stack networks or IPv6-Only networks without the option being set in RA messages. Update ifconfig to allow setting and displaying the flag. Update the checks for the filters to check for either the automatic or the manual flag to be set. Add REVARP to the list of filtered IPv4-related protocols and add an input filter similar to the output filter. Add a check, when receiving the IPv6-Only RA flag to see if the receiving interface has any IPv4 configured. If it does, ignore the IPv6-Only flag. Add a per-VNET global sysctl, which is on by default, to not process the automatic RA IPv6-Only flag. This way an administrator (if this is compiled in) has control over the behaviour in case the node still relies on IPv4. Notes: svn path=/head/; revision=344859
* Implement a limit on on the number of IPv6 reassembly queues per bucket.Jonathan T. Looney2018-08-141-2/+21
| | | | | | | | | | | | | | | | | | | | | There is a hashing algorithm which should distribute IPv6 reassembly queues across the available buckets in a relatively even way. However, if there is a flaw in the hashing algorithm which allows a large number of IPv6 fragment reassembly queues to end up in a single bucket, a per- bucket limit could help mitigate the performance impact of this flaw. Implement such a limit, with a default of twice the maximum number of reassembly queues divided by the number of buckets. Recalculate the limit any time the maximum number of reassembly queues changes. However, allow the user to override the value using a sysctl (net.inet6.ip6.maxfragbucketsize). Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923 Notes: svn path=/head/; revision=337783
* Add a limit of the number of fragments per IPv6 packet.Jonathan T. Looney2018-08-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | The IPv4 fragment reassembly code supports a limit on the number of fragments per packet. The default limit is currently 17 fragments. Among other things, this limit serves to limit the number of fragments the code must parse when trying to reassembly a packet. Add a limit to the IPv6 reassembly code. By default, limit a packet to 65 fragments (64 on the queue, plus one final fragment to complete the packet). This allows an average fragment size of 1,008 bytes, which should be sufficient to hold a fragment. (Recall that the IPv6 minimum MTU is 1280 bytes. Therefore, this configuration allows a full-size IPv6 packet to be fragmented on a link with the minimum MTU and still carry approximately 272 bytes of headers before the fragmented portion of the packet.) Users can adjust this limit using the net.inet6.ip6.maxfragsperpacket sysctl. Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923 Notes: svn path=/head/; revision=337782
* Make the IPv6 fragment limits be global, rather than per-VNET, limits.Jonathan T. Looney2018-08-141-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | The IPv6 reassembly fragment limit is based on the number of mbuf clusters, which are a global resource. However, the limit is currently applied on a per-VNET basis. Given enough VNETs (or given sufficient customization on enough VNETs), it is possible that the sum of all the VNET fragment limits will exceed the number of mbuf clusters available in the system. Given the fact that the fragment limits are intended (at least in part) to regulate access to a global resource, the IPv6 fragment limit should be applied on a global basis. Note that it is still possible to disable fragmentation for a particular VNET by setting the net.inet6.ip6.maxfragpackets sysctl to 0 for that VNET. In addition, it is now possible to disable fragmentation globally by setting the net.inet6.ip6.maxfrags sysctl to 0. Reviewed by: jhb Security: FreeBSD-SA-18:10.ip Security: CVE-2018-6923 Notes: svn path=/head/; revision=337781
* Allow implicit TCP connection setup for TCP/IPv6.Michael Tuexen2018-07-301-1/+1
| | | | | | | | | | | | | | | TCP/IPv4 allows an implicit connection setup using sendto(), which is used for TTCP and TCP fast open. This patch adds support for TCP/IPv6. While there, improve some tests for detecting multicast addresses, which are mapped. Reviewed by: bz@, kbowling@, rrs@ Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D16458 Notes: svn path=/head/; revision=336940
* Remove empty encap_init() function.Andrey V. Elsukov2018-05-291-3/+0
| | | | | | | MFC after: 2 weeks Notes: svn path=/head/; revision=334324
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* Renumber copyright clause 4Warner Losh2017-02-281-1/+1
| | | | | | | | | | | | Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96 Notes: svn path=/head/; revision=314436
* Remove IPsec related PCB code from SCTP.Andrey V. Elsukov2017-02-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inpcb structure has inp_sp pointer that is initialized by ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec security policies associated with a specific socket. An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options to configure these security policies. Then ip[6]_output() uses inpcb pointer to specify that an outgoing packet is associated with some socket. And IPSEC_OUTPUT() method can use a security policy stored in the inp_sp. For inbound packet the protocol-specific input routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms to inbound security policy configured in the inpcb. SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends packets. Thus IPSEC_OUTPUT() method does not consider such packets as associated with some socket and can not apply security policies from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY() method is called from protocol-specific input routine, it can specify inpcb pointer and associated with socket inbound policy will be checked. But there are two problems: 1. Such check is asymmetric, becasue we can not apply security policy from inpcb for outgoing packet. 2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and access to inp_sp is protected. But for SCTP this is not correct, becasue SCTP uses own locks to protect inpcb. To fix these problems remove IPsec related PCB code from SCTP. This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options will be not applicable to SCTP sockets. To be able correctly check inbound security policies for SCTP, mark its protocol header with the PR_LASTHDR flag. Reported by: tuexen Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D9538 Notes: svn path=/head/; revision=313697
* Merge projects/ipsec into head/.Andrey V. Elsukov2017-02-061-33/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Small summary ------------- o Almost all IPsec releated code was moved into sys/netipsec. o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel option IPSEC_SUPPORT added. It enables support for loading and unloading of ipsec.ko and tcpmd5.ko kernel modules. o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type support was removed. Added TCP/UDP checksum handling for inbound packets that were decapsulated by transport mode SAs. setkey(8) modified to show run-time NAT-T configuration of SA. o New network pseudo interface if_ipsec(4) added. For now it is build as part of ipsec.ko module (or with IPSEC kernel). It implements IPsec virtual tunnels to create route-based VPNs. o The network stack now invokes IPsec functions using special methods. The only one header file <netipsec/ipsec_support.h> should be included to declare all the needed things to work with IPsec. o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed. Now these protocols are handled directly via IPsec methods. o TCP_SIGNATURE support was reworked to be more close to RFC. o PF_KEY SADB was reworked: - now all security associations stored in the single SPI namespace, and all SAs MUST have unique SPI. - several hash tables added to speed up lookups in SADB. - SADB now uses rmlock to protect access, and concurrent threads can do SA lookups in the same time. - many PF_KEY message handlers were reworked to reflect changes in SADB. - SADB_UPDATE message was extended to support new PF_KEY headers: SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They can be used by IKE daemon to change SA addresses. o ipsecrequest and secpolicy structures were cardinally changed to avoid locking protection for ipsecrequest. Now we support only limited number (4) of bundled SAs, but they are supported for both INET and INET6. o INPCB security policy cache was introduced. Each PCB now caches used security policies to avoid SP lookup for each packet. o For inbound security policies added the mode, when the kernel does check for full history of applied IPsec transforms. o References counting rules for security policies and security associations were changed. The proper SA locking added into xform code. o xform code was also changed. Now it is possible to unregister xforms. tdb_xxx structures were changed and renamed to reflect changes in SADB/SPDB, and changed rules for locking and refcounting. Reviewed by: gnn, wblock Obtained from: Yandex LLC Relnotes: yes Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D9352 Notes: svn path=/head/; revision=313330
* Improve some of the sysctl descriptions added in r299827.Mark Johnston2017-01-161-6/+10
| | | | | | | | | | Submitted by: Marie Helene Kvello-Aune <marieheleneka@gmail.com> (original version) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D5336 Notes: svn path=/head/; revision=312307
* The pr_destroy field does not allow us to run the teardown code in aBjoern A. Zeeb2016-06-011-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | specific order. VNET_SYSUNINITs however are doing exactly that. Thus remove the VIMAGE conditional field from the domain(9) protosw structure and replace it with VNET_SYSUNINITs. This also allows us to change some order and to make the teardown functions file local static. Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use internally. Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g., for pfil consumers (firewalls), partially for this commit and for others to come. Reviewed by: gnn, tuexen (sctp), jhb (kernel.h) Obtained from: projects/vnet MFC after: 2 weeks X-MFC: do not remove pr_destroy Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6652 Notes: svn path=/head/; revision=301114
* Add PR_CONNREQUIRED for SOCK_STREAM sockets using SCTP.Michael Tuexen2016-05-301-1/+1
| | | | | | | | | | | This is required to signal connetion setup on non-blocking sockets via becoming writable. This still allows for implicit connection setup. MFC after: 1 week Notes: svn path=/head/; revision=301000
* Add sysctl descriptions for net.inet6.ip6 and net.inet6.icmp6.Mark Johnston2016-05-151-69/+92
| | | | | | | | | | | | | icmp6.redirtimeout, icmp6.nd6_maxnudhint and ip6.rr_prune are left undocumented as they appear to have no effect. Some existing sysctl descriptions were modified for consistency and style, and the ip6.tempvltime and ip6.temppltime handlers were rewritten to be a bit simpler and to avoid setting the sysctl value before validating it. MFC after: 3 weeks Notes: svn path=/head/; revision=299827
* Indentation issues.Pedro F. Giffuni2016-04-201-2/+1
| | | | | | | | | Contract some lines leftover from r298310. Mea culpa. Notes: svn path=/head/; revision=298354
* kernel: use our nitems() macro when it is available through param.h.Pedro F. Giffuni2016-04-191-1/+1
| | | | | | | | | No functional change, only trivial cases are done in this sweep, Discussed in: freebsd-current Notes: svn path=/head/; revision=298310
* These files were getting sys/malloc.h and vm/uma.h with header pollutionGleb Smirnoff2016-02-011-0/+1
| | | | | | | via sys/mbuf.h Notes: svn path=/head/; revision=295126
* Renove faith(4) and faithd(8) from base. It looks like industryAlexander V. Chernikov2014-11-091-3/+0
| | | | | | | | | | | | have chosen different (and more traditional) stateless/statuful NAT64 as translation mechanism. Last non-trivial commits to both faith(4) and faithd(8) happened more than 12 years ago, so I assume it is time to drop RFC3142 in FreeBSD. No objections from: net@ Notes: svn path=/head/; revision=274331
* Overhaul if_gre(4).Andrey V. Elsukov2014-11-071-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split it into two modules: if_gre(4) for GRE encapsulation and if_me(4) for minimal encapsulation within IP. gre(4) changes: * convert to if_transmit; * rework locking: protect access to softc with rmlock, protect from concurrent ioctls with sx lock; * correct interface accounting for outgoing datagramms (count only payload size); * implement generic support for using IPv6 as delivery header; * make implementation conform to the RFC 2784 and partially to RFC 2890; * add support for GRE checksums - calculate for outgoing datagramms and check for inconming datagramms; * add support for sending sequence number in GRE header; * remove support of cached routes. This fixes problem, when gre(4) doesn't work at system startup. But this also removes support for having tunnels with the same addresses for inner and outer header. * deprecate support for various GREXXX ioctls, that doesn't used in FreeBSD. Use our standard ioctls for tunnels. me(4): * implementation conform to RFC 2004; * use if_transmit; * use the same locking model as gre(4); PR: 164475 Differential Revision: D1023 No objections from: net@ Relnotes: yes Sponsored by: Yandex LLC Notes: svn path=/head/; revision=274246
* Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.Gleb Smirnoff2014-11-071-82/+85
| | | | | | | Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=274225
* Remove VNET_SYSCTL_ARG(). The generic sysctl(9) code handles that.Gleb Smirnoff2014-11-071-4/+0
| | | | | | | | Reviewed by: ae Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=274223
* Finish r274118: remove useless fields from struct domain.Alexander V. Chernikov2014-11-061-2/+0
| | | | | | | Sponsored by: Yandex LLC Notes: svn path=/head/; revision=274177
* Make checks for rt_mtu generic:Alexander V. Chernikov2014-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some virtual if drivers has (ab)used ifa ifa_rtrequest hook to enforce route MTU to be not bigger that interface MTU. While ifa_rtrequest hooking might be an option in some situation, it is not feasible to do MTU checks there: generic (or per-domain) routing code is perfectly capable of doing this. We currrently have 3 places where MTU is altered: 1) route addition. In this case domain overrides radix _addroute callback (in[6]_addroute) and all necessary checks/fixes are/can be done there. 2) route change (especially, GW change). In this case, there are no explicit per-domain calls, but one can override rte by setting ifa_rtrequest hook to domain handler (inet6 does this). 3) ifconfig ifaceX mtu YYYY In this case, we have no callbacks, but ip[6]_output performes runtime checks and decreases rt_mtu if necessary. Generally, the goals are to be able to handle all MTU changes in control plane, not in runtime part, and properly deal with increased interface MTU. This commit changes the following: * removes hooks setting MTU from drivers side * adds proper per-doman MTU checks for case 1) * adds generic MTU check for case 2) * The latter is done by using new dom_ifmtu callback since if_mtu denotes L3 interface MTU, e.g. maximum trasmitted _packet_ size. However, IPv6 mtu might be different from if_mtu one (e.g. default 1280) for some cases, so we need an abstract way to know maximum MTU size for given interface and domain. * moves rt_setmetrics() before MTU/ifa_rtrequest hooks since it copies user-supplied data which must be checked. * removes RT_LOCK_ASSERT() from other ifa_rtrequest hooks to be able to use this functions on new non-inserted rte. More changes will follow soon. MFC after: 1 month Sponsored by: Yandex LLC Notes: svn path=/head/; revision=274175
* Change pr_output's prototype to avoid the need for explicit casts.Kevin Lo2014-08-151-6/+6
| | | | | | | | | | This is a follow up to r269699. Phabric: D564 Reviewed by: jhb Notes: svn path=/head/; revision=270008
* Merge 'struct ip6protosw' and 'struct protosw' into one. Now we haveKevin Lo2014-08-081-7/+7
| | | | | | | | | | only one protocol switch structure that is shared between ipv4 and ipv6. Phabric: D476 Reviewed by: jhb Notes: svn path=/head/; revision=269699
* Minor style cleanups.Kevin Lo2014-04-071-1/+1
| | | | Notes: svn path=/head/; revision=264213
* Add support for UDP-Lite protocol (RFC 3828) to IPv4 and IPv6 stacks.Kevin Lo2014-04-071-0/+13
| | | | | | | | | | | Tested with vlc and a test suite [1]. [1] http://www.erg.abdn.ac.uk/~gerrit/udp-lite/files/udplite_linux.tar.gz Reviewed by: jhb, glebius, adrian Notes: svn path=/head/; revision=264212
* o Revamp API between flowtable and netinet, netinet6.Gleb Smirnoff2014-02-071-14/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | - ip_output() and ip_output6() simply call flowtable_lookup(), passing mbuf and address family. That's the only code under #ifdef FLOWTABLE in the protocols code now. o Revamp statistics gathering and export. - Remove hand made pcpu stats, and utilize counter(9). - Snapshot of statistics is available via 'netstat -rs'. - All sysctls are moved into net.flowtable namespace, since spreading them over net.inet isn't correct. o Properly separate at compile time INET and INET6 parts. o General cleanup. - Remove chain of multiple flowtables. We simply have one for IPv4 and one for IPv6. - Flowtables are allocated in flowtable.c, symbols are static. - With proper argument to SYSINIT() we no longer need flowtable_ready. - Hash salt doesn't need to be per-VNET. - Removed rudimentary debugging, which use quite useless in dtrace era. The runtime behavior of flowtable shouldn't be changed by this commit. Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=261601
* The r48589 promised to remove implicit inclusion of if_var.h soon. PrepareGleb Smirnoff2013-10-261-0/+1
| | | | | | | | | | | to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=257176
* Migrate structs ip6stat, icmp6stat and rip6stat to PCPU counters.Andrey V. Elsukov2013-07-091-6/+8
| | | | Notes: svn path=/head/; revision=253085
* Use FF02:0:0:0:0:2:FF00::/104 prefix for IPv6 Node Information GroupHiroki Sato2013-05-041-0/+6
| | | | | | | | | | | | | | | | | | | | | Address. Although KAME implementation used FF02:0:0:0:0:2::/96 based on older versions of draft-ietf-ipngwg-icmp-name-lookup, it has been changed in RFC 4620. The kernel always joins the /104-prefixed address, and additionally does /96-prefixed one only when net.inet6.icmp6.nodeinfo_oldmcprefix=1. The default value of the sysctl is 1. ping6(8) -N flag now uses /104-prefixed one. When this flag is specified twice, it uses /96-prefixed one instead. Reviewed by: ume Based on work by: Thomas Scheffler PR: conf/174957 MFC after: 2 weeks Notes: svn path=/head/; revision=250251
* Remove unused global variables.Kevin Lo2013-03-221-15/+0
| | | | | | | Reviewed by: ae, glebius Notes: svn path=/head/; revision=248607
* o Convert IPv6 read-only stats sysctls to the read-write ones.Maxim Konovalov2011-12-191-3/+3
| | | | | | | | | | | | o Teach netstat(1) -z to reset these stats sysctls. PR: bin/153206 Reviewed by: glebuis Sponsored by: NGINX, Inc. MFC after: 1 month Notes: svn path=/head/; revision=228700
* Add $ipv6_cpe_wanif to enable functionality required for IPv6 CPEHiroki Sato2011-09-131-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | (r225485). When setting an interface name to it, the following configurations will be enabled: 1. "no_radr" is set to all IPv6 interfaces automatically. 2. "-no_radr accept_rtadv" will be set only for $ipv6_cpe_wanif. This is done just before evaluating $ifconfig_IF_ipv6 in the rc.d scripts (this means you can manually supersede this configuration if necessary). 3. The node will add RA-sending routers to the default router list even if net.inet6.ip6.forwarding=1. This mode is added to conform to RFC 6204 (a router which connects the end-user network to a service provider network). To enable packet forwarding, you still need to set ipv6_gateway_enable=YES. Note that accepting router entries into the default router list when packet forwarding capability and a routing daemon are enabled can result in messing up the routing table. To minimize such unexpected behaviors, "no_radr" is set on all interfaces but $ipv6_cpe_wanif. Approved by: re (bz) Notes: svn path=/head/; revision=225521
* The socket API only specifies SCTP for SOCK_SEQPACKET andMichael Tuexen2011-07-121-31/+19
| | | | | | | | | SOCK_STREAM, but not SOCK_DGRAM. So don't register it for SOCK_DGRAM. While there, fix some indentation. Notes: svn path=/head/; revision=223963
* - Accept Router Advertisement messages even when net.inet6.ip6.forwarding=1.Hiroki Sato2011-06-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - A new per-interface knob IFF_ND6_NO_RADR and sysctl IPV6CTL_NO_RADR. This controls if accepting a route in an RA message as the default route. The default value for each interface can be set by net.inet6.ip6.no_radr. The system wide default value is 0. - A new sysctl: net.inet6.ip6.norbit_raif. This controls if setting R-bit in NA on RA accepting interfaces. The default is 0 (R-bit is set based on net.inet6.ip6.forwarding). Background: IPv6 host/router model suggests a router sends an RA and a host accepts it for router discovery. Because of that, KAME implementation does not allow accepting RAs when net.inet6.ip6.forwarding=1. Accepting RAs on a router can make the routing table confused since it can change the default router unintentionally. However, in practice there are cases where we cannot distinguish a host from a router clearly. For example, a customer edge router often works as a host against the ISP, and as a router against the LAN at the same time. Another example is a complex network configurations like an L2TP tunnel for IPv6 connection to Internet over an Ethernet link with another native IPv6 subnet. In this case, the physical interface for the native IPv6 subnet works as a host, and the pseudo-interface for L2TP works as the default IP forwarding route. Problem: Disabling processing RA messages when net.inet6.ip6.forwarding=1 and accepting them when net.inet6.ip6.forward=0 cause the following practical issues: - A router cannot perform SLAAC. It becomes a problem if a box has multiple interfaces and you want to use SLAAC on some of them, for example. A customer edge router for IPv6 Internet access service using an IPv6-over-IPv6 tunnel sometimes needs SLAAC on the physical interface for administration purpose; updating firmware and so on (link-local addresses can be used there, but GUAs by SLAAC are often used for scalability). - When a host has multiple IPv6 interfaces and it receives multiple RAs on them, controlling the default route is difficult. Router preferences defined in RFC 4191 works only when the routers on the links are under your control. Details of Implementation Changes: Router Advertisement messages will be accepted even when net.inet6.ip6.forwarding=1. More precisely, the conditions are as follow: (ACCEPT_RTADV && !NO_RADR && !ip6.forwarding) => Normal RA processing on that interface. (as IPv6 host) (ACCEPT_RTADV && (NO_RADR || ip6.forwarding)) => Accept RA but add the router to the defroute list with rtlifetime=0 unconditionally. This effectively prevents from setting the received router address as the box's default route. (!ACCEPT_RTADV) => No RA processing on that interface. ACCEPT_RTADV and NO_RADR are per-interface knob. In short, all interface are classified as "RA-accepting" or not. An RA-accepting interface always processes RA messages regardless of ip6.forwarding. The difference caused by NO_RADR or ip6.forwarding is whether the RA source address is considered as the default router or not. R-bit in NA on the RA accepting interfaces is set based on net.inet6.ip6.forwarding. While RFC 6204 W-1 rule (for CPE case) suggests a router should disable the R-bit completely even when the box has net.inet6.ip6.forwarding=1, I believe there is no technical reason with doing so. This behavior can be set by a new sysctl net.inet6.ip6.norbit_raif (the default is 0). Usage: # ifconfig fxp0 inet6 accept_rtadv => accept RA on fxp0 # ifconfig fxp0 inet6 accept_rtadv no_radr => accept RA on fxp0 but ignore default route information in it. # sysctl net.inet6.ip6.norbit_no_radr=1 => R-bit in NAs on RA accepting interfaces will always be set to 0. Notes: svn path=/head/; revision=222728
* Add FEATURE() definitions for IPv4 and IPv6 so that we can useBjoern A. Zeeb2011-05-251-0/+1
| | | | | | | | | | | | | feature_present(3) to dynamically decide whether to use one or the other family. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 10 days Notes: svn path=/head/; revision=222272
* MFp4 CH=191760,191770:Bjoern A. Zeeb2011-04-201-0/+9
| | | | | | | | | | | | | | | | Not compiling in and not initializing from inetsw from in_proto.c for IPv6 only, we need to initialize upper layer protocols from inet6sw. Make sure to not initialize them twice in a Dual-Stack environment but only conditionally on no INET as we have done for TCP for a long time. Otherwise we would leak resources. Reviewed by: gnn Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems MFC after: 3 days Notes: svn path=/head/; revision=220881
* Allow carp(4) to be loaded as a kernel module. Follow precedent set byWill Andrews2010-08-111-17/+0
| | | | | | | | | | | | | | | | | | bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks Notes: svn path=/head/; revision=211157
* MFp4 CH180235:Bjoern A. Zeeb2010-08-091-0/+17
| | | | | | | | | | | | | | Add proto spacers to inet6sw like we have for legacy IP. This allows us to dynamically pf_proto_register() for INET6 from modules, needed by upcoming CARP changes and SeND. MC and SCTP could make use of it as well in theory in the future after upcoming VIMAGE vnet teardown work. Discussed with: will, anchie MFC after: 10 days Notes: svn path=/head/; revision=211115