aboutsummaryrefslogtreecommitdiff
path: root/sys/netinet6/nd6.c
Commit message (Collapse)AuthorAgeFilesLines
* netinet6: Remove ndpr_raf_ra_derived flagHiroki Sato2025-06-121-5/+3
| | | | | | | | | | | | | | | | | | | | This flag was introduced at 8036234c72c9361711e867cc1a0c6a7fe0babd84 to prevent the SIOCSPFXFLUSH_IN6 ioctl from removing manually-added entries. However, this flag did actually not work due to an incomplete implementation making prelist_update() not handle it before calling nd6_prelist_add(). This patch removes the flag because a prefix is derived from an RA always has an entry in the ndpr_advrtrs member in the struct nd_prefix. Having a separate flag is not a good idea because it can cause a mismatch between the flag and the ndpr_advrtrs entry. Testing using LIST_EMPTY() is simpler for the origial goal. This also removes in a prefix check in the ICMPV6CTL_ND6_PRLIST sysctl to exclude manually-added entries. This ioctl is designed to list all entries, and there is no relationship to SIOCSPFXFLUSH_IN6. Differential Revision: https://reviews.freebsd.org/D46441
* netinet6: Remove a set but not used global variable in6_maxmtuZhenlei Huang2025-05-211-4/+0
| | | | | | | | | | | | | | and its setter in6_setmaxmtu(). This variable was introduced by the KAME projec [1]. It holds the max IPv6 MTU through all the interfaces, but is never used anywhere. [1] 82cd038d51e2 KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP for IPv6 yet) Reviewed by: glebius MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D49357
* inet6: add the missing lock acquire to nd6_get_llentryMateusz Guzik2025-02-101-0/+1
| | | | | | Reported by: Lexi Winter PR: 282378 Sponsored by: Rubicon Communications, LLC ("Netgate")
* nd6: Fix the routing table subscriptionMark Johnston2024-07-251-3/+3
| | | | | | | | | | | | | | | | | | | | The nd6 code listens for RTM_DELETE events so that it can mark the corresponding default router as inactive in the case where the default route is deleted. A subsequent RA from the router may then reinstall the default route. Commit fedeb08b6a58e broke this for non-multipath routes, as rib_decompose_notification() only invokes the callback for multipath routes. Restore the old behaviour. Also ensure that we update the router only for RTM_DELETE notifications, lost in commit 2259a03020fe0. Reviewed by: bz Fixes: fedeb08b6a58 ("Introduce scalable route multipath.") Fixes: 2259a03020fe ("Rework part of routing code to reduce difference to D26449.") MFC after: 2 weeks Sponsored by: Klara, Inc. Sponsored by: Bell Tower Integration Differential Revision: https://reviews.freebsd.org/D46020
* icmp6: move ICMPv6 related tunables to the files where they are usedGleb Smirnoff2024-03-241-13/+31
| | | | | | | | | Most of them can be declared as static after the move out of in6_proto.c. Keeping sysctl(9) declarations with their text descriptions next to the variable declaration create self-documenting code. There should be no functional changes. Differential Revision: https://reviews.freebsd.org/D44481
* sys: Remove $FreeBSD$: one-line .c patternWarner Losh2023-08-161-2/+0
| | | | Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
* IfAPI: Explicitly include <net/if_private.h> in netstackJustin Hibbits2023-01-311-0/+1
| | | | | | | | | | | Summary: In preparation of making if_t completely opaque outside of the netstack, explicitly include the header. <net/if_var.h> will stop including the header in the future. Sponsored by: Juniper Networks, Inc. Reviewed by: glebius, melifaro Differential Revision: https://reviews.freebsd.org/D38200
* nd6: fix panic in lltable_drop_entry_queue()Alexander V. Chernikov2023-01-151-3/+6
| | | | | | | | | | | | | | | | nd6_resolve_slow() can be called without mbuf. If the LLE entry is not reachable, nd6_resolve_slow() will add this NULL mbuf to the holdchain via lltable_append_entry_queue, which will "append" NULL to the end of the queue (effectively no-op) and bump la_numhold value. When this entry gets freed, the kernel will panic due to the inconsistency between the amount of mbufs in the queue and the value of la_numhold. Fix the panic by checking of mbuf is not NULL prior to inserting it into the holdchain. Reported by: kib MFC after: 3 days
* Import the WireGuard driver from zx2c4.com.John Baldwin2022-10-281-2/+2
| | | | | | | | | | | | | | | This commit brings back the driver from FreeBSD commit f187d6dfbf633665ba6740fe22742aec60ce02a2 plus subsequent fixes from upstream. Relative to upstream this commit includes a few other small fixes such as additional INET and INET6 #ifdef's, #include cleanups, and updates for recent API changes in main. Reviewed by: pauamma, gbe, kevans, emaste Obtained from: git@git.zx2c4.com:wireguard-freebsd @ 3cc22b2 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D36909
* routing: constantify @rc in rib_decompose_notification().Alexander V. Chernikov2022-08-291-1/+1
| | | | | | Clarify the @rc immutability by explicitly marking @rc const. MFC after: 2 weeks
* routing: make rib_add_redirect() use new nhop-based KPIAlexander V. Chernikov2022-08-291-5/+2
| | | | | MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D36169
* netinet6: fix SIOCSPFXFLUSH_IN6 by skipping manually-configured prefixesAlexander V. Chernikov2022-08-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently netinet6/ code allocates IPv6 prefixes (nd_prefix) for both manually-assigned addresses and advertised prefixes. As a result, prefixes from manually-assigned prefixes can be seen in `ndp -p` list and be cleared via `ndp -P`. The latter relies on the SIOCSPFXFLUSH_IN6 ioctl to clear to prefix list. The original intent of the SIOCSPFXFLUSH_IN6 was to clear prefixes originated from the advertising routers: ``` 1998-09-02 JINMEI, Tatuya <jinmei@isl.rdc.toshiba.co.jp> * nd6.c (nd6_ioctl): added 2 new ioctls; SIOCSRTRFLUSH_IN6 and SIOCSPFXFLUSH_IN6. The former is to flush all default routers in the default router list, and the latter is to flush all the prefixes and the addresses derived from them in the prefix list. ``` Restore the intent by marking prefixes derived from the RA messages with newly-added ndpr_flags.ra_derived flag and skip prefixes not marked with such flag during deletion and listing. Differential Revision: https://reviews.freebsd.org/D36312 MFC after: 2 weeks
* netinet6: allow ND entries creation for all directly-reachableAlexander V. Chernikov2022-08-101-68/+22
| | | | | | | | | | | | | | | | | | destinations. The current assumption is that kernel-handled rtadv prefixes along with the interface address prefixes are the only prefixes considered in the ND neighbor eligibility code. Change this by allowing any non-gatewaye routes to be eligible. This will allow DHCPv6-controlled routes to be correctly handled by the ND code. Refactor nd6_is_new_addr_neighbor() to enable more deterministic performance in "found" case and remove non-needed V_rt_add_addr_allfibs handling logic. Reviewed By: kbowling Differential Revision: https://reviews.freebsd.org/D23695 MFC after: 1 month
* inet6(4): Fix a typo in a source code commentGordon Bergling2022-08-071-1/+1
| | | | | | - s/Unreachablity/Unreachability/ MFC after: 3 days
* Adjust function definition in nd6.c to avoid clang 15 warningsDimitry Andric2022-07-261-1/+1
| | | | | | | | | | | | | | | With clang 15, the following -Werror warning is produced: sys/netinet6/nd6.c:247:12: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes] nd6_destroy() ^ void This is nd6_destroy() is declared with a (void) argument list, but defined with an empty argument list. Make the definition match the declaration. MFC after: 3 days
* netinet6: Fix mbuf leak in NDPArseny Smalyuk2022-05-311-42/+11
| | | | | | | | | | | | | | | Mbufs leak when manually removing incomplete NDP records with pending packet via ndp -d. It happens because lltable_drop_entry_queue() rely on `la_numheld` counter when dropping NDP entries (lles). It turned out NDP code never increased `la_numheld`, so the actual free never happened. Fix the issue by introducing unified lltable_append_entry_queue(), common for both ARP and NDP code, properly addressing packet queue maintenance. Reviewed By: melifaro Differential Revision: https://reviews.freebsd.org/D35365 MFC after: 2 weeks
* net: Fix memory leaks in lltable_calc_llheader() error pathsMark Johnston2022-04-081-1/+3
| | | | | | | | | | | | | | | Also convert raw epoch_call() calls to lltable_free_entry() calls, no functional change intended. There's no need to asynchronously free the LLEs in that case to begin with, but we might as well use the lltable interfaces consistently. Noticed by code inspection; I believe lltable_calc_llheader() failures do not generally happen in practice. Reviewed by: bz MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34832
* inet6(4): Fix a few common typos in source code commentsGordon Bergling2021-08-281-2/+2
| | | | | | - s/reshedule/reschedule/ MFC after: 3 days
* lltable: Add support for "child" LLEs holding encap for IPv4oIPv6 entries.Alexander V. Chernikov2021-08-211-35/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we use pre-calculated headers inside LLE entries as prepend data for `if_output` functions. Using these headers allows saving some CPU cycles/memory accesses on the fast path. However, this approach makes adding L2 header for IPv4 traffic with IPv6 nexthops more complex, as it is not possible to store multiple pre-calculated headers inside lle. Additionally, the solution space is limited by the fact that PCB caching saves LLEs in addition to the nexthop. Thus, add support for creating special "child" LLEs for the purpose of holding custom family encaps and store mbufs pending resolution. To simplify handling of those LLEs, store them in a linked-list inside a "parent" (e.g. normal) LLE. Such LLEs are not visible when iterating LLE table. Their lifecycle is bound to the "parent" LLE - it is not possible to delete "child" when parent is alive. Furthermore, "child" LLEs are static (RTF_STATIC), avoding complex state machine used by the standard LLEs. nd6_lookup() and nd6_resolve() now accepts an additional argument, family, allowing to return such child LLEs. This change uses `LLE_SF()` macro which packs family and flags in a single int field. This is done to simplify merging back to stable/. Once this code lands, most of the cases will be converted to use a dedicated `family` parameter. Differential Revision: https://reviews.freebsd.org/D31379 MFC after: 2 weeks
* nd6: Mark several callouts as MPSAFEMark Johnston2021-08-091-2/+2
| | | | | | | | | | | | | The use of Giant here is vestigal and does not provide any useful synchronization. Furthermore, non-MPSAFE callouts can cause the softclock threads to block waiting for long-running newbus operations to complete. Reported by: mav Reviewed by: bz MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31470
* [lltable] Restructure nd6 code.Alexander V. Chernikov2021-08-071-40/+74
| | | | | | | | | | | | | | | | | | Factor out lltable locking logic from lltable_try_set_entry_addr() into a separate lltable_acquire_wlock(), so the latter can be used in other parts of the code w/o duplication. Create nd6_try_set_entry_addr() to avoid code duplication in nd6.c and nd6_nbr.c. Move lle creation logic from nd6_resolve_slow() into a separate nd6_get_llentry() to simplify the former. These changes serve as a pre-requisite for implementing RFC8950 (IPv4 prefixes with IPv6 nexthops). Differential Revision: https://reviews.freebsd.org/D31432 MFC after: 2 weeks
* Use lltable calculated header when sending lle holdchain after successful ↵Alexander V. Chernikov2021-08-051-11/+19
| | | | | | | | lle resolution. Subscribers: imp, ae, bz Differential Revision: https://reviews.freebsd.org/D31391
* [lltable] Unify datapath feedback mechamism.Alexander V. Chernikov2021-08-041-24/+8
| | | | | | | | | | | | | Use newly-create llentry_request_feedback(), llentry_mark_used() and llentry_get_hittime() to request datapatch usage check and fetch the results in the same fashion both in IPv4 and IPv6. While here, simplify llentry_provide_feedback() wrapper by eliminating 1 condition check. MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D31390
* base: remove if_wg(4) and associated utilities, manpageKyle Evans2021-03-171-2/+2
| | | | | | | | | | | | After length decisions, we've decided that the if_wg(4) driver and related work is not yet ready to live in the tree. This driver has larger security implications than many, and thus will be held to more scrutiny than other drivers. Please also see the related message sent to the freebsd-hackers@ and freebsd-arch@ lists by Kyle Evans <kevans@FreeBSD.org> on 2021/03/16, with the subject line "Removing WireGuard Support From Base" for additional context.
* if_wg: import latest fixup work from the wireguard-freebsd projectKyle Evans2021-03-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the culmination of about a week of work from three developers to fix a number of functional and security issues. This patch consists of work done by the following folks: - Jason A. Donenfeld <Jason@zx2c4.com> - Matt Dunwoodie <ncon@noconroy.net> - Kyle Evans <kevans@FreeBSD.org> Notable changes include: - Packets are now correctly staged for processing once the handshake has completed, resulting in less packet loss in the interim. - Various race conditions have been resolved, particularly w.r.t. socket and packet lifetime (panics) - Various tests have been added to assure correct functionality and tooling conformance - Many security issues have been addressed - if_wg now maintains jail-friendly semantics: sockets are created in the interface's home vnet so that it can act as the sole network connection for a jail - if_wg no longer fails to remove peer allowed-ips of 0.0.0.0/0 - if_wg now exports via ioctl a format that is future proof and complete. It is additionally supported by the upstream wireguard-tools (which we plan to merge in to base soon) - if_wg now conforms to the WireGuard protocol and is more closely aligned with security auditing guidelines Note that the driver has been rebased away from using iflib. iflib poses a number of challenges for a cloned device trying to operate in a vnet that are non-trivial to solve and adds complexity to the implementation for little gain. The crypto implementation that was previously added to the tree was a super complex integration of what previously appeared in an old out of tree Linux module, which has been reduced to crypto.c containing simple boring reference implementations. This is part of a near-to-mid term goal to work with FreeBSD kernel crypto folks and take advantage of or improve accelerated crypto already offered elsewhere. There's additional test suite effort underway out-of-tree taking advantage of the aforementioned jail-friendly semantics to test a number of real-world topologies, based on netns.sh. Also note that this is still a work in progress; work going further will be much smaller in nature. MFC after: 1 month (maybe)
* arp/nd: Cope with late calls to iflladdr_eventKristof Provost2021-02-231-0/+2
| | | | | | | | | | | | | | | | When tearing down vnet jails we can move an if_bridge out (as part of the normal vnet_if_return()). This can, when it's clearing out its list of member interfaces, change its link layer address. That sends an iflladdr_event, but at that point we've already freed the AF_INET/AF_INET6 if_afdata pointers. In other words: when the iflladdr_event callbacks fire we can't assume that ifp->if_afdata[AF_INET] will be set. Reviewed by: donner@, melifaro@ MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D28860
* When we are about to send down to the driver layerRandall Stewart2021-01-271-0/+1
| | | | | | | | we need to make sure that the m_nextpkt field is NULL else the lower layers may do unwanted things. Reviewed By: gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D28377
* Bump amount of queued packets in for unresolved ARP/NDP entries to 16.Alexander V. Chernikov2021-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently default behaviour is to keep only 1 packet per unresolved entry. Ability to queue more than one packet was added 10 years ago, in r215207, though the default value was kep intact. Things have changed since that time. Systems tend to initiate multiple connections at once for a variety of reasons. For example, recent kern/252278 bug report describe happy-eyeball DNS behaviour sending multiple requests to the DNS server. The primary driver for upper value for the queue length determination is memory consumption. Remote actors should not be able to easily exhaust local memory by sending packets to unresolved arp/ND entries. For now, bump value to 16 packets, to match Darwin implementation. The proper approach would be to switch the limit to calculate memory consumption instead of packet count and limit based on memory. We should MFC this with a variation of D22447. Reviewers: #manpages, #network, bz, emaste Reviewed By: emaste, gbe(doc), jilles(doc) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D28068
* Remove RADIX_MPATH config option.Alexander V. Chernikov2020-11-291-5/+1
| | | | | | | | | | | | ROUTE_MPATH is the new config option controlling new multipath routing implementation. Remove the last pieces of RADIX_MPATH-related code and the config option. Reviewed by: glebius Differential Revision: https://reviews.freebsd.org/D27244 Notes: svn path=/head/; revision=368164
* IPv6: set ifdisabled in the kernel rather than in rcBjoern A. Zeeb2020-11-251-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | Enable ND6_IFF_IFDISABLED when the interface is created in the kernel before return to user space. This avoids a race when an interface is create by a program which also calls ifconfig IF inet6 -ifdisabled and races with the devd -> /etc/pccard_ether -> .. netif start IF -> ifdisabled calls (the devd/rc framework disabling IPv6 again after the program had enabled it already). In case the global net.inet6.ip6.accept_rtadv was turned on, we also default to enabling IPv6 on the interfaces, rather than disabling them. PR: 248172 Reported by: Gert Doering (gert greenie.muc.de) Reviewed by: glebius (, phk) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D27324 Notes: svn path=/head/; revision=368031
* Introduce scalable route multipath.Alexander V. Chernikov2020-10-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is based on the nexthop objects landed in D24232. The change introduces the concept of nexthop groups. Each group contains the collection of nexthops with their relative weights and a dataplane-optimized structure to enable efficient nexthop selection. Simular to the nexthops, nexthop groups are immutable. Dataplane part gets compiled during group creation and is basically an array of nexthop pointers, compiled w.r.t their weights. With this change, `rt_nhop` field of `struct rtentry` contains either nexthop or nexthop group. They are distinguished by the presense of NHF_MULTIPATH flag. All dataplane lookup functions returns pointer to the nexthop object, leaving nexhop groups details inside routing subsystem. User-visible changes: The change is intended to be backward-compatible: all non-mpath operations should work as before with ROUTE_MPATH and net.route.multipath=1. All routes now comes with weight, default weight is 1, maximum is 2^24-1. Current maximum multipath group width is statically set to 64. This will become sysctl-tunable in the followup changes. Using functionality: * Recompile kernel with ROUTE_MPATH * set net.route.multipath to 1 route add -6 2001:db8::/32 2001:db8::2 -weight 10 route add -6 2001:db8::/32 2001:db8::3 -weight 20 netstat -6On Nexthop groups data Internet6: GrpIdx NhIdx Weight Slots Gateway Netif Refcnt 1 ------- ------- ------- --------------------------------------- --------- 1 13 10 1 2001:db8::2 vlan2 14 20 2 2001:db8::3 vlan2 Next steps: * Land outbound hashing for locally-originated routes ( D26523 ). * Fix net/bird multipath (net/frr seems to work fine) * Add ROUTE_MPATH to GENERIC * Set net.route.multipath=1 by default Tested by: olivier Reviewed by: glebius Relnotes: yes Differential Revision: https://reviews.freebsd.org/D26449 Notes: svn path=/head/; revision=366390
* Rework part of routing code to reduce difference to D26449.Alexander V. Chernikov2020-09-211-10/+15
| | | | | | | | | | | | | | * Split rt_setmetrics into get_info_weight() and rt_set_expire_info(), as these two can be applied at different entities and at different times. * Start filling route weight in route change notifications * Pass flowid to UDP/raw IP route lookups * Rework nd6_subscription_cb() and sysctl_dumpentry() to prepare for the fact that rtentry can contain multiple nexthops. Differential Revision: https://reviews.freebsd.org/D26497 Notes: svn path=/head/; revision=365973
* net: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-16/+2
| | | | Notes: svn path=/head/; revision=365071
* Transition from rtrequest1_fib() to rib_action().Alexander V. Chernikov2020-07-211-1/+2
| | | | | | | | | | | | Remove all variations of rtrequest <rtrequest1_fib, rtrequest_fib, in6_rtrequest, rtrequest_fib> and their uses and switch to to rib_action(). This is part of the new routing KPI. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25546 Notes: svn path=/head/; revision=363403
* Temporarly revert r363319 to unbreak the build.Alexander V. Chernikov2020-07-191-2/+1
| | | | | | | | Reported by: CI Pointy hat to: melifaro Notes: svn path=/head/; revision=363320
* Transition from rtrequest1_fib() to rib_action().Alexander V. Chernikov2020-07-191-1/+2
| | | | | | | | | | | | Remove all variations of rtrequest <rtrequest1_fib, rtrequest_fib, in6_rtrequest, rtrequest_fib> and their uses and switch to to rib_action(). This is part of the new routing KPI. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25546 Notes: svn path=/head/; revision=363319
* Switch inet6 default route subscription to the new rib subscription api.Alexander V. Chernikov2020-07-121-25/+7
| | | | | | | | | | | Old subscription model allowed only single customer. Switch inet6 to the new subscription api and eliminate the old model. Differential Revision: https://reviews.freebsd.org/D25615 Notes: svn path=/head/; revision=363128
* Use epoch(9) for rtentries to simplify control plane operations.Alexander V. Chernikov2020-05-231-0/+3
| | | | | | | | | | | | | | | | | | Currently the only reason of refcounting rtentries is the need to report the rtable operation details immediately after the execution. Delaying rtentry reclamation allows to stop refcounting and simplify the code. Additionally, this change allows to reimplement rib_lookup_info(), which is used by some of the customers to get the matching prefix along with nexthops, in more efficient way. The change keeps per-vnet rtzone uma zone. It adds nh_vnet field to nhop_priv to be able to reliably set curvnet even during vnet teardown. Rest of the reference counting code will be removed in the D24867 . Differential Revision: https://reviews.freebsd.org/D24866 Notes: svn path=/head/; revision=361409
* IPv6: Fix a panic in the nd6 code with unmapped mbufs.Andrew Gallatin2020-05-121-3/+21
| | | | | | | | | | | | | | | | | | If the neighbor entry for an IPv6 TCP session using unmapped mbufs times out, IPv6 will send an icmp6 dest. unreachable message. In doing this, it will try to do a software checksum on the reflected packet. If this is a TCP session using unmapped mbufs, then there will be a kernel panic. To fix this, just free packets with unmapped mbufs, rather than sending the icmp. Reviewed by: np, rrs Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24821 Notes: svn path=/head/; revision=360982
* Add nhop to the ifa_rtrequest() callback.Alexander V. Chernikov2020-04-291-4/+4
| | | | | | | | | | | | | With the upcoming multipath changes described in D24141, rt->rt_nhop can potentially point to a nexthop group instead of an individual nhop. To simplify caller handling of such cases, change ifa_rtrequest() callback to pass changed nhop directly. Differential Revision: https://reviews.freebsd.org/D24604 Notes: svn path=/head/; revision=360475
* Move route_temporal.c and route_var.h to net/route.Alexander V. Chernikov2020-04-281-1/+1
| | | | | | | | | | | | Nexthop objects implementation, defined in r359823, introduced sys/net/route directory intended to hold all routing-related code. Move recently-introduced route_temporal.c and private route_var.h header there. Differential Revision: https://reviews.freebsd.org/D24597 Notes: svn path=/head/; revision=360449
* Move struct rtentry definition to nhop_var.h.Alexander V. Chernikov2020-04-281-0/+1
| | | | | | | | | | | | | | | One of the goals of the new routing KPI defined in r359823 is to entirely hide`struct rtentry` from the consumers. It will allow to improve routing subsystem internals and deliver features much faster. This is one of the last changes, effectively moving struct rtentry definition to a net/route_var.h header, internal to the routing subsystem. Differential Revision: https://reviews.freebsd.org/D24580 Notes: svn path=/head/; revision=360447
* Convert rtentry field accesses into nhop field accesses.Alexander V. Chernikov2020-04-231-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the goals of the new routing KPI defined in r359823 is to entirely hide`struct rtentry` from the consumers. It will allow to improve routing subsystem internals and deliver more features much faster. This commit is mostly mechanical change to eliminate direct struct rtentry field accesses. The only notable difference is AF_LINK gateway encoding. AF_LINK gw is used in routing stack for operations with interface routes and host loopback routes. In the former case it indicates _some_ non-NULL gateway, as the interface is the same as in rt_ifp in kernel and rtm_ifindex in rtsock reporting. In the latter case the interface index inside gateway was used by the IPv6 datapath to verify address scope for link-local interfaces. Kernel uses struct sockaddr_dl for this type of gateway. This structure allows for specifying rich interface data, such as mac address and interface name. However, this results in relatively large structure size - 52 bytes. Routing stack fils in only 2 fields - sdl_index and sdl_type, which reside in the first 8 bytes of the structure. In the new KPI, struct nhop_object tries to be cache-efficient, hence embodies gateway address inside the structure. In the AF_LINK case it stores stortened version of the structure - struct sockaddr_dl_short, which occupies 16 bytes. After D24340 changes, the data inside AF_LINK gateway will not be used in the kernel at all, leaving rtsock as the only potential concern. The difference in rtsock reporting: (old) got message of size 240 on Thu Apr 16 03:12:13 2020 RTM_ADD: Add Route: len 240, pid: 0, seq 0, errno 0, flags:<UP,DONE,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 link#5 255.255.255.0 (new) got message of size 200 on Sun Apr 19 09:46:32 2020 RTM_ADD: Add Route: len 200, pid: 0, seq 0, errno 0, flags:<UP,DONE,PINNED> locks: inits: sockaddrs: <DST,GATEWAY,NETMASK> 10.0.0.0 link#5 255.255.255.0 Note 40 bytes different (52-16 + alignment). However, gateway is still a valid AF_LINK gateway with proper data filled in. It is worth noting that these particular messages (interface routes) are mostly ignored by routing daemons: * bird/quagga/frr uses RTM_NEWADDR and ignores prefix route addition messages. * quagga/frr ignores routes without gateway More detailed overview on how rtsock messages are used by the routing daemons to reconstruct the kernel view, can be found in D22974. Differential Revision: https://reviews.freebsd.org/D24519 Notes: svn path=/head/; revision=360218
* Add nhop parameter to rti_filter callback.Alexander V. Chernikov2020-04-161-2/+4
| | | | | | | | | | | | | | | | | | | | One of the goals of the new routing KPI defined in r359823 is to entirely hide`struct rtentry` from the consumers. It will allow to improve routing subsystem internals and deliver more features much faster. This change is one of the ongoing changes to eliminate direct struct rtentry field accesses. Additionally, with the followup multipath changes, single rtentry can point to multiple nexthops. With that in mind, convert rti_filter callback used when traversing the routing table to accept pair (rt, nhop) instead of nexthop. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D24440 Notes: svn path=/head/; revision=360014
* nd6: sysctlBjoern A. Zeeb2019-11-191-12/+10
| | | | | | | | | | | | | | Move the SYSCTL_DECL to the top of the file. Move the sysctl function before SYSCTL_PROC so that we don't need an extra function declaration in the middle of the file. No functional changes. MFC after: 3 weeks Sponsored by: Netflix Notes: svn path=/head/; revision=354863
* nd6: make nd6_timer_ch staticBjoern A. Zeeb2019-11-191-1/+1
| | | | | | | | | | | nd6_timer_ch is only used in file local context. There is no need to export it, so make it static. MFC after: 3 weeks Sponsored by: Netflix Notes: svn path=/head/; revision=354862
* nd6: retire defrouter_select(), use _fib() variant.Bjoern A. Zeeb2019-11-161-2/+2
| | | | | | | | | | | Burn bridges and replace the last two calls of defrouter_select() with defrouter_select_fib(). That allows us to retire defrouter_select() and make it more clear in the calling code that it applies to all FIBs. Sponsored by: Netflix Notes: svn path=/head/; revision=354758
* nd6: simplify codeBjoern A. Zeeb2019-11-151-7/+1
| | | | | | | | | | | We are taking the same actions in both cases of the branch inside the block. Simplify that code as the extra branch is not needed. MFC after: 3 weeks Sponsored by: Netflix Notes: svn path=/head/; revision=354731
* nd6: make nd6_alloc() file staticBjoern A. Zeeb2019-11-131-1/+1
| | | | | | | | | | nd6_alloc() is a function used only locally. Make it static and no longer export it. Keeps the KPI smaller. Sponsored by: Netflix Notes: svn path=/head/; revision=354681
* nd6 defrouter: consolidate nd_defrouter manipulations in nd6_rtr.cBjoern A. Zeeb2019-11-131-98/+8
| | | | | | | | | | | | | | | | | | | Move the nd_defrouter along with the sysctl handler from nd6.c to nd6_rtr.c and make the variable file static. Provide (temporary) new accessor functions for code manipulating nd_defrouter from nd6.c, and stop exporting functions no longer needed outside nd6_rtr.c. This also shuffles a few functions around in nd6_rtr.c without functional changes. Given all nd_defrouter logic is now in one place we can tidy up the code, locking and, and other open items. MFC after: 3 weeks X-MFC: keep exporting the functions Sponsored by: Netflix Notes: svn path=/head/; revision=354680