aboutsummaryrefslogtreecommitdiff
path: root/sys/netinet/if_ether.c
Commit message (Collapse)AuthorAgeFilesLines
* net: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-4/+0
| | | | Notes: svn path=/head/; revision=365071
* Complete conversions from fib<4|6>_lookup_nh_<basic|ext> to fib<4|6>_lookup().Alexander V. Chernikov2020-07-021-7/+9
| | | | | | | | | | | | | fib[46]_lookup_nh_ represents pre-epoch generation of fib api, providing less guarantees over pointer validness and requiring on-stack data copying. With no callers remaining, remove fib[46]_lookup_nh_ functions. Submitted by: Neel Chauhan <neel AT neelc DOT org> Differential Revision: https://reviews.freebsd.org/D25445 Notes: svn path=/head/; revision=362900
* Use interface fib for proxyarp checks.Alexander V. Chernikov2020-04-021-5/+9
| | | | | | | | | | | | | Before the change, proxyarp checks for src and dst addresses were performed using default fib, breaking multi-fib scenario. PR: 245181 Submitted by: Scott Aitken (original version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D24244 Notes: svn path=/head/; revision=359580
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-2/+6
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* White space cleanup -- remove trailing tab's or spacesRandall Stewart2020-02-121-2/+2
| | | | | | | | | from any line. Sponsored by: Netflix Inc. Notes: svn path=/head/; revision=357818
* Widen NET_EPOCH coverage.Gleb Smirnoff2019-10-071-21/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111 Notes: svn path=/head/; revision=353292
* Extract eventfilter declarations to sys/_eventfilter.hConrad Meyer2019-05-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h" in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header pollution substantially. EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c files into appropriate headers (e.g., sys/proc.h, powernv/opal.h). As a side effect of reduced header pollution, many .c files and headers no longer contain needed definitions. The remainder of the patch addresses adding appropriate includes to fix those files. LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by sys/mutex.h since r326106 (but silently protected by header pollution prior to this change). No functional change (intended). Of course, any out of tree modules that relied on header pollution for sys/eventhandler.h, sys/lock.h, or sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped. Notes: svn path=/head/; revision=347984
* Improve ARP logging.Bjoern A. Zeeb2019-03-091-2/+11
| | | | | | | | | | | | | | | | | | r344504 added an extra ARP_LOG() call in case of an if_output() failure. It turns out IPv4 can be noisy. In order to not spam the console by default: (a) add a counter for these events so people can keep better track of how often it happens, and (b) add a sysctl to select the default ARP_LOG log level and set it to INFO avoiding the one (the new) DEBUG level by default. Claim a spare (1st one after 10 years since the stats were added) in order to not break netstat from FreeBSD 12->13 updates in the future. Reviewed by: karels Differential Revision: https://reviews.freebsd.org/D19490 Notes: svn path=/head/; revision=344954
* Make arp code return (more) errors.Bjoern A. Zeeb2019-02-241-8/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | arprequest() is a void function and in case of error we simply return without any feedback. In case of any local operation or *if_output() failing no feedback is send up the stack for the packet which triggered the arp request to be sent. arpresolve_full() has three pre-canned possible errors returned (if we have not yet sent enough arp requests or if we tried often enough without success) otherwise "no error" is returned. Make arprequest() an "internal" function arprequest_internal() which does return a possible error to the caller. Preserve arprequest() as a void wrapper function for external consumers. In arpresolve_full() add an extra error checking. Use the arprequest_internal() function and only return an error if non of the three ones (mentioend above) are already set. This will return possible errors all the way up the stack and allows functions and programs to react on the send errors rather than leaving them in the dark. Also they might get more detailed feedback of why packets cannot be sent and they will receive it quicker. Reviewed by: karels, hselasky Differential Revision: https://reviews.freebsd.org/D18904 Notes: svn path=/head/; revision=344504
* garp: Fix vnet related panic for gratuitous arpKristof Provost2019-02-121-0/+4
| | | | | | | | | | | | | | | | | | | Gratuitous ARP packets are sent from a timer, which means we don't have a vnet context set. As a result we panic trying to send the packet. Set the vnet context based on the interface associated with the interface address. To reproduce: sysctl net.link.ether.inet.garp_rexmit_count=2 ifconfig vtnet1 10.0.0.1/24 up PR: 235699 Reviewed by: vangyzen@ MFC after: 1 week Notes: svn path=/head/; revision=344061
* Mechanical cleanup of epoch(9) usage in network stack.Gleb Smirnoff2019-01-091-14/+19
| | | | | | | | | | | | | | | | | | | | | | | | - Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro is used in embedded scopes. Explicitly declare epoch_tracker always. - Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read locking macros to what they actually are - the net_epoch. Keeping them as is is very misleading. They all are named FOO_RLOCK(), while they no longer have lock semantics. Now they allow recursion and what's more important they now no longer guarantee protection against their companion WLOCK macros. Note: INP_HASH_RLOCK() has same problems, but not touched by this commit. This is non functional mechanical change. The only functionally changed functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter epoch recursively. Discussed with: jtl, gallatin Notes: svn path=/head/; revision=342872
* Improve the comment for arpresolve_full() in if_ether.c.Bjoern A. Zeeb2018-11-171-3/+3
| | | | | | | | | No functional changes. MFC after: 6 weeks Notes: svn path=/head/; revision=340494
* Retire arpresolve_addr(), which is not used anywhere, from if_ether.c.Bjoern A. Zeeb2018-11-171-15/+0
| | | | Notes: svn path=/head/; revision=340493
* Use the new VNET_DEFINE_STATIC macro when we are defining static VNETAndrew Turner2018-07-241-6/+6
| | | | | | | | | | | variables. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147 Notes: svn path=/head/; revision=336676
* ifnet: Replace if_addr_lock rwlock with epoch + mutexMatt Macy2018-05-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366 Notes: svn path=/head/; revision=333813
* Remove support for the Arcnet protocol.Brooks Davis2018-04-131-4/+0
| | | | | | | | | | | | | | | | | | While Arcnet has some continued deployment in industrial controls, the lack of drivers for any of the PCI, USB, or PCIe NICs on the market suggests such users aren't running FreeBSD. Evidence in the PR database suggests that the cm(4) driver (our sole Arcnet NIC) was broken in 5.0 and has not worked since. PR: 182297 Reviewed by: jhibbits, vangyzen Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15057 Notes: svn path=/head/; revision=332490
* Remove support for FDDI networks.Brooks Davis2018-04-111-4/+0
| | | | | | | | | | | | Defines in net/if_media.h remain in case code copied from ifconfig is in use elsewere (supporting non-existant media type is harmless). Reviewed by: kib, jhb Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15017 Notes: svn path=/head/; revision=332412
* Fix outgoing TCP/UDP packet drop on arp/ndp entry expiration.Alexander V. Chernikov2018-03-171-12/+4
| | | | | | | | | | | | | | | | | | | | | | Current arp/nd code relies on the feedback from the datapath indicating that the entry is still used. This mechanism is incorporated into the arpresolve()/nd6_resolve() routines. After the inpcb route cache introduction, the packet path for the locally-originated packets changed, passing cached lle pointer to the ether_output() directly. This resulted in the arp/ndp entry expire each time exactly after the configured max_age interval. During the small window between the ARP/NDP request and reply from the router, most of the packets got lost. Fix this behaviour by plugging datapath notification code to the packet path used by route cache. Unify the notification code by using single inlined function with the per-AF callbacks. Reported by: sthaug at nethelp.no Reviewed by: ae MFC after: 2 weeks Notes: svn path=/head/; revision=331098
* sys: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-201-0/+2
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Notes: svn path=/head/; revision=326023
* Fix comment typo.Oleg Bulyzhin2017-08-091-1/+1
| | | | Notes: svn path=/head/; revision=322307
* Fix the L2 address printed in the "arp: %s moved from %*D" message.Andrey V. Elsukov2017-03-111-1/+1
| | | | | | | | | | | In the r292978 struct llentry was changed and the ll_addr field become the pointer. PR: 217667 MFC after: 1 week Notes: svn path=/head/; revision=315050
* Renumber copyright clause 4Warner Losh2017-02-281-1/+1
| | | | | | | | | | | | Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96 Notes: svn path=/head/; revision=314436
* Use inet_ntoa_r() instead of inet_ntoa() throughout the kernelEric van Gyzen2017-02-161-9/+17
| | | | | | | | | | | | | | | inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with a buffer on the caller's stack. Suggested by: glebius, emaste Reviewed by: gnn MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9625 Notes: svn path=/head/; revision=313821
* Add GARP retransmit capabilityEric van Gyzen2016-10-021-0/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A single gratuitous ARP (GARP) is always transmitted when an IPv4 address is added to an interface, and that is usually sufficient. However, in some circumstances, such as when a shared address is passed between cluster nodes, this single GARP may occasionally be dropped or lost. This can lead to neighbors on the network link working with a stale ARP cache and sending packets destined for that address to the node that previously owned the address, which may not respond. To avoid this situation, GARP retransmissions can be enabled by setting the net.link.ether.inet.garp_rexmit_count sysctl to a value greater than zero. The setting represents the maximum number of retransmissions. The interval between retransmissions is calculated using an exponential backoff algorithm, doubling each time, so the retransmission intervals are: {1, 2, 4, 8, 16, ...} (seconds). Due to the exponential backoff algorithm used for the interval between GARP retransmissions, the maximum number of retransmissions is limited to 16 for sanity. This limit corresponds to a maximum interval between retransmissions of 2^16 seconds ~= 18 hours. Increasing this limit is possible, but sending out GARPs spaced days apart would be of little use. Submitted by: David A. Bright <david.a.bright@dell.com> MFC after: 1 month Relnotes: yes Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D7695 Notes: svn path=/head/; revision=306577
* Fix per-connection L2 caching in fast pathMike Karels2016-07-221-1/+8
| | | | | | | | | | | | r301217 re-added per-connection L2 caching from a previous change, but it omitted caching in the fast path. Add it. Reviewed By: gallatin Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D7239 Notes: svn path=/head/; revision=303171
* Introduce a per-VNET flag to enable/disable netisr prcessing on that VNET.Bjoern A. Zeeb2016-06-031-5/+25
| | | | | | | | | | | | | | | | | | | | | | | Add accessor functions to toggle the state per VNET. The base system (vnet0) will always enable itself with the normal registration. We will share the registered protocol handlers in all VNETs minimising duplication and management. Upon disabling netisr processing for a VNET drain the netisr queue from packets for that VNET. Update netisr consumers to (de)register on a per-VNET start/teardown using VNET_SYS(UN)INIT functionality. The change should be transparent for non-VIMAGE kernels. Reviewed by: gnn (, hiren) Obtained from: projects/vnet MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D6691 Notes: svn path=/head/; revision=301270
* This change re-adds L2 caching for TCP and UDP, as originally added in D4306George V. Neville-Neil2016-06-021-5/+15
| | | | | | | | | | | | but removed due to other changes in the system. Restore the llentry pointer to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as appropriate. Submitted by: Mike Karels Differential Revision: https://reviews.freebsd.org/D6262 Notes: svn path=/head/; revision=301217
* netinet: for pointers replace 0 with NULL.Pedro F. Giffuni2016-04-151-1/+1
| | | | | | | | | | | These are mostly cosmetical, no functional change. Found with devel/coccinelle. Reviewed by: ae. tuexen Notes: svn path=/head/; revision=298066
* Remove second EVENTHANDLER_REGISTER slipped in r292978.Alexander V. Chernikov2016-01-011-3/+0
| | | | | | | Describe the reason of doing unconditional M_PREPEND in ether_output(). Notes: svn path=/head/; revision=293035
* Implement interface link header precomputation API.Alexander V. Chernikov2015-12-311-36/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add if_requestencap() interface method which is capable of calculating various link headers for given interface. Right now there is support for INET/INET6/ARP llheader calculation (IFENCAP_LL type request). Other types are planned to support more complex calculation (L2 multipath lagg nexthops, tunnel encap nexthops, etc..). Reshape 'struct route' to be able to pass additional data (with is length) to prepend to mbuf. These two changes permits routing code to pass pre-calculated nexthop data (like L2 header for route w/gateway) down to the stack eliminating the need for other lookups. It also brings us closer to more complex scenarios like transparently handling MPLS nexthops and tunnel interfaces. Last, but not least, it removes layering violation introduced by flowtable code (ro_lle) and simplifies handling of existing if_output consumers. ARP/ND changes: Make arp/ndp stack pre-calculate link header upon installing/updating lle record. Interface link address change are handled by re-calculating headers for all lles based on if_lladdr event. After these changes, arpresolve()/nd6_resolve() returns full pre-calculated header for supported interfaces thus simplifying if_output(). Move these lookups to separate ether_resolve_addr() function which ether returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr() compat versions to return link addresses instead of pre-calculated data. BPF changes: Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT. Despite the naming, both of there have ther header "complete". The only difference is that interface source mac has to be filled by OS for AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside BPF and not pollute if_output() routines. Convert BPF to pass prepend data via new 'struct route' mechanism. Note that it does not change non-optimized if_output(): ro_prepend handling is purely optional. Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI. It is not needed for ethernet anymore. The only remaining FDDI user is dev/pdq mostly untouched since 2007. FDDI support was eliminated from OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65). Flowtable changes: Flowtable violates layering by saving (and not correctly managing) rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated header data from that lle. Differential Revision: https://reviews.freebsd.org/D4102 Notes: svn path=/head/; revision=292978
* Fix typo (s/harware/hardware/)Kevin Lo2015-12-251-3/+3
| | | | Notes: svn path=/head/; revision=292730
* Revert r292275 & r292379Steven Hartland2015-12-171-81/+15
| | | | | | | | | | glebius has concerns about these changes so reverting those can be discussed and addressed. Sponsored by: Multiplay Notes: svn path=/head/; revision=292402
* Fix issues introduced by r292275Steven Hartland2015-12-161-4/+7
| | | | | | | | | | | | | | | * Fix panic for etherswitches which don't have a LLADDR. * Disabled DELAY in unsolicited NDA, which needs further work. * Fixed missing DELAY in carp_send_na. * style(9) fix. Reported by: kp & melifaro X-MFC-With: r292275 MFC after: 1 month Sponsored by: Multiplay Notes: svn path=/head/; revision=292379
* Fix ARP reply handling changed in r286955.Alexander V. Chernikov2015-12-161-4/+12
| | | | | | | | | | | If source of ARP request didn't pass the routing check (e.g. not in directly connected network), be polite and still answer the request instead of dropping frame. Reported by: quadro at irc@rusnet Notes: svn path=/head/; revision=292329
* Fix lagg failover due to missing notificationsSteven Hartland2015-12-151-15/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using lagg failover mode neither Gratuitous ARP (IPv4) or Unsolicited Neighbour Advertisements (IPv6) are sent to notify other nodes that the address may have moved. This results is slow failover, dropped packets and network outages for the lagg interface when the primary link goes down. We now use the new if_link_state_change_cond with the force param set to allow lagg to force through link state changes and hence fire a ifnet_link_event which are now monitored by rip and nd6. Upon receiving these events each protocol trigger the relevant notifications: * inet4 => Gratuitous ARP * inet6 => Unsolicited Neighbour Announce This also fixes the carp IPv6 NA's that stopped working after r251584 which added the ipv6_route__llma route. The new behavour can be controlled using the sysctls: * net.link.ether.inet.arp_on_link * net.inet6.icmp6.nd6_on_link Also removed unused param from lagg_port_state and added descriptions for the sysctls while here. PR: 156226 MFC after: 1 month Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D4111 Notes: svn path=/head/; revision=292275
* Make in_arpinput(), inp_lookup_mcast_ifp(), icmp_reflect(),Alexander V. Chernikov2015-12-091-17/+9
| | | | | | | | | | | | | | | | | ip_dooptions(), icmp6_redirect_input(), in6_lltable_rtcheck(), in6p_lookup_mcast_ifp() and in6_selecthlim() use new routing api. Eliminate now-unused ip_rtaddr(). Fix lookup key fib6_lookup_nh_basic() which was lost diring merge. Make fib6_lookup_nh_basic() and fib6_lookup_nh_extended() always return IPv6 destination address with embedded scope. Currently rw_gateway has it scope embedded, do the same for non-gatewayed destinations. Sponsored by: Yandex LLC Notes: svn path=/head/; revision=292015
* Remove LLE read lock from IPv4 fast path.Alexander V. Chernikov2015-12-051-50/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLE structure is mostly unchanged during its lifecycle. To be more specific, there are 2 things relevant for fast path lookup code: 1) link-level address change. Since r286722, these updates are performed under AFDATA WLOCK. 2) Some sort of feedback indicating that this particular entry is used so we re-send arp request to perform reachability verification instead of expiring entry. The only signal that is needed from fast path is something like binary yes/no. The latter is solved by the following changes: 1) introduce special r_skip_req field which is read lockless by fast path, but updated under (new) req_mutex mutex. If this field is non-zero, then fast path will acquire lock and set it back to 0. 2) introduce simple state machine: incomplete->reachable<->verify->deleted. Before that we implicitely had incomplete->reachable->deleted state machine, with V_arpt_keep between "reachable" and "deleted". Verification was performed in runtime 5 seconds before V_arpt_keep expire. This is changed to "change state to verify 5 seconds before V_arpt_keep, set r_skip_req to non-zero value and check it every second". If the value is zero - then send arp verification probe. These changes do not introduce any signifficant control plane overhead: typically lle callout timer would fire 1 time more each V_arpt_keep (1200s) for used lles and up to arp_maxtries (5) for dead lles. As a result, all packets towards "reachable" lle are handled by fast path without acquiring lle read lock. Additional "req_mutex" is needed because callout / arpresolve_slow() or eventhandler might keep LLE lock for signifficant amount of time, which might not be feasible for fast path locking (e.g. having rmlock as ether AFDATA or lltable own lock). Differential Revision: https://reviews.freebsd.org/D3688 Notes: svn path=/head/; revision=291853
* Decompose arp_ifinit() into arp_add_ifa_lle() and arp_announce_ifaddr().Alexander V. Chernikov2015-11-091-23/+28
| | | | | | | | | | | Rename arp_ifinit2() into arp_announce_ifaddr(). Eliminate zeroing ifa_rtrequest: it was used for calling arp_rtrequest() which was responsible for handling route cloning requests. It became obsolete since r186119 (L2/L3 split). Notes: svn path=/head/; revision=290604
* Use lladdr_event to propagate gratiotus arp.Alexander V. Chernikov2015-11-091-0/+31
| | | | | | | Differential Revision: https://reviews.freebsd.org/D4019 Notes: svn path=/head/; revision=290603
* Unify setting lladdr for AF_INET[6].Alexander V. Chernikov2015-11-071-15/+2
| | | | Notes: svn path=/head/; revision=290486
* Fix regression from r287779, that bite me. If we call m_pullup()Gleb Smirnoff2015-10-071-5/+8
| | | | | | | | | | | | | | unconditionally, we end up with an mbuf chain of two mbufs, which later in in_arpreply() is rewritten from ARP request to ARP reply and is sent out. Looks like igb(4) (at least mine, and at least at my network) fails on such mbuf chain, so ARP reply doesn't go out wire. Thus, make the m_pullup() call conditional, as it is everywhere. Of course, the bug in igb(?) should be investigated, but better first fix the head. And unconditional m_pullup() was suboptimal, anyway. Notes: svn path=/head/; revision=288991
* * Improve logging invalid arp messagesAlexander V. Chernikov2015-09-151-26/+32
| | | | | | | | | | * Remove redundant check in ip_arpinput Suggested by: glebius MFC after: 2 weeks Notes: svn path=/head/; revision=287815
* * Require explicitl lle unlink prior to calling llentry_delete().Alexander V. Chernikov2015-09-151-7/+5
| | | | | | | | | This one slightly decreases time of holding afdata wlock. * While here, make nd6_free() return void. No one has used its return value since r186119. Notes: svn path=/head/; revision=287813
* * Do more fine-grained locking: call eventhandlers/free_entryAlexander V. Chernikov2015-09-141-20/+0
| | | | | | | | | | | | | without holding afdata wlock * convert per-af delete_address callback to global lltable_delete_entry() and more low-level "delete this lle" per-af callback * fix some bugs/inconsistencies in IPv4/IPv6 ifscrub procedures Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D3573 Notes: svn path=/head/; revision=287789
* * Improve error checking for arp messages.Alexander V. Chernikov2015-09-141-20/+48
| | | | | | | | | | | * Clean stale headers from if_ether.c. Reported by: rozhuk.im at gmail.com Reviewed by: ae MFC after: 2 weeks Notes: svn path=/head/; revision=287779
* * Split allocation and table linking for lle's.Alexander V. Chernikov2015-08-201-33/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before that, the logic besides lle_create() was the following: return existing if found, create if not. This behaviour was error-prone since we had to deal with 'sudden' static<>dynamic lle changes. This commit fixes bunch of different issues like: - refcount leak when lle is converted to static. Simple check case: console 1: while true; do for i in `arp -an|awk '$4~/incomp/{print$2}'|tr -d '()'`; do arp -s $i 00:22:44:66:88:00 ; arp -d $i; done; done console 2: ping -f any-dead-host-in-L2 console 3: # watch for memory consumption: vmstat -m | awk '$1~/lltable/{print$2}' - possible problems in arptimer() / nd6_timer() when dropping/reacquiring lock. New logic explicitly handles use-or-create cases in every lla_create user. Basically, most of the changes are purely mechanical. However, we explicitly avoid using existing lle's for interface/static LLE records. * While here, call lle_event handlers on all real table lle change. * Create lltable_free_entry() calling existing per-lltable lle_free_t callback for entry deletion Notes: svn path=/head/; revision=286955
* Check value return from lle_create() for NULL.Alexander V. Chernikov2015-08-191-3/+6
| | | | | | | | | This bug sneaked unnoticed in r286722. Reported by: adrian Notes: svn path=/head/; revision=286945
* Fix panic when handling non-inet arp message introduced in r286825.Alexander V. Chernikov2015-08-181-1/+0
| | | | | | | Submitted by: delphij Notes: svn path=/head/; revision=286869
* Split arpresolve() into fast/slow path.Alexander V. Chernikov2015-08-161-61/+93
| | | | | | | | | | | | | | | This change isolates the most common case (e.g. successful lookup) from more complicates scenarios. It also (tries to) make code more simple by avoiding retry: cycle. The actual goal is to prepare code to the upcoming change that will allow LL address retrieval without acquiring LLE lock at all. Reviewed by: ae Differential Revision: https://reviews.freebsd.org/D3383 Notes: svn path=/head/; revision=286825
* Move lle update code from from gigantic ip_arpinput() toAlexander V. Chernikov2015-08-131-94/+167
| | | | | | | | | | | | separate bunch of functions. The goal is to isolate actual lle updates to permit more fine-grained locking. Do all lle link-level update under AFDATA wlock. Sponsored by: Yandex LLC Notes: svn path=/head/; revision=286722