summaryrefslogtreecommitdiff
path: root/sys/netinet
Commit message (Collapse)AuthorAgeFilesLines
* MFC r208553Qing Li2010-06-254-6/+7
| | | | | | | | | | This patch fixes the problem where proxy ARP entries cannot be added over the if_ng interface. Approved by: re (bz) Notes: svn path=/releng/8.1/; revision=209524
* MFC 209264Michael Tuexen2010-06-222-13/+20
| | | | | | | | | | | | | | | | | * Fix a bug where the length of the ASCONF-ACK was calculated wrong due to using an uninitialized variable. * Fix a bug where a NULL pointer was dereferenced when interfaces come and go at a high rate. * Fix a bug where inps where not deregistered from iterators. * Fix a race condition in freeing an association. * Fix a refcount problem related to the iterator. Each of the above bug results in a panic. It shows up when interfaces come and go at a high rate. Approved by: re Notes: svn path=/releng/8.1/; revision=209433
* MFC 209029Michael Tuexen2010-06-113-43/+28
| | | | | | | | | | | | | | | | | | | 3 Fixes - a) There was a case where a ICMP message could cause us to return leaving a stuck lock on an stcb. b) The iterator needed some tweaks to fix its lock ordering. c) The ITERATOR_LOCK is no longer needed in the freeing of a stcb. Now that the timer based one is gone we don't have a multiple resume situation. Add to that that there was somewhere a path out of the freeing of an assoc that did NOT release the iterator_lock.. it was time to clean this old code up and in the process fix the lock bug. Approved by: re (bz) Notes: svn path=/stable/8/; revision=209067
* MFC:Randall Stewart2010-06-1115-423/+421
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a number of bugs and race conditions. r208160: Bring back of the iterator thread. It now properly handles VNETS having only one thread. The old timer based code was full of LOR's and other issues. r208852: Cleanup bug. Basically when an un-accepted socket was hanging on a closed listener, we would leak the inp never cleaning it up r208853: Enhance the use under invarients of the audit for locks function and fix a bug where a close collision with a cookie being processed would cause a crash. r208854: Use the proper increment macros when working with the sent_queue_retran_cnt r208855: Align comments properly, Fix a bug where we were NOT looking at the resend markings for control chunks and also not decrementing the retran count which caused extra calls to retransmission. Alos add a valid no locks call to the output routine. r208856: Spacing issues in auth/bsd addr. r208857: Get rid of a windows ifdef that somehow leaked in r208863: Missing error leg returns in some failure cases r208864: LOR fix between the iterator and sctp_inpcb_close r208874: Don't call the sctp_inpcb_free from abort an association since you don't know what locks you hold and a timer will take care of the situation when the gone flag is set r208875: sctp_inpcb_free bug - a socket under the right situation could get stuck (from the accept queue) and never start the proper cleanup timer) r208876: Further enhance invariant lock validation, Fix a bug where a closed socket and a INIT-ACK could collide and cause a crash r208878: Clear up another bug in sctp_inpcb_free where we would end up due to a race in freeing hit a destroy of a contended lock. r208879: Optimize the cleanup and make some additional fixes in the sysctl code so that it won't reference a GONE INP and crash us r208883 & r208891: Fix so we don't open a hole between a sock lock and a call to socantrcvmore.. we could before hit a race that would kill the socket underneath us leading to a crash r208897: CUM-ACK calculation was messed up. So basically large message got broken from the original NR_sack integration. r208902: Make sure that we don't move a bit to the NR array that is behind the cum-ack r208952: Use both bit maps to calculte the cum-ack. r208953: Fix bug having to do with freeing an sctp_inpcb_free(). 1) make sure not to remove the flag until you get the lock again. 2) make sure all log_closing calls hold the lock. 3) Release all the locks when everthing is done and call callout_drain not callout_stop.. r208970: Fix some places on user allocation of a new sctp_inpcb where we run out of resource that we make sure to NULL the so_pcb pointer. Approved by: re - (bz@freebsd.org) Notes: svn path=/stable/8/; revision=209028
* Merge r204830 from head to stable/8Robert Watson2010-06-031-3/+0
| | | | | | | | | | | | | Locking the tcbinfo structure should not be necessary in tcp_timer_delack(), so don't. Reviewed by: bz Sponsored by: Juniper Networks Approved by: re (kib) Notes: svn path=/stable/8/; revision=208768
* Merge r204826 from head to stable/8:Robert Watson2010-06-031-12/+5
| | | | | | | | | | | | | | Make udp_set_kernel_tunneling() less forgiving when its invariants are violated: so_pcb can never be NULL for a valid UDP socket, and it is always SOCK_DGRAM. Use sotoinpcb() as the rest of the UDP code does. Reviewed by: bz Sponsored by: Juniper Networks Approved by: re (kib) Notes: svn path=/stable/8/; revision=208767
* Merge r204810 from head to stable/8:Robert Watson2010-06-031-3/+0
| | | | | | | | | | | | | | | Remove unnecessary locking of divcbinfo lock from div_output(): this has not been required since FreeBSD 7.0 when the so_pcb pointer leading to inp was guaranteed to be stable when a valid socket reference is held (as it is in the output path). Reviewed by: bz Sponsored by: Juniper Networks Approved by: re (kib) Notes: svn path=/stable/8/; revision=208766
* Merge r204809 from head to stable/8:Robert Watson2010-06-011-3/+9
| | | | | | | | | | | | | | | Add a comment to tcp_usr_accept() to indicate why it is we acquire the tcbinfo lock there: r175612, which re-added it, masked a race between sonewconn(2) and accept(2) that could allow an incompletely initialized address on a newly-created socket on a listen queue to be exposed. Full details can be found in that commit message. Sponsored by: Juniper Networks Approved by: re (bz) Notes: svn path=/stable/8/; revision=208700
* Merge r204806 from head to stable/8:Robert Watson2010-06-012-2/+3
| | | | | | | | | | Wrap use of rw_try_upgrade() on pcbinfo with macro INP_INFO_TRY_UPGRADE() to match other pcbinfo locking macros. Approved by: re (bz) Notes: svn path=/stable/8/; revision=208698
* MFC r206844:Kenneth D. Merry2010-05-211-1/+1
| | | | | | | | | | Don't clear other flags (e.g. CSUM_TCP) when setting CSUM_TSO. This was causing TSO to break for the Xen netfront driver. Reviewed by: gibbs, rwatson Notes: svn path=/stable/8/; revision=208367
* MFC 207985Randall Stewart2010-05-161-3/+2
| | | | | | | | | Fix an old long time bug in generating a fwd-tsn. This would appear when greater than the size of mbuf TSN's would need to be skipped. Notes: svn path=/stable/8/; revision=208158
* MFC 207983Randall Stewart2010-05-162-4/+5
| | | | | | | | | | | | More PR-SCTP bugs: - Make sure that when you kick the streams you add correctly using a 16 bit unsigned. - Make sure when sending out you allow FWD-TSN to skip over and list the ACKED chunks in the stream/seq list (so the rcv will kick the stream) Notes: svn path=/stable/8/; revision=208157
* MFC 207966 (for Michael)Randall Stewart2010-05-162-16/+0
| | | | | | | Get rid of unused constants. Notes: svn path=/stable/8/; revision=208156
* MFC of 207963Randall Stewart2010-05-161-36/+13
| | | | | | | | | | | | This fixes PR-SCTP issues: - Slide the map at the proper place. - Mark the bits in the nr_array ONLY if there is no marking. - When generating a FWD-TSN we allow us to skip past ACKED chunks too. Notes: svn path=/stable/8/; revision=208155
* MFC of 207924:Randall Stewart2010-05-164-7/+22
| | | | | | | | | | | This fixes a bug with the one-2-one model socket when a user sets up a socket to a server sends data and closes the socket before the server has called accept(). It used to NOT work at all. Now we add a flag to the assoc and defer assoc cleanup so that the accept will succeed Notes: svn path=/stable/8/; revision=208154
* MFC 206758, 206840, 206891, 206892, 207099, 207191, 207197Michael Tuexen2010-05-074-31/+28
| | | | | | | | | | | | * Fix a bug where SACKs are not sent when they should. * Get delayed SACK working again. * Really print the nr_mapping array when it should be printed. * Update highest_tsn variables when sliding mapping arrays. * Sending a FWDTSN chunk should not affect the retran count. * Cleanups. Notes: svn path=/stable/8/; revision=207756
* MFC r207369:Bjoern A. Zeeb2010-05-0621-306/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MFP4: @176978-176982, 176984, 176990-176994, 177441 "Whitspace" churn after the VIMAGE/VNET whirls. Remove the need for some "init" functions within the network stack, like pim6_init(), icmp_init() or significantly shorten others like ip6_init() and nd6_init(), using static initialization again where possible and formerly missed. Move (most) variables back to the place they used to be before the container structs and VIMAGE_GLOABLS (before r185088) and try to reduce the diff to stable/7 and earlier as good as possible, to help out-of-tree consumers to update from 6.x or 7.x to 8 or 9. This also removes some header file pollution for putatively static global variables. Revert VIMAGE specific changes in ipfilter::ip_auth.c, that are no longer needed. Reviewed by: jhb Discussed with: rwatson Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Notes: svn path=/stable/8/; revision=207695
* MFC r207275:Bruce M Simpson2010-05-031-6/+11
| | | | | | | | | | | Fix a regression where DVMRP diagnostic traffic, such as that used by mrinfo and mtrace, was dropped by the IGMP TTL check. IGMP control traffic must always have a TTL of 1. Submitted by: Matthew Luckie Notes: svn path=/stable/8/; revision=207558
* MFC r207277:Bjoern A. Zeeb2010-05-021-5/+18
| | | | | | | | | | | | | | | | | Enhance the historic behaviour of raw sockets and jails in a way that we allow all possible jail IPs as source address rather than forcing the "primary". While IPv6 naturally has source address selection, for legacy IP we do not go through the pain in case IP_HDRINCL was not set. People should bind(2) for that. This will, for example, allow ping(|6) -S to work correctly for non-primary addresses. Reported by: (ten 211.ru) Tested by: (ten 211.ru) Notes: svn path=/stable/8/; revision=207515
* MFC r206989:Bjoern A. Zeeb2010-05-021-1/+1
| | | | | | | | | | | Avoid memory access after free. Use the (shortend) copy for the ipsec mtu lookup as well. PR: kern/145736 Submitted by: Peter Molnar (peter molnar.cc) Notes: svn path=/stable/8/; revision=207512
* MFC 206452:Bruce M Simpson2010-04-271-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a few issues related to the legacy 4.4 BSD multicast APIs. IPv4 addresses can and do change during normal operation. Testing by pfSense developers exposed an issue where OpenOSPFD was using the IPv4 address to leave the OSPF link-scope multicast groups on a dynamic OpenVPN tun interface, rather than using RFC 3678 with the interface index, which won't be raced when the interface's addresses change. In inp_join_group(): If we are already a member of an ASM group, and IP_ADD_MEMBERSHIP or MCAST_JOIN_GROUP ioctls are re-issued, return EADDRINUSE as per the legacy 4.4BSD multicast API. This bends RFC 3678 slightly, but does not violate POLA for apps using the old API. It also stops us falling through to kicking IGMP state transactions in what is otherwise a no-op case. [This has already been dealt with in HEAD, but make it explicit before we MFC the change to 8.] In inp_leave_group(): Fix a bogus conditional. Move the ifp null check to ioctls MCAST_LEAVE* in the switch..case where it actually belongs. If an interface was specified, by primary IPv4 address, for ioctl IP_DROP_MEMBERSHIP or MCAST_LEAVE_GROUP (an ASM full leave operation), then and only then should we look up the ifp from the IPv4 address in mreqs.imr_interface. If not, we fall through to imo_match_group() as before, but only in the IP_DROP_MEMBERSHIP case. With these changes, the legacy 4.4BSD multicast API idempotence should be mostly preserved in the SSM enabled IPv4 stack. [Note: this is not a straight svn merge as head and 8 differ slightly] Found by: ermal (with pfSense) Notes: svn path=/stable/8/; revision=207274
* MFC r206481:Bjoern A. Zeeb2010-04-212-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Plug reference leaks in the link-layer code ("new-arp") that previously prevented the link-layer entry from being freed. In both in.c and in6.c (though that code path seems to be basically dead) plug a reference leak in case of a pending callout being drained. In if_ether.c consistently add a reference before resetting the callout and in case we canceled a pending one remove the reference for that. In the final case in arptimer, before freeing the expired entry, remove the reference again and explicitly call callout_stop() to clear the active flag. In nd6.c:nd6_free() we are only ever called from the callout function and thus need to remove the reference there as well before calling into llentry_free(). In if_llatbl.c when freeing the entire tables make sure that in case we cancel a pending callout to remove the reference as well. Reviewed by: qingli (earlier version) MFC after: 10 days Problem observed, patch tested by: simon on ipv6gw.f.o, Christian Kratzer (ck cksoft.de), Evgenii Davidov (dado korolev-net.ru) PR: kern/144564 Configurations still affected: with options FLOWTABLE Notes: svn path=/stable/8/; revision=207013
* MFC r206456:Rui Paulo2010-04-171-4/+2
| | | | | | | | | | Honor the CE bit even when the CWR bit is set. PR: 145600 Submitted by: Richard Scheffenegger <rs at netapp.com> Notes: svn path=/stable/8/; revision=206762
* MFC of 206281Randall Stewart2010-04-172-33/+18
| | | | | | | | Final MFC of all the IETF hack a-thon.. head and stable are now in sync ;-) Notes: svn path=/stable/8/; revision=206744
* MFC of 206151Randall Stewart2010-04-171-13/+60
| | | | Notes: svn path=/stable/8/; revision=206743
* MFC of 206137Randall Stewart2010-04-1714-806/+437
| | | | | | | | | This is Part III of the great IETF hack-a-thon to fix the NR-Sack code. (the last one on the cpu options was a lull.. i.e MFC 205629).. still 2 more to go. Notes: svn path=/stable/8/; revision=206742
* MFC of 205629Randall Stewart2010-04-174-2/+173
| | | | | | | | | | Adds the option of seperating out the sctp stats per processor. This will be refined further and is definetly exploratory (which is why its an option) i.e. making it allocate the actual number of processors is coming ;-D. Notes: svn path=/stable/8/; revision=206741
* MFC of 205628Randall Stewart2010-04-171-2/+0
| | | | | | | Out goes the nr_mapping_array expand. Notes: svn path=/stable/8/; revision=206740
* MFC of 205627Randall Stewart2010-04-174-666/+193
| | | | | | | | Part II (more to follow) of the great IETF hack-a-thon to fix the NR-Sack code. Notes: svn path=/stable/8/; revision=206739
* MFC of 204141Randall Stewart2010-04-173-6/+24
| | | | | | | | Cleans up so we can have a vtag reflected argument. One of Michaels fixes ;-) Notes: svn path=/stable/8/; revision=206738
* MFC of 204096Randall Stewart2010-04-172-4/+12
| | | | | | | | One of Michaels changes to fix some sign issues and some minor locking. Notes: svn path=/stable/8/; revision=206736
* MFD 204040Randall Stewart2010-04-171-3/+3
| | | | | | | Fixes some argument calsl (u_long vs uint32_t). Notes: svn path=/stable/8/; revision=206735
* MFC of 203847Randall Stewart2010-04-171-2/+2
| | | | | | | | Puts in missing packed declarations (from Michael). It worked only because it was properly aligned anyway ;-) Notes: svn path=/stable/8/; revision=206734
* MFC of 203503Randall Stewart2010-04-171-1/+1
| | | | | | | A fix to how the checksum code works that Michael put in. Notes: svn path=/stable/8/; revision=206733
* MFC of 202782Randall Stewart2010-04-173-19/+19
| | | | | | | Michaels changes that took out [0] -> for [] Notes: svn path=/stable/8/; revision=206732
* MFC of 202526Randall Stewart2010-04-176-1945/+241
| | | | | | | | The first round of some of Michael's changes to get the sack processing in better shape. Notes: svn path=/stable/8/; revision=206731
* MFC of 205502Randall Stewart2010-04-173-50/+35
| | | | | | | | The firste of Michael and my long fight at the IETF to get the NR sack code fixed and aligned. Notes: svn path=/stable/8/; revision=206730
* MFC of 202523Randall Stewart2010-04-171-1/+7
| | | | | | | | This fixes a closing race condition that is unlikely to ever happen.. but good to fix ;-) Notes: svn path=/stable/8/; revision=206729
* MFC 202521Randall Stewart2010-04-171-6/+0
| | | | | | | | | | More stray ifdef's that had worked their way into the code base somehow (yes thats ifdef Windows going out.. our stack runs on windows .. big thanks for that goes to Kozuka-san and Bruce Cran ;-D) Notes: svn path=/stable/8/; revision=206728
* MFC of 202520Randall Stewart2010-04-172-15/+18
| | | | | | | | | This aligns us to the socket api of the stream reset with proper naming.. and a define for backward compatibility. Notes: svn path=/stable/8/; revision=206727
* MFC of 202518Randall Stewart2010-04-171-12/+0
| | | | | | | More ifdefs that should not be present... Notes: svn path=/stable/8/; revision=206726
* MFC of 202517Randall Stewart2010-04-171-4/+0
| | | | | | | | | Again gets rid of some rather strange ifdef's for APPLE/USERSPACE that drifted in through our scrubber programs. Notes: svn path=/stable/8/; revision=206725
* MFC 202516Randall Stewart2010-04-171-12/+0
| | | | | | | | This gets rid of some stray #ifdef APPLE that drifted in some how. Notes: svn path=/stable/8/; revision=206724
* Merge of SVN 196507.Randall Stewart2010-04-171-460/+255
| | | | | | | | This optimizes the sack handling a bit and restructures it so its much more readable ;-) Notes: svn path=/stable/8/; revision=206723
* add priority scheduler.Luigi Rizzo2010-04-072-1/+231
| | | | Notes: svn path=/stable/8/; revision=206342
* fix breakage in ipfw removal.Luigi Rizzo2010-04-071-57/+97
| | | | Notes: svn path=/stable/8/; revision=206340
* MFC of 2 items to fix the csum for v6 issue:Randall Stewart2010-04-055-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Revision 205075 and 205104: ---------205075---------- With the recent change of the sctp checksum to support offload, no delayed checksum was added to the ip6 output code. This causes cards that do not support SCTP checksum offload to have SCTP packets that are IPv6 NOT have the sctp checksum performed. Thus you could not communicate with a peer. This adds the missing bits to make the checksum happen for these cards. ------------------------- ---------205104---------- The proper fix for the delayed SCTP checksum is to have the delayed function take an argument as to the offset to the SCTP header. This allows it to work for V4 and V6. This of course means changing all callers of the function to either pass the header len, if they have it, or create it (ip_hl << 2 or sizeof(ip6_hdr)). ------------------------- PR: 144529 Notes: svn path=/stable/8/; revision=206181
* MFC 204902Qing Li2010-04-022-1/+13
| | | | | | | | | | | | | | | | | | | | | | | One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to allow for connection load balancing across interfaces. Currently the address alias handling method is colliding with the ECMP code. For example, when two interfaces are configured on the same prefix, only one prefix route is installed. So connection load balancing among the available interfaces is not possible. The other advantage of ECMP is for failover. The issue with the current code, is that the interface link-state is not reflected in the route entry. For example, if there are two interfaces on the same prefix, the cable on one interface is unplugged, new and existing connections should switch over to the other interface. This is not done today and packets go into a black hole. Also, there is a small bug in the kernel where deleting ECMP routes in the userland will always return an error even though the command is successfully executed. Notes: svn path=/stable/8/; revision=206067
* MFC 201131Qing Li2010-04-021-18/+22
| | | | | | | | | | | introduce a local variable rte acting as a cache of ro->ro_rt within ip_output, achieving (in random order of importance): - a reduction of the number of 'r's in the source code; - improved legibility; - a reduction of 64 bytes in the .text Notes: svn path=/stable/8/; revision=206066
* MFC 205066, 205069, 205093, 205097, 205488:Kip Macy2010-04-012-11/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | r205066: Log: - restructure flowtable to support ipv6 - add a name argument to flowtable_alloc for printing with ddb commands - extend ddb commands to print destination address or 4-tuples - don't parse ports in ulp header if FL_HASH_ALL is not passed - add kern_flowtable_insert to enable more generic use of flowtable (e.g. system calls for adding entries) - don't hash loopback addresses - cleanup whitespace - keep statistics per-cpu for per-cpu flowtables to avoid cache line contention - add sysctls to accumulate stats and report aggregate r205069: Log: fix stats reporting sysctl r205093: Log: re-update copyright to 2010 pointed out by danfe@ r205097: Log: flowtable_get_hashkey is only used by a DDB function - move under #ifdef DDB pointed out by jkim@ r205488: Log: - boot-time size the ipv4 flowtable and the maximum number of flows - increase flow cleaning frequency and decrease flow caching time when near the flow limit - stop allocating new flows when within 3% of maxflows don't start allocating again until below 12.5% Notes: svn path=/stable/8/; revision=206024