aboutsummaryrefslogtreecommitdiff
path: root/sys/ofed/include
Commit message (Collapse)AuthorAgeFilesLines
* ofed: Fix a typo in a source code commentGordon Bergling2026-03-271-1/+1
| | | | | | - s/refereced/referenced/ MFC after: 3 days
* ofed: reduce usage of struct dma_attrs *dma_attrsBjoern A. Zeeb2026-02-241-4/+4
| | | | | | | | | | | | | | | | | | | | | | | ib_verbs.h still uses struct dma_attrs *dma_attrs everywhere. It is beyond my knowledge when that struct got deprecated upstream but it is still supported by our LinuxKPI. The problem is that the functions called with that argument (dma_map_single_attrs, dma_unmap_single_attrs, dma_map_sg_attrs, dma_unmap_sg_attrs) so far are #defines in LinuxKPI and drop the last argument (attrs) so it was never a problem. In preparation to pass the attrs to the actual implementation in LinuxKPI, which has gained support for them, we now pass dma_sttrs->flags which is the expected unsigned long bit field. If anyone has serious interest in updating our ofed implementation they could look into this some more and remove the usage of struct dma_attrs entirely. Sponsored by: The FreeBSD Foundation MFC after: 3 days Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D55390
* ibcore: Introduce enum ib_raw_packet_caps from Linux 4.11Ka Ho Ng2023-10-281-0/+14
| | | | | | | | | | | | This enum also exists as enum ibv_raw_packet_caps in libibverbs/verbs.h. [khng: cherry-picked from Linux ebaaee253ad3a3c573ab7d3d77e849056bdfa9ea] Sponsored by: Juniper Networks, Inc. MFC after: 7 days Reviewed by: kib, zlei Differential Revision: https://reviews.freebsd.org/D42177
* sys: Remove $FreeBSD$: two-line .h patternWarner Losh2023-08-1635-70/+0
| | | | Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
* spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh2023-05-121-1/+1
| | | | | | | | | The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
* ofed: Fix a logic inversion from IfAPI conversionJustin Hibbits2023-04-191-1/+1
| | | | | | Reported by: bartosz.sobczak_intel.com Fixes: 3e142e07675b ("ofed: Mechanically convert to IfAPI") Sponsored by: Juniper Networks, Inc.
* infiniband: Opt-in for net epochZhenlei Huang2023-04-051-0/+1
| | | | | | | | | This is counterpart to e87c4940156c, which did the same for ethernet. Suggested by: hselasky Reviewed by: hselasky, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D39405
* ofed: Mechanically convert to IfAPIJustin Hibbits2023-03-245-36/+37
| | | | | | | | | | Summary: Because of the intricacies of this code it wasn't purely scripted, but instead hand-mechanical. Reviewed by: hselasky Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D38560
* ofed: Fix a typo in a source code commentGordon Bergling2022-04-091-1/+1
| | | | | | - s/it it/it to/ MFC after: 3 days
* ibcore: Add support for NDR link speed.Hans Petter Selasky2022-02-211-1/+2
| | | | | | | | | | Add new IBTA speed NDR, supporting signaling rate of 100Gb. Linux commit: c7adf7717301558e8852949d8e3dc3748d1a4a97 MFC after: 1 week Sponsored by: NVIDIA Networking
* mlx5ib: Add support for parsing udata in mlx5_ib_create_flow().Hans Petter Selasky2022-02-101-0/+22
| | | | | | | | | Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c) This fixes creating flow rules from user-space after the kernel space update based on Linux 5.7-rc1 . Sponsored by: NVIDIA Networking
* ibcore: Kernel space update based on Linux 5.7-rc1.Hans Petter Selasky2021-07-2815-339/+3686
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overview: This is the first stage of a RDMA stack upgrade introducing kernel changes only based on Linux 5.7-rc1. This patch is based on about four main areas of work: - Update of the IB uobjects system: - The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects is now managed by ibcore. This also require some changes in the kernel verbs API. The updated verbs changes are typically about initialize and deinitialize objects, and remove allocation and free of memory. - Update of the uverbs IOCTL framework: - The parsing and handling of user-space commands has been completely refactored to integrate with the updated IB uobjects system. - Various changes and updates to the generic uverbs interfaces in device drivers including the new uAPI surface. - The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes. Dependencies: - The mlx4ib driver code has been updated with the minimum changes needed. - The mlx5ib driver code has been updated with the minimum changes needed including DV support. Compatibility: - All user-space facing APIs are backwards compatible after this change. - All kernel-space facing RDMA APIs are backwards compatible after this change, with exception of ib_create_ah() and ib_destroy_ah() which takes a new flag. - The "ib_device_ops" structure exist, but only contains the driver ID and some structure sizes. Differences from Linux: - Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set the sizes needed for allocating various IB objects, when adding IB device instances. Security: - PRIV_NET_RAW is needed to use raw ethernet transmit features. - PRIV_DRIVER is needed to use other privileged operations. Based on upstream Linux, Torvalds (5.7-rc1): 8632e9b5645bbc2331d21d892b0d6961c1a08429 MFC after: 1 week Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31149 Sponsored by: NVIDIA Networking
* ibcore: Add some functions and definitions for selecting and querying ↵Hans Petter Selasky2021-07-121-0/+56
| | | | | | | | | | | retryable ucontext cleanup. Linux commit: 1c77483e4c50339b0306572167ccbff6b55d051b MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Declare ib_post_send() and ib_post_recv() arguments constHans Petter Selasky2021-07-121-17/+17
| | | | | | | | | | | | | | | | | | | Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send|recv|srq_recv) really do not modify the posted work request. Linux commit: f696bf6d64b195b83ca1bdb7cd33c999c9dcf514 7bb1fafc2f163ad03a2007295bb2f57cfdbfb630 d34ac5cd3a73aacd11009c4fc3ba15d7ea62c411 MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Implement ib_uverbs_get_ucontext_file().Hans Petter Selasky2021-07-121-0/+3
| | | | | | | | | | | | | | | Expose ib_ucontext from a given ib_uverbs_file. Drivers that use the ioctl(9) API may have the ib_uverbs_file and need a way to get the related ib_ucontext from it, this is enabled by this patch. Downstream patches from this series will use it. Linux commit: 7dc08dcfc8c86cb4457e383734ff6844ddaff876 MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Simplify ib_modify_qp_is_ok().Hans Petter Selasky2021-07-121-4/+2
| | | | | | | | | | | | | | | | All callers to ib_modify_qp_is_ok() provides enum ib_qp_state makes the checks of out-of-scope redundant. Let's remove them together with updating function signature to return boolean result. While at it remove unused "ll" parameter from ib_modify_qp_is_ok(). Linux commit: 19b1f54099b6ee334acbfbcfbdffd1d1f057216d d31131bba5a1630304c55ea775c48cc84912ab59 MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Support rate limit for packet pacingHans Petter Selasky2021-07-121-0/+2
| | | | | | | | | | | | | | | Add new member rate_limit to ib_qp_attr which holds the packet pacing rate in kbps, 0 means unlimited. IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW QPs when changing QP state from RTR to RTS, RTS to RTS. Linux commit: 528e5a1bd3f0e9b760cb3a1062fce7513712a15d MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Define option to set ack timeout.Hans Petter Selasky2021-07-122-0/+5
| | | | | | | | | | | | | | Define new option in 'rdma_set_option' to override calculated QP timeout when requested to provide QP attributes to modify a QP. At the same time, pack tos_set to be bitfield. Linux commit: 2c1619edef61a03cb516efaa81750784c3071d10 MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Protect against concurrent access to hardware stats.Hans Petter Selasky2021-07-121-0/+4
| | | | | | | | | | | | | | Currently access to hardware stats buffer isn't protected, this can result in multiple writes and reads at the same time to the same memory location. This can lead to providing an incorrect value to the user. Add a mutex to protect against it. Linux commit: e945130b52bea65d15f9bdf54949d4cb7a88db7f MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Introduce ib_port_phys_state enum.Hans Petter Selasky2021-07-121-0/+10
| | | | | | | | | | | | In order to improve readability, add ib_port_phys_state enum to replace the use of magic numbers. Linux commit: 72a7720fca37fec0daf295923f17ac5d88a613e1 MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* ibcore: Add rdma_reject_msg() helper function.Hans Petter Selasky2021-07-123-0/+20
| | | | | | | | | | | | rdma_reject_msg() returns a pointer to a string message associated with the transport reject reason codes. Linux commit: 77a5db13153906a7e00740b10b2730e53385c5a8 MFC after: 1 week Reviewed by: kib Sponsored by: Mellanox Technologies // NVIDIA Networking
* mlx4/OFED: replace the struct net_device with struct ifnetBjoern A. Zeeb2021-06-184-15/+15
| | | | | | | | | | | | | | | | | Given all the code does operate on struct ifnet, the last step in this longer series of changes now is to rename struct net_device to struct ifnet (that is what it was defined to in the LinuxKPi code). While mlx4 and OFED are "shared" code the decision was made years ago to not write it based on the netdevice KPI but the native ifnet KPI for most of it. This commit simply spells this out and with that frees "struct netdevice" to be re-done on LinuxKPI to become a more native/mixed implementation over time as needed by, e.g., wireless drivers. Sponsored by: The FreeBSD Foundation MFC after: 10 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D30515
* OFED: migrate LinuxKPI net_device/ifnet macros into ofedBjoern A. Zeeb2021-05-271-0/+7
| | | | | | | | | | | | The LinuxKPI net_device actually is an ifnet; in order to further clean that up so we can extend "net_device" migrate the few macros left into ofed and make sure the header is included in all files which need access to the macros. Sponsored by: The FreeBSD Foundation MFC after: 12 days Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D30477
* LinuxKPI/OFED/mlx4: cleanup netdevice.h some moreBjoern A. Zeeb2021-05-263-1/+17
| | | | | | | | | | | | | This removes all unused bits from linux/netdevice.h and migrates two inline functions into the mlx4 and ofed code respectively. This gets the mlx4/ofed (struct ifnet) specific bits down to 7 lines in netdevice.h. Sponsored by: The FreeBSD Foundation MFC after: 13 days Reviewed by: hselasky, kib Differential Revision: https://reviews.freebsd.org/D30461
* LinuxKPI/OFED: (re)move inetdevice.h implementationBjoern A. Zeeb2021-03-302-1/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | The two functions in linux/inetdevice.h are highly FreeBSD/ifnet specific. This is a result of struct net_device being mapped to struct ifnet. The only known consumer of these functions are two files in the ofed/infiniband code. As a first step of cleaning up copy linux/inetdevice.h to rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve copyright and license of the original file; otherwise it could be merged into ib_addr.h where more EPOCH/vnet/.. are already used). Slightly rename the function to not conflict with LinuxKPI in the future. Remove the three last, now unneeded includes of inetdevice.h and zap linux/inetdevice.h to an empty header file with only the forward include to netdevice.h remaining. Sponsored-by: The FreeBSD Foundation MFC-after: 2 weeks Reviewed-by: hselasky, kib X-D-R: D29366 (extracted as further cleanup) Differential Revision: https://reviews.freebsd.org/D29434
* Update user access region, UAR, APIs in the core in mlx5core.Hans Petter Selasky2021-01-081-8/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change include several changes as listed below all related to UAR. UAR is a special PCI memory area where the so-called doorbell register and blue flame register live. Blue flame is a feature for sending small packets more efficiently via a PCI memory page, instead of using PCI DMA. - All structures and functions named xxx_uuars were renamed into xxx_bfreg. - Remove partially implemented Blueflame support from mlx5en(4) and mlx5ib. - Implement blue flame register allocator. - Use blue flame register allocator in mlx5ib. - A common UAR page is now allocated by the core to support doorbell register writes for all of mlx5en and mlx5ib, instead of allocating one UAR per sendqueue. - Add support for DEVX query UAR. - Add support for 4K UAR for libmlx5. Linux commits: 7c043e908a74ae0a935037cdd984d0cb89b2b970 2f5ff26478adaff5ed9b7ad4079d6a710b5f27e7 0b80c14f009758cefeed0edff4f9141957964211 30aa60b3bd12bd79b5324b7b595bd3446ab24b52 5fe9dec0d045437e48f112b8fa705197bd7bc3c0 0118717583cda6f4f36092853ad0345e8150b286 a6d51b68611e98f05042ada662aed5dbe3279c1e MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking
* infiniband: Appease CovertyEric van Gyzen2020-08-311-0/+7
| | | | | | | | | | | | | | | | Coverity claims the call to rdma_gid2ip in cma_igmp_send overwrites addr. Use a consistent definition of sockaddr to prevent detections and code changes in the future. Submitted by: bret_ketchum@dell.com Reported by: Coverity Reviewed by: hselasky, kib MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26229 Notes: svn path=/head/; revision=364997
* Prevent potential underflow in ibcore.Hans Petter Selasky2019-11-151-1/+1
| | | | | | | | | | | Linux commit: a9018adfde809d44e71189b984fa61cc89682b5e MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=354728
* Correct MR length field to be 64-bit in ibcore.Hans Petter Selasky2019-11-151-1/+1
| | | | | | | | | | | Linux commit: edd31551148c09608feee6b8756ad148d550ee3b MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=354727
* VLAN_TRUNKDEV() requires epochification in ibcore after r353292.Hans Petter Selasky2019-10-161-3/+7
| | | | | | | Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=353633
* OFED: Fix accidental double-copy of rdma_sdp.h in r351176Conrad Meyer2019-08-181-78/+0
| | | | | | | | | | The mistake came about like this: the first attempt to commit was blocked by a pre-commit hook due to missing SVN tags. svn revert doesn't delete new files, I guess. While reapplying the fixed diff, the non-empty target file was just concatenated with the new contents? Ugh. :-( Notes: svn path=/head/; revision=351180
* OFED: Unbreak SDP support in ibcoreConrad Meyer2019-08-171-0/+158
| | | | | | | | | | | | This regression was introduced in the r326169 Linux v4.9 Infiniband upgrade. Restore the functionality. Reviewed by: hselasky Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21298 Notes: svn path=/head/; revision=351176
* OFED: Fix ib_mad.h ib_user_mad.h include to match new uapi pathConrad Meyer2019-08-171-1/+1
| | | | | | | Sponsored by: Dell EMC Isilon Notes: svn path=/head/; revision=351163
* Add new rates to ibcore.Hans Petter Selasky2019-05-081-1/+7
| | | | | | | | | | | | Add the new rates that were added to the Infiniband specification as part of HDR and 2x support. Submitted by: slavash@ MFC after: 3 days Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=347301
* Introduce and use sgid_index in CM requests in ibcore.Hans Petter Selasky2018-09-092-1/+19
| | | | | | | | | | | | | | | | | | | | | | For RoCE, when CM requests are received for RC and UD connections, netdevice of the incoming request is unavailable. Because of that CM requests are always forwarded to init_net namespace. Now that we have the GID index available, introduce SGID index in incoming CM requests and refer to the netdevice of it. While at it fix some incorrect uses of init_net and make sure the rdma_create_id() function stores the VNET it is passed. Based on linux commit: cee104334c98dd04e9dd4d9a4fa4784f7f6aada9 MFC after: 3 days Approved by: re (gjb) Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=338541
* ibcore: Fix endless loop in searching for matching VLAN deviceSlava Shwartsman2018-09-061-2/+2
| | | | | | | | | | | | | | | | In r337943 ifnet's if_pcp was set to the PCP value in use instead of IFNET_PCP_NONE. Current ibcore code assumes that if_pcp is IFNET_PCP_NONE with VLAN interfaces so it can identify prio-tagged traffic. Fix that by explicitly verifying that that the if_type is IFT_ETHER and not IFT_L2VLAN. MFC after: 3 days Approved by: re (Marius), hselasky (mentor), kib (mentor) Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=338491
* Only NULL check the VNET pointer when VIMAGE is enabled in ibcore.Hans Petter Selasky2018-07-311-1/+5
| | | | | | | | | | | Else a NULL VNET pointer should be ignored. This fixes address resolving when VIMAGE is disabled. MFC after: 3 days Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=336964
* Remove blank line.Hans Petter Selasky2018-07-171-1/+0
| | | | | | | | MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=336390
* Check port number supplied by user verbs cmds in ibcore.Hans Petter Selasky2018-07-171-0/+7
| | | | | | | | | | | | | | | | | | | The ib_uverbs_create_ah() ind ib_uverbs_modify_qp() calls receive the port number from user input as part of its attributes and assumes it is valid. Down on the stack, that parameter is used to access kernel data structures. If the value is invalid, the kernel accesses memory it should not. To prevent this, verify the port number before using it. Linux commit: 5ecce4c9b17bed4dc9cb58bfb10447307569b77b a62ab66b13a0f9bcb17b7b761f6670941ed5cd62 5a7a88f1b488e4ee49eb3d5b82612d4d9ffdf2c3 MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=336383
* Check AF family prior resolving address and introduce safer rdma_addr_size() ↵Hans Petter Selasky2018-07-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | variants in ibcore. Garbage supplied by user will cause to UCMA module provide zero memory size for memcpy(), because it wasn't checked, it will produce unpredictable results in rdma_resolve_addr(). There are several places in the ucma ABI where userspace can pass in a sockaddr but set the address family to AF_IB. When that happens, rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6, and the ucma kernel code might end up copying past the end of a buffer not sized for a struct sockaddr_ib. Fix this by introducing new variants int rdma_addr_size_in6(struct sockaddr_in6 *addr); int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr); that are type-safe for the types used in the ucma ABI and return 0 if the size computed is bigger than the size of the type passed in. We can use these new variants to check what size userspace has passed in before copying any addresses. Linux commit: 2975d5de6428ff6d9317e9948f0968f7d42e5d74 09abfe7b5b2f442a85f4c4d59ecf582ad76088d7 84652aefb347297aa08e91e283adf7b18f77c2d5 MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=336380
* Add support for prio-tagged traffic for RDMA in ibcore.Hans Petter Selasky2018-07-171-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | When receiving a PCP change all GID entries are reloaded. This ensures the relevant GID entries use prio tagging, by setting VLAN present and VLAN ID to zero. The priority for prio tagged traffic is set using the regular rdma_set_service_type() function. Fake the real network device to have a VLAN ID of zero when prio tagging is enabled. This is logic is hidden inside the rdma_vlan_dev_vlan_id() function which must always be used to retrieve the VLAN ID throughout all of ibcore and the infiniband network drivers. The VLAN presence information then propagates through all of ibcore and so incoming connections will have the VLAN bit set. The incoming VLAN ID is then checked against the return value of rdma_vlan_dev_vlan_id(). MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=336372
* Set RoCEv2 MGID according to spec in ibcore.Hans Petter Selasky2018-07-171-1/+7
| | | | | | | | | | | | | | | | | | | RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding IPv4 address is encoded into the GID according to the following rule: GID= :ffff:<IPv4 address> Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it zeroed and change rdma_is_multicast_addr() to consider the new logic. Linux commit: be1d325a335840a86c133a56c6a911c368bac0fd 1c3aea2bc8f0b2e5b57375ead40457ff75a3a2ec MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=336370
* ifnet: Replace if_addr_lock rwlock with epoch + mutexMatt Macy2018-05-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run on LLNW canaries and tested by pho@ gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace. When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch: InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32 After the patch InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52 Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366 Notes: svn path=/head/; revision=333813
* Add IB_SPEED_HDR definition in ibcore.Hans Petter Selasky2018-03-071-1/+2
| | | | | | | | MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=330581
* Optimize ibcore RoCE address handle creation from user-space.Hans Petter Selasky2018-03-052-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Creating a UD address handle from user-space or from the kernel-space, when the link layer is ethernet, requires resolving the remote L3 address into a L2 address. Doing this from the kernel is easy because the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily available. In userspace such an interface does not exist and kernel help is required. It should be noted that in an IP-based GID environment, the GID itself does not contain all the information needed to resolve the destination IP address. For example information like VLAN ID and SCOPE ID, is not part of the GID and must be fetched from the GID attributes. Therefore a source GID should always be referred to as a GID index. Instead of going through various racy steps to obtain information about the GID attributes from user-space, this is now all done by the kernel. This patch optimises the L3 to L2 address resolving using the existing create address handle uverbs interface, retrieving back the L2 address as an additional user-space information structure. This commit combines the following Linux upstream commits: IB/core: Let create_ah return extended response to user IB/core: Change ib_resolve_eth_dmac to use it in create AH IB/mlx5: Make create/destroy_ah available to userspace IB/mlx5: Use kernel driver to help userspace create ah IB/mlx5: Report that device has udata response in create_ah MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=330508
* Get correct network device when accepting incoming RDMA connections in ibcore.Hans Petter Selasky2018-03-051-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch ensures the GID index is always used as a basis of resolving incoming RDMA connections, as compared to the GID value itself. Background: On a per infiniband port basis, the GID identifier is not a unique identifier! This assumption falls apart when VLAN ID, IPv6 scope ID and RoCE type, as supported by RoCE v2, is taken into account. This additional information is stored in the so-called GID attributes and is needed to correctly identify the destination network interface for an incoming connection. Different VLANs are allowed to define the same IPv4 addresses and especially for the default IPv6 link-local addresses or when using so-called containers or jails, this is true. The VNET information for the destination network interface is needed in order to perform the L2 address lookup in the right Virtual Network Stack context. Consequently old functions previously used by RoCE v1, like rdma_addr_find_smac_by_sgid() are impossible to support, because there can be multiple identical GIDs associated with the same infiniband port, and the answer to such a request becomes undefined. This function has been removed. MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=330507
* Add missing FreeBSD tags and SVN properties to ibcore.Hans Petter Selasky2018-03-0533-18/+117
| | | | | | | | MFC after: 1 week Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=330490
* Import the mthca kernel side infiniband driver from Linux 4.9 and fixHans Petter Selasky2018-02-131-0/+111
| | | | | | | | | | | | | compilation under FreeBSD. The mthca driver was temporarily removed as part of the Linux 4.9 RoCE/infinband upgrade. Top commit in Linux source tree: 69973b830859bc6529a7a0468ba0d80ee5117826 Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=329222
* sys: general adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-2715-15/+45
| | | | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. No functional change intended. Notes: svn path=/head/; revision=326272
* Merge ^/head r323559 through r325504.Hans Petter Selasky2017-11-071-0/+6
|\ | | | | | | Notes: svn path=/projects/bsd_rdma_4_9/; revision=325505