aboutsummaryrefslogtreecommitdiff
path: root/sys/dev/nvme/nvme.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix panic if NVMe is detached before the intrhook call.Alexander Motin2020-11-121-1/+6
| | | | | | | | MFC after: 1 week Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=367625
* nvme: change namei_request_zone into a malloc typeMateusz Guzik2020-11-051-5/+0
| | | | | | | | | | | | | Both the size (128 bytes) and ephemeral nature of allocations make it a great fit for malloc. A dedicated zone unnecessarily avoids sharing buckets with 128-byte objects. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D27103 Notes: svn path=/head/; revision=367400
* nvme: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-1/+0
| | | | Notes: svn path=/head/; revision=365189
* Fix various Coverity-detected errors in nvme driverDavid Bright2020-05-021-1/+2
| | | | | | | | | | | | | | | This fixes several Coverity-detected errors in the nvme driver. CIDs addressed: 1008344, 1009377, 1009380, 1193740, 1305470, 1403975, 1403980 Reviewed by: imp@, vangyzen@ MFC after: 5 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D24532 Notes: svn path=/head/; revision=360568
* Add KASSERT to ensure sane nsid.Warner Losh2020-05-011-1/+6
| | | | | | | | | All callers are currently filtering bad nsid to this function, however, we'll have undefined behavior if that's not true. Add the KASSERT to prevent that. Notes: svn path=/head/; revision=360550
* Move reset to the interrutp processing stageWarner Losh2019-12-111-19/+0
| | | | | | | | | | | This trims the boot time a bit more for AWS and other platforms that have nvme drives. There's no reason too do this inline. This has been in my tree a while, but IIRC I talked to Jim Harris about this at one of our face to face meetings. MFC After: 2 weeks Notes: svn path=/head/; revision=355631
* Implement nvme suspend / resume for pci attachmentWarner Losh2019-09-031-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we suspend, we need to properly shutdown the NVME controller. The controller may go into D3 state (or may have the power removed), and to properly flush the metadata to non-volatile RAM, we must complete a normal shutdown. This consists of deleting the I/O queues and setting the shutodown bit. We have to do some extra stuff to make sure we reset the software state of the queues as well. On resume, we have to reset the card twice, for reasons described in the attach funcion. Once we've done that, we can restart the card. If any of this fails, we'll fail the NVMe card, just like we do when a reset fails. Set is_resetting for the duration of the suspend / resume. This keeps the reset taskqueue from running a concurrent reset, and also is needed to prevent any hw completions from queueing more I/O to the card. Pass resetting flag to nvme_ctrlr_start. It doesn't need to get that from the global state of the ctrlr. Wait for any pending reset to finish. All queued I/O will get sent to the hardware as part of nvme_ctrlr_start(), though the upper layers shouldn't send any down. Disabling the qpairs is the other failsafe to ensure all I/O is queued. Rename nvme_ctrlr_destory_qpairs to nvme_ctrlr_delete_qpairs to avoid confusion with all the other destroy functions. It just removes the queues in hardware, while the other _destroy_ functions tear down driver data structures. Split parts of the hardware reset function up so that I can do part of the reset in suspsend. Split out the software disabling of the qpairs into nvme_ctrlr_disable_qpairs. Finally, fix a couple of spelling errors in comments related to this. Relnotes: Yes MFC After: 1 week Reviewed by: scottl@ (prior version) Differential Revision: https://reviews.freebsd.org/D21493 Notes: svn path=/head/; revision=351747
* It turns out the duplication is only mostly harmless.Warner Losh2019-08-231-0/+16
| | | | | | | | | | | | | | | | | | | While it worked with the kenrel, it wasn't working with the loader. It failed to handle dependencies correctly. The reason for that is that we never created a nvme module with the DRIVER_MODULE, but instead a nvme_pci and nvme_ahci module. Create a real nvme module that nvd can be dependent on so it can import the nvme symbols it needs from there. Arguably, nvd should just be a simple child of nvme, but transitioning to that (and winning that argument given why it was done this way) is beyond the scope of this change. Reviewed by: jhb@ Differential Revision: https://reviews.freebsd.org/D21382 Notes: svn path=/head/; revision=351447
* Remove stray line that was duplicated.Warner Losh2019-08-221-1/+0
| | | | | | | Noticed by: rpokala@ Notes: svn path=/head/; revision=351376
* Separate the pci attachment from the rest of nvmeWarner Losh2019-08-211-145/+5
| | | | | | | | | | | Nvme drives can be attached in a number of different ways. Separate out the PCI attachment so that we can have other attachment types, like ahci and various types of NVMeoF. Submitted by: cognet@ Notes: svn path=/head/; revision=351355
* Formalize NVMe controller consumer life cycle.Alexander Motin2019-08-211-9/+23
| | | | | | | | | | | This fixes possible double call of fail_fn, for example on hot removal. It also allows ctrlr_fn to safely return NULL cookie in case of failure and not get useless ns_fn or fail_fn call with NULL cookie later. MFC after: 2 weeks Notes: svn path=/head/; revision=351320
* Use sysctl + CTLRWTUN for hw.nvme.verbose_cmd_dump.Warner Losh2019-07-191-2/+0
| | | | | | | | | | | Also convert it to a bool. While the rest of the driver isn't yet bool clean, this will help. Reviewed by: cem@ Differential Revision: https://reviews.freebsd.org/D20988 Notes: svn path=/head/; revision=350120
* Provide new tunable hw.nvme.verbose_cmd_dumpWarner Losh2019-07-181-0/+3
| | | | | | | | | | | | | | | The nvme drive dumps only the most relevant details about a command when it fails. However, there are times this is not sufficient (such as debugging weird issues for a new drive with a vendor). Setting hw.nvme.verbose_cmd_dump=1 in loader.conf will enable more complete debugging information about each command that fails. Reviewed by: rpokala Sponsored by: Netflix Differential Version: https://reviews.freebsd.org/D20988 Notes: svn path=/head/; revision=350118
* Remove now-obsolete comment.Warner Losh2019-07-171-2/+1
| | | | Notes: svn path=/head/; revision=350094
* Remove do-nothing nvme_modevent.Warner Losh2018-11-161-30/+1
| | | | | | | | | nvme_modevent no longer does anything interesting, remove it. Sponsored by: Netflix Notes: svn path=/head/; revision=340481
* Put a workaround in for command timeout malfunctioningWarner Losh2018-10-261-0/+20
| | | | | | | | | | | | | At least one NVMe drive has a bug that makeing the Command Time Out PCIe feature unreliable. The workaround is to disable this feature. The driver wouldn't deal correctly with a timeout anyway. Only do this for drives that are known bad. Sponsored by: Netflix, Inc Differential Revision: https://reviews.freebsd.org/D17708 Notes: svn path=/head/; revision=339775
* Make NVMe compatible with the original APIChuck Tuffli2018-08-221-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The original NVMe API used bit-fields to represent fields in data structures defined by the specification (e.g. the op-code in the command data structure). The implementation targeted x86_64 processors and defined the bit fields for little endian dwords (i.e. 32 bits). This approach does not work as-is for big endian architectures and was changed to use a combination of bit shifts and masks to support PowerPC. Unfortunately, this changed the NVMe API and forces #ifdef's based on the OS revision level in user space code. This change reverts to something that looks like the original API, but it uses bytes instead of bit-fields inside the packed command structure. As a bonus, this works as-is for both big and little endian CPU architectures. Bump __FreeBSD_version to 1200081 due to API change Reviewed by: imp, kbowling, smh, mav Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D16404 Notes: svn path=/head/; revision=338182
* Refactor NVMe CAM integration.Alexander Motin2018-05-251-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | - Remove layering violation, when NVMe SIM code accessed CAM internal device structures to set pointers on controller and namespace data. Instead make NVMe XPT probe fetch the data directly from hardware. - Cleanup NVMe SIM code, fixing support for multiple namespaces per controller (reporting them as LUNs) and adding controller detach support and run-time namespace change notifications. - Add initial support for namespace change async events. So far only in CAM mode, but it allows run-time namespace arrival and departure. - Add missing nvme_notify_fail_consumers() call on controller detach. Together with previous changes this allows NVMe device detach/unplug. Non-CAM mode still requires a lot of love to stay on par, but at least CAM mode code should not stay in the way so much, becoming much more self-sufficient. Reviewed by: imp MFC after: 1 month Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=334200
* NVMe: Add big-endian supportWojciech Macek2018-02-221-8/+23
| | | | | | | | | | | | | | | | Remove bitfields from defined structures as they are not portable. Instead use shift and mask macros in the driver and nvmecontrol application. NVMe is now working on powerpc64 host. Submitted by: Michal Stanek <mst@semihalf.com> Obtained from: Semihalf Reviewed by: imp, wma Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D13916 Notes: svn path=/head/; revision=329824
* Use atomic load and stores to ensure that the compiler doesn'tWarner Losh2018-01-291-2/+1
| | | | | | | | | | | | | | | optimize away these loops. Change boolean to int to match what atomic API supplies. Remove wmb() since the atomic_store_rel() on status.done ensure the prior writes to status. It also fixes the fact that there wasn't a rmb() before reading done. This should also be more efficient since wmb() is fairly heavy weight. Sponsored by: Netflix Reviewed by: kib@, jim harris Differential Revision: https://reviews.freebsd.org/D14053 Notes: svn path=/head/; revision=328521
* When we're disabling the nvme device, some drives have a controllerWarner Losh2017-12-181-0/+19
| | | | | | | | | | | | | | | | | | | | | | bug that requires 'hands off' for a period of time (2.3s) before we check the RDY bit. Sicne this is a very odd quirk for a very limited selection of drives, do this as a quirk. This prevented a successful reset of the card when the card wedged. Also, make sure that we comply with the advice from section 3.1.5 of the 1.3 spec says that transitioning CC.EN from 0 to 1 when CSTS.RDY is 1 or transitioning CC.EN from 1 to 0 when CSTS.RDY is 0 "has undefined results". Short circuit when EN == RDY == desired state. Finally, fail the reset if the disable fails. This will lead to a failed device, which is what we want. (note: nda device needs work for coping with a failed device). Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D13389 Notes: svn path=/head/; revision=326937
* sys/dev: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-271-0/+2
| | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326255
* The nvme module should explicitly declare dependency on the cam.Konstantin Belousov2017-08-311-0/+1
| | | | | | | | | | | | If both nvme and cam are compiled as modules, nvme cannot be kldloaded otherwise. Reviewed by: imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Notes: svn path=/head/; revision=323054
* Enable bus mastering on the device before resetting the device. TheWarner Losh2017-08-251-2/+6
| | | | | | | | | | | card has to do PCIe transactions to complete the reset process, but can't do them, per the PCIe spec, unless bus mastering is enabled. Submitted by: Kinjal Patel PR: 22166 Notes: svn path=/head/; revision=322872
* Move NVME controller shutdown from being called as part of module unloadingNathan Whitehorn2017-08-121-15/+7
| | | | | | | | | | | | | to being called through the newbus DEVICE_SHUTDOWN() path. This ensures that the NVME controller gets shut down before the device and bus disappear and prevents data corruption on shutdown on at least Samsung EVO 960 SSDs. PR: kern/211852 Reviewed by: imp MFC after: 2 weeks Notes: svn path=/head/; revision=322443
* Make multi-namespace nvme drives more robust.Warner Losh2017-03-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Fix assumptions about name spaces in NVME driver. First, it assumes cdata.nn is the number of configured devices. However, it is the number of supported name spaces. Second, it assumes that there will never be more than 16 name spaces supported, but a certain drive I'm testing reports 1024. It assumes that name spaces are a tightly packed namespace, but the standard seems to indicate otherwise. Finally, it assumes that an error would be generated when quearying an unconfigured namespace. Instead, it succeeds but the identify data is all zeros. Fix these by limiting the number of name spaces we probe to 16. Remove aborting when we find one in error. When the size of the name space is zero, ignore it. This is admittedly a bandaide. The long term fix will be to participate in the enumeration and name space change protocols definfed in the NVNe standard. Sponsored by: Netflix Notes: svn path=/head/; revision=314884
* nvme: do not revert o single I/O queue when per-CPU queues not possibleJim Harris2016-01-071-2/+0
| | | | | | | | | | | | | | Previously nvme(4) would revert to a signle I/O queue if it could not allocate enought interrupt vectors or NVMe submission/completion queues to have one I/O queue per core. This patch determines how to utilize a smaller number of available interrupt vectors, and assigns (as closely as possible) an equal number of cores to each associated I/O queue. MFC after: 3 days Sponsored by: Intel Notes: svn path=/head/; revision=293328
* nvme: do not notify a consumer about failures that occur during initializationJim Harris2015-07-291-0/+9
| | | | | | | | MFC after: 3 days Sponsored by: Intel Notes: svn path=/head/; revision=286043
* nvme: remove CHATHAM related codeJim Harris2015-04-081-1/+0
| | | | | | | | | | | Chatham was an internal NVMe prototype board used for early driver development. MFC after: 1 week Sponsored by: Intel Notes: svn path=/head/; revision=281283
* nvme: add device strings for Intel DC series NVMe SSDsJim Harris2015-04-081-10/+38
| | | | | | | | MFC after: 1 week Sponsored by: Intel Notes: svn path=/head/; revision=281282
* nvme: Close hole where nvd(4) would not be notified of all nvme(4)Jim Harris2014-03-181-26/+62
| | | | | | | | | | instances if modules loaded during boot. Sponsored by: Intel MFC after: 3 days Notes: svn path=/head/; revision=263310
* Do not leak resources during attach if nvme_ctrlr_construct() or the initialJim Harris2013-10-081-3/+9
| | | | | | | | | | | | controller resets fail. Sponsored by: Intel Reviewed by: carl Approved by: re (hrs) MFC after: 1 week Notes: svn path=/head/; revision=256155
* If a controller fails to initialize, do not notify consumers (nvd) of itsJim Harris2013-08-131-0/+9
| | | | | | | | | | | namespaces. Sponsoredy by: Intel Reviewed by: carl MFC after: 3 days Notes: svn path=/head/; revision=254303
* Send a shutdown notification in the driver unload path, to ensureJim Harris2013-08-131-17/+1
| | | | | | | | | | | | notification gets sent in cases where system shuts down with driver unloaded. Sponsored by: Intel Reviewed by: carl MFC after: 3 days Notes: svn path=/head/; revision=254302
* Add message when nvd disks are attached and detached.Jim Harris2013-07-191-1/+0
| | | | | | | | | | | | | | | As part of this commit, add an nvme_strvis() function which borrows heavily from cam_strvis(). This will allow stripping of leading/trailing whitespace and also handle unprintable characters in model/serial numbers. This function goes into a new nvme_util.c file which is used by both the driver and nvmecontrol. Sponsored by: Intel Reviewed by: carl MFC after: 3 days Notes: svn path=/head/; revision=253476
* Update copyright dates.Jim Harris2013-07-091-1/+1
| | | | | | | MFC after: 3 days Notes: svn path=/head/; revision=253112
* Add pci_enable_busmaster() and pci_disable_busmaster() calls inJim Harris2013-07-091-0/+3
| | | | | | | | | | nvme_attach() and nvme_detach() respectively. Sponsored by: Intel MFC after: 3 days Notes: svn path=/head/; revision=253107
* Move the busdma mapping functions to nvme_qpair.c.Jim Harris2013-04-121-37/+0
| | | | | | | | | This removes nvme_uio.c completely. Sponsored by: Intel Notes: svn path=/head/; revision=249420
* Do not panic when a busdma mapping operation fails.Jim Harris2013-04-121-1/+7
| | | | | | | | | | Instead, print an error message and fail the associated command with DATA_TRANSFER_ERROR NVMe completion status. Sponsored by: Intel Notes: svn path=/head/; revision=249416
* Replace usages of mtx_pool_find used for admin commands with a pollingJim Harris2013-03-261-0/+15
| | | | | | | | | | | | | | | | | | mechanism. Now that all requests are timed, we are guaranteed to get a completion notification, even if it is an abort status due to a timed out admin command. This has the effect of simplifying the controller and namespace setup code, so that it reads straight through rather than broken up into a bunch of different callback functions. Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248769
* Add the ability to internally mark a controller as failed, if it is unable toJim Harris2013-03-261-1/+17
| | | | | | | | | | | | | | | | | | start or reset. Also add a notifier for NVMe consumers for controller fail conditions and plumb this notifier for nvd(4) to destroy the associated GEOM disks when a failure occurs. This requires a bit of work to cover the races when a consumer is sending I/O requests to a controller that is transitioning to the failed state. To help cover this condition, add a task to defer completion of I/Os submitted to a failed controller, so that the consumer will still always receive its completions in a different context than the submission. Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248767
* Remove the is_started flag from struct nvme_controller.Jim Harris2013-03-261-1/+3
| | | | | | | | | | | | | This flag was originally added to communicate to the sysctl code which oids should be built, but there are easier ways to do this. This needs to be cleaned up prior to adding new controller states - for example, controller failure. Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248763
* Cap the number of retry attempts to a configurable number. This ensuresJim Harris2013-03-261-1/+2
| | | | | | | | | | | | that if a specific I/O repeatedly times out, we don't retry it indefinitely. The default number of retries will be 4, but is adjusted using hw.nvme.retry_count. Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248761
* Pass associated log page data to async event consumers, if requested.Jim Harris2013-03-261-2/+5
| | | | | | | | Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248760
* Create struct nvme_status.Jim Harris2013-03-261-2/+2
| | | | | | | | | | | | | | | | | NVMe error log entries include status, so breaking this out into its own data structure allows it to be included in both the nvme_completion data structure as well as error log entry data structures. While here, expose nvme_completion_is_error(), and change all of the places that were explicitly looking at sc/sct bits to use this macro instead. Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248756
* Add controller reset capability to nvme(4) and ability to explicitlyJim Harris2013-03-261-3/+3
| | | | | | | | | | | | | | | | | invoke it from nvmecontrol(8). Controller reset will be performed in cases where I/O are repeatedly timing out, the controller reports an unrecoverable condition, or when explicitly requested via IOCTL or an nvme consumer. Since the controller may be in such a state where it cannot even process queue deletion requests, we will perform a controller reset without trying to clean up anything on the controller first. Sponsored by: Intel Reviewed by: carl Notes: svn path=/head/; revision=248746
* Add an interface for nvme shim drivers (i.e. nvd) to register forJim Harris2013-03-261-14/+49
| | | | | | | | | notifications when new nvme controllers are added to the system. Sponsored by: Intel Notes: svn path=/head/; revision=248738
* Move controller destruction code from nvme_detach() to new nvme_ctrlr_destruct()Jim Harris2013-03-261-46/+1
| | | | | | | | | function. Sponsored by: Intel Notes: svn path=/head/; revision=248736
* Fix GCC build:David E. O'Brien2013-03-071-5/+3
| | | | | | | /usr/src/sys/modules/nvme/../../dev/nvme/nvme.c:211: warning: format '%qx' expects type 'long unsigned int', but argument 9 has type 'long long unsigned int' [-Wformat] Notes: svn path=/head/; revision=247963
* Map BAR 4/5, because NVMe spec says devices may place the MSI-X tableJim Harris2012-12-181-0/+5
| | | | | | | | | behind BAR 4/5, rather than in BAR 0/1 with the control/doorbell registers. Sponsored by: Intel Notes: svn path=/head/; revision=244413