aboutsummaryrefslogtreecommitdiff
path: root/sys/geom/raid/g_raid.c
Commit message (Collapse)AuthorAgeFilesLines
* Make g_attach() return ENXIO for orphaned providers; update variousEdward Tomasz Napierala2020-10-181-1/+3
| | | | | | | | | | | | | classes to add missing error checking. Reviewed by: imp MFC after: 2 weeks Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26658 Notes: svn path=/head/; revision=366811
* sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/".Xin LI2020-07-091-1/+1
| | | | | | | | | Reviewed by: cem MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25565 Notes: svn path=/head/; revision=363034
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-1/+2
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* Pass BIO_SPEEDUP through all the geom layersWarner Losh2020-01-171-0/+1
| | | | | | | | | | | | | | | While some geom layers pass unknown commands down, not all do. For the ones that don't, pass BIO_SPEEDUP down to the providers that constittue the geom, as applicable. No changes to vinum or virstor because I was unsure how to add this support, and I'm also unsure how to test these. gvinum doesn't implement BIO_FLUSH either, so it may just be poorly maintained. gvirstor is for testing and not supportig BIO_SPEEDUP is fine. Reviewed by: chs Differential Revision: https://reviews.freebsd.org/D23183 Notes: svn path=/head/; revision=356818
* GEOM: Reduce unnecessary log interleaving with sbufsConrad Meyer2019-08-071-0/+1
| | | | | | | | | | | | | | | Similar to what was done for device_printfs in r347229. Convert g_print_bio() to a thin shim around g_format_bio(), which acts on an sbuf; documented in g_bio.9. Reviewed by: markj Discussed with: rlibby Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21165 Notes: svn path=/head/; revision=350694
* Use sbuf_cat() in GEOM confxml generation.Alexander Motin2019-06-191-10/+10
| | | | | | | | | | | When it comes to megabytes of text, difference between sbuf_printf() and sbuf_cat() becomes substantial. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=349195
* Use g_handleattr() to reply to GEOM::candelete queries.Mark Johnston2019-01-021-8/+4
| | | | | | | | | | | | | | | g_handleattr() fills out bp->bio_completed; otherwise, g_getattr() returns an error in response to the query. This caused BIO_DELETE support to not be propagated through stacked configurations, e.g., a gconcat of gmirror volumes would not handle BIO_DELETE even when the gmirrors do. g_io_getattr() was not affected by the problem. PR: 232676 Reported and tested by: noah.bergbauer@tum.de MFC after: 1 week Notes: svn path=/head/; revision=342687
* Extend stripeoffset and stripesize of GEOMs from u_int to off_tEugene Grosbein2018-10-271-1/+1
| | | | | | | | | | | | | | GEOM's stripeoffset overflows at 4 gigabyte margin (2^32) because of its u_int type. This leads to incorrect data in the output generated by "sysctl kern.geom.confxml" command, "graid list" etc. when GEOM array has volumes larger than 4G, for example. This change does not affect ABI but changes KBI. No MFC planned. Differential Revision: https://reviews.freebsd.org/D13426 Notes: svn path=/head/; revision=339815
* Do pass removing some write-only variables from the kernel.Alexander Kabaev2017-12-251-2/+0
| | | | | | | | | | | | This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385 Notes: svn path=/head/; revision=327173
* sys/geom: adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-271-0/+2
| | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326270
* Removal of Giant droping wrappers for GEOM classes.Konstantin Belousov2016-05-201-2/+0
| | | | | | | Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=300288
* Create an API to reset a struct bio (g_reset_bio). This is mandatoryWarner Losh2016-02-171-1/+1
| | | | | | | | | | | | for all struct bio you get back from g_{new,alloc}_bio. Temporary bios that you create on the stack or elsewhere should use this before first use of the bio, and between uses of the bio. At the moment, it is nothing more than a wrapper around bzero, but that may change in the future. The wrapper also removes one place where we encode the size of struct bio in the KBI. Notes: svn path=/head/; revision=295707
* Remove compatibility shims for legacy ATA device names.Alexander Motin2015-10-111-20/+0
| | | | | | | | We got new ATA stack in FreeBSD 8.x, switched to it at 9.x, completely removed old stack at 10.x, so at 11.x it is time to remove compat shims. Notes: svn path=/head/; revision=289137
* Remove request sorting from GEOM_MIRROR and GEOM_RAID.Alexander Motin2015-03-271-3/+3
| | | | | | | | | | | | | | | When CPU is not busy, those queues are typically empty. When CPU is busy, then one more extra sorting is the last thing it needs. If specific device (HDD) really needs sorting, then it will be done later by CAM. This supposed to fix livelock reported for mirror of two SSDs, when UFS fires zillion of BIO_DELETE requests, that totally blocks I/O subsystem by pointless sorting of requests and responses under single mutex lock. MFC after: 2 weeks Notes: svn path=/head/; revision=280757
* Follow up to r225617. In order to maximize the re-usability of kernel codeDavide Italiano2014-10-161-1/+1
| | | | | | | | | | | in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe Notes: svn path=/head/; revision=273174
* Pull in r267961 and r267973 again. Fix for issues reported will follow.Hans Petter Selasky2014-06-281-21/+10
| | | | Notes: svn path=/head/; revision=267992
* Revert r267961, r267973:Glen Barber2014-06-271-10/+21
| | | | | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory Notes: svn path=/head/; revision=267985
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifHans Petter Selasky2014-06-271-21/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=267961
* Reduce number of opens by REOM RAID during provider taste.Alexander Motin2014-04-281-1/+7
| | | | | | | | | | | | Instead opening/closing provider by each of metadata classes, do it only once in core code. Since for SCSI disks open/close means sending some SCSI commands to the device, this change reduces taste time. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Notes: svn path=/head/; revision=265054
* Merge GEOM direct dispatch changes from the projects/camlock branch.Alexander Motin2013-10-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months Notes: svn path=/head/; revision=256880
* MFprojects/camlock r256445:Alexander Motin2013-10-161-7/+15
| | | | | | | Add unmapped I/O support to GEOM RAID. Notes: svn path=/head/; revision=256610
* Return error when opening read-only volumes (like RAID4/5/...) for writing.Alexander Motin2013-08-131-0/+5
| | | | | | | | | | Previously opens succeeded, but actual write operations returned errors. Requested by: peter MFC after: 2 weeks Notes: svn path=/head/; revision=254275
* Introduce 3 seconds timeout on `graid stop` command (mostly with -f flag).Alexander Motin2013-07-271-10/+7
| | | | | | | | Since completion waiting goes in g_event thread, it may cause GEOM deadlock if consumer on top (for example, ZFS) uses g_event thread for closing. Notes: svn path=/head/; revision=253706
* Return "descr" field alike to "Intel RAID1 volume" for GEOM RAID to makeAlexander Motin2013-04-271-0/+4
| | | | | | | it look better in bsdinstall. Notes: svn path=/head/; revision=249974
* Add legacy support to geom raid to create a /dev/arX device for supportSean Bruno2013-03-081-0/+22
| | | | | | | | | | | | | | | | | | of upgrading older machines using ataraid(4) to newer releases. This optional parameter is controlled via kern.geom.raid.legacy_aliases and will create a /dev/ar0 device that will point at /dev/raid/r0 for example. Tested on Dell SC 1425 DDF-1 format software raid controllers installing from stable/7 and upgrading to stable/9 without having to adjust /etc/fstab Reviewed by: mav Obtained from: Yahoo! MFC after: 2 Weeks Notes: svn path=/head/; revision=248068
* Improve support for disabled disks. If disabled disk disconnected and thenAlexander Motin2013-01-131-1/+1
| | | | | | | | reconnected back, leave it as disconnected. If new disk inserted instead of disabled, rebuild it and leave as enabled. Notes: svn path=/head/; revision=245363
* Add basic support for Intel Rapid Recover Technology (Intel RRT).Alexander Motin2013-01-121-1/+5
| | | | | | | | | | | | It is alike to RAID1, but with dedicating master and recovery disks and providing manual control over synchronization. It allows to use recovery disk as snapshot of the master disk from the time of the last sync. This implementation is not functionaly complete comparing to Windows, but it is better then silent conversion to RAID1 on first boot. Notes: svn path=/head/; revision=245326
* Add basic BIO_DELETE support to GEOM RAID class for all RAID levels.Alexander Motin2012-10-291-1/+56
| | | | | | | | | | | | If at least one subdisk in the volume supports it, BIO_DELETE requests will be propagated down. Unfortunatelly, for RAID levels with redundancy unmapped blocks will be mapped back during first rebuild/resync process. Sponsored by: iXsystems, Inc. MFC after: 1 month Notes: svn path=/head/; revision=242323
* Make GEOM RAID more aggressive in marking volumes as clean on shutdownAlexander Motin2012-10-291-17/+20
| | | | | | | | | | | | | | and move that action from shutdown_pre_sync to shutdown_post_sync stage to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. MFC after: 2 weeks Notes: svn path=/head/; revision=242314
* Add global and per-module sysctls/tunables to enable/disable metadata taste.Alexander Motin2012-09-131-0/+10
| | | | | | | | | | That should help to handle some cases when disk has some RAID metadata that should be ignored, especially during boot. MFC after: 3 days Notes: svn path=/head/; revision=240465
* Add missing FAILED event to g_raid_subdisk_event2str() to print it properlyAlexander Motin2012-08-101-0/+2
| | | | | | | | | in debug messages. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com> Notes: svn path=/head/; revision=239175
* Add support for RAID5R. Slightly improve support for RAIDMDF.Alexander Motin2012-05-061-0/+1
| | | | Notes: svn path=/head/; revision=235076
* Implement read-only support for volumes in optimal state (without usingAlexander Motin2012-05-041-4/+4
| | | | | | | redundancy) for the following RAID levels: RAID4/5E/5EE/6/MDF. Notes: svn path=/head/; revision=234993
* Add optional -o argument to the `graid label ` to specify some metadataAlexander Motin2012-05-031-3/+4
| | | | | | | | format options. Use it for specifying byte order for the DDF metadata: big-endian defined by specification and little-endian used by Adaptec. Notes: svn path=/head/; revision=234940
* s/gmirror/graid/Alexander Motin2012-04-291-2/+2
| | | | Notes: svn path=/head/; revision=234816
* Fix copy-paste typo in r234603.Alexander Motin2012-04-231-2/+2
| | | | | | | Submitted by: kan Notes: svn path=/head/; revision=234610
* Add names for all primary RAID levels defined by DDF 2.0 specification.Alexander Motin2012-04-231-17/+147
| | | | Notes: svn path=/head/; revision=234603
* Add to GEOM RAID class module for reading non-degraded RAID5 volumes andAlexander Motin2012-04-191-2/+21
| | | | | | | | | | | some environment to differentiate 4 possible RAID5 on-disk layouts. Tested with Intel and AMD RAID BIOSes. MFC after: 2 weeks Notes: svn path=/head/; revision=234458
* Include sys/sbuf.h directly.Andrey V. Elsukov2011-07-111-0/+1
| | | | | | | Reviewed by: pjd Notes: svn path=/head/; revision=223921
* Reduce geom_raid log verbosity.Alexander Motin2011-04-181-1/+1
| | | | Notes: svn path=/head/; revision=220790
* Bunch of small bugfixes and cleanups.Alexander Motin2011-03-311-4/+1
| | | | | | | Found with: Clang Static Analyzer Notes: svn path=/head/; revision=220210
* MFgraid/head:Alexander Motin2011-03-241-0/+2340
Add new RAID GEOM class, that is going to replace ataraid(4) in supporting various BIOS-based software RAIDs. Unlike ataraid(4) this implementation does not depend on legacy ata(4) subsystem and can be used with any disk drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4) with `options ATA_CAM`). To make code more readable and extensible, this implementation follows modular design, including core part and two sets of modules, implementing support for different metadata formats and RAID levels. Support for such popular metadata formats is now implemented: Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage. Such RAID levels are now supported: RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT. For any all of these RAID levels and metadata formats this class supports full cycle of volume operations: reading, writing, creation, deletion, disk removal and insertion, rebuilding, dirty shutdown detection and resynchronization, bad sector recovery, faulty disks tracking, hot-spare disks. For Intel and Promise formats there is support multiple volumes per disk set. Look graid(8) manual page for additional details. Co-authored by: imp Sponsored by: Cisco Systems, Inc. and iXsystems, Inc. Notes: svn path=/head/; revision=219974