aboutsummaryrefslogtreecommitdiff
path: root/sys/dev/xen/blkback
Commit message (Collapse)AuthorAgeFilesLines
* sys/xen: Use __printflike() instead of format(printf)Alex Richardson2025-12-281-1/+1
| | | | | | | | The __printflike macro sets the format to freebsd_kprintf which recent clang understands and warns about. Fixes the following error: `passing 'printf' format string where 'freebsd_kprintf' format string is expected [-Werror,-Wformat]` MFC after: 1 week
* xen/blk{front,back}: fix usage of sector sizes different than 512bRoger Pau Monné2024-10-081-7/+15
| | | | | | | | | | | | | | | | | | | | The units of the size reported in the 'sectors' xenbus node is always 512b, regardless of the value of the 'sector-size' node. The sector offsets in the ring requests are also always based on 512b sectors, regardless of the 'sector-size' reported in xenbus. Fix both blkfront and blkback to assume 512b sectors in the required fields. The blkif.h public header has been recently updated in upstream Xen repository to fix the regressions in the specification introduced by later modifications, and clarify the base units of xenstore and shared ring fields. PR: 280884 Reported by: Christian Kujau MFC after: 1 week Sponsored by: Cloud Software Group Reviewed by: markj Differential revision: https://reviews.freebsd.org/D46756
* xen/dev: switch to DEVMETHOD_ENDElliott Mitchell2023-11-281-1/+2
| | | | | | | Switch to the preferred end of the device method table. These hadn't been updated previously. Reviewed by: royger
* sys: Remove $FreeBSD$: one-line .c patternWarner Losh2023-08-161-2/+0
| | | | Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
* spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh2023-05-121-1/+1
| | | | | | | | | The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
* Fix unused variable warning in xen's blkback.cDimitry Andric2022-07-261-3/+0
| | | | | | | | | | | | | | With clang 15, the following -Werror warning is produced: sys/dev/xen/blkback/blkback.c:1561:12: error: variable 'req_seg_idx' set but not used [-Werror,-Wunused-but-set-variable] u_int req_seg_idx; ^ The 'req_seg_idx' variable was used in the for loop later in the xbb_dispatch_io() function, but refactoring in 112cacaee408 got rid of it. Remove the variable since it no longer serves any purpose. MFC after: 3 days
* xen/blkback: do not use x86 CPUID in generic codeRoger Pau Monné2022-06-281-6/+1
| | | | | | | | | Move checker for whether Xen creates IOMMU mappings for foreign pages into a helper that's defined in arch-specific code. Reported by: Elliott Mitchell <ehem+freebsd@m5p.com> Fixes: 1d528f95e8ce ('xen/blkback: remove bounce buffering mode') Sponsored by: Citrix Systems R&D
* xen/blkback: remove bounce buffering modeRoger Pau Monné2022-06-071-178/+22
| | | | | | | | | | | | | | Remove bounce buffering code for blkback and only attach if Xen creates IOMMU entries for grant mapped pages. Such bounce buffering consumed a non trivial amount of memory and CPU resources to do the memory copy, when it's been a long time since Xen has been creating IOMMU entries for grant maps. Refuse to attach blkback if Xen doesn't advertise that IOMMU entries are created for grant maps. Sponsored by: Citrix Systems R&D
* xen/blkback: fix tear-down issuesRoger Pau Monné2022-06-071-33/+30
| | | | | | | | | | | | | | Handle tearing down a blkback that hasn't been fully initialized. This requires carefully checking that fields are allocated before trying to access them. Also communication memory is allocated before setting XBBF_RING_CONNECTED, so gating it's freeing on XBBF_RING_CONNECTED being set is wrong and will lead to memory leaks. Also stop using xbb_disconnect() in error paths. Use xenbus_dev_fatal and let the normal disconnection procedure take care of the cleanup. Reported by: Ze Dupsys <zedupsys@gmail.com> Sponsored by: Citrix Systems R&D
* xen: Remove unused devclass arguments to DRIVER_MODULE.John Baldwin2022-05-061-2/+1
|
* vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd)Mateusz Guzik2022-03-241-1/+1
|
* xen: switch to use headers in contribElliott Mitchell2022-02-071-3/+3
| | | | | | | | | | These headers originate with the Xen project and shouldn't be mixed with the main portion of the FreeBSD kernel. Notably they shouldn't be the target of clean-up commits. Switch to use the headers in sys/contrib/xen. Reviewed by: royger
* xen: plug some of set-but-not-used varsMateusz Guzik2021-12-151-3/+1
| | | | Sponsored by: Rubicon Communications, LLC ("Netgate")
* vfs: remove the unused thread argument from NDINIT*Mateusz Guzik2021-11-251-1/+1
| | | | | | See b4a58fbf640409a1 ("vfs: remove cn_thread") Bump __FreeBSD_version to 1400043.
* xen/blkback: fix reconnection of backendRoger Pau Monné2021-05-111-35/+48
| | | | | | | | | | | | | | | | | | | | | | The hotplug script will be executed only once for each backend, regardless of the frontend triggering reconnections. Fix blkback to deal with the hotplug script being executed only once, so that reconnections don't stall waiting for a hotplug script execution that will never happen. As a result of the fix move the initialization of dev_mode, dev_type and dev_name to the watch callback, as they should be set only once the first time the backend connects. This fix is specially relevant for guests wanting to use UEFI OVMF firmware, because OVMF will use Xen PV block devices and disconnect afterwards, thus allowing them to be used by the guest OS. Without this change the guest OS will stall waiting for the block backed to attach. Fixes: de0bad00010c ('blkback: add support for hotplug scripts') MFC after: 1 week Sponsored by: Citrix Systems R&D
* xen-blkback: fix leak of grant maps on ring setup failureRoger Pau Monné2021-02-221-0/+21
| | | | | | | | | | | | | | | Multi page rings are mapped using a single hypercall that gets passed an array of grants to map. One of the grants in the array failing to map would lead to the failure of the whole ring setup operation, but there was no cleanup of the rest of the grant maps in the array that could have likely been created as a result of the hypercall. Add proper cleanup on the failure path during ring setup to unmap any grants that could have been created. This is part of XSA-361. Sponsored by: Citrix Systems R&D
* xen: allow limiting the amount of duplicated pending xenstore watchesRoger Pau Monné2020-12-301-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xenstore watches received are queued in a list and processed in a deferred thread. Such queuing was done without any checking, so a guest could potentially trigger a resource starvation against the FreeBSD kernel if such kernel is watching any user-controlled xenstore path. Allowing limiting the amount of pending events a watch can accumulate to prevent a remote guest from triggering this resource starvation issue. For the PV device backends and frontends this limitation is only applied to the other end /state node, which is limited to 1 pending event, the rest of the watched paths can still have unlimited pending watches because they are either local or controlled by a privileged domain. The xenstore user-space device gets special treatment as it's not possible for the kernel to know whether the paths being watched by user-space processes are controlled by a guest domain. For this reason watches set by the xenstore user-space device are limited to 1000 pending events. Note this can be modified using the max_pending_watch_events sysctl of the device. This is XSA-349. Sponsored by: Citrix Systems R&D MFC after: 3 days
* Make MAXPHYS tunable. Bump MAXPHYS to 1M.Konstantin Belousov2020-11-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225 Notes: svn path=/head/; revision=368124
* dev/xen: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-29/+5
| | | | Notes: svn path=/head/; revision=365128
* vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_errorMateusz Guzik2020-08-191-1/+1
| | | | | | | Most consumers pass NULL. Notes: svn path=/head/; revision=364372
* vfs: drop the mostly unused flags argument from VOP_UNLOCKMateusz Guzik2020-01-031-4/+4
| | | | | | | | | | | Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427 Notes: svn path=/head/; revision=356337
* vfs: introduce v_irflag and make v_type smallerMateusz Guzik2019-12-081-1/+1
| | | | | | | | | | | | | | | | | | The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715 Notes: svn path=/head/; revision=355537
* xen-blkback: don't unbind the interrupt while holding the lockRoger Pau Monné2018-05-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There's no need to perform the interrupt unbind while holding the blkback lock, and doing so leads to the following LOR: lock order reversal: (sleepable after non-sleepable) 1st 0xfffff8000802fe90 xbbd1 (xbbd1) @ /usr/src/sys/dev/xen/blkback/blkback.c:3423 2nd 0xffffffff81fdf890 intrsrc (intrsrc) @ /usr/src/sys/x86/x86/intr_machdep.c:224 stack backtrace: #0 0xffffffff80bdd993 at witness_debugger+0x73 #1 0xffffffff80bdd814 at witness_checkorder+0xe34 #2 0xffffffff80b7d798 at _sx_xlock+0x68 #3 0xffffffff811b3913 at intr_remove_handler+0x43 #4 0xffffffff811c63ef at xen_intr_unbind+0x10f #5 0xffffffff80a12ecf at xbb_disconnect+0x2f #6 0xffffffff80a12e54 at xbb_shutdown+0x1e4 #7 0xffffffff80a10be4 at xbb_frontend_changed+0x54 #8 0xffffffff80ed66a4 at xenbusb_back_otherend_changed+0x14 #9 0xffffffff80a2a382 at xenwatch_thread+0x182 #10 0xffffffff80b34164 at fork_exit+0x84 #11 0xffffffff8101ec9e at fork_trampoline+0xe Reported by: Nathan Friess <nathan.friess@gmail.com> Sponsored by: Citrix Systems R&D Notes: svn path=/head/; revision=334145
* xen-blkback: do not use state 3 (XenbusStateInitialised)Roger Pau Monné2018-05-221-6/+13
| | | | | | | | | | | | | | | | | | | Linux will not connect to a backend that's in state 3 (XenbusStateInitialised), it needs to be in state 2 (XenbusStateInitWait) for Linux to attempt to connect to the backend. The protocol seems to suggest that the backend should indeed wait in state 2 for the frontend to connect, which makes state 3 unusable for disk backends. Also make sure blkback will connect to the frontend if the frontend reaches state 3 (XenbusStateInitialised) before blkback has processed the results from the hotplug script (Submitted by Nathan Friess). MFC after: 1 week Notes: svn path=/head/; revision=334027
* Correct pseudo misspelling in sys/ commentsEd Maste2018-02-231-3/+3
| | | | | | | contrib code and #define in intel_ata.h unchanged. Notes: svn path=/head/; revision=329873
* Revert r327828, r327949, r327953, r328016-r328026, r328041:Pedro F. Giffuni2018-01-211-3/+3
| | | | | | | | | | | | | | | | | | Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197 Notes: svn path=/head/; revision=328218
* dev: make some use of mallocarray(9).Pedro F. Giffuni2018-01-131-3/+3
| | | | | | | | | | | | | | Focus on code where we are doing multiplications within malloc(9). None of these is likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values. Notes: svn path=/head/; revision=327949
* sys/dev: further adoption of SPDX licensing ID tags.Pedro F. Giffuni2017-11-271-0/+2
| | | | | | | | | | | | | | | Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326255
* xen-blkback: fix error path on failed attachRoger Pau Monné2016-06-031-4/+5
| | | | | | | | | | | | | | | | | | The current error path in case of failure during attach/initialization is not correct and leaves blkback in a stuck state. This is due to blkback waiting for blkfront to switch to state XenbusStateClosed, but if blkfront never attached (because the guest is not even started) it cannot possibly make it to that state. Instead just wait for the frontend to be in a state different than XenbusStateConnected in order to proceed with the shutdown. Also, it is wrong to call xbb_detach directly because it destroys the lock which can still be used by xbb_frontend_changed. Sponsored by: Citrix Systems R&D Notes: svn path=/head/; revision=301269
* blkback: add support for hotplug scriptsRoger Pau Monné2016-06-031-56/+104
| | | | | | | | | | | | | | | | | | | | Hotplug scripts are needed in order to use fancy disk configurations in xl, like iSCSI disks. The job of hotplug scripts is to locally attach the disk and present it to blkback as a block device or a regular file. This change introduces a new xenstore node in the blkback hierarchy, called "physical-device-path". This is a straigh replacement for the "params" node, which was used before. Hotplug scripts will need to read the "params" node, perform whatever actions are necessary and then write the "physical-device-path" node. The hotplug script is also in charge of detaching the disk once the domain has been shutdown. Sponsored by: Citrix Systems R&D Notes: svn path=/head/; revision=301268
* Improve performance and functionality of the bitstring(3) apiAlan Somers2016-05-041-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two new functions are provided, bit_ffs_at() and bit_ffc_at(), which allow for efficient searching of set or cleared bits starting from any bit offset within the bit string. Performance is improved by operating on longs instead of bytes and using ffsl() for searches within a long. ffsl() is a compiler builtin in both clang and gcc for most architectures, converting what was a brute force while loop search into a couple of instructions. All of the bitstring(3) API continues to be contained in the header file. Some of the functions are large enough that perhaps they should be uninlined and moved to a library, but that is beyond the scope of this commit. sys/sys/bitstring.h: Convert the majority of the existing bit string implementation from macros to inline functions. Properly protect the implementation from inadvertant macro expansion when included in a user's program by prefixing all private macros/functions and local variables with '_'. Add bit_ffs_at() and bit_ffc_at(). Implement bit_ffs() and bit_ffc() in terms of their "at" counterparts. Provide a kernel implementation of bit_alloc(), making the full API usable in the kernel. Improve code documenation. share/man/man3/bitstring.3: Add pre-exisiting API bit_ffc() to the synopsis. Document new APIs. Document the initialization state of the bit strings allocated/declared by bit_alloc() and bit_decl(). Correct documentation for bitstr_size(). The original code comments indicate the size is in bytes, not "elements of bitstr_t". The new implementation follows this lead. Only hastd assumed "elements" rather than bytes and it has been corrected. etc/mtree/BSD.tests.dist: tests/sys/Makefile: tests/sys/sys/Makefile: tests/sys/sys/bitstring.c: Add tests for all existing and new functionality. include/bitstring.h Include all headers needed by sys/bitstring.h lib/libbluetooth/bluetooth.h: usr.sbin/bluetooth/hccontrol/le.c: Include bitstring.h instead of sys/bitstring.h. sbin/hastd/activemap.c: Correct usage of bitstr_size(). sys/dev/xen/blkback/blkback.c Use new bit_alloc. sys/kern/subr_unit.c: Remove hard-coded assumption that sizeof(bitstr_t) is 1. Get rid of unrb.busy, which caches the number of bits set in unrb.map. When INVARIANTS are disabled, nothing needs to know that information. callapse_unr can be adapted to use bit_ffs and bit_ffc instead. Eliminating unrb.busy saves memory, simplifies the code, and provides a slight speedup when INVARIANTS are disabled. sys/net/flowtable.c: Use the new kernel implementation of bit-alloc, instead of hacking the old libc-dependent macro. sys/sys/param.h Update __FreeBSD_version to indicate availability of new API Submitted by: gibbs, asomers Reviewed by: gibbs, ngie MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D6004 Notes: svn path=/head/; revision=299090
* sys/dev: minor spelling fixes.Pedro F. Giffuni2016-05-031-7/+7
| | | | | | | Most affect comments, very few have user-visible effects. Notes: svn path=/head/; revision=298955
* xen: Code cleanup and small bug fixesRoger Pau Monné2015-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xen/hypervisor.h: - Remove unused helpers: MULTI_update_va_mapping, is_initial_xendomain, is_running_on_xen - Remove unused define CONFIG_X86_PAE - Remove unused variable xen_start_info: note that it's used inpcifront which is not built at all - Remove forward declaration of HYPERVISOR_crash xen/xen-os.h: - Remove unused define CONFIG_X86_PAE - Drop unused helpers: test_and_clear_bit, clear_bit, force_evtchn_callback - Implement a generic version (based on ofed/include/linux/bitops.h) of set_bit and test_bit and prefix them by xen_ to avoid any use by other code than Xen. Note that It would be worth to investigate a generic implementation in FreeBSD. - Replace barrier() by __compiler_membar() - Replace cpu_relax() by cpu_spinwait(): it's exactly the same as rep;nop = pause xen/xen_intr.h: - Move the prototype of xen_intr_handle_upcall in it: Use by all the platform x86/xen/xen_intr.c: - Use BITSET* for the enabledbits: Avoid to use custom helpers - test_bit/set_bit has been renamed to xen_test_bit/xen_set_bit - Don't export the variable xen_intr_pcpu dev/xen/blkback/blkback.c: - Fix the string format when XBB_DEBUG is enabled: host_addr is typed uint64_t dev/xen/balloon/balloon.c: - Remove set but not used variable - Use the correct type for frame_list: xen_pfn_t represents the frame number on any architecture dev/xen/control/control.c: - Return BUS_PROBE_WILDCARD in xs_probe: Returning 0 in a probe callback means the driver can handle this device. If by any chance xenstore is the first driver, every new device with the driver is unset will use xenstore. dev/xen/grant-table/grant_table.c: - Remove unused cmpxchg - Drop unused include opt_pmap.h: Doesn't exist on ARM64 and it doesn't contain anything required for the code on x86 dev/xen/netfront/netfront.c: - Use the correct type for rx_pfn_array: xen_pfn_t represents the frame number on any architecture dev/xen/netback/netback.c: - Use the correct type for gmfn: xen_pfn_t represents the frame number on any architecture dev/xen/xenstore/xenstore.c: - Return BUS_PROBE_WILDCARD in xctrl_probe: Returning 0 in a probe callback means the driver can handle this device. If by any chance xenstore is the first driver, every new device with the driver is unset will use xenstore. Note that with the changes, x86/include/xen/xen-os.h doesn't contain anymore arch-specific code. Although, a new series will add some helpers that differ between x86 and ARM64, so I've kept the headers for now. Submitted by: Julien Grall <julien.grall@citrix.com> Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D3921 Sponsored by: Citrix Systems R&D Notes: svn path=/head/; revision=289686
* Code cleanup unused-but-set-variable spotted by gcc.Marcelo Araujo2015-08-251-2/+0
| | | | | | | | | Reviewed by: royger Approved by: bapt (mentor) Differential Revision: D3476 Notes: svn path=/head/; revision=287132
* Create a dedicated function for ensuring that cdir and rdir are populated.Mateusz Guzik2015-07-111-12/+1
| | | | | | | | | | | Previously several places were doing it on its own, partially incorrectly (e.g. without the filedesc locked) or even actively harmful by populating jdir or assigning rootvnode without vrefing it. Reviewed by: kib Notes: svn path=/head/; revision=285391
* xen-blk{front/back}: remove broken FreeBSD extensionsRoger Pau Monné2015-06-121-164/+50
| | | | | | | | | | | | | | | | | | The FreeBSD extension adds a new request type, called blkif_segment_block which has a size of 112bytes for both i386 and amd64. This is fine on amd64, since requests have a size of 112B there also. But this is not true for i386, where requests have a size of 108B. So on i386 we basically overrun the ring slot when queuing a request of type blkif_segment_block_t, which is very bad. Remove this extension (including a cleanup of the public blkif.h header file) from blkfront and blkback. Sponsored by: Citrix Systems R&D Tested-by: cperciva Notes: svn path=/head/; revision=284296
* xen: introduce a newbus function to allocate unused memoryRoger Pau Monné2015-05-081-7/+4
| | | | | | | | | | | | | | | | | | | In order to map memory from other domains when running on Xen FreeBSD uses unused physical memory regions. Until now this memory has been allocated using bus_alloc_resource, but this is not completely safe as we can end up using unreclaimed MMIO or ACPI regions. Fix this by introducing a new newbus method that can be used by Xen drivers to request for unused memory regions. On amd64 we make sure this memory comes from regions above 4GB in order to prevent clashes with MMIO/ACPI regions. On i386 there's nothing we can do, so just fall back to the previous mechanism. Sponsored by: Citrix Systems R&D Tested by: Gustau Pérez <gperez@entel.upc.edu> Notes: svn path=/head/; revision=282634
* Remove support for Xen PV domU kernels. Support for HVM domU kernelsJohn Baldwin2015-04-301-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | remains. Xen is planning to phase out support for PV upstream since it is harder to maintain and has more overhead. Modern x86 CPUs include virtualization extensions that support HVM guests instead of PV guests. In addition, the PV code was i386 only and not as well maintained recently as the HVM code. - Remove the i386-only NATIVE option that was used to disable certain components for PV kernels. These components are now standard as they are on amd64. - Remove !XENHVM bits from PV drivers. - Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3, etc.) - Remove duplicate copy of <xen/features.h>. - Remove unused, i386-only xenstored.h. Differential Revision: https://reviews.freebsd.org/D2362 Reviewed by: royger Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0) Relnotes: yes Notes: svn path=/head/; revision=282274
* xen: fix blkback pushing responses before releasing internal resourcesRoger Pau Monné2014-09-301-27/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a problem where the blockback driver could run out of requests, despite the fact that we allocate enough request and reqlist structures to satisfy the maximum possible number of requests. The problem was that we were sending responses back to the other end (blockfront) before freeing resources. The Citrix Windows driver is pretty agressive about queueing, and would queue more I/O to us immediately after we sent responses to it. We would run into a resource shortage and stall out I/O until we freed resources. It isn't clear whether the request shortage condition was an indirect cause of the I/O hangs we've been seeing between Windows with the Citrix PV drivers and FreeBSD's blockback, but the above problem is certainly a bug. Sponsored by: Spectra Logic Submitted by: ken Reviewed by: royger dev/xen/blkback/blkback.c: - Break xbb_send_response() into two sub-functions, xbb_queue_response() and xbb_push_responses(). Remove xbb_send_response(), because it is no longer used. - Adjust xbb_complete_reqlist() so that it calls the two new functions, and holds the mutex around both calls. The mutex insures that another context can't come along and push responses before we've freed our resources. - Change xbb_release_reqlist() so that it requires the mutex to be held instead of acquiring the mutex itself. Both callers could easily hold the mutex while calling it, and one really needs to hold the mutex during the call. - Add two new counters, accessible via sysctl variables. The first one counts the number of I/Os that are queued and waiting to be pushed (reqs_queued_for_completion). The second one (reqs_completed_with_error) counts the number of requests we've completed with an error status. Notes: svn path=/head/; revision=272321
* xen: fix incorrectly accounted freeRoger Pau Monné2014-08-221-3/+3
| | | | | | | | | | | | | | Fix some frees incorrectly assigned to M_XENBUS when the memory is allocated with M_XENSTORE. Sponsored by: Citrix Systems R&D MFC after: 1 week dev/xen/blkback/blkback.c: - Fix incorrect frees. Notes: svn path=/head/; revision=270339
* dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINEAndriy Gapon2013-11-261-6/+6
| | | | | | | | | | | In its stead use the Solaris / illumos approach of emulating '-' (dash) in probe names with '__' (two consecutive underscores). Reviewed by: markj MFC after: 3 weeks Notes: svn path=/head/; revision=258622
* - For kernel compiled only with KDTRACE_HOOKS and not any lock debuggingAttilio Rao2013-11-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip Notes: svn path=/head/; revision=258541
* Correct blkback handling of the BLKIF_OP_FLUSH_DISKCACHE opcode.Justin T. Gibbs2013-09-041-12/+4
| | | | | | | | | | | | | | | | | | | | | | Properly round-trip the "operation code" for client requests. sys/dev/xen/blkback/blkback.c: In xbb_dispatch_dev() when processing a flush request, correctly set bio->bio_caller1 to the request list (not bare request) for the operation, as is expected by the completion handler xbb_bio_done(). In xbb_get_resources(), initialize "operation" in the driver's internal request object from the client's "ring request", so it is correct when used to populate the reply when this operation completes. Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D Reviewed by: gibbs Notes: svn path=/head/; revision=255218
* sys/dev/xen/blkback/blkback.c:Justin T. Gibbs2013-09-031-1/+1
| | | | | | | | | | | | | Initialize the request id for requests in xbb_get_resources() instead of its previous location in xbb_dispatch_io(). This guarantees that all request types (e.g. BLKIF_OP_FLUSH_DISKCACHE) have the front-end specified id recorded. Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D Notes: svn path=/head/; revision=255179
* Implement vector callback for PVHVM and unify event channel implementationsJustin T. Gibbs2013-08-291-26/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-structure Xen HVM support so that: - Xen is detected and hypercalls can be performed very early in system startup. - Xen interrupt services are implemented using FreeBSD's native interrupt delivery infrastructure. - the Xen interrupt service implementation is shared between PV and HVM guests. - Xen interrupt handlers can optionally use a filter handler in order to avoid the overhead of dispatch to an interrupt thread. - interrupt load can be distributed among all available CPUs. - the overhead of accessing the emulated local and I/O apics on HVM is removed for event channel port events. - a similar optimization can eventually, and fairly easily, be used to optimize MSI. Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure, and misc Xen cleanups: Sponsored by: Spectra Logic Corporation Unification of PV & HVM interrupt infrastructure, bug fixes, and misc Xen cleanups: Submitted by: Roger Pau Monné Sponsored by: Citrix Systems R&D sys/x86/x86/local_apic.c: sys/amd64/include/apicvar.h: sys/i386/include/apicvar.h: sys/amd64/amd64/apic_vector.S: sys/i386/i386/apic_vector.s: sys/amd64/amd64/machdep.c: sys/i386/i386/machdep.c: sys/i386/xen/exception.s: sys/x86/include/segments.h: Reserve IDT vector 0x93 for the Xen event channel upcall interrupt handler. On Hypervisors that support the direct vector callback feature, we can request that this vector be called directly by an injected HVM interrupt event, instead of a simulated PCI interrupt on the Xen platform PCI device. This avoids all of the overhead of dealing with the emulated I/O APIC and local APIC. It also means that the Hypervisor can inject these events on any CPU, allowing upcalls for different ports to be handled in parallel. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: Map Xen per-vcpu area during AP startup. sys/amd64/include/intr_machdep.h: sys/i386/include/intr_machdep.h: Increase the FreeBSD IRQ vector table to include space for event channel interrupt sources. sys/amd64/include/pcpu.h: sys/i386/include/pcpu.h: Remove Xen HVM per-cpu variable data. These fields are now allocated via the dynamic per-cpu scheme. See xen_intr.c for details. sys/amd64/include/xen/hypercall.h: sys/dev/xen/blkback/blkback.c: sys/i386/include/xen/xenvar.h: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/xen/gnttab.c: Prefer FreeBSD primatives to Linux ones in Xen support code. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: sys/dev/xen/balloon/balloon.c: sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/console/xencons_ring.c: sys/dev/xen/control/control.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/dev/xen/xenpci/xenpci.c: sys/i386/i386/machdep.c: sys/i386/include/pmap.h: sys/i386/include/xen/xenfunc.h: sys/i386/isa/npx.c: sys/i386/xen/clock.c: sys/i386/xen/mp_machdep.c: sys/i386/xen/mptable.c: sys/i386/xen/xen_clock_util.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/xen_rtc.c: sys/xen/evtchn/evtchn_dev.c: sys/xen/features.c: sys/xen/gnttab.c: sys/xen/gnttab.h: sys/xen/hvm.h: sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbus_if.m: sys/xen/xenbus/xenbusb_front.c: sys/xen/xenbus/xenbusvar.h: sys/xen/xenstore/xenstore.c: sys/xen/xenstore/xenstore_dev.c: sys/xen/xenstore/xenstorevar.h: Pull common Xen OS support functions/settings into xen/xen-os.h. sys/amd64/include/xen/xen-os.h: sys/i386/include/xen/xen-os.h: sys/xen/xen-os.h: Remove constants, macros, and functions unused in FreeBSD's Xen support. sys/xen/xen-os.h: sys/i386/xen/xen_machdep.c: sys/x86/xen/hvm.c: Introduce new functions xen_domain(), xen_pv_domain(), and xen_hvm_domain(). These are used in favor of #ifdefs so that FreeBSD can dynamically detect and adapt to the presence of a hypervisor. The goal is to have an HVM optimized GENERIC, but more is necessary before this is possible. sys/amd64/amd64/machdep.c: sys/dev/xen/xenpci/xenpcivar.h: sys/dev/xen/xenpci/xenpci.c: sys/x86/xen/hvm.c: sys/sys/kernel.h: Refactor magic ioport, Hypercall table and Hypervisor shared information page setup, and move it to a dedicated HVM support module. HVM mode initialization is now triggered during the SI_SUB_HYPERVISOR phase of system startup. This currently occurs just after the kernel VM is fully setup which is just enough infrastructure to allow the hypercall table and shared info page to be properly mapped. sys/xen/hvm.h: sys/x86/xen/hvm.c: Add definitions and a method for configuring Hypervisor event delievery via a direct vector callback. sys/amd64/include/xen/xen-os.h: sys/x86/xen/hvm.c: sys/conf/files: sys/conf/files.amd64: sys/conf/files.i386: Adjust kernel build to reflect the refactoring of early Xen startup code and Xen interrupt services. sys/dev/xen/blkback/blkback.c: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkfront/block.h: sys/dev/xen/control/control.c: sys/dev/xen/evtchn/evtchn_dev.c: sys/dev/xen/netback/netback.c: sys/dev/xen/netfront/netfront.c: sys/xen/xenstore/xenstore.c: sys/xen/evtchn/evtchn_dev.c: sys/dev/xen/console/console.c: sys/dev/xen/console/xencons_ring.c Adjust drivers to use new xen_intr_*() API. sys/dev/xen/blkback/blkback.c: Since blkback defers all event handling to a taskqueue, convert this task queue to a "fast" taskqueue, and schedule it via an interrupt filter. This avoids an unnecessary ithread context switch. sys/xen/xenstore/xenstore.c: The xenstore driver is MPSAFE. Indicate as much when registering its interrupt handler. sys/xen/xenbus/xenbus.c: sys/xen/xenbus/xenbusvar.h: Remove unused event channel APIs. sys/xen/evtchn.h: Remove all kernel Xen interrupt service API definitions from this file. It is now only used for structure and ioctl definitions related to the event channel userland device driver. Update the definitions in this file to match those from NetBSD. Implementing this interface will be necessary for Dom0 support. sys/xen/evtchn/evtchnvar.h: Add a header file for implemenation internal APIs related to managing event channels event delivery. This is used to allow, for example, the event channel userland device driver to access low-level routines that typical kernel consumers of event channel services should never access. sys/xen/interface/event_channel.h: sys/xen/xen_intr.h: Standardize on the evtchn_port_t type for referring to an event channel port id. In order to prevent low-level event channel APIs from leaking to kernel consumers who should not have access to this data, the type is defined twice: Once in the Xen provided event_channel.h, and again in xen/xen_intr.h. The double declaration is protected by __XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared twice within a given compilation unit. sys/xen/xen_intr.h: sys/xen/evtchn/evtchn.c: sys/x86/xen/xen_intr.c: sys/dev/xen/xenpci/evtchn.c: sys/dev/xen/xenpci/xenpcivar.h: New implementation of Xen interrupt services. This is similar in many respects to the i386 PV implementation with the exception that events for bound to event channel ports (i.e. not IPI, virtual IRQ, or physical IRQ) are further optimized to avoid mask/unmask operations that aren't necessary for these edge triggered events. Stubs exist for supporting physical IRQ binding, but will need additional work before this implementation can be fully shared between PV and HVM. sys/amd64/amd64/mp_machdep.c: sys/i386/i386/mp_machdep.c: sys/i386/xen/mp_machdep.c sys/x86/xen/hvm.c: Add support for placing vcpu_info into an arbritary memory page instead of using HYPERVISOR_shared_info->vcpu_info. This allows the creation of domains with more than 32 vcpus. sys/i386/i386/machdep.c: sys/i386/xen/clock.c: sys/i386/xen/xen_machdep.c: sys/i386/xen/exception.s: Add support for new event channle implementation. Notes: svn path=/head/; revision=255040
* Replace kernel virtual address space allocation with vmem. This providesJeff Roberson2013-08-071-2/+2
| | | | | | | | | | | | | | | | transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division Notes: svn path=/head/; revision=254025
* Remove the support for using non-mpsafe filesystem modules.Konstantin Belousov2012-10-221-14/+0
| | | | | | | | | | | | | | | In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho Notes: svn path=/head/; revision=241896
* Fix regression in the handling of blkback close events forJustin T. Gibbs2012-02-171-6/+2
| | | | | | | | | | | | | | | devices that are unplugged via QEMU. sys/dev/xen/blkback/blkback.c: Toolstack initiated closures change the frontend's state to Closing. The backend must change to Closing as well, even if we can't actually close yet, in order for the frontend to notice and start the closing process. MFC after: 3 days Notes: svn path=/head/; revision=231883
* Fix typo in a printf string: "specificed" -> "specified".Justin T. Gibbs2012-02-161-4/+4
| | | | | | | MFC after: 1 day Notes: svn path=/head/; revision=231837
* Enhance documentation, improve interoperability, and fix defects inJustin T. Gibbs2012-02-151-55/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FreeBSD's front and back Xen blkif interface drivers. sys/dev/xen/blkfront/block.h: sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkback/blkback.c: Replace FreeBSD specific multi-page ring impelementation with support for both the Citrix and Amazon/RedHat versions of this extension. sys/dev/xen/blkfront/blkfront.c: o Add a per-instance sysctl tree that exposes all negotiated transport parameters (ring pages, max number of requests, max request size, max number of segments). o In blkfront_vdevice_to_unit() add a missing return statement so that we properly identify the unit number for high numbered xvd devices. sys/dev/xen/blkback/blkback.c: o Add static dtrace probes for several events in this driver. o Defer connection shutdown processing until the front-end enters the closed state. This avoids prematurely tearing down the connection when buggy front-ends transition to the closing state, even though the device is open and they veto the close request from the tool stack. o Add nodes for maximum request size and the number of active ring pages to the exising, per-instance, sysctl tree. o Miscelaneous style cleanup. sys/xen/interface/io/blkif.h: o Add extensive documentation of the XenStore nodes used to implement the blkif interface. o Document the startup sequence between a front and back driver. o Add structures and documenatation for the "discard" feature (AKA Trim). o Cleanup some definitions related to FreeBSD's request number/size/segment-limit extension. sys/dev/xen/blkfront/blkfront.c: sys/dev/xen/blkback/blkback.c: sys/xen/xenbus/xenbusvar.h: Add the convenience function xenbus_get_otherend_state() and use it to simplify some logic in both block-front and block-back. MFC after: 1 day Notes: svn path=/head/; revision=231743