aboutsummaryrefslogtreecommitdiff
path: root/sys/amd64
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix LINT: Add backlight to NOTESEmmanuel Vadot2020-10-021-0/+1
| | | | Notes: svn path=/head/; revision=366382
* Remove svn:executable from a couple of vmm(4) source files.Mark Johnston2020-10-012-0/+0
| | | | | | | MFC after: 3 days Notes: svn path=/head/; revision=366347
* Clear the upper 32-bits of registers in x86_emulate_cpuid().John Baldwin2020-10-014-48/+49
| | | | | | | | | | | | | | | | | | | | | Per the Intel manuals, CPUID is supposed to unconditionally zero the upper 32 bits of the involved (rax/rbx/rcx/rdx) registers. Previously, the emulation would cast pointers to the 64-bit register values down to `uint32_t`, which while properly manipulating the lower bits, would leave any garbage in the upper bits uncleared. While no existing guest OSes seem to stumble over this in practice, the bhyve emulation should match x86 expectations. This was discovered through alignment warnings emitted by gcc9, while testing it against SmartOS/bhyve. SmartOS bug: https://smartos.org/bugview/OS-8168 Submitted by: Patrick Mooney Reviewed by: rgrimes Differential Revision: https://reviews.freebsd.org/D24727 Notes: svn path=/head/; revision=366328
* Rename kernel option ACPI_DMAR to IOMMU.Ruslan Bukin2020-09-292-2/+2
| | | | | | | | | | | This is mostly needed for a common arm64/amd64 iommu code. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D26587 Notes: svn path=/head/; revision=366267
* Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead.Edward Tomasz Napierala2020-09-277-17/+11
| | | | | | | | | Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26458 Notes: svn path=/head/; revision=366205
* Regen after r366145.Edward Tomasz Napierala2020-09-254-858/+870
| | | | | | | Sponsored by: DARPA Notes: svn path=/head/; revision=366147
* Add a vmparam.h constant indicating pmap support for large pages.Mark Johnston2020-09-231-0/+5
| | | | | | | | | | | Enable SHM_LARGEPAGE support on arm64. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc., Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26467 Notes: svn path=/head/; revision=366090
* Use envvar rather than nonstandard hint. linesWarner Losh2020-09-231-11/+11
| | | | | | | | | | | | | The NOTES files have a bunch of hint lines that are removed when generating LINT. However, we can achieve the same effect by prepending each of the lines with 'envvar' so the NOTES files become standard config(8) files. No functional changes as the sed script to generate the LINT files filters these either way. Suggested by: kevans Notes: svn path=/head/; revision=366088
* amd64 pmap: More unification for psind = 1 vs 2 in pmap_enter_largepage().Konstantin Belousov2020-09-221-49/+28
| | | | | | | | | | | | | | | | | | | | | Move pkru check wait for page alloc wire accounting update asserting allowed updates for valid mappings out of psind conditions. Also add assert that psind references supported page size. Remove not true comment. Avoid uneccessary page table walks from top level. Reviewed by: alc, markj (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D26513 Notes: svn path=/head/; revision=366030
* Sparsify the vm_page_dump bitmapD Scott Phillips2020-09-212-1/+13
| | | | | | | | | | | | | | | | | | | On Ampere Altra systems, the sparse population of RAM within the physical address space causes the vm_page_dump bitmap to be much larger than necessary, increasing the size from ~8 Mib to > 2 Gib (and overflowing `int` for the size). Changing the page dump bitmap also changes the minidump file format, so changes are also necessary in libkvm. Reviewed by: jhb Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26131 Notes: svn path=/head/; revision=365978
* Move vm_page_dump bitset array definition to MI codeD Scott Phillips2020-09-213-54/+20
| | | | | | | | | | | | | | | | | | | These definitions were repeated by all architectures, with small variations. Consolidate the common definitons in machine independent code and use bitset(9) macros for manipulation. Many opportunities for deduplication remain in the machine dependent minidump logic. The only intended functional change is increasing the bit index type to vm_pindex_t, allowing the indexing of pages with address of 8 TiB and greater. Reviewed by: kib, markj Approved by: scottl (implicit) MFC after: 1 week Sponsored by: Ampere Computing, Inc. Differential Revision: https://reviews.freebsd.org/D26129 Notes: svn path=/head/; revision=365977
* amd64 pmap: only calculate page table page when needed.Konstantin Belousov2020-09-211-4/+6
| | | | | | | | | | | Noted by: alc Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D26499 Notes: svn path=/head/; revision=365951
* amd64 pmap: handle cases where pml4 page table page is not allocated.Konstantin Belousov2020-09-201-6/+8
| | | | | | | | | | | | Possible in LA57 pmap config. Noted by: alc Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26492 Notes: svn path=/head/; revision=365931
* Fix some nits in 1G page support in the amd64 pmap.Mark Johnston2020-09-191-46/+66
| | | | | | | | | | | | | | | | | | - Move assertions out of the main loop to avoid duplicate conditional expressions, and improve assertion messages. - Fix va_next updates. In some cases we were not doing the wraparound check before continuing the loop. - Use the right va_next. In pmap_advise() and pmap_copy() we would step through 1G pages 2M at a time. - Copy 1G mappings in pmap_copy(). Reviewed by: alc, kib MFC with: r365518 Sponsored by: Juniper Networks, Inc., Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26463 Notes: svn path=/head/; revision=365906
* amd64 pmap_pkru_same: prev_ppr was always NULLEric van Gyzen2020-09-181-2/+4
| | | | | | | | | | | | | Fix the logic so it works as it appears. Reported by: Coverity Reviewed by: kib MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: D26211 (in progress, so omitting full URL) Notes: svn path=/head/; revision=365890
* Ensure that a protection key is selected in pmap_enter_largepage().Mark Johnston2020-09-181-12/+12
| | | | | | | | | | Reviewed by: alc, kib Reported by: Coverity MFC with: r365518 Differential Revision: https://reviews.freebsd.org/D26464 Notes: svn path=/head/; revision=365878
* Get rid of sv_errtbl and SV_ABI_ERRNO().Edward Tomasz Napierala2020-09-173-10/+2
| | | | | | | | | Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26388 Notes: svn path=/head/; revision=365832
* bhyve: do not permit write access to VMCB / VMCSEd Maste2020-09-152-2/+9
| | | | | | | | | Reported by: Patrick Mooney Submitted by: jhb Security: CVE-2020-24718 Notes: svn path=/head/; revision=365775
* bhyve: intercept AMD SVM instructions.Konstantin Belousov2020-09-152-36/+74
| | | | | | | | | | | | | | | Intercept and report #UD to VM on SVM/AMD in case VM tried to execute an SVM instruction. Otherwise, SVM allows execution of them, and instructions operate on host physical addresses despite being executed in guest mode. Reported by: Maxime Villard <max@m00nbsd.net> admbug: 972 CVE: CVE-2020-7467 Reviewed by: grehan, markj Differential revision: https://reviews.freebsd.org/D26313 Notes: svn path=/head/; revision=365766
* Move SV_ABI_ERRNO translation into linux-specific code, to simplifyEdward Tomasz Napierala2020-09-153-2/+21
| | | | | | | | | | | | the syscall path and declutter it a bit. No functional changes intended. Reviewed by: kib (earlier version) MFC after: 2 weeks Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D26378 Notes: svn path=/head/; revision=365755
* Add constant for the DE_CFG MSR on AMD CPUs.John Baldwin2020-09-111-3/+3
| | | | | | | | | Reported by: Patrick Mooney <pmooney@pfmooney.com> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25885 Notes: svn path=/head/; revision=365642
* Use vmcb_read/write for the vmcb snapshot functions.John Baldwin2020-09-101-2/+2
| | | | | | | This avoids some unnecessary layers of indirection. Notes: svn path=/head/; revision=365616
* Add pmap_enter(9) PMAP_ENTER_LARGEPAGE flag and implement it on amd64.Konstantin Belousov2020-09-091-0/+120
| | | | | | | | | | | | | | | | | | The flag requests entry of non-managed superpage mapping of size pagesizes[psind] into the page table. Pmap supports fake wiring of the largepage mappings. Only attributes of the largepage mapping can be changed by calling pmap_enter(9) over existing mapping, physical address of the page must be unchanged. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365518
* Fix assert.Konstantin Belousov2020-09-091-1/+1
| | | | | | | | Noted by: alc MFC after: 1 week Notes: svn path=/head/; revision=365514
* amd64 pmap: teach functions walking user page tables about PG_PS bit in PDPE.Konstantin Belousov2020-09-091-40/+130
| | | | | | | | | | | | | Only unmanaged 1G superpages are handled. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365512
* amd64: report support for 1G superpages in getpagesizes(2).Konstantin Belousov2020-09-091-0/+5
| | | | | | | | | | | Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D24652 Notes: svn path=/head/; revision=365511
* Include the psind in data returned by mincore(2).Mark Johnston2020-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Currently we use a single bit to indicate whether the virtual page is part of a superpage. To support a forthcoming implementation of non-transparent 1GB superpages, it is useful to provide more detailed information about large page sizes. The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind) values, indicating a mapping of size psind, where psind is an index into the pagesizes array returned by getpagesizes(3), which in turn comes from the hw.pagesizes sysctl. MINCORE_PSIND(1) is equal to the old value of MINCORE_SUPER. For now, two bits are used to record the page size, permitting values of MAXPAGESIZES up to 4. Reviewed by: alc, kib Sponsored by: Juniper Networks, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26238 Notes: svn path=/head/; revision=365267
* Add the MEM_EXTRACT_PADDR ioctl to /dev/mem.Mark Johnston2020-09-022-3/+2
| | | | | | | | | | | | | | | This allows privileged userspace processes to find information about the physical page backing a given mapping. It is useful in applications such as DPDK which perform some of their own memory management. Reviewed by: kib, jhb (previous version) MFC after: 2 weeks Sponsored by: Juniper Networks, Inc. Sponsored by: Klara Inc. Differential Revision: https://reviews.freebsd.org/D26237 Notes: svn path=/head/; revision=365265
* Fix a page table pages leak after LA57.Konstantin Belousov2020-09-021-1/+22
| | | | | | | | | | | | | | | | | | If the call to _pmap_allocpte() is not sleepable, it is possible that allocation of PML4 or PDP page is successful but either PDP or PD page is not. Restructured code in _pmap_allocpte() leaves zero-referenced page in the paging structure. Handle it by checking refcount of the page one level above failed alloc and free that page if its reference count is zero. Reported and tested by: pho Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26293 Notes: svn path=/head/; revision=365251
* amd64: clean up empty lines in .c and .h filesMateusz Guzik2020-09-0138-93/+44
| | | | Notes: svn path=/head/; revision=365067
* ZFS: clarify dependencies for static linkingMatt Macy2020-08-281-0/+1
| | | | Notes: svn path=/head/; revision=364923
* Restore workaround for sysret fault on non-canonical address after LA57.Konstantin Belousov2020-08-241-1/+2
| | | | | | | Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=364734
* cpu_auxmsr: assert caller is preventing CPU migration.Peter Grehan2020-08-241-1/+5
| | | | | | | | | | | | Submitted by: Adam Fenn (adam at fenn dot io) Requested by: kib Reviewed by: kib, grehan Approved by: kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D26166 Notes: svn path=/head/; revision=364656
* amd64: Handle 5-level paging on wakeup.Konstantin Belousov2020-08-232-3/+15
| | | | | | | | | | We can switch into long mode directly with LA57 enabled. Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273 Notes: svn path=/head/; revision=364534
* amd64: Handle 5-level paging for efirt calls.Konstantin Belousov2020-08-231-11/+36
| | | | | | | | Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273 Notes: svn path=/head/; revision=364533
* Add bhyve support for LA57 guest mode.Konstantin Belousov2020-08-233-5/+14
| | | | | | | | | Noted and reviewed by: grehan Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273 Notes: svn path=/head/; revision=364531
* Add amd64 procctl(2) ops to manage forced LA48/LA57 VA after exec.Konstantin Belousov2020-08-231-22/+93
| | | | | | | | | Tested by: pho (LA48 hardware) Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273 Notes: svn path=/head/; revision=364530
* amd64 pmap: LA57 AKA 5-level pagingKonstantin Belousov2020-08-2314-226/+949
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since LA57 was moved to the main SDM document with revision 072, it seems that we should have a support for it, and silicons are coming. This patch makes pmap support both LA48 and LA57 hardware. The selection of page table level is done at startup, kernel always receives control from loader with 4-level paging. It is not clear how UEFI spec would adapt LA57, for instance it could hand out control in LA57 mode sometimes. To switch from LA48 to LA57 requires turning off long mode, requesting LA57 in CR4, then re-entering long mode. This is somewhat delicate and done in pmap_bootstrap_la57(). AP startup in LA57 mode is much easier, we only need to toggle a bit in CR4 and load right value in CR3. I decided to not change kernel map for now. Single PML5 entry is created that points to the existing kernel_pml4 (KML4Phys) page, and a pml5 entry to create our recursive mapping for vtopte()/vtopde(). This decision is motivated by the fact that we cannot overcommit for KVA, so large space there is unusable until machines start providing wider physical memory addressing. Another reason is that I do not want to break our fragile autotuning, so the KVA expansion is not included into this first step. Nice side effect is that minidumps are compatible. On the other hand, (very) large address space is definitely immediately useful for some userspace applications. For userspace, numbering of pte entries (or page table pages) is always done for 5-level structures even if we operate in 4-level mode. The pmap_is_la57() function is added to report the mode of the specified pmap, this is done not to allow simultaneous 4-/5-levels (which is not allowed by hw), but to accomodate for EPT which has separate level control and in principle might not allow 5-leve EPT despite x86 paging supports it. Anyway, it does not seems critical to have 5-level EPT support now. Tested by: pho (LA48 hardware) Reviewed by: alc Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D25273 Notes: svn path=/head/; revision=364527
* amd64 pmap: potential integer overflowing expressionEric van Gyzen2020-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | Coverity has identified the line in this change as "Potential integer overflowing expression" due to the variable i declared as an int and used in an expression with vm_paddr_t, a 64bit variable. This change has very little effect as when this line is execute nkpt is small and phys_addr is a the beginning of physical memory. But there is no explicit protection that the above is true. Submitted by: bret_ketchum@dell.com Reported by: Coverity Reviewed by: markj MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26141 Notes: svn path=/head/; revision=364457
* Use pmap_mapbios() to map ACPI tables on amd64 and i386.Mark Johnston2020-08-201-92/+19
| | | | | | | | | | | | | | | | | | | The ACPI table-mapping code used pmap_kenter_temporary() to create mappings, which in turn uses the fixed-size crashdump map. Moreover, the code was not verifying that the table fits in this map, so when mapping large tables we could clobber adjacent mappings. This use of pmap_kenter_temporary() appears to predate support in pmap_mapbios() for creating early mappings, but that restriction no longer applies. PR: 248746 Reviewed by: kib, mav Tested by: gallatin, Curtis Villamizar <curtis@ipv6.occnc.com> MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26125 Notes: svn path=/head/; revision=364411
* Remove some noisy ACPI tables messages from verbose dmesg.Alexander Motin2020-08-191-9/+1
| | | | | | | | | | | | Those messages were printed hundreds of times during boot, often multiple times for each table. We already print information about the tables in more organized form once to not duplicate it when random ACPI drivers are attaching. MFC after: 1 week Notes: svn path=/head/; revision=364399
* linux: add sysctl compat.linux.use_emul_pathMateusz Guzik2020-08-181-5/+9
| | | | | | | | | | | | | | | | This is a step towards facilitating jails with only Linux binaries. Supporting emul_path adds path lookups which are completely spurious if the binary at hand runs in a Linux-based root directory. It defaults to on (== current behavior). make -C /root/linux-5.3-rc8 -s -j 1 bzImage: use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total Notes: svn path=/head/; revision=364366
* linux: consistently use LFREEPATH instead of open-coding itMateusz Guzik2020-08-181-1/+1
| | | | Notes: svn path=/head/; revision=364365
* Export a routine to provide the TSC_AUX MSR value and use this in vmm.Peter Grehan2020-08-182-6/+15
| | | | | | | | | | | Also, drop an unnecessary set of braces. Requested by: kib Reviewed by: kib MFC after: 3 weeks Notes: svn path=/head/; revision=364343
* Support guest rdtscp and rdpid instructions on Intel VT-xPeter Grehan2020-08-186-7/+181
| | | | | | | | | | | | | | | | Enable any of rdtscp and/or rdpid for bhyve guests on Intel-based hosts that support the "enable RDTSCP" VM-execution control. Submitted by: adam_fenn.io Reported by: chuck Reviewed by: chuck, grehan, jhb Approved by: jhb (bhyve), grehan MFC after: 3 weeks Relnotes: Yes Differential Revision: https://reviews.freebsd.org/D26003 Notes: svn path=/head/; revision=364340
* Allow guest device MMIO access from bootmem memory segments.Peter Grehan2020-08-181-2/+1
| | | | | | | | | | | | | | | | | | | Recent versions of UEFI have moved local APIC timer initialization into the early SEC phase which runs out of ROM, prior to self-relocating into RAM. This results in a hypervisor exit. Currently bhyve prevents instruction emulation from segments that aren't marked as "sysmem" aka guest RAM, with the vm_gpa_hold() routine failing. However, there is no reason for this restriction: the hypervisor already controls whether EPT mappings are marked as executable. Fix by dropping the redundant check of sysmem. MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D25955 Notes: svn path=/head/; revision=364339
* o Add machine/iommu.h and include MD iommu headers from it,Ruslan Bukin2020-08-051-0/+6
| | | | | | | | | | | | | so we don't ifdef for every arch in busdma_iommu.c; o No need to include specialreg.h for x86, remove it. Requested by: andrew Reviewed by: kib Sponsored by: DARPA/AFRL Differential Revision: https://reviews.freebsd.org/D25957 Notes: svn path=/head/; revision=363929
* Allow swi_sched() to be called from NMI context.Alexander Motin2020-07-253-0/+15
| | | | | | | | | | | | | | | | | | For purposes of handling hardware error reported via NMIs I need a way to escape NMI context, being too restrictive to do something significant. To do it this change introduces new swi_sched() flag SWI_FROMNMI, making it careful about used KPIs. On platforms allowing IPI sending from NMI context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI, otherwise it works just like SWI_DELAY. To handle the delayed SWIs this patch calls clk_intr_event on every hardclock() tick. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25754 Notes: svn path=/head/; revision=363527
* Include TMPFS in all the GENERIC kernel configsAlex Richardson2020-07-241-0/+1
| | | | | | | | | | | | | | | | Being able to use tmpfs without kernel modules is very useful when building small MFS_ROOT kernels without a real file system. Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs. Compiling TMPFS only adds 4 .c files so this should not make much of a difference to NO_MODULES build times (as we do for our minimal RISC-V images). Reviewed By: br (earlier version for riscv), brooks, emaste Differential Revision: https://reviews.freebsd.org/D25317 Notes: svn path=/head/; revision=363471
* Untie nmi_handle_intr() from DEV_ISA.Alexander Motin2020-07-221-4/+0
| | | | | | | | | | The only part of nmi_handle_intr() depending on ISA is isa_nmi(), which is already wrapped. Entering debugger on NMI does not really depend on ISA. MFC after: 2 weeks Notes: svn path=/head/; revision=363431