aboutsummaryrefslogtreecommitdiff
path: root/sys/dev/acpica/acpi_cpu.c
Commit message (Collapse)AuthorAgeFilesLines
* Replace calls to bus_generic_attach with bus_attach_childrenJohn Baldwin2024-12-061-1/+1
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D47675
* Replace calls to bus_generic_probe with bus_identify_childrenJohn Baldwin2024-12-061-1/+1
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D47674
* Use the correct idle routine on recent AMD EPYC serversAndrew Gallatin2024-11-081-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We have been incorrectly choosing the "hlt" idle method on modern AMD EPYC servers for C1 idle. This is because AMD also uses the Functional Fixed Hardware interface. Due to not parsing the table properly for AMD, and due to a weird quirk where the mwait latency for C1 is mis-interpreted as the latency for hlt, we wind up choosing hlt for c1, which has a far higher wake up latency (similar to IO) of roughly 400us on my test system (AMD 7502P). This patch fixes this by: - Looking for AMD in addition to Intel in the FFH (Note the vendor id of "2" for AMD is not publically documented, but AMD has confirmed they are using "2" and has promised to document it.) - Using mwait on AMD when specified in the table, and when CPUid says its supported - Fixing a weird issue where we copy the contents of cx_ptr for C1 and when moving to C2, we do not reinitialize cx_ptr. This leads to mwait being selected, and ignoring the specified i/o halt method unless we clear mwait before looking at the table for C2. Differential Revision: https://reviews.freebsd.org/D47444 Reviewed by: dab, kib, vangyzen Sponsored by: Netflix
* acpi_cpu: Reduce BUS_MASTER_RLD manipulationsAlexander Motin2023-12-261-9/+9
| | | | | | | | | | | | | | | Instead of setting and clearing BUS_MASTER_RLD register on every C3 state enter/exit, set it only once if the system supports C3 state and we are going to "disable" bus master arbitration while in it. This is what Linux does for the past 14 years, and for even more time this register is not implemented in a relevant hardware. Same time since this is only a single bit in a bigger register, ACPI has to do take a global lock and do read-modify-write for it, that is too expensive, saved only by C3 not entered frequently, but enough to be seen in idle system CPU profiles. MFC after: 1 month
* sys: Remove $FreeBSD$: one-line .c patternWarner Losh2023-08-161-2/+0
| | | | Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
* acpi: Create cppc_notify sysctl before it is checkedTom Jones2022-10-231-8/+8
| | | | | | Reported by: Henrix Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D37081
* acpi: Put CPPC workaround behind i386/amd64 if defTom Jones2022-10-111-0/+4
| | | | | While CPPC is available on arm64 platforms with ACPI we don't know if we need to work around issues with firmware there.
* acpi: Tell SMM we will handle CPPC notificationsTom Jones2022-10-101-0/+14
| | | | | | | | | | | | | | | | | | | | | | | Buggy SMM implementations can hang while processing CPPC notifications. This leads to some laptops (notably Thinkpads) hanging when the hwpstate_intel driver is loaded. Tell the SMM that we will handle CPPC notifications as described in: - Intel® Processor Vendor-Specific ACPI - Intel® 64 and IA-32 Architectures Software Developer’s Manual CPPC events default to masked (disabled) so while we do not do any handling right now this does not seem to lead to any issues. This approach was found via this Linux Kernel patch: https://lkml.org/lkml/2016/3/17/563 PR: 253288 Reviewed by: imp, jhb Sponsored by: Modirum Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D36699
* acpi: Remove unused devclass arguments to DRIVER_MODULE.John Baldwin2022-05-061-2/+1
|
* acpi_cpu: Use device_get_devclass to find devclass in attach.John Baldwin2022-04-211-1/+2
| | | | | Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D34988
* xen/acpi: upload Cx and Px data to XenRoger Pau Monné2022-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | When FreeBSD is running as dom0 (initial domain) on a Xen system it has access to the native ACPI tables and is the OSPM. However the hypervisor is the entity in charge of the CPU idle and frequency states, and in order to perform this duty it requires information found the ACPI dynamic tables that can only be parsed by the OSPM. Introduce a new Xen specific ACPI driver to fetch the Processor related information and upload it to Xen. Note that this driver needs to take precedence over the generic ACPI CPU driver when running as dom0, so downgrade the probe score of the native driver to BUS_PROBE_DEFAULT in order for the Xen specific driver to use BUS_PROBE_SPECIFIC. Tested on an Intel NUC to successfully parse and upload both the Cx and Px states to Xen. Sponsored by: Citrix Systems R&D Reviewed by: jhb kib Differential revision: https://reviews.freebsd.org/D34841
* acpica: Remove CTLFLAG_NEEDGIANT from most sysctls.Alexander Motin2021-12-271-60/+48
| | | | MFC after: 2 weeks
* acpi_cpu: Replace Giant with bus_topo_lock.Alexander Motin2021-12-101-2/+2
|
* Check cpu_softc is not NULL before dereferencingAndrew Turner2021-09-271-0/+3
| | | | | | | | In the acpi_cpu_postattach SYSINIT function cpu_softc may be NULL, e.g. on arm64 when booting from FDT. Check it is not NULL at the start of the function so we don't try to dereference a NULL pointer. Sponsored by: The FreeBSD Foundation
* acpi_cpu: Fix panic if some CPU devices are disabled.Alexander Motin2021-09-251-37/+29
| | | | While there, remove couple unneeded global variables.
* acpi_cpu: Make device unit numbers match OS CPU IDs.Alexander Motin2021-09-251-62/+21
| | | | | | | | | | | There are already APIC ID, ACPI ID and OS ID for each CPU. In perfect world all of those may match, but at least for SuperMicro server boards none of them do. Plus none of them match the CPU devices listing order by ACPI. Previous code used the ACPI device listing order to number cpuX devices. It looked nice from NewBus perspective, but introduced 4th different set of IDs. Extremely confusing one, since in some places the device unit numbers were treated as OS CPU IDs (coretemp), but not in others (sysctl dev.cpu.X.%location).
* Move time math out of disabled interrupts sections.Alexander Motin2021-03-101-14/+16
| | | | | | | | | | | | We don't need the result before next sleep time, so no reason to additionally increase interrupt latency. While there, remove extra PM ticks to microseconds conversion, making C2/C3 sleep times look 4 times smaller than really. The conversion is already done by AcpiGetTimerDuration(). Now I see reported sleep times up to 0.5s, just as expected for planned 2 wakeups per second. MFC after: 1 month
* Do not read timer extra time when MWAIT is used.Alexander Motin2021-03-081-9/+10
| | | | | | | | | | | | When we enter C2+ state via memory read, it may take chipset some time to stop CPU. Extra register read covers that time. But MWAIT makes CPU stop immediately, so we don't need to waste time after wakeup with interrupts still disabled, increasing latency. On my system it reduces ping localhost latency, waking up all CPUs once a second, from 277us to 242us. MFC after: 1 month
* Change mwait_bm_avoidance use to match Linux.Alexander Motin2021-03-081-4/+6
| | | | | | | | | | | | | | | | | Even though the information is very limited, it seems the intent of this flag is to control ACPI_BITREG_BUS_MASTER_STATUS use for C3, not force ACPI_BITREG_ARB_DISABLE manipulations for C2, where it was never needed, and which register not really doing anything for years. It wasted lots of CPU time on congested global ACPI hardware lock when many CPU cores were trying to enter/exit deep C-states same time. On idle 80-core system it pushed ping localhost latency up to 20ms, since badport_bandlim() via counter_ratecheck() wakes up all CPUs same time once a second just to synchronously reset the counters. Now enabling C-states increases the latency from 0.1 to just 0.25ms. Discussed with: kib MFC after: 1 month
* acpica: clean up empty lines in .c and .h filesMateusz Guzik2020-09-011-1/+0
| | | | Notes: svn path=/head/; revision=365096
* Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)Pawel Biernacki2020-02-261-18/+17
| | | | | | | | | | | | | | | | | | | r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718 Notes: svn path=/head/; revision=358333
* Distinguish _CID match and _HID match and make lower priority probeTakanori Watanabe2018-10-261-1/+1
| | | | | | | | | | when _CID match. Reviewed by: jhb, imp Differential Revision:https://reviews.freebsd.org/D16468 Notes: svn path=/head/; revision=339754
* Use device_quiet_children to silence verbose CPU probe messages.Warner Losh2018-05-071-0/+5
| | | | | | | Have cpu0 be noisy, but all the other CPU devices be quiet on boot. Notes: svn path=/head/; revision=333334
* Implement ACPI CPU support when Processor object is not presentConrad Meyer2017-12-191-35/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | By the ACPI standard (ACPI 5 chapter 8.4 Declaring Processors) Processors can be implemented in 2 distinct ways: * Through a Processor object type (which provides P_BLK) * Through a Device object type Prior to this change, the FreeBSD driver only supported the former. AMD Epyc / Poweredge systems we are testing both implement the latter only. Add the missing support. Because P_BLK is not defined in the device object case, C-states entering must be completely controlled via _CST methods rather than P_LVL2/3. John Baldwin points out that ACPI 6.0 formally deprecates the Processor keyword, so eventually processors will only be enumerated as Device objects. Submitted by: attilio Reviewed by: jhb, markj, Anton Rang <rang AT acm.org> Relnotes: maybe Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D13457 Notes: svn path=/head/; revision=326956
* Merge ACPICA 20170929 (take 2).Jung-uk Kim2017-10-101-1/+1
| | | | Notes: svn path=/head/; revision=324502
* Revert r324109. This commit broke a number of systems.Jung-uk Kim2017-09-301-1/+1
| | | | | | | | Reported by: lwhsu, kib Requested by: ngie Notes: svn path=/head/; revision=324136
* Merge ACPICA 20170929.Jung-uk Kim2017-09-291-1/+1
| | | | Notes: svn path=/head/; revision=324109
* Merge ACPICA 20170831.Jung-uk Kim2017-08-311-3/+3
| | | | Notes: svn path=/head/; revision=323076
* Corrected misspelled versions of rendezvous.Patrick Kelsey2017-04-091-2/+2
| | | | | | | | | | | | The MFC will include a compat definition of smp_no_rendevous_barrier() that calls smp_no_rendezvous_barrier(). Reviewed by: gnn, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D10313 Notes: svn path=/head/; revision=316648
* Remove cpu_deepest_sleep variable.Konstantin Belousov2017-02-241-8/+1
| | | | | | | | | | | | | | | | | | | | | | On Core2 and older Intel CPUs, where TSC stops in C2, system does not allow C2 entrance if timecounter hardware is TSC. This is done by tc_windup() which tests for TC_FLAGS_C2STOP flag of the new timecounter and increases cpu_disable_c2_sleep if flag is set. Right now init_TSC_tc() only sets the flag if cpu_deepest_sleep >= 2, but TSC is initialized too early for this variable to be set by acpi_cpu.c. There is no reason to require that ACPI reported C2 and deeper states to set TC_FLAGS_C2STOP, so remove cpu_deepest_sleep test from init_TSC_tc() condition. And since this is the only use of the variable, remove it at all. Reported and submitted by: Jia-Shiun Li <jiashiun@gmail.com> Suggested by: jhb MFC after: 2 weeks Notes: svn path=/head/; revision=314211
* Ensure the idle thread's loop services interrupts in a timely way whenJonathan T. Looney2017-02-081-0/+3
| | | | | | | | | | | | | | | | | | | | using the ACPI C1/mwait sleep method. Previously, the mwait instruction would return when an interrupt was pending; however, the idle loop did not actually enable interrupts when this occurred. This led to a situation where the idle loop could quickly spin through the C1/mwait sleep method a number of times when an interrupt was pending. (Eventually, the situation corrected itself when something other than an interrupt triggered the idle loop to either enable interrupts or schedule another thread.) Reviewed by: kib, imp (earlier version) Input from: jhb MFC after: 1 week Sponsored by: Netflix Notes: svn path=/head/; revision=313447
* Add an EARLY_AP_STARTUP option to start APs earlier during boot.John Baldwin2016-05-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot. This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed. This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs. However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu. Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case. As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code. These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix Notes: svn path=/head/; revision=299746
* sys/dev: minor spelling fixes.Pedro F. Giffuni2016-05-031-1/+1
| | | | | | | Most affect comments, very few have user-visible effects. Notes: svn path=/head/; revision=298955
* Only count CPU devices that are using the ACPI CPU driver.John Baldwin2016-04-281-1/+2
| | | | | | | | | | | Arguably we should only be doing the probe/attach to children of these devices as well. Tested by: Michal Stanek <mst_semihalf.com> (arm64) Differential Revision: https://reviews.freebsd.org/D6133 Notes: svn path=/head/; revision=298754
* Optionally return the output capabilities list from _OSC.John Baldwin2016-04-221-2/+2
| | | | | | | | | | | Both of the callers were expecting the input cap_set to be modified. This fixes them to request cap_set to be updated with the returned buffer. Reviewed by: jkim Differential Revision: https://reviews.freebsd.org/D6040 Notes: svn path=/head/; revision=298484
* Queue the CPU-probing task after all acpi_cpu devices are attached.John Baldwin2016-04-211-3/+10
| | | | | | | | | | | | | Eventually with earlier AP startup this code will change to call the startup function synchronously instead of queueing the task. Moving the time we queue the task should be a no-op since taskqueue threads don't start executing tasks until much later, but this reduces the diff with the earlier AP startup patches. Sponsored by: Netflix Notes: svn path=/head/; revision=298425
* There is no need to use array any more. No functional change.Jung-uk Kim2016-04-201-5/+5
| | | | Notes: svn path=/head/; revision=298379
* Remove query flag from acpi_EvaluateOSC(). This function does not supportJung-uk Kim2016-04-201-2/+2
| | | | | | | return buffer (yet). Notes: svn path=/head/; revision=298377
* Add a wrapper for evaluating _OSC methods.John Baldwin2016-04-201-15/+3
| | | | | | | | | | | | | | This wrapper does not translate errors in the first word to ACPI error status returns. Use this wrapper in the acpi_cpu(4) driver in place of the existing _OSC code. While here, fix a bug where the wrong count of words was passed when invoking _OSC. Reviewed by: jkim MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6022 Notes: svn path=/head/; revision=298370
* Add basic support for ACPI. It splits out the nexus driver to two newAndrew Turner2015-06-111-0/+8
| | | | | | | | | | | | | | | | | | | | drivers, one for fdt, one for acpi. It then uses this to decide if it will use fdt or acpi. The GICv2 (interrupt controller) and Generic Timer drivers have been updated to handle both cases. As this is early code we still need FDT to find the kernel console, and some parts are still missing, including PCI support. Differential Revision: https://reviews.freebsd.org/D2463 Reviewed by: jhb, jkim, emaste Obtained from: ABT Systems Ltd Relnotes: Yes Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=284273
* Check status of AcpiReadBitRegister() calls.Jung-uk Kim2015-06-091-4/+6
| | | | | | | | Reported by: Coverity CID: 1306132 Notes: svn path=/head/; revision=284195
* Do not probe Intel PIIX4 south bridge quirks on amd64. These quirky southJung-uk Kim2015-05-211-7/+14
| | | | | | | | | | bridges only supported Intel Pentium and Pentium II era processors and there is no reason for hardware virtualizations to emulate these quirks. MFC after: 1 week Notes: svn path=/head/; revision=283261
* Hide code only used on i386 and amd64.Andrew Turner2015-05-111-1/+6
| | | | Notes: svn path=/head/; revision=282771
* If x86 CPU implementation of the MWAIT instruction reasonablyKonstantin Belousov2015-05-091-19/+157
| | | | | | | | | | | | | | | | | | | | | | | | | | interacts with interrupts, query ACPI and use MWAIT for entrance into Cx sleep states. Support C1 "I/O then halt" mode. See Intel' document 302223-007 "Intelб╝ Processor Vendor-Specific ACPI Interface Specification" for description. Move the acpi_cpu_c1() function into x86/cpu_machdep.c and use it instead of inlining "sti; hlt" sequence in several places. In the acpi(4) man page, besides documenting the dev.cpu.N.cx_methods sysctl, correct the names for dev.cpu.N.{cx_usage,cx_lowest,cx_supported} sysctls. Both jkim and avg have some other patches implementing the mwait functionality; this work is unrelated. Linux does not rely on the ACPI to provide correct tables describing Cx modes. Instead, the driver has pre-defined knowledge of the CPU models, it was supplied by Intel. Tested by: pho (previous versions) Sponsored by: The FreeBSD Foundation Notes: svn path=/head/; revision=282678
* When disabling C3+ CPU states due to the CPU_QUIRK_NO_C3 quirk, don'tColin Percival2015-01-181-1/+1
| | | | | | | | | | | | | | | | | | accidentally enable non-existent states. This bug was triggered if ACPI advertises the presence of a C2 state which we fail to parse via acpi_PkgGas due to our lack of support for FFixedHW resources, and causes an immediate panic when an attempt is made to enter the (NULL) state. One affected platform is the EC2 c4.8xlarge VM instance type; there may be others. MFC after: 1 week Thanks to: jkim, @_msw_ Notes: svn path=/head/; revision=277318
* On some Intel CPUs with a P-state but not C-state invariant TSC the TSCJohn Baldwin2015-01-051-3/+14
| | | | | | | | | | | | | | | may also halt in C2 and not just C3 (it seems that in some cases the BIOS advertises its C3 state as a C2 state in _CST). Just play it safe and disable both C2 and C3 states if a user forces the use of the TSC as the timecounter on such CPUs. PR: 192316 Differential Revision: https://reviews.freebsd.org/D1441 No objection from: jkim MFC after: 1 week Notes: svn path=/head/; revision=276724
* xen: add ACPI bus to xen_nexus when running as Dom0Roger Pau Monné2014-08-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Also disable a couple of ACPI devices that are not usable under Dom0. To this end a couple of booleans are added that allow disabling ACPI specific devices. Sponsored by: Citrix Systems R&D Reviewed by: jhb x86/xen/xen_nexus.c: - Return BUS_PROBE_SPECIFIC in the Xen Nexus attachement routine to force the usage of the Xen Nexus. - Attach the ACPI bus when running as Dom0. dev/acpica/acpi_cpu.c: dev/acpica/acpi_hpet.c: dev/acpica/acpi_timer.c - Add a variable that gates the addition of the devices. x86/include/init.h: - Declare variables that control the attachment of ACPI cpu, hpet and timer devices. Notes: svn path=/head/; revision=269515
* Pull in r267961 and r267973 again. Fix for issues reported will follow.Hans Petter Selasky2014-06-281-1/+0
| | | | Notes: svn path=/head/; revision=267992
* Revert r267961, r267973:Glen Barber2014-06-271-0/+1
| | | | | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory Notes: svn path=/head/; revision=267985
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifHans Petter Selasky2014-06-271-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies Notes: svn path=/head/; revision=267961