diff options
Diffstat (limited to 'lib/libpmc/pmc.corei7.3')
-rw-r--r-- | lib/libpmc/pmc.corei7.3 | 1576 |
1 files changed, 1576 insertions, 0 deletions
diff --git a/lib/libpmc/pmc.corei7.3 b/lib/libpmc/pmc.corei7.3 new file mode 100644 index 0000000000000..ec310548d08e2 --- /dev/null +++ b/lib/libpmc/pmc.corei7.3 @@ -0,0 +1,1576 @@ +.\" Copyright (c) 2010 Fabien Thomas. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd March 24, 2010 +.Dt PMC.COREI7 3 +.Os +.Sh NAME +.Nm pmc.corei7 +.Nd measurement events for +.Tn Intel +.Tn Core i7 and Xeon 5500 +family CPUs +.Sh LIBRARY +.Lb libpmc +.Sh SYNOPSIS +.In pmc.h +.Sh DESCRIPTION +.Tn Intel +.Tn "Core i7" +CPUs contain PMCs conforming to version 2 of the +.Tn Intel +performance measurement architecture. +These CPUs may contain up to three classes of PMCs: +.Bl -tag -width "Li PMC_CLASS_IAP" +.It Li PMC_CLASS_IAF +Fixed-function counters that count only one hardware event per counter. +.It Li PMC_CLASS_IAP +Programmable counters that may be configured to count one of a defined +set of hardware events. +.El +.Pp +The number of PMCs available in each class and their widths need to be +determined at run time by calling +.Xr pmc_cpuinfo 3 . +.Pp +Intel Core i7 and Xeon 5500 PMCs are documented in +.Rs +.%B "Intel(R) 64 and IA-32 Architectures Software Developes Manual" +.%T "Volume 3B: System Programming Guide, Part 2" +.%N "Order Number: 253669-033US" +.%D December 2009 +.%Q "Intel Corporation" +.Re +.Ss COREI7 AND XEON 5500 FIXED FUNCTION PMCS +These PMCs and their supported events are documented in +.Xr pmc.iaf 3 . +Not all CPUs in this family implement fixed-function counters. +.Ss COREI7 AND XEON 5500 PROGRAMMABLE PMCS +The programmable PMCs support the following capabilities: +.Bl -column "PMC_CAP_INTERRUPT" "Support" +.It Em Capability Ta Em Support +.It PMC_CAP_CASCADE Ta \&No +.It PMC_CAP_EDGE Ta Yes +.It PMC_CAP_INTERRUPT Ta Yes +.It PMC_CAP_INVERT Ta Yes +.It PMC_CAP_READ Ta Yes +.It PMC_CAP_PRECISE Ta \&No +.It PMC_CAP_SYSTEM Ta Yes +.It PMC_CAP_TAGGING Ta \&No +.It PMC_CAP_THRESHOLD Ta Yes +.It PMC_CAP_USER Ta Yes +.It PMC_CAP_WRITE Ta Yes +.El +.Ss Event Qualifiers +Event specifiers for these PMCs support the following common +qualifiers: +.Bl -tag -width indent +.It Li rsp= Ns Ar value +Configure the Off-core Response bits. +.Bl -tag -width indent +.It Li DMND_DATA_RD +Counts the number of demand and DCU prefetch data reads of full +and partial cachelines as well as demand data page table entry +cacheline reads. Does not count L2 data read prefetches or +instruction fetches. +.It Li DMND_RFO +Counts the number of demand and DCU prefetch reads for ownership +(RFO) requests generated by a write to data cacheline. Does not +count L2 RFO. +.It Li DMND_IFETCH +Counts the number of demand and DCU prefetch instruction cacheline +reads. Does not count L2 code read prefetches. +WB +Counts the number of writeback (modified to exclusive) transactions. +.It Li PF_DATA_RD +Counts the number of data cacheline reads generated by L2 prefetchers. +.It Li PF_RFO +Counts the number of RFO requests generated by L2 prefetchers. +.It Li PF_IFETCH +Counts the number of code reads generated by L2 prefetchers. +.It Li OTHER +Counts one of the following transaction types, including L3 invalidate, +I/O, full or partial writes, WC or non-temporal stores, CLFLUSH, Fences, +lock, unlock, split lock. +.It Li UNCORE_HIT +L3 Hit: local or remote home requests that hit L3 cache in the uncore +with no coherency actions required (snooping). +.It Li OTHER_CORE_HIT_SNP +L3 Hit: local or remote home requests that hit L3 cache in the uncore +and was serviced by another core with a cross core snoop where no modified +copies were found (clean). +.It Li OTHER_CORE_HITM +L3 Hit: local or remote home requests that hit L3 cache in the uncore +and was serviced by another core with a cross core snoop where modified +copies were found (HITM). +.It Li REMOTE_CACHE_FWD +L3 Miss: local homed requests that missed the L3 cache and was serviced +by forwarded data following a cross package snoop where no modified +copies found. (Remote home requests are not counted) +.It Li REMOTE_DRAM +L3 Miss: remote home requests that missed the L3 cache and were serviced +by remote DRAM. +.It Li LOCAL_DRAM +L3 Miss: local home requests that missed the L3 cache and were serviced +by local DRAM. +.It Li NON_DRAM +Non-DRAM requests that were serviced by IOH. +.El +.It Li cmask= Ns Ar value +Configure the PMC to increment only if the number of configured +events measured in a cycle is greater than or equal to +.Ar value . +.It Li edge +Configure the PMC to count the number of de-asserted to asserted +transitions of the conditions expressed by the other qualifiers. +If specified, the counter will increment only once whenever a +condition becomes true, irrespective of the number of clocks during +which the condition remains true. +.It Li inv +Invert the sense of comparison when the +.Dq Li cmask +qualifier is present, making the counter increment when the number of +events per cycle is less than the value specified by the +.Dq Li cmask +qualifier. +.It Li os +Configure the PMC to count events happening at processor privilege +level 0. +.It Li usr +Configure the PMC to count events occurring at privilege levels 1, 2 +or 3. +.El +.Pp +If neither of the +.Dq Li os +or +.Dq Li usr +qualifiers are specified, the default is to enable both. +.Ss Event Specifiers (Programmable PMCs) +Core i7 and Xeon 5500 programmable PMCs support the following events: +.Bl -tag -width indent +.It Li SB_DRAIN.ANY +.Pq Event 04H , Umask 07H +Counts the number of store buffer drains. +.It Li STORE_BLOCKS.AT_RET +.Pq Event 06H , Umask 04H +Counts number of loads delayed with at-Retirement block code. The following +loads need to be executed at retirement and wait for all senior stores on +the same thread to be drained: load splitting across 4K boundary (page +split), load accessing uncacheable (UC or USWC) memory, load lock, and load +with page table in UC or USWC memory region. +.It Li STORE_BLOCKS.L1D_BLOCK +.Pq Event 06H , Umask 08H +Cacheable loads delayed with L1D block code +.It Li PARTIAL_ADDRESS_ALIAS +.Pq Event 07H , Umask 01H +Counts false dependency due to partial address aliasing +.It Li DTLB_LOAD_MISSES.ANY +.Pq Event 08H , Umask 01H +Counts all load misses that cause a page walk +.It Li DTLB_LOAD_MISSES.WALK_COMPLETED +.Pq Event 08H , Umask 02H +Counts number of completed page walks due to load miss in the STLB. +.It Li DTLB_LOAD_MISSES.STLB_HIT +.Pq Event 08H , Umask 10H +Number of cache load STLB hits +.It Li DTLB_LOAD_MISSES.PDE_MISS +.Pq Event 08H , Umask 20H +Number of DTLB cache load misses where the low part of the linear to +physical address translation was missed. +.It Li DTLB_LOAD_MISSES.LARGE_WALK_COMPLETED +.Pq Event 08H , Umask 80H +Counts number of completed large page walks due to load miss in the STLB. +.It Li MEM_INST_RETIRED.LOADS +.Pq Event 0BH , Umask 01H +Counts the number of instructions with an architecturally-visible store +retired on the architected path. +In conjunction with ld_lat facility +.It Li MEM_INST_RETIRED.STORES +.Pq Event 0BH , Umask 02H +Counts the number of instructions with an architecturally-visible store +retired on the architected path. +In conjunction with ld_lat facility +.It Li MEM_INST_RETIRED.LATENCY_ABOVE_THRESHOLD +.Pq Event 0BH , Umask 10H +Counts the number of instructions exceeding the latency specified with +ld_lat facility. +In conjunction with ld_lat facility +.It Li MEM_STORE_RETIRED.DTLB_MISS +.Pq Event 0CH , Umask 01H +The event counts the number of retired stores that missed the DTLB. The DTLB +miss is not counted if the store operation causes a fault. Does not counter +prefetches. Counts both primary and secondary misses to the TLB +.It Li UOPS_ISSUED.ANY +.Pq Event 0EH , Umask 01H +Counts the number of Uops issued by the Register Allocation Table to the +Reservation Station, i.e. the UOPs issued from the front end to the back +end. +.It Li UOPS_ISSUED.STALLED_CYCLES +.Pq Event 0EH , Umask 01H +Counts the number of cycles no Uops issued by the Register Allocation Table +to the Reservation Station, i.e. the UOPs issued from the front end to the +back end. +set invert=1, cmask = 1 +.It Li UOPS_ISSUED.FUSED +.Pq Event 0EH , Umask 02H +Counts the number of fused Uops that were issued from the Register +Allocation Table to the Reservation Station. +.It Li MEM_UNCORE_RETIRED.L3_DATA_MISS_UNKNOWN +.Pq Event 0FH , Umask 01H +Counts number of memory load instructions retired where the memory reference +missed L3 and data source is unknown. +Available only for CPUID signature 06_2EH +.It Li MEM_UNCORE_RETIRED.OTHER_CORE_L2_HITM +.Pq Event 0FH , Umask 02H +Counts number of memory load instructions retired where the memory reference +hit modified data in a sibling core residing on the same socket. +.It Li MEM_UNCORE_RETIRED.REMOTE_CACHE_LOCAL_HOME_HIT +.Pq Event 0FH , Umask 08H +Counts number of memory load instructions retired where the memory reference +missed the L1, L2 and L3 caches and HIT in a remote socket's cache. Only +counts locally homed lines. +.It Li MEM_UNCORE_RETIRED.REMOTE_DRAM +.Pq Event 0FH , Umask 10H +Counts number of memory load instructions retired where the memory reference +missed the L1, L2 and L3 caches and was remotely homed. This includes both +DRAM access and HITM in a remote socket's cache for remotely homed lines. +.It Li MEM_UNCORE_RETIRED.LOCAL_DRAM +.Pq Event 0FH , Umask 20H +Counts number of memory load instructions retired where the memory reference +missed the L1, L2 and L3 caches and required a local socket memory +reference. This includes locally homed cachelines that were in a modified +state in another socket. +.It Li MEM_UNCORE_RETIRED.UNCACHEABLE +.Pq Event 0FH , Umask 80H +Counts number of memory load instructions retired where the memory reference +missed the L1, L2 and L3 caches and to perform I/O. +Available only for CPUID signature 06_2EH +.It Li FP_COMP_OPS_EXE.X87 +.Pq Event 10H , Umask 01H +Counts the number of FP Computational Uops Executed. The number of FADD, +FSUB, FCOM, FMULs, integer MULsand IMULs, FDIVs, FPREMs, FSQRTS, integer +DIVs, and IDIVs. This event does not distinguish an FADD used in the middle +of a transcendental flow from a separate FADD instruction. +.It Li FP_COMP_OPS_EXE.MMX +.Pq Event 10H , Umask 02H +Counts number of MMX Uops executed. +.It Li FP_COMP_OPS_EXE.SSE_FP +.Pq Event 10H , Umask 04H +Counts number of SSE and SSE2 FP uops executed. +.It Li FP_COMP_OPS_EXE.SSE2_INTEGER +.Pq Event 10H , Umask 08H +Counts number of SSE2 integer uops executed. +.It Li FP_COMP_OPS_EXE.SSE_FP_PACKED +.Pq Event 10H , Umask 10H +Counts number of SSE FP packed uops executed. +.It Li FP_COMP_OPS_EXE.SSE_FP_SCALAR +.Pq Event 10H , Umask 20H +Counts number of SSE FP scalar uops executed. +.It Li FP_COMP_OPS_EXE.SSE_SINGLE_PRECISION +.Pq Event 10H , Umask 40H +Counts number of SSE* FP single precision uops executed. +.It Li FP_COMP_OPS_EXE.SSE_DOUBLE_PRECISION +.Pq Event 10H , Umask 80H +Counts number of SSE* FP double precision uops executed. +.It Li SIMD_INT_128.PACKED_MPY +.Pq Event 12H , Umask 01H +Counts number of 128 bit SIMD integer multiply operations. +.It Li SIMD_INT_128.PACKED_SHIFT +.Pq Event 12H , Umask 02H +Counts number of 128 bit SIMD integer shift operations. +.It Li SIMD_INT_128.PACK +.Pq Event 12H , Umask 04H +Counts number of 128 bit SIMD integer pack operations. +.It Li SIMD_INT_128.UNPACK +.Pq Event 12H , Umask 08H +Counts number of 128 bit SIMD integer unpack operations. +.It Li SIMD_INT_128.PACKED_LOGICAL +.Pq Event 12H , Umask 10H +Counts number of 128 bit SIMD integer logical operations. +.It Li SIMD_INT_128.PACKED_ARITH +.Pq Event 12H , Umask 20H +Counts number of 128 bit SIMD integer arithmetic operations. +.It Li SIMD_INT_128.SHUFFLE_MOVE +.Pq Event 12H , Umask 40H +Counts number of 128 bit SIMD integer shuffle and move operations. +.It Li LOAD_DISPATCH.RS +.Pq Event 13H , Umask 01H +Counts number of loads dispatched from the Reservation Station that bypass +the Memory Order Buffer. +.It Li LOAD_DISPATCH.RS_DELAYED +.Pq Event 13H , Umask 02H +Counts the number of delayed RS dispatches at the stage latch. If an RS +dispatch can not bypass to LB, it has another chance to dispatch from the +one-cycle delayed staging latch before it is written into the LB. +.It Li LOAD_DISPATCH.MOB +.Pq Event 13H , Umask 04H +Counts the number of loads dispatched from the Reservation Station to the +Memory Order Buffer. +.It Li LOAD_DISPATCH.ANY +.Pq Event 13H , Umask 07H +Counts all loads dispatched from the Reservation Station. +.It Li ARITH.CYCLES_DIV_BUSY +.Pq Event 14H , Umask 01H +Counts the number of cycles the divider is busy executing divide or square +root operations. The divide can be integer, X87 or Streaming SIMD Extensions +(SSE). The square root operation can be either X87 or SSE. +Set 'edge =1, invert=1, cmask=1' to count the number of divides. +Count may be incorrect When SMT is on. +.It Li ARITH.MUL +.Pq Event 14H , Umask 02H +Counts the number of multiply operations executed. This includes integer as +well as floating point multiply operations but excludes DPPS mul and MPSAD. +Count may be incorrect When SMT is on +.It Li INST_QUEUE_WRITES +.Pq Event 17H , Umask 01H +Counts the number of instructions written into the instruction queue every +cycle. +.It Li INST_DECODED.DEC0 +.Pq Event 18H , Umask 01H +Counts number of instructions that require decoder 0 to be decoded. Usually, +this means that the instruction maps to more than 1 uop +.It Li TWO_UOP_INSTS_DECODED +.Pq Event 19H , Umask 01H +An instruction that generates two uops was decoded +.It Li INST_QUEUE_WRITE_CYCLES +.Pq Event 1EH , Umask 01H +This event counts the number of cycles during which instructions are written +to the instruction queue. Dividing this counter by the number of +instructions written to the instruction queue (INST_QUEUE_WRITES) yields the +average number of instructions decoded each cycle. If this number is less +than four and the pipe stalls, this indicates that the decoder is failing to +decode enough instructions per cycle to sustain the 4-wide pipeline. +If SSE* instructions that are 6 bytes or longer arrive one after another, +then front end throughput may limit execution speed. In such case, +.It Li LSD_OVERFLOW +.Pq Event 20H , Umask 01H +Counts number of loops that cant stream from the instruction queue. +.It Li L2_RQSTS.LD_HIT +.Pq Event 24H , Umask 01H +Counts number of loads that hit the L2 cache. L2 loads include both L1D +demand misses as well as L1D prefetches. L2 loads can be rejected for +various reasons. Only non rejected loads are counted. +.It Li L2_RQSTS.LD_MISS +.Pq Event 24H , Umask 02H +Counts the number of loads that miss the L2 cache. L2 loads include both L1D +demand misses as well as L1D prefetches. +.It Li L2_RQSTS.LOADS +.Pq Event 24H , Umask 03H +Counts all L2 load requests. L2 loads include both L1D demand misses as well +as L1D prefetches. +.It Li L2_RQSTS.RFO_HIT +.Pq Event 24H , Umask 04H +Counts the number of store RFO requests that hit the L2 cache. L2 RFO +requests include both L1D demand RFO misses as well as L1D RFO prefetches. +Count includes WC memory requests, where the data is not fetched but the +permission to write the line is required. +.It Li L2_RQSTS.RFO_MISS +.Pq Event 24H , Umask 08H +Counts the number of store RFO requests that miss the L2 cache. L2 RFO +requests include both L1D demand RFO misses as well as L1D RFO prefetches. +.It Li L2_RQSTS.RFOS +.Pq Event 24H , Umask 0CH +Counts all L2 store RFO requests. L2 RFO requests include both L1D demand +RFO misses as well as L1D RFO prefetches. +.It Li L2_RQSTS.IFETCH_HIT +.Pq Event 24H , Umask 10H +Counts number of instruction fetches that hit the L2 cache. L2 instruction +fetches include both L1I demand misses as well as L1I instruction +prefetches. +.It Li L2_RQSTS.IFETCH_MISS +.Pq Event 24H , Umask 20H +Counts number of instruction fetches that miss the L2 cache. L2 instruction +fetches include both L1I demand misses as well as L1I instruction +prefetches. +.It Li L2_RQSTS.IFETCHES +.Pq Event 24H , Umask 30H +Counts all instruction fetches. L2 instruction fetches include both L1I +demand misses as well as L1I instruction prefetches. +.It Li L2_RQSTS.PREFETCH_HIT +.Pq Event 24H , Umask 40H +Counts L2 prefetch hits for both code and data. +.It Li L2_RQSTS.PREFETCH_MISS +.Pq Event 24H , Umask 80H +Counts L2 prefetch misses for both code and data. +.It Li L2_RQSTS.PREFETCHES +.Pq Event 24H , Umask C0H +Counts all L2 prefetches for both code and data. +.It Li L2_RQSTS.MISS +.Pq Event 24H , Umask AAH +Counts all L2 misses for both code and data. +.It Li L2_RQSTS.REFERENCES +.Pq Event 24H , Umask FFH +Counts all L2 requests for both code and data. +.It Li L2_DATA_RQSTS.DEMAND.I_STATE +.Pq Event 26H , Umask 01H +Counts number of L2 data demand loads where the cache line to be loaded is +in the I (invalid) state, i.e. a cache miss. L2 demand loads are both L1D +demand misses and L1D prefetches. +.It Li L2_DATA_RQSTS.DEMAND.S_STATE +.Pq Event 26H , Umask 02H +Counts number of L2 data demand loads where the cache line to be loaded is +in the S (shared) state. L2 demand loads are both L1D demand misses and L1D +prefetches. +.It Li L2_DATA_RQSTS.DEMAND.E_STATE +.Pq Event 26H , Umask 04H +Counts number of L2 data demand loads where the cache line to be loaded is +in the E (exclusive) state. L2 demand loads are both L1D demand misses and +L1D prefetches. +.It Li L2_DATA_RQSTS.DEMAND.M_STATE +.Pq Event 26H , Umask 08H +Counts number of L2 data demand loads where the cache line to be loaded is +in the M (modified) state. L2 demand loads are both L1D demand misses and +L1D prefetches. +.It Li L2_DATA_RQSTS.DEMAND.MESI +.Pq Event 26H , Umask 0FH +Counts all L2 data demand requests. L2 demand loads are both L1D demand +misses and L1D prefetches. +.It Li L2_DATA_RQSTS.PREFETCH.I_STATE +.Pq Event 26H , Umask 10H +Counts number of L2 prefetch data loads where the cache line to be loaded is +in the I (invalid) state, i.e. a cache miss. +.It Li L2_DATA_RQSTS.PREFETCH.S_STATE +.Pq Event 26H , Umask 20H +Counts number of L2 prefetch data loads where the cache line to be loaded is +in the S (shared) state. A prefetch RFO will miss on an S state line, while +a prefetch read will hit on an S state line. +.It Li L2_DATA_RQSTS.PREFETCH.E_STATE +.Pq Event 26H , Umask 40H +Counts number of L2 prefetch data loads where the cache line to be loaded is +in the E (exclusive) state. +.It Li L2_DATA_RQSTS.PREFETCH.M_STATE +.Pq Event 26H , Umask 80H +Counts number of L2 prefetch data loads where the cache line to be loaded is +in the M (modified) state. +.It Li L2_DATA_RQSTS.PREFETCH.MESI +.Pq Event 26H , Umask F0H +Counts all L2 prefetch requests. +.It Li L2_DATA_RQSTS.ANY +.Pq Event 26H , Umask FFH +Counts all L2 data requests. +.It Li L2_WRITE.RFO.I_STATE +.Pq Event 27H , Umask 01H +Counts number of L2 demand store RFO requests where the cache line to be +loaded is in the I (invalid) state, i.e, a cache miss. The L1D prefetcher +does not issue a RFO prefetch. +This is a demand RFO request +.It Li L2_WRITE.RFO.S_STATE +.Pq Event 27H , Umask 02H +Counts number of L2 store RFO requests where the cache line to be loaded is +in the S (shared) state. The L1D prefetcher does not issue a RFO prefetch,. +This is a demand RFO request +.It Li L2_WRITE.RFO.M_STATE +.Pq Event 27H , Umask 08H +Counts number of L2 store RFO requests where the cache line to be loaded is +in the M (modified) state. The L1D prefetcher does not issue a RFO prefetch. +This is a demand RFO request +.It Li L2_WRITE.RFO.HIT +.Pq Event 27H , Umask 0EH +Counts number of L2 store RFO requests where the cache line to be loaded is +in either the S, E or M states. The L1D prefetcher does not issue a RFO +prefetch. +This is a demand RFO request +.It Li L2_WRITE.RFO.MESI +.Pq Event 27H , Umask 0FH +Counts all L2 store RFO requests.The L1D prefetcher does not issue a RFO +prefetch. +This is a demand RFO request +.It Li L2_WRITE.LOCK.I_STATE +.Pq Event 27H , Umask 10H +Counts number of L2 demand lock RFO requests where the cache line to be +loaded is in the I (invalid) state, i.e. a cache miss. +.It Li L2_WRITE.LOCK.S_STATE +.Pq Event 27H , Umask 20H +Counts number of L2 lock RFO requests where the cache line to be loaded is +in the S (shared) state. +.It Li L2_WRITE.LOCK.E_STATE +.Pq Event 27H , Umask 40H +Counts number of L2 demand lock RFO requests where the cache line to be +loaded is in the E (exclusive) state. +.It Li L2_WRITE.LOCK.M_STATE +.Pq Event 27H , Umask 80H +Counts number of L2 demand lock RFO requests where the cache line to be +loaded is in the M (modified) state. +.It Li L2_WRITE.LOCK.HIT +.Pq Event 27H , Umask E0H +Counts number of L2 demand lock RFO requests where the cache line to be +loaded is in either the S, E, or M state. +.It Li L2_WRITE.LOCK.MESI +.Pq Event 27H , Umask F0H +Counts all L2 demand lock RFO requests. +.It Li L1D_WB_L2.I_STATE +.Pq Event 28H , Umask 01H +Counts number of L1 writebacks to the L2 where the cache line to be written +is in the I (invalid) state, i.e. a cache miss. +.It Li L1D_WB_L2.S_STATE +.Pq Event 28H , Umask 02H +Counts number of L1 writebacks to the L2 where the cache line to be written +is in the S state. +.It Li L1D_WB_L2.E_STATE +.Pq Event 28H , Umask 04H +Counts number of L1 writebacks to the L2 where the cache line to be written +is in the E (exclusive) state. +.It Li L1D_WB_L2.M_STATE +.Pq Event 28H , Umask 08H +Counts number of L1 writebacks to the L2 where the cache line to be written +is in the M (modified) state. +.It Li L1D_WB_L2.MESI +.Pq Event 28H , Umask 0FH +Counts all L1 writebacks to the L2. +.It Li L3_LAT_CACHE.REFERENCE +.Pq Event 2EH , Umask 4FH +This event counts requests originating from the core that reference a cache +line in the last level cache. The event count includes speculative traffic +but excludes cache line fills due to a L2 hardware-prefetch. Because cache +hierarchy, cache sizes and other implementation-specific characteristics; +value comparison to estimate performance differences is not recommended. +see Table A-1 +.It Li L3_LAT_CACHE.MISS +.Pq Event 2EH , Umask 41H +This event counts each cache miss condition for references to the last level +cache. The event count may include speculative traffic but excludes cache +line fills due to L2 hardware-prefetches. Because cache hierarchy, cache +sizes and other implementation-specific characteristics; value comparison to +estimate performance differences is not recommended. +see Table A-1 +.It Li CPU_CLK_UNHALTED.THREAD_P +.Pq Event 3CH , Umask 00H +Counts the number of thread cycles while the thread is not in a halt state. +The thread enters the halt state when it is running the HLT instruction. The +core frequency may change from time to time due to power or thermal +throttling. +see Table A-1 +.It Li CPU_CLK_UNHALTED.REF_P +.Pq Event 3CH , Umask 01H +Increments at the frequency of TSC when not halted. +see Table A-1 +.It Li L1D_CACHE_LD.I_STATE +.Pq Event 40H , Umask 01H +Counts L1 data cache read requests where the cache line to be loaded is in +the I (invalid) state, i.e. the read request missed the cache. +Counter 0, 1 only +.It Li L1D_CACHE_LD.S_STATE +.Pq Event 40H , Umask 02H +Counts L1 data cache read requests where the cache line to be loaded is in +the S (shared) state. +Counter 0, 1 only +.It Li L1D_CACHE_LD.E_STATE +.Pq Event 40H , Umask 04H +Counts L1 data cache read requests where the cache line to be loaded is in +the E (exclusive) state. +Counter 0, 1 only +.It Li L1D_CACHE_LD.M_STATE +.Pq Event 40H , Umask 08H +Counts L1 data cache read requests where the cache line to be loaded is in +the M (modified) state. +Counter 0, 1 only +.It Li L1D_CACHE_LD.MESI +.Pq Event 40H , Umask 0FH +Counts L1 data cache read requests. +Counter 0, 1 only +.It Li L1D_CACHE_ST.S_STATE +.Pq Event 41H , Umask 02H +Counts L1 data cache store RFO requests where the cache line to be loaded is +in the S (shared) state. +Counter 0, 1 only +.It Li L1D_CACHE_ST.E_STATE +.Pq Event 41H , Umask 04H +Counts L1 data cache store RFO requests where the cache line to be loaded is +in the E (exclusive) state. +Counter 0, 1 only +.It Li L1D_CACHE_ST.M_STATE +.Pq Event 41H , Umask 08H +Counts L1 data cache store RFO requests where cache line to be loaded is in +the M (modified) state. +Counter 0, 1 only +.It Li L1D_CACHE_LOCK.HIT +.Pq Event 42H , Umask 01H +Counts retired load locks that hit in the L1 data cache or hit in an already +allocated fill buffer. The lock portion of the load lock transaction must +hit in the L1D. +The initial load will pull the lock into the L1 data cache. Counter 0, 1 +only +.It Li L1D_CACHE_LOCK.S_STATE +.Pq Event 42H , Umask 02H +Counts L1 data cache retired load locks that hit the target cache line in +the shared state. +Counter 0, 1 only +.It Li L1D_CACHE_LOCK.E_STATE +.Pq Event 42H , Umask 04H +Counts L1 data cache retired load locks that hit the target cache line in +the exclusive state. +Counter 0, 1 only +.It Li L1D_CACHE_LOCK.M_STATE +.Pq Event 42H , Umask 08H +Counts L1 data cache retired load locks that hit the target cache line in +the modified state. +Counter 0, 1 only +.It Li L1D_ALL_REF.ANY +.Pq Event 43H , Umask 01H +Counts all references (uncached, speculated and retired) to the L1 data +cache, including all loads and stores with any memory types. The event +counts memory accesses only when they are actually performed. For example, a +load blocked by unknown store address and later performed is only counted +once. +The event does not include non- memory accesses, such as I/O accesses. +Counter 0, 1 only +.It Li L1D_ALL_REF.CACHEABLE +.Pq Event 43H , Umask 02H +Counts all data reads and writes (speculated and retired) from cacheable +memory, including locked operations. +Counter 0, 1 only +.It Li DTLB_MISSES.ANY +.Pq Event 49H , Umask 01H +Counts the number of misses in the STLB which causes a page walk. +.It Li DTLB_MISSES.WALK_COMPLETED +.Pq Event 49H , Umask 02H +Counts number of misses in the STLB which resulted in a completed page walk. +.It Li DTLB_MISSES.STLB_HIT +.Pq Event 49H , Umask 10H +Counts the number of DTLB first level misses that hit in the second level +TLB. This event is only relevant if the core contains multiple DTLB levels. +.It Li DTLB_MISSES.PDE_MISS +.Pq Event 49H , Umask 20H +Number of DTLB misses caused by low part of address, includes references to 2M pages because 2M pages do not use the PDE. +.It Li DTLB_MISSES.LARGE_WALK_COMPLETED +.Pq Event 49H , Umask 80H +Counts number of misses in the STLB which resulted in a completed page walk for large pages. +.It Li LOAD_HIT_PRE +.Pq Event 4CH , Umask 01H +Counts load operations sent to the L1 data cache while a previous SSE +prefetch instruction to the same cache line has started prefetching but has +not yet finished. +.It Li L1D_PREFETCH.REQUESTS +.Pq Event 4EH , Umask 01H +Counts number of hardware prefetch requests dispatched out of the prefetch +FIFO. +.It Li L1D_PREFETCH.MISS +.Pq Event 4EH , Umask 02H +Counts number of hardware prefetch requests that miss the L1D. There are two +prefetchers in the L1D. A streamer, which predicts lines sequentially after +this one should be fetched, and the IP prefetcher that remembers access +patterns for the current instruction. The streamer prefetcher stops on an +L1D hit, while the IP prefetcher does not. +.It Li L1D_PREFETCH.TRIGGERS +.Pq Event 4EH , Umask 04H +Counts number of prefetch requests triggered by the Finite State Machine and +pushed into the prefetch FIFO. Some of the prefetch requests are dropped due +to overwrites or competition between the IP index prefetcher and streamer +prefetcher. The prefetch FIFO contains 4 entries. +.It Li L1D.REPL +.Pq Event 51H , Umask 01H +Counts the number of lines brought into the L1 data cache. +Counter 0, 1 only +.It Li L1D.M_REPL +.Pq Event 51H , Umask 02H +Counts the number of modified lines brought into the L1 data cache. +Counter 0, 1 only +.It Li L1D.M_EVICT +.Pq Event 51H , Umask 04H +Counts the number of modified lines evicted from the L1 data cache due to +replacement. +Counter 0, 1 only +.It Li L1D.M_SNOOP_EVICT +.Pq Event 51H , Umask 08H +Counts the number of modified lines evicted from the L1 data cache due to +snoop HITM intervention. +Counter 0, 1 only +.It Li L1D_CACHE_PREFETCH_LOCK_FB_HIT +.Pq Event 52H , Umask 01H +Counts the number of cacheable load lock speculated instructions accepted +into the fill buffer. +.It Li L1D_CACHE_LOCK_FB_HIT +.Pq Event 53H , Umask 01H +Counts the number of cacheable load lock speculated or retired instructions +accepted into the fill buffer. +.It Li CACHE_LOCK_CYCLES.L1D_L2 +.Pq Event 63H , Umask 01H +Cycle count during which the L1D and L2 are locked. A lock is asserted when +there is a locked memory access, due to uncacheable memory, a locked +operation that spans two cache lines, or a page walk from an uncacheable +page table. +Counter 0, 1 only. L1D and L2 locks have a very high performance penalty and +it is highly recommended to avoid such accesses. +.It Li CACHE_LOCK_CYCLES.L1D +.Pq Event 63H , Umask 02H +Counts the number of cycles that cacheline in the L1 data cache unit is +locked. +Counter 0, 1 only. +.It Li IO_TRANSACTIONS +.Pq Event 6CH , Umask 01H +Counts the number of completed I/O transactions. +.It Li L1I.HITS +.Pq Event 80H , Umask 01H +Counts all instruction fetches that hit the L1 instruction cache. +.It Li L1I.MISSES +.Pq Event 80H , Umask 02H +Counts all instruction fetches that miss the L1I cache. This includes +instruction cache misses, streaming buffer misses, victim cache misses and +uncacheable fetches. An instruction fetch miss is counted only once and not +once for every cycle it is outstanding. +.It Li L1I.READS +.Pq Event 80H , Umask 03H +Counts all instruction fetches, including uncacheable fetches that bypass +the L1I. +.It Li L1I.CYCLES_STALLED +.Pq Event 80H , Umask 04H +Cycle counts for which an instruction fetch stalls due to a L1I cache miss, +ITLB miss or ITLB fault. +.It Li LARGE_ITLB.HIT +.Pq Event 82H , Umask 01H +Counts number of large ITLB hits. +.It Li ITLB_MISSES.ANY +.Pq Event 85H , Umask 01H +Counts the number of misses in all levels of the ITLB which causes a page +walk. +.It Li ITLB_MISSES.WALK_COMPLETED +.Pq Event 85H , Umask 02H +Counts number of misses in all levels of the ITLB which resulted in a +completed page walk. +.It Li ILD_STALL.LCP +.Pq Event 87H , Umask 01H +Cycles Instruction Length Decoder stalls due to length changing prefixes: +66, 67 or REX.W (for EM64T) instructions which change the length of the +decoded instruction. +.It Li ILD_STALL.MRU +.Pq Event 87H , Umask 02H +Instruction Length Decoder stall cycles due to Brand Prediction Unit (PBU) +Most Recently Used (MRU) bypass. +.It Li ILD_STALL.IQ_FULL +.Pq Event 87H , Umask 04H +Stall cycles due to a full instruction queue. +.It Li ILD_STALL.REGEN +.Pq Event 87H , Umask 08H +Counts the number of regen stalls. +.It Li ILD_STALL.ANY +.Pq Event 87H , Umask 0FH +Counts any cycles the Instruction Length Decoder is stalled. +.It Li BR_INST_EXEC.COND +.Pq Event 88H , Umask 01H +Counts the number of conditional near branch instructions executed, but not +necessarily retired. +.It Li BR_INST_EXEC.DIRECT +.Pq Event 88H , Umask 02H +Counts all unconditional near branch instructions excluding calls and +indirect branches. +.It Li BR_INST_EXEC.INDIRECT_NON_CALL +.Pq Event 88H , Umask 04H +Counts the number of executed indirect near branch instructions that are not +calls. +.It Li BR_INST_EXEC.NON_CALLS +.Pq Event 88H , Umask 07H +Counts all non call near branch instructions executed, but not necessarily +retired. +.It Li BR_INST_EXEC.RETURN_NEAR +.Pq Event 88H , Umask 08H +Counts indirect near branches that have a return mnemonic. +.It Li BR_INST_EXEC.DIRECT_NEAR_CALL +.Pq Event 88H , Umask 10H +Counts unconditional near call branch instructions, excluding non call +branch, executed. +.It Li BR_INST_EXEC.INDIRECT_NEAR_CALL +.Pq Event 88H , Umask 20H +Counts indirect near calls, including both register and memory indirect, +executed. +.It Li BR_INST_EXEC.NEAR_CALLS +.Pq Event 88H , Umask 30H +Counts all near call branches executed, but not necessarily retired. +.It Li BR_INST_EXEC.TAKEN +.Pq Event 88H , Umask 40H +Counts taken near branches executed, but not necessarily retired. +.It Li BR_INST_EXEC.ANY +.Pq Event 88H , Umask 7FH +Counts all near executed branches (not necessarily retired). This includes +only instructions and not micro-op branches. Frequent branching is not +necessarily a major performance issue. However frequent branch +mispredictions may be a problem. +.It Li BR_MISP_EXEC.COND +.Pq Event 89H , Umask 01H +Counts the number of mispredicted conditional near branch instructions +executed, but not necessarily retired. +.It Li BR_MISP_EXEC.DIRECT +.Pq Event 89H , Umask 02H +Counts mispredicted macro unconditional near branch instructions, excluding +calls and indirect branches (should always be 0). +.It Li BR_MISP_EXEC.INDIRECT_NON_CALL +.Pq Event 89H , Umask 04H +Counts the number of executed mispredicted indirect near branch instructions +that are not calls. +.It Li BR_MISP_EXEC.NON_CALLS +.Pq Event 89H , Umask 07H +Counts mispredicted non call near branches executed, but not necessarily +retired. +.It Li BR_MISP_EXEC.RETURN_NEAR +.Pq Event 89H , Umask 08H +Counts mispredicted indirect branches that have a rear return mnemonic. +.It Li BR_MISP_EXEC.DIRECT_NEAR_CALL +.Pq Event 89H , Umask 10H +Counts mispredicted non-indirect near calls executed, (should always be 0). +.It Li BR_MISP_EXEC.INDIRECT_NEAR_CALL +.Pq Event 89H , Umask 20H +Counts mispredicted indirect near calls executed, including both register +and memory indirect. +.It Li BR_MISP_EXEC.NEAR_CALLS +.Pq Event 89H , Umask 30H +Counts all mispredicted near call branches executed, but not necessarily +retired. +.It Li BR_MISP_EXEC.TAKEN +.Pq Event 89H , Umask 40H +Counts executed mispredicted near branches that are taken, but not +necessarily retired. +.It Li BR_MISP_EXEC.ANY +.Pq Event 89H , Umask 7FH +Counts the number of mispredicted near branch instructions that were +executed, but not necessarily retired. +.It Li RESOURCE_STALLS.ANY +.Pq Event A2H , Umask 01H +Counts the number of Allocator resource related stalls. Includes register +renaming buffer entries, memory buffer entries. In addition to resource +related stalls, this event counts some other events. Includes stalls arising +during branch misprediction recovery, such as if retirement of the +mispredicted branch is delayed and stalls arising while store buffer is +draining from synchronizing operations. +Does not include stalls due to SuperQ (off core) queue full, too many cache +misses, etc. +.It Li RESOURCE_STALLS.LOAD +.Pq Event A2H , Umask 02H +Counts the cycles of stall due to lack of load buffer for load operation. +.It Li RESOURCE_STALLS.RS_FULL +.Pq Event A2H , Umask 04H +This event counts the number of cycles when the number of instructions in +the pipeline waiting for execution reaches the limit the processor can +handle. A high count of this event indicates that there are long latency +operations in the pipe (possibly load and store operations that miss the L2 +cache, or instructions dependent upon instructions further down the pipeline +that have yet to retire. +When RS is full, new instructions can not enter the reservation station and +start execution. +.It Li RESOURCE_STALLS.STORE +.Pq Event A2H , Umask 08H +This event counts the number of cycles that a resource related stall will +occur due to the number of store instructions reaching the limit of the +pipeline, (i.e. all store buffers are used). The stall ends when a store +instruction commits its data to the cache or memory. +.It Li RESOURCE_STALLS.ROB_FULL +.Pq Event A2H , Umask 10H +Counts the cycles of stall due to re- order buffer full. +.It Li RESOURCE_STALLS.FPCW +.Pq Event A2H , Umask 20H +Counts the number of cycles while execution was stalled due to writing the +floating-point unit (FPU) control word. +.It Li RESOURCE_STALLS.MXCSR +.Pq Event A2H , Umask 40H +Stalls due to the MXCSR register rename occurring to close to a previous +MXCSR rename. The MXCSR provides control and status for the MMX registers. +.It Li RESOURCE_STALLS.OTHER +.Pq Event A2H , Umask 80H +Counts the number of cycles while execution was stalled due to other +resource issues. +.It Li MACRO_INSTS.FUSIONS_DECODED +.Pq Event A6H , Umask 01H +Counts the number of instructions decoded that are macro-fused but not +necessarily executed or retired. +.It Li BACLEAR_FORCE_IQ +.Pq Event A7H , Umask 01H +Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ +is also responsible for providing conditional branch prediction direction +based on a static scheme and dynamic data provided by the L2 Branch +Prediction Unit. If the conditional branch target is not found in the Target +Array and the IQ predicts that the branch is taken, then the IQ will force +the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by +the BAC generates approximately an 8 cycle bubble in the instruction fetch +pipeline. +.It Li LSD.UOPS +.Pq Event A8H , Umask 01H +Counts the number of micro-ops delivered by loop stream detector +Use cmask=1 and invert to count cycles +.It Li ITLB_FLUSH +.Pq Event AEH , Umask 01H +Counts the number of ITLB flushes +.It Li OFFCORE_REQUESTS.L1D_WRITEBACK +.Pq Event B0H , Umask 40H +Counts number of L1D writebacks to the uncore. +.It Li UOPS_EXECUTED.PORT0 +.Pq Event B1H , Umask 01H +Counts number of Uops executed that were issued on port 0. Port 0 handles +integer arithmetic, SIMD and FP add Uops. +.It Li UOPS_EXECUTED.PORT1 +.Pq Event B1H , Umask 02H +Counts number of Uops executed that were issued on port 1. Port 1 handles +integer arithmetic, SIMD, integer shift, FP multiply and FP divide Uops. +.It Li UOPS_EXECUTED.PORT2_CORE +.Pq Event B1H , Umask 04H +Counts number of Uops executed that were issued on port 2. Port 2 handles +the load Uops. This is a core count only and can not be collected per +thread. +.It Li UOPS_EXECUTED.PORT3_CORE +.Pq Event B1H , Umask 08H +Counts number of Uops executed that were issued on port 3. Port 3 handles +store Uops. This is a core count only and can not be collected per thread. +.It Li UOPS_EXECUTED.PORT4_CORE +.Pq Event B1H , Umask 10H +Counts number of Uops executed that where issued on port 4. Port 4 handles +the value to be stored for the store Uops issued on port 3. This is a core +count only and can not be collected per thread. +.It Li UOPS_EXECUTED.CORE_ACTIVE_CYCLES_NO_PORT5 +.Pq Event B1H , Umask 1FH +Counts cycles when the Uops executed were issued from any ports except port +5. Use Cmask=1 for active cycles; Cmask=0 for weighted cycles; Use CMask=1, +Invert=1 to count P0-4 stalled cycles Use Cmask=1, Edge=1, Invert=1 to count +P0-4 stalls. +.It Li UOPS_EXECUTED.PORT5 +.Pq Event B1H , Umask 20H +Counts number of Uops executed that where issued on port 5. +.It Li UOPS_EXECUTED.CORE_ACTIVE_CYCLES +.Pq Event B1H , Umask 3FH +Counts cycles when the Uops are executing. Use Cmask=1 for active cycles; +Cmask=0 for weighted cycles; Use CMask=1, Invert=1 to count P0-4 stalled +cycles Use Cmask=1, Edge=1, Invert=1 to count P0-4 stalls. +.It Li UOPS_EXECUTED.PORT015 +.Pq Event B1H , Umask 40H +Counts number of Uops executed that where issued on port 0, 1, or 5. +use cmask=1, invert=1 to count stall cycles +.It Li UOPS_EXECUTED.PORT234 +.Pq Event B1H , Umask 80H +Counts number of Uops executed that where issued on port 2, 3, or 4. +.It Li OFFCORE_REQUESTS_SQ_FULL +.Pq Event B2H , Umask 01H +Counts number of cycles the SQ is full to handle off-core requests. +.It Li OFF_CORE_RESPONSE_0 +.Pq Event B7H , Umask 01H +see Section 30.6.1.3, Off-core Response Performance Monitoring in the +Processor Core +Requires programming MSR 01A6H +.It Li SNOOP_RESPONSE.HIT +.Pq Event B8H , Umask 01H +Counts HIT snoop response sent by this thread in response to a snoop +request. +.It Li SNOOP_RESPONSE.HITE +.Pq Event B8H , Umask 02H +Counts HIT E snoop response sent by this thread in response to a snoop +request. +.It Li SNOOP_RESPONSE.HITM +.Pq Event B8H , Umask 04H +Counts HIT M snoop response sent by this thread in response to a snoop +request. +.It Li OFF_CORE_RESPONSE_1 +.Pq Event BBH , Umask 01H +see Section 30.6.1.3, Off-core Response Performance Monitoring in the +Processor Core +Requires programming MSR 01A7H +.It Li INST_RETIRED.ANY_P +.Pq Event C0H , Umask 01H +See Table A-1 +Notes: INST_RETIRED.ANY is counted by a designated fixed counter. +INST_RETIRED.ANY_P is counted by a programmable counter and is an +architectural performance event. Event is supported if CPUID.A.EBX[1] = 0. +Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not +count as retired instructions. +.It Li INST_RETIRED.X87 +.Pq Event C0H , Umask 02H +Counts the number of MMX instructions retired. +.It Li INST_RETIRED.MMX +.Pq Event C0H , Umask 04H +Counts the number of floating point computational operations retired: +floating point computational operations executed by the assist handler and +sub-operations of complex floating point instructions like transcendental +instructions. +.It Li UOPS_RETIRED.ANY +.Pq Event C2H , Umask 01H +Counts the number of micro-ops retired, (macro-fused=1, micro- fused=2, +others=1; maximum count of 8 per cycle). Most instructions are composed of +one or two micro-ops. Some instructions are decoded into longer sequences +such as repeat instructions, floating point transcendental instructions, and +assists. +Use cmask=1 and invert to count active cycles or stalled cycles +.It Li UOPS_RETIRED.RETIRE_SLOTS +.Pq Event C2H , Umask 02H +Counts the number of retirement slots used each cycle +.It Li UOPS_RETIRED.MACRO_FUSED +.Pq Event C2H , Umask 04H +Counts number of macro-fused uops retired. +.It Li MACHINE_CLEARS.CYCLES +.Pq Event C3H , Umask 01H +Counts the cycles machine clear is asserted. +.It Li MACHINE_CLEARS.MEM_ORDER +.Pq Event C3H , Umask 02H +Counts the number of machine clears due to memory order conflicts. +.It Li MACHINE_CLEARS.SMC +.Pq Event C3H , Umask 04H +Counts the number of times that a program writes to a code section. +Self-modifying code causes a sever penalty in all Intel 64 and IA-32 +processors. The modified cache line is written back to the L2 and L3caches. +.It Li BR_INST_RETIRED.ALL_BRANCHES +.Pq Event C4H , Umask 00H +See Table A-1 +.It Li BR_INST_RETIRED.CONDITIONAL +.Pq Event C4H , Umask 01H +Counts the number of conditional branch instructions retired. +.It Li BR_INST_RETIRED.NEAR_CALL +.Pq Event C4H , Umask 02H +Counts the number of direct & indirect near unconditional calls retired +.It Li BR_INST_RETIRED.ALL_BRANCHES +.Pq Event C4H , Umask 04H +Counts the number of branch instructions retired +.It Li BR_MISP_RETIRED.ALL_BRANCHES +.Pq Event C5H , Umask 00H +See Table A-1 +.It Li BR_MISP_RETIRED.NEAR_CALL +.Pq Event C5H , Umask 02H +Counts mispredicted direct & indirect near unconditional retired calls. +.It Li SSEX_UOPS_RETIRED.PACKED_SINGLE +.Pq Event C7H , Umask 01H +Counts SIMD packed single-precision floating point Uops retired. +.It Li SSEX_UOPS_RETIRED.SCALAR_SINGLE +.Pq Event C7H , Umask 02H +Counts SIMD calar single-precision floating point Uops retired. +.It Li SSEX_UOPS_RETIRED.PACKED_DOUBLE +.Pq Event C7H , Umask 04H +Counts SIMD packed double- precision floating point Uops retired. +.It Li SSEX_UOPS_RETIRED.SCALAR_DOUBLE +.Pq Event C7H , Umask 08H +Counts SIMD scalar double-precision floating point Uops retired. +.It Li SSEX_UOPS_RETIRED.VECTOR_INTEGER +.Pq Event C7H , Umask 10H +Counts 128-bit SIMD vector integer Uops retired. +.It Li ITLB_MISS_RETIRED +.Pq Event C8H , Umask 20H +Counts the number of retired instructions that missed the ITLB when the +instruction was fetched. +.It Li MEM_LOAD_RETIRED.L1D_HIT +.Pq Event CBH , Umask 01H +Counts number of retired loads that hit the L1 data cache. +.It Li MEM_LOAD_RETIRED.L2_HIT +.Pq Event CBH , Umask 02H +Counts number of retired loads that hit the L2 data cache. +.It Li MEM_LOAD_RETIRED.L3_UNSHARED_HIT +.Pq Event CBH , Umask 04H +Counts number of retired loads that hit their own, unshared lines in the L3 +cache. +.It Li MEM_LOAD_RETIRED.OTHER_CORE_L2_HIT_HITM +.Pq Event CBH , Umask 08H +Counts number of retired loads that hit in a sibling core's L2 (on die +core). Since the L3 is inclusive of all cores on the package, this is an L3 +hit. This counts both clean or modified hits. +.It Li MEM_LOAD_RETIRED.L3_MISS +.Pq Event CBH , Umask 10H +Counts number of retired loads that miss the L3 cache. The load was +satisfied by a remote socket, local memory or an IOH. +.It Li MEM_LOAD_RETIRED.HIT_LFB +.Pq Event CBH , Umask 40H +Counts number of retired loads that miss the L1D and the address is located +in an allocated line fill buffer and will soon be committed to cache. This +is counting secondary L1D misses. +.It Li MEM_LOAD_RETIRED.DTLB_MISS +.Pq Event CBH , Umask 80H +Counts the number of retired loads that missed the DTLB. The DTLB miss is +not counted if the load operation causes a fault. This event counts loads +from cacheable memory only. The event does not count loads by software +prefetches. Counts both primary and secondary misses to the TLB. +.It Li FP_MMX_TRANS.TO_FP +.Pq Event CCH , Umask 01H +Counts the first floating-point instruction following any MMX instruction. +You can use this event to estimate the penalties for the transitions between +floating-point and MMX technology states. +.It Li FP_MMX_TRANS.TO_MMX +.Pq Event CCH , Umask 02H +Counts the first MMX instruction following a floating-point instruction. You +can use this event to estimate the penalties for the transitions between +floating-point and MMX technology states. +.It Li FP_MMX_TRANS.ANY +.Pq Event CCH , Umask 03H +Counts all transitions from floating point to MMX instructions and from MMX +instructions to floating point instructions. You can use this event to +estimate the penalties for the transitions between floating-point and MMX +technology states. +.It Li MACRO_INSTS.DECODED +.Pq Event D0H , Umask 01H +Counts the number of instructions decoded, (but not necessarily executed or +retired). +.It Li UOPS_DECODED.MS +.Pq Event D1H , Umask 02H +Counts the number of Uops decoded by the Microcode Sequencer, MS. The MS +delivers uops when the instruction is more than 4 uops long or a microcode +assist is occurring. +.It Li UOPS_DECODED.ESP_FOLDING +.Pq Event D1H , Umask 04H +Counts number of stack pointer (ESP) instructions decoded: push , pop , call +, ret, etc. ESP instructions do not generate a Uop to increment or decrement +ESP. Instead, they update an ESP_Offset register that keeps track of the +delta to the current value of the ESP register. +.It Li UOPS_DECODED.ESP_SYNC +.Pq Event D1H , Umask 08H +Counts number of stack pointer (ESP) sync operations where an ESP +instruction is corrected by adding the ESP offset register to the current +value of the ESP register. +.It Li RAT_STALLS.FLAGS +.Pq Event D2H , Umask 01H +Counts the number of cycles during which execution stalled due to several +reasons, one of which is a partial flag register stall. A partial register +stall may occur when two conditions are met: 1) an instruction modifies +some, but not all, of the flags in the flag register and 2) the next +instruction, which depends on flags, depends on flags that were not modified +by this instruction. +.It Li RAT_STALLS.REGISTERS +.Pq Event D2H , Umask 02H +This event counts the number of cycles instruction execution latency became +longer than the defined latency because the instruction used a register that +was partially written by previous instruction. +.It Li RAT_STALLS.ROB_READ_PORT +.Pq Event D2H , Umask 04H +Counts the number of cycles when ROB read port stalls occurred, which did +not allow new micro-ops to enter the out-of-order pipeline. Note that, at +this stage in the pipeline, additional stalls may occur at the same cycle +and prevent the stalled micro-ops from entering the pipe. In such a case, +micro-ops retry entering the execution pipe in the next cycle and the +ROB-read port stall is counted again. +.It Li RAT_STALLS.SCOREBOARD +.Pq Event D2H , Umask 08H +Counts the cycles where we stall due to microarchitecturally required +serialization. Microcode scoreboarding stalls. +.It Li RAT_STALLS.ANY +.Pq Event D2H , Umask 0FH +Counts all Register Allocation Table stall cycles due to: Cycles when ROB +read port stalls occurred, which did not allow new micro-ops to enter the +execution pipe. Cycles when partial register stalls occurred Cycles when +flag stalls occurred Cycles floating-point unit (FPU) status word stalls +occurred. To count each of these conditions separately use the events: +RAT_STALLS.ROB_READ_PORT, RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and +RAT_STALLS.FPSW. +.It Li SEG_RENAME_STALLS +.Pq Event D4H , Umask 01H +Counts the number of stall cycles due to the lack of renaming resources for +the ES, DS, FS, and GS segment registers. If a segment is renamed but not +retired and a second update to the same segment occurs, a stall occurs in +the front-end of the pipeline until the renamed segment retires. +.It Li ES_REG_RENAMES +.Pq Event D5H , Umask 01H +Counts the number of times the ES segment register is renamed. +.It Li UOP_UNFUSION +.Pq Event DBH , Umask 01H +Counts unfusion events due to floating point exception to a fused uop. +.It Li BR_INST_DECODED +.Pq Event E0H , Umask 01H +Counts the number of branch instructions decoded. +.It Li BPU_MISSED_CALL_RET +.Pq Event E5H , Umask 01H +Counts number of times the Branch Prediction Unit missed predicting a call +or return branch. +.It Li BACLEAR.CLEAR +.Pq Event E6H , Umask 01H +Counts the number of times the front end is resteered, mainly when the +Branch Prediction Unit cannot provide a correct prediction and this is +corrected by the Branch Address Calculator at the front end. This can occur +if the code has many branches such that they cannot be consumed by the BPU. +Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble +in the instruction fetch pipeline. The effect on total execution time +depends on the surrounding code. +.It Li BACLEAR.BAD_TARGET +.Pq Event E6H , Umask 02H +Counts number of Branch Address Calculator clears (BACLEAR) asserted due to +conditional branch instructions in which there was a target hit but the +direction was wrong. Each BACLEAR asserted by the BAC generates +approximately an 8 cycle bubble in the instruction fetch pipeline. +.It Li BPU_CLEARS.EARLY +.Pq Event E8H , Umask 01H +Counts early (normal) Branch Prediction Unit clears: BPU predicted a taken +branch after incorrectly assuming that it was not taken. +The BPU clear leads to 2 cycle bubble in the Front End. +.It Li BPU_CLEARS.LATE +.Pq Event E8H , Umask 02H +Counts late Branch Prediction Unit clears due to Most Recently Used +conflicts. The PBU clear leads to a 3 cycle bubble in the Front End. +.It Li L2_TRANSACTIONS.LOAD +.Pq Event F0H , Umask 01H +Counts L2 load operations due to HW prefetch or demand loads. +.It Li L2_TRANSACTIONS.RFO +.Pq Event F0H , Umask 02H +Counts L2 RFO operations due to HW prefetch or demand RFOs. +.It Li L2_TRANSACTIONS.IFETCH +.Pq Event F0H , Umask 04H +Counts L2 instruction fetch operations due to HW prefetch or demand ifetch. +.It Li L2_TRANSACTIONS.PREFETCH +.Pq Event F0H , Umask 08H +Counts L2 prefetch operations. +.It Li L2_TRANSACTIONS.L1D_WB +.Pq Event F0H , Umask 10H +Counts L1D writeback operations to the L2. +.It Li L2_TRANSACTIONS.FILL +.Pq Event F0H , Umask 20H +Counts L2 cache line fill operations due to load, RFO, L1D writeback or +prefetch. +.It Li L2_TRANSACTIONS.WB +.Pq Event F0H , Umask 40H +Counts L2 writeback operations to the L3. +.It Li L2_TRANSACTIONS.ANY +.Pq Event F0H , Umask 80H +Counts all L2 cache operations. +.It Li L2_LINES_IN.S_STATE +.Pq Event F1H , Umask 02H +Counts the number of cache lines allocated in the L2 cache in the S (shared) +state. +.It Li L2_LINES_IN.E_STATE +.Pq Event F1H , Umask 04H +Counts the number of cache lines allocated in the L2 cache in the E +(exclusive) state. +.It Li L2_LINES_IN.ANY +.Pq Event F1H , Umask 07H +Counts the number of cache lines allocated in the L2 cache. +.It Li L2_LINES_OUT.DEMAND_CLEAN +.Pq Event F2H , Umask 01H +Counts L2 clean cache lines evicted by a demand request. +.It Li L2_LINES_OUT.DEMAND_DIRTY +.Pq Event F2H , Umask 02H +Counts L2 dirty (modified) cache lines evicted by a demand request. +.It Li L2_LINES_OUT.PREFETCH_CLEAN +.Pq Event F2H , Umask 04H +Counts L2 clean cache line evicted by a prefetch request. +.It Li L2_LINES_OUT.PREFETCH_DIRTY +.Pq Event F2H , Umask 08H +Counts L2 modified cache line evicted by a prefetch request. +.It Li L2_LINES_OUT.ANY +.Pq Event F2H , Umask 0FH +Counts all L2 cache lines evicted for any reason. +.It Li SQ_MISC.SPLIT_LOCK +.Pq Event F4H , Umask 10H +Counts the number of SQ lock splits across a cache line. +.It Li SQ_FULL_STALL_CYCLES +.Pq Event F6H , Umask 01H +Counts cycles the Super Queue is full. Neither of the threads on this core +will be able to access the uncore. +.It Li FP_ASSIST.ALL +.Pq Event F7H , Umask 01H +Counts the number of floating point operations executed that required +micro-code assist intervention. Assists are required in the following cases: +SSE instructions, (Denormal input when the DAZ flag is off or Underflow +result when the FTZ flag is off): x87 instructions, (NaN or denormal are +loaded to a register or used as input from memory, Division by 0 or +Underflow output). +.It Li FP_ASSIST.OUTPUT +.Pq Event F7H , Umask 02H +Counts number of floating point micro-code assist when the output value +(destination register) is invalid. +.It Li FP_ASSIST.INPUT +.Pq Event F7H , Umask 04H +Counts number of floating point micro-code assist when the input value (one +of the source operands to an FP instruction) is invalid. +.It Li SIMD_INT_64.PACKED_MPY +.Pq Event FDH , Umask 01H +Counts number of SID integer 64 bit packed multiply operations. +.It Li SIMD_INT_64.PACKED_SHIFT +.Pq Event FDH , Umask 02H +Counts number of SID integer 64 bit packed shift operations. +.It Li SIMD_INT_64.PACK +.Pq Event FDH , Umask 04H +Counts number of SID integer 64 bit pack operations. +.It Li SIMD_INT_64.UNPACK +.Pq Event FDH , Umask 08H +Counts number of SID integer 64 bit unpack operations. +.It Li SIMD_INT_64.PACKED_LOGICAL +.Pq Event FDH , Umask 10H +Counts number of SID integer 64 bit logical operations. +.It Li SIMD_INT_64.PACKED_ARITH +.Pq Event FDH , Umask 20H +Counts number of SID integer 64 bit arithmetic operations. +.It Li SIMD_INT_64.SHUFFLE_MOVE +.Pq Event FDH , Umask 40H +Counts number of SID integer 64 bit shift or move operations. +.El +.Ss Event Specifiers (Programmable PMCs) +Core i7 and Xeon 5500 programmable PMCs support the following events as +June 2009 document (removed in December 2009): +.Bl -tag -width indent +.It Li SB_FORWARD.ANY +.Pq Event 02H , Umask 01H +Counts the number of store forwards. +.It Li LOAD_BLOCK.STD +.Pq Event 03H , Umask 01H +Counts the number of loads blocked by a preceding store with unknown data. +.It Li LOAD_BLOCK.ADDRESS_OFFSET +.Pq Event 03H , Umask 04H +Counts the number of loads blocked by a preceding store address. +.It Li LOAD_BLOCK.ADDRESS_OFFSET +.Pq Event 01H , Umask 04H +Counts the cycles of store buffer drains. +.It Li MISALIGN_MEM_REF.LOAD +.Pq Event 05H , Umask 01H +Counts the number of misaligned load references +.It Li MISALIGN_MEM_REF.STORE +.Pq Event 05H , Umask 02H +Counts the number of misaligned store references +.It Li MISALIGN_MEM_REF.ANY +.Pq Event 05H , Umask 03H +Counts the number of misaligned memory references +.It Li STORE_BLOCKS.NOT_STA +.Pq Event 06H , Umask 01H +This event counts the number of load operations delayed caused by preceding +stores whose addresses are known but whose data is unknown, and preceding +stores that conflict with the load but which incompletely overlap the load. +.It Li STORE_BLOCKS.STA +.Pq Event 06H , Umask 02H +This event counts load operations delayed caused by preceding stores whose +addresses are unknown (STA block). +.It Li STORE_BLOCKS.ANY +.Pq Event 06H , Umask 0FH +All loads delayed due to store blocks +.It Li MEMORY_DISAMBIGURATION.RESET +.Pq Event 09H , Umask 01H +Counts memory disambiguration reset cycles +.It Li MEMORY_DISAMBIGURATION.SUCCESS +.Pq Event 09H , Umask 02H +Counts the number of loads that memory disambiguration succeeded +.It Li MEMORY_DISAMBIGURATION.WATCHDOG +.Pq Event 09H , Umask 04H +Counts the number of times the memory disambiguration watchdog kicked in. +.It Li MEMORY_DISAMBIGURATION.WATCH_CYCLES +.Pq Event 09H , Umask 08H +Counts the cycles that the memory disambiguration watchdog is active. +set invert=1, cmask = 1 +.It Li HW_INT.RCV +.Pq Event 1DH , Umask 01H +Number of interrupt received +.It Li HW_INT.CYCLES_MASKED +.Pq Event 1DH , Umask 02H +Number of cycles interrupt are masked +.It Li HW_INT.CYCLES_PENDING_AND_MASKED +.Pq Event 1DH , Umask 04H +Number of cycles interrupts are pending and masked +.It Li HW_INT.CYCLES_PENDING_AND_MASKED +.Pq Event 04H , Umask 04H +Counts number of L2 store RFO requests where the cache line to be loaded is +in the E (exclusive) state. The L1D prefetcher does not issue a RFO +prefetch. +This is a demand RFO request +.It Li HW_INT.CYCLES_PENDING_AND_MASKED +.Pq Event 27H , Umask 04H +LONGEST_LAT_CACH E.MISS +.It Li UOPS_DECODED.DEC0 +.Pq Event 3DH , Umask 01H +Counts micro-ops decoded by decoder 0. +.It Li UOPS_DECODED.DEC0 +.Pq Event 01H , Umask 01H +Counts L1 data cache store RFO requests where the cache line to be loaded is +in the I state. +Counter 0, 1 only +.It Li 0FH +.Pq Event 41H , Umask 41H +L1D_CACHE_ST.MESI +Counts L1 data cache store RFO requests. +Counter 0, 1 only +.It Li DTLB_MISSES.PDE_MISS +.Pq Event 49H , Umask 20H +Number of DTLB cache misses where the low part of the linear to physical +address translation was missed. +.It Li DTLB_MISSES.PDP_MISS +.Pq Event 49H , Umask 40H +Number of DTLB misses where the high part of the linear to physical address +translation was missed. +.It Li DTLB_MISSES.LARGE_WALK_COMPLETED +.Pq Event 49H , Umask 80H +Counts number of completed large page walks due to misses in the STLB. +.It Li SSE_MEM_EXEC.NTA +.Pq Event 4BH , Umask 01H +Counts number of SSE NTA prefetch/weakly-ordered instructions which missed +the L1 data cache. +.It Li SSE_MEM_EXEC.STREAMING_STORES +.Pq Event 4BH , Umask 08H +Counts number of SSE non temporal stores +.It Li SFENCE_CYCLES +.Pq Event 4DH , Umask 01H +Counts store fence cycles +.It Li EPT.EPDE_MISS +.Pq Event 4FH , Umask 02H +Counts Extended Page Directory Entry misses. The Extended Page Directory +cache is used by Virtual Machine operating systems while the guest operating +systems use the standard TLB caches. +.It Li EPT.EPDPE_HIT +.Pq Event 4FH , Umask 04H +Counts Extended Page Directory Pointer Entry hits. +.It Li EPT.EPDPE_MISS +.Pq Event 4FH , Umask 08H +Counts Extended Page Directory Pointer Entry misses. T +.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_DATA +.Pq Event 60H , Umask 01H +Counts weighted cycles of offcore demand data read requests. Does not +include L2 prefetch requests. +counter 0 +.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND.READ_CODE +.Pq Event 60H , Umask 02H +Counts weighted cycles of offcore demand code read requests. Does not +include L2 prefetch requests. +counter 0 +.It Li OFFCORE_REQUESTS_OUTSTANDING.DEMAND.RFO +.Pq Event 60H , Umask 04H +Counts weighted cycles of offcore demand RFO requests. Does not include L2 +prefetch requests. +counter 0 +.It Li OFFCORE_REQUESTS_OUTSTANDING.ANY.READ +.Pq Event 60H , Umask 08H +Counts weighted cycles of offcore read requests of any kind. Include L2 +prefetch requests. +counter 0 +.It Li IFU_IVC.FULL +.Pq Event 81H , Umask 01H +Instruction Fetche unit victim cache full. +.It Li IFU_IVC.L1I_EVICTION +.Pq Event 81H , Umask 02H +L1 Instruction cache evictions. +.It Li L1I_OPPORTUNISTIC_HITS +.Pq Event 83H , Umask 01H +Opportunistic hits in streaming. +.It Li ITLB_MISSES.WALK_CYCLES +.Pq Event 85H , Umask 04H +Counts ITLB miss page walk cycles. +.It Li ITLB_MISSES.PMH_BUSY_CYCLES +.Pq Event 85H , Umask 04H +Counts PMH busy cycles. +.It Li ITLB_MISSES.STLB_HIT +.Pq Event 85H , Umask 10H +Counts the number of ITLB misses that hit in the second level TLB. +.It Li ITLB_MISSES.PDE_MISS +.Pq Event 85H , Umask 20H +Number of ITLB misses where the low part of the linear to physical address +translation was missed. +.It Li ITLB_MISSES.PDP_MISS +.Pq Event 85H , Umask 40H +Number of ITLB misses where the high part of the linear to physical address +translation was missed. +.It Li ITLB_MISSES.LARGE_WALK_COMPLETED +.Pq Event 85H , Umask 80H +Counts number of completed large page walks due to misses in the STLB. +.It Li ITLB_MISSES.LARGE_WALK_COMPLETED +.Pq Event 01H , Umask 80H +Counts number of offcore demand data read requests. Does not count L2 +prefetch requests. +.It Li OFFCORE_REQUESTS.DEMAND.READ_CODE +.Pq Event B0H , Umask 02H +Counts number of offcore demand code read requests. Does not count L2 +prefetch requests. +.It Li OFFCORE_REQUESTS.DEMAND.RFO +.Pq Event B0H , Umask 04H +Counts number of offcore demand RFO requests. Does not count L2 prefetch +requests. +.It Li OFFCORE_REQUESTS.ANY.READ +.Pq Event B0H , Umask 08H +Counts number of offcore read requests. Includes L2 prefetch requests. +.It Li OFFCORE_REQUESTS.ANY.RFO +.Pq Event B0H , Umask 10H +Counts number of offcore RFO requests. Includes L2 prefetch requests. +.It Li OFFCORE_REQUESTS.UNCACHED_MEM +.Pq Event B0H , Umask 20H +Counts number of offcore uncached memory requests. +.It Li OFFCORE_REQUESTS.ANY +.Pq Event B0H , Umask 80H +Counts all offcore requests. +.It Li SNOOPQ_REQUESTS_OUTSTANDING.DATA +.Pq Event B3H , Umask 01H +Counts weighted cycles of snoopq requests for data. Counter 0 only +Use cmask=1 to count cycles not empty. +.It Li SNOOPQ_REQUESTS_OUTSTANDING.INVALIDATE +.Pq Event B3H , Umask 02H +Counts weighted cycles of snoopq invalidate requests. Counter 0 only +Use cmask=1 to count cycles not empty. +.It Li SNOOPQ_REQUESTS_OUTSTANDING.CODE +.Pq Event B3H , Umask 04H +Counts weighted cycles of snoopq requests for code. Counter 0 only +Use cmask=1 to count cycles not empty. +.It Li SNOOPQ_REQUESTS_OUTSTANDING.CODE +.Pq Event BAH , Umask 04H +Counts number of TPR reads +.It Li PIC_ACCESSES.TPR_WRITES +.Pq Event BAH , Umask 02H +Counts number of TPR writes +one or two micro-ops. Some instructions are decoded into longer sequences +.It Li MACHINE_CLEARS.FUSION_ASSIST +.Pq Event C3H , Umask 10H +Counts the number of macro-fusion assists +Counts SIMD packed single- precision floating point Uops retired. +.It Li BOGUS_BR +.Pq Event E4H , Umask 01H +Counts the number of bogus branches. +.It Li L2_HW_PREFETCH.HIT +.Pq Event F3H , Umask 01H +Count L2 HW prefetcher detector hits +.It Li L2_HW_PREFETCH.ALLOC +.Pq Event F3H , Umask 02H +Count L2 HW prefetcher allocations +.It Li L2_HW_PREFETCH.DATA_TRIGGER +.Pq Event F3H , Umask 04H +Count L2 HW data prefetcher triggered +.It Li L2_HW_PREFETCH.CODE_TRIGGER +.Pq Event F3H , Umask 08H +Count L2 HW code prefetcher triggered +.It Li L2_HW_PREFETCH.DCA_TRIGGER +.Pq Event F3H , Umask 10H +Count L2 HW DCA prefetcher triggered +.It Li L2_HW_PREFETCH.KICK_START +.Pq Event F3H , Umask 20H +Count L2 HW prefetcher kick started +.It Li SQ_MISC.PROMOTION +.Pq Event F4H , Umask 01H +Counts the number of L2 secondary misses that hit the Super Queue. +.It Li SQ_MISC.PROMOTION_POST_GO +.Pq Event F4H , Umask 02H +Counts the number of L2 secondary misses during the Super Queue filling L2. +.It Li SQ_MISC.LRU_HINTS +.Pq Event F4H , Umask 04H +Counts number of Super Queue LRU hints sent to L3. +.It Li SQ_MISC.FILL_DROPPED +.Pq Event F4H , Umask 08H +Counts the number of SQ L2 fills dropped due to L2 busy. +.It Li SEGMENT_REG_LOADS +.Pq Event F8H , Umask 01H +Counts number of segment register loads. +.El +.Sh SEE ALSO +.Xr pmc 3 , +.Xr pmc.atom 3 , +.Xr pmc.core 3 , +.Xr pmc.corei7uc 3 , +.Xr pmc.iaf 3 , +.Xr pmc.k7 3 , +.Xr pmc.k8 3 , +.Xr pmc.p4 3 , +.Xr pmc.p5 3 , +.Xr pmc.p6 3 , +.Xr pmc.soft 3 , +.Xr pmc.tsc 3 , +.Xr pmc.ucf 3 , +.Xr pmc.westmere 3 , +.Xr pmc.westmereuc 3 , +.Xr pmc_cpuinfo 3 , +.Xr pmclog 3 , +.Xr hwpmc 4 +.Sh HISTORY +The +.Nm pmc +library first appeared in +.Fx 6.0 . +.Sh AUTHORS +The +.Lb libpmc +library was written by +.An Joseph Koshy Aq Mt jkoshy@FreeBSD.org . |