pmc
— library for
accessing hardware performance monitoring counters
Performance Counters Library (libpmc,
-lpmc)
Intel Pentium PMCs are present in Intel Pentium and Pentium MMX
processors. These PMCs are documented in the
Volume 3B: System Programming Guide, Part
2, Intel 64 and IA-32 Intel(R) Architectures Software
Developer's Manual, Order Number 253669-024US,
Intel Corporation, August
2007.
These CPUs contain two PMCs, each 40 bits wide. These PMCs support
the following capabilities:
Capability |
Support |
PMC_CAP_CASCADE |
No |
PMC_CAP_EDGE |
No |
PMC_CAP_INTERRUPT |
No |
PMC_CAP_INVERT |
No |
PMC_CAP_READ |
Yes |
PMC_CAP_PRECISE |
No |
PMC_CAP_SYSTEM |
Yes |
PMC_CAP_TAGGING |
No |
PMC_CAP_THRESHOLD |
No |
PMC_CAP_USER |
Yes |
PMC_CAP_WRITE |
Yes |
Event specifiers for Intel Pentium PMCs can have the following
common qualifiers:
duration
- Count duration (in clocks) of events. The default is to count events.
os
- Measure events at privilege levels 0, 1 and 2.
overflow
- Assert the external processor pin associated with a counter on counter
overflow.
usr
- Measure events at privilege level 3.
If neither of the “os
” or
“usr
” qualifiers are specified, the
default is to enable both.
Some events may only be used on specific counters and some events
are defined only on processors supporting the MMX instruction set. Note that
these PMCs do not have the ability to interrupt the CPU.
The event specifiers supported by Intel Pentium PMCs are:
p5-any-segment-register-loaded
- (Event 0FH) The number of writes to any segment register, including the
LDTR, GDTR, TR and IDTR. Far control transfers and task switches that
involve privilege level changes will count this event twice.
p5-bank-conflicts
- (Event 0AH) The number of actual bank conflicts.
p5-branches
- (Event 12H) The number of taken and not taken branches including branches,
jumps, calls, software interrupts and interrupt returns.
p5-breakpoint-match-on-dr0-register
- (Event 23H) The number of matches on the DR0 breakpoint register.
p5-breakpoint-match-on-dr1-register
- (Event 24H) The number of matches on the DR1 breakpoint register.
p5-breakpoint-match-on-dr2-register
- (Event 25H) The number of matches on the DR2 breakpoint register.
p5-breakpoint-match-on-dr3-register
- (Event 26H) The number of matches on the DR3 breakpoint register.
p5-btb-false-entries
- (Event 3AH, Pentium MMX) The number of false entries in the BTB. This
event is only allocated on counter 0.
p5-btb-hits
- (Event 13H) The number of branches executed that hit in the branch table
buffer.
p5-btb-miss-prediction-on-not-taken-branch
- (Event 3AH, Pentium MMX) The number of times the BTB predicted a not-taken
branch as taken. This event is only allocated on counter 1.
p5-bus-cycle-duration
- (Event 18H) The number of cycles while a bus cycle was in progress.
p5-bus-ownership-latency
- (Event 2AH, Pentium MMX) The time from bus ownership being requested to
ownership being granted. This event is only allocated on counter 0.
p5-bus-ownership-transfers
- (Event 2AH, Pentium MMX) The number of bus ownership transfers. This event
is only allocated on counter 1.
p5-bus-utilization-due-to-processor-activity
- (Event 2EH, Pentium MMX) The number of clocks the bus is busy due to the
processor's own activity. This event is only allocated on counter 0.
p5-cache-line-sharing
- (Event 2CH, Pentium MMX) The number of shared data lines in L1 cache. This
event is only allocated on counter 1.
p5-cache-m-state-line-sharing
- (Event 2CH, Pentium MMX) The number of hits to an M- state line due to a
memory access by another processor. This event is only allocated on
counter 0.
p5-code-cache-miss
- (Event 0EH) The number of instruction reads that miss the internal code
cache. Both cacheable and un-cacheable misses are counted.
p5-code-read
- (Event 0CH) The number of instruction reads to both cacheable and
un-cacheable regions.
p5-code-tlb-miss
- (Event 0DH) The number of instruction reads that miss the instruction TLB.
Both cacheable and un-cacheable unreads are counted.
p5-d1-starvation-and-fifo-is-empty
- (Event 33H, Pentium MMX) The number of times the D1 stage cannot issue any
instructions because the FIFO was empty. This event is only allocated on
counter 0.
p5-d1-starvation-and-only-one-instruction-in-fifo
- (Event 33H, Pentium MMX) The number of times the D1 stage could issue only
one instruction because the FIFO had one instruction ready. This event is
only allocated on counter 1.
p5-data-cache-lines-written-back
- (Event 06H) The number of data cache lines that are written back,
including those caused by internal and external snoops.
p5-data-cache-tlb-miss-stall-duration
- (Event 30H, Pentium MMX) The number of clocks the pipeline is stalled due
to a data cache TLB miss. This event is only allocated on counter 1.
p5-data-read
- (Event 00H) The number of memory data reads, counting internal data cache
hits and misses. I/O and data memory accesses due to TLB miss processing
are not included. Split cycle reads are counted individually.
p5-data-read-miss
- (Event 03H) The number of memory read accesses that miss the data cache,
counting both cacheable and un-cacheable accesses. Data accesses that are
part of TLB miss processing are not included. I/O accesses are not
included.
p5-data-read-miss-or-write-miss
- (Event 29H) The number of data reads and writes that miss the internal
data cache, counting un-cacheable accesses. Data accesses due to TLB miss
processing are not counted.
p5-data-read-or-write
- (Event 28H) The number of data reads and writes including internal data
cache hits and misses. Data reads due to TLB miss processing are not
counted.
p5-data-tlb-miss
- (Event 02H) The number of misses to the data cache translation look aside
buffer.
p5-data-write
- (Event 01H) The number of memory data writes, counting internal data cache
hits and misses. I/O is not included and split cycle writes are counted
individually.
p5-data-write-miss
- (Event 04H) The number of memory write accesses that miss the data cache,
counting both cacheable and un-cacheable accesses. I/O accesses are not
counted.
p5-emms-instructions-executed
- (Event 2DH, Pentium MMX) The number of EMMS instructions executed. This
event is only allocated on counter 0.
p5-external-data-cache-snoop-hits
- (Event 08H) The number of external snoops to the data cache that hit a
valid line, or the data line fill buffer, or one of the write back
buffers.
p5-external-snoops
- (Event 07H) The number of external snoop requests accepted, including
snoops that hit in the code cache, the data cache and that hit in
neither.
p5-floating-point-stalls-duration
- (Event 32H, Pentium MMX) The number of cycles the pipeline is stalled due
to a floating point freeze. This event is only allocated on counter
0.
p5-flops
- (Event 22H) The number of floating point adds, subtracts, multiples,
divides and square roots. Transcendental instructions trigger this event
multiple times. Instructions generating divide-by-zero, negative square
root, special operand and stack exceptions are not counted. Integer
multiply instructions that use the x87 FPU are counted.
p5-full-write-buffer-stall-duration-while-executing-mmx-instructions
- (Event 3BH, Pentium MMX) The number of clocks the pipeline has stalled due
to full write buffers when executing MMX instructions. This event is only
allocated on counter 0.
p5-hardware-interrupts
- (Event 27H) The number of taken INTR and NMI interrupts.
p5-instructions-executed
- (Event 16H) The number of instructions executed. Repeat prefixed
instructions are counted only once. The HLT instruction is counted only
once, irrespective of the number of cycles spent in the halted state. All
hardware and software exceptions are counted as instructions, and fault
handler invocations are also counted as instructions.
p5-instructions-executed-v-pipe
- (Event 17H) The number of instructions that executed in the V pipe.
p5-io-read-or-write-cycle
- (Event 1DH) The number of bus cycles directed to I/O space.
p5-locked-bus-cycle
- (Event 1CH) The number of locked bus cycles that occur on account of the
lock prefixes, LOCK instructions, page table updates and descriptor table
updates.
p5-memory-accesses-in-both-pipes
- (Event 09H) The number of data memory reads or writes that are paired in
both pipes.
p5-misaligned-data-memory-or-io-references
- (Event 0BH) The number of memory or I/O reads or writes that are not
aligned on natural boundaries. 2- and 4-byte accesses are counted as
misaligned if they cross a 4 byte boundary.
p5-misaligned-data-memory-reference-on-mmx-instructions
- (Event 36H, Pentium MMX) The number of misaligned data memory references
when executing MMX instructions. This event is only allocated on counter
0.
p5-mispredicted-or-unpredicted-returns
- (Event 37H, Pentium MMX) The number of returns predicted incorrectly or
not at all, only counting RET instructions. This event is only allocated
on counter 0.
p5-mmx-instruction-data-read-misses
- (Event 31H, Pentium MMX) The number of MMX instruction data read misses.
This event is only allocated on counter 1.
p5-mmx-instruction-data-reads
- (Event 31H, Pentium MMX) The number of MMX instruction data reads. This
event is only allocated on counter 0.
p5-mmx-instruction-data-write-misses
- (Event 34H, Pentium MMX) The number of data write misses caused by MMX
instructions. This event is only allocated on counter 1.
p5-mmx-instruction-data-writes
- (Event 34H, Pentium MMX) The number of data writes caused by MMX
instructions. This event is only allocated on counter 0.
p5-mmx-instructions-executed-u-pipe
- (Event 2BH, Pentium MMX) The number of MMX instructions executed in the U
pipe. This event is only allocated on counter 0.
p5-mmx-instructions-executed-v-pipe
- (Event 2BH, Pentium MMX) The number of MMX instructions executed in the V
pipe. This event is only allocated on counter 1.
p5-mmx-multiply-unit-interlock
- (Event 38H, Pentium MMX) The number of clocks the pipeline is stalled
because the destination of a prior MMX multiply is not ready. This event
is only allocated on counter 0.
p5-movd-movq-store-stall-due-to-previous-mmx-operation
- (Event 38H, Pentium MMX) The number of clocks a MOVD/MOVQ instruction
stalled in the D2 stage of the pipeline due to a previous MMX instruction.
This event is only allocated on counter 1.
p5-noncacheable-memory-reads
- (Event 1EH) The number of bus cycles for non-cacheable instruction or data
reads, including cycles caused by TLB misses.
p5-number-of-cycles-not-in-halt-state
- (Event 30H, Pentium MMX) The number of cycles the processor is not idle
due to the HLT instruction. This event is only allocated on counter
0.
p5-pipeline-agi-stalls
- (Event 1FH) The number of address generation interlock stalls. An AGI that
occurs in both the U and V pipelines in the same clock signals the event
twice.
p5-pipeline-flushes
- (Event 15H) The number of pipeline flushes that occur. Pipeline flushes
are caused by branch mispredicts, exceptions, interrupts, some segment
register loads, and BTB misses. Prefetch queue flushes due to serializing
instructions are not counted.
p5-pipeline-flushes-due-to-wrong-branch-predictions
- (Event 35H, Pentium MMX) The number of pipeline flushes due to wrong
branch predictions resolved in either the E- or WB- stage of the pipeline.
This event is only allocated on counter 0.
p5-pipeline-flushes-due-to-wrong-branch-predictions-resolved-in-wb-stage
- (Event 35H, Pentium MMX) The number of pipeline flushes due to wrong
branch predictions resolved in the stage of the pipeline. This event is
only allocated on counter 1.
p5-pipeline-stall-for-mmx-instruction-data-memory-reads
- (Event 36H, Pentium MMX) The number of clocks during pipeline stalls
caused by waiting MMX data memory reads. This event is only allocated on
counter 1.
p5-predicted-returns
- (Event 37H, Pentium MMX) The number of predicted returns, whether correct
or incorrect. This counter only counts RET instructions. This event is
only allocated on counter 1.
p5-returns
- (Event 39H, Pentium MMX) The number of RET instructions executed. This
event is only allocated on counter 0.
p5-saturating-mmx-instructions-executed
- (Event 2FH, Pentium MMX) The number of saturating MMX instructions
executed. This event is only allocated on counter 0.
p5-saturations-performed
- (Event 2FH, Pentium MMX) The number of saturating MMX instructions
executed when at least one of its results were actually saturated. This
event is only allocated on counter 1.
p5-stall-on-mmx-instruction-write-to-e-o-m-state-line
- (Event 3BH, Pentium MMX) The number of clocks during stalls on MMX
instructions writing to E- or M- state cache lines. This event is only
allocated on counter 1.
p5-stall-on-write-to-an-e-or-m-state-line
- (Event 1BH) The number of stalls on a write to an exclusive or modified
data cache line.
p5-taken-branch-or-btb-hit
- (Event 14H) The number of events that may cause a hit in the BTB, namely
either taken branches or BTB hits.
p5-taken-branches
- (Event 32H, Pentium MMX) The number of taken branches. This event is only
allocated on counter 1.
p5-transitions-between-mmx-and-fp-instructions
- (Event 2DH, Pentium MMX) The number of transitions between MMX and
floating-point instructions and vice-versa. This event is only allocated
on counter 1.
p5-waiting-for-data-memory-read-stall-duration
- (Event 1AH) The number of clocks the pipeline was stalled waiting for data
memory reads. Data TLB misses processing is included in this count.
p5-write-buffer-full-stall-duration
- (Event 19H) The number of clocks while the pipeline was stalled due to
write buffers being full.
p5-write-hit-to-m-or-e-state-lines
- (Event 05H) The number of writes that hit exclusive or modified lines in
the data cache.
p5-writes-to-noncacheable-memory
- (Event 2EH, Pentium MMX) The number of writes to non-cacheable memory,
including write cycles caused by TLB misses and I/O writes. This event is
only allocated on counter 1.
The following table shows the mapping between the PMC-independent
aliases supported by Performance Counters Library (libpmc,
-lpmc) and the underlying hardware events used.
pmc(3),
pmc.atom(3),
pmc.core(3),
pmc.core2(3),
pmc.iaf(3),
pmc.k7(3),
pmc.k8(3),
pmc.p4(3),
pmc.p6(3),
pmc.soft(3),
pmc.tsc(3),
pmclog(3),
hwpmc(4)
The pmc
library first appeared in
FreeBSD 6.0.
The Performance Counters Library (libpmc,
-lpmc) library was written by Joseph Koshy
<jkoshy@FreeBSD.org>.