# APPENDIX A PERFORMANCE-MONITORING EVENTS

This appendix contains list of the performance-monitoring events that can be monitored with the Intel Architecture processors. In the Intel Architecture processors, the ability to monitor performance events and the events that can be monitored are model specific. Section A.1., "P6 Family Processor Performance-Monitoring Events" lists and describes the events that can be monitored with the P6 family of processors. Section A.2., "Pentium® Processor Performance-Monitoring Events" lists and describes the events that can be monitored with Pentium® processors.

# A.1. P6 FAMILY PROCESSOR PERFORMANCE-MONITORING EVENTS

Table A-1 lists the events that can be counted with the performance-monitoring counters and read with the RDPMC instruction for the P6 family of processors. The unit column gives the microarchitecture or bus unit that produces the event; the event number column gives the hexa-decimal number identifying the event; the mnemonic event name column gives the name of the event; the unit mask column gives the unit mask required (if any); the description column describes the event; and the comments column gives additional information about the event.

These performance-monitoring events are intended to be used as guides for performance tuning. The counter values reported are not guaranteed to be absolutely accurate and should be used as a relative guide for tuning. Known discrepancies are documented where applicable.

Some performance events are model specific. Those added in later generations of the P6 family processors are listed in this table. Performance events are not architecturally guaranteed in future versions of the P6 family processors. All performance event encodings not listed in Table A-1 are reserved and their use will result in undefined counter results.

Refer to the end of the table for notes related to certain entries in the table.



| Unit                               | Event<br>Num. | Mnemonic Event<br>Name   | Unit<br>Mask | Description                                                                                                                                                                                                                                                                                                                                                                                                        | Comments                                                                                                                      |
|------------------------------------|---------------|--------------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
| Data Cache<br>Unit (DCU)           | 43H           | DATA_MEM_REFS            | 00H          | All loads from any memory type.<br>All stores to any memory type.<br>Each part of a split is counted<br>separately. The internal logic<br>counts not only memory loads<br>and stores, but also internal<br>retries.                                                                                                                                                                                                |                                                                                                                               |
|                                    |               |                          |              | Note: 80-bit floating-point<br>accesses are double counted,<br>since they are decomposed into<br>a 16-bit exponent load and a 64-<br>bit mantissa load. Memory<br>accesses are only counted<br>when they are actually<br>performed (such as a load that<br>gets squashed because a<br>previous cache miss is<br>outstanding to the same<br>address, and which finally gets<br>performed, is only counted<br>once). |                                                                                                                               |
|                                    |               |                          |              | Does not include I/O accesses,<br>or other nonmemory accesses.                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                               |
|                                    | 45H           | DCU_LINES_IN             | 00H          | Total lines allocated in the DCU.                                                                                                                                                                                                                                                                                                                                                                                  |                                                                                                                               |
|                                    | 46H           | DCU_M_LINES_IN           | 00H          | Number of M state lines allocated in the DCU.                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                               |
|                                    | 47H           | DCU_M_LINES_OUT          | 00H          | Number of M state lines evicted<br>from the DCU. This includes<br>evictions via snoop HITM,<br>intervention or replacement.                                                                                                                                                                                                                                                                                        |                                                                                                                               |
|                                    | 48H           | DCU_MISS_<br>OUTSTANDING | 00H          | Weighted number of cycles<br>while a DCU miss is<br>outstanding, incremented by the<br>number of outstanding cache<br>misses at any particular time.                                                                                                                                                                                                                                                               | An access that also<br>misses the L2 is short-<br>changed by 2 cycles<br>(i.e., if counts N cycles,<br>should be N+2 cycles). |
|                                    |               |                          |              | Cacheable read requests only are considered.                                                                                                                                                                                                                                                                                                                                                                       | Subsequent loads to the<br>same cache line will not<br>result in any additional                                               |
|                                    |               |                          |              | Uncacheable requests are<br>excluded.                                                                                                                                                                                                                                                                                                                                                                              | counts.                                                                                                                       |
|                                    |               |                          |              | Read-for-ownerships are<br>counted, as well as line fills,<br>invalidates, and stores.                                                                                                                                                                                                                                                                                                                             | Count value not precise,<br>but still useful.                                                                                 |
| Instruction<br>Fetch Unit<br>(IFU) | 80H           | IFU_IFETCH               | 00H          | Number of instruction fetches,<br>both cacheable and<br>noncacheable, including UC<br>fetches.                                                                                                                                                                                                                                                                                                                     |                                                                                                                               |
|                                    | 81H           | IFU_IFETCH_MISS          | 00H          | Number of instruction fetch misses.                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                               |
|                                    |               |                          |              | All instruction fetches that do not<br>hit the IFU (i.e., that produce<br>memory requests).                                                                                                                                                                                                                                                                                                                        |                                                                                                                               |
|                                    |               |                          |              | Includes UC accesses.                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                               |
|                                    | 85H           | ITLB_MISS                | 00H          | Number of ITLB misses.                                                                                                                                                                                                                                                                                                                                                                                             |                                                                                                                               |



| Unit                  | Event<br>Num. | Mnemonic Event<br>Name | Unit<br>Mask | Description                                                                                                                                                                 | Comments |
|-----------------------|---------------|------------------------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|                       | 86H           | IFU_MEM_STALL          | 00H          | Number of cycles instruction fetch is stalled, for any reason.                                                                                                              |          |
|                       |               |                        |              | Includes IFU cache misses,<br>ITLB misses, ITLB faults, and<br>other minor stalls.                                                                                          |          |
|                       | 87H           | ILD_STALL              | 00H          | Number of cycles that the<br>instruction length decoder is<br>stalled.                                                                                                      |          |
| L2 Cache <sup>1</sup> | 28H           | L2_IFETCH              | MESI<br>0FH  | Number of L2 instruction fetches.                                                                                                                                           |          |
|                       |               |                        |              | This event indicates that a normal instruction fetch was received by the L2.                                                                                                |          |
|                       |               |                        |              | The count includes only L2 cacheable instruction fetches; it does not include UC instruction fetches.                                                                       |          |
|                       |               |                        |              | It does not include ITLB miss accesses.                                                                                                                                     |          |
|                       | 29H           | L2_LD                  | MESI<br>0FH  | Number of L2 data loads.                                                                                                                                                    |          |
|                       |               |                        | UFH          | This event indicates that a<br>normal, unlocked, load memory<br>access was received by the L2.                                                                              |          |
|                       |               |                        |              | It includes only L2 cacheable<br>memory accesses; it does not<br>include I/O accesses, other<br>nonmemory accesses, or<br>memory accesses such as<br>UC/WT memory accesses. |          |
|                       |               |                        |              | It does include L2 cacheable<br>TLB miss memory accesses.                                                                                                                   |          |
|                       | 2AH           | L2_ST                  | MESI<br>0FH  | Number of L2 data stores.                                                                                                                                                   |          |
|                       |               |                        | UFH          | This event indicates that a normal, unlocked, store memory access was received by the L2.                                                                                   |          |
|                       |               |                        |              | Specifically, it indicates that the DCU sent a read-for-ownership request to the L2.                                                                                        |          |
|                       |               |                        |              | It also includes Invalid to<br>Modified requests sent by the<br>DCU to the L2.                                                                                              |          |
|                       |               |                        |              | It includes only L2 cacheable<br>memory accesses; it does not<br>include I/O accesses, other<br>nonmemory accesses, or<br>memory accesses such as<br>UC/WT memory accesses. |          |
|                       |               |                        |              | It includes TLB miss memory accesses.                                                                                                                                       |          |
|                       | 24H           | L2_LINES_IN            | 00H          | Number of lines allocated in the L2.                                                                                                                                        |          |



| l lucit                                  | Event<br>Num. | Mnemonic Event<br>Name  | Unit                          | Description                                                                                                                                                 | Commente                                                                                                                                                                                |
|------------------------------------------|---------------|-------------------------|-------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Unit                                     | 26H           | L2_LINES_OUT            | Mask<br>00H                   | Description Number of lines removed from                                                                                                                    | Comments                                                                                                                                                                                |
|                                          | 201           | L2_LINES_001            | 001                           | the L2 for any reason.                                                                                                                                      |                                                                                                                                                                                         |
|                                          | 25H           | L2_M_LINES_INM          | 00H                           | Number of modified lines allocated in the L2.                                                                                                               |                                                                                                                                                                                         |
|                                          | 27H           | L2_M_LINES_OUTM         | 00H                           | Number of modified lines removed from the L2 for any reason.                                                                                                |                                                                                                                                                                                         |
|                                          | 2EH           | L2_RQSTS                | MESI<br>0FH                   | Total number of L2 requests.                                                                                                                                |                                                                                                                                                                                         |
|                                          | 21H           | L2_ADS                  | 00H                           | Number of L2 address strobes.                                                                                                                               |                                                                                                                                                                                         |
|                                          | 22H           | L2_DBUS_BUSY            | 00H                           | Number of cycles during which the L2 cache data bus was busy.                                                                                               |                                                                                                                                                                                         |
|                                          | 23H           | L2_DBUS_BUSY_RD         | 00H                           | Number of cycles during which<br>the data bus was busy<br>transferring read data from L2 to<br>the processor.                                               |                                                                                                                                                                                         |
| External Bus<br>Logic (EBL) <sup>2</sup> | 62H           | BUS_DRDY_<br>CLOCKS     | 00H<br>(Self)<br>20H<br>(Any) | Number of clocks during which<br>DRDY# is asserted.<br>Utilization of the external system<br>data bus during data transfers.                                | Unit Mask = 00H counts<br>bus clocks when the<br>processor is driving<br>DRDY#.<br>Unit Mask = 20H counts<br>in processor clocks<br>when any agent is<br>driving DRDY#.                 |
|                                          | 63H           | BUS_LOCK_<br>CLOCKS     | 00H<br>(Self)<br>20H<br>(Any) | Number of clocks during which<br>LOCK# is asserted on the<br>external system bus. <sup>3</sup>                                                              | Always counts in<br>processor clocks.                                                                                                                                                   |
|                                          | 60H           | BUS_REQ_<br>OUTSTANDING | 00H<br>(Self)                 | Number of bus requests<br>outstanding.<br>This counter is incremented by<br>the number of cacheable read<br>bus requests outstanding in any<br>given cycle. | Counts only DCU full-<br>line cacheable reads, not<br>RFOs, writes, instruction<br>fetches, or anything else.<br>Counts "waiting for bus<br>to complete" (last data<br>chunk received). |
|                                          | 65H           | BUS_TRAN_BRD            | 00H<br>(Self)<br>20H<br>(Any) | Number of burst read transactions.                                                                                                                          |                                                                                                                                                                                         |
|                                          | 66H           | BUS_TRAN_RFO            | 00H<br>(Self)<br>20H<br>(Any) | Number of completed read for<br>ownership transactions.                                                                                                     |                                                                                                                                                                                         |
|                                          | 67H           | BUS_TRANS_WB            | 00H<br>(Self)<br>20H<br>(Any) | Number of completed write back transactions.                                                                                                                |                                                                                                                                                                                         |
|                                          | 68H           | BUS_TRAN_<br>IFETCH     | 00H<br>(Self)<br>20H<br>(Any) | Number of completed instruction fetch transactions.                                                                                                         |                                                                                                                                                                                         |

| Unit | Event<br>Num. | Mnemonic Event<br>Name | Unit<br>Mask                  | Description                                                                                                                                                                 | Comments |
|------|---------------|------------------------|-------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|      | 69H           | BUS_TRAN_INVAL         | 00H<br>(Self)<br>20H<br>(Any) | Number of completed invalidate transactions.                                                                                                                                |          |
|      | 6AH           | BUS_TRAN_PWR           | 00H<br>(Self)<br>20H<br>(Any) | Number of completed partial write transactions.                                                                                                                             |          |
|      | 6BH           | BUS_TRANS_P            | 00H<br>(Self)<br>20H<br>(Any) | Number of completed partial transactions.                                                                                                                                   |          |
|      | 6CH           | BUS_TRANS_IO           | 00H<br>(Self)<br>20H<br>(Any) | Number of completed I/O transactions.                                                                                                                                       |          |
|      | 6DH           | BUS_TRAN_DEF           | 00H<br>(Self)<br>20H<br>(Any) | Number of completed deferred transactions.                                                                                                                                  |          |
|      | 6EH           | BUS_TRAN_BURST         | 00H<br>(Self)<br>20H<br>(Any) | Number of completed burst transactions.                                                                                                                                     |          |
|      | 70H           | BUS_TRAN_ANY           | 00H<br>(Self)<br>20H<br>(Any) | Number of all completed bus<br>transactions.<br>Address bus utilization can be<br>calculated knowing the minimum<br>address bus occupancy.<br>Includes special cycles, etc. |          |
|      | 6FH           | BUS_TRAN_MEM           | 00H<br>(Self)<br>20H<br>(Any) | Number of completed memory transactions.                                                                                                                                    |          |
|      | 64H           | BUS_DATA_RCV           | 00H<br>(Self)                 | Number of bus clock cycles<br>during which this processor is<br>receiving data.                                                                                             |          |
|      | 61H           | BUS_BNR_DRV            | 00H<br>(Self)                 | Number of bus clock cycles<br>during which this processor is<br>driving the BNR# pin.                                                                                       |          |



| Unit | Event<br>Num. | Mnemonic Event<br>Name | Unit<br>Mask  | Description                                                                            | Comments                                                                                                                                                                               |
|------|---------------|------------------------|---------------|----------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|      | 7AH           | BUS_HIT_DRV            | 00H<br>(Self) | Number of bus clock cycles<br>during which this processor is<br>driving the HIT# pin.  | Includes cycles due to<br>snoop stalls.                                                                                                                                                |
|      |               |                        |               |                                                                                        | The event counts<br>correctly, but the BPM <i>i</i><br>pins function as follows<br>based on the setting of<br>the PC bits (bit 19 in the<br>PerfEvtSel0 and<br>PerfEvtSel1 registers): |
|      |               |                        |               |                                                                                        | If the core-clock-to- bus-<br>clock ratio is 2:1 or 3:1,<br>and a PC bit is set, the<br>BPM <i>i</i> pins will be<br>asserted for a single<br>clock when the counters<br>overflow.     |
|      |               |                        |               |                                                                                        | If the PC bit is clear, the processor toggles the BPM <i>i</i> pins when the counter overflows.                                                                                        |
|      |               |                        |               |                                                                                        | If the clock ratio is not<br>2:1 or 3:1, the BPM <i>i</i><br>pins will not function for<br>these performance-<br>monitoring counter<br>events.                                         |
|      | 7BH           | BUS_HITM_DRV           | 00H<br>(Self) | Number of bus clock cycles<br>during which this processor is<br>driving the HITM# pin. | Includes cycles due to<br>snoop stalls.                                                                                                                                                |
|      |               |                        |               | unving die Fir finite pini.                                                            | The event counts<br>correctly, but the BPM <i>i</i><br>pins function as follows<br>based on the setting of<br>the PC bits (bit 19 in the<br>PerfEvtSel0 and<br>PerfEvtSel1 registers): |
|      |               |                        |               |                                                                                        | If the core-clock-to- bus-<br>clock ratio is 2:1 or 3:1,<br>and a PC bit is set, the<br>BPM/pins will be<br>asserted for a single<br>clock when the counters<br>overflow.              |
|      |               |                        |               |                                                                                        | If the PC bit is clear, the processor toggles the BPM <i>i</i> pins when the counter overflows.                                                                                        |
|      |               |                        |               |                                                                                        | If the clock ratio is not<br>2:1 or 3:1, the BPM <i>i</i><br>pins will not function for<br>these performance-<br>monitoring counter<br>events.                                         |
|      | 7EH           | BUS_SNOOP_STALL        | 00H<br>(Self) | Number of clock cycles during which the bus is snoop stalled.                          |                                                                                                                                                                                        |



| Unit                    | Event<br>Num. | Mnemonic Event<br>Name | Unit<br>Mask | Description                                                                                                                    | Comments                                                                          |
|-------------------------|---------------|------------------------|--------------|--------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|
| Floating-<br>Point Unit | C1H           | FLOPS                  | 00H          | Number of computational floating-point operations retired.                                                                     | Counter 0 only.                                                                   |
|                         |               |                        |              | Excludes floating-point<br>computational operations that<br>cause traps or assists.                                            |                                                                                   |
|                         |               |                        |              | Includes floating-point<br>computational operations<br>executed by the assist handler.                                         |                                                                                   |
|                         |               |                        |              | Includes internal sub-operations<br>for complex floating-point<br>instructions like<br>transcendentals.                        |                                                                                   |
|                         |               |                        |              | Excludes floating-point loads and stores.                                                                                      |                                                                                   |
|                         | 10H           | FP_COMP_OPS_<br>EXE    | 00H          | Number of computational<br>floating-point operations<br>executed.                                                              | Counter 0 only.                                                                   |
|                         |               |                        |              | The number of FADD, FSUB,<br>FCOM, FMULs, integer MULs<br>and IMULs, FDIVs, FPREMs,<br>FSQRTS, integer DIVs, and<br>IDIVs.     |                                                                                   |
|                         |               |                        |              | Note not the number of cycles, but the number of operations.                                                                   |                                                                                   |
|                         |               |                        |              | This event does not distinguish<br>an FADD used in the middle of a<br>transcendental flow from a<br>separate FADD instruction. |                                                                                   |
|                         | 11H           | FP_ASSIST              | 00H          | Number of floating-point<br>exception cases handled by<br>microcode.                                                           | Counter 1 only.<br>This event includes<br>counts due to<br>speculative execution. |
|                         | 12H           | MUL                    | 00H          | Number of multiplies.                                                                                                          | Counter 1 only.                                                                   |
|                         |               |                        |              | Note: Includes integer as well as FP multiplies and is speculative.                                                            |                                                                                   |
|                         | 13H           | DIV                    | 00H          | Number of divides.                                                                                                             | Counter 1 only.                                                                   |
|                         |               |                        |              | Note: Includes integer as well as FP divides and is speculative.                                                               |                                                                                   |
|                         | 14H           | CYCLES_DIV_BUSY        | 00H          | Number of cycles during which the divider is busy, and cannot accept new divides.                                              | Counter 0 only.                                                                   |
|                         |               |                        |              | Note: Includes integer and FP<br>divides, FPREM, FPSQRT, etc.,<br>and is speculative.                                          |                                                                                   |



| Unit               | Event<br>Num. | Mnemonic Event<br>Name       | Unit<br>Mask             | Description                                                                                                                                                                                                                                                                                                                                                         | Comments                                                                                                                                                                                                                                                                      |
|--------------------|---------------|------------------------------|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Memory<br>Ordering | 03H           | LD_BLOCKS                    | 00H                      | Number of store buffer blocks.<br>Includes counts caused by<br>preceding stores whose<br>addresses are unknown,<br>preceding stores whose<br>addresses are known but whose<br>data is unknown, and preceding<br>stores that conflicts with the load<br>but which incompletely overlap<br>the load.                                                                  |                                                                                                                                                                                                                                                                               |
|                    | 04H           | SB_DRAINS                    | 00H                      | Number of store buffer drain<br>cycles.<br>Incremented every cycle the<br>store buffer is draining.<br>Draining is caused by serializing<br>operations like CPUID,<br>synchronizing operations like<br>XCHG, interrupt<br>acknowledgment, as well as<br>other conditions (such as cache<br>flushing).                                                               |                                                                                                                                                                                                                                                                               |
|                    | 05H           | MISALIGN_<br>MEM_REF         | 00H                      | Number of misaligned data<br>memory references.<br>Incremented by 1 every cycle,<br>during which either the proc load<br>or store pipeline dispatches a<br>misaligned uop.<br>Counting is performed if it is the<br>first or second half, or if it is<br>blocked, squashed, or missed.<br>Note: In this context, misaligned<br>means crossing a 64-bit<br>boundary. | It should be noted that<br>MISALIGN_MEM_REF<br>is only an approximation<br>to the true number of<br>misaligned memory<br>references.<br>The value returned is<br>roughly proportional to<br>the number of<br>misaligned memory<br>accesses, i.e., the size<br>of the problem. |
|                    | 07H           | EMON_KNI_PREF_<br>DISPATCHED | 00H<br>01H<br>02H<br>03H | Number of Streaming SIMD<br>extensions prefetch/weakly-<br>ordered instructions dispatched<br>(speculative prefetches are<br>included in counting)<br>0: prefetch NTA<br>1: prefetch T1<br>2: prefetch T2<br>3: weakly ordered stores                                                                                                                               | Counters 0 and 1.<br>Pentium <sup>®</sup> III processor<br>only.                                                                                                                                                                                                              |
|                    | 4BH           | EMON_KNI_PREF_<br>MISS       | 00H<br>01H<br>02H<br>03H | Number of prefetch/weakly-<br>ordered instructions that miss all<br>caches.<br>0: prefetch NTA<br>1: prefetch T1<br>2: prefetch T2<br>3: weakly ordered stores                                                                                                                                                                                                      | Counters 0 and 1.<br>Pentium <sup>®</sup> III processor<br>only.                                                                                                                                                                                                              |

| Unit                                         | Event<br>Num. | Mnemonic Event<br>Name                | Unit<br>Mask | Description                                                                                                                                                                   | Comments                                                                                                                                             |
|----------------------------------------------|---------------|---------------------------------------|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
| Instruction<br>Decoding<br>and<br>Retirement | СОН           | INST_RETIRED                          | ООН          | Number of instructions retired.                                                                                                                                               | A hardware interrupt<br>received during/after the<br>last iteration of the REP<br>STOS flow causes the<br>counter to undercount by<br>1 instruction. |
|                                              | C2H           | UOPS_RETIRED                          | 00H          | Number of UOPs retired.                                                                                                                                                       |                                                                                                                                                      |
|                                              | D0H           | INST_DECODED                          | 00H          | Number of instructions decoded.                                                                                                                                               |                                                                                                                                                      |
|                                              | D8H           | EMON_KNI_INST_<br>RETIRED             | 00H<br>01H   | Number of Streaming SIMD<br>extensions retired<br>0: packed & scalar<br>1: scalar                                                                                             | Counters 0 and 1.<br>Pentium <sup>®</sup> III processor<br>only.                                                                                     |
|                                              | D9H           | EMON_KNI_COMP_<br>INST_RET            | 00H<br>01H   | Number of Streaming SIMD<br>extensions computation<br>instructions retired.<br>0: packed and scalar<br>1: scalar                                                              | Counters 0 and 1.<br>Pentium <sup>®</sup> III processor<br>only.                                                                                     |
| Interrupts                                   | C8H           | HW_INT_RX                             | 00H          | Number of hardware interrupts received.                                                                                                                                       |                                                                                                                                                      |
|                                              | C6H           | CYCLES_INT_<br>MASKED                 | 00H          | Number of processor cycles for which interrupts are disabled.                                                                                                                 |                                                                                                                                                      |
|                                              | C7H           | CYCLES_INT_<br>PENDING_<br>AND_MASKED | 00H          | Number of processor cycles for which interrupts are disabled and interrupts are pending.                                                                                      |                                                                                                                                                      |
| Branches                                     | C4H           | BR_INST_RETIRED                       | 00H          | Number of branch instructions retired.                                                                                                                                        |                                                                                                                                                      |
|                                              | C5H           | BR_MISS_PRED_<br>RETIRED              | 00H          | Number of mispredicted branches retired.                                                                                                                                      |                                                                                                                                                      |
|                                              | C9H           | BR_TAKEN_<br>RETIRED                  | 00H          | Number of taken branches retired.                                                                                                                                             |                                                                                                                                                      |
|                                              | CAH           | BR_MISS_PRED_<br>TAKEN_RET            | 00H          | Number of taken mispredictions branches retired.                                                                                                                              |                                                                                                                                                      |
|                                              | E0H           | BR_INST_DECODED                       | 00H          | Number of branch instructions decoded.                                                                                                                                        |                                                                                                                                                      |
|                                              | E2H           | BTB_MISSES                            | 00H          | Number of branches for which<br>the BTB did not produce a<br>prediction.                                                                                                      |                                                                                                                                                      |
|                                              | E4H           | BR_BOGUS                              | 00H          | Number of bogus branches.                                                                                                                                                     |                                                                                                                                                      |
|                                              | E6H           | BACLEARS                              | 00H          | Number of times BACLEAR is asserted.                                                                                                                                          |                                                                                                                                                      |
|                                              |               |                                       |              | This is the number of times that<br>a static branch prediction was<br>made, in which the branch<br>decoder decided to make a<br>branch prediction because the<br>BTB did not. |                                                                                                                                                      |



| Unit                         | Event<br>Num. | Mnemonic Event<br>Name  | Unit<br>Mask | Description                                                                                                                                                                                                                  | Comments                                                                                                                                               |
|------------------------------|---------------|-------------------------|--------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|
| Stalls                       | A2H           | RESOURCE_STALLS         | 00H          | Incremented by 1 during every<br>cycle for which there is a<br>resource related stall.                                                                                                                                       |                                                                                                                                                        |
|                              |               |                         |              | Includes register renaming<br>buffer entries, memory buffer<br>entries.                                                                                                                                                      |                                                                                                                                                        |
|                              |               |                         |              | Does not include stalls due to<br>bus queue full, too many cache<br>misses, etc.                                                                                                                                             |                                                                                                                                                        |
|                              |               |                         |              | In addition to resource related<br>stalls, this event counts some<br>other events.                                                                                                                                           |                                                                                                                                                        |
|                              |               |                         |              | Includes stalls arising during<br>branch misprediction recovery,<br>such as if retirement of the<br>mispredicted branch is delayed<br>and stalls arising while store<br>buffer is draining from<br>synchronizing operations. |                                                                                                                                                        |
|                              | D2H           | PARTIAL_RAT_<br>STALLS  | 00H          | Number of cycles or events for partial stalls.                                                                                                                                                                               |                                                                                                                                                        |
|                              |               |                         |              | Note: Includes flag partial stalls.                                                                                                                                                                                          |                                                                                                                                                        |
| Segment<br>Register<br>Loads | 06H           | SEGMENT_REG_<br>LOADS   | 00H          | Number of segment register loads.                                                                                                                                                                                            |                                                                                                                                                        |
| Clocks                       | 79H           | CPU_CLK_<br>UNHALTED    | 00H          | Number of cycles during which the processor is not halted.                                                                                                                                                                   |                                                                                                                                                        |
| MMX <sup>™</sup> Unit        | B0H           | MMX_INSTR_EXEC          | 00H          | Number of MMX <sup>™</sup> Instructions<br>Executed.                                                                                                                                                                         | Available in Intel <sup>®</sup><br>Celeron <sup>™</sup> , Pentium <sup>®</sup> II<br>and Pentium <sup>®</sup> II Xeon <sup>™</sup><br>processors only. |
|                              |               |                         |              |                                                                                                                                                                                                                              | Does not account for<br>MOVQ and MOVD<br>stores from register to<br>memory.                                                                            |
|                              | B1H           | MMX_SAT_<br>INSTR_EXEC  | 00H          | Number of MMX <sup>™</sup> Saturating<br>Instructions Executed.                                                                                                                                                              | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only.                                                                 |
|                              | B2H           | MMX_UOPS_EXEC           | 0FH          | Number of MMX <sup>™</sup> UOPS<br>Executed.                                                                                                                                                                                 | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only.                                                                 |
|                              | B3H           | MMX_INSTR_<br>TYPE_EXEC | 01H          | MMX <sup>™</sup> packed multiply<br>instructions executed.                                                                                                                                                                   | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III                                                                                     |
|                              |               |                         | 02H          | MMX <sup>™</sup> packed shift instructions executed.                                                                                                                                                                         | processors only.                                                                                                                                       |
|                              |               |                         | 04H          | MMX <sup>™</sup> pack operation instructions executed.                                                                                                                                                                       |                                                                                                                                                        |
|                              |               |                         | 08H          | MMX <sup>™</sup> unpack operation<br>instructions executed.                                                                                                                                                                  |                                                                                                                                                        |
|                              |               |                         | 10H          | MMX <sup>™</sup> packed logical instructions executed.                                                                                                                                                                       |                                                                                                                                                        |
|                              |               |                         | 20H          | MMX <sup>™</sup> packed arithmetic instructions executed.                                                                                                                                                                    |                                                                                                                                                        |



| Unit                            | Event<br>Num. | Mnemonic Event<br>Name | Unit<br>Mask                    | Description                                                                                                                                                                          | Comments                                                                               |
|---------------------------------|---------------|------------------------|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|
|                                 | ССН           | FP_MMX_TRANS           | 00H<br>01H                      | Transitions from MMX <sup>™</sup><br>instruction to floating-point<br>instructions.<br>Transitions from floating-point<br>instructions to MMX <sup>™</sup><br>instructions.          | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only. |
|                                 | CDH           | MMX_ASSIST             | 00H                             | Number of MMX <sup>™</sup> Assists (that<br>is, the number of EMMS<br>instructions executed).                                                                                        | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only. |
|                                 | CEH           | MMX_INSTR_RET          | 00H                             | Number of MMX <sup>™</sup> Instructions Retired.                                                                                                                                     | Available in Pentium <sup>®</sup> II processor only.                                   |
| Segment<br>Register<br>Renaming | D4H           | SEG_RENAME_<br>STALLS  | 01H<br>02H<br>04H<br>08H<br>0FH | Number of Segment Register<br>Renaming Stalls:<br>Segment register ES<br>Segment register DS<br>Segment register FS<br>Segment register FS<br>Segment registers ES + DS +<br>FS + GS | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only. |
|                                 | D5H           | SEG_REG_<br>RENAMES    | 01H<br>02H<br>04H<br>08H<br>0FH | Number of Segment Register<br>Renames:<br>Segment register ES<br>Segment register DS<br>Segment register FS<br>Segment register FS<br>Segment registers ES + DS +<br>FS + GS         | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only. |
|                                 | D6H           | RET_SEG_<br>RENAMES    | 00H                             | Number of segment register rename events retired.                                                                                                                                    | Available in Pentium <sup>®</sup> II<br>& Pentium <sup>®</sup> III<br>processors only. |

#### NOTES:

- Several L2 cache events, where noted, can be further qualified using the Unit Mask (UMSK) field in the PerfEvtSel0 and PerfEvtSel1 registers. The lower 4 bits of the Unit Mask field are used in conjunction with L2 events to indicate the cache state or cache states involved. The P6 family processors identify cache states using the "MESI" protocol and consequently each bit in the Unit Mask field represents one of the four states: UMSK[3] = M (8H) state, UMSK[2] = E (4H) state, UMSK[1] = S (2H) state, and UMSK[0] = I (1H) state. UMSK[3:0] = MESI" (FH) should be used to collect data for all states; UMSK = 0H, for the applicable events, will result in nothing being counted.
- 2. All of the external bus logic (EBL) events, except where noted, can be further qualified using the Unit Mask (UMSK) field in the PerfEvtSel0 and PerfEvtSel1 registers. Bit 5 of the UMSK field is used in conjunction with the EBL events to indicate whether the processor should count transactions that are self-generated (UMSK[5] = 0) or transactions that result from any processor on the bus (UMSK[5] = 1).
- 3. L2 cache locks, so it is possible to have a zero count.



# A.2. PENTIUM<sup>®</sup> PROCESSOR PERFORMANCE-MONITORING EVENTS

Table A-2 lists the events that can be counted with the performance-monitoring counters for the Pentium<sup>®</sup> processor. The Event Number column gives the hexadecimal code that identifies the event and that is entered in the ES0 or ES1 (event select) fields of the CESR MSR. The Mnemonic Event Name column gives the name of the event, and the Description and Comments columns give detailed descriptions of the events. Most events can be counted with either counter 0 or counter 1; however, some events can only be counted with only counter 0 or only counter 1 (as noted).

#### NOTE

The events in the table that are shaded are implemented only in the Pentium<sup>®</sup> processor with MMX<sup>TM</sup> technology.

| Event<br>Num. | Mnemonic Event<br>Name | Description                                                                                                                              | Comments                                                                                                                                                                                                                                                                                                                                                             |
|---------------|------------------------|------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 00H           | DATA_READ              | Number of memory data<br>reads (internal data cache<br>hit and miss combined).                                                           | Split cycle reads are counted<br>individually. Data Memory Reads that<br>are part of TLB miss processing are not<br>included. These events may occur at a<br>maximum of two per clock. I/O is not<br>included.                                                                                                                                                       |
| 01H           | DATA_WRITE             | Number of memory data<br>writes (internal data cache<br>hit and miss combined),<br>I/O is not included.                                  | Split cycle writes are counted<br>individually. These events may occur at<br>a maximum of two per clock. I/O is not<br>included.                                                                                                                                                                                                                                     |
| 0H2           | DATA_TLB_MISS          | Number of misses to the data cache translation look-aside buffer.                                                                        |                                                                                                                                                                                                                                                                                                                                                                      |
| 03H           | DATA_READ_MISS         | Number of memory read<br>accesses that miss the<br>internal data cache<br>whether or not the access<br>is cacheable or<br>noncacheable.  | Additional reads to the same cache line<br>after the first BRDY# of the burst line fill<br>is returned but before the final (fourth)<br>BRDY# has been returned, will not<br>cause the counter to be incremented<br>additional times. Data accesses that<br>are part of TLB miss processing are not<br>included. Accesses directed to I/O<br>space are not included. |
| 04H           | DATA WRITE MISS        | Number of memory write<br>accesses that miss the<br>internal data cache<br>whether or not the access<br>is cacheable or<br>noncacheable. | Data accesses that are part of TLB<br>miss processing are not included.<br>Accesses directed to I/O space are not<br>included.                                                                                                                                                                                                                                       |



| Event<br>Num. | Mnemonic Event<br>Name                         | Description                                                                                                                 | Comments                                                                                                                                                                                                                               |
|---------------|------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 05H           | WRITE_HIT_TO_<br>MOR_E-<br>STATE_LINES         | Number of write hits to<br>exclusive or modified lines<br>in the data cache.                                                | These are the writes that may be held<br>up if EWBE# is inactive. These events<br>may occur a maximum of two per clock.                                                                                                                |
| 06H           | DATA_CACHE_<br>LINES_<br>WRITTEN_BACK          | Number of dirty lines (all)<br>that are written back,<br>regardless of the cause.                                           | Replacements and internal and external<br>snoops can all cause writeback and are<br>counted.                                                                                                                                           |
| 07H           | EXTERNAL_<br>SNOOPS                            | Number of accepted<br>external snoops whether<br>they hit in the code cache<br>or data cache or neither.                    | Assertions of EADS# outside of the sampling interval are not counted, and no internal snoops are counted.                                                                                                                              |
| 08H           | EXTERNAL_DATA_<br>CACHE_SNOOP_<br>HITS         | Number of external snoops to the data cache.                                                                                | Snoop hits to a valid line in either the data cache, the data line fill buffer, or one of the write back buffers are all counted as hits.                                                                                              |
| 09H           | MEMORY<br>ACCESSES IN<br>BOTH PIPES            | Number of data memory<br>reads or writes that are<br>paired in both pipes of the<br>pipeline.                               | These accesses are not necessarily run<br>in parallel due to cache misses, bank<br>conflicts, etc.                                                                                                                                     |
| 0AH           | BANK CONFLICTS                                 | Number of actual bank conflicts.                                                                                            |                                                                                                                                                                                                                                        |
| 0BH           | MISALIGNED DATA<br>MEMORY OR I/O<br>REFERENCES | Number of memory or I/O<br>reads or writes that are<br>misaligned.                                                          | A 2- or 4-byte access is misaligned<br>when it crosses a 4-byte boundary; an<br>8-byte access is misaligned when it<br>crosses an 8-byte boundary. Ten byte<br>accesses are treated as two separate<br>accesses of 8 and 2 bytes each. |
| 0CH           | CODE READ                                      | Number of instruction<br>reads whether the read is<br>cacheable or<br>noncacheable.                                         | Individual 8-byte noncacheable instruction reads are counted.                                                                                                                                                                          |
| 0DH           | CODE TLB MISS                                  | Number of instruction<br>reads that miss the code<br>TLB whether the read is<br>cacheable or<br>noncacheable.               | Individual 8-byte noncacheable instruction reads are counted.                                                                                                                                                                          |
| 0EH           | CODE CACHE MISS                                | Number of instruction<br>reads that miss the<br>internal code cache<br>whether the read is<br>cacheable or<br>noncacheable. | Individual 8-byte noncacheable instruction reads are counted.                                                                                                                                                                          |



| Event<br>Num. | Mnemonic Event<br>Name         | Description                                                                                                                                                                                            | Comments                                                                                                                                                                                                                                                                                                                                                                                     |
|---------------|--------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0FH           | ANY SEGMENT<br>REGISTER LOADED | Number of writes into any<br>segment register in real or<br>protected mode including<br>the LDTR, GDTR, IDTR,<br>and TR.                                                                               | Segment loads are caused by explicit<br>segment register load instructions, far<br>control transfers, and task switches. Far<br>control transfers and task switches<br>causing a privilege level change will<br>signal this event twice. Note that<br>interrupts and exceptions may initiate a<br>far control transfer.                                                                      |
| 10H           | Reserved                       |                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                              |
| 11H           | Reserved                       |                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                              |
| 12H           | Branches                       | Number of taken and not<br>taken branches, including<br>conditional branches,<br>jumps, calls, returns,<br>software interrupts, and<br>interrupt returns.                                              | Also counted as taken branches are<br>serializing instructions, VERR and<br>VERW instructions, some segment<br>descriptor loads, hardware interrupts<br>(including FLUSH#), and programmatic<br>exceptions that invoke a trap or fault<br>handler. The pipe is not necessarily<br>flushed. The number of branches<br>actually executed is measured, not the<br>number of predicted branches. |
| 13H           | BTB_HITS                       | Number of BTB hits that occur.                                                                                                                                                                         | Hits are counted only for those instructions that are actually executed.                                                                                                                                                                                                                                                                                                                     |
| 14H           | TAKEN_BRANCH_<br>OR_BTB_HIT    | Number of taken<br>branches or BTB hits that<br>occur.                                                                                                                                                 | This event type is a logical OR of taken<br>branches and BTB hits. It represents an<br>event that may cause a hit in the BTB.<br>Specifically, it is either a candidate for a<br>space in the BTB or it is already in the<br>BTB.                                                                                                                                                            |
| 15H           | PIPELINE FLUSHES               | Number of pipeline<br>flushes that occur.<br>Pipeline flushes are<br>caused by BTB misses on<br>taken branches,<br>mispredictions,<br>exceptions, interrupts,<br>and some segment<br>descriptor loads. | The counter will not be incremented for<br>serializing instructions (serializing<br>instructions cause the prefetch queue<br>to be flushed but will not trigger the<br>Pipeline Flushed event counter) and<br>software interrupts (software interrupts<br>do not flush the pipeline).                                                                                                        |



| Event<br>Num. | Mnemonic Event<br>Name                                  | Description                                                                                                                                                   | Comments                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|---------------|---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 16H           | INSTRUCTIONS_<br>EXECUTED                               | Number of instructions<br>executed (up to two per<br>clock).                                                                                                  | Invocations of a fault handler are<br>considered instructions. All hardware<br>and software interrupts and exceptions<br>will also cause the count to be<br>incremented. Repeat prefixed string<br>instructions will only increment this<br>counter once despite the fact that the<br>repeat loop executes the same<br>instruction multiple times until the loop<br>criteria is satisfied. This applies to all<br>the Repeat string instruction prefixes<br>(i.e., REP, REPE, REPZ, REPNE, and<br>REPNZ). This counter will also only<br>increment once per each HLT<br>instruction executed regardless of how<br>many cycles the processor remains in<br>the HALT state. |
| 17H           | INSTRUCTIONS_<br>EXECUTED_ V PIPE                       | Number of instructions<br>executed in the V_pipe. It<br>indicates the number of<br>instructions that were<br>paired.                                          | This event is the same as the 16H<br>event except it only counts the number<br>of instructions actually executed in the<br>V-pipe.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| 18H           | BUS_CYCLE_<br>DURATION                                  | Number of clocks while a<br>bus cycle is in progress.<br>This event measures bus<br>use.                                                                      | The count includes HLDA, AHOLD, and BOFF# clocks.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| 19H           | WRITE_BUFFER_<br>FULL_STALL_<br>DURATION                | Number of clocks while<br>the pipeline is stalled due<br>to full write buffers.                                                                               | Full write buffers stall data memory<br>read misses, data memory write<br>misses, and data memory write hits to<br>S-state lines. Stalls on I/O accesses are<br>not included.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| 1AH           | WAITING_FOR_<br>DATA_MEMORY_<br>READ_STALL_<br>DURATION | Number of clocks while<br>the pipeline is stalled<br>while waiting for data<br>memory reads.                                                                  | Data TLB Miss processing is also<br>included in the count. The pipeline stalls<br>while a data memory read is in progress<br>including attempts to read that are not<br>bypassed while a line is being filled.                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| 1BH           | STALL ON WRITE<br>TO AN E- OR M-<br>STATE LINE          | Number of stalls on writes to E- or M-state lines                                                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| 1CH           | LOCKED BUS<br>CYCLE                                     | Number of locked bus<br>cycles that occur as the<br>result of the LOCK prefix<br>or LOCK instruction,<br>page-table updates, and<br>descriptor table updates. | Only the read portion of the locked<br>read-modify-write is counted. Split<br>locked cycles (SCYC active) count as<br>two separate accesses. Cycles<br>restarted due to BOFF# are not re-<br>counted.                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |



| Event<br>Num. | Mnemonic Event<br>Name        | Description                                                                                                                                                                             | Comments                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|---------------|-------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1DH           | I/O READ OR<br>WRITE CYCLE    | Number of bus cycles directed to I/O space.                                                                                                                                             | Misaligned I/O accesses will generate<br>two bus cycles. Bus cycles restarted<br>due to BOFF# are not re-counted.                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 1EH           | NONCACHEABLE_<br>MEMORY_READS | Number of noncacheable<br>instruction or data<br>memory read bus cycles.<br>Count includes read<br>cycles caused by TLB<br>misses, but does not<br>include read cycles to I/O<br>space. | Cycles restarted due to BOFF# are not re-counted.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 1FH           | PIPELINE_AGI_<br>STALLS       | Number of address<br>generation interlock (AGI)<br>stalls. An AGI occurring in<br>both the U- and V-<br>pipelines in the same<br>clock signals this event<br>twice.                     | An AGI occurs when the instruction in<br>the execute stage of either of U- or V-<br>pipelines is writing to either the index or<br>base address register of an instruction<br>in the D2 (address generation) stage of<br>either the U- or V- pipelines.                                                                                                                                                                                                                                                                                                                    |
| 20H           | Reserved                      |                                                                                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| 21H           | Reserved                      |                                                                                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| 22H           | FLOPS                         | Number of floating-point operations that occur.                                                                                                                                         | Number of floating-point adds,<br>subtracts, multiplies, divides,<br>remainders, and square roots are<br>counted. The transcendental<br>instructions consist of multiple adds and<br>multiplies and will signal this event<br>multiple times. Instructions generating<br>the divide-by-zero, negative square<br>root, special operand, or stack<br>exceptions will not be counted.<br>Instructions generating all other<br>floating-point exceptions will be<br>counted. The integer multiply<br>instructions and other instructions<br>which use the FPU will be counted. |



| Event<br>Num. | Mnemonic Event<br>Name                 | Description                                                                                                                                          | Comments                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|---------------|----------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 23H           | BREAKPOINT<br>MATCH ON DR0<br>REGISTER | Number of matches on register DR0 breakpoint.                                                                                                        | The counters is incremented regardless<br>if the breakpoints are enabled or not.<br>However, if breakpoints are not<br>enabled, code breakpoint matches will<br>not be checked for instructions<br>executed in the V-pipe and will not<br>cause this counter to be incremented.<br>(They are checked on instruction<br>executed in the U-pipe only when<br>breakpoints are not enabled.) These<br>events correspond to the signals driven<br>on the BP[3:0] pins. Refer to Chapter<br>15, Debugging and Performance<br>Monitoring, for more information. |
| 24H           | BREAKPOINT<br>MATCH ON DR1<br>REGISTER | Number of matches on register DR1 breakpoint.                                                                                                        | Refer to comment for 23H event.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 25H           | BREAKPOINT<br>MATCH ON DR2<br>REGISTER | Number of matches on register DR2 breakpoint.                                                                                                        | Refer to comment for 23H event.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 26H           | BREAKPOINT<br>MATCH ON DR3<br>REGISTER | Number of matches on register DR3 breakpoint.                                                                                                        | Refer to comment for 23H event.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 27H           | HARDWARE<br>INTERRUPTS                 | Number of taken INTR and NMI interrupts.                                                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| 28H           | DATA_READ_OR_<br>WRITE                 | Number of memory data<br>reads and/or writes<br>(internal data cache hit<br>and miss combined).                                                      | Split cycle reads and writes are counted<br>individually. Data Memory Reads that<br>are part of TLB miss processing are not<br>included. These events may occur at a<br>maximum of two per clock. I/O is not<br>included.                                                                                                                                                                                                                                                                                                                                |
| 29H           | DATA_READ_MISS<br>OR_WRITE MISS        | Number of memory read<br>and/or write accesses that<br>miss the internal data<br>cache whether or not the<br>access is cacheable or<br>noncacheable. | Additional reads to the same cache line<br>after the first BRDY# of the burst line fill<br>is returned but before the final (fourth)<br>BRDY# has been returned, will not<br>cause the counter to be incremented<br>additional times. Data accesses that<br>are part of TLB miss processing are not<br>included. Accesses directed to I/O<br>space are not included.                                                                                                                                                                                     |



| Event Mnemonic Event |                                                                        |                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                         |
|----------------------|------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Num.                 | Name                                                                   | Description                                                                                                                                                                                                                        | Comments                                                                                                                                                                                                                                                                                                                |
| 2AH                  | BUS_OWNERSHIP_<br>LATENCY (Counter<br>0)                               | The time from LRM bus<br>ownership request to bus<br>ownership granted (that<br>is, the time from the<br>earlier of a PBREQ (0),<br>PHITM# or HITM#<br>assertion to a PBGNT<br>assertion).                                         | The ratio of the 2AH events counted on<br>counter 0 and counter 1 is the average<br>stall time due to bus ownership conflict.                                                                                                                                                                                           |
| 2AH                  | BUS OWNERSHIP<br>TRANSFERS<br>(Counter 1)                              | The number of buss<br>ownership transfers (that<br>is, the number of PBREQ<br>(0) assertions.                                                                                                                                      | The ratio of the 2AH events counted on counter 0 and counter 1 is the average stall time due to bus ownership conflict.                                                                                                                                                                                                 |
| 2BH                  | MMX_<br>INSTRUCTIONS_<br>EXECUTED_<br>U-PIPE (Counter 0)               | Number of MMX <sup>™</sup><br>instructions executed in<br>the U-pipe.                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                         |
| 2BH                  | MMX_<br>INSTRUCTIONS_<br>EXECUTED_<br>V-PIPE (Counter 1)               | Number of MMX <sup>™</sup><br>instructions executed in<br>the V-pipe.                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                         |
| 2CH                  | CACHE_M-<br>STATE_LINE_<br>SHARING<br>(Counter 0)                      | Number of times a<br>processor identified a hit<br>to a modified line due to a<br>memory access in the<br>other processor (PHITM<br>(O)).                                                                                          | If the average memory latencies of the<br>system are known, this event enables<br>the user to count the Write Backs on<br>PHITM(O) penalty and the Latency on<br>Hit Modified(I) penalty.                                                                                                                               |
| 2CH                  | CACHE_LINE_<br>SHARING<br>(Counter 1)                                  | Number of shared data<br>lines in the L1 cache<br>(PHIT (O)).                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                         |
| 2DH                  | EMMS_<br>INSTRUCTIONS_<br>EXECUTED<br>(Counter 0)                      | Number of EMMS instructions executed.                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                         |
| 2DH                  | TRANSITIONS_<br>BETWEEN_MMX_<br>AND_FP_<br>INSTRUCTIONS<br>(Counter 1) | Number of transitions<br>between MMX <sup>TM</sup> and<br>floating-point instructions<br>or vice versa. An even<br>count indicates the<br>processor is in MMX <sup>TM</sup><br>state. an odd count<br>indicates it is in FP state. | This event counts the first floating-point<br>instruction following an MMX <sup>™</sup><br>instruction or first MMX <sup>™</sup> instruction<br>following a floating-point instruction.<br>The count may be used to estimate the<br>penalty in transitions between floating-<br>point state and MMX <sup>™</sup> state. |
| 2DH                  | BUS_UTILIZATION_<br>DUE_TO_<br>PROCESSOR_<br>ACTIVITY<br>(Counter 0)   | Number of clocks the bus<br>is busy due to the<br>processor's own activity,<br>i.e., the bus activity that is<br>caused by the processor.                                                                                          |                                                                                                                                                                                                                                                                                                                         |



| Event<br>Num. | Mnemonic Event<br>Name                                          | Description                                                                                                                                  | Comments                                                                                                                                                                                                                                                                                                                                           |
|---------------|-----------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2EH           | WRITES_TO_<br>NONCACHEABLE_<br>MEMORY<br>(Counter 1)            | Number of write accesses to noncacheable memory.                                                                                             | The count includes write cycles caused<br>by TLB misses and I/O write cycles.<br>Cycles restarted due to BOFF# are not<br>re-counted.                                                                                                                                                                                                              |
| 2FH           | SATURATING_<br>MMX_<br>INSTRUCTIONS_<br>EXECUTED<br>(Counter 0) | Number of saturating<br>MMX <sup>™</sup> instructions<br>executed, independently<br>of whether they actually<br>saturated.                   |                                                                                                                                                                                                                                                                                                                                                    |
| 2FH           | SATURATIONS_<br>PERFORMED<br>(Counter 1)                        | Number of MMX <sup>™</sup><br>instructions that used<br>saturating arithmetic and<br>that at least one of its<br>results actually saturated. | If an MMX <sup>™</sup> instruction operating on 4 doublewords saturated in three out of the four results, the counter will be incremented by one only.                                                                                                                                                                                             |
| 30H           | NUMBER_OF_<br>CYCLES_NOT_IN_<br>HALT_STATE<br>(Counter 0)       | Number of cycles the<br>processor is not idle due<br>to HLT instruction.                                                                     | This event will enable the user to<br>calculate "net CPI". Note that during the<br>time that the processor is executing the<br>HLT instruction, the Time-Stamp<br>Counter is not disabled. Since this<br>event is controlled by the Counter<br>Controls CC0, CC1 it can be used to<br>calculate the CPI at CPL=3, which the<br>TSC cannot provide. |
| 30H           | DATA_CACHE_<br>TLB_MISS_<br>STALL_DURATION<br>(Counter 1)       | Number of clocks the<br>pipeline is stalled due to a<br>data cache translation<br>look-aside buffer (TLB)<br>miss.                           |                                                                                                                                                                                                                                                                                                                                                    |
| 31H           | MMX_<br>INSTRUCTION_<br>DATA_READS<br>(Counter 0)               | Number of MMX™<br>instruction data reads.                                                                                                    |                                                                                                                                                                                                                                                                                                                                                    |
| 31H           | MMX_<br>INSTRUCTION_<br>DATA_READ_<br>MISSES<br>(Counter 1)     | Number of MMX™<br>instruction data read<br>misses.                                                                                           |                                                                                                                                                                                                                                                                                                                                                    |
| 32H           | FLOATING_POINT_<br>STALLS_DURATION<br>(Counter 0)               | Number of clocks while<br>pipe is stalled due to a<br>floating-point freeze.                                                                 |                                                                                                                                                                                                                                                                                                                                                    |
| 32H           | TAKEN_BRANCHES<br>(Counter 1)                                   | Number of taken branches.                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                    |
| 33H           | D1_STARVATION_<br>AND_FIFO_IS_<br>EMPTY<br>(Counter 0)          | Number of times D1 stage<br>cannot issue ANY<br>instructions since the<br>FIFO buffer is empty.                                              | The D1 stage can issue 0, 1, or 2 instructions per clock if those are available in an instructions FIFO buffer.                                                                                                                                                                                                                                    |



| Event<br>Num. | Mnemonic Event<br>Name                                                                                       | Description                                                                                                                                    | Comments                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|---------------|--------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 33H           | D1_STARVATION_<br>AND_ONLY_ONE_<br>INSTRUCTION_IN_<br>FIFO<br>(Counter 1)                                    | Description<br>Number of times the D1<br>stage issues just a single<br>instruction since the FIFO<br>buffer had just one<br>instruction ready. | The D1 stage can issue 0, 1, or 2<br>instructions per clock if those are<br>available in an instructions FIFO buffer.<br>When combined with the previously<br>defined events, Instruction Executed<br>(16H) and Instruction Executed in the V-<br>pipe (17H), this event enables the user<br>to calculate the numbers of time pairing<br>rules prevented issuing of two<br>instructions.                                                                                                                                                                                                                                         |
| 34H           | MMX_<br>INSTRUCTION_<br>DATA_WRITES<br>(Counter 0)                                                           | Number of data writes<br>caused by MMX <sup>™</sup><br>instructions.                                                                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| 34H           | MMX_<br>INSTRUCTION_<br>DATA_WRITE_<br>MISSES<br>(Counter 1)                                                 | Number of data write<br>misses caused by MMX™<br>instructions.                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| 35H           | PIPELINE_<br>FLUSHES_DUE_<br>TO_WRONG_<br>BRANCH_<br>PREDICTIONS<br>(Counter 0)                              | Number of pipeline<br>flushes due to wrong<br>branch predictions<br>resolved in either the E-<br>stage or the WB-stage.                        | The count includes any pipeline flush<br>due to a branch that the pipeline did not<br>follow correctly. It includes cases where<br>a branch was not in the BTB, cases<br>where a branch was in the BTB but was<br>mispredicted, and cases where a<br>branch was correctly predicted but to<br>the wrong address. Branches are<br>resolved in either the Execute stage (E-<br>stage) or the Writeback stage (WB-<br>stage). In the later case, the<br>misprediction penalty is larger by one<br>clock. The difference between the 35H<br>event count in counter 0 and counter 1<br>is the number of E-stage resolved<br>branches. |
| 35H           | PIPELINE_<br>FLUSHES_DUE_<br>TO_WRONG_<br>BRANCH_<br>PREDICTIONS_<br>RESOLVED_IN_<br>WB-STAGE (Counter<br>1) | Number of pipeline<br>flushes due to wrong<br>branch predictions<br>resolved in the WB-stage.                                                  | Refer to note for event 35H (Counter 0).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| 36H           | MISALIGNED_<br>DATA_MEMORY_<br>REFERENCE_ON_<br>MMX_<br>INSTRUCTIONS<br>(Counter 0)                          | Number of misaligned<br>data memory references<br>when executing MMX <sup>™</sup><br>instructions.                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |

| Monitoring Counters (Conta.) |                                                                                      |                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                  |  |
|------------------------------|--------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Event<br>Num.                | Mnemonic Event<br>Name                                                               | Description                                                                                                                                                                                  | Comments                                                                                                                                                                                                                                                                                                                         |  |
| 36H                          | PIPELINE_<br>ISTALL_FOR_MMX_<br>INSTRUCTION_<br>DATA_MEMORY_<br>READS<br>(Counter 1) | Number clocks during<br>pipeline stalls caused by<br>waits form MMX <sup>™</sup><br>instruction data memory<br>reads.                                                                        |                                                                                                                                                                                                                                                                                                                                  |  |
| 37H                          | MISPREDICTED_<br>OR_<br>UNPREDICTED_<br>RETURNS<br>(Counter 1)                       | Number of returns<br>predicted incorrectly or<br>not predicted at all.                                                                                                                       | The count is the difference between the total number of executed returns and the number of returns that were correctly predicted. Only RET instructions are counted (for example, IRET instructions are not counted).                                                                                                            |  |
| 37H                          | PREDICTED_<br>RETURNS<br>(Counter 1)                                                 | Number of predicted<br>returns (whether they are<br>predicted correctly and<br>incorrectly.                                                                                                  | Only RET instructions are counted (for example, IRET instructions are not counted).                                                                                                                                                                                                                                              |  |
| 38H                          | MMX_MULTIPLY_<br>UNIT_INTERLOCK<br>(Counter 0)                                       | Number of clocks the pipe<br>is stalled since the<br>destination of previous<br>MMX <sup>™</sup> multiply<br>instruction is not ready<br>yet.                                                | The counter will not be incremented if<br>there is another cause for a stall. For<br>each occurrence of a multiply interlock<br>this event will be counted twice (if the<br>stalled instruction comes on the next<br>clock after the multiply) or by one (if the<br>stalled instruction comes two clocks<br>after the multiply). |  |
| 38H                          | MOVD/MOVQ_<br>STORE_STALL_<br>DUE_TO_<br>PREVIOUS_MMX_<br>OPERATION<br>(Counter 1)   | Number of clocks a<br>MOVD/MOVQ instruction<br>store is stalled in D2 stage<br>due to a previous MMX <sup>™</sup><br>operation with a<br>destination to be used in<br>the store instruction. |                                                                                                                                                                                                                                                                                                                                  |  |
| 39H                          | RETURNS<br>(Counter 0)                                                               | Number or returns<br>executed.                                                                                                                                                               | Only RET instructions are counted;<br>IRET instructions are not counted. Any<br>exception taken on a RET instruction<br>and any interrupt recognized by the<br>processor on the instruction boundary<br>prior to the execution of the RET<br>instruction will also cause this counter<br>to be incremented.                      |  |
| 39H                          | Reserved                                                                             |                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                  |  |
| ЗАН                          | BTB_FALSE_<br>ENTRIES<br>(Counter 0)                                                 | Number of false entries in the Branch Target Buffer.                                                                                                                                         | False entries are causes for<br>misprediction other than a wrong<br>prediction.                                                                                                                                                                                                                                                  |  |



| Monitoring Counters (Contd.) |                                                                                                      |                                                                                                                                     |          |
|------------------------------|------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------|
| Event<br>Num.                | Mnemonic Event<br>Name                                                                               | Description                                                                                                                         | Comments |
| 3AH                          | BTB_MISS_<br>PREDICTION_ON_<br>NOT-TAKEN_<br>BRANCH<br>(Counter 1)                                   | Number of times the BTB predicted a not-taken branch as taken.                                                                      |          |
| ЗВН                          | FULL_WRITE_<br>BUFFER_STALL_<br>DURATION_<br>WHILE_<br>EXECUTING_MMX_<br>INSTRUCTIONS<br>(Counter 0) | Number of clocks while<br>the pipeline is stalled due<br>to full write buffers while<br>executing MMX <sup>™</sup><br>instructions. |          |
| ЗВН                          | STALL_ON_MMX_<br>INSTRUCTION_<br>WRITE_TO EOR_<br>M-STATE_LINE<br>(Counter 1)                        | Number of clocks during<br>stalls on MMX <sup>™</sup><br>instructions writing to E-<br>or M-state lines.                            |          |