Observing stale instruction fetching on x86 with self-modifying code by userbinator

Share This Article

Sed ut perspiciatis unde.

I think, you should check the MACHINE_CLEARS.SMC performance counter (part of MACHINE_CLEARS event) of the CPU (it is available in Sandy Bridge 1, which is used in your Air powerbook; and also available on your Xeon, which is Nehalem 2 – search “smc”). You can use oprofile, perf or Intel’s Vtune to find its value:

http://software.intel.com/sites/products/documentation/doclib/iss/2013/amplifier/lin/ug_docs/GUID-F0FD7660-58B5-4B5D-AA9A-E1AF21DDCA0E.htm

Machine Clears

Metric Description

Certain events require the entire pipeline to be cleared and restarted from just after the last retired instruction. This metric measures three such events: memory ordering violations, self-modifying code, and certain loads to illegal address ranges.

Possible Issues

A significant portion of execution time is spent handling machine clears. Examine the MACHINE_CLEARS events to determine the specific cause.

SMC: http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/snb/events/machine_clears.html

MACHINE_CLEARS Event Code: 0xC3
SMC Mask: 0x04

Self-modifying code (SMC) detected.

Number of self-modifying-code machine clears detected.

Intel also says about smc http://software.intel.com/en-us/forums/topic/345561 (linked from Intel Performance Bottleneck Analyzer’s taxonomy

This event fires when self-modifying code is detected. This can be typically used by folks who do binary editing to force it to take certain path (e.g. hackers). This event counts the number of times that a program writes to a code section. Self-modifying code causes a severe penalty in all Intel 64 and IA-32 processors. The modified cache line is written back to the L2 and LLC caches. Also, the instructions would need to be re-loaded hence causing performance penalty.

I think, you will see some such events. If they are, then CPU was able to detect act of self-modifying the code and raised the “Machine Clear” – full restart of pipeline. First stages are Fetch and they will ask L2 cache for new opcode. I’m very interested in the exact count of SMC events per execution of your code – this will give us some estimate about latencies.. (SMC is counted in some units where 1 unit is assumed to be 1.5 cpu cycles – B.6.2.6 of intel optimization manual)

We can see that Intel says “restarted from just after the last retired instruction.”, so I think last retired instruction will be mov; and your nops are already in the pipeline. But SMC will be raised at mov’s retirement and it will kill everything in pipeline, including nops.

This SMC induced pipeline restart is not cheap, Agner has some measurements in the Optimizing_assembly.pdf – “17.10 Self-modifying code (All processors)” (I think any Core2/CoreiX is like PM here):

The penalty for executing a piece of code immediately after modifying it is approximately 19 clocks for P1, 31 for PMMX, and 150-300 for PPro, P2, P3, PM. The P4 will purge the entire trace cache after self-modifying code. The 80486 and earlier processors require a jump between the modifying and the modified code in order to flush the code cache.
…

Self-modifying code is not considered good programming practice. It should be used only if
the gain in speed is substantial and the modified code is executed so many times that the
advantage outweighs the penalties for using self-modifying code.

Usage of different linear addresses to fail SMC detector was recommended here:
https://stackoverflow.com/a/10994728/196561 – I’ll try to find actual intel docume

Observing stale instruction fetching on x86 with self-modifying code by userbinator

Observing stale instruction fetching on x86 with self-modifying code by userbinator

Share This Article

Newsletter

HackTech

Leave a comment Cancel reply

Editor's Choice

Observing stale instruction fetching on x86 with self-modifying code by userbinator

Observing stale instruction fetching on x86 with self-modifying code by userbinator

Share This Article

Newsletter

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter