This question already has an answer here: What is the best way to set a register to zero in x86 assembly: xor, mov or and? (1 answer) Closed 6 years ago.
xor eax, eax
will always set eax
to zero, right? So, why does MSVC++ sometimes put it in my executable's code? Is it more efficient that mov eax, 0
?
012B1002 in al,dx
012B1003 push ecx
int i = 5;
012B1004 mov dword ptr [i],5
return 0;
012B100B xor eax,eax
Also, what does it mean to do in al, dx
?
Yes, it is more efficient.
The opcode is shorter than mov eax, 0
, only 2 bytes, and the processor recognizes the special case and treats it as a mov eax, 0
without a false read dependency on eax
, so the execution time is the same.
Also to avoid 0s when compiled as used on shell codes for exploitation of buffer overflows, etc. Why avoid the 0 ? Well, 0 represents the end of string in c/c++ and the shell code would be truncated if the mean of exploitation is a string processing function or the like.
Btw im referring to the original question: "Any reason to do a “xor eax, eax”?" not what the MSVC++ compiler does.
Since there's some debate in the comments about how this is pertinent in the real world, see this article and this section on Wikipedia.
.text
segment. I also can't see how terminating a string somehow allows someone to move the .data
segment to mirror the .text
segment to modify anything in the first place. Please be more specific than "MOST BASICIST TECHNIQUE"
xor eax, eax
is a faster way of setting eax
to zero. This is happening because you're returning zero.
The in
instruction is doing stuff with I/O ports. Basically reading a word of data from the port specified dx
in and storing it in al
. It's not clear why it is happening here. Here's a reference that seems to explain it in detail.
Another reason to use XOR reg, reg
or XORPS reg, reg
is to break dependency chains, this allows the CPU to optimize the parallel execution of the assembly commands more efficiently (even it it adds some more instruction throughput preasure).
AND reg, 0
, but not over MOV reg, 0
. Dep-chain breaking is a special case for xor
, but always the case for mov
. It doesn't get mentioned, leading to occasional confusion from people thinking that mov
has a false dependency on the old value being overwritten. But of course it doesn't.
op dest, src1, src2
instruction (e.g. VPSHUFB dest, src1, src2
or lea eax, [rbx + 2*rdx]
) breaks the dep chain on the old value of dest
. It's only notable when there's a false dependency on the old value: like mov ax, bx
, which (on AMD/Silvermont/P4, but not P6/SnB) has a false dep on the old value of eax
, even if you never read eax
. On Intel, the big notable one is that popcnt/lzcnt/tzcnt
have a false dep on their output
mov ax, bx / mov [mem], eax
has a dependency on the previous value of eax
, but it's not a false dependency. You're actually using those bits, so it's a true dependency.
mov reg,imm
can run on AGU ports as well as ALU where mov could have a back-end throughput advantage over xor zeroing for some surrounding code. But compilers always just use xor-zeroing when tuning for anything, and IMO that's the correct decision.
The XOR operation is indeed very fast. If the result is to set a register to zero, the compiler will often do it the fastest way it knows. A bit operation like XOR might take only one CPU cycle, whereas a copy (from one register to another) can take a small handful.
Often compiler writers will even have different behaviors given different target CPU architectures.
from the OP > any reason to do "xor eax,eax" return 0; 012B100B xor eax,eax ret <-- OP doesn't show this
The XOR EAX,EAX simply 0's out the EAX register, it executes faster than a MOV EAX,$0 and doesn't need to fetch immediate data of 0 to load into eax
It's very obvious this is the "return 0" that MSVC is optimizing EAX is the register used to return a value from a function in MSVC
eax
/ rax
as the register for return values. Also, immediate data doesn't have to be fetched, other than as a pre-requisite for instruction decoding. xor is shorter than mov, and even leaves more spare space in the uop cache line it's in, but neither of those effects are well described as "not having to fetch".
xor is often used to encrypt a code for example
mov eax,[ecx+ValueHere]
xor eax,[ecx+ValueHere]
mov [ebx+ValueHere],esi
xor esi,[esp+ValueHere]
pop edi
mov [ebx+ValueHere],esi
The XOR instruction connects two values using logical exclusive OR remember OR uses inclusive OR To understand XOR better, consider those two binary values:
1001010110
0101001101
If you OR them, the result is 1100011011 When two bits on top of each other are equal, the resulting bit is 0. Else the resulting bit is 1. You can use calc.exe to calculate XOR.
Success story sharing
eax
.xor eax, eax