Originally published on my blog in 2014
I still remember an interview I had around February 2001, in which an embedded firmware engineer talked about how his team wrote code:
We write stuff in Assembler, because we're too lazy to write stuff in C.
Wait...what? I thought the whole purpose of C was to have portable Assembly, so you could control the bare metal correctly? I did get an inkling if you were that good, assembly could be seductive in your ability to do whatever you want.
This came to mind again when a former colleague of mine posed a similar question on Facebook the other night:
Pop quiz: When you run this, what prints out?
Basically, the above is a quiz to determine if you understand loops, expressions -versus- statements, and the pre-decrement operator (--
). Pre-decrement specifies that the lvalue of the expression is the current value minus one and the post-state of that variable is assigned that decremented value. Post-decrement has the same result (decrementing the value), but the lvalue of the expression is the PREVIOUS value.
As is my wont, I got the above wrong, but that's not the point. :-D
To check my answer, I sucked it into quick c program using vim:
Compiling that program and using mac's otool
to dump the assembly gives you this:
Unoptimized version
Some things to note in the above:
- The compiler has done a faithful job of translating exactly the program (as-is) to assembler:
- We load the variables in lines 9 and 10
- We have the first loop in lines 11-22
- The second loop (despite being a no-op) still exists, in lines 24-29
Compiler-optimized version
Things get slightly more interesting when you pass the -O (optimize) flag
Some things to note:
- This looks nothing like the C code. There are no loops (or indeed, branch instructions) at all.
- The compiler determined the second loop to be a no-op, and compiled it away completely.
- Our stack variables are gone. The compiler is using x64 CPU registers exclusively.
- The compiler has analyzed the loop and unrolled it into discrete calls to
callq
for the printf function.
Lastly: The answer to the quiz is in the assembly if you look hard enough:
5 9
5 8
5 7
5 6
5 5
5 4
5 3
5 2
5 1
Pretty cool....I never get to look at assembly in my day-job, so getting this close to the CPU was a neat
Top comments (2)
AMD has a great guide on writing optimized C/C++ code.
Personally I think of C as kinda psuedocodes for assembly, I can
write stuff out without having to think as much, and it allows me
to mess around without having to write as much code
The compiler also does this to sum(1,...n) as n + n(n-1)/2. Pretty intelligent compiler devs.