~ C vs ASM ~

by kalekale on 2023-12-30
comparing C and ASM programs

Euclid's GCD algorithm

given by gcd(a,b) = gcd(b, a mod n) {until b is 0}

gcd.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <gcd>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	48 83 ec 10          	sub    $0x10,%rsp
   8:	89 7d fc             	mov    %edi,-0x4(%rbp)
   b:	89 75 f8             	mov    %esi,-0x8(%rbp)
   e:	83 7d f8 00          	cmpl   $0x0,-0x8(%rbp)
  12:	75 05                	jne    19 
  14:	8b 45 fc             	mov    -0x4(%rbp),%eax
  17:	eb 13                	jmp    2c 
  19:	8b 45 fc             	mov    -0x4(%rbp),%eax
  1c:	99                   	cltd
  1d:	f7 7d f8             	idivl  -0x8(%rbp)
  20:	8b 45 f8             	mov    -0x8(%rbp),%eax
  23:	89 d6                	mov    %edx,%esi
  25:	89 c7                	mov    %eax,%edi
  27:	e8 00 00 00 00       	call   2c 
  2c:	c9                   	leave
  2d:	c3                   	ret

gcd.c: 	 

	
int gcd(int a, int b) {
	if (b == 0) {
		return a;
	}
	else {
		gcd(b, a%b);
	}
}

01-04 sets the stack and allocates space. 08-0b stores the arguements int a, int b on the stack.

0e-12 is where the first computation takes place. %rbp-8 has our `int b` arguement, and the cmpl instruction compares 0x0 with whatever long is on address %rbp-8

 
0e:	83 7d f8 00          	cmpl   $0x0,-0x8(%rbp)
12:	75 05                	jne    19
14:	8b 45 fc             	mov    -0x4(%rbp),%eax
17:	eb 13                	jmp    2c

the jne instruction tells the cpu to jump the program to address 19 if the comparism is not equal, however if %rbp-8 is equal to 0, the program moves onto the next instruction 14-17, where it sets the value of `int a` to the return address and jumps to 2c where it leaves the function and returns the gcd computed.
19-25 is where the recursive function is programmed.

1d:	f7 7d f8             	idivl  -0x8(%rbp)
20:	8b 45 f8             	mov    -0x8(%rbp),%eax
23:	89 d6                	mov    %edx,%esi
25:	89 c7                	mov    %eax,%edi

1d performs the `idivl` instruction, but before that it should be noted instructions 19-1c set the long to a double type with the `cltd` instruction. idivl instruction computes the quotient and remainder, the values are stored in %eax and %edx. since we only need remainder for the next iteration we set %esi to the value of %edx. the mov -0x8(%rbp),%eax is done to discard the quotient and get the value of `int b` from the %rbp we set in 0b, and with the parameters set for the gcd() function we call <gcd> again, and this is done until the program reaches instruction 14 and then leaves and returns from the function.

conclusion

This demonstares how a high level C program compares to machine code. The C compiler makes our lives easier by handling the data type conversions, managing CPU addresses and it's size based on the operation we do. However it should also be noted that a C program only does what the programmer tells it to do, there are no hidden conversion or allocations, a simple flow which is just machine code in human readable form. Unlike C++ and Rust, where they do provide a low level interface but also the very hard for a programmer to tell what his code is doing at the machine level. A good systems C programmer can tell you exactly what his program's machine code looks like but a good C++/Rust programmer may not be able to explain what his program does on the machine level.