Friday, November 25, 2016

Introduction to assembly x86 for Linux - Hello world

Introduction to Assembly

Assembly language is considered the closest language to the hardware nowadays, You can write any software using assembly. Also, You can almost write the same software without a single line of assembly.

Then why people stay learn assembly !

There is some tasks that assembly cane done them better than any other language like Bootloaders and low-level kernel functions . It also used within high level language code to optimize its performance.

Assembly is not used in programming only, nowadays it mostly used in debugging.

How it works?

Assembly language is based on instructions those are provided by the CPU, not statements like the rest of languages

Each processors family has their own instructions set, so you cant execute ARM assembly code on Intel processor or 64-bit assembly code on 16-bit processor.

To know if your PC will match the code i will write, type this command in the terminal

uname -i

If your output is x86_64 then you are ready to follow me.


Requirements : 

nasm - type sudo apt-get install nasm to install it.
ld linker - Its already installed by default.
Any text editor

Hello World

I will assume you are know what is a register and know a little bit about assembly.
I will talk about this basics later on my youtube channel, its easier to talk rather than typing :D

The code is below, discussed and explained in the comments.

Wait, i assumed you know about registers and basics of assembly so its silly to write a simple code to print Hello world and exit, i will make it advanced a little bit and i will write a sub routine to print out the message.


;Data section.
;All our variables should be declared here.

section .data
    msg: db "Hello, World!",10,0

;Text section.
;Deal it as if its name is code section. i think its clear now
;our code goes here

section .text
    global _start ;This is so important to the linker to tell it where is out main entry

_start:

;Skip this for now, scroll down and read the sub routine code and ;then back here
    mov rax, str  ;store our string offset in rax
    call echo     ;call our function

    
    mov rax, 60
    mov rdi, 0
    syscall     ;rax=60, rdi=0 will perform a clean close


;input = string offset in rax.
;output = printed string.
echo: ;if you are familar with any high level language so deal this ;label as a function name
    push rax    ;push the value stored in RAX into the stack cuz we     ;need it
    ;the value that will be pushed is the string offset, we need it     ;so we will save it to the stack 
    mov rbx, 0  ;We need to count the letters in our word
    ;so we will set rbx to zero and we will compare each byte (char)     ;in our string 
    ;with 0 and if its not 0 then increase rbx by one
    
print:
    inc rax   ;increase rax by 1 to get the first character ot our       ;string
    inc rbx   ; increase rbx, the counter
    mov cl, [rax] ;save a byte from rax to cl, thats why we are         ;increasing rax, 
    ;to get diffrent char everytime we loop
    cmp cl, 0   ;is this byte = 0?
    jne print   ;if its not zero that means we still have chars to       ;count, loop again
    ;if its zero then the following block will be executed
    
mov rax, 1    
mov rdi, 1
pop rsi        ;the offset of our string is saved in the stack, we                  ;will pop it to rsi
mov rdx, rbx  ;rbx should contain the lenght of our string which is               ;the value in rbx
syscall       ;call the kernel

ret           ;return to where we called the routine