Recently, ive faced some issues with blogger, things i'd like to do are not doable on blogger (or are not easy to do, so what should i focus on, creating content or arranging my blog?) so i decided to move to Wordpress, where everything is easier. My new blog is https://llcode.wordpress.com
Im waiting you there.
Wednesday, October 11, 2017
Friday, October 6, 2017
How to start writing Assembly 16-bit code on your 64-bit machine
Hello there. The first thing to came into anyone's mind after deciding to learn Assembly is: Why can't I run my 16-bit assembly code on my PC? Why there is still Assembly 16-bit while all the processors and OSs nowadays are 64 bits?
The answer is...History! let me explain.
Do you know what BIOS is? It's the first thing to run when you press the power button on your PC/Laptop. BIOS is an abbreviation for The Basic Input Output System. BIOS is a software, stored on a small memory chip on the motherboard.
What is BIOS used for?
BIOS instruct your PC on how to perform a number of basic functions that are pretty important for booting process (loading your OS) like identifying the hard drive, floppy drive, optical drive, memory, etc..
And it does some other stuff like POST process or Power-on Self-test.
In order to not getting bogged down in nitpicky details, let's back to the track.
BIOS must run and the processor has to be in the proper mode to run it. From now onwards, things gonna make sense.
The very first commercial computer released, had an 8088 processor, which was a variant of the 8086 processor with an 8-bit data bust. It was able to deal with 16-bit hardware drivers but back then, the hardware designers were centered around the 8-bit processors. So creating a 16-bit processor with 8-bit data bus will help the processor to fit in the hardware market thus will make it more successful.
But there is another detail that will help the processor to fit in the markets. The ability to run the pre-existing software(like BIOS). 8086 and 8088 processor were designed to be assembly-language compatible with the earlier 8080 8-bit processors to make it easier to port the existing software. And this became a recurrent theme. Each processor in the family has to be compatible somehow with what comes before.
After that, when the 80386 came out, the first truly 32-bit processor, the designers had a problem. The processor must be able to run the existed 16-bit software or it's not gonna success. The solution was to make the processor starts at the 16-bit compatible mode (real mode) then they programmed a piece of software to switch it to the 32-bit mode (protected mode) to make use of the other features that 32-bit processors offer like memory and so on.
So the existing BIOS, which was written for 16-bit machines, could still work on 386 machines. MS-DOS, which was written for 16-bit machines, could still work on 386 machines.
And that's why you can run 16-bit assembly on a 32-bit OS. But what happened after? What about 64-bit?
The 64-bits processors offered a new mode which is the long mode and the designers changed the time-honored tradition. The BIOS was replaced by UEFI (Unified Extensible Firmware Interface) that allows the computers to boot directly into the protected mode with the ability to switch into the long mode. The software leaders like Microsoft removed the 16-bit subsystems from Windows because it's no longer useful. Bye bye, real mode.
Now that you know the story, how to get over that? How to learn the easiest and the most supported assembly and then make your way up to other variants?
There are many solutions you can go through, I'm going to mention some of them.
If you are on Windows, you can get emu8086. As its name implies, its an 8086 emulator and debugger.
You can download it here
You can also get DOSBox, Its a DOS emulator that will allow you to use DOS on your modern OS, so you will be able to use any assembler you need (unlike emu8086 that uses FASM by default).
Get DOSBox here
The third option is to download a VM like Virtual Box or VMWare workstation and install any DOS-like OS on it, like Free-DOS
If you are on Linux, you still can use DOSBox because its ported to run under Linux. And if you wanna use emu8086, you can download WINE and install emu8086 through it. It runs perfect, I've tried it under Linux. And you also can get a VM because you know, VMWare workstation and Virtual Box are ported to Linux.
The answer is...History! let me explain.
Do you know what BIOS is? It's the first thing to run when you press the power button on your PC/Laptop. BIOS is an abbreviation for The Basic Input Output System. BIOS is a software, stored on a small memory chip on the motherboard.
What is BIOS used for?
BIOS instruct your PC on how to perform a number of basic functions that are pretty important for booting process (loading your OS) like identifying the hard drive, floppy drive, optical drive, memory, etc..
And it does some other stuff like POST process or Power-on Self-test.
In order to not getting bogged down in nitpicky details, let's back to the track.
BIOS must run and the processor has to be in the proper mode to run it. From now onwards, things gonna make sense.
The very first commercial computer released, had an 8088 processor, which was a variant of the 8086 processor with an 8-bit data bust. It was able to deal with 16-bit hardware drivers but back then, the hardware designers were centered around the 8-bit processors. So creating a 16-bit processor with 8-bit data bus will help the processor to fit in the hardware market thus will make it more successful.
But there is another detail that will help the processor to fit in the markets. The ability to run the pre-existing software(like BIOS). 8086 and 8088 processor were designed to be assembly-language compatible with the earlier 8080 8-bit processors to make it easier to port the existing software. And this became a recurrent theme. Each processor in the family has to be compatible somehow with what comes before.
After that, when the 80386 came out, the first truly 32-bit processor, the designers had a problem. The processor must be able to run the existed 16-bit software or it's not gonna success. The solution was to make the processor starts at the 16-bit compatible mode (real mode) then they programmed a piece of software to switch it to the 32-bit mode (protected mode) to make use of the other features that 32-bit processors offer like memory and so on.
So the existing BIOS, which was written for 16-bit machines, could still work on 386 machines. MS-DOS, which was written for 16-bit machines, could still work on 386 machines.
And that's why you can run 16-bit assembly on a 32-bit OS. But what happened after? What about 64-bit?
The 64-bits processors offered a new mode which is the long mode and the designers changed the time-honored tradition. The BIOS was replaced by UEFI (Unified Extensible Firmware Interface) that allows the computers to boot directly into the protected mode with the ability to switch into the long mode. The software leaders like Microsoft removed the 16-bit subsystems from Windows because it's no longer useful. Bye bye, real mode.
Now that you know the story, how to get over that? How to learn the easiest and the most supported assembly and then make your way up to other variants?
There are many solutions you can go through, I'm going to mention some of them.
If you are on Windows, you can get emu8086. As its name implies, its an 8086 emulator and debugger.
You can download it here
You can also get DOSBox, Its a DOS emulator that will allow you to use DOS on your modern OS, so you will be able to use any assembler you need (unlike emu8086 that uses FASM by default).
Get DOSBox here
The third option is to download a VM like Virtual Box or VMWare workstation and install any DOS-like OS on it, like Free-DOS
If you are on Linux, you still can use DOSBox because its ported to run under Linux. And if you wanna use emu8086, you can download WINE and install emu8086 through it. It runs perfect, I've tried it under Linux. And you also can get a VM because you know, VMWare workstation and Virtual Box are ported to Linux.
Wednesday, October 4, 2017
Hello Assembly
*wipes the dust off the blog* Hello there, long time not writing but today I'm willing to talk about the Assembly language (not like the previous article that illustrated how to write your first Assembly program)
In the beginning of the computing revolution, the computers were hardwired to perform only one operation, the circuit of the computer was designed to perform a single operation.
By the time, it became obvious very quickly that we need a universal computer that can perform any task that it is programmed to do.
And this was the real beginning (from my perspective) of the computing revolution.
When you talk about the general purpose computers, you ought to mention two guys who without them efforts, the computers wouldn't have been true. Alan Turing and John Von Neumann
Alan Turing was famous for breaking the Enigma algorithm in the second world war, and he is the godfather of the computing. he is the first one how proposed the idea of "Turing machine" which is a machine that can compute any computable algorithm. I won't go further about how he talked about it and what it really is, but its all about something that can be written somewhere then executed by something then produce something else as a result that could be written to somewhere else and so on.
This idea was picked up and taken further by Von Neumann, John introduced the idea of the CPU, with some memory that can be altered by the CPU after processing and executing some instructions.
in fact, the vast majority of the computers that we are using nowadays uses what so-called "Von Neumann Architecture"
So, you might be wondering, what does all of this have to do with Assembly language?
Ok, let me explain. When the first programmable computer came out, it was pretty hard to program and the coding process was backbreaking even to a professional computer scientist.
they used to program that computer using only zeros and ones, a bunch of zeros and ones tells the CPU to do something. change a single zero to one and you will face the butterfly effect. The CPU will do something completely different from what you expected it to do.
After that, they came out with an idea to solve that problem and get rid of programming using the error-prone machine language. if 01001010 is moving something from A to B, ok I will substitute that value with a constant word and I will create a software that understands this word and convert it back to its original value so the CPU could understand and execute it.
And this idea was the Assembly language (of course it's not that easy, it's complicated more than what you can ever imagine but I don't wanna take you through mind-blowing details).
The program that used to convert the Assembly code into the machine code is called assembler.
Assembly language has many variants. Each processor family has its own Assembly language. Yea Assembly language is not portable. A program written in Assembly on an ARM computer cannot run on, for example, Intel computer.
Shocked? Ok, let me tell you more. Each variant has different versions which span the gamut from 16-bit to 64-bit opcodes. Kill yourself.
But the current most popular Assembly flavors are ARM and x86. Google and read about them on your own, because in this article I'm going to talk about the 16-bit Assembly language for the 8086 Intel processor.
To illustrate the difference between Assembly and the other programming languages, let me show you a snippet of Assembly and its equivalent from any other language like C.
This code will print Hello World on the screen, simple enough.
in C, you can do it in a single line (after doing your essential stuff like declaring the main function and including stdio and stuff)
printf("Hello World");
the code explains itself. you are calling a function that prints the passed string on the screen. pretty easy and straightforward.
What about Assembly?
Ok, let see...
.data
str db "Hello World$"
.code
mov ah, 9
lea dx, str
int 21h
hlt
Is it hard? then what about coding it using only 0s and 1s?
let's go through the first segment which is the data segment. Unlike the other programming languages, in Assembly, you cant define variables wherever and whenever you wanna, no.
Variables are defined in the data segment before getting to the code segment (some developers used to begin with the code segment and place the data segment underneath. its ok, you can do both as long as the variables and resided in the data segment).
The next line is str db "Hello World$"
In Assembly, the variables definition formula is var_name bytes_number var_value
str is the variable name, db is telling the assembler to declare a single byte, Hello World is the variable value and carry on, we will get to the dollar sign later.
Hey hey, hold on. Are you kidding? you are declaring a byte to store a string? how come!
Actually, str is not storing our string. It only has the memory address of our first litter which is 'H' and by going through the memory byte by byte you can access the whole string, untill you get to the dollar sign, the assembler will understand that this is the end of the string. the dollar sign is the string delimiter.
Now, let's move to the next line that is mov ah, 9
let's discuss the use of mov instruction first. As its name implies, it moves something from place to another. In this line, mov is moving 9 to the register AH.
Come on Mohamed! stop telling mysterious stuff, what is a register? Ok, a register is a temporary memory that can store a very tiny amount of data. A register is a small data holding place and its a part of the processor architecture. A processor usually has a set of registers consists of more than 4 registers, for example, the 8086 has more than 10 registers.
The registers in the 8086 processor are 16 bits registers, divided into two sub-registers. AX is the main register, AH is the high 8 bits of the register and AL is the low 8 bits of the register.
i.e. if AH = 01010101 and AL = 11001011 then AX = 010101111001011, clear enough.
Lets back to the track. The register AH has the value of 9. Ok but why? I will tell you later.
After that, we have lea dx, str. this line loads the address of the first char in our string into dx register. lea stands for load effective address.
The most important line which is int 21h
this is where the magic happens. for simplicity, you can consider this line as a function call in the higher level languages, int 21h is calling 21h function and the parameters are 9 and str, yea we placed them in the registers so the interrupt could access them. When int 21h is reached, it will look at AH to obtain the function code to know the task it should do. 9 is for printing (you can use Google to search for what else this interrupt can do)
Ok, I'm about to print, but what to print? int 21h will print whatever dx is pointing to, that is in our case, str.
Last but not least, we have hlt. It's just like return 0; in C or C++. In Assembly, hlt returns the control from the program back to the OS so you can do another task or run another program. And that's it!
Assembly is not as hard as it seems. It's easy, very easy. but its really complicated too. being easy doesn't make it simple. C is hard but simple. can you feel the difference?
In the beginning of the computing revolution, the computers were hardwired to perform only one operation, the circuit of the computer was designed to perform a single operation.
By the time, it became obvious very quickly that we need a universal computer that can perform any task that it is programmed to do.
And this was the real beginning (from my perspective) of the computing revolution.
When you talk about the general purpose computers, you ought to mention two guys who without them efforts, the computers wouldn't have been true. Alan Turing and John Von Neumann
Alan Turing was famous for breaking the Enigma algorithm in the second world war, and he is the godfather of the computing. he is the first one how proposed the idea of "Turing machine" which is a machine that can compute any computable algorithm. I won't go further about how he talked about it and what it really is, but its all about something that can be written somewhere then executed by something then produce something else as a result that could be written to somewhere else and so on.
This idea was picked up and taken further by Von Neumann, John introduced the idea of the CPU, with some memory that can be altered by the CPU after processing and executing some instructions.
in fact, the vast majority of the computers that we are using nowadays uses what so-called "Von Neumann Architecture"
So, you might be wondering, what does all of this have to do with Assembly language?
Ok, let me explain. When the first programmable computer came out, it was pretty hard to program and the coding process was backbreaking even to a professional computer scientist.
they used to program that computer using only zeros and ones, a bunch of zeros and ones tells the CPU to do something. change a single zero to one and you will face the butterfly effect. The CPU will do something completely different from what you expected it to do.
After that, they came out with an idea to solve that problem and get rid of programming using the error-prone machine language. if 01001010 is moving something from A to B, ok I will substitute that value with a constant word and I will create a software that understands this word and convert it back to its original value so the CPU could understand and execute it.
And this idea was the Assembly language (of course it's not that easy, it's complicated more than what you can ever imagine but I don't wanna take you through mind-blowing details).
The program that used to convert the Assembly code into the machine code is called assembler.
Assembly language has many variants. Each processor family has its own Assembly language. Yea Assembly language is not portable. A program written in Assembly on an ARM computer cannot run on, for example, Intel computer.
Shocked? Ok, let me tell you more. Each variant has different versions which span the gamut from 16-bit to 64-bit opcodes. Kill yourself.
But the current most popular Assembly flavors are ARM and x86. Google and read about them on your own, because in this article I'm going to talk about the 16-bit Assembly language for the 8086 Intel processor.
To illustrate the difference between Assembly and the other programming languages, let me show you a snippet of Assembly and its equivalent from any other language like C.
This code will print Hello World on the screen, simple enough.
in C, you can do it in a single line (after doing your essential stuff like declaring the main function and including stdio and stuff)
printf("Hello World");
the code explains itself. you are calling a function that prints the passed string on the screen. pretty easy and straightforward.
What about Assembly?
Ok, let see...
.data
str db "Hello World$"
.code
mov ah, 9
lea dx, str
int 21h
hlt
Is it hard? then what about coding it using only 0s and 1s?
let's go through the first segment which is the data segment. Unlike the other programming languages, in Assembly, you cant define variables wherever and whenever you wanna, no.
Variables are defined in the data segment before getting to the code segment (some developers used to begin with the code segment and place the data segment underneath. its ok, you can do both as long as the variables and resided in the data segment).
The next line is str db "Hello World$"
In Assembly, the variables definition formula is var_name bytes_number var_value
str is the variable name, db is telling the assembler to declare a single byte, Hello World is the variable value and carry on, we will get to the dollar sign later.
Hey hey, hold on. Are you kidding? you are declaring a byte to store a string? how come!
Actually, str is not storing our string. It only has the memory address of our first litter which is 'H' and by going through the memory byte by byte you can access the whole string, untill you get to the dollar sign, the assembler will understand that this is the end of the string. the dollar sign is the string delimiter.
Now, let's move to the next line that is mov ah, 9
let's discuss the use of mov instruction first. As its name implies, it moves something from place to another. In this line, mov is moving 9 to the register AH.
Come on Mohamed! stop telling mysterious stuff, what is a register? Ok, a register is a temporary memory that can store a very tiny amount of data. A register is a small data holding place and its a part of the processor architecture. A processor usually has a set of registers consists of more than 4 registers, for example, the 8086 has more than 10 registers.
The registers in the 8086 processor are 16 bits registers, divided into two sub-registers. AX is the main register, AH is the high 8 bits of the register and AL is the low 8 bits of the register.
i.e. if AH = 01010101 and AL = 11001011 then AX = 010101111001011, clear enough.
Lets back to the track. The register AH has the value of 9. Ok but why? I will tell you later.
After that, we have lea dx, str. this line loads the address of the first char in our string into dx register. lea stands for load effective address.
The most important line which is int 21h
this is where the magic happens. for simplicity, you can consider this line as a function call in the higher level languages, int 21h is calling 21h function and the parameters are 9 and str, yea we placed them in the registers so the interrupt could access them. When int 21h is reached, it will look at AH to obtain the function code to know the task it should do. 9 is for printing (you can use Google to search for what else this interrupt can do)
Ok, I'm about to print, but what to print? int 21h will print whatever dx is pointing to, that is in our case, str.
Last but not least, we have hlt. It's just like return 0; in C or C++. In Assembly, hlt returns the control from the program back to the OS so you can do another task or run another program. And that's it!
Assembly is not as hard as it seems. It's easy, very easy. but its really complicated too. being easy doesn't make it simple. C is hard but simple. can you feel the difference?
Friday, November 25, 2016
Introduction to assembly x86 for Linux - Hello world
Introduction to Assembly
Assembly language is considered the closest language to the hardware nowadays, You can write any software using assembly. Also, You can almost write the same software without a single line of assembly.
Then why people stay learn assembly !
There is some tasks that assembly cane done them better than any other language like Bootloaders and low-level kernel functions . It also used within high level language code to optimize its performance.
Assembly is not used in programming only, nowadays it mostly used in debugging.
How it works?
Assembly language is based on instructions those are provided by the CPU, not statements like the rest of languages
Each processors family has their own instructions set, so you cant execute ARM assembly code on Intel processor or 64-bit assembly code on 16-bit processor.
To know if your PC will match the code i will write, type this command in the terminal
uname -i
If your output is x86_64 then you are ready to follow me.
Requirements :
nasm - type sudo apt-get install nasm to install it.
ld linker - Its already installed by default.
Any text editor
Hello World
I will assume you are know what is a register and know a little bit about assembly.
I will talk about this basics later on my youtube channel, its easier to talk rather than typing :D
The code is below, discussed and explained in the comments.
Wait, i assumed you know about registers and basics of assembly so its silly to write a simple code to print Hello world and exit, i will make it advanced a little bit and i will write a sub routine to print out the message.
;Data section.
;All our variables should be declared here.
section .data
msg: db "Hello, World!",10,0
;Text section.
;Deal it as if its name is code section. i think its clear now
;our code goes here
section .text
global _start ;This is so important to the linker to tell it where is out main entry
_start:
;Skip this for now, scroll down and read the sub routine code and ;then back here
mov rax, str ;store our string offset in rax
call echo ;call our function
mov rax, 60
mov rdi, 0
syscall ;rax=60, rdi=0 will perform a clean close
;input = string offset in rax.
;output = printed string.
echo: ;if you are familar with any high level language so deal this ;label as a function name
push rax ;push the value stored in RAX into the stack cuz we ;need it
;the value that will be pushed is the string offset, we need it ;so we will save it to the stack
mov rbx, 0 ;We need to count the letters in our word
;so we will set rbx to zero and we will compare each byte (char) ;in our string
;with 0 and if its not 0 then increase rbx by one
print:
inc rax ;increase rax by 1 to get the first character ot our ;string
inc rbx ; increase rbx, the counter
mov cl, [rax] ;save a byte from rax to cl, thats why we are ;increasing rax,
;to get diffrent char everytime we loop
cmp cl, 0 ;is this byte = 0?
jne print ;if its not zero that means we still have chars to ;count, loop again
;if its zero then the following block will be executed
mov rax, 1
mov rdi, 1
pop rsi ;the offset of our string is saved in the stack, we ;will pop it to rsi
mov rdx, rbx ;rbx should contain the lenght of our string which is ;the value in rbx
syscall ;call the kernel
ret ;return to where we called the routine
Thursday, September 22, 2016
Dynamic memory allocating errors handling
Sometimes you are not aware in advance how much memory you will need store some information, like a string entered by a user. You can declare a variable that is large enough to store any expected data but its a waste of memory, the alternative way to do this is to allocate the memory at runtime and use the Dynamic memory allocation concept.
Unlike the memory the created at compile time, The memory declared at runtime allocated in the heap which is unused memory that allows the applications to allocate memory at runtime
As you know, the computer resources are limited and dynamic memory allocation may fail if there is no enough free memory in the heap to allocate the memory you want.
C++ provides you with 2 ways to check if the allocation succeeded or not.
The first one is by handling exceptions. When memory allocation fails, bad_alloc exception is thrown and handling this exception with the proper handler will prevent your application from termination.
The second one is to use nothrow object while creating the pointer. Using this way, there is no exceptions will be thrown and you the pointer will be returned with the value of nullptr and by checking the value of the pointer you know if the allocation process succeeded or not.
Example :
int *x = (nothrow) new x[5];
if (x==nullptr)
{
//do something
}
Unlike the memory the created at compile time, The memory declared at runtime allocated in the heap which is unused memory that allows the applications to allocate memory at runtime
As you know, the computer resources are limited and dynamic memory allocation may fail if there is no enough free memory in the heap to allocate the memory you want.
C++ provides you with 2 ways to check if the allocation succeeded or not.
The first one is by handling exceptions. When memory allocation fails, bad_alloc exception is thrown and handling this exception with the proper handler will prevent your application from termination.
The second one is to use nothrow object while creating the pointer. Using this way, there is no exceptions will be thrown and you the pointer will be returned with the value of nullptr and by checking the value of the pointer you know if the allocation process succeeded or not.
Example :
int *x = (nothrow) new x[5];
if (x==nullptr)
{
//do something
}
Tuesday, August 23, 2016
How to call a function before the main function and after 'return 0' line
Some applications needs to call a function before the main function takes the control for loading a specific value form somewhere or after the program dies for saving settings or something like that but, how?
This peace of code has two options or two parameters to pass.
If constructor is passed, the function will be executed before the main function.
Unlike, if destructor is passed, the function will be executed after int main returns 0
2- Executing after closing the program
Pretty easy and straightforward, right !
Its very simple.
__attribute__
This peace of code has two options or two parameters to pass.
If constructor is passed, the function will be executed before the main function.
Unlike, if destructor is passed, the function will be executed after int main returns 0
Examples :
1- Executing before main function
#include iostream using namespace std; void beforemain(void)__attribute__((constructor)); void beforemain(void) { cout<<"hi befor main\n"; } int main() { cout<<"main\n"; return 0; }
2- Executing after closing the program
#include iostream using namespace std; void aftermain(void)__attribute__((destructor)); void aftermain(void) { cout<<"bye after main\n"; } int main() { cout<<"main\n"; return 0; }
Pretty easy and straightforward, right !
Friday, August 12, 2016
Introduction to computer programming
Before getting to computer programming, lets first understand computer programs and what they do.
Now, try to arrange your answer in sequence and you will get :
Actually, you can consider this sequence as a computer program written in your original language.
Now, lets talk about programming again. A computer program is a sequence of instruction tells the computer what to do.
For example, This C code tells the computer to prints "Hello World!" on the screen.
A computer program is a sequence of instructions written using a programming language and converted to the machine language to do a specific task.
You can consider that a programming language converts your logic into the computer logic
If somebody asks you how to login to facebook your answer will be like :
Open facebook, create account then login
Now, try to arrange your answer in sequence and you will get :
1 - Open facebook 2 - Create account 3 - Login
Actually, you can consider this sequence as a computer program written in your original language.
Now, lets talk about programming again. A computer program is a sequence of instruction tells the computer what to do.
For example, This C code tells the computer to prints "Hello World!" on the screen.
printf("Hello Wordl!");
Introduction to computer programming.
If you understood what a computer program is then you will answer yourself "Writing a computer program is computer programming"
You are right, but how. As we mentioned, the computer can't understand what you say and you have to convert your instructions that You need him to do into machine language.
For this purpose, the computer programming languages are invented.
There are many programming languages such as
- Assembly
- C
- C++
- Java
- Python
- C#
- Perl
- VB.NET
And many more.
This is just an introduction to the world of programming, other guides on how to be a programmer will be posted soon.
Subscribe to:
Posts (Atom)