Recently, ive faced some issues with blogger, things i'd like to do are not doable on blogger (or are not easy to do, so what should i focus on, creating content or arranging my blog?) so i decided to move to Wordpress, where everything is easier. My new blog is https://llcode.wordpress.com
Im waiting you there.
Wednesday, October 11, 2017
Friday, October 6, 2017
How to start writing Assembly 16-bit code on your 64-bit machine
Hello there. The first thing to came into anyone's mind after deciding to learn Assembly is: Why can't I run my 16-bit assembly code on my PC? Why there is still Assembly 16-bit while all the processors and OSs nowadays are 64 bits?
The answer is...History! let me explain.
Do you know what BIOS is? It's the first thing to run when you press the power button on your PC/Laptop. BIOS is an abbreviation for The Basic Input Output System. BIOS is a software, stored on a small memory chip on the motherboard.
What is BIOS used for?
BIOS instruct your PC on how to perform a number of basic functions that are pretty important for booting process (loading your OS) like identifying the hard drive, floppy drive, optical drive, memory, etc..
And it does some other stuff like POST process or Power-on Self-test.
In order to not getting bogged down in nitpicky details, let's back to the track.
BIOS must run and the processor has to be in the proper mode to run it. From now onwards, things gonna make sense.
The very first commercial computer released, had an 8088 processor, which was a variant of the 8086 processor with an 8-bit data bust. It was able to deal with 16-bit hardware drivers but back then, the hardware designers were centered around the 8-bit processors. So creating a 16-bit processor with 8-bit data bus will help the processor to fit in the hardware market thus will make it more successful.
But there is another detail that will help the processor to fit in the markets. The ability to run the pre-existing software(like BIOS). 8086 and 8088 processor were designed to be assembly-language compatible with the earlier 8080 8-bit processors to make it easier to port the existing software. And this became a recurrent theme. Each processor in the family has to be compatible somehow with what comes before.
After that, when the 80386 came out, the first truly 32-bit processor, the designers had a problem. The processor must be able to run the existed 16-bit software or it's not gonna success. The solution was to make the processor starts at the 16-bit compatible mode (real mode) then they programmed a piece of software to switch it to the 32-bit mode (protected mode) to make use of the other features that 32-bit processors offer like memory and so on.
So the existing BIOS, which was written for 16-bit machines, could still work on 386 machines. MS-DOS, which was written for 16-bit machines, could still work on 386 machines.
And that's why you can run 16-bit assembly on a 32-bit OS. But what happened after? What about 64-bit?
The 64-bits processors offered a new mode which is the long mode and the designers changed the time-honored tradition. The BIOS was replaced by UEFI (Unified Extensible Firmware Interface) that allows the computers to boot directly into the protected mode with the ability to switch into the long mode. The software leaders like Microsoft removed the 16-bit subsystems from Windows because it's no longer useful. Bye bye, real mode.
Now that you know the story, how to get over that? How to learn the easiest and the most supported assembly and then make your way up to other variants?
There are many solutions you can go through, I'm going to mention some of them.
If you are on Windows, you can get emu8086. As its name implies, its an 8086 emulator and debugger.
You can download it here
You can also get DOSBox, Its a DOS emulator that will allow you to use DOS on your modern OS, so you will be able to use any assembler you need (unlike emu8086 that uses FASM by default).
Get DOSBox here
The third option is to download a VM like Virtual Box or VMWare workstation and install any DOS-like OS on it, like Free-DOS
If you are on Linux, you still can use DOSBox because its ported to run under Linux. And if you wanna use emu8086, you can download WINE and install emu8086 through it. It runs perfect, I've tried it under Linux. And you also can get a VM because you know, VMWare workstation and Virtual Box are ported to Linux.
The answer is...History! let me explain.
Do you know what BIOS is? It's the first thing to run when you press the power button on your PC/Laptop. BIOS is an abbreviation for The Basic Input Output System. BIOS is a software, stored on a small memory chip on the motherboard.
What is BIOS used for?
BIOS instruct your PC on how to perform a number of basic functions that are pretty important for booting process (loading your OS) like identifying the hard drive, floppy drive, optical drive, memory, etc..
And it does some other stuff like POST process or Power-on Self-test.
In order to not getting bogged down in nitpicky details, let's back to the track.
BIOS must run and the processor has to be in the proper mode to run it. From now onwards, things gonna make sense.
The very first commercial computer released, had an 8088 processor, which was a variant of the 8086 processor with an 8-bit data bust. It was able to deal with 16-bit hardware drivers but back then, the hardware designers were centered around the 8-bit processors. So creating a 16-bit processor with 8-bit data bus will help the processor to fit in the hardware market thus will make it more successful.
But there is another detail that will help the processor to fit in the markets. The ability to run the pre-existing software(like BIOS). 8086 and 8088 processor were designed to be assembly-language compatible with the earlier 8080 8-bit processors to make it easier to port the existing software. And this became a recurrent theme. Each processor in the family has to be compatible somehow with what comes before.
After that, when the 80386 came out, the first truly 32-bit processor, the designers had a problem. The processor must be able to run the existed 16-bit software or it's not gonna success. The solution was to make the processor starts at the 16-bit compatible mode (real mode) then they programmed a piece of software to switch it to the 32-bit mode (protected mode) to make use of the other features that 32-bit processors offer like memory and so on.
So the existing BIOS, which was written for 16-bit machines, could still work on 386 machines. MS-DOS, which was written for 16-bit machines, could still work on 386 machines.
And that's why you can run 16-bit assembly on a 32-bit OS. But what happened after? What about 64-bit?
The 64-bits processors offered a new mode which is the long mode and the designers changed the time-honored tradition. The BIOS was replaced by UEFI (Unified Extensible Firmware Interface) that allows the computers to boot directly into the protected mode with the ability to switch into the long mode. The software leaders like Microsoft removed the 16-bit subsystems from Windows because it's no longer useful. Bye bye, real mode.
Now that you know the story, how to get over that? How to learn the easiest and the most supported assembly and then make your way up to other variants?
There are many solutions you can go through, I'm going to mention some of them.
If you are on Windows, you can get emu8086. As its name implies, its an 8086 emulator and debugger.
You can download it here
You can also get DOSBox, Its a DOS emulator that will allow you to use DOS on your modern OS, so you will be able to use any assembler you need (unlike emu8086 that uses FASM by default).
Get DOSBox here
The third option is to download a VM like Virtual Box or VMWare workstation and install any DOS-like OS on it, like Free-DOS
If you are on Linux, you still can use DOSBox because its ported to run under Linux. And if you wanna use emu8086, you can download WINE and install emu8086 through it. It runs perfect, I've tried it under Linux. And you also can get a VM because you know, VMWare workstation and Virtual Box are ported to Linux.
Wednesday, October 4, 2017
Hello Assembly
*wipes the dust off the blog* Hello there, long time not writing but today I'm willing to talk about the Assembly language (not like the previous article that illustrated how to write your first Assembly program)
In the beginning of the computing revolution, the computers were hardwired to perform only one operation, the circuit of the computer was designed to perform a single operation.
By the time, it became obvious very quickly that we need a universal computer that can perform any task that it is programmed to do.
And this was the real beginning (from my perspective) of the computing revolution.
When you talk about the general purpose computers, you ought to mention two guys who without them efforts, the computers wouldn't have been true. Alan Turing and John Von Neumann
Alan Turing was famous for breaking the Enigma algorithm in the second world war, and he is the godfather of the computing. he is the first one how proposed the idea of "Turing machine" which is a machine that can compute any computable algorithm. I won't go further about how he talked about it and what it really is, but its all about something that can be written somewhere then executed by something then produce something else as a result that could be written to somewhere else and so on.
This idea was picked up and taken further by Von Neumann, John introduced the idea of the CPU, with some memory that can be altered by the CPU after processing and executing some instructions.
in fact, the vast majority of the computers that we are using nowadays uses what so-called "Von Neumann Architecture"
So, you might be wondering, what does all of this have to do with Assembly language?
Ok, let me explain. When the first programmable computer came out, it was pretty hard to program and the coding process was backbreaking even to a professional computer scientist.
they used to program that computer using only zeros and ones, a bunch of zeros and ones tells the CPU to do something. change a single zero to one and you will face the butterfly effect. The CPU will do something completely different from what you expected it to do.
After that, they came out with an idea to solve that problem and get rid of programming using the error-prone machine language. if 01001010 is moving something from A to B, ok I will substitute that value with a constant word and I will create a software that understands this word and convert it back to its original value so the CPU could understand and execute it.
And this idea was the Assembly language (of course it's not that easy, it's complicated more than what you can ever imagine but I don't wanna take you through mind-blowing details).
The program that used to convert the Assembly code into the machine code is called assembler.
Assembly language has many variants. Each processor family has its own Assembly language. Yea Assembly language is not portable. A program written in Assembly on an ARM computer cannot run on, for example, Intel computer.
Shocked? Ok, let me tell you more. Each variant has different versions which span the gamut from 16-bit to 64-bit opcodes. Kill yourself.
But the current most popular Assembly flavors are ARM and x86. Google and read about them on your own, because in this article I'm going to talk about the 16-bit Assembly language for the 8086 Intel processor.
To illustrate the difference between Assembly and the other programming languages, let me show you a snippet of Assembly and its equivalent from any other language like C.
This code will print Hello World on the screen, simple enough.
in C, you can do it in a single line (after doing your essential stuff like declaring the main function and including stdio and stuff)
printf("Hello World");
the code explains itself. you are calling a function that prints the passed string on the screen. pretty easy and straightforward.
What about Assembly?
Ok, let see...
.data
str db "Hello World$"
.code
mov ah, 9
lea dx, str
int 21h
hlt
Is it hard? then what about coding it using only 0s and 1s?
let's go through the first segment which is the data segment. Unlike the other programming languages, in Assembly, you cant define variables wherever and whenever you wanna, no.
Variables are defined in the data segment before getting to the code segment (some developers used to begin with the code segment and place the data segment underneath. its ok, you can do both as long as the variables and resided in the data segment).
The next line is str db "Hello World$"
In Assembly, the variables definition formula is var_name bytes_number var_value
str is the variable name, db is telling the assembler to declare a single byte, Hello World is the variable value and carry on, we will get to the dollar sign later.
Hey hey, hold on. Are you kidding? you are declaring a byte to store a string? how come!
Actually, str is not storing our string. It only has the memory address of our first litter which is 'H' and by going through the memory byte by byte you can access the whole string, untill you get to the dollar sign, the assembler will understand that this is the end of the string. the dollar sign is the string delimiter.
Now, let's move to the next line that is mov ah, 9
let's discuss the use of mov instruction first. As its name implies, it moves something from place to another. In this line, mov is moving 9 to the register AH.
Come on Mohamed! stop telling mysterious stuff, what is a register? Ok, a register is a temporary memory that can store a very tiny amount of data. A register is a small data holding place and its a part of the processor architecture. A processor usually has a set of registers consists of more than 4 registers, for example, the 8086 has more than 10 registers.
The registers in the 8086 processor are 16 bits registers, divided into two sub-registers. AX is the main register, AH is the high 8 bits of the register and AL is the low 8 bits of the register.
i.e. if AH = 01010101 and AL = 11001011 then AX = 010101111001011, clear enough.
Lets back to the track. The register AH has the value of 9. Ok but why? I will tell you later.
After that, we have lea dx, str. this line loads the address of the first char in our string into dx register. lea stands for load effective address.
The most important line which is int 21h
this is where the magic happens. for simplicity, you can consider this line as a function call in the higher level languages, int 21h is calling 21h function and the parameters are 9 and str, yea we placed them in the registers so the interrupt could access them. When int 21h is reached, it will look at AH to obtain the function code to know the task it should do. 9 is for printing (you can use Google to search for what else this interrupt can do)
Ok, I'm about to print, but what to print? int 21h will print whatever dx is pointing to, that is in our case, str.
Last but not least, we have hlt. It's just like return 0; in C or C++. In Assembly, hlt returns the control from the program back to the OS so you can do another task or run another program. And that's it!
Assembly is not as hard as it seems. It's easy, very easy. but its really complicated too. being easy doesn't make it simple. C is hard but simple. can you feel the difference?
In the beginning of the computing revolution, the computers were hardwired to perform only one operation, the circuit of the computer was designed to perform a single operation.
By the time, it became obvious very quickly that we need a universal computer that can perform any task that it is programmed to do.
And this was the real beginning (from my perspective) of the computing revolution.
When you talk about the general purpose computers, you ought to mention two guys who without them efforts, the computers wouldn't have been true. Alan Turing and John Von Neumann
Alan Turing was famous for breaking the Enigma algorithm in the second world war, and he is the godfather of the computing. he is the first one how proposed the idea of "Turing machine" which is a machine that can compute any computable algorithm. I won't go further about how he talked about it and what it really is, but its all about something that can be written somewhere then executed by something then produce something else as a result that could be written to somewhere else and so on.
This idea was picked up and taken further by Von Neumann, John introduced the idea of the CPU, with some memory that can be altered by the CPU after processing and executing some instructions.
in fact, the vast majority of the computers that we are using nowadays uses what so-called "Von Neumann Architecture"
So, you might be wondering, what does all of this have to do with Assembly language?
Ok, let me explain. When the first programmable computer came out, it was pretty hard to program and the coding process was backbreaking even to a professional computer scientist.
they used to program that computer using only zeros and ones, a bunch of zeros and ones tells the CPU to do something. change a single zero to one and you will face the butterfly effect. The CPU will do something completely different from what you expected it to do.
After that, they came out with an idea to solve that problem and get rid of programming using the error-prone machine language. if 01001010 is moving something from A to B, ok I will substitute that value with a constant word and I will create a software that understands this word and convert it back to its original value so the CPU could understand and execute it.
And this idea was the Assembly language (of course it's not that easy, it's complicated more than what you can ever imagine but I don't wanna take you through mind-blowing details).
The program that used to convert the Assembly code into the machine code is called assembler.
Assembly language has many variants. Each processor family has its own Assembly language. Yea Assembly language is not portable. A program written in Assembly on an ARM computer cannot run on, for example, Intel computer.
Shocked? Ok, let me tell you more. Each variant has different versions which span the gamut from 16-bit to 64-bit opcodes. Kill yourself.
But the current most popular Assembly flavors are ARM and x86. Google and read about them on your own, because in this article I'm going to talk about the 16-bit Assembly language for the 8086 Intel processor.
To illustrate the difference between Assembly and the other programming languages, let me show you a snippet of Assembly and its equivalent from any other language like C.
This code will print Hello World on the screen, simple enough.
in C, you can do it in a single line (after doing your essential stuff like declaring the main function and including stdio and stuff)
printf("Hello World");
the code explains itself. you are calling a function that prints the passed string on the screen. pretty easy and straightforward.
What about Assembly?
Ok, let see...
.data
str db "Hello World$"
.code
mov ah, 9
lea dx, str
int 21h
hlt
Is it hard? then what about coding it using only 0s and 1s?
let's go through the first segment which is the data segment. Unlike the other programming languages, in Assembly, you cant define variables wherever and whenever you wanna, no.
Variables are defined in the data segment before getting to the code segment (some developers used to begin with the code segment and place the data segment underneath. its ok, you can do both as long as the variables and resided in the data segment).
The next line is str db "Hello World$"
In Assembly, the variables definition formula is var_name bytes_number var_value
str is the variable name, db is telling the assembler to declare a single byte, Hello World is the variable value and carry on, we will get to the dollar sign later.
Hey hey, hold on. Are you kidding? you are declaring a byte to store a string? how come!
Actually, str is not storing our string. It only has the memory address of our first litter which is 'H' and by going through the memory byte by byte you can access the whole string, untill you get to the dollar sign, the assembler will understand that this is the end of the string. the dollar sign is the string delimiter.
Now, let's move to the next line that is mov ah, 9
let's discuss the use of mov instruction first. As its name implies, it moves something from place to another. In this line, mov is moving 9 to the register AH.
Come on Mohamed! stop telling mysterious stuff, what is a register? Ok, a register is a temporary memory that can store a very tiny amount of data. A register is a small data holding place and its a part of the processor architecture. A processor usually has a set of registers consists of more than 4 registers, for example, the 8086 has more than 10 registers.
The registers in the 8086 processor are 16 bits registers, divided into two sub-registers. AX is the main register, AH is the high 8 bits of the register and AL is the low 8 bits of the register.
i.e. if AH = 01010101 and AL = 11001011 then AX = 010101111001011, clear enough.
Lets back to the track. The register AH has the value of 9. Ok but why? I will tell you later.
After that, we have lea dx, str. this line loads the address of the first char in our string into dx register. lea stands for load effective address.
The most important line which is int 21h
this is where the magic happens. for simplicity, you can consider this line as a function call in the higher level languages, int 21h is calling 21h function and the parameters are 9 and str, yea we placed them in the registers so the interrupt could access them. When int 21h is reached, it will look at AH to obtain the function code to know the task it should do. 9 is for printing (you can use Google to search for what else this interrupt can do)
Ok, I'm about to print, but what to print? int 21h will print whatever dx is pointing to, that is in our case, str.
Last but not least, we have hlt. It's just like return 0; in C or C++. In Assembly, hlt returns the control from the program back to the OS so you can do another task or run another program. And that's it!
Assembly is not as hard as it seems. It's easy, very easy. but its really complicated too. being easy doesn't make it simple. C is hard but simple. can you feel the difference?
Subscribe to:
Posts (Atom)