- The Bridged
- Posts
- Compilers and Abstraction
Compilers and Abstraction
How Data Is Organised and Coded to Create Programmes
In my last post I finished up explaining a rudimentary theoretical model of a programme using instruction codes. In that model we only used raw binary instructions and in actual software programmes a lot more goes into the functioning of programmes.
In order for programmes to function correctly abstraction takes place to take what happens at the lower levels of computation and allow it to be usable as software.
If we look at any .exe program file as an example, at its lowest level the program will consist of 1s and 0s since this is the lowest level of data that a computer can store.
At this lowest level not all of the binary will be in the form of instruction codes. For example at the start of most files we have what is called a magic number. The magic number is a kind of identifier or signature to tell what kind of file it is. For example portable executables will have a set identifier within that magic number and there is a set amount of bits or bytes reserved for the magic number to tell the operating system or accessing entity what kind of file it is.
Next we will have lots of data in a file which will consist of anything that makes up the contents of the file such as images or strings etc.
Then there is the instruction layer of data which allows the program to run and tells the processor what to process and compute, allowing a program to perform the actions that we expect it to perform. These are the instruction codes that that make up the set for the processor we want to run the program on. These codes together is generally referred to as the machine language and are also sometimes shown in Hex.
There is also an intermediary to machine language that puts some human readable text to an operation called assembly language.
These languages or codes are not necessarily used to code programmes nowadays as it would be far too complicated and take too long to code like this so we use higher level programming languages like C, C++, Java, Python or GO to name just a few. These languages read more similarly to English and allow for developers to write, troubleshoot and debug them at later stages.
That being said, computers and devices still look for the lower level information in order to carry out the functioning of the programmes.
A computer uses a compiler in order to read this coded language and convert it into a format that it can use, like low-level binary language that it can use to execute the code. Compilers allow for humans to code programs in human-readable formats and have the computer translate this to something it can use.
When it comes to the concept of abstraction, we can understand that by understanding at a basic level the different stages of interaction in a software that take place from the higher levels to the lower. At the highest level, we could say at the user level, we don’t need to know anything about the underlying programming or software that makes everything work. We just need to know how to use and interact with the software. We don’t need to know what language it was coded or the contents of the executable. These aspects are abstracted away from us.
If we go down a layer of abstraction from the user layer to the developer layer, here we can see the language that the software was coded in. The developers would need to know this language and also perhaps the compiler and it’s options however it’s most likely that the compiler was developed by someone else.
From here we can go down another layer of abstraction to the compiler developer. The compiler developer would need to know the programming language used to know what they were compiling down from. They would also need to know about the instructions sets a layer below for the processor that they’re compiling to.
Another layer down from this could be the hardware and looking at the actual make up of a processor itself. The manufacturer would have to understand this layer so that they know what components and configurations are needed to successfully produce the right parts.
Abstraction then can be looked at as a system of identifying discrete layers at each stage of processing and programming so that we can easily identify what exists where, who exactly would be involved in doing the work at each of these stages and what needs to be done.