Creating programs in assembly can be somewhat tedious and time-consuming, so professional
programmers tend to prefer to use a higher-level programming language such as C (its predecessor was called "B"),
which was developed at Bell Laboratories by Dennis Ritchie in 1972. Thus, we've been contemplating writing a small
C compiler that we could use to create programs to run on the DIY Calculator.
View Topics
Introduction
Creating our programs in assembly language gives us ultimate control as to the way in which they
function, but it can be a laborious and time-consuming process. Thus, professional programmers prefer to use higher-level
languages, which allow programs to be captured quickly and concisely, and which also makes them much easier to understand.
For example, consider the following simple function called "lower" presented in C:
Think of a function as being like a subroutine in assembly language. This one accepts an ASCII character called “c” and, if this character
is in uppercase (“A” to “Z”), the function will return its lowercase counterpart (“a” to “z”). The point is that, even if you don’t know C,
you can probably scan through the above code and work out what’s happening a lot easier then in you were looking at an
unfamiliar assembly language.
And so, we’ve been mulling over the idea of creating a small C compiler for use with the DIY Calculator as discussed
in the following topics; now read on …
Alternative Outputs
If we should decide to create a C compiler, we know that it will accept C source code (that is, programs
written in C) as its input, but what should it generate as output? In fact there are two main scenarios here;
first, our compiler could directly generate a *.ram file containing the DIY Calculator’s machine code as shown below:
Alternatively, the compiler could output an intermediate *.asm file containing DIY Calculator assembly source code.
This file would subsequently be assembled into machine code using the DIY Calculator’s assembler as shown below:
In the context of what we’re trying to do here, this second scenario would be the mnore advantageous for two reasons:
(a) it’s easier to get a compiler to generate an intermediate format at a reasonably high level of abstraction and (b)
during the process of developing the compiler, it’s easier for us to look at the assembly code it generates to evaluate
how well the compiler is doing its job.
Note: Assuming the second scenario as discussed above, once the compiler has successfully generated an assembly
code file, there are two ways we could go: (a) we could run the assembler by hand, or (b) the compiler could invoke
the assembler automatically. In this latter case, if you are the one creating the C compiler (as opposed to us doing it),
then we can provide you with the assembler in the form of a DLL (Dynamic Link Library) object that you can call from
within your program. Email us as described on the About/Contact Us page
on the main DIY Calculator website for more details.
Choosing a C Subset
The next main consideration is to choose the subset of C that you intent to implement. We have to target
a subset, because creating a compiler that could handle the entire C language would be a formidable task.
Choosing a subset that is relatively powerful while remaining reasonably easy to implement is a non-trivial problem that
will require some serious pondering. We’ve bounced a few ideas around, but haven’t come up with anything definite at the time
of this writing. When we get a few moments, however, we will return to this topic in the not-so-distant future.
Note: See also the discussions on creating a Backus-Naur Form (BNF) description of the language subset you intend
to support in the General Notes section below.
Compiling the Compiler
Hmmm, let’s take a deep breath … from our discussions above we know that we are going to create a C compiler that
accepts C source code and generates DIY Calculator assembler code. However, we haven’t actually specified the language we are
going to use to create our compiler.
Think about this for a moment: there are a lot of computer programming languages around, such as Ada, Algol, BASIC, C, C++,
COBOL, Forth, FORTRAN, Java, Lisp, Pascal, Perl, and Prolog (to name but a few). The point is that we could create our C
compiler using any of these languages. In reality, however, creating out compiler in anything other than C would probably
make our brains hurt so much that it wouldn’t be worth the effort.
OK, so assuming that we are going to create our compiler in C, we will probably compile this program using an industry-standard
C compiler running on our home computer as shown below:
Now that we’ve created our own DIY Calculator compiler, we can use this to compile source code programs (written in our
C subset) into the DIY Calculator’s assembly code, which can subsequently be assembled into the DIY Calculator’s machine
code as shown below:
This is the easy scenario, and therefore it’s the one we will assume that we’re aiming at. However, it would be remiss of us
if we failed to mention that there is a potential twist to the tale as discussed in the following topic …
A Bit of a Brain Boggler
This would be a good time to prepare to have your mind well and truly boggled. But first we should note that
our ruminations in this topic are intended primarily to stimulate the old gray brain cells, because our intention is to follow
the relatively straightforward usage scenarios presented in the previous topics.
Having got the “weasel words” out of the way (and remembering that “Eagles may soar, but weasels rarely get sucked into jet
engines!”), let’s cast our minds back to the way in which we create and compile our DIY Calculator compiler as discussed
in the previous topic and as illustrated below:
As illustrated here, if we wish, the C source code we use to actually create our DIY Calculator compiler could employ every
construct in the C programming language (even the really “hairy” ones). That is, when we talked about our supporting only
a subset of the C programming language earlier in these discussions, we were referring to the fact that the C
programs that we intend to compile using our new DIY Calculator compiler would be written using only this subset.
However, let’s suppose that, when we capture the source code for the DIY Calculator compiler, we actually create this compiler
using only our defined subset. In this case, we open the door to an intriguing possibility as illustrated below:
First we use the path sporting the “(a)” annotations (with the blue arrows) to generate the DIY Calculator compiler that
will run on the main computer. Next (and this is the tricky part), we use the path with the “(b)” annotations (and the magenta arrows)
to generate a version of the DIY Calculator compiler that will actually run on the DIY Calculator itself! (For your interest, the color "magenta"
was named after the dye of the same moniker, which – in turn – was named after the Battle of Magenta that occured in Italy in
1859: the year in which the dye was first discovered!)
We know, we know … the mind starts to go into overdrive at this point trying to wrap itself around all of the implications.
We should also note that everything tends to sound easy if you say it quickly and wave your hands around a lot, but
actually running our compiler on the DIY Calculator would take a little thought.
For example, we would need some way to enter or load our C source code programs into the DIY Calculator’s memory in order to
give the DIY Calculator's compiler something to actually compile. One solution to this conundrum might be to use the "Console Window"
and the "QWERTY Keyboard" as discussed in the BASIC Interpreter topic on the More Tools page
of the main DIY Calculator website.
General Notes (Sharing Your Work)
| 1) |
If you do decide to create a C compiler as described here, we’re sure that other users would be very interested
in seeing it and using it. We would be very happy to make such a tool available via the DIY Calculator website
(giving full credit to you, of course).
|
| | |
| 2) |
Note that the ideas presented here are just a few thoughts that have popped into our minds during the course
of writing How Computers Do Math. If you think of any other considerations we should note regarding our proposed
C compiler, email us as described on the About/Contact Us page on the main DIY Calculator website and we’ll add
them to the list so that they are there should we (or someone else) decide to actually create such a compiler.
|
| | |
| 3) |
Introduced by John Backus and Peter Naur in the late 1950s and early
1960s, Backus-Naur Form (BNF) is a technique for recursively defining the grammar – the words, symbols,
and tokens – associated with a computer language. Having such a description greatly eases the task of creating a
parser for a language. A full BNF description of our existing assembler language is provided in Appendix E of The
Official DIY Calculator Data Book, which is itself provided on the CD-ROM accompanying our book, How Computers
Do Math. The point is that, if you are contemplating creating a C compiler, it would be a very good idea to kick
of the process by documenting a full BNF description of the language subset you intend to support.
|
Questions?
There are always a lot of points to ponder before embarking on a new software development quest.
We’ve had a head-start, because we’ve been pondering furiously for a long time. Thus, if you are interested in
creating a C compiler and want to bounce some ideas around, please feel free to drop us a line as described
on the About/Contact Us page on the main DIY Calculator website.
|