In this topic, we first consider some of the problems associated with testing a program to ensure it works as planned. Next, we present some ideas pertaining to a Code Coverage utility that would aid us in the task of verifying out programs. In particular, we consider a variety of mechanisms one might use to present code coverage data.

View Topics



 
Introduction
One you’ve created a program, the first thing you are going to do is run it to make sure that the little scamp works as planned. In the case of small, simple programs, convincing yourself that everything is “top-top” is not overly demanding. As your programs become larger and more complex, however, testing them becomes somewhat more problematical.

The worst case scenario is that you run your program and it appears to work as expected. Then, flush with excitement, you gather your family and friends (the ones who weren’t fast enough to get out of the way when they saw you approaching), run your program again ... and it crashes and burns in a most spectacular way. Oh the shame! Oh the humiliation!

One common reason for a program crashing is that it contains a number of subroutines, but not all of them are used all of the time. Thus, you might run the program in one way and have it work, and then run the program in another way and have it fail. In some cases, this may be because a subroutine wasn’t used at all during the first run. In other cases, the subroutine in question may have been used in a different context; for example, it may have been called from different points of the program or used in conjunction with different subroutines on the two different runs.

Very often, the root cause of the problem is that your program and/or subroutines contain a lot of “IF ... THEN ... ELSE” type branches. For example, consider the following pseudo-code example of a nested conditional statement buried deep in the bowels of your program:

In fact, this really isn’t too complex. But now suppose that – instead of saying “Hello there”, “Goodbye”, and “My head hurts”, these different conditions caused us to jump to different parts of program or to call different subroutines – and that these targets may themselves contain nested conditional statements, and on and on it goes.

What all of this means is that you have to run your program under a variety of different conditions to ensure that you’ve tested all of the different possibilities. The trick, of course, is to know whether or not you’ve actually executed all of the different areas and subroutines in your program. The solution is to create some sort of Code Coverage utility that can examine the results from a program run and report which portions of your code have been exercised and which portions remain unverified.




 
Gathering Code Coverage Data
When we run our assembler, it uses a *.asm (source code) file as input and generates three files as output: a *.lst (list) file, a *.ram (machine code) file, and a *.rad (raw assembly dump) file. (The formats for each of these files are fully documented under the File Formats topic on the More Tools page.) For example, consider what happens when we assemble a program called test.asm:

As we know, we can load *.ram machine code files into the DIY Calculator’s memory by means of the Memory > Load RAM command, at which point we can run these programs. Thus far, we’ve used the assembler’s list output only to debug our programs by stepping through the program and visually examining these files, and we haven’t used the Raw Assembly Dump (*.rad) files at all, but that’s about to change.

In fact, we've augmented the DIY Calculator with a “Data Gatherer” utility, which can be used to gather data during the course of a program run. The type of data we’re talking about here is the number of times each opcode is executed and the number of times each data location (reserved using a .BYTE command or its .2BYTE and .4BYTE cousins) is written to and/or read from. In the case of conditional jump instructions such as a JZ (“jump if zero”), the Data Gatherer utility will record how many times the instruction passes its test and how many times it fails.

The way in which the Data Gatherer utility works is described in the documentation accompanying this release of the DIY Calculator, as presented on the Download page. For our purposes here, we need only be aware of the fact that – when we instruct it to do so – the Data Gatherer can be used to read in the *.rad (raw assembly dump) file generated by the assembler and to write out a corresponding *.pad (processed assembly dump) file that contains the gathered data:

Once again, the formats for each of these files – including *.pad files – are fully documented under the File Formats topic on the More Tools page. The point is that all of the above is handled by the DIY Calculator and its environment. What we are proposing is for someone (maybe you, maybe us) to create a Code Coverage utility. This tool will read in the *.lst (list) file generated by the assembler and the *.pad (processed assembly dump) file generated by the DIY Calculator’s new Data Gatherer utility and – by means of some form of Display utility – will inform us as to which portions of our program have been exercised and which portions remain untested:

But how are we going to display the code coverage results? Ah Ha! Therein lays the question. In fact, there are a large number of techniques we might adopt; a few of these approaches are presented in the following topics.




 
Colored Text Displays
Perhaps the simplest display technique would be for the Code Coverage utility to color the lines of text from the *.lst (list) file using the data from the *.pad (processed assembly dump) file as shown below:

In this case, the fact that all of the text in lines 24 and 25 are green indicates that these instructions were executed. By comparison, only the left-hand side of the text associated with the JZ instruction on line 26 is green, indicating that the test did pass and the jump was performed. Meanwhile, the right-hand side of this line is colored red, thereby indicating that this test never failed. This explains why the instructions on lines 28, 29, and 30 are colored red; they were never executed because the JZ instruction on line 26 never failed.

Observe that we’ve chosen to display blank lines and comment lines in black; perhaps another color (even green) would be more appropriate? (See also the Handling Comments and Blank Lines topic later on this page) Also, is it better to color all of the text as shown above, or would it be preferably to just color the line numbers? Alternatively, as opposed to coloring the text itself, another technique would be to leave the text as-is and instead modify the color of the background as shown below:

Observe that the green color covers the comment lines at the top and persists until we reach a point where the red color commences (at the point where the JZ instruction never fails, in this case). In the scheme we're showing here, once we start using the red color, we continue through comment lines and blank lines until we arrive at some instructions that have been exercised and revert back to a green color again. Once again, however, there are different techniques for handling commants and blank lines that we could choose to use (see also the Handling Comments and Blank Lines topic later on this page).

The above offers just two main possibilities; there are many more. What would be really interesting would be to create a Code Coverage utility that allowed the user to experiment with different approaches to presenting this data, and to then interview users to determine which they found to be the most efficacious. we would be very interested to learn the results of any such experiments.




 
Vertical Bar Graph Display
The colored text approach presented in the previous topic is great for fine-grained (detailed) line-by-line analysis of the code, but it would also be useful to be able to work with a “bigger picture” view that allowed the user to quickly locate and focus on problem areas. One way of doing this would be to use a vertical bar-graph type display:

The idea here is that each vertical line in the new display corresponds to a line of source code. Each bar is split into two halves. In the case of a standard instruction, both halves would be colored green or red, corresponding to whether that instruction had or had not been executed, respectively. In the case of conditional jump instructions like a JZ (“jump if zero”), the botton half of the bar could correspond to the test passing, while the top half could correspond to the test failing. Similarly, in the case of a data location, the bottom half could correspond to that location being read from, while the top half could correspond to it being written to.

Once again, we would have to decide how to handle comment and blank lines. Perhaps the cleanest approach is to start off by coloring them green, and to subsequently have them inherit the color from the previous line.

Last but not least, clicking the mouse in this bar-chart display should auto-scroll the text window to display the corresponding lines of source code (see also the Advanced Navigation Capabilities topic later on this page). (Note that the size and location of the “thumb” in the horizontal scroll bar associated with this bar-chart display corresponds to the relative amount and location of the text being presented in the main display.)




 
Handling Comments and Blank Lines
Consider the illustration below. The two vertical line displays at the top reflect the beginning of two different programs. Both programs commence with a few comment (or blank) lines, which are shown colored gray. Both then have a few instructions that are executed and are colored green (it’s safe to assume that at least the first instruction will be green [exercised] because when we run the program it's going to run the first instruction, even if that instruction is an immediate [unconditional] jump to another part of the program).

The program on the left then has two largish blocks of red (untested) instructions surrounding a small number of gray (comment/blank lines). By comparison, the program on the right has two smallish blocks of red instructions surrounding a largish block of gray comment/blank lines. Each program then has a few more gray lines followed by a block of green (tested) lines. So, what we need to do is to decide how we wish to handle the comments and blank lines.

The obvious and – arguably – the simplest approach would be to leave these lines colored gray. One argument against this technique is that there are going to be a lot of 1-line and 2-line comments in the program, which will result in a somewhat fragmented display. By comparison, an argument in its favor would be that indicating the comments and blank lines with their own color more accurately reflects the nature of the program and will make it easier to relate lines in the source code to lines on the display.

Another technique – indicated by the (a) annotation – would be to color the comment on line 1 in the source code green, and for each succeeding comment to inherit the color from the line preceding it (if the first line was an instruction, then that line would already have a color). As shown above, this doesn’t look too bad in the case of the program on the left, but it appears overly pessimistic in the case of the program on the right.

Similarly, the technique of coloring any comments red if they are surrounded by red instructions on both sides – indicated by the (b) annotation – doesn’t look too bad in the case of the program on the left, but it appears overly pessimistic in the case of the program on the right.

Yet another alternative – indicated by the (c) annotation – would be to “weight” the amounts of red surrounding the comments and blank lines. In thsi case, if there is a lot of red and relatively little gray, then we woukd color these comments and blank lines red as shown in the program on the left; but if there is relatively little red compared to the gray, then we would color these comment and blank lines green as shown in the program on the right.

Another approach would be to simply color all comments and blank lines green as reflected in the examples sporting the (d) annotation. What would be really interesting would be to create a display that could be switched between all of these different approaches, and to then evaluate which was the most intuitive and easy to use.

Note that, whichever of the above techniques we choose (or any other approach that we might eventually settle on), we should be consistent in the main text window, the vertical bar graph display (discussed in the previous topic), and the bird’s eye skyscraper display (introduced in the following topic).




 
Bird’s Eye Skyscraper Display
This is somewhat related to the previous display. The idea here, however, is to consider things from a bird’s eye view flying high above a city containing skyscrapers, in which case all our feathered friend would be able to see would be small squares corresponding to the top of each building. Similarly, in our display, each square would correspond to one line of code:

This type of display may convey advantages when working with larger programs because it facilitates a “big picture” view that encompasses more data. One consideration is that it is no longer possible to split conditional jump instructions into two colors indicating whether or not the cases where the jump passed and/or failed were both tested. One solution would be to introduce two new colors into the display; for example, we could use blue if the instructions pass consition was exercised but the fail was not; and orange if the pass was not exercised but the fail was (of course green would continue to indicate that both pass and fail paths had been tested, while red would indicate that this instruction was not exercised in either mode).

A very important consideration will be how we organize the elements in this display. For example, let’s assume for the sake of discussion that 100 elements (squares) are presented across the display in the horizontal direction and 10 elements in the vertical direction. Assuming the top-left corner corresponds to line 1 in the source code, do we progress from left to right until we reach line 100, and then continue from the left-hand-side of the next from with line 101; for example:

     1, 2, 3, 4, 5, 6 … 97, 98, 99, 100
     101, 102, 103, ...   198, 199, 200
     201, 202, 203, ... etc.

Alternatively, would it be better for the display order to wend its way back and forth sort of like a snake as follows:

     1, 2, 3, 4, 5, 6 … 97, 98, 99, 100
     200, 199, 198, ...   103, 102, 101
     201, 202, 203, ... etc.

Or maybe it would be better to present the squares (lines) in a vertical ordering ... or perhaps to present each group of 100 lines in a 10 x 10 array of squares, Once again, it would be very interesting to experiment with these different possibilities to see which was the easier and more efficient to use.

Of course it would also be interesting to allow the user to have access to both the vertical bar display and the birds-eye view display to see which they preferred (it should be possible to turn these different displays on and off individually). And once again, clicking the mouse in this bird's eye display should auto-scroll the text window to display the corresponding lines of source code (see also the Advanced Navigation Capabilities topic later on this page).




 
Organic Lava-Lamp Display
While we’re letting our imaginations run wild and free, another form of display might look something like the oil in a lava lamp of the type that were prevalent during the 1970s (it would be ultra-cool and add a lot of visual interest if the various shapes maintaind their relative positions but they were all gradually "undulating"):

The idea here is that the main rectangle represents the top-level program, of which we can assume that at least the first instruction has been executed. The various ellipses correspond to either loops in the main body of the program or to subroutines; ellipses containing ellipses represent nested loops or nested subroutines. Note the orange ellipse; this represents the fact that this nested subroutine wasn’t tested from the surrounding subroutine, but it was called and tested from elsewhere in the program. One might also wish to experiment with different shades of colors representing subroutines that have been partially tested and so forth.

Ideally, it would be nice to annotate the various ellipses with the names of their corresponding loops or subroutines. Alternatively, the names of the loops or subroutines could appear in a small “pop-up” window as the mouse cursor was moved over the display. Clicking on an area in this display should auto-scroll the text in the main display to the corresponding lines of source code (see also the following topic).

This type of display could be used to give a much higher view of our code-coverage world, but it would require our Code Coverage utility to perform an in-depth analysis of the list file to determine the extent of each of the loops and subroutines. Also, actually generating this display would pose some interesting challenges.




 
Advanced Navigation Capabilities
As was noted in the discussions associated with all of the displays discussed above (the vertical bar graph, the bird’s eye view, and the organic lava-lamp display), clicking on an area in the display should cause the main text window to auto-scroll to display the corresponding lines of source code.

However, we might be able offer something a little more sophisticated. Let’s suppose that we have a subroutine that is colored red because it has not been exercised at all. In this case, the problem isn’t really with the subroutine; instead, we need to look at the point (or points) in the program from whence this routine is called in order to determine why they never actually got around to calling the routine.

One technique we could use would be that if we left-mouse-click in the display, then the main text window auto-scrolls to display the corresponding lines of source code; but if we right-mouse-click in the display, then we are presented with a pop-up list of options, including jumping to the subroutine itself or jumping to the point in the program from whence the routine was called (if there are several such calling points, they would be listed in order).

Another alternative would be that if we held the <Ctrl> key down while left-mouse-clicking in the display, then the main window would auto-scroll to the point from whence the subroutine was called; but if we click without holding the <Ctrl> key, the main window would auto-scroll to the subroutine itself. There are of course many more techniques one might explore, so put your thinking cap on and start pondering furiously.




 
General Notes (Sharing Your Work)
1)   If you do decide to create a Code Coverage utility as described here, we’re sure that other users would be very interested in seeing it and using it. We would be very happy to make such a tool available via the DIY Calculator website (giving full credit to you, of course).
  
2)   The Code Coverage utility discussed here is very similar in concept (and in the display mechanisms it uses) to the Code Profiler utility discussed elsewhere on the More Tools page. Thus, it might be a good idea to combine these two tools and to provide the ability to switch back and forth between the different views of the data.
  
3)   Note that the ideas presented here are just a few thoughts that have popped into our minds during the course of writing How Computers Do Math. If you think of any other considerations we should note regarding our proposed Code Coverage utility, email us as described on the About/Contact Us page on the main DIY Calculator website and we’ll add them to the list so that they are there should we (or someone else) decide to actually create such a tool.




 
Questions?
There are always a lot of points to ponder before embarking on a new software development quest. We’ve had a head-start, because we’ve been pondering furiously for a long time. Thus, if you are interested in creating a Code Coverage utility and want to bounce some ideas around, please feel free to drop us a line as described on the About/Contact Us page on the main DIY Calculator website.