Loading a program from tape - Part 2: The Bootstrap Loader
Updated: Jan 29, 2021
This post is a deep-dive on the PDP-11 bootstrap loader. It is a follow on from Part 1 of this series on loading a program from tape, which is recommeded reading before this post.
Running the bootstrap loader
The sole purpose of the bootstrap loader is to load the absolute loader into memory from an external media, such as from a paper tape. When the bootstrap loader was being used with PDP-11 hardware the bootstrap loader have been loaded by either being manually entered at the operator panel or loaded from a boot ROM. In a modern simulated environment the bootstrap loader is entered into the configuration file.
Configuring the bootstrap loader
If you take a look, for example, at the PDP-11 Programming Card, you will see that the bootstrap loader code is there. However, the code is provided in the form of a sort of template that you need to adjust according to your requirements:
Firstly, note that the "Contents" of the bootstrap loader are listed but the memory addresses at which the bootstrap loader should be deposited are incomplete. That's because the location for the bootstrap loader depends on how much physical memory was contained in the machine you were loading the bootstrap loader onto.
On the left hand side of the image above you can see a "Memory Size" table. If your PDP-11 had 4K of memory, you used addresses for the bootstrap loader starting with "017". If your PDP-11 had 8K of memory, all of your addresses start with "037", and so on.
When I am emulating the PDP-11, I use 4M of physical memory so I use "157" for all addresses, so all of the "hyphens" in the bootstrap loader listing are replaced with "157" (or whatever is the appropriate choice in your case). In my case, for example, the first instruction of the bootstrap loader ("016 701") is deposited at memory address "157 744". Note that you also need to use the appropriate offset to complete the instruction word at address "- 766".
The reason why this is done is that it was intended that the bootstrap loader would be deposited at the highest available memory address locations, which will obviously depend on how much memory is available.
The other thing to note is that the final word of the bootstrap loader, in my case at memory address "157 776", has two possible values; "157 560" for TTY and "157 550" for PC11. This is to configure where you want the bootstrap loader to load the absolute loader from. You use "157 560" to load the absolute loader via a TTY and you use "157 550" to load the absolute loader from paper tape (i.e. from a PC11 tape reader device). This value is the memory address of the Control and Status Register (CSR) of the device from which the absolute loader will be loaded.
The bootstrap loader
So, here is my bootstrap loader. The first item in each row is a memory address and the second is the value to be placed at that memory address. Note that, as discussed above, all addresses start with "157" and that the last word has the value "157 550" (the CSR of the paper tape reader).
157744 016701 157746 000026 157750 012702 157752 000352 157754 005211 157756 105711 157760 100376 157762 116162 157764 000002 157766 157400 157770 005267 157772 177756 157774 000765 157776 177550
Let's take the code instruction-by-instruction and see how it works.
157744 016701 157746 000026
The first two octal digits in the first word ("01") are a MOV instruction. The second two octal digits ("67") indicate that the source value to be moved can be found at an offset from the Program Counter. The offset is found in the next word ("000026").
When an instruction is being executed, the content of the location pointed to by the PC is loaded and then the PC is incremented before the instruction is actually executed. Similarly, when an offset value is being read the value is read and the PC is incremented before the value is used. Therefore, the PC always points at the word after the instruction that is currently being executed. In this case, as the MOV instruction is being executed, the PC will have the value 157750.
The offset here is 26 (octal), meaning that the source operand for the MOV instruction will be found at address 157776. As mentioned above, this value is the address of the control and status register (CSR) of the input device, in this case the paper tape reader.
The final two octal digits of the instruction ("01") are the destination operand, which in this case is register R1.
In summary, these first two words represent "MOV 26(PC), R1".
157750 012702 157752 000352
These next two octal words are another MOV instruction, as indicated by the first two octal digits in the first word ("01"). The next two octal digits ("27") reflect the source operand of the move instruction. In this case they represent the immediate value pointed to by the PC. Remember that at the time the MOV instruction is being executed, the PC will already have been incremented, so it will point at address 157752. In other words the, source operand is the immediate value 352 (in octal).
The destination in this case, represented by the final two octal digits of the MOV instruction ("02") is register R2.
Therefore, these two words represent the instruction "MOV #352, R2".
The next instruction is an increment word instruction, indicated by the four octal digits "0052". The operand to be incremented is represented by the final two octal digits ("11"). In this case, the value to be incremented is the value in the memory address that is contained in register R1. Put another way, this word represents the instruction INC (R1).
Remember that R1 contains the address of the CSR of the paper tape reader. Incrementing this value sets the lowest bit of the CSR to 1, which enables the device.
157756 105711 157760 100376
The next two instructions are a loop to wait for the device to become ready. The paper tape reader is significantly slower than the CPU (at least at the time when physical hardware was being used rather than emulation!) and so the CPU needs to wait until the paper tape reader has a byte of data ready to be loaded into memory.
The first word is a TSTB instruction (given by the octal values "1057"). This value will set the flags in the Processor Status Word (PSW) based on the value of the operand. The operand in this case is the value in the memory address that is contained in register R1, given by the octal digits ("11"). Again, recall that this is the CSR of the paper tape reader.
Bit 7 of the CSR value represents the DONE status bit of the paper tape reader. At the same time, the highest bit of any signed binary value represents the sign of that value with a "1" representing a negative value and a "0" representing a positive value. Therefore when bit 7 of the low byte of the CSR is set to 1, that will mean that the byte has a negative value whereas if bit 7 is zero, the CSR will have a positive value.
That brings us onto the second of these instructions, the branch instruction, which is a BPL instruction, or branch if positive, given by the octal digits ("100"). This instruction will test the value of the "N" bit in the PSW. If the value tested, in this case the CSR of the paper tape reader, has a negative value, the N bit will be set, otherwise the N bit will not be set.
The branch instruction will branch if the value of the CSR was positive, meaning that bit 7 was not set, which in turn means that the paper tape reader does not yet have a value ready to be loaded into memory.
The location to branch to if the value of the CSR is positive is given by the offset portion of the branch instruction, which has the octal value 376. Without digressing into a whole discussion about how 2's complement numbers are structured, suffice to say that this is a branch offset of -2.
Recall that the PC will have been incremented before the branch instruction is executed, so it will have the value 157762. Therefore the -2 offset, two words back from the value of the PC, will branch back to address 157756, which is the TSTB instruction again.
In other words, these two instructions loop until bit 7 of the CSR of the paper tape reader is set, indicating that the paper tape reader has a value ready to be loaded into memory.
157762 116162 157764 000002 157766 157400
These next three words represent a single MOVB (move byte) instruction, as reflected by the first two octal digits of the first word ("11").
The source operand of the move instruction is specified by the second two octal digits ("61"). This value means that the source is the value of a memory location obtained by adding the value in the R1 register (i.e. the CSR of the paper tape reader) plus the value specified in the next word of the instruction ("000002"). In other words the value "157 552", which is the data buffer register of the paper tape reader.
The destination operand of the move instruction is specified by the final two octal digits ("62") of the first word. This value means that the destination is the value of a memory location obtained by adding the value in the R2 register (which currently contains the octal value 352) plus the value specified in the final word of the instruction ("157400"). This gives a destination memory address of "157752".
Note that this value is actually one of the addresses that make up the bootstrap loader program itself. In fact, it is the memory address location of the value placed in register R2 and just used to determine the memory address location where the byte read from the paper tape is to be stored.
157770 005267 157772 177756
This next instruction is an INC (increment) instruction, as indicated by the octal value "0052". The location to be incremented is specified as a PC offset in the next word. The value "177756" is "-18" in 2's complement. The value in PC, after the location offset word has been read is "157774". Subtracting 18 from this gives a memory address of "177752".
So, in summary, this instruction increments the value in memory location "177752".
The final instruction in the bootstrap loader program is an unconditional. The branch instruction is "0004", which makes the branch offset 365 in octal, or -11 in decimal. So, after reading this instruction the PC will have the value "157776". Branching back 11 means subtracting 11 words (or 22 bytes) from the memory location, which means that this instruction branches back to memory location "157750".
The final byte of the bootstrap loader is the address of the CSR of the input device, in this case the CSR of the paper tape reader.
Summary of code
Here is a summary of the bootstrap loader:
Move address of paper tape reader CSR into R1
Move destination address offset into R2
Enable the paper tape reader and wait for a byte to become available
Move a byte from the paper tape reader buffer into the memory address 157400 + the destination address offset in R2.
Increment the destination address offset
Loop back to number 2
The sharper-eyed reader might have noticed that this code appears to be an infinite loop. So, how does it ever stop reading bytes from the paper tape? That's the subject of the next in this series of posts.
A useful reference for understanding the operation of the PC11 paper tape reader, particularly the location of the various memory mapped I/O registers, is the PC11 control manual.
I mentioned in my post on a really useful resource for learning PDP-11, that there were a series of training videos made by DEC in 1977 that describe a lot of the features of the PDP-11. Two of them in particular describe the bootstrap loading process in detail; Tape 23 - IO Programming part C and Tape 24 - IO Programming part D.