Reverse engineering PDP-11 BASIC: Part 12
In this post I'll describe some of the BASIC commands that use runtime state storage, including GOSUB, GOTO and RETURN.
For context and a list of other posts on this topic, see the PDP-11 BASIC reverse engineering project page.
The GOSUB command is used to jump to a subroutine starting at a specific line in the program code. For example:
This command will jump to line 100 and execute code there until a RETURN command is encountered, at which point control will return to the line after the GOSUB command.
The code is very simple:
004204 016700 MOV 13660, R0 004210 052700 BIS #20000, R0 004214 104512 TRAP 112
The currently executing line number, which is stored in memory location 13660, is moved to R0. The line number is logically AND'd with the value 20000, and then TRAP 112 is used to store this calculated return address value into the running state.
Once the return address has been stored by the code above, execution continues with the GOTO code, which follows immediately after the GOSUB code.
The GOTO command is very similar to the GOSUB command, except that control never returns from from GOTO, so storing a return address is not required. Apart from that, both involve jumping to a line number in the code and continuing with execution from there.
Here is the GOTO code (which is also used by GOSUB):
004216 104410 TRAP 10 004220 104474 TRAP 74 004222 001005 BNE 4236 004224 012767 MOV #1, 13664 004232 000167 JMP 3032 004236 104405 TRAP 5
First, a TRAP 10 is used to get the numeric value in the GOTO/GOSUB command. This is the line number to be executed next. TRAP 10 will store this value in R0. Then, TRAP 74 is used to check whether the line number stored in R0 exists in the program. If the line number is found, the zero flag will be set and R1 will point at the location of the command specified by the line number.
If the line number is not found, control jumps to the TRAP 5 instruction, and an error code is generated.
Otherwise, 1 is moved into the memory location 13664 to indicate that the program is running (because GOTO can be used a bit line RUN to start a program running) and then control jumps back to the main syntax parsing loop to execute the next instruction.
Before explaining how RETURN works, there are a couple of additional TRAPs that are used that haven't been encountered before. First we have TRAP 134, which positions R3 at the beginning of the runtime state storage.
Here's the code:
002362 016703 MOV 13666, R3 002366 005203 INC R3 002370 006203 ASR R3 002372 000241 CLC 002374 006303 ASL R3 002376 000207 RTS PC
First, the content of memory address 13666 is moved into R3. When a program is running, this will contain the maximum memory address of the program code storage area. This value will then be incremented. It is possible that the maximum memory address of the program code area may have been an odd value, which needs to be aligned to an even value.
Therefore, the value in R3 is shifted right, the carry bit is cleared, and then R3 is shifted left again. This will ensure that (a) R3 is bigger than the maximum memory address of the program code area and (b) aligned on a word boundary.
This value is then returned in R3.
The other TRAP we need to discuss before looking at the RETURN command is TRAP 120. This is used to delete the number of bytes specified in R4 from the location in the runtime state storage specified by R3.
The first part of the code is unique to TRAP 120:
001610 010301 MOV R3, R1 001612 010102 MOV R1, R2 001614 060401 ADD R4, R1 001616 000404 BR 1630
The value in R3 (the location from which the bytes are to be deleted) is moved into R1, with a backup copy made in R2. The number of bytes to be deleted is added to R1 and then control branches to address 1630.
Now, the instruction(s) at address 1630 are part of TRAP 76, which was discussed in Part 7, but it's no harm to go through them here and see how they are re-used as part of TRAP 120.
Here they are:
001630 020105 CMP R1, R5 001632 103002 BCC 1640 001634 112123 MOVB (R1)+, (R3)+ 001636 000774 BR 1630 001640 010305 MOV R3, R5 001642 010201 MOV R2, R1 001644 000207 RTS PC
R1 is compared to R5, which when a program is running, contains the upper extent of the runtime state storage. If R1 exceeds R5, meaning we have reached the end of the runtime state, the program jump ahead to 1640.
Otherwise, we overwrite the bytes at address R1 with the remaining bytes of the runtime state. R1 points at the bytes after the storage state entry to be deleted and R3 points at the storage state entry to be overwritten. A byte is moved from R1 to R3 and then both pointers are incremented. The code then loop around to address 1630 and bytes are copied until R1 reaches the end of the current program code area.
When the loop ends, R3 will contain the address of the end of the runtime state. This value is now moved to R5, the address where the end of the runtime state is normally kept.
Finally R2, which contains a backup of the original value of R1 is now moved back to R1 and then control returns from the subroutine.
With that groundwork completed, we can now look at the RETURN command to see how it works.
Here's the code:
004246 005046 CLR -(SP) 004250 012704 MOV #20000, R4 004254 104534 TRAP 134 004256 001424 BEQ 4330 004260 012700 MOV #17777, R0 004264 104514 TRAP 114 004266 001403 BEQ 4276 004270 010316 MOV R3, (SP) 004272 005723 TST (R3)+ 004274 000773 BR 4264 004276 012603 MOV (SP)+, R3 004300 001413 BEQ 4330 004302 011300 MOV (R3), R0 004304 040400 BIC R4, R0 004306 005200 INC R0 004310 012704 MOV #2, R4 004314 104520 TRAP 120 004316 020027 CMP R0, #1 004322 001677 BEQ 4122 004324 104474 TRAP 74 004326 000736 BR 4224 004330 104411 TRAP 11
Let's see how this works.
004246 005046 CLR -(SP) 004250 012704 MOV #20000, R4
First, a zero word is pushed onto the stack and the value 20000 is moved into R4.
004254 104534 TRAP 134 004256 001424 BEQ 4330
TRAP 134 is then used to position R3 to the beginning of the runtime state storage. If TRAP 134 sets the zero flag then control branches to address 4330, which is the error.
004260 012700 MOV #17777, R0
The value 17777 is moved into R0.
004264 104514 TRAP 114 004266 001403 BEQ 4276
TRAP 114 is used to locate a return address in the runtime state storage. Return addresses have a "type" of 20000. The mask value of 17777 will mean that any runtime state entry with a "type" of 20000 will be returned. Since R3 contains the address of the beginning of the runtime state, the first return address in the runtime state will be returned.
If the zero flag is set, meaning that a matching value was not found in the runtime state, control jumps to address 4276.
004270 010316 MOV R3, (SP) 004272 005723 TST (R3)+ 004274 000773 BR 4264
Otherwise, the return address identified in the runtime state is moved to the zero word pre-pushed onto the stack. The address at R3 is tested and incremented.
Control then branches back up to address 4264 to read another return address from the runtime state. This is because GOSUBs can be nested within each other, so there may be more than one return address in the runtime state.
The return address we are looking for is the one that was added to the runtime state last. So, the code loops looking for return addresses until no more can be found (i.e. TRAP 134 fails to find an entry and sets the zero flag) and then the return address that is used is the one that was most recently found, which will be stored at the top of the stack.
004276 012603 MOV (SP)+, R3 004300 001413 BEQ 4330
Once the test to locate another return address fails, control continues at address 4276. The value from the top of the stack (the most recently identified return address) is popped into address R3. If this value is zero, a return address has not been found and therefore an error is generated by branching to 4330.
004302 011300 MOV (R3), R0 004304 040400 BIC R4, R0 004306 005200 INC R0
The value of the return address, stored at memory address R3, is moved into R0. The bit clear command (BIC) is used to remove the 20000 from the return address value, leaving just the line number of the GOSUB command. This value is then incremented.
004310 012704 MOV #2, R4 004314 104520 TRAP 120
These next two lines are used to delete the return address from the runtime state. TRAP 120 will delete the number of bytes specified in R4 from the runtime state at the location specified in R3.
004316 020027 CMP R0, #1 004322 001677 BEQ 4122
The value in R0 is compared to 1, which would indicate that the return address was zero, in which case control branches to 4122. This will clear the currently executing line of the program (by clearing memory address 13660) and then jumping to the syntax parsing loop.
004324 104474 TRAP 74 004326 000736 BR 4224 004330 104411 TRAP 11
Otherwise, TRAP 74 is used to get the next line in the program after the value specified in R0. When this returns, R1 will point at the next line to be executed in the program code area. Then, control branches back to the syntax parsing loop.
The TRAP 11 is used in case of errors, as described above.