Reverse engineering PDP-11 BASIC: Part 3
Updated: Feb 10, 2021
I was in the middle of writing a post about the analysis of the BASIC startup code, but it was getting massive, and I'm actually going to have to split it into a few posts to make it manageable. A few more posts will follow in fairly quick succession.
In this post I'm going to talk about a few more of the TRAP calls. These are used in the setup code, so these are a necessary prerequisite before moving on to the setup/option parsing code. I'll also describe the use of odd-numbered TRAP calls.
Here is the TRAP 70 handling code:
000430 020227 CMP R2, #60 000434 002415 BLT 470 000436 020227 CMP R2, #71 000442 003002 BGT 450 000444 000264 SEZ 000446 000207 RTS PC 000450 020227 CMP R2, #101 000454 002405 BLT 470 000456 020227 CMP R2, #132 000462 003002 BGT 470 000464 000257 CCC 000466 000207 RTS PC 000470 000257 CCC 000472 000262 SEV 000474 000207 RTS PC
The purpose of this code is to look at the value stored in R2 and set flags according to its value. It's actually quite simple and can be broken up into a few logical pieces.
000430 020227 CMP R2, #60 000434 002415 BLT 470 000436 020227 CMP R2, #71 000442 003002 BGT 450 000444 000264 SEZ 000446 000207 RTS PC
The first thing that happens is that the value in R2 is compared to "0" (ASCII 60), if the ASCII code is less than 60, jump to address 470. If the ASCII code is greater than 60, then the value in R2 is compared to "9" (ASCII 71). If the ASCII code is greater than 71 then jump to address 450.
Otherwise, we have determined that the ASCII code in R2 is between "0" and "9" (inclusive), meaning that R2 contains the ASCII code of a digit. In this case, we set the zero flag ("SEZ") and return.
000450 020227 CMP R2, #101 000454 002405 BLT 470 000456 020227 CMP R2, #132 000462 003002 BGT 470 000464 000257 CCC 000466 000207 RTS PC
Remember that if the value in R2 was greater than "9" (ASCII 71) then we jumped to 450. Well, at 450 we check whether the value is equal to "A" (ASCII 101) if it is less than 101 we jump to 470. If it is greater than 101 we check if the value is greater than "Z" (ASCII 132). If it is greater than "Z" then we jump to 470.
Otherwise, we have determined that the ASCII code in R2 is between "A" and "Z" inclusive. In this case we clear all status flags ("CCC") and return.
000470 000257 CCC 000472 000262 SEV 000474 000207 RTS PC
Finally, if we're at address 470 we have determined that the ASCII code in R2 is not in the ranges "0"-"9" or "A"-"Z", in which case we clear all status bits, set the overflow bit, and return.
In summary, therefore, this code tests and returns as follows:
If the value in R2 is an ASCII digit, set the zero flag
If the value in R2 is an ASCII upper-case letter, clear all flags
Otherwise clear all flags and set the overflow flag
TRAP 10 is an ASCII-to-integer converter. It converts a string of digits, pointed to by R1, into a numeric value stored in R0. Here's the code:
011440 005000 CLR R0 011442 004767 JSR PC, 1224 011446 004767 JSR PC, 430 011452 001013 BNE 11502 011454 162702 SUB #60, R2 011460 006300 ASL R0 011462 060002 ADD R0, R2 011464 006300 ASL R0 011466 006300 ASL R0 011470 060200 ADD R2, R0 011472 032700 BIT #160000, R0 011476 001761 BEQ 11442 011500 104441 TRAP 41 011502 005301 DEC R1 ; decrement R1 011504 000207 RTS PC
So, let's take a look line-by-line.
011440 005000 CLR R0
Firstly, the value in R0 is set to zero.
011442 004767 JSR PC, 1224 011446 004767 JSR PC, 430
Then there are two subroutine calls. The first, the jump to address 1224, is the same as invoking TRAP 72. This code get's the next non-whitespace character in the string pointed to by R1 and stores it in R2. The second subroutine call, to address 430, is the same as invoking TRAP 70, which checks whether the value in R2 is numeric. If the ASCII code in R2 represents a number, the zero flag is set.
011452 001013 BNE 11502
If the value is non-numeric (i.e. the previous comamnd returned a result that was not equal to zero) then branch to 11502, which is the end of the subroutine. If if the value is non-numeric we exit from the subroutine.
011454 162702 SUB #60, R2
At this point, therefore, we know that R2 contains an ASCII digit representing a number. We therefore subtract 60 from the value in R2, which will convert the ASCII digit in R2 into the number it represents. In other words, if R2 contains "1" (ASCII 61) this command will change the value of R2 into 1.
011460 006300 ASL R0 011462 060002 ADD R0, R2 011464 006300 ASL R0 011466 006300 ASL R0 011470 060200 ADD R2, R0
Thinking about these five lines might make your head spin, because you need to map octal to decimal and back again while thinking about the impact of each calculation. What we need to do is take the currently running tally of the value, multiply it by ten and add the next digit.
Each shift left is that the value in R0 is a multiplication by 2. Also, remember that R2 contains the next lowest significance decimal digit to be added. So, shifing R0 left gives us (2xR2), which is added to R2. So now R2 contains (R2 + 2R0). Then R0 is shifted left again, so now it contains 4 times the original value. Then it is shifted left again, so now it contains 8 times the original value. Finally R2 is added to R0, but R2 contains R2 + 2R0, so the upshot is that after these lines, R0 will contain (10R0 + R2), ten times its original value plus the value that used to be in R2.
In summary, the effect of these lines is take the current value, multiply it by ten and add the next digit.
011472 032700 BIT #160000, R0 011476 001761 BEQ 11442 011500 104441 TRAP 41
Having incorporated the next digit into the running total in R0, the highest three bits are checked. If any of them are non-zero this is an error and the TRAP 41 is generated, otherwise control loops back to the top to check for, and process, for another digit.
011502 005301 DEC R1 011504 000207 RTS PC
Finally, before returning R1 is decremented, so it points at the last digit of the number just converted and then control returns from the subroutine.
This TRAP is used to display the currently executing line number of the currently running program. Here's the code:
002312 162706 SUB #10, SP 002316 010600 MOV SP, R0 002320 016701 MOV 13660, R1 002324 104412 TRAP 12 002326 010600 MOV SP, R0 002330 005720 TST (R0)+ 002332 105066 CLRB 7(SP) 002336 104466 TRAP 66 002340 062706 ADD #10, SP 002344 000207 RTS PC
Here's the code, line-by-line:
002312 162706 SUB #10, SP
The calculations are done using stack space, so firstly, some space is made on the stack by subtracting 10 from the current value of the stack pointer.
002316 010600 MOV SP, R0
The current value of the stack pointer is then moved into R0.
002320 016701 MOV 13660, R1
Then, the value of memory address 13660 is moved to R1. Address 13660 is used to store the currently executing line of the currently running BASIC program.
002324 104412 TRAP 12
TRAP 12 is an integer-to-ASCII subroutine, used to convert the numeric value in R1 to a string at the location pointed to by R0. I'll walk through this subroutine in another post. For now, suffice to say, it's an integer-to-ASCII subroutine and afterwards, R0 will contain a string representation of the value in R1.
002326 010600 MOV SP, R0
The stack pointer is moved into R0 again, to move R0 back to the beginning of the string that now represents the value from address 13660.
002330 005720 TST (R0)+
The value at R0 is tested and then R0 is incremented.
002332 105066 CLRB 7(SP)
Next, the byte at SP+7 is set to zero. This will add a zero terminator to the end of the string representing the current line number.
002336 104466 TRAP 66
TRAP 66 is used to display the string pointed to by R0.
002340 062706 ADD #10, SP 002344 000207 RTS PC
Finally, the stack pointer is restored to its original value and control returns from the subroutine.
Odd numbered TRAPs
In my previous post about TRAPs I explained how all even TRAP values are used like function calls but, at the time, I didn't explain the purpose and use of odd TRAP values. Well, odd TRAPs are used to represent error conditions. I'll explain how it works.
Firstly, if you re-read the first TRAPs post, you'll see that the TRAP instruction opcode is shifted right to see whether the TRAP value is odd or even. If it is odd, control jumps to instruction 126, and that's where we'll pick up the story.
Here's the code starting at instruction 126, with SP pointing at the location of the TRAP instruction:
000126 042716 BIC #177600, (SP) 000132 012602 MOV (SP)+, R2 000134 020227 CMP R2, #100 000140 003011 BGT 164 000142 005067 CLR 13674 000146 012767 MOV #1, 13700 000154 016706 MOV 13712, SP 000160 012746 MOV #4122, -(SP) 000164 010146 MOV R1, -(SP) 000166 010201 MOV R2, R1 000170 012700 MOV #233, R0 000174 010146 MOV R1, -(SP) 000176 104412 TRAP 12 000200 104402 TRAP 2 000202 012700 MOV #226, R0 000206 104466 TRAP 66 000210 104404 TRAP 4 000212 104402 TRAP 2 000214 005726 TST (SP)+ 000216 001001 BNE 222 000220 104516 TRAP 116 000222 012601 MOV (SP)+, R1 000224 000207 RTS PC
Let's take a look at this line-by-line.
000126 042716 BIC #177600, (SP)
Firstly, all except for the low 6 bits are masked away, using the BIC (bit clear) instruction.
000132 012602 MOV (SP)+, R2
The TRAP "parameter" value is popped from the stack and stored in R2.
000134 020227 CMP R2, #100 000140 003011 BGT 164
The value in R2 is then compared to 100. This checks whether the TRAP value is greater than 100. If it is, control jumps to address 164.
000142 005067 CLR 13674 000146 012767 MOV #1, 13700 000154 016706 MOV 13712, SP 000160 012746 MOV #4122, -(SP)
If the TRAP parameter value is less than 100, then this code is run, in addition to the code that follows.
Firstly, the value in 13674 is cleared. This value is set and used when reading data in (e.g. during the BASIC OLD command) to indicate that the data should be read from paper tape. Clearing this value will prevent reading data from paper tape.
Next the value 1 is moved into 13700. This prevents simultaneous reading and writing to I/O devices.
Then the value from 13712 is moved into the stack pointer. This will reset the stack pointer to the default value.
Finally the value 4122 is pushed onto the stack. When this value is popped from the stack and used as a return address, this will cause execution to re-enter the parsing loop and allow interactive BASIC to continue.
000164 010146 MOV R1, -(SP)
Now we move onto the code that is executed irrespective of the odd TRAP value. I.e. execution continues here regardless of whether the TRAP parameter was greater or less than 100. Firstly, the value in R1 is pushed onto the stack.
000166 010201 MOV R2, R1
Then, the value in R2 (the TRAP "parameter") is moved into R1.
000170 012700 MOV #233, R0 000174 010146 MOV R1, -(SP) 000176 104412 TRAP 12
Next the value 233 is moved into R0. This points at the string " AT LINE ". R1, containing the TRAP parameter value, is pushed onto the stack, and then TRAP 12 is invoked.
Trap 12 converts the numeric value in R1 into a string, which is placed at R0. In other words, suppose the TRAP parameter value (remember, it has been shifted left by one bit) is 9. Then, the result will be to create the string "9 AT LINE ", stored at memory address 233.
000200 104402 TRAP 2
A TRAP 2 is used to create a newline (i.e. a CR-LF pair).
000202 012700 MOV #226, R0 000206 104466 TRAP 66
Now the error string is displayed. Starting at address 226 we (normally) have "ERROR AT LINE ", but because of the code just executed we have placed the error code into the middle of that string. Therefore, using our example error code of 9, at address 226 we find the string "ERROR 9 AT LINE ". TRAP 66 is used to display the string pointed to by R0.
000210 104404 TRAP 4 000212 104402 TRAP 2
TRAP 4 is used to display the current line number of the executing program. This will be displayed straight after the "ERROR 9 AT LINE " string just printed. The TRAP 2 then produces another CR-LF pair.
000214 005726 TST (SP)+ 000216 001001 BNE 222 000220 104516 TRAP 116 000222 012601 MOV (SP)+, R1 000224 000207 RTS PC
Then, we do a last bit of tidying up before returning. Firstly the value at SP is tested and then SP is incremented. If the value popped from the stack is not equal to zero control jumps to 222. If the popped value is zero, TRAP 116 is executed. I'm not certain what this TRAP does yet but I think it's something to do with tracking memory usage.
The value in R1 is popped off the stack and then control returns to the stored PC on the stack (remember this will be address 4122 that was pushed onto the stack earlier).