Reverse engineering PDP-11 BASIC: Part 22

daveor
Mar 21, 2021
8 min read

This post will describe the operation of TRAP 124 and the BASIC DIM and LET commands. For completeness, I'll also describe the DATA and REM commands.

For context and a list of other posts on this topic, see the PDP-11 BASIC reverse engineering project page.

TRAP 124

This TRAP is used to confirm that the value passed in the register R0 is between 0 and 255. This is used to confirm that, when defining an array, the bounds of the array are within the acceptable range. If the value is in the valid range, the zero flag is set, otherwise all flags are cleared.

Here's the code:

001772 005700 TST R0
001774 002746 BLT 1712
001776 020027 CMP R0, #377
002002 003343 BGT 1712
002004 000264 SEZ
002006 000207 RTS PC

The value in R0 is tested. If it is less than zero all flags are cleared and control returns by branching to address 1712. The value in R0 is then compared to 255. If R0 is greater than 255, again, control jumps to address 1712, where all flags will be cleared and control will be returned.

Otherwise the zero flag is set and control is returned.

DIM command

The DIM command is used for allocating space for an array variable. One and two dimensional arrays are allowed and each dimension subscript can be any value up to 255. Here are some example DIM commands:

DIM A(100)
DIM A1(1, 10)
DIM B(255,255)

Here is the DIM command code:

004332 104544 TRAP 144
004334 102427 BVS 4414
004336 001055 BNE 4472
004340 010446 MOV R4, -(SP)
004342 104472 TRAP 72
004344 020227 CMP R2, #50
004350 001021 BNE 4414
004352 104410 TRAP 10
004354 104524 TRAP 124
004356 001016 BNE 4414
004360 010046 MOV R0, -(SP)
004362 000316 SWAB (SP)
004364 104472 TRAP 72
004366 120227 CMPB R2, #54
004372 001005 BNE 4406
004374 104410 TRAP 10
004376 104524 TRAP 124
004400 001005 BNE 4414
004402 050016 BIS R0, (SP)
004404 104472 TRAP 72
004406 020227 CMP R2, #51
004412 001401 BEQ 4416
004414 104433 TRAP 33
004416 012602 MOV (SP)+, R2
004420 012600 MOV (SP)+, R0
004422 010146 MOV R1, -(SP)
004424 104512 TRAP 112
004426 010200 MOV R2, R0
004430 104512 TRAP 112
004432 010201 MOV R2, R1
004434 000301 SWAB R1
004436 104522 TRAP 122
004440 102413 BVS 4470
004442 104504 TRAP 104
004444 103411 BCS 4470
004446 060005 ADD R0, R5
004450 012601 MOV (SP)+, R1
004452 104472 TRAP 72
004454 020227 CMP R2, #54
004460 001724 BEQ 4332
004462 005301 DEC R1
004464 000167 JMP 2762
004470 104435 TRAP 35
004472 104443 TRAP 43

Let's see how this works.

004332 104544 TRAP 144
004334 102427 BVS 4414
004336 001055 BNE 4472
004340 010446 MOV R4, -(SP)

TRAP 144 is used to get the variable name. If the overflow flag is set, meaning the variable name is invalid control jumps to 4414 to return an error. If the zero flag is set it means the variable was not found. If the zero flag is not set, that means the variable has already been defined, which is invalid, so control branches to 4472 to return an error. The variable identifier will be returned in R4.

004342 104472 TRAP 72
004344 020227 CMP R2, #50
004350 001021 BNE 4414

TRAP 72 is used to get the next non-whitespace character. The character, returned in R2, is compared to "(" (ASCII 50). If not equal, this is invalid, and control branches to 4414 to return an error.

004352 104410 TRAP 10
004354 104524 TRAP 124
004356 001016 BNE 4414

TRAP 10 is used to get a number pointed to by R1 and then TRAP 124 is used to confirm that the resulting value is in the range 0 to 255. If the array dimension is valid, the zero flag will be set. Otherwise the zero flag will not be set, in which case control jumps to 4414 to return an error.

004360 010046 MOV R0, -(SP)
004362 000316 SWAB (SP)

The array dimension is pushed onto the stack and then the bytes of the word on the top of the stack are swapped, so that the X dimension of the array is in the higher byte of the word on the top of the stack.

004364 104472 TRAP 72
004366 120227 CMPB R2, #54
004372 001005 BNE 4406

The next non-whitespace character is extracted using TRAP 72 and compared to "," (ASCII 54). If it is not a comma, control branches to 4406, skipping the parsing of the second array dimension.

004374 104410 TRAP 10
004376 104524 TRAP 124
004400 001005 BNE 4414

004402 050016 BIS R0, (SP)

The resulting value is AND'd with the value on the top of the stack.

004404 104472 TRAP 72
004406 020227 CMP R2, #51
004412 001401 BEQ 4416
004414 104433 TRAP 33

TRAP 72 is used to get the next non-whitespace character, which is then compared to ")" (ASCII 51). If the character equals a closing bracket, control branches to 4416. Otherwise, an error code is generated.

004416 012602 MOV (SP)+, R2
004420 012600 MOV (SP)+, R0
004422 010146 MOV R1, -(SP)

The array dimensions word is popped from the stack into R2. The variable identifier word is popped from the stack into R0. R1 is then pushed onto the stack.

004424 104512 TRAP 112
004426 010200 MOV R2, R0
004430 104512 TRAP 112

TRAP 112 is used to push the value from R0 (the variable identifier) into the runtime state storage. The value in R2 is copied into R0 and then this value (the array dimensions) is pushed into the runtime state storage.

004432 010201 MOV R2, R1
004434 000301 SWAB R1

The dimensions of the array, currently in register R2, are copied into R1. The bytes of R1 are swapped so the X dimension of the array is now in the low byte of R1.

004436 104522 TRAP 122
004440 102413 BVS 4470

TRAP 122 is used to calculate the space required to store the array. If there is not enough runtime state storage available the overflow flag will be set and control will branch to 4470 to return an error. Otherwise, the number of words required to store the array values will be returned in R0.

004442 104504 TRAP 104
004444 103411 BCS 4470

TRAP 104 is used to check whether the number of words specified in R0 are available in the runtime state storage. If the carry flag is set, that means there is not enough runtime state storage to store the array, so control branches to 4470 to return an error.

004446 060005 ADD R0, R5

Otherwise R0 is added to R5, allocating the space for the array in the runtime state storage.

004450 012601 MOV (SP)+, R1

R1 is restored from the stack.

004452 104472 TRAP 72
004454 020227 CMP R2, #54
004460 001724 BEQ 4332
004462 005301 DEC R1

This bit of code is really weird and doesn't make any sense.

TRAP 72 is used to get the next non-whitespace character which is then compared to "," (ASCII 54). If it is equal control branches back up to 4332. Otherwise R1 is decremented, to undo the side-effect of TRAP 72, which will have incremented R1.

The situation that is really strange is the case where the character matches a comma. Remember that R1 has just been popped off the stack, so we know that R1 points at the character after the closing bracket of the DIM statement. So, this code tests whether there is a comma AFTER the DIM statement. If there is, this will cause allocation of further space in the runtime state storage based on a weird combination of

the high word of the result of calculating the array size
the ASCII code for comma

This is really strange code and there is no mention of comma after the closing bracket in a DIM command as valid syntax in the PDP-11 BASIC Programming Manual, so this is a weird one. Anyone who can make sense of this feel free to comment below.

004464 000167 JMP 2762
004470 104435 TRAP 35
004472 104443 TRAP 43

Otherwise, control jumps back to the main syntax parsing loop. The two uneven TRAP instructions are used in the case of error conditions above.

LET command

The LET command is used to assign a value to a variable. The variable may or may not already exist, and the value can be any valid mathematical expression.

Here's the code:

006070 104544 TRAP 144
006072 102420 BVS 6134
006074 001002 BNE 6102
006076 010400 MOV R4, R0
006100 104546 TRAP 146
006102 010046 MOV R0, -(SP)
006104 104472 TRAP 72
006106 020227 CMP R2, #75
006112 001010 BNE 6134
006114 104536 TRAP 136
006116 102407 BVS 6136
006120 012600 MOV (SP)+, R0
006122 010220 MOV R2, (R0)+
006124 010320 MOV R3, (R0)+
006126 010420 MOV R4, (R0)+
006130 000167 JMP 2762
006134 104421 TRAP 21
006136 104417 TRAP 17

Let's see how this works.

006070 104544 TRAP 144
006072 102420 BVS 6134
006074 001002 BNE 6102
006076 010400 MOV R4, R0

TRAP 144 is used to get the variable name. If the overflow flag is set, meaning the variable name is invalid control jumps to 6134 to return an error. If the zero flag is set it means the variable was not found. If the zero flag is not set, that means the variable has already been defined, so control branches to 6102 to return an error.

The variable identifier will be returned in R4, which it is then copied into R0.

006100 104546 TRAP 146
006102 010046 MOV R0, -(SP)

TRAP 146 is then used to push a variable structure into the runtime state storage. The variable identifier from R0 is pushed into the runtime state, followed by a zero word (used to store array dimensions in the case of an array), followed by three words for recording the variable value. After returning, R0 will point at the variable value location in the runtime state storage, and this location is pushed onto the stack.

006104 104472 TRAP 72
006106 020227 CMP R2, #75
006112 001010 BNE 6134

TRAP 72 is used to get the next non-whitespace character. The returned character is compared to "=" (ASCII 75), and if it is not equal, this is invalid and control branches to 6134 to return an error.

006114 104536 TRAP 136
006116 102407 BVS 6136

After the equals sign may be any valid expression, so TRAP 136 is used to parse the expression. If the overflow flag is set that means that a bracket was consumed at the end of TRAP 136 which is unexpected so control branches to 1636 to return an error.

006120 012600 MOV (SP)+, R0
006122 010220 MOV R2, (R0)+
006124 010320 MOV R3, (R0)+
006126 010420 MOV R4, (R0)+

The storage location for the variable in the runtime state storage is popped off the stack into R0 and then R2/R3/R4 are copied into the three consecutive word locations at R0 (i.e. into the variable structure in the runtime state storage).

006130 000167 JMP 2762
006134 104421 TRAP 21
006136 104417 TRAP 17

Afterwards, control jumps to the main syntax parsing loop. The two uneven TRAPs are used to return certain error codes.

DATA and REM commands

At the time the DATA and REM commands are being parsed, they are both skipped. The information provided in the DATA command is not used until corresponding READ commands are encountered.

Therefore, both DATA and REM are handled using the same code:

006324 104502 TRAP 102
006326 005301 DEC R1
006330 000167 JMP 2762

TRAP 102 is used to skip to the end of the current command. R1 is decremented and then control jumps back to the main parsing loop.

Reverse engineering PDP-11 BASIC: Part 22

Recent Posts

Comments