@ HP-28C BYTES Subject: Studying the byte structure of HP-28C commands. Author: Wlodek Mier-Jedrzejowicz On the day that the HP-28C was announced I sent out a few copies of an article called "NOMAS programming on the HP-28C". My hope was that an exchange of ideas and discoveries would develop rapidly, similar to that in the PPC Journal following the introduction of the HP-41. Since I had been fortunate in being allowed to use an HP-28C for a week, I had already made a few discoveries and hoped to share them through "NOMAS programming" so as to save other people having to make the same discoveries. In the event, things have not turned out quite that way. The American PPC Journal has so far printed little more than HP-28C advertising material, and Richard Nelson is winding up the other American club, CHHU, so there is no chance that the CHHU Chronicle will become a medium for the rapid dissemination of HP-28C information. In the light of the poor performance of PPC and CHHU it is not surprising that HP have not told them anything exciting about the internals of the HP-28C, as had been the case with the HP-41C. Maybe Brian Walsh's new club will eventually pick up the pieces, but for now the other user clubs are left to our own resources. What can we do, what do we want to do? Well, for a start "NOMAS programming..." was published in DATAFILE. Since then I have been writing "HP-28C Notes" - a regular column of 28C tips and news. The purpose of that column is to spread information I get, and to give non-28C owners some idea of what is going on. The article you are reading now is different - it explains how you can analyse the byte structure of HP-28C objects, hence its title. The methods described should be a very useful step along the road to understanding the HP-28C, so I have sent copies of the article to friends and to other user clubs. One thing I proposed in "NOMAS programming" was that a group of people divide up the HP-28C ROM address area among themselves and each one study his or her area with SYSEVAL. Only Ian Maw took up that idea, so I suggested that he and I look at the top of the HP-28C ROM area; it seemed to me that this area could contain some interesting general-purpose operations. I took addresses #3F000 to #3F7FF and Ian took #3F800 to #3FFFF. Neither of us got very far, but Ian discovered that #261031 (decimal) works like STO, but does not check for any errors. Thus you can use this to turn an object displayed in level 1 into a name on the user menu, then press " and the menu key to get the name back into the command line as a character string - and then use NUM to convert this into bytes. One of the things I have done with this is to get the addresses of all HP-28C commands; I made a full list by going through the catalogue, so it was in alphabetic order. I then wrote a set of HP-71B programs to let me enter this list, sort it in address order, and print it out. One very interesting discovery came out of this; nearly every conmmand is separated from the next one by (n*5)+1 nybbles. Since the Saturn CPU in the 28C uses 5-nybble addresses, this shows that each command is made up of a string of calls to lower-level routines. This is very similar to FORTH, and is confirmed by Bill Wickes who has said that the built-in commands are written in a superset of the language available to the user. Thus one question raised in "NOMAS programming" is answered - most commands are addresses of lower-level commands, so they are indirect addresses. At some point the 28C must reach a level where machine language instructions are executed, and this must need a special form, either the address of a command which says "execute the following as machine language", or a special code which is not 5 nybbles long; possibly both. (Indeed there are other reasons for storing some data immediately after a 5-nybble address.) This might explain why each command is (5*n)+1 nybbles long - a special nybble of data appears to be stored just ahead of each command. Recognising that much of the HP-28C work is done in terms of 5-nybble addresses gives us further insights. For example many (if not all) items on the stack are actually stored as addresses. I have already written that DUP, OVER, and other commands put only 5 nybbles onto the stack: they just copy the address at which the object is stored in RAM. For example the value of a real number is stored as 16 nybbles; sign, 12 digit mantissa, 3 digit exponent. This is preceded by the 5-nybble address of the routine which identifies it as a real number. That takes up 21 nybbles if the number is stored as part of a program. But if a real number is put on the stack it uses 32 nybbles (assuming LAST, UNDO and COMMAND are disabled). I surmise that the number is stored together with a further 5 nybble address and one data nybble to say that this is not a named variable but a stack item. Finally, the number's RAM address is stored in 5 more nybbles in the stack; a total of 32 nybbles, 16 bytes. Complex numbers are stored in the same way, except that they contain 16 more nybbles for the imaginary part, so a complex number entered from the command line onto the stack takes up 24 bytes. A great many HP-28C operations are based on the use of five-nybble addresses. An address is not itself an object, but it usually points to an object, or to some instructions or to another address. If the address pointed to does not define an object or a command then it is shown in the display as a "System Object". Although addresses are not objects they are such important entities that I suggest HP-28C users could accept the word "address" as the standard name for 5 nybbles (20 bits), just as "nybble" is used as a standard name for 4 bits. I could give more details, but you are probably more interested to learn how you can make this sort of discovery for yourself. The rest of this article will therefore describe how I use Ian Maw's discovery of the "Synthetic" STO to analyse the byte structure of HP-28C objects shown in level 1: 1. Put the object to be studied in level 1 of the stack. (In fact, as I wrote above, it is probably only the object's address that is put in level 1, but STO will replace the name with its value, so the distinction does not matter.) 2. Make sure there is at least one more object on the stack (in level 2). You can do this by pressing ENTER to duplicate the object being studied. 3. Execute the program << #261031 SYSEVAL>> (with the 28C in DECimal mode.) This is equivalent to the STO command, EXCEPT that it does not test whether the object in level 1 is a valid name. You can check this by putting a name in level 1 and executing this program - the name will be added to the USER menu, just as if you had used STO. With any other type of object in level 1, the object is also treated as if it was a name, and added to the USER menu. The name produced this way usually means nothing, but you can check what the object is by pressing the key and seeing what it does - that is why I suggest storing the object itself under this name. (Another use of this synthetic STO can be to create menu options with names that the HP-28C does not normally allow.) Bruce Bailey suggests that we call this process Ian-ization in honour of its discoverer! 4. Press SHIFT, then quote (key marked "), then the key to which this name has just been assigned; this brings the name you have created into the command line. 5. You have now turned the object you wish to study into a text string which is in the command line. You can press ENTER to put it into level 1 and can then use the NUM function from the STRING menu to extract bytes from this string one by one, and see what they are. This lets you analyze the object one byte at a time. Ignore the first byte which is a space that the 28C automatically inserts after a " at the beginning of a text string. The last character will also be a space and can be used to recognise the end of the string (see point 10). Unfortunately there are several problems to be dealt with. 6. First of all, NUM tells you only what the first byte is in a string. You need to keep a copy of the rest of the string so as to be able to study it. I use the program << DUP 2 999 SUB SWAP NUM HEX R->B >> to leave the hexadecimal value of the first byte in level 1 and the rest of the string in level 2. 7. Secondly, numbers and addresses are stored in memory back-to front. For example the number 1.23454443424E104 is represented by the hexadecimal string 0123454443424104 (sign, followed by twelve mantissa digits followed by three exponent digits). If you put two copies of this into the stack, then execute #261031 SYSEVAL (in DEC mode), you will see the name ABCD added to the USER menu. The last two hexadecimal characters (04) were taken as the length of a name, and the preceding four bytes (44434241) were stored as the name. 41 represents the character A, 42 represents B, 43 represents C, and so on. The code used here for the letters is the ASCII code which is also used by the HP-41 and HP-71. In other words the number was really stored as the bytes 04,41,42,43,44,45,23,01. Actually, this is preceded by the address of the routine which defines a real number, as I mentioned above, but that is ignored by the synthetic STO. Addresses are also stored in reverse order, for example the hexadecimal address of SYSEVAL is #1A582, so if you create a program which contains SYSEVAL and then analyse it, you will find it contains the hex bytes 2X, 58, 1A or 82, A5, X1. You can see that the first represents 1A582 stored with its individual bytes in reverse order and preceded by a nybble X, while the second is 1A582 stored with a nybble X at the end. This does not make for easy decoding - but you can always write a program to do the reversing. 8. From the above you can see that the first five nybbles of the object in level 1 are not stored in the name string - and the next byte is used as the length of the string. To study the whole of an object, put it in a list, and store the list with #261031 - then the lost bytes will be part of the list definition, not part of your object. Furthermore, the byte used to define the string length will be hexadecimal 96, so 150 bytes will be stored in the name, allowing you to study long objects. The first three nybbles of the name will be the rest of the list definition; they will be followed by the bytes which make up the object. 9. If the name string you have created contains any " characters then only the part of it before the first " will stay in one complete string. One way to deal with this is to press the key at the right of SHIFT to go into the editing menu, then step through the string, note the position of each " (hex 22) and replace it with something else, such as ' (hex 27). 10. The name might also contain null characters (byte 00) or newline characters (byte 0A). The HP-28C manuals are not clear about null characters, they just say you cannot edit text strings which contain nulls. This is because a null is stored as the last character of a string in the command line, to tell the HP-28C that this is the end of the string. If you were to use CHR to put a null into a character string and then edit that string, the 28C would assume that the null marks the end of the string. To avoid this you are prevented from editing such strings. However the method described above can put a string with nulls into the command line. If you press ENTER then a null will be treated as the end of the string, just as a " character will. To get rid of nulls, newline characters and " characters, do the following: a/ Go into the editing menu (as in 9. above). b/ Press SHIFT < to get to the beginning of the string. c/ Press SHIFT > to get to where the 28C thinks is the end of the string. This is either the first null, or the true end of the string, or a newline, with which I shall deal in d/ below. If the character to the immediate left of the cursor is a space then it is most probably the true end of the string (see the end of point 5. above). If the character to the left of the cursor position is not a blank, then the cursor has stopped at a null or a newline. Usually it is a null; make a note of its position, then press a key to replace the null with some other character (I use Z because it it stands for zero). If the character is a single null, then the Z will replace it and you will see more characters to its right. However, the character might be a null followed by another null; in that case you will see nothing to the right of the Z. Repeat SHIFT > and press Z again to see if this will replace another null. You might have to do this several times, for example the number 1.2 contains a sign, two digits, and five null bytes which all need to be replaced. Eventually, you should see some more characters, or the space which marks the end of the string. d/ If repeatedly pressing Z fails to show any more characters to the right then the 28C may have come to a newline byte. You can find if this is so by pressing ENTER. If there is a newline character in the string then it will show up as a small filled-in square (the symbol for a non-displayable character) following the Zs and followed by characters which you have not seen in the display. A newline character (hex 0A) is difficult to deal with; one way to get round it is to create a new version of the string you are analysing, with an extra command somewhere ahead of the 0A byte. This will be five nybbles long, so the nybbles 0 and A will be put into two different bytes. Another method is to find the position of the newline in the way just described, then not to try replacing it, simply press ENTER and accept the fact that the first null or " after it will be treated as the end of the string. e/ I repeat c/ (and d/ if necessary) until I am sure I have reached the end of the string, or until I have replaced enough nulls to make it possible to analyse the relevant part of the object. This is a matter of judgement; strings can be long, and the 28C has little RAM, so it is not always possible to analyse the whole string, particularly if you are using memory for other purposes too. f/ Press SHIFT < to get back to the beginning of the string again and use > to step to each " character, make a note of its position and replace it with a '. g/ Press ENTER to get the string into level 1 where it can be analysed. Now you can use the program shown in point 6 to analyse the string, or write another program which suits your needs better. Remember to check whether Z and ' characters are ones which you have used to replace null and " characters. Well, that's a very useful trick, but I haven't yet found PEEK or POKE commands - they would be very helpful. At least the HP-28C machine language is the same as that of the HP-71B, so we have some help in decoding anything we do read. Indeed we may eventually be able to write programs to do PEEK and POKE by creating long character strings containing the bytes that make up the required machine language programs. We would still need to know the addresses of these programs in RAM so as to use SYSEVAL to execute them. (We would also need to know the code which tells the 28C that a set of nybbles is a machine language routine, not the address of some program steps. Could a machine language routine be one of the object types 11 to 15?) Some of the SYSEVAL operations return RAM addresses to the stack as binary numbers, so it may be possible to find a SYSEVAL which returns the PC (Program Counter) - that would help a lot! All these questions are for the future - for now I hope you will find ways of exploiting the method described here - maybe the method will even help us answer some of the questions outlined above. If you do make any discoveries, please let me know so we can all share them.