HP-28C BYTES On the day that the HP-28C was announced I sent out a few copies of an article called "NOMAS programming on the HP-28C". My hope was that an exchange of ideas and discoveries would develop rapidly, similar to that in the PPC Journal after the introduction of the HP-41. I had been fortunate in being allowed to preview a 28C for a week, and had made a few discoveries which I hoped to share through "NOMAS programming" so as to save others having to make the same discoveries. In the event, things have not turned out quite that way. The PPC Journal has so far printed little more than HP-28C advertising material, and Richard Nelson is winding up CHHU, so the CHHU Chronicle will not become the hoped-for medium for rapid dissemination of HP-28C information. In the light of the above it is not surprising that HP seem not to have told PPC or CHHU anything exciting about the internals of the HP-28C, as had been the case with the HP-41C. Maybe Brian Walsh's new club will pick up the pieces, but for now other user clubs are left to their own resources. What can we do, what do we want to do? For a start "NOMAS programming..." was published in DATAFILE, the journal of the British club HPCC. Since then I have been writing "HP-28C Notes" - a regular column of tips and news in DATAFILE. The purpose of the column is to spread information I get, and to give non-28C owners some idea of what is going on. (The June 1987 column includes a description of the 28C internals and tells you how to put in more RAM.) The article you are reading now explains how you can analyse the byte structure of HP-28C objects, hence its title. The methods described should be a useful step along the road to understanding the 28C, so I am putting the article on a swap disk for friends and user clubs. One thing I proposed in "NOMAS programming" was that a group of people divide up the HP-28C ROM address area among themselves and each one study his or her area with SYSEVAL. Only Ian Maw took up that idea, so I suggested that he and I look at the top of the HP-28C ROM area; it seemed to me that this area could contain some interesting general-purpose operations. I took addresses #3F000 to #3F7FF and Ian took #3F800 to #3FFFF. Ian very cleverly recognised that #261031 (decimal) works like STO, but does not check for any errors. Thus you can use this to turn an object displayed in level 1 into a name on the user menu, then get the name back into the command line as a character string and use NUM to convert this into bytes. One of the things I have done with this is to get the addresses of all HP-28C commands; I made a full list by going through the catalogue. I then wrote the HP-71B programs on this swap disk to sort and print the list in address order. This led to a very interesting discovery; nearly every command is separated from the next one by (n*5)+1 nybbles. Since the Saturn CPU in the 28C uses 5-nybble addresses, this shows that each command is made up of a string of calls to lower-level routines. This is very similar to FORTH, and is confirmed by Bill Wickes who has said that the built-in commands are written in a superset of the language available to the user. Thus one question raised in "NOMAS programming" is answered - most commands are addresses of lower-level commands, so they are indirect addresses. At some point the 28C must reach a level where machine language instructions are executed; this must need a special form, either the address of a command which says "execute the following as machine language", or a special code which is not 5 nybbles long; possibly both. (There are other reasons too for storing some data immediately after a 5-nybble address.) This might explain why each command is (5*n)+1 nybbles long - a nybble of data appears to be stored just ahead of each command. Recognising that much of the HP-28C work is done in terms of 5-nybble addresses provides further insights; for instance many (if not all) items on the stack are actually stored as addresses. I have previously written that DUP, OVER, and other commands put only 5 nybbles onto the stack: they just copy the address at which the object is stored in RAM. For example the value of a real number is stored as 16 nybbles; sign, 12 digit mantissa, 3 digit exponent. This is preceded by the 5-nybble address of the routine which identifies it as a real number. That takes up 21 nybbles if the number is stored as part of a program. But if a real number is put on the stack it uses 32 nybbles (assuming LAST, UNDO and COMMAND are disabled). I surmise that the number is stored together with a further 5 nybble address and one data nybble to say that this is not a named variable but a stack item. Finally, the number's RAM address is stored in 5 more nybbles in the stack; a total of 32 nybbles, 16 bytes. Complex numbers are stored the same way, but contain 16 more nybbles for the imaginary part, so a complex number entered from the command line onto the stack takes up 24 bytes. Many HP-28C operations are based on the use of 5-nybble addresses. Addresses are not themselves objects, but point to objects, or to instructions or other addresses. If an address pointed to does not define an object or a command then it is displayed as "System Object". Addresses are such important entities that I suggest HP-28C users could choose some word as a standard name for 5 nybbles (20 bits), just as "nybble" is used as a standard name for 4 bits. I could give more details, but you are probably more interested to learn how you can make this sort of discovery for yourself. The rest of this article will therefore describe how I use Ian Maw's discovery of the "Synthetic" STO to analyse the byte structure of HP-28C objects shown in level 1: 1. Put the object to be studied in level 1 of the stack. (In fact, as I wrote above, it is probably only the object's address that is put in level 1, but STO will replace the name with its value, so the distinction does not matter.) 2. Make sure there is at least one more object on the stack (in level 2). It is best to do this by pressing ENTER to duplicate the object being studied. 3. Execute the program << #261031 SYSEVAL>> (with the 28C in DECimal mode.) This is equivalent to the STO command, EXCEPT that it does not test whether the object in level 1 is a valid name. You can check this by putting a name in level 1 and executing this program - the name will be added to the USER menu, just as if you had used STO. With any other type of object in level 1, the object is also treated as if it was a name, and added to the USER menu. The name produced this way usually means nothing, but you can check what the object is by pressing the key and seeing what it does - that is why I suggested storing the object itself under this name. (Another use of this synthetic STO can be to create menu options with names that the HP-28C does not normally allow.) Bruce Bailey suggests that we call this process Ian-ization in honour of its discoverer! 4. Press SHIFT, then quote (key marked "), then the key to which this name has just been assigned; this brings the name you have created into the command line. 5. You have now turned the object you wish to study into a text string which is in the command line. You can press ENTER to put it into level 1 and can then use the NUM function from the STRING menu to extract bytes from this string one by one, and thus analyze the object one byte at a time. Ignore the first byte which is a space that the 28C inserts after a " at the beginning of a text string. The last character will also be a space and can be used to recognise the end of the string (see point 10). Unfortunately there are several problems. 6. First of all, NUM tells you only what the first byte is in a string. You need to keep a copy of the rest of the string so as to be able to study it. I use the program << DUP 2 999 SUB SWAP NUM HEX R->B >> to leave the hexadecimal value of the first byte in level 1 and the rest of the string in level 2. 7. Secondly, numbers and addresses are stored in memory back-to front. For example the number 1.23454443424E104 is represented by the hexadecimal string 0123454443424104 (sign, followed by twelve mantissa digits followed by three exponent digits). If you put two copies of this into the stack, then execute #261031 SYSEVAL (in DEC mode), you will see the name ABCD added to the USER menu. The last byte (04) was taken as the length of a name, and the preceding four bytes (44434241) were stored as the name. 41 represents the character A, 42 represents B, 43 represents C, and so on; this is the ASCII code which is also used by the HP-41 and HP-71. In other words the number was really stored as the bytes 04,41,42,43,44,45,23,01. Actually, this is preceded by the address of the routine which defines a real number, as I mentioned above, but that is ignored by the synthetic STO. Addresses are also stored in reverse order, for example the hexadecimal address of SYSEVAL is #1A582, so if you create a program which contains SYSEVAL and then analyse it, you will find it contains the hex bytes 2X, 58, 1A or 82, A5, X1. You can see that the first represents 1A582 stored with its individual bytes in reverse order and preceded by a nybble X, while the second is 1A582 stored with a nybble X at the end. This does not make for easy decoding - but you can always write a program to do the reversing. 8. From the above you can see that the first five nybbles of the object in level 1 are lost and the next byte is used as the length of the string. To study the whole of an object, put it in a list, and store the list with #261031 - then the lost bytes will be part of the list definition, not part of the object. The byte used to define the string length will be hexadecimal 96, allowing you to study long objects. The first three nybbles of the name will be the rest of the list definition; they will be followed by the bytes which make up the object. 9. If the name string you have created contains any " characters then only the part of it before the first " will stay in one complete string. One way to deal with this is to press the key at the right of SHIFT to go into the editing menu, then step through the string, note the position of each " (hex 22) and replace it with something else, such as ' (hex 27). 10. The name might also contain null or newline characters (bytes 00 or 0A). The manuals are not clear about null characters, they just say you cannot edit text strings which contain nulls. This is because a null is stored as the last character of a command line string, to mark end of the string. If you were to use CHR to put a null into a character string and then edit that string, the 28C would assume that the null marks the end of the string. To avoid this you are prevented from editing such strings. However the method described above can put a string with nulls into the command line. If you press ENTER then a null will be treated as the end of the string, just as a " character will. To get rid of nulls, newline characters and " characters, do the following: a/ Go into the editing menu (as in 9. above). b/ Press SHIFT < to get to the beginning of the string. c/ Press SHIFT > to get to where the 28C thinks is the end of the string. This is either the first null, or the true end of the string, or a newline. If the character to the immediate left of the cursor is a space then it is most probably the true end of the string (see the end of point 5. above). If the character to the left of the cursor position is not a blank, then the cursor has stopped at a null or a newline. Usually it is a null; make a note of its position and press a key to replace the null with some other character (I use Z for zero). If the character is a single null, then the Z will replace it and you will see more characters to its right. However, a null might be followed by another null; in that case you will see nothing to the right of the Z. Repeat SHIFT > and press Z again to see if this will replace another null. You might have to do this several times, for example the number 1.2 contains a sign, two digits, and five null bytes which all need to be replaced. Eventually, you should see some more characters, or the space which marks the end of the string. d/ If repeatedly pressing Z fails to show any more characters to the right then the 28C may have come to a newline byte. You can find if this is so by pressing ENTER. If there is a newline character in the string then it will show up as a small filled-in square (the symbol for a non-displayable character) following the Zs and followed by characters which were not in the display. Newline characters (hex 0A) are difficult to deal with; one approach it is to create a new version of the string you are analysing, with an extra command somewhere ahead of the 0A byte. This will be five nybbles long, so the nybbles 0 and A will be put into two different bytes. Another method is to find the position of the newline as just described, then simply press ENTER and accept the fact that the first null or " after it will be treated as the end of the string. e/ I repeat c/ (and d/ if necessary) until I am sure I have reached the end of the string, or until I have replaced enough nulls to make it possible to analyse the relevant part of the object. This is a matter of judgement; strings can be long, and the 28C has little RAM, so it is not always possible to analyse the whole string, particularly if you are using memory for other purposes too. f/ Press SHIFT < to get back to the beginning of the string again and use > to step to each " character, make a note of its position and replace it with a '. g/ Press ENTER to get the string into level 1 where it can be analysed. Now you can use the program shown in point 6 to analyse the string, or write another program which suits your needs better. Remember to check whether Z and ' characters are ones which you have used to replace null and " characters. Well, that's a very useful trick, but I haven't yet found PEEK or POKE commands - they would be very helpful. At least the HP-28C machine language is the same as that of the HP-71B, so we have some help in decoding anything we do read. Indeed we may eventually be able to write programs to do PEEK and POKE by creating long character strings containing the bytes that make up the required machine language programs. We would still need to know the addresses of these programs in RAM so as to use SYSEVAL to execute them. (We would also need to know the code which tells the 28C that a set of nybbles is a machine language routine, not the address of a routine.) Some of the SYSEVAL operations return RAM addresses to the stack as binary numbers, so it may be possible to find a SYSEVAL which returns the PC (Program Counter) - that would help a lot! Wlodek Mier-Jedrzejowicz, June 1987