@                                  HP-28C BYTES

Subject: Studying the byte structure of HP-28C commands.
Author:  Wlodek Mier-Jedrzejowicz

On  the day that the HP-28C was announced I sent out a few copies of an  article
called "NOMAS programming on the HP-28C".  My hope was that an exchange of ideas
and  discoveries  would  develop rapidly,  similar to that in  the  PPC  Journal
following  the introduction of the HP-41.   Since I had been fortunate in  being
allowed  to use an HP-28C for a week,  I had already made a few discoveries  and
hoped  to  share  them through "NOMAS programming" so as to  save  other  people
having to make the same discoveries.   In the event,  things have not turned out
quite  that way.   The American PPC Journal has so far printed little more  than
HP-28C advertising material, and Richard Nelson is winding up the other American
club,  CHHU,  so there is no chance that the CHHU Chronicle will become a medium
for  the  rapid dissemination of HP-28C information.   In the light of the  poor
performance  of  PPC and CHHU it is not surprising that HP have  not  told  them
anything  exciting about the internals of the HP-28C,  as had been the case with
the  HP-41C.   Maybe Brian Walsh's new club will eventually pick up the  pieces,
but for now the other user clubs are left to our own resources.  What can we do,
what do we want to do?

Well,  for a start "NOMAS programming..." was published in DATAFILE.  Since then
I have been writing "HP-28C Notes" - a regular column of 28C tips and news.  The
purpose  of  that column is to spread information I get,  and  to  give  non-28C
owners  some  idea  of what is going on.   The article you are  reading  now  is
different  - it  explains  how  you can analyse the  byte  structure  of  HP-28C
objects,  hence  its title.   The methods described should be a very useful step
along the road to understanding the HP-28C, so I have sent copies of the article
to friends and to other user clubs.

One thing I proposed in "NOMAS programming" was that a group of people divide up
the  HP-28C ROM address area among themselves and each one study his or her area
with SYSEVAL.  Only Ian Maw took up that idea, so I suggested that he and I look
at the top of the HP-28C ROM area;  it seemed to me that this area could contain
some interesting general-purpose operations.   I took addresses #3F000 to #3F7FF
and Ian took #3F800 to #3FFFF.   Neither of us got very far,  but Ian discovered
that #261031 (decimal) works like STO,  but does not check for any errors.  Thus
you can use this to turn an object displayed in level 1 into a name on the  user
menu,  then  press " and the menu key to get the name back into the command line
as a character string - and then use NUM to convert this into bytes.  One of the
things I have done with this is to get the addresses of all HP-28C  commands;  I
made a full list by going through the catalogue,  so it was in alphabetic order.
I  then  wrote a set of HP-71B programs to let me enter this list,  sort  it  in
address  order,  and print it out.   One very interesting discovery came out  of
this;  nearly  every conmmand is separated from the next one by (n*5)+1 nybbles.
Since  the Saturn CPU in the 28C uses 5-nybble addresses,  this shows that  each
command is  made up of a string of calls to lower-level routines.   This is very
similar to FORTH, and is confirmed by Bill Wickes who has said that the built-in
commands are written in a superset of the language available to the user.   Thus
one  question  raised  in "NOMAS programming" is answered  - most  commands  are
addresses  of lower-level commands,  so they are indirect  addresses.   At  some
point  the  28C  must  reach a level where  machine  language  instructions  are
executed,  and  this must need a special form,  either the address of a  command
which says "execute the following as machine language",  or a special code which
is  not  5 nybbles long;  possibly both.   (Indeed there are other  reasons  for
storing some data immediately after a 5-nybble address.)  This might explain why
each  command  is (5*n)+1 nybbles long - a special nybble of data appears to  be
stored just ahead of each command.

Recognising  that much of the HP-28C work is done in terms of 5-nybble addresses
gives us further insights.  For example many (if not all) items on the stack are
actually stored as addresses.   I have already written that DUP, OVER, and other
commands put only 5 nybbles onto the stack:  they just copy the address at which
the  object is stored in RAM.   For example the value of a real number is stored
as 16 nybbles;  sign,  12 digit mantissa, 3 digit exponent.  This is preceded by
the 5-nybble address of the routine which identifies it as a real number.   That
takes up 21 nybbles if the number is stored as part of a program.  But if a real
number  is put on the stack it uses 32 nybbles (assuming LAST,  UNDO and COMMAND
are  disabled).   I surmise that the number is stored together with a further  5
nybble address and one data nybble to say that this is not a named variable  but
a stack item.  Finally,  the number's RAM address is stored in 5 more nybbles in
the stack;  a total of 32 nybbles,  16 bytes.  Complex numbers are stored in the
same way,  except that they contain 16 more nybbles for the imaginary part, so a
complex number entered from the command line onto the stack takes up 24 bytes.

A  great  many HP-28C operations are based on the use of five-nybble  addresses.
An address is not itself an object,  but it usually points to an object,  or  to
some  instructions  or to another address.   If the address pointed to does  not
define  an  object  or a command then it is shown in the display  as  a  "System
Object".   Although  addresses are not objects they are such important  entities
that I suggest HP-28C users could accept the word "address" as the standard name
for 5 nybbles (20 bits), just as "nybble" is used as a standard name for 4 bits.

I could give more details, but you are probably more interested to learn how you
can  make this sort of discovery for yourself.   The rest of this  article  will
therefore  describe  how  I use Ian Maw's discovery of the  "Synthetic"  STO  to
analyse the byte structure of HP-28C objects shown in level 1:

1.  Put the object to be studied in level 1 of the stack.   (In fact, as I wrote
above,  it is probably only the object's address that is put in level 1, but STO
will replace the name with its value, so the distinction does not matter.)

2.  Make sure there is at least one more object on the stack (in level 2).   You
can do this by pressing ENTER to duplicate the object being studied.

3.  Execute  the program << #261031 SYSEVAL>>  (with the 28C in  DECimal  mode.)
This is equivalent to the STO command,  EXCEPT that it does not test whether the
object in level 1 is a valid name. You can check this by putting a name in level
1 and executing this program - the name will be added to the USER menu,  just as
if  you had used STO.   With any other type of object in level 1,  the object is
also treated as if it was a name, and added to the USER menu.  The name produced
this way usually means nothing, but you can check what the object is by pressing
the  key  and  seeing what it does - that is why I suggest  storing  the  object
itself  under  this name.   (Another use of this synthetic STO can be to  create
menu options with names that the HP-28C does not normally allow.)  Bruce  Bailey
suggests that we call this process Ian-ization in honour of its discoverer!

4.  Press SHIFT,  then quote (key marked "), then the key to which this name has
just been assigned; this brings the name you have created into the command line.

5.  You have now turned the object you wish to study into a text string which is
in  the command line.   You can press ENTER to put it into level 1 and can  then
use the NUM function from the STRING menu to extract bytes from this string  one
by  one,  and see what they are.  This lets you analyze the object one byte at a
time.  Ignore the first byte which is a space that the 28C automatically inserts
after a " at the beginning of a text string.  The last character will also be  a
space  and  can  be  used to recognise the end of the  string  (see  point  10).
Unfortunately there are several problems to be dealt with.

6.  First  of all,  NUM tells you only what the first byte is in a string.   You
need to keep a copy of the rest of the string so as to be able to study  it.   I
use the program <<  DUP 2 999 SUB SWAP NUM HEX R->B  >> to leave the hexadecimal
value of the first byte in level 1 and the rest of the string in level 2.

7.  Secondly,  numbers  and addresses are stored in memory back-to  front.   For
example  the  number 1.23454443424E104 is represented by the hexadecimal  string
0123454443424104  (sign,  followed by twelve mantissa digits followed  by  three
exponent  digits).   If you put two copies of this into the stack,  then execute
#261031  SYSEVAL  (in DEC mode),  you will see the name ABCD added to  the  USER
menu.   The  last two hexadecimal characters (04) were taken as the length of  a
name,  and  the  preceding four bytes (44434241) were stored as  the  name.   41
represents the character A,  42 represents B,  43 represents C,  and so on.  The
code used here for the letters is the ASCII code which is also used by the HP-41
and  HP-71.   In  other  words  the  number  was  really  stored  as  the  bytes
04,41,42,43,44,45,23,01.   Actually,  this  is  preceded by the address  of  the
routine which defines a real number,  as I mentioned above,  but that is ignored
by the synthetic STO.   Addresses are also stored in reverse order,  for example
the  hexadecimal address of SYSEVAL is #1A582,  so if you create a program which
contains  SYSEVAL and then analyse it,  you will find it contains the hex  bytes
2X, 58, 1A  or 82, A5, X1.   You can see that the first represents 1A582  stored
with its individual bytes in reverse order and preceded by a nybble X, while the
second is 1A582 stored with a nybble X at the end.   This does not make for easy
decoding - but you can always write a program to do the reversing.

8. From the above you can see that the first five nybbles of the object in level
1 are not stored in the name string - and the next byte is used as the length of
the string.   To study the whole of an object,  put it in a list,  and store the
list with #261031 - then the lost bytes will be part of the list definition, not
part  of your object.   Furthermore,  the byte used to define the string  length
will be hexadecimal 96, so 150 bytes will be stored in the name, allowing you to
study long objects.  The first three nybbles of the name will be the rest of the
list definition; they will be followed by the bytes which make up the object.

9.  If  the name string you have created contains any " characters then only the
part of it before the first " will stay in one complete string.  One way to deal
with this is to press the key at the right of SHIFT to go into the editing menu,
then step through the string,  note the position of each " (hex 22) and  replace
it  with  something else,  such as ' (hex 27).

10.  The name might also contain null characters (byte 00) or newline characters
(byte 0A). The HP-28C manuals are not clear about null characters, they just say
you  cannot edit text strings which contain nulls.   This is because a  null  is
stored as the last character of a string in the command line, to tell the HP-28C
that this is the end of the string.  If you were to use CHR to put a null into a
character  string and then edit that string,  the 28C would assume that the null
marks the end of the string.   To avoid this you are prevented from editing such
strings.   However  the method described above can put a string with nulls  into
the command line.  If you press ENTER then a null will  be treated as the end of
the string, just as a " character will.  To get rid of nulls, newline characters
and " characters, do the following:

a/ Go into the editing menu (as in 9. above).

b/ Press SHIFT < to get to the beginning of the string.

c/ Press SHIFT > to get to where the 28C thinks is the end of the string.   This
is  either the first null,  or the true end of the string,  or a  newline,  with
which I shall deal in d/ below.   If the character to the immediate left of  the
cursor  is a space then it is most probably the true end of the string (see  the
end of point 5.  above).  If the character to the left of the cursor position is
not a blank,  then the cursor has stopped at a null or a newline.  Usually it is
a null;  make a note of its position,  then press a key to replace the null with
some other character (I use Z because it it stands for zero).   If the character
is a single null, then the Z will replace it and you will see more characters to
its right.   However, the character might be a null followed by another null; in
that case you will see nothing to the right of the Z.   Repeat SHIFT > and press
Z  again  to see if this will replace another null.   You might have to do  this
several times,  for example the number 1.2 contains a sign, two digits, and five
null bytes which all need to be replaced.   Eventually, you should see some more
characters, or the space which marks the end of the string.

d/ If repeatedly pressing Z fails to show any more characters to the right  then
the 28C may have come to a newline byte.  You can find if this is so by pressing
ENTER.   If there is a newline character in the string then it will show up as a
small  filled-in  square (the symbol for a non-displayable character)  following
the  Zs and followed by characters which you have not seen in  the  display.   A
newline character (hex 0A) is difficult to deal with; one way to get round it is
to  create a new version of the string you are analysing,  with an extra command
somewhere ahead of the 0A byte.   This will be five nybbles long, so the nybbles
0  and A will be put into two different bytes.   Another method is to  find  the
position of the newline in the way just described, then not to try replacing it,
simply press ENTER and accept the fact that the first null or " after it will be
treated as the end of the string.

e/  I repeat c/ (and d/ if necessary) until I am sure I have reached the end  of
the string, or until I have replaced enough nulls to make it possible to analyse
the relevant part of the object.   This is a matter of judgement; strings can be
long,  and the 28C has little RAM,  so it is not always possible to analyse  the
whole string, particularly if you are using memory for other purposes too.

f/  Press SHIFT < to get back to the beginning of the string again and use >  to
step to each " character, make a note of its position and replace it with a '.

g/ Press ENTER to get the string into level 1 where it can be analysed.

Now  you  can use the program shown in point 6 to analyse the string,  or  write
another program which suits your needs better.   Remember to check whether Z and
' characters are ones which you have used to replace null and " characters.

Well,  that's a very useful trick, but I haven't yet found PEEK or POKE commands
- they would be very helpful.   At least the HP-28C machine language is the same
as  that of the HP-71B,  so we have some help in decoding anything we  do  read.
Indeed  we  may  eventually  be able to write programs to do PEEK  and  POKE  by
creating  long character strings containing the bytes that make up the  required
machine language programs.   We would still need to know the addresses of  these
programs  in RAM so as to use SYSEVAL to execute them.   (We would also need  to
know  the  code which tells the 28C that a set of nybbles is a machine  language
routine,  not  the  address of some program steps.   Could  a  machine  language
routine  be  one of the object types 11 to 15?)  Some of the SYSEVAL  operations
return  RAM addresses to the stack as binary numbers,  so it may be possible  to
find  a SYSEVAL which returns the PC (Program Counter) - that would help a  lot!
All  these questions are for the future - for now I hope you will find  ways  of
exploiting the method described here - maybe the method will even help us answer
some  of the questions outlined above.   If you do make any discoveries,  please
let me know so we can all share them.
