DISASM DOCUMENTATION - by Michael Markov 1988/06/03
DISASM is an advanced HP-71 LEX file disassembler that actually can produce a
source code file, given the LEX file itself. It introduces concepts that could
revolutionize disassemblers for other controllers such as the HP-75 and the
HP-41. It has successfully disassembled a wide variety of HP-71 LEX files. The
major improvement over all the other disassemblers I have seen is that most
commonly used data table structures and all data fields inherent in the basic
structure of LEX files are disassembled correctly, as data, without any user
intervention. Furthermore, the program is specifically designed to allow you
to make minor modifications that will readily handle non-standard data
structures.
In order to run DISASM, you will need PEEKLEX, DISASMLX, USERLIBA, JPCROM,
HPILROM and EDLEX. (I think that's all, but I may be wrong.)
The program is easy to use - just RUN DISASM and answer the prompts. You can
keep track of what is going on if you have a video monitor. Very quickly, you
will get a source code file. For example, ROMCOPY can be disassembled in
approximately 15 minutes on my machine, VER$ 2CDCCC, running at 1.18 MHZ.
(Yes, my machine has been speeded-up quite a bit. Also, I have been combining
LEX files together, splitting linked LEX files, effectively reducing the total
number of LEX files in my machine to improve system performance. The job is
not finished, but progress is excellent!)
Once you have the source file, you should inspect it for possible problems due
to non-standard data structures. Search the file for a "?" in column 1. If you
find some, take corrective action. For example, the ROMCOPY source will
have a single error, "?addrs GOTO address2", which needs to be changed to
" addrs REL(5)
".
Thereafter, do ASSIGN#1 TO and FIXUP#1. The source file is
now ready for assembly, using either SASM or AREUH (The DOS assemblers I run
on my HP110). You can also assemble the source on the HP71 using John Baker's
enhancements (See FIX5F), provided you convert CON(6) pseudo-ops to equivalent
NIBHEX instructions, at least until we add some hooks for CON(n)
instructions that allow n>5.
Reassembling the LEX file without errors, and a byte by byte check to insure
the new file is identical to the original is the best proof of successful
disassembly!
Thereafter, you can combine keywords with relative ease. The tools ( WRITHEAD
DISASM, PACKCODE, ect,) are here to be used. Enjoy!
DISASM has an option that will delight users who have had the doubtful pleasure
of searching the HP-71 IDS volumes to find out what a mainframe routine does,
starting with its execution code address. The EQSORTED LEX file provides a new
keyword, EQUATE$(), that returns the name of the
subject routine almost instantaneously. This LEX was developed because I found
that using the SEARCH keyword on text files is very, very slow. Also, the
EQSORTED lex file uses only 10 Kbytes of memory, as compared to 27 Kbytes for
the text file it replaces. This keyword is used in various ways to make the
source file produced by DISASM more meaningful.
DISASM also allows you to select the base "start-of-file" address that goes
into the output source file. This is very convenient when you try to compare
the output source file to the original listing file. It can also be used to
advantage when combining several LEX files into one, as it allows you to easily
avoid the problem of duplicate labels. This option is supported by a new
keyword, OFFSET$(,). OFFSET$ is available separately in
the OFFSET lex file, to allow you to use DISASM without the 10 Kbyte long
EQSORTED lex file. (EQSORTED now provides both OFFSET$ and EQUATE$. Other
keywords are still in the development stage.)
DISASM is intended to disassemble entire lex files automatically. However,
since we never have enough spare memory, DISASM allows you to disassemble up to
ten selected keywords at a time. (Complete source files are 15 to 25 times
bigger than the lex file you are disassembling... ). This feature can be used
to advantage when 'splitting' linked lex files, as any common code will be
disassembled correctly.
It should be mentioned that the output source file is compatible with the HP-71
FORTH ROM assembler, as modified by John R. Baker's FIX5F enhancements.
Effectively, this allows the HP-71 owner to both assemble and disassemble most
lex files without assistance from more powerful machines such as the HP-110,
the HP-200 or the IBM PC. Compatibility is provided by the FIXUP program, which
deletes unneccessary labels and does some additional pre-processing.
The most convincing proof of successful disassembly is the error-free assembly
of the output source file. The availability of the AREUH and SASM
cross-assemblers, which I run either on my HP110 or on a friends IBM PC have
been life-savers through the tedious process of debugging DISASM. My thanks to
PPC Paris and HHP for providing these time saving tools.
REQUIREMENTS: DISASM uses keywords from the following lex files : EDLEX,
HPILROM, JPCROM ver. C00, STRINGLX and either OFFSET or EQSORTED.
JPCROM keywords OPCODE$ and NEXTOP$ shorten DISASM by some 4K of basic. They
help improve execution speed, and make possible many features that would
be hard to implement in basic. Some of these features are automatic
documentation of jumps to mainframe routines, error detection and the
associated routines that prevent the propagation of know errors, and much more.
USER INSTRUCTIONS:
1) Copy DISASM, STRINGLX, and either OFFSET or EQSORTED into your HP-71.
2) Make sure you also have EDLEX (FORTH/ASSEMBLER ROM), HPILROM and JPCROM.
3) Copy the lex to be disassembled to Independent RAM (IRAM).
4) (Optional) Connect your HP-71 to a display device. This will allow you to
monitor the progress, and give you an opportunity to intervene if DISASM runs
into problems because of non-standard data tables.
5) RUN DISASM. Answer each promt, and watch the fun!
5a) The first prompt asks you to provide the name of the output file. The
default output file is JUNK. If JUNK already exists, it will be purged. If you
specify any other name and the file exists, you will get suitable error
messages if the file is not a text file, or if the file is not a source code
file.. Work has been started on a special interactive mode that will be
enabled if the specified file is a suitable source code file. This feature is
not implemented in the current version of disasm.
5b) Your are prompted for the name of the LEX file to be disassembled.
5c) Next, you will be prompted for the "base address". This will be the address
of the start of the file shown in the output file. The default '00000' is very
convenient if you wish to compare the output file with the original assembler
listing files. You can also respond to the prompt with ADDR$('filename'),
which can be very handy if you need to determine the structure of non-standard
data tables with SYSEDIT (JPCROM), as the output file addressing will match
the physical address of the object lex file.
WARNING: The base address must be less than (FFFFF - file length in nibbles
including headers). Later versions of DISASM will re-prompt you if you specify
a base address that is too high.
5d) Next, you will be asked to specify the keywords to be disassembled. The
default 'all' is the recommended response if you have enough memory. The
keyword select option is provided to cope with the problem of not having enough
memory, and to ease the pain associated with 'splitting' linked lex files. The
current version of DISASM allows you to specify up to 10 keywords. I have found
this to be more than enough, but you could increase this simply by changing
DIM ...,F9$(10)[9] to DIM ...,F9$(20)[9]. See line 370.
A null response (just press [ENDLINE]) to the 'Keyword? ' prompt terminates
the keyword selection process.
That's all, at least for a while. You may now enjoy watching your HP-71 work
for you.
6) Next come the most difficult part of the entire process: making sure
disassembly is correct. The best way is to document the code, making sure you
understand the purpose of every machine language instruction. This, however,
requires an excellent grasp of machine language programming. It is also a very
time consuming job. DISASM flags any errors it detects with a leading '?', to
help you find trouble spots more easily.
Therefore, the first thing to do is to call on your text editor, and search the
file for leading '?' with S/\^?. If you do not find any, chances are fairly
good that you can reassemble the file without problems.
If you find leading '?', list about 20 lines starting about 10 lines before the
'?'. Leading '?' will follow calls on system routines TBLJMP and TBLJMC even
if disassembly is correct. Here, you really should document code to determine
how many REL(3) instructions really follow. The HP Debugger can also be used
to advantage for some trial and error testing.
If TBLJM(P/C) is not the problem, scan the source file backwards from the '?'
and look for a GOSUB, GOSUBL or GOSBVL instruction. The probability is high
that a data table of some kind follows such instructions. Studying the
routines that are invoked by the subroutine call can tell you a lot about the
structure of the data table. The JPCROM SYSEDIT keyword provides a handy way of
testing your guesses to determine the structure of such data tables.
Once you have taken care of obvious errors, you should look for anomalies, such
as code that follows a GOTO, GOLONG, GOVLNG, RTN, RTNSC, RTNCC, RTNSXM or RTI,
but is not designated as a label by an 'o' prefix. Such code may well be
harmless left-overs from the development stage of the object lex (code that
performs no function and which should have been deleted prior to release).
It could also mean that such code is accessed by mean of a computed jump.
file, and comparing the original lex file with the lex file produced by your
assembler.
NOTES:
1) JPCROM is currently available on EPROMs to members of the PPC Paris Chapter.
This minimizes the cost of upgrades (new keywords, enhancements, ect.).
JPCROM is about 25K bytes and growing, with upgrades to be available on an
irregular basis in the future. 64K EPROMs are an excellent idea, as unused
capacity can be filled with your choice of software, freeing up RAM. (Janick
Tallandier, 335 rue Lecourbe, 75015 Paris, FRANCE is the contact person for
English speaking people.)
2) The HP-75 equivalent of EQSORTED lex is the 11264 bytes long ENTRYPNT lex.
3) The basic equivalent of OFFSET$(A$,O) is DTH$(MOD(HTD(A$)+O,1048576))
4) JPCROM should soon be available through commercial channels.
5) JPCROM keyword SYSEDIT provides a very powerful tool for determining the
structure of local data tables. It allows you to try various combinations of
CON( ), REL( ), NIBHEX, NIBASC or LCASC instructions. This capability, combined
with the information provided by disassembling the subroutines that read the
data tables, makes disassembly of such data tables considerably less difficult.