DISASM DOCUMENTATION - by Michael Markov 1988/06/03 DISASM is an advanced HP-71 LEX file disassembler that actually can produce a source code file, given the LEX file itself. It introduces concepts that could revolutionize disassemblers for other controllers such as the HP-75 and the HP-41. It has successfully disassembled a wide variety of HP-71 LEX files. The major improvement over all the other disassemblers I have seen is that most commonly used data table structures and all data fields inherent in the basic structure of LEX files are disassembled correctly, as data, without any user intervention. Furthermore, the program is specifically designed to allow you to make minor modifications that will readily handle non-standard data structures. In order to run DISASM, you will need PEEKLEX, DISASMLX, USERLIBA, JPCROM, HPILROM and EDLEX. (I think that's all, but I may be wrong.) The program is easy to use - just RUN DISASM and answer the prompts. You can keep track of what is going on if you have a video monitor. Very quickly, you will get a source code file. For example, ROMCOPY can be disassembled in approximately 15 minutes on my machine, VER$ 2CDCCC, running at 1.18 MHZ. (Yes, my machine has been speeded-up quite a bit. Also, I have been combining LEX files together, splitting linked LEX files, effectively reducing the total number of LEX files in my machine to improve system performance. The job is not finished, but progress is excellent!) Once you have the source file, you should inspect it for possible problems due to non-standard data structures. Search the file for a "?" in column 1. If you find some, take corrective action. For example, the ROMCOPY source will have a single error, "?addrs GOTO address2", which needs to be changed to " addrs REL(5)
". Thereafter, do ASSIGN#1 TO and FIXUP#1. The source file is now ready for assembly, using either SASM or AREUH (The DOS assemblers I run on my HP110). You can also assemble the source on the HP71 using John Baker's enhancements (See FIX5F), provided you convert CON(6) pseudo-ops to equivalent NIBHEX instructions, at least until we add some hooks for CON(n) instructions that allow n>5. Reassembling the LEX file without errors, and a byte by byte check to insure the new file is identical to the original is the best proof of successful disassembly! Thereafter, you can combine keywords with relative ease. The tools ( WRITHEAD DISASM, PACKCODE, ect,) are here to be used. Enjoy! DISASM has an option that will delight users who have had the doubtful pleasure of searching the HP-71 IDS volumes to find out what a mainframe routine does, starting with its execution code address. The EQSORTED LEX file provides a new keyword, EQUATE$(), that returns the name of the subject routine almost instantaneously. This LEX was developed because I found that using the SEARCH keyword on text files is very, very slow. Also, the EQSORTED lex file uses only 10 Kbytes of memory, as compared to 27 Kbytes for the text file it replaces. This keyword is used in various ways to make the source file produced by DISASM more meaningful. DISASM also allows you to select the base "start-of-file" address that goes into the output source file. This is very convenient when you try to compare the output source file to the original listing file. It can also be used to advantage when combining several LEX files into one, as it allows you to easily avoid the problem of duplicate labels. This option is supported by a new keyword, OFFSET$(,). OFFSET$ is available separately in the OFFSET lex file, to allow you to use DISASM without the 10 Kbyte long EQSORTED lex file. (EQSORTED now provides both OFFSET$ and EQUATE$. Other keywords are still in the development stage.) DISASM is intended to disassemble entire lex files automatically. However, since we never have enough spare memory, DISASM allows you to disassemble up to ten selected keywords at a time. (Complete source files are 15 to 25 times bigger than the lex file you are disassembling... ). This feature can be used to advantage when 'splitting' linked lex files, as any common code will be disassembled correctly. It should be mentioned that the output source file is compatible with the HP-71 FORTH ROM assembler, as modified by John R. Baker's FIX5F enhancements. Effectively, this allows the HP-71 owner to both assemble and disassemble most lex files without assistance from more powerful machines such as the HP-110, the HP-200 or the IBM PC. Compatibility is provided by the FIXUP program, which deletes unneccessary labels and does some additional pre-processing. The most convincing proof of successful disassembly is the error-free assembly of the output source file. The availability of the AREUH and SASM cross-assemblers, which I run either on my HP110 or on a friends IBM PC have been life-savers through the tedious process of debugging DISASM. My thanks to PPC Paris and HHP for providing these time saving tools. REQUIREMENTS: DISASM uses keywords from the following lex files : EDLEX, HPILROM, JPCROM ver. C00, STRINGLX and either OFFSET or EQSORTED. JPCROM keywords OPCODE$ and NEXTOP$ shorten DISASM by some 4K of basic. They help improve execution speed, and make possible many features that would be hard to implement in basic. Some of these features are automatic documentation of jumps to mainframe routines, error detection and the associated routines that prevent the propagation of know errors, and much more. USER INSTRUCTIONS: 1) Copy DISASM, STRINGLX, and either OFFSET or EQSORTED into your HP-71. 2) Make sure you also have EDLEX (FORTH/ASSEMBLER ROM), HPILROM and JPCROM. 3) Copy the lex to be disassembled to Independent RAM (IRAM). 4) (Optional) Connect your HP-71 to a display device. This will allow you to monitor the progress, and give you an opportunity to intervene if DISASM runs into problems because of non-standard data tables. 5) RUN DISASM. Answer each promt, and watch the fun! 5a) The first prompt asks you to provide the name of the output file. The default output file is JUNK. If JUNK already exists, it will be purged. If you specify any other name and the file exists, you will get suitable error messages if the file is not a text file, or if the file is not a source code file.. Work has been started on a special interactive mode that will be enabled if the specified file is a suitable source code file. This feature is not implemented in the current version of disasm. 5b) Your are prompted for the name of the LEX file to be disassembled. 5c) Next, you will be prompted for the "base address". This will be the address of the start of the file shown in the output file. The default '00000' is very convenient if you wish to compare the output file with the original assembler listing files. You can also respond to the prompt with ADDR$('filename'), which can be very handy if you need to determine the structure of non-standard data tables with SYSEDIT (JPCROM), as the output file addressing will match the physical address of the object lex file. WARNING: The base address must be less than (FFFFF - file length in nibbles including headers). Later versions of DISASM will re-prompt you if you specify a base address that is too high. 5d) Next, you will be asked to specify the keywords to be disassembled. The default 'all' is the recommended response if you have enough memory. The keyword select option is provided to cope with the problem of not having enough memory, and to ease the pain associated with 'splitting' linked lex files. The current version of DISASM allows you to specify up to 10 keywords. I have found this to be more than enough, but you could increase this simply by changing DIM ...,F9$(10)[9] to DIM ...,F9$(20)[9]. See line 370. A null response (just press [ENDLINE]) to the 'Keyword? ' prompt terminates the keyword selection process. That's all, at least for a while. You may now enjoy watching your HP-71 work for you. 6) Next come the most difficult part of the entire process: making sure disassembly is correct. The best way is to document the code, making sure you understand the purpose of every machine language instruction. This, however, requires an excellent grasp of machine language programming. It is also a very time consuming job. DISASM flags any errors it detects with a leading '?', to help you find trouble spots more easily. Therefore, the first thing to do is to call on your text editor, and search the file for leading '?' with S/\^?. If you do not find any, chances are fairly good that you can reassemble the file without problems. If you find leading '?', list about 20 lines starting about 10 lines before the '?'. Leading '?' will follow calls on system routines TBLJMP and TBLJMC even if disassembly is correct. Here, you really should document code to determine how many REL(3) instructions really follow. The HP Debugger can also be used to advantage for some trial and error testing. If TBLJM(P/C) is not the problem, scan the source file backwards from the '?' and look for a GOSUB, GOSUBL or GOSBVL instruction. The probability is high that a data table of some kind follows such instructions. Studying the routines that are invoked by the subroutine call can tell you a lot about the structure of the data table. The JPCROM SYSEDIT keyword provides a handy way of testing your guesses to determine the structure of such data tables. Once you have taken care of obvious errors, you should look for anomalies, such as code that follows a GOTO, GOLONG, GOVLNG, RTN, RTNSC, RTNCC, RTNSXM or RTI, but is not designated as a label by an 'o' prefix. Such code may well be harmless left-overs from the development stage of the object lex (code that performs no function and which should have been deleted prior to release). It could also mean that such code is accessed by mean of a computed jump. file, and comparing the original lex file with the lex file produced by your assembler. NOTES: 1) JPCROM is currently available on EPROMs to members of the PPC Paris Chapter. This minimizes the cost of upgrades (new keywords, enhancements, ect.). JPCROM is about 25K bytes and growing, with upgrades to be available on an irregular basis in the future. 64K EPROMs are an excellent idea, as unused capacity can be filled with your choice of software, freeing up RAM. (Janick Tallandier, 335 rue Lecourbe, 75015 Paris, FRANCE is the contact person for English speaking people.) 2) The HP-75 equivalent of EQSORTED lex is the 11264 bytes long ENTRYPNT lex. 3) The basic equivalent of OFFSET$(A$,O) is DTH$(MOD(HTD(A$)+O,1048576)) 4) JPCROM should soon be available through commercial channels. 5) JPCROM keyword SYSEDIT provides a very powerful tool for determining the structure of local data tables. It allows you to try various combinations of CON( ), REL( ), NIBHEX, NIBASC or LCASC instructions. This capability, combined with the information provided by disassembling the subroutines that read the data tables, makes disassembly of such data tables considerably less difficult.