.TOC .STL AS Reference Manual .TTL DECUS C LANGUAGE SYSTEM The AS Assembler for the PDP-11 by David G. Conroy Edited by Martin Minow .INT This document describes the AS assembler, used to compile the output of the Decus C compiler to PDP-11 object code. .MID DECUS Structured Languages SIG .MID Version of 1-Aug-80 .PAG .MID Copyright (C) 1980, DECUS .LIN General permission to copy or modify, but not for profit, is hereby granted, provided that the above copyright notice is included and reference made to the fact that reproduction privileges were granted by DECUS. .LIN The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation or by DECUS. .LIN Neither Digital Equipment Corporation, DECUS, nor the authors assume any responsibility for the use or reliability of this document or the described software. .LIN This software is made available without any support whatsoever. The person responsible for an implementation of this system should expect to have to understand and modify the source code if any problems are encountered in implementing or maintaining the compiler or its run-time library. The DECUS `Structqre` Lajcuaces Special Inperest Grkqp' is the primary focus for coimqnication among users of this software. .LIN UNIX is a trademark of Bell Telephone Laboratories. RSX, RSTS/E, RT11 and VMS are trademarks of Digital Equipment Corporation. .CPT Introduction .LIN AS is a three pass assembler for the PDP-11 inpended for use as a targat assembler for compilers. It lacks some of the more esoteric assembler directives of the standard DEC assemblers; it also lacks any form of macro facility. However, it does posess the ability to extend any branch instruction which is unable to reach its target label into the appropriate branch over a jump. .LIN The input syntax is, in general, similar in style to MACRO-11. The actual details have in many cases been changed so that a programme intended to be assembled with AS cannot be assembled with MACRO-11. The syntax is compatible with the Unix AS assembler. .CPT Usage .LIN Under RSX11M or RSX compatibility mode under VMS, AS is invoked from MCR as follows: .MID AS [-b] [-d] [-g] [-l] [-n] file [file ...] .BLN 0 Under RSX compatibility mode under RSTS/E, AS is invoked as follows: .MID XAS [-b] [-d] [-g] [-l] [-n] file [file ...] .BLN 0 Under RT11 compatibility mode under RSTS/E, AS is invoked as follows: .MID AS file,file=file[/ switches] .BLN 0 Under RT11, AS is invoked as follows: .SWT 1 RUN AS AS> object,list=source[/ switches] .BLN 0 In RSX mode, the specified files are assembled separately and the object code is placed in a file (on the same device and in the same UIC) having the same name as the source file but with a filetype of OBJ. The default filetype for source files is .S. Wildcard file names are permitted on native RSX11-M and RSX11-M compatibility mode under VMS. They are not permitted on RSTS/E or RT11. .LIN In RT11 mode, only one file may be assembled and the object and list files may be explicitly named. The default case: .SWT 1 AS file .BLN0 is equivalent to .SWT 1 AS file=file .BLN The following switches are defined: .SWT 1 b Causes the assembler to flag all branches that were extended to jumps. d Causes the assembler to delete the source file after assembly. This option makes compiling C programmes much easier. This option is turned off by errors. g Causes all symbols which are undefined at the end of pass 1 to be given the type undefined external; this corresponds to the .ENABL GBL directive of MACRO-11. l Causes the assembler to generate a listing file. This is the only way to generate a listing file on RSX mode. It works on RT11 mode as well. n Causes the assembler to produce no object file. This option is primarily used to prevent creating a lot of useless object files when debugging the assembler; however, it may be of use when one wishes to simply check a file for errors. .LIN The object file is in standard DEC format. AS writes a full symbol table (including internal symbols) into the GSD records. These symbols are almost useless because the DEC linkers discard internal symbols, but are present for the benefit of future generations of debuggers. Note that the RSX compiler produces an object file in the format required by TKB.TSK, while the RT11 compiler produces an object file int the format required by LINK.SAV. .LIN The title of the object file is always set to the first six characters of the source file name. This is of interest only to people who load overlaid programmes off libraries. .CPT Lexical Conventions .LIN Assembler tokens consist of identifiers (`symbols', `names'), constants and operators. .HLV 1 Identifiers .LIN An identifier consists of a sequence of alphanumeric characters (including the period `.', the tilde `~', and the underscore `_'), the first of which may not be numeric. Only the first eight characters of the name are significant; the rest are discarded. Upper and lower case are treated identically. Because the DEC object file stores symbols in radix 50, the `_' character is mapped to `.' in the symbol table while the `~' character is mapped to `$'. Global symbols must be unique within the first six characters. .MID WARNING .BLN Note that the mapping of `_' to `.' is an incompatible change from previous versions of this assembler. .HLV 1 Constants .LIN An octal constant is a sequence of digits; `8' and `9' are taken to have octal values of 10 and 11. The number is truncated to 16 bits and interpreted in two's complement notation. .LIN A decimal constant is a sequence of digits terminated by a period. The magnitude of the constant should be representable in 15 bits; i.e., be less than 32,768. .LIN A single character constant consists of the single quote (`'') followed by any ASCII character (except the newline). The constant's value is the code for the character right justified in the word, with zeros on the left. .HLV 1 Operators .LIN There are several single and multiple character operators; see section 6.1. .HLV 1 Blanks and Tabs .LIN Blanks and tabs may be used freely between tokens, but may not appear within identifiers. A blank or a tab is required to separate adjacent tokens not otherwise separated. .HLV 1 Comments .LIN The character `/' introduces a comment, which continues until the end of line. Comments are ignored by the assembler. .CPT Programme Sections .LIN AS permits multiple programme sections (PSECTS); however, it does not allow the attributes of the programme sections to be specified. All programme sections receive the attributes LOW, NOLIB, CON, RW, REL, LCL and ISPACE. .CPT The Location Counter .LIN The special symbol `.' is the location counter. Its value is the offset into the current programme section of the start of the statement in which it appears. It may be assigned to, with the restrictions that the assignment must not either change the programme section or cause the value to decrease. .CPT Statements .LIN A programme consists of a sequence of statements seperated by newline or semicolons. There are three kinds of statements; null statements, assignment statements and keyword statements. .LIN Any statement may be preceeded by any number of labels. .HLV 1 Labels .LIN A name label consists of an identifier followed by a colon (`:'). The programme section and value of the label are set to those of the location counter. It is an error for the value of a label to change between pass 1 and pass2. .LIN A temporary label consists of a digit `0' to `9' followed by a colon (`:'). Such a label serves to define temporary symbols of the form `xf' and `xb', where `x' is the digit of the label. References of the form `xf' refer to the first temporary label `x:' forward from the reference; those of the form `xb' refer to the first temporary label `x:' backward from the reference. Such labels tend to conserve both the symbol table space of the assembler and the inventive powers of the programmer. .HLV 1 Null Statements .LIN A null statement is just an empty line (which may have labels and be followed by a comment). Null statements are ignored by the assembler. Common examples of null statements are empty lines or lines consisting of only a label. .HLV 1 Assignment Statements .LIN An assignment statement consists of an identifier followed by an equal sign (`=') and an expression. The value and programme section of the identifier are set to that of the expression. Any symbol defined by an assignment statement may be redefined, either by another assignment statement or by a label. .HLV 1 Expression Statements .LIN An expression statement consists of an arithmetic expression not beginning with a keyword. The assembler computes its (16 bit) value and places it in the output stream along with the appropriate relocation. .HLV 1 String Statements .LIN A (UNIX style) string statement generates a sequence of bytes containing ASCII characters. It consists of a left string quote `<' followed by a sequence of ASCII characters not including newline followed by a right string quote `>'. Any of the characters may be replaced by an escape sequence as follows: .SWT 1 \b Backspace (0010) \f Formfeed (0014) \n Newline (0012) \r Carriage Return (0015) \t Tab (0011) \nnn Octal value (0nnn) .BLN 1 These escape sequences may also be used in the .ASCII and .ASCIZ keyword statements and in character constants. .HLV 1 Keyword Statements .LIN Keyword statements are the most common type; all of the machine operations and assembler pseudo operations are of this type. A keyword statement begins with one of the assembler's predefined keywords, followed by any operands required by that keyword. All of the keywords and their required operands are described in sections 7 and 8. .CPT Expressions .LIN An expression is a sequence of symbols representing a value and a programme section. Expressions are made up of identifiers, constants, operators and brackets. All binary operators have equal precidence and are executed in a strict left to right order (unless altered by brackets). .HLV 1 Types .LIN Every expression has a type determined by its operands. The types that will be met explicitly are: .SWT2 Undefined Upon first encounter, each symbol is undefined. A symbol may also become undefined if it is assigned to an undefined expression. It is an error to assemble an undefined expression in pass 2. Pass 1 allows assembly of undefined expressions, but phase errors may result if undefined expressions are used in certain contexts (i.e., in a .BLKW or .BLKB). Absolute An absolute symbol is one defined untimately from a constant or from the difference of two relocatable values. Register Register symbols refer to the general registers of the PDP-11. They are required to distinguish register addressing from normal memory addressing. The symbols R0, R1, R2, R3, R4, R5, SP and PC are predefined as register symbols. Relocatable All other user symbols are relocatable symbols in some programme section. Each programme section is a different relocatable type. .LIN Each keyword in the assembler has a secret type which identifies it internally. However, all of these secret types are converted to absolute in expressions. Thus any keyword may be used in an expression to obtain the basic value of that keyword. For machine operations the basic value is the opcode with all of the bits set to zero; the basic value of pseudo operations is, in general, uninteresting. .HLV 1 Operators .SWT 2 The operators are: `+' Addition `-' Subtraction `*' Multiplication `%' Integer (truncating) division `&' Bitwise AND `|' Bitwise OR `>>' Arithmetic right shift `<<' Arithmetic left shift `-' Unary negation `!' Unary ones complement `^' Value of the left, type of the right .LIN 1 Expressions may be grouped by means of square brackets (`[' and `]'); parentheses are reserved for use in address expressions. .HLV 1 Type Propagation in Expressions .LIN When operands are combined in expressions the resulting type is a function of both the types of the operands and the operator. Only the `+' and binary `-' operators can manipulate non-absolute operands. .LIN The `+' operand permits the addition of two absolute operands (yielding an absolute result) and the addition of an absolute to a non-absolute operand (yielding a result with the same type as the non-absolute operand). As a consequence, R3 may be refered to as R0+3. .LIN The binary `-' operator permits two operands of the same type, including relocatable, to be subtracted (yielding an absolute result) and an absolute to be subtracted from a non-absolute (yielding a result with the same type as the non absolute operand). .LIN The notion of `complex relocation' is not supported. .CPT Pseudo Operations .LIN The keywords listed below introduce statements which generate data or effect the later operation of the assembler. .HLV 1 .BYTE expression [ , expression ] ... .LIN The expressions in the comma separated list are truncated to 8 bits and are assembled into successive bytes. The expressions must be absolute. .HLV 1 .WORD expression [ , expression ] ... .LIN The expressions in the comma separated list are assembled into successive words. .HLV 1 .ASCII string .LIN The first nonblank (or tab) character after the .ascii keyword is taken as a delimiter. Successive characters from the string are assembled into successive bytes until the delimiter is encountered. .HLV 1 .ASCIZ string .LIN This pseudo operation is identical to .ASCII except that it appends a null byte to the end of the string. .HLV 1 .EVEN .LIN If the location counter is odd, output a null byte so the next statement will be assembled on an even boundry. .HLV1 .ODD .LIN If the location counter is even, assemble a null byte so that the next statement will be assembled on an odd boundry. .HLV 1 .BLKB expression .LIN This statement assembles into expression null bytes. The expression must be absolute. .HLV 1 .BLKW expression .LIN This statement assembles into expression null words. The expression must be absolute. .HLV 1 .GLOBL identifier [ , identifier ] ... .LIN The identifiers in the comma separated list are marked as global. If they are defined in the current assembly they may be referenced by other object modules; if they are undefined they must be resolved by the loader before execution. .HLV 1 .ENTRY expresion .LIN The value of the expression becomes the transfer address of the object file. This provides the function of the label on the .END statement in MACRO-11. .HLV 1 .PSECT identifier .LIN This statement switches the assembler to the specified programme section. If the programme section has not been encountered previously `.' is set to 0; otherwise it is set to the highest location already assembled into that programme section. .PSECT attributes can not be defined. .HLV 1 .LIMIT .LIN This statement assembles into two words that are filled in with program limit information by the linker. When the program is executed, the first word will have the lowest address in the load image and the second word will have the highest address in the load image. .HLV 1 .FLT2 number [ , number ] ... .LIN The floating point numbers in the comma separated list are converted to single precision floating point binary and are assembled into successive two word blocks. .HLV 1 .FLT4 number [ , number ] ... .LIN The floating point numbers in the comma separated list are converted to double precision floating point binary and are assembled into successive four word blocks. .HLV 1 End of file .LIN The assembly source is terminated by the end of the input file. There is no seperate .END statement. .CPT Machine Instructions .LIN Because of the rather complicated instruction and addressing structure of the PDP-11, the syntax of the machine instructions is varied. The PDP-11 handbooks should be consulted for the detailed semantics of the instructions. .HLV 1 Sources and Destinations .LIN The syntax of general source and destination addresses is the same as in MACRO-11, except that `$' has been substituted for `#' and `*' has been substituted for `@'. .HLV 1 Simple Machine Instructions .SWT1 The following simple machine instructions are defined: HALT WAIT RTI BPT IOT RESET RTT CLC CLV CLZ CLN SEC SEV SEZ SEN CLE (= CLC) SEE (= SEC) .LIN 1 The PDP-11 hardware allows more than one of the `set' and `clear' instructions to be ored together. There is no syntactic provision for this; such instructions may be generated by means of the .WORD pseudo operation. .HLV 1 Branches .LIN The following instructions take an expression as an operand. The expression must lie in the same programme section as `.'. If the value of the expression differs from the current location by more than 254 (decimal) bytes the instruction assembles as a branch to .+6 having the opposite sense to the coded instruction, followed by a jump to the desired location. .SWT 1 BR BNE BEQ BPL BMI BVS BVC BCS BCC BES (= BCS) BEC (= BCC) BLT BGT BGE BLE BHI BLOS BHIS BLO .HLV 1 Single Operand Instructions .LIN The following single operand instructions take one address of the general source-destination type. .SWT 1 CLR CLRB COM COMB INC INCB DEC DECB NEG NEGB TST TSTB ASR ASRB ASL ASLB ROR RORB ROL ROLB ADC ADCB SBC SBCB JMP SWAB SXT .HLV 1 Double Operand Instructions .LIN The double operand instructions take two general source-destination type address fields, separated by a comma. .SWT 1 MOV MOVB CMP CMPB BIT BITB BIS BISB BIC BICB ADD SUB .HLV 1 Other Instructions .LIN The following instructions have a more specialised syntax. Here reg specifies a register expression, src and dst general source-destination addresses, and expr an expression. .SWT 1 JSR reg,dst RTS reg CALL dst (same as JSR PC,dst) CALLR dst (same as JMP dst) RETURN (same as RTS PC) EMT expr TRAP expr SYS expr (same as EMT expr) ASH src,reg ASHC src,reg MUL src,reg DIV src,reg XOR reg,dst MARK expr SOB reg,expr .LIN 1 The expression in a SOB must be in the same programme section as `.' and must be within 176 bytes of `.'. The assembler does not attempt to adjust SOB's which cannot reach their destinations into DEC and BNE, because the DEC and BNE do not set the condition codes in the same manner. .HLV 1 Floating Point Instructions .LIN The following floating point instructions are defined, with syntax as indicated. .SWT 1 ABSF dst ABSD dst ADDF src,reg ADDD src,reg CLRF dst CLRD dst CMPF reg,dst CMPD reg,dst DIVF src,reg DIVD src,reg LDCDF src,reg LDCFD src,reg LDCIF src,reg LDCID src,reg LDCLF src,reg LDCLD src,reg LDEXP src LDFPS src LDF src,reg LDD src,reg MODF src,reg MODD src,reg MULF src,reg MULD src,reg NEGF dst NEGD dst STCFD reg,dst STCDF reg,dst STCFI reg,dst STCDI reg,dst STCFL reg,dst STCDL reg,dst STEXP dst STFPS dst STF reg,dst STD reg,dst SUBF src,reg SUBD src,reg SETF SETD SETI SETL STST dst CFCC TSTF dst TSTD dst .CPT Diagnostics .LIN Syntactic or semantic errors in the source are reported by displaying the offending line on the console device, preceeded by the appropriate error flags. The name of the file is also displayed. .SWT 1 a Addressing b Byte allignment d Illegal operation on `.' e Expression syntax j Jump (-b) m Multiple definition o Illegal operation code p Phase q Questionable syntax r Relocation u Undefined symbol .LIN Errors encountered in the accessing of files or in internal assembler operations are reported in English on the console device. Overflows are fatal and require that the assembler be modified to have a larger table (this is easy). .LIN The only really cryptic diagnostic is `AS>', which means that the MCR command line could not be obtained (you probably typed RUN AS). Type the command line you would have typed at MCR (including the leading AS). .CPT Shortcomings and Caveats .LIN The assembler has some limitations. Some of the more important ones are listed below, and are classified as H (hard to fix), L (likely to remain), F (fixable, and likely to be fixed) and G (go away). .SWT 1 F No conditional assembly L No macro facilities .CPT Installation Notes .LIN The AS assembler may be installed on RSTS/E (V7.0 or later) in RSX or RT11 modes, on RSX11M (V3.2 or later), on RT11 (V3B or later), or on VMS compatiblity mode. This section contains installation instructions for RSTS/E and VMS. .HLV 1 Installation on RSTS/E .LIN The AS assembler is installed after building the CC compiler as it must refer to the compiler support library which is built together with CC. Assuming the AS modules are stored in account [5,3], the following command files should be run by ATPK: .SWT 1 XMAKAS Make RSX-style assembler RMAKAS Make RT11-style assembler .LIN 1 You will have to edit the various command files as needed. .LIN 0 In order to use AS, the System Manager must add the following to the startup command file: .SWT 1 RUN $UTILTY ADD LOGICAL [5,2] C CCL XAS-=C:AS.TSK;0 CCL AS-=C:AS.SAV;8192 EXIT .LIN 1 Note that the RSX-style assembler must run from a defined command. .HLV 1 Installation on VMS .LIN 1 The build command file for the AS assembler is VMAKAS.COM, located in the same directory as other compiler components. You will have to edit it as needed, running it as an indirect command file. .LIN 1 To use the assembler, define it as a command as follows: .SWT 1 XAS :== $device:[account]XAS.EXE XAS .HLV 1 Installation on RT11 .LIN 1 After building the CC compiler, the TMAKAS.COM command file should be edited and executed. .HLV 1 Installation on RSX-11M .LIN 1 After building the CC compiler, the MMAKAS.CMD command file should be invoked.