.TOC .STL The C Language .TTL The C Language Martin Minow .INT .MID Introduction .LIN This memorandum is an introduction to, and justification for, the C programming language. .LIN "C is a general-purpose programming language. It has been closely associated with the UNIX system, since it was developed on that system, and since UNIX and its software are written in C. The language, however, is not tied to any one operating sys- tem or machine; and although it has been called a "system pro- gramming language" because it is useful for writing operating systems, it has been used equally well to write major numerical, text-processing, and database programs. .LIN "C is a relatively "low level" language. This characterization is not perjorative; it simply means that C deals with the same sort of objects that most computers do, namely characters, numbers, and addresses. These may be combined and moved about with the ususal arithmetic and logical operations implemented by actual machines. .LIN "C provides no operations to deal directly with composite ob- jects such as character strings, sets, lists, or arrays consi- dered as a whole. There is no analog, for example, of the PL/I operations which manipulate an entire array or string. The language does not define any storage allocation facility other than static definition and the stack discipline provided by the local variables of functions: there is no heap or garbage col- lection like that provided by Algol 68. Finally, C itself pro- vides no input-output facilities: there are no READ or WRITE statements, and no wired-in file access methods. All of these higher-level mechanisms must be provided by explicitly-called functions. .LIN "Similarly, C offers only straightforward, single-thread control flow constructions: tests, loops, grouping, and subprograms, but not multiprogramming, parallel operations, synchronization, or coroutines. .LIN "... Keeping the language down to modest dimensions has brought real benefits. Since C is relatively small, it can be described in a small space and learned quickly. A compiler for C can be simple and compact. Compilers are also easily written; using current technology, one can expect to prepare a compiler for a new machine in a couple of months, and to find that 80 percent of the code of a new compiler is common with existing ones. This provides a high degree of language mobility." .LIN From Kernighan and Ritchie, The C Programming Language. .CPT Background .LIN C was developed at Bell Labs by Dennis Ritchie. It was based on BCPL via the B language (a minimal language for system program- ming). The language has been available for about five years. Over 90% of the Unix operating system (and essentially all UNIX applications software) is written in C. .LIN The language is available on PDP-11's (all major operating sys- tems, including PDT150's), VAX, GCOS (HIS 6070), OS 370, Inter- data 8/32, SEL86, Nova, Eclipse, and on 8080/86 and Z80 micro- computers. On PDP-11's, the compiler generates direct machine code. (The compiler, can be recompiled on an RT11 system). There are at least three compiler vendors along with a version available from Decus. .LIN A C language standard was published in 1977. While the language does not define any I/O or operating-system conventions, a "portable C library" has been defined to provide necessary func- tionality in a machine-independent fashion. The Decus C com- piler supports a run-time library that is portable between RSX11, RT11, and either operating system emulated under RSTS/E. .LIN The documentation available includes a 150 page paperback book of excellent quality. Unix documentation includes a 30 page reference manual describing the entire language. .LIN This discussion was prepared using the UNIX system documenta- tion, the discussion of C in the Bell Systems Technical Journal (July-August 1978), and the book by Kernighan and Ritchie quoted above. .CPT Utilization of System/microprocessor Architecture .LIN The language is completely machine-independent. Operating sys- tem interaction is by (separately compiled) functions. This in- cludes all I/O processing. .LIN All variables must be declared prior to use. A variable may be declared "register" type. If so, the compiler will attempt to use hardware registers. Registers (R0, SP, etc.) cannot be ex- plicitly named by C programs. (It is possible, however, to equate pointer variables to absolute memory locations, allowing C programs access to device registers.) .LIN Assembly language cannot be embedded in C programs. However, separately compiled C and assembly-language programs can be called. All C programs are re-entrant and the "standard" PDP-11 calling sequence is not used. However, the Decus runtime libra- ry includes an interface subroutine to allow calling a Fortran library routine, should this be necessary. This allows C pro- grams running on RT11 to access all operating system functional- ity. The subroutine call/return sequence requires about 20 mi- croseconds on a PDP11/70. C programs can be compiled with "pro- filing" support. If this is selected, the Decus C runtime li- brary includes tests for stack overflow and prints "traceback" messages if the program aborts. .LIN By slightly extending the definition of pointer variables, it is possible to address any location in the machine (this is not ex- plicitly provided for in the language definition, and may not be completely transportable). Thus, C programs can be used as dev- ice drivers (all UNIX device drivers are written in C). They can also be connected to interrupt vectors (again, not provided for in the language standard). .CPT Language Functionality .LIN C is a typed block-structured language. All "normal" control constructs are included. The compiler generates reentrant code (reentrancy is inherent in the language). There is no language construct for coroutines. .LIN One of the most important features of the language is its abili- ty to define structures, and to manipulate them by use of po- inter variables. In addition, an increment/decrement operator is available which allows flexible indexing through a structure or vector. Multi-dimensioned arrays are also available. Also, variables may be partitioned into bit fields. .LIN While the language is typed, it is possible to escape from the typing. This is important when storage must be dynamically al- located from free memory. The system library contains a storage allocation subroutine which returns enough space to store the item required. The allocator is called as follows: .SWT 1 struct cell { /* Define a lisp cell */ struct cell *car; /* car pointer */ struct cell *cdr; /* cdr pointer */ int valtype; /* What is stored in the cell */ union cellval { /* Cell can store any of: */ int ival; /* Integer or */ double rval; /* Real or */ struct cell *vlist; /* Value list pointer or */ char *cval; /* Character string */ } v; /* v.ival, etc. */ } *head; /* Head points to a cell */ .LIN 0 .SWT /* * Allocate a cell and initialize it's contents */ head = (struct cell *)alloc(sizeof cell); head->valtype = INTEGER; head->v.ival = 123; /* * Print value if a cell contains an integer */ if (head->valtype == INTEGER) printf("%d", head->v.ival); .LIN 1 The above sequence defines a structure and an uninitialized po- inter to that structure (at compile time). When the program is run, space for the structure is allocated which is of the proper size and which is properly aligned for the data being stored. .LIN Note that "sizeof cell" is evaluated at compile-time. The pro- gram body need not be modified if the structure changes, or if the number of bytes needed to store address pointers or other storage units changes. (This means that programs are transport- able between PDP11's (with 16 bit address pointers) and Vax's (with 32 bit pointers). The C "type cast" declaration allows the computer to type-check the assignment statements. .LIN The following example shows how bit fields and PDP-11 hardware registers may be defined and manipulated: .SWT /* * Define a KW11-P programmable counter */ struct { struct { /* Define control-status reg. */ unsigned run : 1; /* Run bit */ unsigned rate : 2; /* Rate select (2 bits) */ unsigned mode : 1; /* Mode */ unsigned updwn : 1; /* Up/down bit */ unsigned fix : 1; /* Fix bit (maintenence) */ unsigned inten : 1; /* Interrupt enable */ unsigned done : 1; /* Device done */ : 7; /* Unused */ unsigned error : 1; /* Overrun */ } csr; /* All packed in this word */ int csb; /* Count set buffer */ int cnt; /* Count */ }; #define KW11P 0172540 /* Programmable clock address */ KW11P->csr.rate = 1; /* Set rate to 10kHz. */ KW11P->csb = (-100); /* Count set to 100 int/sec. */ KW11P->csr.run |= 1; /* Start counting */ .LIN The above shows how device registers might be defined, and how the pointer mechanism may be used to access actual machine loca- tions. This (nonstandard) use of pointer variables follows the actual coding practices used within the UNIX operating system. .LIN Note, of course that interrupt handling would require an operat- ing-system specific function. .CPT C language characteristics .LIN C is a typed block-structured language many of whose concepts can be traced back to Algol-60. .HLV 1 C control structures .SWT 1 if () then ; else ; while () ; do ; while (); for (; ; ) ; switch () ; case : ; default: break; continue; goto