- 1 -
The C Programming Language - Reference Manual
Dennis M. Ritchie
AT&T Bell Laboratories
Murray Hill, New Jersey 07974
This manual is a reprint, with updates to the current C
standard, from The C Programming Language, by Brian W. Kernighan
and Dennis M. Richie, Prentice-Hall, Inc., 1978.
1. Introduction
This manual describes the C language on the DEC PDP-11-, the
DEC VAX-11, and the AT&T 3B 20=. Where differences exist, it con-
centrates on the VAX, but tries to point out implementation-
dependent details. With few execptions, these dependencies fol-
low directly from the underlying properties of the hardware; the
various compilers are generally quite compatible.
2. Lexical Conventions
There are six classes of tokens - identifiers, keywords,
constants, strings, operators, and other separators. Blanks,
tabs, new-lines, and comments (collectively, ``white space'') as
described below are ignored except as they serve to separate
tokens. Some white space is required to separate otherwise adja-
cent identifiers, keywords, and constants.
If the input stream has been parsed into tokens up to a
given character, the next token is taken to include the longest
string of characters which could possibly constitute a token.
2.1. Comments
The characters /* introduce a comment which terminates with
the characters */. Comments do not nest.
_________________________
- DEC PDP-11, and DEC VAX-11 are trademarks of Digital Equipment
Corporation.
= 3B 20 is a trademark of AT&T.
December 26, 2008
PS1:1-2 The C Programming Language - Reference Manual
2.2. Identifiers (Names)
An identifier is a sequence of letters and digits. The first
character must be a letter. The underscore (_) counts as a
letter. Uppercase and lowercase letters are different. Although
there is no limit on the length of a name, only initial charac-
ters are significant: at least eight characters of a non-external
name, and perhaps fewer for external names. Moreover, some imple-
mentations may collapse case distinctions for external names. The
external name sizes include:
PDP-11 7 characters, 2 cases
VAX-11 >100 characters, 2 cases
AT&T 3B 20 >100 characters, 2 cases
2.3. Keywords
The following identifiers are reserved for use as keywords
and may not be used otherwise:
auto do for return typedef
break double goto short union
case else if sizeof unsigned
char enum int static void
continue external long structwhile
default float register switch
Some implementations also reserve the words fortran, asm,
gfloat, hfloat and quad
2.4. Constants
There are several kinds of constants. Each has a type; an
introduction to types is given in ``NAMES.'' Hardware charac-
teristics that affect sizes are summarized in ``Hardware Charac-
teristics'' under ``LEXICAL CONVENTIONS.''
2.4.1. Integer Constants
An integer constant consisting of a sequence of digits is
taken to be octal if it begins with 0 (digit zero). An octal con-
stant consists of the digits 0 through 7 only. A sequence of
digits preceded by 0x or 0X (digit zero) is taken to be a hexade-
cimal integer. The hexadecimal digits include a or A through f or
F with values 10 through 15. Otherwise, the integer constant is
taken to be decimal. A decimal constant whose value exceeds the
largest signed machine integer is taken to be long; an octal or
hex constant which exceeds the largest unsigned machine integer
is likewise taken to be long. Otherwise, integer constants are
int.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-3
2.4.2. Explicit Long Constants
A decimal, octal, or hexadecimal integer constant immedi-
ately followed by l (letter ell) or L is a long constant. As dis-
cussed below, on some machines integer and long values may be
considered identical.
2.4.3. Character Constants
A character constant is a character enclosed in single
quotes, as in 'x'. The value of a character constant is the
numerical value of the character in the machine's character set.
Certain nongraphic characters, the single quote (') and the
backslash (\), may be represented according to the following
table of escape sequences:
new-line NL (LF) \n
horizontal tab HT \t
vertical tab VT \v
backspace BS \b
carriage return CR \r
form feed FF \f
backslash \ \\
single quote ' \'
bit pattern ddd \ddd
The escape \ddd consists of the backslash followed by 1, 2,
or 3 octal digits which are taken to specify the value of the
desired character. A special case of this construction is \0 (not
followed by a digit), which indicates the character NUL. If the
character following a backslash is not one of those specified,
the behavior is undefined. A new-line character is illegal in a
character constant. The type of a character constant is int.
2.4.4. Floating Constants
A floating constant consists of an integer part, a decimal
point, a fraction part, an e or E, and an optionally signed
integer exponent. The integer and fraction parts both consist of
a sequence of digits. Either the integer part or the fraction
part (not both) may be missing. Either the decimal point or the e
and the exponent (not both) may be missing. Every floating con-
stant has type double.
2.4.5. Enumeration Constants
Names declared as enumerators (see ``Structure, Union, and
Enumeration Declarations'' under ``DECLARATIONS'') have type int.
December 26, 2008
PS1:1-4 The C Programming Language - Reference Manual
2.5. Strings
A string is a sequence of characters surrounded by double
quotes, as in "...". A string has type ``array of char'' and
storage class static (see ``NAMES'') and is initialized with the
given characters. The compiler places a null byte (\0) at the end
of each string so that programs which scan the string can find
its end. In a string, the double quote character (") must be pre-
ceded by a \; in addition, the same escapes as described for
character constants may be used.
A \ and the immediately following new-line are ignored. All
strings, even when written identically, are distinct.
2.6. Hardware Characteristics
The following figure summarize certain hardware properties
that vary from machine to machine.
_______________________________________________________
| DEC PDP-11 DEC VAX-11 AT&T 3B |
| (ASCII) (ASCII) (ASCII) |
| |
|_____________|____________|____________|______________|
| char | 8 bits | 8 bits | 8bits |
| int | 16 | 32 | 32 |
| short | 16 | 16 | 16 |
| long | 32 | 32 | 32 |
| float | 32 | 32 | 32 |
| double | 64 | 64 | 64 |
| | _38| _38| _38 |
| float range | _10 | _10 | _10 |
| | _38| _38| _308|
| double range| _10 | _10 | _10 |
|_____________|____________|____________|______________|
3. Syntax Notation
Syntactic categories are indicated by italic type and
literal words and characters in bold type. Alternative categories
are listed on separate lines. An optional terminal or nonterminal
symbol is indicated by the subscript ``opt,'' so that
{ expressionopt }
indicates an optional expression enclosed in braces. The syntax
is summarized in ``SYNTAX SUMMARY''.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-5
4. Names
The C language bases the interpretation of an identifier
upon two attributes of the identifier - its storage class and its
type. The storage class determines the location and lifetime of
the storage associated with an identifier; the type determines
the meaning of the values found in the identifier's storage.
4.1. Storage Class
There are four declarable storage classes: Automatic Static
External Register.
Automatic variables are local to each invocation of a block
(see ``Compound Statement or Block'' in ``STATEMENTS'') and are
discarded upon exit from the block. Static variables are local to
a block but retain their values upon reentry to a block even
after control has left the block. External variables exist and
retain their values throughout the execution of the entire pro-
gram and may be used for communication between functions, even
separately compiled functions. Register variables are (if possi-
ble) stored in the fast registers of the machine; like automatic
variables, they are local to each block and disappear on exit
from the block.
4.2. Type
The C language supports several fundamental types of
objects. Objects declared as characters (char) are large enough
to store any member of the implementation's character set. If a
genuine character from that character set is stored in a char
variable, its value is equivalent to the integer code for that
character. Other quantities may be stored into character vari-
ables, but the implementation is machine dependent. In particu-
lar, char may be signed or unsigned by default.
Up to three sizes of integer, declared short int, int, and
long int, are available. Longer integers provide no less storage
than shorter ones, but the implementation may make either short
integers or long integers, or both, equivalent to plain integers.
``Plain'' integers have the natural size suggested by the host
machine architecture. The other sizes are provided to meet spe-
cial needs.
The properties of enum types (see ``Structure, Union, and
Enumeration Declarations'' under ``DECLARATIONS'') are identical
to those of some integer types. The implementation may use the
range of values to determine how to allocate storage.
Unsigned integers, declared unsigned, obey the laws of
arithmetic modulo 2n where n is the number of bits in the
representation. (On the PDP-11, unsigned long quantities are not
supported.)
December 26, 2008
PS1:1-6 The C Programming Language - Reference Manual
Single-precision floating point (float) and double precision
floating point (double) may be synonymous in some implementa-
tions.
Because objects of the foregoing types can usefully be
interpreted as numbers, they will be referred to as arithmetic
types. Char, int of all sizes whether unsigned or not, and enum
will collectively be called integral types. The float and double
types will collectively be called floating types.
The void type specifies an empty set of values. It is used
as the type returned by functions that generate no value.
Besides the fundamental arithmetic types, there is a concep-
tually infinite class of derived types constructed from the fun-
damental types in the following ways: Arrays of objects of most
types Functions which return objects of a given type Pointers to
objects of a given type Structures containing a sequence of
objects of various types Unions capable of containing any one of
several objects of various types.
In general these methods of constructing objects can be
applied recursively.
5. Objects and Lvalues
An object is a manipulatable region of storage. An lvalue is
an expression referring to an object. An obvious example of an
lvalue expression is an identifier. There are operators which
yield lvalues: for example, if E is an expression of pointer
type, then *E is an lvalue expression referring to the object to
which E points. The name ``lvalue'' comes from the assignment
expression E1 = E2 in which the left operand E1 must be an lvalue
expression. The discussion of each operator below indicates
whether it expects lvalue operands and whether it yields an
lvalue.
6. Conversions
A number of operators may, depending on their operands,
cause conversion of the value of an operand from one type to
another. This part explains the result to be expected from such
conversions. The conversions demanded by most ordinary operators
are summarized under ``Arithmetic Conversions.'' The summary will
be supplemented as required by the discussion of each operator.
6.1. Characters and Integers
A character or a short integer may be used wherever an
integer may be used. In all cases the value is converted to an
integer. Conversion of a shorter integer to a longer preserves
sign. Whether or not sign-extension occurs for characters is
machine dependent, but it is guaranteed that a member of the
standard character set is non-negative. Of the machines treated
December 26, 2008
The C Programming Language - Reference Manual PS1:1-7
here, only the PDP-11 and VAX-11 sign-extend. On these machines,
char variables range in value from -128 to 127. The more explicit
type unsigned char forces the values to range from 0 to 255.
On machines that treat characters as signed, the characters
of the ASCII set are all non-negative. However, a character con-
stant specified with an octal escape suffers sign extension and
may appear negative; for example, '\377' has the value -1.
When a longer integer is converted to a shorter integer or
to a char, it is truncated on the left. Excess bits are simply
discarded.
6.2. Float and Double
All floating arithmetic in C is carried out in double preci-
sion. Whenever a float appears in an expression it is lengthened
to double by zero padding its fraction. When a double must be
converted to float, for example by an assignment, the double is
rounded before truncation to float length. This result is unde-
fined if it cannot be represented as a float. On the VAX, the
compiler can be directed to use single percision for expressions
containing only float and interger operands.
6.3. Floating and Integral
Conversions of floating values to integral type are rather
machine dependent. In particular, the direction of truncation of
negative numbers varies. The result is undefined if it will not
fit in the space provided.
Conversions of integral values to floating type are well
behaved. Some loss of accuracy occurs if the destination lacks
sufficient bits.
6.4. Pointers and Integers
An expression of integral type may be added to or subtracted
from a pointer; in such a case, the first is converted as speci-
fied in the discussion of the addition operator. Two pointers to
objects of the same type may be subtracted; in this case, the
result is converted to an integer as specified in the discussion
of the subtraction operator.
6.5. Unsigned
Whenever an unsigned integer and a plain integer are com-
bined, the plain integer is converted to unsigned and the result
is unsigned. The value is the least unsigned integer congruent to
the signed integer (modulo 2wordsize). In a 2's complement
representation, this conversion is conceptual; and there is no
actual change in the bit pattern.
When an unsigned short integer is converted to long, the
December 26, 2008
PS1:1-8 The C Programming Language - Reference Manual
value of the result is the same numerically as that of the
unsigned integer. Thus the conversion amounts to padding with
zeros on the left.
6.6. Arithmetic Conversions
A great many operators cause conversions and yield result
types in a similar way. This pattern will be called the ``usual
arithmetic conversions.'' First, any operands of type char or
short are converted to int, and any operands of type unsigned
char or unsigned short are converted to unsigned int. Then, if
either operand is double, the other is converted to double and
that is the type of the result. Otherwise, if either operand is
unsigned long, the other is converted to unsigned long and that
is the type of the result. Otherwise, if either operand is long,
the other is converted to long and that is the type of the
result. Otherwise, if one operand is long, and the other is
unsigned int, they are both converted to unsigned long and that
is the type of the result. Otherwise, if either operand is
unsigned, the other is converted to unsigned and that is the type
of the result. Otherwise, both operands must be int, and that is
the type of the result.
6.7. Void
The (nonexistent) value of a void object may not be used in
any way, and neither explicit nor implicit conversion may be
applied. Because a void expression denotes a nonexistent value,
such an expression may be used only as an expression statement
(see ``Expression Statement'' under ``STATEMENTS'') or as the
left operand of a comma expression (see ``Comma Operator'' under
``EXPRESSIONS'').
An expression may be converted to type void by use of a
cast. For example, this makes explicit the discarding of the
value of a function call used as an expression statement.
7. Expressions
The precedence of expression operators is the same as the
order of the major subsections of this section, highest pre-
cedence first. Thus, for example, the expressions referred to as
the operands of + (see ``Additive Operators'') are those expres-
sions defined under ``Primary Expressions'', ``Unary Operators'',
and ``Multiplicative Operators''. Within each subpart, the opera-
tors have the same precedence. Left- or right-associativity is
specified in each subsection for the operators discussed therein.
The precedence and associativity of all the expression operators
are summarized in the grammar of ``SYNTAX SUMMARY''.
Otherwise, the order of evaluation of expressions is unde-
fined. In particular, the compiler considers itself free to com-
pute subexpressions in the order it believes most efficient even
if the subexpressions involve side effects. The order in which
December 26, 2008
The C Programming Language - Reference Manual PS1:1-9
subexpression evaluation takes place is unspecified. Expressions
involving a commutative and associative operator (*, •, &, |, ^)
may be rearranged arbitrarily even in the presence of
parentheses; to force a particular order of evaluation, an expli-
cit temporary must be used.
The handling of overflow and divide check in expression
evaluation is undefined. Most existing implementations of C
ignore integer overflows; treatment of division by 0 and all
floating-point exceptions varies between machines and is usually
adjustable by a library function.
7.1. Primary Expressions
Primary expressions involving ., ->, subscripting, and func-
tion calls group left to right.
primary-expression:
identifier
constant
string
( expression )
primary-expression [ expression ]
primary-expression ( expression-listopt )
primary-expression . identifier
primary-expression -> identifier
expression-list:
expression
expression-list , expression
An identifier is a primary expression provided it has been
suitably declared as discussed below. Its type is specified by
its declaration. If the type of the identifier is ``array of
...'', then the value of the identifier expression is a pointer
to the first object in the array; and the type of the expression
is ``pointer to ...''. Moreover, an array identifier is not an
lvalue expression. Likewise, an identifier which is declared
``function returning ...'', when used except in the function-name
position of a call, is converted to ``pointer to function return-
ing ...''.
A constant is a primary expression. Its type may be int,
long, or double depending on its form. Character constants have
type int and floating constants have type double.
A string is a primary expression. Its type is originally
``array of char'', but following the same rule given above for
identifiers, this is modified to ``pointer to char'' and the
result is a pointer to the first character in the string. (There
is an exception in certain initializers; see ``Initialization''
under ``DECLARATIONS.'')
December 26, 2008
PS1:1-10 The C Programming Language - Reference Manual
A parenthesized expression is a primary expression whose
type and value are identical to those of the unadorned expres-
sion. The presence of parentheses does not affect whether the
expression is an lvalue.
A primary expression followed by an expression in square
brackets is a primary expression. The intuitive meaning is that
of a subscript. Usually, the primary expression has type
``pointer to ...'', the subscript expression is int, and the type
of the result is ``...''. The expression E1[E2] is identical (by
definition) to *((E1)+E2)). All the clues needed to understand
this notation are contained in this subpart together with the
discussions in ``Unary Operators'' and ``Additive Operators'' on
identifiers, * and + respectively. The implications are summar-
ized under ``Arrays, Pointers, and Subscripting'' under ``TYPES
REVISITED.''
A function call is a primary expression followed by
parentheses containing a possibly empty, comma-separated list of
expressions which constitute the actual arguments to the func-
tion. The primary expression must be of type ``function returning
...,'' and the result of the function call is of type ``...''. As
indicated below, a hitherto unseen identifier followed immedi-
ately by a left parenthesis is contextually declared to represent
a function returning an integer; thus in the most common case,
integer-valued functions need not be declared.
Any actual arguments of type float are converted to double
before the call. Any of type char or short are converted to int.
Array names are converted to pointers. No other conversions are
performed automatically; in particular, the compiler does not
compare the types of actual arguments with those of formal argu-
ments. If conversion is needed, use a cast; see ``Unary Opera-
tors'' and ``Type Names'' under ``DECLARATIONS.''
In preparing for the call to a function, a copy is made of
each actual parameter. Thus, all argument passing in C is
strictly by value. A function may change the values of its formal
parameters, but these changes cannot affect the values of the
actual parameters. It is possible to pass a pointer on the under-
standing that the function may change the value of the object to
which the pointer points. An array name is a pointer expression.
The order of evaluation of arguments is undefined by the
language; take note that the various compilers differ. Recursive
calls to any function are permitted.
A primary expression followed by a dot followed by an iden-
tifier is an expression. The first expression must be a structure
or a union, and the identifier must name a member of the struc-
ture or union. The value is the named member of the structure or
union, and it is an lvalue if the first expression is an lvalue.
A primary expression followed by an arrow (built from - and
> ) followed by an identifier is an expression. The first
December 26, 2008
The C Programming Language - Reference Manual PS1:1-11
expression must be a pointer to a structure or a union and the
identifier must name a member of that structure or union. The
result is an lvalue referring to the named member of the struc-
ture or union to which the pointer expression points. Thus the
expression E1->MOS is the same as (*E1).MOS. Structures and
unions are discussed in ``Structure, Union, and Enumeration
Declarations'' under ``DECLARATIONS.''
7.2. Unary Operators
Expressions with unary operators group right to left.
unary-expression:
* expression
& lvalue
- expression
! expression
~ expression
++ lvalue
--lvalue
lvalue ++
lvalue --
( type-name ) expression
sizeof expression
sizeof ( type-name )
The unary * operator means indirection ; the expression must
be a pointer, and the result is an lvalue referring to the object
to which the expression points. If the type of the expression is
``pointer to ...,'' the type of the result is ``...''.
The result of the unary & operator is a pointer to the
object referred to by the lvalue. If the type of the lvalue is
``...'', the type of the result is ``pointer to ...''.
The result of the unary - operator is the negative of its
operand. The usual arithmetic conversions are performed. The
negative of an unsigned quantity is computed by subtracting its
value from 2n where n is the number of bits in the corresponding
signed type.
There is no unary + operator.
The result of the logical negation operator ! is one if the
value of its operand is zero, zero if the value of its operand is
nonzero. The type of the result is int. It is applicable to any
arithmetic type or to pointers.
The ~ operator yields the one's complement of its operand.
The usual arithmetic conversions are performed. The type of the
operand must be integral.
The object referred to by the lvalue operand of prefix ++ is
December 26, 2008
PS1:1-12 The C Programming Language - Reference Manual
incremented. The value is the new value of the operand but is not
an lvalue. The expression ++x is equivalent to x=x+1. See the
discussions ``Additive Operators'' and ``Assignment Operators''
for information on conversions.
The lvalue operand of prefix -- is decremented analogously
to the prefix ++ operator.
When postfix ++ is applied to an lvalue, the result is the
value of the object referred to by the lvalue. After the result
is noted, the object is incremented in the same manner as for the
prefix ++ operator. The type of the result is the same as the
type of the lvalue expression.
When postfix -- is applied to an lvalue, the result is the
value of the object referred to by the lvalue. After the result
is noted, the object is decremented in the manner as for the pre-
fix -- operator. The type of the result is the same as the type
of the lvalue expression.
An expression preceded by the parenthesized name of a data
type causes conversion of the value of the expression to the
named type. This construction is called a cast. Type names are
described in ``Type Names'' under ``Declarations.''
The sizeof operator yields the size in bytes of its operand.
(A byte is undefined by the language except in terms of the value
of sizeof. However, in all existing implementations, a byte is
the space required to hold a char.) When applied to an array, the
result is the total number of bytes in the array. The size is
determined from the declarations of the objects in the expres-
sion. This expression is semantically an unsigned constant and
may be used anywhere a constant is required. Its major use is in
communication with routines like storage allocators and I/O sys-
tems.
The sizeof operator may also be applied to a parenthesized
type name. In that case it yields the size in bytes of an object
of the indicated type.
The construction sizeof(type) is taken to be a unit, so the
expression sizeof(type)-2 is the same as (sizeof(type))-2.
7.3. Multiplicative Operators
The multiplicative operators *, /, and % group left to
right. The usual arithmetic conversions are performed.
multiplicative expression:
expression * expression
expression / expression
expression % expression
December 26, 2008
The C Programming Language - Reference Manual PS1:1-13
The binary * operator indicates multiplication. The * opera-
tor is associative, and expressions with several multiplications
at the same level may be rearranged by the compiler. The binary /
operator indicates division.
The binary % operator yields the remainder from the division
of the first expression by the second. The operands must be
integral.
When positive integers are divided, truncation is toward 0;
but the form of truncation is machine-dependent if either operand
is negative. On all machines covered by this manual, the
remainder has the same sign as the dividend. It is always true
that (a/b)*b + a%b is equal to a (if b is not 0).
7.4. Additive Operators
The additive operators + and - group left to right. The
usual arithmetic conversions are performed. There are some addi-
tional type possibilities for each operator.
additive-expression:
expression + expression
expression - expression
The result of the + operator is the sum of the operands. A
pointer to an object in an array and a value of any integral type
may be added. The latter is in all cases converted to an address
offset by multiplying it by the length of the object to which the
pointer points. The result is a pointer of the same type as the
original pointer which points to another object in the same
array, appropriately offset from the original object. Thus if P
is a pointer to an object in an array, the expression P+1 is a
pointer to the next object in the array. No further type combina-
tions are allowed for pointers.
The + operator is associative, and expressions with several
additions at the same level may be rearranged by the compiler.
The result of the - operator is the difference of the
operands. The usual arithmetic conversions are performed. Addi-
tionally, a value of any integral type may be subtracted from a
pointer, and then the same conversions for addition apply.
If two pointers to objects of the same type are subtracted,
the result is converted (by division by the length of the object)
to an int representing the number of objects separating the
pointed-to objects. This conversion will in general give unex-
pected results unless the pointers point to objects in the same
array, since pointers, even to objects of the same type, do not
necessarily differ by a multiple of the object length.
December 26, 2008
PS1:1-14 The C Programming Language - Reference Manual
7.5. Shift Operators
The shift operators << and >> group left to right. Both per-
form the usual arithmetic conversions on their operands, each of
which must be integral. Then the right operand is converted to
int; the type of the result is that of the left operand. The
result is undefined if the right operand is negative or greater
than or equal to the length of the object in bits. On the VAX a
negative right operand is interpreted as reversing the direction
of the shift.
shift-expression:
expression << expression
expression >> expression
The value of E1<<E2 is E1 (interpreted as a bit pattern)
left-shifted E2 bits. Vacated bits are 0 filled. The value of
E1>>E2 is E1 right-shifted E2 bit positions. The right shift is
guaranteed to be logical (0 fill) if E1 is unsigned; otherwise,
it may be arithmetic.
7.6. Relational Operators
The relational operators group left to right.
relational-expression:
expression < expression
expression > expression
expression <= expression
expression >= expression
The operators < (less than), > (greater than), <= (less than
or equal to), and >= (greater than or equal to) all yield 0 if
the specified relation is false and 1 if it is true. The type of
the result is int. The usual arithmetic conversions are per-
formed. Two pointers may be compared; the result depends on the
relative locations in the address space of the pointed-to
objects. Pointer comparison is portable only when the pointers
point to objects in the same array.
7.7. Equality Operators
equality-expression:
expression == expression
expression != expression
The == (equal to) and the != (not equal to) operators are
exactly analogous to the relational operators except for their
lower precedence. (Thus a<b == c<d is 1 whenever a<b and c<d have
the same truth value).
December 26, 2008
The C Programming Language - Reference Manual PS1:1-15
A pointer may be compared to an integer only if the integer
is the constant 0. A pointer to which 0 has been assigned is
guaranteed not to point to any object and will appear to be equal
to 0. In conventional usage, such a pointer is considered to be
null.
7.8. Bitwise AND Operator
and-expression:
expression & expression
The & operator is associative, and expressions involving &
may be rearranged. The usual arithmetic conversions are per-
formed. The result is the bitwise AND function of the operands.
The operator applies only to integral operands.
7.9. Bitwise Exclusive OR Operator
exclusive-or-expression:
expression ^ expression
The ^ operator is associative, and expressions involving ^
may be rearranged. The usual arithmetic conversions are per-
formed; the result is the bitwise exclusive OR function of the
operands. The operator applies only to integral operands.
7.10. Bitwise Inclusive OR Operator
inclusive-or-expression:
expression | expression
The | operator is associative, and expressions involving |
may be rearranged. The usual arithmetic conversions are per-
formed; the result is the bitwise inclusive OR function of its
operands. The operator applies only to integral operands.
7.11. Logical AND Operator
logical-and-expression:
expression && expression
The && operator groups left to right. It returns 1 if both
its operands evaluate to nonzero, 0 otherwise. Unlike &, &&
guarantees left to right evaluation; moreover, the second operand
is not evaluated if the first operand is 0.
The operands need not have the same type, but each must have
one of the fundamental types or be a pointer. The result is
always int.
December 26, 2008
PS1:1-16 The C Programming Language - Reference Manual
7.12. Logical OR Operator
logical-or-expression:
expression || expression
The || operator groups left to right. It returns 1 if either
of its operands evaluates to nonzero, 0 otherwise. Unlike |, ||
guarantees left to right evaluation; moreover, the second operand
is not evaluated if the value of the first operand is nonzero.
The operands need not have the same type, but each must have
one of the fundamental types or be a pointer. The result is
always int.
7.13. Conditional Operator
conditional-expression:
expression ? expression : expression
Conditional expressions group right to left. The first
expression is evaluated; and if it is nonzero, the result is the
value of the second expression, otherwise that of third expres-
sion. If possible, the usual arithmetic conversions are performed
to bring the second and third expressions to a common type. If
both are structures or unions of the same type, the result has
the type of the structure or union. If both pointers are of the
same type, the result has the common type. Otherwise, one must be
a pointer and the other the constant 0, and the result has the
type of the pointer. Only one of the second and third expressions
is evaluated.
7.14. Assignment Operators
There are a number of assignment operators, all of which
group right to left. All require an lvalue as their left operand,
and the type of an assignment expression is that of its left
operand. The value is the value stored in the left operand after
the assignment has taken place. The two parts of a compound
assignment operator are separate tokens.
assignment-expression:
lvalue = expression
lvalue += expression
lvalue -= expression
lvalue *= expression
lvalue /= expression
lvalue %= expression
lvalue >>= expression
lvalue <<= expression
lvalue &= expression
lvalue ^= expression
lvalue |= expression
December 26, 2008
The C Programming Language - Reference Manual PS1:1-17
In the simple assignment with =, the value of the expression
replaces that of the object referred to by the lvalue. If both
operands have arithmetic type, the right operand is converted to
the type of the left preparatory to the assignment. Second, both
operands may be structures or unions of the same type. Finally,
if the left operand is a pointer, the right operand must in gen-
eral be a pointer of the same type. However, the constant 0 may
be assigned to a pointer; it is guaranteed that this value will
produce a null pointer distinguishable from a pointer to any
object.
The behavior of an expression of the form E1 op = E2 may be
inferred by taking it as equivalent to E1 = E1 op (E2); however,
E1 is evaluated only once. In += and -=, the left operand may be
a pointer; in which case, the (integral) right operand is con-
verted as explained in ``Additive Operators.'' All right operands
and all nonpointer left operands must have arithmetic type.
7.15. Comma Operator
comma-expression:
expression , expression
A pair of expressions separated by a comma is evaluated left
to right, and the value of the left expression is discarded. The
type and value of the result are the type and value of the right
operand. This operator groups left to right. In contexts where
comma is given a special meaning, e.g., in lists of actual argu-
ments to functions (see ``Primary Expressions'') and lists of
initializers (see ``Initialization'' under ``DECLARATIONS''), the
comma operator as described in this subpart can only appear in
parentheses. For example,
f(a, (t=3, t+2), c)
has three arguments, the second of which has the value 5.
8. Declarations
Declarations are used to specify the interpretation which C
gives to each identifier; they do not necessarily reserve storage
associated with the identifier. Declarations have the form
declaration:
decl-specifiers declarator-listopt ;
The declarators in the declarator-list contain the identif-
iers being declared. The decl-specifiers consist of a sequence of
type and storage class specifiers.
December 26, 2008
PS1:1-18 The C Programming Language - Reference Manual
decl-specifiers:
type-specifier decl-specifiersopt
sc-specifier decl-specifiersopt
The list must be self-consistent in a way described below.
8.1. Storage Class Specifiers
The sc-specifiers are:
sc-specifier:
auto
static
extern
register
typedef
The typedef specifier does not reserve storage and is called
a ``storage class specifier'' only for syntactic convenience. See
``Typedef'' for more information. The meanings of the various
storage classes were discussed in ``Names.''
The auto, static, and register declarations also serve as
definitions in that they cause an appropriate amount of storage
to be reserved. In the extern case, there must be an external
definition (see ``External Definitions'') for the given identif-
iers somewhere outside the function in which they are declared.
A register declaration is best thought of as an auto
declaration, together with a hint to the compiler that the vari-
ables declared will be heavily used. Only the first few such
declarations in each function are effective. Moreover, only vari-
ables of certain types will be stored in registers; on the PDP-
11, they are int or pointer. One other restriction applies to
register variables: the address-of operator & cannot be applied
to them. Smaller, faster programs can be expected if register
declarations are used appropriately, but future improvements in
code generation may render them unnecessary.
At most, one sc-specifier may be given in a declaration. If
the sc-specifier is missing from a declaration, it is taken to be
auto inside a function, extern outside. Exception: functions are
never automatic.
8.2. Type Specifiers
The type-specifiers are
December 26, 2008
The C Programming Language - Reference Manual PS1:1-19
type-specifier:
struct-or-union-specifier
typedef-name
enum-specifier
basic-type-specifier:
basic-type
basic-type basic-type-specifiers
basic-type:
char
short
int
long
unsigned
float
double
void
At most one of the words long or short may be specified in
conjunction with int; the meaning is the same as if int were not
mentioned. The word long may be specified in conjunction with
float; the meaning is the same as double. The word unsigned may
be specified alone, or in conjunction with int or any of its
short or long varieties, or with char.
Otherwise, at most on type-specifier may be given in a
declaration. In particular, adjectival use of long, short, or
unsigned is not permitted with typedef names. If the type-
specifier is missing from a declaration, it is taken to be int.
Specifiers for structures, unions, and enumerations are dis-
cussed in ``Structure, Union, and Enumeration Declarations.''
Declarations with typedef names are discussed in ``Typedef.''
8.3. Declarators
The declarator-list appearing in a declaration is a comma-
separated sequence of declarators, each of which may have an ini-
tializer.
declarator-list:
init-declarator
init-declarator , declarator-list
init-declarator:
declarator initializeropt
Initializers are discussed in ``Initialization''. The
specifiers in the declaration indicate the type and storage class
of the objects to which the declarators refer. Declarators have
the syntax:
December 26, 2008
PS1:1-20 The C Programming Language - Reference Manual
declarator:
identifier
( declarator )
* declarator
declarator ()
declarator [ constant-expressionopt ]
The grouping is the same as in expressions.
8.4. Meaning of Declarators
Each declarator is taken to be an assertion that when a con-
struction of the same form as the declarator appears in an
expression, it yields an object of the indicated type and storage
class.
Each declarator contains exactly one identifier; it is this
identifier that is declared. If an unadorned identifier appears
as a declarator, then it has the type indicated by the specifier
heading the declaration.
A declarator in parentheses is identical to the unadorned
declarator, but the binding of complex declarators may be altered
by parentheses. See the examples below.
Now imagine a declaration
T D1
where T is a type-specifier (like int, etc.) and D1 is a declara-
tor. Suppose this declaration makes the identifier have type
``... T ,'' where the ``...'' is empty if D1 is just a plain
identifier (so that the type of x in `int x'' is just int). Then
if D1 has the form
*D
the type of the contained identifier is ``... pointer to T .''
If D1 has the form
D()
then the contained identifier has the type ``... function return-
ing T.''
If D1 has the form
D[constant-expression]
December 26, 2008
The C Programming Language - Reference Manual PS1:1-21
or
D[]
then the contained identifier has type ``... array of T.'' In the
first case, the constant expression is an expression whose value
is determinable at compile time , whose type is int, and whose
value is positive. (Constant expressions are defined precisely in
``Constant Expressions.'') When several ``array of'' specifica-
tions are adjacent, a multidimensional array is created; the con-
stant expressions which specify the bounds of the arrays may be
missing only for the first member of the sequence. This elision
is useful when the array is external and the actual definition,
which allocates storage, is given elsewhere. The first constant
expression may also be omitted when the declarator is followed by
initialization. In this case the size is calculated from the
number of initial elements supplied.
An array may be constructed from one of the basic types,
from a pointer, from a structure or union, or from another array
(to generate a multidimensional array).
Not all the possibilities allowed by the syntax above are
actually permitted. The restrictions are as follows: functions
may not return arrays or functions although they may return
pointers; there are no arrays of functions although there may be
arrays of pointers to functions. Likewise, a structure or union
may not contain a function; but it may contain a pointer to a
function.
As an example, the declaration
int i, *ip, f(), *fip(), (*pfi)();
declares an integer i, a pointer ip to an integer, a function f
returning an integer, a function fip returning a pointer to an
integer, and a pointer pfi to a function which returns an
integer. It is especially useful to compare the last two. The
binding of *fip() is *(fip()). The declaration suggests, and the
same construction in an expression requires, the calling of a
function fip. Using indirection through the (pointer) result to
yield an integer. In the declarator (*pfi)(), the extra
parentheses are necessary, as they are also in an expression, to
indicate that indirection through a pointer to a function yields
a function, which is then called; it returns an integer.
As another example,
float fa[17], *afp[17];
declares an array of float numbers and an array of pointers to
December 26, 2008
PS1:1-22 The C Programming Language - Reference Manual
float numbers. Finally,
static int x3d[3][5][7];
declares a static 3-dimensional array of integers, with rank
3x5x7. In complete detail, x3d is an array of three items; each
item is an array of five arrays; each of the latter arrays is an
array of seven integers. Any of the expressions x3d, x3d[i],
x3d[i][j], x3d[i][j][k] may reasonably appear in an expression.
The first three have type ``array'' and the last has type int.
8.5. Structure and Union Declarations
A structure is an object consisting of a sequence of named
members. Each member may have any type. A union is an object
which may, at a given time, contain any one of several members.
Structure and union specifiers have the same form.
struct-or-union-specifier:
struct-or-union { struct-decl-list }
struct-or-union identifier { struct-decl-list }
struct-or-union identifier
struct-or-union:
struct
union
The struct-decl-list is a sequence of declarations for the
members of the structure or union:
struct-decl-list:
struct-declaration
struct-declaration struct-decl-list
struct-declaration:
type-specifier struct-declarator-list ;
struct-declarator-list:
struct-declarator
struct-declarator , struct-declarator-list
In the usual case, a struct-declarator is just a declarator
for a member of a structure or union. A structure member may also
consist of a specified number of bits. Such a member is also
called a field ; its length, a non-negative constant expression,
is set off from the field name by a colon.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-23
struct-declarator:
declarator
declarator : constant-expression
: constant-expression
Within a structure, the objects declared have addresses
which increase as the declarations are read left to right. Each
nonfield member of a structure begins on an addressing boundary
appropriate to its type; therefore, there may be unnamed holes in
a structure. Field members are packed into machine integers; they
do not straddle words. A field which does not fit into the space
remaining in a word is put into the next word. No field may be
wider than a word.
Fields are assigned right to left on the PDP-11 and VAX-11,
left to right on the 3B 20.
A struct-declarator with no declarator, only a colon and a
width, indicates an unnamed field useful for padding to conform
to externally-imposed layouts. As a special case, a field with a
width of 0 specifies alignment of the next field at an implemen-
tation dependant boundary.
The language does not restrict the types of things that are
declared as fields, but implementations are not required to sup-
port any but integer fields. Moreover, even int fields may be
considered to be unsigned. On the PDP-11, fields are not signed
and have only integer values; on the VAX-11, fields declared with
int are treated as containing a sign. For these reasons, it is
strongly recommended that fields be declared as unsigned. In all
implementations, there are no arrays of fields, and the address-
of operator & may not be applied to them, so that there are no
pointers to fields.
A union may be thought of as a structure all of whose
members begin at offset 0 and whose size is sufficient to contain
any of its members. At most, one of the members can be stored in
a union at any time.
A structure or union specifier of the second form, that is,
one of
struct identifier { struct-decl-list }
union identifier { struct-decl-list }
declares the identifier to be the structure tag (or union tag) of
the structure specified by the list. A subsequent declaration may
then use the third form of specifier, one of
struct identifier
union identifier
December 26, 2008
PS1:1-24 The C Programming Language - Reference Manual
Structure tags allow definition of self-referential struc-
tures. Structure tags also permit the long part of the declara-
tion to be given once and used several times. It is illegal to
declare a structure or union which contains an instance of
itself, but a structure or union may contain a pointer to an
instance of itself.
The third form of a structure or union specifier may be used
prior to a declaration which gives the complete specification of
the structure or union in situations in which the size of the
structure or union is unnecessary. The size is unnecessary in two
situations: when a pointer to a structure or union is being
declared and when a typedef name is declared to be a synonym for
a structure or union. This, for example, allows the declaration
of a pair of structures which contain pointers to each other.
The names of members and tags do not conflict with each
other or with ordinary variables. A particular name may not be
used twice in the same structure, but the same name may be used
in several different structures in the same scope.
A simple but important example of a structure declaration is
the following binary tree structure:
struct tnode
{
char tword[20];
int count;
struct tnode *left;
struct tnode *right;
};
which contains an array of 20 characters, an integer, and two
pointers to similar structures. Once this declaration has been
given, the declaration
struct tnode s, *sp;
declares s to be a structure of the given sort and sp to be a
pointer to a structure of the given sort. With these declara-
tions, the expression
sp->count
refers to the count field of the structure to which sp points;
s.left
refers to the left subtree pointer of the structure s; and
December 26, 2008
The C Programming Language - Reference Manual PS1:1-25
s.right->tword[0]
refers to the first character of the tword member of the right
subtree of s.
8.6. Enumeration Declarations
Enumeration variables and constants have integral type.
enum-specifier:
enum { enum-list }
enum identifier { enum-list }
enum identifier
enum-list:
enumerator
enum-list , enumerator
enumerator:
identifier
identifier = constant-expression
The identifiers in an enum-list are declared as constants
and may appear wherever constants are required. If no enumerators
with = appear, then the values of the corresponding constants
begin at 0 and increase by 1 as the declaration is read from left
to right. An enumerator with = gives the associated identifier
the value indicated; subsequent identifiers continue the progres-
sion from the assigned value.
The names of enumerators in the same scope must all be dis-
tinct from each other and from those of ordinary variables.
The role of the identifier in the enum-specifier is entirely
analogous to that of the structure tag in a struct-specifier; it
names a particular enumeration. For example,
enum color { chartreuse, burgundy, claret=20, winedark };
...
enum color **cp, col;
...
col = claret;
cp = &col;
...
if (**cp == burgundy) ...
makes color the enumeration-tag of a type describing various
colors, and then declares cp as a pointer to an object of that
type, and col as an object of that type. The possible values are
December 26, 2008
PS1:1-26 The C Programming Language - Reference Manual
drawn from the set {0,1,20,21}.
8.7. Initialization
A declarator may specify an initial value for the identifier
being declared. The initializer is preceded by = and consists of
an expression or a list of values nested in braces.
initializer:
= expression
= { initializer-list }
= { initializer-list , }
initializer-list:
expression
initializer-list , initializer-list
{ initializer-list }
{ initializer-list , }
All the expressions in an initializer for a static or exter-
nal variable must be constant expressions, which are described in
``CONSTANT EXPRESSIONS'', or expressions which reduce to the
address of a previously declared variable, possibly offset by a
constant expression. Automatic or register variables may be ini-
tialized by arbitrary expressions involving constants and previ-
ously declared variables and functions.
Static and external variables that are not initialized are
guaranteed to start off as zero. Automatic and register variables
that are not initialized are guaranteed to start off as garbage.
When an initializer applies to a scalar (a pointer or an
object of arithmetic type), it consists of a single expression,
perhaps in braces. The initial value of the object is taken from
the expression; the same conversions as for assignment are per-
formed.
When the declared variable is an aggregate (a structure or
array), the initializer consists of a brace-enclosed, comma-
separated list of initializers for the members of the aggregate
written in increasing subscript or member order. If the aggregate
contains subaggregates, this rule applies recursively to the
members of the aggregate. If there are fewer initializers in the
list than there are members of the aggregate, then the aggregate
is padded with zeros. It is not permitted to initialize unions or
automatic aggregates.
Braces may in some cases be omitted. If the initializer
begins with a left brace, then the succeeding comma-separated
list of initializers initializes the members of the aggregate; it
is erroneous for there to be more initializers than members. If,
however, the initializer does not begin with a left brace, then
December 26, 2008
The C Programming Language - Reference Manual PS1:1-27
only enough elements from the list are taken to account for the
members of the aggregate; any remaining members are left to ini-
tialize the next member of the aggregate of which the current
aggregate is a part.
A final abbreviation allows a char array to be initialized
by a string. In this case successive characters of the string
initialize the members of the array.
For example,
int x[] = { 1, 3, 5 };
declares and initializes x as a one-dimensional array which has
three members, since no size was specified and there are three
initializers.
float y[4][3] =
{
{ 1, 3, 5 },
{ 2, 4, 6 },
{ 3, 5, 7 },
};
is a completely-bracketed initialization: 1, 3, and 5 initialize
the first row of the array y[0], namely y[0][0], y[0][1], and
y[0][2]. Likewise, the next two lines initialize y[1] and y[2].
The initializer ends early and therefore y[3] is initialized with
0. Precisely, the same effect could have been achieved by
float y[4][3] =
{
1, 3, 5, 2, 4, 6, 3, 5, 7
};
The initializer for y begins with a left brace but that for
y[0] does not; therefore, three elements from the list are used.
Likewise, the next three are taken successively for y[1] and
y[2]. Also,
float y[4][3] =
{
{ 1 }, { 2 }, { 3 }, { 4 }
};
initializes the first column of y (regarded as a two-dimensional
array) and leaves the rest 0.
Finally,
December 26, 2008
PS1:1-28 The C Programming Language - Reference Manual
char msg[] = "Syntax error on line %s\n";
shows a character array whose members are initialized with a
string.
8.8. Type Names
In two contexts (to specify type conversions explicitly by
means of a cast and as an argument of sizeof), it is desired to
supply the name of a data type. This is accomplished using a
``type name'', which in essence is a declaration for an object of
that type which omits the name of the object.
type-name:
type-specifier abstract-declarator
abstract-declarator:
empty
( abstract-declarator )
* abstract-declarator
abstract-declarator ()
abstract-declarator [ constant-expressionopt ]
To avoid ambiguity, in the construction
( abstract-declarator )
the abstract-declarator is required to be nonempty. Under this
restriction, it is possible to identify uniquely the location in
the abstract-declarator where the identifier would appear if the
construction were a declarator in a declaration. The named type
is then the same as the type of the hypothetical identifier. For
example,
int
int *
int *[3]
int (*)[3]
int *()
int (*)()
int (*[3])()
name respectively the types ``integer,'' ``pointer to integer,''
``array of three pointers to integers,'' ``pointer to an array of
three integers,'' ``function returning pointer to integer,''
``pointer to function returning an integer,'' and ``array of
three pointers to functions returning an integer.''
December 26, 2008
The C Programming Language - Reference Manual PS1:1-29
8.9. Typedef
Declarations whose ``storage class'' is typedef do not
define storage but instead define identifiers which can be used
later as if they were type keywords naming fundamental or derived
types.
typedef-name:
identifier
Within the scope of a declaration involving typedef, each
identifier appearing as part of any declarator therein becomes
syntactically equivalent to the type keyword naming the type
associated with the identifier in the way described in ``Meaning
of Declarators.'' For example, after
typedef int MILES, *KLICKSP;
typedef struct { double re, im; } complex;
the constructions
MILES distance;
extern KLICKSP metricp;
complex z, *zp;
are all legal declarations; the type of distance is int, that of
metricp is ``pointer to int, '' and that of z is the specified
structure. The zp is a pointer to such a structure.
The typedef does not introduce brand-new types, only
synonyms for types which could be specified in another way. Thus
in the example above distance is considered to have exactly the
same type as any other int object.
9. Statements
Except as indicated, statements are executed in sequence.
9.1. Expression Statement
Most statements are expression statements, which have the
form
expression ;
Usually expression statements are assignments or function
calls.
December 26, 2008
PS1:1-30 The C Programming Language - Reference Manual
9.2. Compound Statement or Block
So that several statements can be used where one is
expected, the compound statement (also, and equivalently, called
``block'') is provided:
compound-statement:
{ declaration-listopt statement-listopt }
declaration-list:
declaration
declaration declaration-list
statement-list:
statement
statement statement-list
If any of the identifiers in the declaration-list were pre-
viously declared, the outer declaration is pushed down for the
duration of the block, after which it resumes its force.
Any initializations of auto or register variables are per-
formed each time the block is entered at the top. It is currently
possible (but a bad practice) to transfer into a block; in that
case the initializations are not performed. Initializations of
static variables are performed only once when the program begins
execution. Inside a block, extern declarations do not reserve
storage so initialization is not permitted.
9.3. Conditional Statement
The two forms of the conditional statement are
if ( expression ) statement
if ( expression ) statement else statement
In both cases, the expression is evaluated; and if it is
nonzero, the first substatement is executed. In the second case,
the second substatement is executed if the expression is 0. The
``else'' ambiguity is resolved by connecting an else with the
last encountered else-less if.
9.4. While Statement
The while statement has the form
while ( expression ) statement
The substatement is executed repeatedly so long as the value
December 26, 2008
The C Programming Language - Reference Manual PS1:1-31
of the expression remains nonzero. The test takes place before
each execution of the statement.
9.5. Do Statement
The do statement has the form
do statement while ( expression ) ;
The substatement is executed repeatedly until the value of
the expression becomes 0. The test takes place after each execu-
tion of the statement.
9.6. For Statement
The for statement has the form:
for ( exp-1opt ; exp-2opt ; exp-3opt ) statement
Except for the behavior of continue, this statement is
equivalent to
exp-1 ;
while ( exp-2 )
{
statement
exp-3 ;
}
Thus the first expression specifies initialization for the
loop; the second specifies a test, made before each iteration,
such that the loop is exited when the expression becomes 0. The
third expression often specifies an incrementing that is per-
formed after each iteration.
Any or all of the expressions may be dropped. A missing
exp-2 makes the implied while clause equivalent to while(1);
other missing expressions are simply dropped from the expansion
above.
9.7. Switch Statement
The switch statement causes control to be transferred to one
of several statements depending on the value of an expression. It
has the form
switch ( expression ) statement
The usual arithmetic conversion is performed on the
December 26, 2008
PS1:1-32 The C Programming Language - Reference Manual
expression, but the result must be int. The statement is typi-
cally compound. Any statement within the statement may be labeled
with one or more case prefixes as follows:
case constant-expression :
where the constant expression must be int. No two of the case
constants in the same switch may have the same value. Constant
expressions are precisely defined in ``CONSTANT EXPRESSIONS.''
There may also be at most one statement prefix of the form
default :
When the switch statement is executed, its expression is
evaluated and compared with each case constant. If one of the
case constants is equal to the value of the expression, control
is passed to the statement following the matched case prefix. If
no case constant matches the expression and if there is a
default, prefix, control passes to the prefixed statement. If no
case matches and if there is no default, then none of the state-
ments in the switch is executed.
The prefixes case and default do not alter the flow of con-
trol, which continues unimpeded across such prefixes. To exit
from a switch, see ``Break Statement.''
Usually, the statement that is the subject of a switch is
compound. Declarations may appear at the head of this statement,
but initializations of automatic or register variables are inef-
fective.
9.8. Break Statement
The statement
break ;
causes termination of the smallest enclosing while, do, for, or
switch statement; control passes to the statement following the
terminated statement.
9.9. Continue Statement
The statement
continue ;
causes control to pass to the loop-continuation portion of the
smallest enclosing while, do, or for statement; that is to the
December 26, 2008
The C Programming Language - Reference Manual PS1:1-33
end of the loop. More precisely, in each of the statements
while (...) { do { for (...) {
statement ; statement ; statement ;
contin: ; contin: ; contin: ;
} } while (...); }
a continue is equivalent to goto contin. (Following the contin:
is a null statement, see ``Null Statement''.)
9.10. Return Statement
A function returns to its caller by means of the return
statement which has one of the forms
return ;
return expression ;
In the first case, the returned value is undefined. In the
second case, the value of the expression is returned to the
caller of the function. If required, the expression is converted,
as if by assignment, to the type of function in which it appears.
Flowing off the end of a function is equivalent to a return with
no returned value. The expression may be parenthesized.
9.11. Goto Statement
Control may be transferred unconditionally by means of the
statement
goto identifier ;
The identifier must be a label (see ``Labeled Statement'')
located in the current function.
9.12. Labeled Statement
Any statement may be preceded by label prefixes of the form
identifier :
which serve to declare the identifier as a label. The only use of
a label is as a target of a goto. The scope of a label is the
current function, excluding any subblocks in which the same iden-
tifier has been redeclared. See ``SCOPE RULES.''
December 26, 2008
PS1:1-34 The C Programming Language - Reference Manual
9.13. Null Statement
The null statement has the form
;
A null statement is useful to carry a label just before the
} of a compound statement or to supply a null body to a looping
statement such as while.
10. External Definitions
A C program consists of a sequence of external definitions.
An external definition declares an identifier to have storage
class extern (by default) or perhaps static, and a specified
type. The type-specifier (see ``Type Specifiers'' in ``DECLARA-
TIONS'') may also be empty, in which case the type is taken to be
int. The scope of external definitions persists to the end of the
file in which they are declared just as the effect of declara-
tions persists to the end of a block. The syntax of external
definitions is the same as that of all declarations except that
only at this level may the code for functions be given.
10.1. External Function Definitions
Function definitions have the form
function-definition:
decl-specifiersopt function-declarator function-body
The only sc-specifiers allowed among the decl-specifiers are
extern or static; see ``Scope of Externals'' in ``SCOPE RULES''
for the distinction between them. A function declarator is simi-
lar to a declarator for a ``function returning ...'' except that
it lists the formal parameters of the function being defined.
function-declarator:
declarator ( parameter-listopt )
parameter-list:
identifier
identifier , parameter-list
The function-body has the form
function-body:
declaration-listopt compound-statement
The identifiers in the parameter list, and only those
December 26, 2008
The C Programming Language - Reference Manual PS1:1-35
identifiers, may be declared in the declaration list. Any iden-
tifiers whose type is not given are taken to be int. The only
storage class which may be specified is register; if it is speci-
fied, the corresponding actual parameter will be copied, if pos-
sible, into a register at the outset of the function.
A simple example of a complete function definition is
int max(a, b, c)
int a, b, c;
{
int m;
m = (a > b) ? a : b;
return((m > c) ? m : c);
}
Here int is the type-specifier; max(a, b, c) is the
function-declarator; int a, b, c; is the declaration-list for the
formal parameters; { ... } is the block giving the code for the
statement.
The C program converts all float actual parameters to dou-
ble, so formal parameters declared float have their declaration
adjusted to read double. All char and short formal parameter
declarations are similarly adjusted to read int. Also, since a
reference to an array in any context (in particular as an actual
parameter) is taken to mean a pointer to the first element of the
array, declarations of formal parameters declared ``array of
...'' are adjusted to read ``pointer to ....''
10.2. External Data Definitions
An external data definition has the form
data-definition:
declaration
The storage class of such data may be extern (which is the
default) or static but not auto or register.
11. Scope Rules
A C program need not all be compiled at the same time. The
source text of the program may be kept in several files, and
precompiled routines may be loaded from libraries. Communication
among the functions of a program may be carried out both through
explicit calls and through manipulation of external data.
Therefore, there are two kinds of scopes to consider: first,
what may be called the lexical scope of an identifier, which is
essentially the region of a program during which it may be used
December 26, 2008
PS1:1-36 The C Programming Language - Reference Manual
without drawing ``undefined identifier'' diagnostics; and second,
the scope associated with external identifiers, which is charac-
terized by the rule that references to the same external identif-
ier are references to the same object.
11.1. Lexical Scope
The lexical scope of identifiers declared in external defin-
itions persists from the definition through the end of the source
file in which they appear. The lexical scope of identifiers which
are formal parameters persists through the function with which
they are associated. The lexical scope of identifiers declared at
the head of a block persists until the end of the block. The lex-
ical scope of labels is the whole of the function in which they
appear.
In all cases, however, if an identifier is explicitly
declared at the head of a block, including the block constituting
a function, any declaration of that identifier outside the block
is suspended until the end of the block.
Remember also (see ``Structure, Union, and Enumeration
Declarations'' in ``DECLARATIONS'') that tags, identifiers asso-
ciated with ordinary variables, and identities associated with
structure and union members form three disjoint classes which do
not conflict. Members and tags follow the same scope rules as
other identifiers. The enum constants are in the same class as
ordinary variables and follow the same scope rules. The typedef
names are in the same class as ordinary identifiers. They may be
redeclared in inner blocks, but an explicit type must be given in
the inner declaration:
typedef float distance;
...
{
auto int distance;
...
}
The int must be present in the second declaration, or it
would be taken to be a declaration with no declarators and type
distance.
11.2. Scope of Externals
If a function refers to an identifier declared to be extern,
then somewhere among the files or libraries constituting the com-
plete program there must be at least one external definition for
the identifier. All functions in a given program which refer to
the same external identifier refer to the same object, so care
must be taken that the type and size specified in the definition
are compatible with those specified by each function which refer-
ences the data.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-37
It is illegal to explicitly initialize any external identif-
ier more than once in the set of files and libraries comprising a
multi-file program. It is legal to have more than one data defin-
ition for any external non-function identifier; explicit use of
extern does not change the meaning of an external declaration.
In restricted environments, the use of the extern storage
class takes on an additional meaning. In these environments, the
explicit appearance of the extern keyword in external data
declarations of identities without initialization indicates that
the storage for the identifiers is allocated elsewhere, either in
this file or another file. It is required that there be exactly
one definition of each external identifier (without extern) in
the set of files and libraries comprising a mult-file program.
Identifiers declared static at the top level in external
definitions are not visible in other files. Functions may be
declared static.
12. Compiler Control Lines
The C compiler contains a preprocessor capable of macro sub-
stitution, conditional compilation, and inclusion of named files.
Lines beginning with # communicate with this preprocessor. There
may be any number of blanks and horizontal tabs between the # and
the directive. These lines have syntax independent of the rest of
the language; they may appear anywhere and have effect which
lasts (independent of scope) until the end of the source program
file.
12.1. Token Replacement
A compiler-control line of the form
#define identifier token-stringopt
causes the preprocessor to replace subsequent instances of the
identifier with the given string of tokens. Semicolons in or at
the end of the token-string are part of that string. A line of
the form
#define identifier(identifier, ... )token-stringopt
where there is no space between the first identifier and the (,
is a macro definition with arguments. There may be zero or more
formal parameters. Subsequent instances of the first identifier
followed by a (, a sequence of tokens delimited by commas, and a
) are replaced by the token string in the definition. Each
occurrence of an identifier mentioned in the formal parameter
list of the definition is replaced by the corresponding token
string from the call. The actual arguments in the call are token
strings separated by commas; however, commas in quoted strings or
December 26, 2008
PS1:1-38 The C Programming Language - Reference Manual
protected by parentheses do not separate arguments. The number of
formal and actual parameters must be the same. Strings and char-
acter constants in the token-string are scanned for formal param-
eters, but strings and character constants in the rest of the
program are not scanned for defined identifiers to replacement.
In both forms the replacement string is rescanned for more
defined identifiers. In both forms a long definition may be con-
tinued on another line by writing \ at the end of the line to be
continued.
This facility is most valuable for definition of ``manifest
constants,'' as in
#define TABSIZE 100
int table[TABSIZE];
A control line of the form
#undef identifier
causes the identifier's preprocessor definition (if any) to be
forgotten.
If a #defined identifier is the subject of a subsequent
#define with no intervening #undef, then the two token-strings
are compared textually. If the two token-strings are not identi-
cal (all white space is considered as equivalent), then the iden-
tifier is considered to be redefined.
12.2. File Inclusion
A compiler control line of the form
#include "filename"
causes the replacement of that line by the entire contents of the
file filename. The named file is searched for first in the direc-
tory of the file containing the #include, and then in a sequence
of specified or standard places. Alternatively, a control line of
the form
#include <filename>
searches only the specified or standard places and not the direc-
tory of the #include. (How the places are specified is not part
of the language.)
#includes may be nested.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-39
12.3. Conditional Compilation
A compiler control line of the form
#if restricted-constant-expression
checks whether the restricted-constant expression evaluates to
nonzero. (Constant expressions are discussed in ``CONSTANT
EXPRESSIONS''; the following additional restrictions apply here:
the constant expression may not contain sizeof casts, or an
enumeration constant.)
A restricted constant expression may also contain the addi-
tional unary expression
defined identifier
or
defined( identifier )
which evaluates to one if the identifier is currently defined in
the preprocessor and zero if it is not.
All currently defined identifiers in restricted-constant-
expressions are replaced by their token-strings (except those
identifiers modified by defined) just as in normal text. The res-
tricted constant expression will be evaluated only after all
expressions have finished. During this evaluation, all undefined
(to the procedure) identifiers evaluate to zero.
A control line of the form
#ifdef identifier
checks whether the identifier is currently defined in the prepro-
cessor; i.e., whether it has been the subject of a #define con-
trol line. It is equivalent to #ifdef(identifier). A control line
of the form
#ifndef identifier
checks whether the identifier is currently undefined in the
preprocessor. It is equivalent to
#if !defined(identifier).
All three forms are followed by an arbitrary number of
lines, possibly containing a control line
December 26, 2008
PS1:1-40 The C Programming Language - Reference Manual
#else
and then by a control line
#endif
If the checked condition is true, then any lines between
#else and #endif are ignored. If the checked condition is false,
then any lines between the test and a #else or, lacking a #else,
the #endif are ignored.
These constructions may be nested.
12.4. Line Control
For the benefit of other preprocessors which generate C pro-
grams, a line of the form
#line constant "filename"
causes the compiler to believe, for purposes of error diagnos-
tics, that the line number of the next source line is given by
the constant and the current input file is named by "filename".
If "filename" is absent, the remembered file name does not
change.
13. Implicit Declarations
It is not always necessary to specify both the storage class
and the type of identifiers in a declaration. The storage class
is supplied by the context in external definitions and in
declarations of formal parameters and structure members. In a
declaration inside a function, if a storage class but no type is
given, the identifier is assumed to be int; if a type but no
storage class is indicated, the identifier is assumed to be auto.
An exception to the latter rule is made for functions because
auto functions do not exist. If the type of an identifier is
``function returning ...,'' it is implicitly declared to be
extern.
In an expression, an identifier followed by ( and not
already declared is contextually declared to be ``function
returning int.''
14. Types Revisited
This part summarizes the operations which can be performed
on objects of certain types.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-41
14.1. Structures and Unions
Structures and unions may be assigned, passed as arguments
to functions, and returned by functions. Other plausible opera-
tors, such as equality comparison and structure casts, are not
implemented.
In a reference to a structure or union member, the name on
the right of the -> or the . must specify a member of the aggre-
gate named or pointed to by the expression on the left. In gen-
eral, a member of a union may not be inspected unless the value
of the union has been assigned using that same member. However,
one special guarantee is made by the language in order to sim-
plify the use of unions: if a union contains several structures
that share a common initial sequence and if the union currently
contains one of these structures, it is permitted to inspect the
common initial part of any of the contained structures. For exam-
ple, the following is a legal fragment:
union
{
struct
{
int type;
} n;
struct
{
int type;
int intnode;
} ni;
struct
{
int type;
float floatnode;
} nf;
} u;
...
u.nf.type = FLOAT;
u.nf.floatnode = 3.14;
...
if (u.n.type == FLOAT)
... sin(u.nf.floatnode) ...
14.2. Functions
There are only two things that can be done with a function
m, call it or take its address. If the name of a function appears
in an expression not in the function-name position of a call, a
pointer to the function is generated. Thus, to pass one function
to another, one might say
December 26, 2008
PS1:1-42 The C Programming Language - Reference Manual
int f();
...
g(f);
Then the definition of g might read
g(funcp)
int (*funcp)();
{
...
(*funcp)();
...
}
Notice that f must be declared explicitly in the calling
routine since its appearance in g(f) was not followed by (.
14.3. Arrays, Pointers, and Subscripting
Every time an identifier of array type appears in an expres-
sion, it is converted into a pointer to the first member of the
array. Because of this conversion, arrays are not lvalues. By
definition, the subscript operator [] is interpreted in such a
way that E1[E2] is identical to *((E1)+E2)). Because of the
conversion rules which apply to •, if E1 is an array and E2 an
integer, then E1[E2] refers to the E2-th member of E1. Therefore,
despite its asymmetric appearance, subscripting is a commutative
operation.
A consistent rule is followed in the case of multidimen-
sional arrays. If E is an n-dimensional array of rank ixjx...xk,
then E appearing in an expression is converted to a pointer to an
(n-1)-dimensional array with rank jx...xk. If the * operator,
either explicitly or implicitly as a result of subscripting, is
applied to this pointer, the result is the pointed-to (n-1)-
dimensional array, which itself is immediately converted into a
pointer.
For example, consider
int x[3][5];
Here x is a 3x5 array of integers. When x appears in an
expression, it is converted to a pointer to (the first of three)
5-membered arrays of integers. In the expression x[i], which is
equivalent to *(x+i), x is first converted to a pointer as
described; then i is converted to the type of x, which involves
multiplying i by the length the object to which the pointer
points, namely 5-integer objects. The results are added and
indirection applied to yield an array (of five integers) which in
December 26, 2008
The C Programming Language - Reference Manual PS1:1-43
turn is converted to a pointer to the first of the integers. If
there is another subscript, the same argument applies again; this
time the result is an integer.
Arrays in C are stored row-wise (last subscript varies
fastest) and the first subscript in the declaration helps deter-
mine the amount of storage consumed by an array. Arrays play no
other part in subscript calculations.
14.4. Explicit Pointer Conversions
Certain conversions involving pointers are permitted but
have implementation-dependent aspects. They are all specified by
means of an explicit type-conversion operator, see ``Unary Opera-
tors'' under``EXPRESSIONS'' and ``Type Names''under ``DECLARA-
TIONS.''
A pointer may be converted to any of the integral types
large enough to hold it. Whether an int or long is required is
machine dependent. The mapping function is also machine dependent
but is intended to be unsurprising to those who know the address-
ing structure of the machine. Details for some particular
machines are given below.
An object of integral type may be explicitly converted to a
pointer. The mapping always carries an integer converted from a
pointer back to the same pointer but is otherwise machine depen-
dent.
A pointer to one type may be converted to a pointer to
another type. The resulting pointer may cause addressing excep-
tions upon use if the subject pointer does not refer to an object
suitably aligned in storage. It is guaranteed that a pointer to
an object of a given size may be converted to a pointer to an
object of a smaller size and back again without change.
For example, a storage-allocation routine might accept a
size (in bytes) of an object to allocate, and return a char
pointer; it might be used in this way.
extern char *malloc();
double *dp;
dp = (double *) malloc(sizeof(double));
*dp = 22.0 / 7.0;
The alloc must ensure (in a machine-dependent way) that its
return value is suitable for conversion to a pointer to double;
then the use of the function is portable.
The pointer representation on the PDP-11 corresponds to a
16-bit integer and measures bytes. The char's have no alignment
requirements; everything else must have an even address.
December 26, 2008
PS1:1-44 The C Programming Language - Reference Manual
On the VAX-11, pointers are 32 bits long and measure bytes.
Elementary objects are aligned on a boundary equal to their
length, except that double quantities need be aligned only on
even 4-byte boundaries. Aggregates are aligned on the strictest
boundary required by any of their constituents.
The 3B 20 computer has 24-bit pointers placed into 32-bit
quantities. Most objects are aligned on 4-byte boundaries. Shorts
are aligned in all cases on 2-byte boundaries. Arrays of charac-
ters, all structures, ints, longs, floats, and doubles are
aligned on 4-byte boundries; but structure members may be packed
tighter.
14.5. CONSTANT EXPRESSIONS
In several places C requires expressions that evaluate to a
constant: after case, as array bounds, and in initializers. In
the first two cases, the expression can involve only integer con-
stants, character constants, casts to integral types, enumeration
constants, and sizeof expressions, possibly connected by the
binary operators
+ - * / % & | ^ << >> == != < > <= >= && ||
or by the unary operators
- ~
or by the ternary operator
?:
Parentheses can be used for grouping but not for function
calls.
More latitude is permitted for initializers; besides con-
stant expressions as discussed above, one can also use floating
constants and arbitrary casts and can also apply the unary &
operator to external or static objects and to external or static
arrays subscripted with a constant expression. The unary & can
also be applied implicitly by appearance of unsubscripted arrays
and functions. The basic rule is that initializers must evaluate
either to a constant or to the address of a previously declared
external or static object plus or minus a constant.
15. Portability Considerations
Certain parts of C are inherently machine dependent. The
following list of potential trouble spots is not meant to be
all-inclusive but to point out the main ones.
December 26, 2008
The C Programming Language - Reference Manual PS1:1-45
Purely hardware issues like word size and the properties of
floating point arithmetic and integer division have proven in
practice to be not much of a problem. Other facets of the
hardware are reflected in differing implementations. Some of
these, particularly sign extension (converting a negative charac-
ter into a negative integer) and the order in which bytes are
placed in a word, are nuisances that must be carefully watched.
Most of the others are only minor problems.
The number of register variables that can actually be placed
in registers varies from machine to machine as does the set of
valid types. Nonetheless, the compilers all do things properly
for their own machine; excess or invalid register declarations
are ignored.
Some difficulties arise only when dubious coding practices
are used. It is exceedingly unwise to write programs that depend
on any of these properties.
The order of evaluation of function arguments is not speci-
fied by the language. The order in which side effects take place
is also unspecified.
Since character constants are really objects of type int,
multicharacter character constants may be permitted. The specific
implementation is very machine dependent because the order in
which characters are assigned to a word varies from one machine
to another.
Fields are assigned to words and characters to integers
right to left on some machines and left to right on other
machines. These differences are invisible to isolated programs
that do not indulge in type punning (e.g., by converting an int
pointer to a char pointer and inspecting the pointed-to storage)
but must be accounted for when conforming to externally-imposed
storage layouts.
16. Syntax Summary
This summary of C syntax is intended more for aiding
December 26, 2008
PS1:1-46 The C Programming Language - Reference Manual
comprehension than as an exact statement of the language.
16.1. Expressions
The basic expressions are:
expression:
primary
* expression
&lvalue
- expression
! expression
~ expression
++ lvalue
--lvalue
lvalue ++
lvalue --
sizeof expression
sizeof (type-name)
( type-name ) expression
expression binop expression
expression ? expression : expression
lvalue asgnop expression
expression , expression
primary:
identifier
constant
string
( expression )
primary ( expression-listopt )
primary [ expression ]
primary . identifier
primary - identifier
lvalue:
identifier
primary [ expression ]
lvalue . identifier
primary - identifier
* expression
( lvalue )
The primary-expression operators
() [] . -
have highest priority and group left to right. The unary opera-
tors
December 26, 2008
The C Programming Language - Reference Manual PS1:1-47
* & - ! ~ ++ -- sizeof ( type-name )
have priority below the primary operators but higher than any
binary operator and group right to left. Binary operators group
left to right; they have priority decreasing as indicated below.
binop:
* / %
+ -
>> <<
< > <= >=
== !=
&
^
|
&&
||
The conditional operator groups right to left.
Assignment operators all have the same priority and all
group right to left.
asgnop:
= += -= *= /= %= >>= <<= &= ^= |=
The comma operator has the lowest priority and groups left
to right.
16.2. Declarations
declaration:
decl-specifiers init-declarator-listopt ;
decl-specifiers:
type-specifier decl-specifiersopt
sc-specifier decl-specifiersopt
sc-specifier:
auto
static
extern
register
typedef
December 26, 2008
PS1:1-48 The C Programming Language - Reference Manual
type-specifier:
struct-or-union-specifier
typedef-name
enum-specifier
basic-type-specifier:
basic-type
basic-type basic-type-specifiers
basic-type:
char
short
int
long
unsigned
float
double
void
enum-specifier:
enum { enum-list }
enum identifier { enum-list }
enum identifier
enum-list:
enumerator
enum-list , enumerator
enumerator:
identifier
identifier = constant-expression
init-declarator-list:
init-declarator
init-declarator , init-declarator-list
init-declarator:
declarator initializeropt
declarator:
identifier
( declarator )
* declarator
declarator ()
declarator [ constant-expressionopt ]
December 26, 2008
The C Programming Language - Reference Manual PS1:1-49
struct-or-union-specifier:
struct { struct-decl-list }
struct identifier { struct-decl-list }
struct identifier
union { struct-decl-list }
union identifier { struct-decl-list }
union identifier
struct-decl-list:
struct-declaration
struct-declaration struct-decl-list
struct-declaration:
type-specifier struct-declarator-list ;
struct-declarator-list:
struct-declarator
struct-declarator , struct-declarator-list
struct-declarator:
declarator
declarator : constant-expression
: constant-expression
initializer:
= expression
= { initializer-list }
= { initializer-list , }
initializer-list:
expression
initializer-list , initializer-list
{ initializer-list }
{ initializer-list , }
type-name:
type-specifier abstract-declarator
abstract-declarator:
empty
( abstract-declarator )
* abstract-declarator
abstract-declarator ()
abstract-declarator [ constant-expressionopt ]
December 26, 2008
PS1:1-50 The C Programming Language - Reference Manual
typedef-name:
identifier
16.3. Statements
compound-statement:
{ declaration-listopt statement-listopt }
declaration-list:
declaration
declaration declaration-list
statement-list:
statement
statement statement-list
statement:
compound-statement
expression ;
if ( expression ) statement
if ( expression ) statement else statement
while ( expression ) statement
do statement while ( expression ) ;
for (expopt;expopt;expopt) statement
switch ( expression ) statement
case constant-expression : statement
default : statement
break ;
continue ;
return ;
return expression ;
goto identifier ;
identifier : statement
;
16.4. External definitions
program:
external-definition
external-definition program
external-definition:
function-definition
data-definition
December 26, 2008
The C Programming Language - Reference Manual PS1:1-51
function-definition:
decl-specifieropt function-declarator function-body
function-declarator:
declarator ( parameter-listopt )
parameter-list:
identifier
identifier , parameter-list
function-body:
declaration-listopt compound-statement
data-definition:
extern declaration ;
static declaration ;
17. Preprocessor
#define identifier token-stringopt
#define identifier(identifier,...)token-stringopt
#undef identifier
#include "filename"
#include <filename>
#if restricted-constant-expression
#ifdef identifier
#ifndef identifier
#else
#endif
#line constant "filename"
December 26, 2008
Generated on 2008-12-26 21:13:42 by $MirOS: src/scripts/roff2htm,v 1.57 2008/12/09 22:04:51 tg Exp $
These manual pages are copyrighted
by their respective writers; their source is available at our CVSweb, AnonCVS, and other mirrors.
The rest is Copyright © 2002-2008 The
MirOS Project, Germany.
This product includes material provided by Thorsten Glaser.
This manual page’s HTML representation is supposed to be valid XHTML/1.1; if not, please send a bug report – diffs preferred.