BC - An Arbitrary Precision Desk-CalculatorLanguageLorinda Cherry Robert MorrisABSTRACTBC is a language and a compiler for doing arbitrary precision arithmetic on the PDP-11 under the UNIX-time-sharing system. The output of the compiler is interpreted and executed by a collec- tion of routines which can input, output, and do arithmetic on indefinitely large integers and on scaled fixed-point numbers. These routines are themselves based on a dynamic storage allocator. Overflow does not occur until all available core storage is exhausted. The language has a complete control structure as well as immediate-mode operation. Functions can be defined and saved for later execution. Two five hundred-digit numbers can be multi- plied to give a thousand digit result in about ten seconds. A small collection of library functions is also available, including sin, cos, arctan, log, exponential, and Bessel functions of integer order. Some of the uses of this compiler are - to do computation with large integers, - to do computation accurate to many decimal places, - conversion of numbers from one base to another base. _________________________-UNIX is a registered trademark of AT&T Bell Labora- tories in the USA and other countries. December 7, 2021 USD:6-2 BC - An Arbitrary Precision Desk-Calculator LanguageIntroductionBC is a language and a compiler for doing arbitrary precision arithmetic on the UNIX time-sharing system [1]. The compiler was written to make conveniently available a collection of routines (called DC [5]) which are capable of doing arithmetic on integers of arbitrary size. The com- piler is by no means intended to provide a complete program- ming language. It is a minimal language facility. There is a scaling provision that permits the use of decimal point notation. Provision is made for input and out- put in bases other than decimal. Numbers can be converted from decimal to octal by simply setting the output base to equal 8. The actual limit on the number of digits that can be handled depends on the amount of storage available on the machine. Manipulation of numbers with many hundreds of digits is possible even on the smallest versions of UNIX. The syntax of BC has been deliberately selected to agree substantially with the C language [2]. Those who are familiar with C will find few surprises in this language.Simple Computations with IntegersThe simplest kind of statement is an arithmetic expres- sion on a line by itself. For instance, if you type in the line:142857 + 285714the program responds immediately with the line428571The operators -, *, /, %, and ^ can also be used; they indi- cate subtraction, multiplication, division, remaindering, and exponentiation, respectively. Division of integers pro- duces an integer result truncated toward zero. Division by zero produces an error comment. Any term in an expression may be prefixed by a minus sign to indicate that it is to be negated (the 'unary' minus sign). The expression7+-3is interpreted to mean that -3 is to be added to 7. More complex expressions with several operators and with parentheses are interpreted just as in Fortran, with ^ having the greatest binding power, then * and % and /, and December 7, 2021 BC - An Arbitrary Precision Desk-Calculator Language USD:6-3 finally + and -. Contents of parentheses are evaluated before material outside the parentheses. Exponentiations are performed from right to left and the other operators from left to right. The two expressionsa^b^c and a^(b^c)are equivalent, as are the two expressionsa*b*c and (a*b)*cBC shares with Fortran and C the undesirable convention thata/b*cis equivalent to(a/b)*cInternal storage registers to hold numbers have single lower-case letter names. The value of an expression can be assigned to a register in the usual way. The statementx = x + 3has the effect of increasing by three the value of the con- tents of the register named x. When, as in this case, the outermost operator is an =, the assignment is performed but the result is not printed. Only 26 of these named storage registers are available. There is a built-in square root function whose result is truncated to an integer (but see scaling below). The linesx = sqrt(191)xproduce the printed result13BasesThere are special internal quantities, called 'ibase' and 'obase'. The contents of 'ibase', initially set to 10, determines the base used for interpreting numbers read in. For example, the linesibase = 811will produce the output line9December 7, 2021 USD:6-4 BC - An Arbitrary Precision Desk-Calculator Language and you are all set up to do octal to decimal conversions. Beware, however of trying to change the input base back to decimal by typingibase = 10Because the number 10 is interpreted as octal, this state- ment will have no effect. For those who deal in hexadecimal notation, the characters A-F are permitted in numbers (no matter what base is in effect) and are interpreted as digits having values 10-15 respectively. The statementibase = Awill change you back to decimal input base no matter what the current input base is. Negative and large positive input bases are permitted but useless. No mechanism has been pro- vided for the input of arbitrary numbers in bases less than 1 and greater than 16. The contents of 'obase', initially set to 10, are used as the base for output numbers. The linesobase = 161000will produce the output line3E8which is to be interpreted as a 3-digit hexadecimal number. Very large output bases are permitted, and they are some- times useful. For example, large numbers can be output in groups of five digits by setting 'obase' to 100000. Strange (i.e. 1, 0, or negative) output bases are handled appropri- ately. Very large numbers are split across lines with 70 char- acters per line. Lines which are continued end with \. Decimal output conversion is practically instantaneous, but output of very large numbers (i.e., more than 100 digits) with other bases is rather slow. Non-decimal output conver- sion of a one hundred digit number takes about three seconds. It is best to remember that 'ibase' and 'obase' have no effect whatever on the course of internal computation or on the evaluation of expressions, but only affect input and output conversion, respectively.ScalingA third special internal quantity called 'scale' is used to determine the scale of calculated quantities. December 7, 2021 BC - An Arbitrary Precision Desk-Calculator Language USD:6-5 Numbers may have up to a specific number of decimal digits after the decimal point. This fractional part is retained in further computations. We refer to the number of digits after the decimal point of a number as its scale. The current implementation allows scales to be as large as can be represented by a 32-bit unsigned number minus one. This is a non-portable extension. The original implementation allowed for a maximum scale of 99. When two scaled numbers are combined by means of one of the arithmetic operations, the result has a scale determined by the following rules. For addition and subtraction, the scale of the result is the larger of the scales of the two operands. In this case, there is never any truncation of the result. For multiplications, the scale of the result is never less than the maximum of the two scales of the operands, never more than the sum of the scales of the operands and, subject to those two restrictions, the scale of the result is set equal to the contents of the internal quantity 'scale'. The scale of a quotient is the contents of the internal quantity 'scale'. The scale of a remainder is the sum of the scales of the quotient and the divisor. The result of an exponentiation is scaled as if the implied mul- tiplications were performed. An exponent must be an integer. The scale of a square root is set to the maximum of the scale of the argument and the contents of 'scale'. All of the internal operations are actually carried out in terms of integers, with digits being discarded when necessary. In every case where digits are discarded, trunca- tion and not rounding is performed. The contents of 4294967294 and no less than 0. It is initially set to 0. The internal quantities 'scale', 'ibase', and 'obase' can be used in expressions just like other variables. The linescale = scale + 1increases the value of 'scale' by one, and the linescalecauses the current value of 'scale' to be printed. The value of 'scale' retains its meaning as a number of decimal digits to be retained in internal computation even when 'ibase' or 'obase' are not equal to 10. The internal computations (which are still conducted in decimal, regard- less of the bases) are performed to the specified number of decimal digits, never hexadecimal or octal or any other kind of digits. December 7, 2021 USD:6-6 BC - An Arbitrary Precision Desk-Calculator LanguageFunctionsThe name of a function is a single lower-case letter. Function names are permitted to collide with simple variable names. Twenty-six different defined functions are permitted in addition to the twenty-six variable names. The linedefine a(x){begins the definition of a function with one argument. This line must be followed by one or more statements, which make up the body of the function, ending with a right brace }. Return of control from a function occurs when a return statement is executed or when the end of the function is reached. The return statement can take either of the two formsreturnreturn(x)In the first case, the value of the function is 0, and in the second, the value of the expression in parentheses. Variables used in the function can be declared as automatic by a statement of the formauto x,y,zThere can be only one 'auto' statement in a function and it must be the first statement in the definition. These automatic variables are allocated space and initialized to zero on entry to the function and thrown away on return. The values of any variables with the same names outside the function are not disturbed. Functions may be called recur- sively and the automatic variables at each level of call are protected. The parameters named in a function definition are treated in the same way as the automatic variables of that function with the single exception that they are given a value on entry to the function. An example of a function definition isdefine a(x,y){auto zz = x*yreturn(z)}The value of this function, when called, will be the product of its two arguments. A function is called by the appearance of its name fol- lowed by a string of arguments enclosed in parentheses and separated by commas. The result is unpredictable if the wrong number of arguments is used. December 7, 2021 BC - An Arbitrary Precision Desk-Calculator Language USD:6-7 Functions with no arguments are defined and called using parentheses with nothing between them: b(). If the functionaabove has been defined, then the linea(7,3.14)would cause the result 21.98 to be printed and the linex = a(a(3,4),5)would cause the value of x to become 60.Subscripted VariablesA single lower-case letter variable name followed by an expression in brackets is called a subscripted variable (an array element). The variable name is called the array name and the expression in brackets is called the subscript. Only one-dimensional arrays are permitted. The names of arrays are permitted to collide with the names of simple variables and function names. Any fractional part of a subscript is discarded before use. Subscripts must be greater than or equal to zero and less than or equal to 2047. Subscripted variables may be freely used in expres- sions, in function calls, and in return statements. An array name may be used as an argument to a function, or may be declared as automatic in a function definition by the use of empty brackets:f(a[])define f(a[])auto a[]When an array name is so used, the whole contents of the array are copied for the use of the function, and thrown away on exit from the function. Array names which refer to whole arrays cannot be used in any other contexts.Control StatementsThe 'if', the 'while', and the 'for' statements may be used to alter the flow within programs or to cause itera- tion. The range of each of them is a statement or a compound statement consisting of a collection of statements enclosed in braces. They are written in the following wayif(relation) statementif(relation) statement else statementwhile(relation) statementfor(expression1; relation; expression2) statementDecember 7, 2021 USD:6-8 BC - An Arbitrary Precision Desk-Calculator Language orif(relation) {statements}if(relation) {statements} else {statements}while(relation) {statements}for(expression1; relation; expression2) {statements}A relation in one of the control statements is an expression of the formx>ywhere two expressions are related by one of the six rela- tional operators '<', '>', '<=', '>=', '==', or '!='. The relation '==' stands for 'equal to' and '!=' stands for 'not equal to'. The meaning of the remaining relational operators is clear. BEWARE of using '=' instead of '==' in a relational. Unfortunately, both of them are legal, so you will not get a diagnostic message, but '=' really will not do a comparison. The 'if' statement causes execution of its range if and only if the relation is true. Then control passes to the next statement in sequence. If an 'else' branch is present, the statements in this branch are executed if the relation is false. The 'else' keyword is a non-portable extension. The 'while' statement causes execution of its range repeatedly as long as the relation is true. The relation is tested before each execution of its range and if the rela- tion is false, control passes to the next statement beyond the range of the while. The 'for' statement begins by executing 'expression1'. Then the relation is tested and, if true, the statements in the range of the 'for' are executed. Then 'expression2' is executed. The relation is tested, and so on. The typical use of the 'for' statement is for a controlled iteration, as in the statementfor(i=1; i<=10; i=i+1) iwhich will print the integers from 1 to 10. Here are some examples of the use of the control statements.define f(n){auto i, xx=1for(i=1; i<=n; i=i+1) x=x*ireturn(x)}December 7, 2021 BC - An Arbitrary Precision Desk-Calculator Language USD:6-9 The linef(a)will printafactorial ifais a positive integer. Here is the definition of a function which will compute values of the binomial coefficient (m and n are assumed to be positive integers).define b(n,m){auto x, jx=1for(j=1; j<=m; j=j+1) x=x*(n-j+1)/jreturn(x)}The following function computes values of the exponential function by summing the appropriate series without regard for possible truncation errors:scale = 20define e(x){auto a, b, c, d, na = 1b = 1c = 1d = 0n = 1while(1==1){a = a*xb = b*nc = c + a/bn = n + 1if(c==d) return(c)d = c}}Some DetailsThere are some language features that every user should know about even if he will not use them. Normally statements are typed one to a line. It is also permissible to type several statements on a line separated by semicolons. If an assignment statement is parenthesized, it then has a value and it can be used anywhere that an expression can. For example, the line(x=y+17)December 7, 2021 USD:6-10BC - An Arbitrary Precision Desk-Calculator Language not only makes the indicated assignment, but also prints the resulting value. Here is an example of a use of the value of an assign- ment statement even when it is not parenthesized.x = a[i=i+1]causes a value to be assigned to x and also increments i before it is used as a subscript. The following constructs work in BC in exactly the same manner as they do in the C language. Consult the appendix or the C manuals [2] for their exact workings.x=y=z is the same asx=(y=z)x += y x = x+yx -= y x = x-yx *= y x = x*yx /= y x = x/yx %= y x = x%yx ^= y x = x^yx++ (x=x+1)-1x-- (x=x-1)+1++x x = x+1--x x = x-1Even if you don't intend to use the constructs, if you type one inadvertently, something correct but unexpected may hap- pen.Three Important Things1. To exit a BC program, type 'quit'. 2. There is a comment convention identical to that of C and of PL/I. Comments begin with '/*' and end with '*/'. As a non-portable extension, comments may also start with a '#' and end with a newline. The newline is not part of the com- ment. 3. There is a library of math functions which may be obtained by typing at command levelbc -lThis command will load a set of library functions which, at the time of writing, consists of sine (named 's'), cosine ('c'), arctangent ('a'), natural logarithm ('l'), exponen- tial ('e') and Bessel functions of integer order ('j(n,x)'). Doubtless more functions will be added in time. The library sets the scale to 20. You can reset it to something else if you like. The design of these mathematical library routines is discussed elsewhere [3]. December 7, 2021 BC - An Arbitrary Precision Desk-Calculator LanguageUSD:6-11 If you typebc file ...BC will read and execute the named file or files before accepting commands from the keyboard. In this way, you may load your favorite programs and function definitions.AcknowledgementThe compiler is written in YACC [4]; its original ver- sion was written by S. C. Johnson.References[1] K. Thompson and D. M. Ritchie,UNIX Programmer'sManual, Bell Laboratories, 1978. [2] B. W. Kernighan and D. M. Ritchie,The C ProgrammingLanguage, Prentice-Hall, 1978. [3] R. Morris,A Library of Reference Standard MathematicalSubroutines, Bell Laboratories internal memorandum, 1975. [4] S. C. Johnson,YACC - Yet Another Compiler-Compiler. Bell Laboratories Computing Science Technical Report #32, 1978. [5] R. Morris and L. L. Cherry,DC - An Interactive DeskCalculator. December 7, 2021 USD:6-12BC - An Arbitrary Precision Desk-Calculator Language Appendix1. NotationIn the following pages syntactic categories are initalics; literals are inbold; material in brackets [] is optional.2. TokensTokens consist of keywords, identifiers, constants, operators, and separators. Token separators may be blanks, tabs or comments. Newline characters or semicolons separate statements.2.1. CommentsComments are introduced by the characters /* and ter- minated by */. As a non-portable extension, comments may also start with a # and end with a newline. The newline is not part of the comment.2.2. IdentifiersThere are three kinds of identifiers - ordinary iden- tifiers, array identifiers and function identifiers. All three types consist of single lower-case letters. Array identifiers are followed by square brackets, possibly enclosing an expression describing a subscript. Arrays are singly dimensioned and may contain up to 2048 elements. Indexing begins at zero so an array may be indexed from 0 to 2047. Subscripts are truncated to integers. Function iden- tifiers are followed by parentheses, possibly enclosing arguments. The three types of identifiers do not conflict; a program can have a variable namedx, an array namedxand a function namedx, all of which are separate and distinct.2.3. KeywordsThe following are reserved keywords:ibaseifobasebreakscaledefinesqrt autolengthreturnwhilequitfor continueelse last2.4. ConstantsConstants consist of arbitrarily long numbers with an optional decimal point. The hexadecimal digitsA-Fare also recognized as digits with values 10-15, respectively.3. ExpressionsThe value of an expression is printed unless the main operator is an assignment. The value printed is assigned to the special variablelast. A single dot may be used as a synonym forlast. This is a non-portable extension. Pre- cedence is the same as the order of presentation here, with highest appearing first. Left or right associativity, where applicable, is discussed with each operator. December 7, 2021 USD:6-14BC - An Arbitrary Precision Desk-Calculator Language3.1. Primitive expressions3.1.1. Named expressionsNamed expressions are places where values are stored. Simply stated, named expressions are legal on the left side of an assignment. The value of a named expression is the value stored in the place named.3.1.1.1. identifiersSimple identifiers are named expressions. They have an initial value of zero.3.1.1.2. array-name[expression] Array elements are named expressions. They have an ini- tial value of zero.3.1.1.3.scale,ibaseandobaseThe internal registersscale,ibaseandobaseare all named expressions.scaleis the number of digits after the decimal point to be retained in arithmetic operations.scalehas an initial value of zero.ibaseandobaseare the input and output number radix respectively. Bothibaseandobasehave initial values of 10.3.1.2. Function calls3.1.2.1. function-name([expression[,expression...]])A function call consists of a function name followed by parentheses containing a comma-separated list of expres- sions, which are the function arguments. A whole array passed as an argument is specified by the array name fol- lowed by empty square brackets. All function arguments are passed by value. As a result, changes made to the formal parameters have no effect on the actual arguments. If the function terminates by executing a return statement, the value of the function is the value of the expression in the parentheses of the return statement or is zero if no expres- sion is provided or if there is no return statement.3.1.2.2. sqrt(expression) The result is the square root of the expression. The result is truncated in the least significant decimal place. The scale of the result is the scale of the expression or the value ofscale,whichever is larger.3.1.2.3. length(expression) The result is the total number of significant decimal December 7, 2021 BC - An Arbitrary Precision Desk-Calculator LanguageUSD:6-15 digits in the expression. The scale of the result is zero.3.1.2.4. scale(expression) The result is the scale of the expression. The scale of the result is zero.3.1.3. ConstantsConstants are primitive expressions.3.1.4. ParenthesesAn expression surrounded by parentheses is a primitive expression. The parentheses are used to alter the normal precedence.3.2. Unary operatorsThe unary operators bind right to left.3.2.1. -expressionThe result is the negative of the expression.3.2.2. ++named-expressionThe named expression is incremented by one. The result is the value of the named expression after incrementing.3.2.3. --named-expressionThe named expression is decremented by one. The result is the value of the named expression after decrementing.3.2.4. named-expression++ The named expression is incremented by one. The result is the value of the named expression before incrementing.3.2.5. named-expression--The named expression is decremented by one. The result is the value of the named expression before decrementing.3.3. Exponentiation operatorThe exponentiation operator binds right to left.3.3.1. expression ^ expressionThe result is the first expression raised to the power of the second expression. The second expression must be an integer. Ifais the scale of the left expression andbis December 7, 2021 USD:6-16BC - An Arbitrary Precision Desk-Calculator Language the absolute value of the right expression, then the scale of the result is: min(axb,max(scale,a))3.4. Multiplicative operatorsThe operators *, /, % bind left to right.3.4.1. expression * expressionThe result is the product of the two expressions. Ifaandbare the scales of the two expressions, then the scale of the result is: min(a+b,max(scale,a,b))3.4.2. expression / expressionThe result is the quotient of the two expressions. The scale of the result is the value ofscale.3.4.3. expression % expressionThe % operator produces the remainder of the division of the two expressions. More precisely,a%bisa-a/b*b. The scale of the result is the sum of the scale of the divisor and the value ofscale3.5. Additive operatorsThe additive operators bind left to right.3.5.1. expression+expressionThe result is the sum of the two expressions. The scale of the result is the maximum of the scales of the expres- sions.3.5.2. expression - expressionThe result is the difference of the two expressions. The scale of the result is the maximum of the scales of the expressions.3.6. assignment operatorsThe assignment operators bind right to left.3.6.1. named-expression=expressionThis expression results in assigning the value of the expression on the right to the named expression on the left. December 7, 2021 BC - An Arbitrary Precision Desk-Calculator LanguageUSD:6-173.6.2. named-expression+=expression3.6.3. named-expression-=expression3.6.4. named-expression*=expression3.6.5. named-expression/=expression3.6.6. named-expression%=expression3.6.7. named-expression^=expressionThe result of the above expressions is equivalent to "named expression = named expression OP expression", where OP is the operator after the = sign.4. RelationsUnlike all other operators, the relational operators are only valid as the object of anif,while, or inside aforstatement.4.1. expression<expression4.2. expression>expression4.3. expression<=expression4.4. expression>=expression4.5. expression==expression4.6. expression!=expression5. Storage classesThere are only two storage classes in BC, global and automatic (local). Only identifiers that are to be local to a function need be declared with theautocommand. The argu- ments to a function are local to the function. All other identifiers are assumed to be global and available to all functions. All identifiers, global and local, have initial values of zero. Identifiers declared asautoare allocated on entry to the function and released on returning from the function. They therefore do not retain values between func- tion calls.autoarrays are specified by the array name fol- lowed by empty square brackets. Automatic variables in BC do not work in exactly the same way as in either C or PL/I. On entry to a function, the old values of the names that appear as parameters and as automatic variables are pushed onto a stack. Until return is made from the function, reference to these names refers only to the new values. December 7, 2021 USD:6-18BC - An Arbitrary Precision Desk-Calculator Language6. StatementsStatements must be separated by semicolon or newline. Except where altered by control statements, execution is sequential.6.1. Expression statementsWhen a statement is an expression, unless the main operator is an assignment, the value of the expression is printed, followed by a newline character.6.2. Compound statementsStatements may be grouped together and used when one statement is expected by surrounding them with { }.6.3. Quoted string statements"any string" This statement prints the string inside the quotes.6.4. If statementsif(relation)statementThe substatement is executed if the relation is true.6.5. If-else statementsif(relation)statementelsestatementThe first substatement is executed if the relation is true, the second substatement if the relation is false. Theif-elsestatement is a non-portable extension.6.6. While statementswhile(relation)statementThe statement is executed while the relation is true. The test occurs before each execution of the statement.6.7. For statementsfor(expression;relation;expression)statementTheforstatement is the same asfirst-expressionwhile(relation) {statementlast-expression} All three expressions may be left out. This is a non- portable extension. December 7, 2021 BC - An Arbitrary Precision Desk-Calculator LanguageUSD:6-196.8. Break statementsbreakbreakcauses termination of afororwhilestatement.6.9. Continue statementscontinuecontinuecauses the next iteration of afororwhilestatement to start, skipping the remainder of the loop. For awhilestatement, execution continues with the evaluation of the condition. For aforstatement, execution continues with evaluation of the last-expression. Thecontinuestate- ment is a non-portable extension.6.10. Auto statementsautoidentifier[,identifier] Theautostatement causes the values of the identifiers to be pushed down. The identifiers can be ordinary identif- iers or array identifiers. Array identifiers are specified by following the array name by empty square brackets. The auto statement must be the first statement in a function definition.6.11. Define statementsdefine([parameter[,parameter...]]){statements}Thedefinestatement defines a function. The parameters may be ordinary identifiers or array names. Array names must be followed by empty square brackets. As a non-portable extension, the opening brace may also appear on the next line.6.12. Return statementsreturnreturn(expression)Thereturnstatement causes termination of a function, popping of its auto variables, and specifies the result of the function. The first form is equivalent toreturn(0). The result of the function is the result of the expression in parentheses. Leaving out the expression between parentheses is equivalent toreturn(0). As a non-portable extension, the parentheses may be left out.6.13. PrintThe6.14. QuitThequitstatement stops execution of a BC program and returns control to UNIX when it is first encountered. Because it is not treated as an executable statement, it cannot be used in a function definition or in anif, for,orwhilestatement. December 7, 2021