MirOS Manual: 23.eqn(USD)


A System for Typesetting Mathematics                     USD:26-1

              A System for Typesetting Mathematics

            Brian W. Kernighan and Lorinda L. Cherry

                     AT&T Bell Laboratories
                  Murray Hill, New Jersey 07974

                            ABSTRACT

          This paper describes the design and implementation
     of  a  system for typesetting mathematics. The language
     has been designed to be easy to learn  and  to  use  by
     people  (for example, secretaries and mathematical typ-
     ists) who know  neither  mathematics  nor  typesetting.
     Experience  indicates  that the language can be learned
     in an hour or so, for it has few rules and fewer excep-
     tions.  For  typical  expressions,  the  size  and font
     changes, positioning, line drawing, and the like neces-
     sary to print according to mathematical conventions are
     all done automatically. For example, the input
          sum from i=0 to infinity x sub i = pi over 2
     produces
                           oo   i_̅
                           ≥̅ xi=2
          The syntax of the=language is specified by a small
     context-free  grammar;  a  compiler-compiler is used to
     make a compiler  that  translates  this  language  into
     typesetting  commands. Output may be produced on either
     a phototypesetter or on a  terminal  with  forward  and
     reverse   half-line   motions.  The  system  interfaces
     directly with text formatting programs, so mixtures  of
     text and mathematics may be handled simply.

          This paper is a revision  of  a  paper  originally
     published in CACM, March, 1975.

                                   1. Introduction

                          July 4, 2014

USD:26-2                     A System for Typesetting Mathematics

     ``Mathematics is known in     (``Requires'' is  perhaps  the

the  trade  as  difficult,  or     wrong  word,  but  mathematics

penalty, copy  because  it  is     has its own typographical con-

slower,  more  difficult,  and     ventions  which are quite dif-

more expensive to set in  type     ferent from those of  ordinary

than  any  other  kind of copy     text.)   Typesetting  such  an

normally  occurring  in  books     expression   by    traditional

and journals.'' [1]                methods  is  still  an  essen-

                                   tially manual operation.
     One    difficulty    with

mathematical  text is the mul-          A  second  difficulty  is

tiplicity    of    characters,     the  two dimensional character

sizes,  and  fonts. An expres-     of  mathematics,   which   the

sion such as                       superscript  and limits in the

                                   preceding  example  showed  in
     lim  (tan x)sin 2x = 1
   x->i̅i̅/2
                                   its  simplest  form.  This  is
requires an  intimate  mixture

                                   carried further by
of  roman,  italic  and  greek               ______1______

                                                _____2____
letters, in three sizes, and a            a0+
                                             a1+   ___3___
                                                a2+a3+...
special   character   or  two.
                          July 4, 2014

A System for Typesetting Mathematics                     USD:26-3

and still further by
                                   phototypesetter  is  a  device

             |
           |             _      _  which  exposes a piece of pho-
           |_______ log _|_a__m__-__|_b_
           |2m\|ab     \|aemx+\|b
           |               _       tographic paper or film, plac-
_____x_____ |______ tanh-1(_|_a_emx)
aemx-be-mx |m\|ab        \|b
           |               _       ing  characters  wherever they
           |______ coth-1(_|_a_emx)
           |m\|ab        \|b
           |                       are wanted. The  Graphic  Sys-

These   examples   also   show     tems phototypesetter[2] on the

line-drawing, built-up charac-     UNIX operating system[3] works

ters like braces and radicals,     by  shining  light  through  a

and  a spectrum of positioning     character stencil. The charac-

problems.  (Section  6   shows     ter  is made the right size by

what  a  user  has  to type to     lenses,  and  the  light  beam

produce these on our system.)      directed  by  fiber  optics to

                                   the desired place on  a  piece

2. Photocomposition                of   photographic  paper.  The

                                   exposed paper is developed and
     Photocomposition    tech-

                                   typically used in some form of
niques  can  be  used to solve

                                   photo-offset reproduction.
some  of   the   problems   of

typesetting   mathematics.   A

                          July 4, 2014

USD:26-4                     A System for Typesetting Mathematics

     On   UNIX,   the   photo-     ``assembly   language,''    by

typesetter is driven by a for-     designing   a   language   for

matting program  called  TROFF     describing        mathematical

[4].  TROFF  was  designed for     expressions,  and compiling it

setting running text. It  also     into TROFF.

provides all of the facilities

that  one  needs   for   doing     3. Language Design

mathematics, such as arbitrary
                                        The fundamental principle

horizontal    and     vertical
                                   upon   which   we   based  our

motions,   line-drawing,  size
                                   language design  is  that  the

changing, but the  syntax  for
                                   language should be easy to use

describing    these    special
                                   by people (for example, secre-

operations  is  difficult   to
                                   taries)   who   know   neither

learn,  and difficult even for
                                   mathematics nor typesetting.

experienced  users   to   type

                                        This  principle   implies
correctly.

                                   several  things. First, ``nor-

     For   this   reason    we
                                   mal'' mathematical conventions

decided  to  use  TROFF  as an
                                   about   operator   precedence,

                          July 4, 2014

A System for Typesetting Mathematics                     USD:26-5

                                   operators,  and the like. This
parentheses, and the like can-

                                   keeps  the  language  easy  to
not  be used, for to give spe-

                                   learn  and  remember. Further-
cial meaning to  such  charac-

                                   more,  there  should  be   few
ters  means  that the user has

                                   exceptions  to  the rules that
to understand what he  or  she

                                   do exist: if  something  works
is  typing.  Thus the language

                                   in  one  situation,  it should
should   not    assume,    for

                                   work everywhere. If a variable
instance, that parentheses are

                                   can  have  a subscript, then a
always balanced, for they  are

                                   subscript  can  have  a   sub-
not  in the half-open interval

                                   script,   and  so  on  without
(a,b]. Nor  should  it  assume

                ___                limit.
that   that   \|a+b   can   be

replaced by (a+b)1/2, or  that          Third,       ``standard''

1/(1-x)  is  better written as     things should happen automati-

___ (or vice versa).
1-x                                cally.   Someone   who   types

     Second, there  should  be     ``x=y+z+1''     should     get

relatively   few  rules,  key-     ``x=y+z+1''.  Subscripts   and

words,  special  symbols   and     superscripts  should automati-

                          July 4, 2014

USD:26-6                     A System for Typesetting Mathematics

cally   be   printed   in   an     typed  on  a computer terminal

appropriately   smaller  size,     much like  an  ordinary  type-

with no special  intervention.     writer.  This implies an input

Fraction  bars have to be made     alphabet of perhaps 100  char-

the  right  length  and  posi-     acters, none of them special.

tioned  at  the  right height.
                                        A  secondary,  but  still

And so on. Indeed a  mechanism
                                   important,  goal in our design

for overriding default actions
                                   was that the system should  be

has to exist, but its applica-
                                   easy  to implement, since nei-

tion is the exception, not the
                                   ther of the  authors  had  any

rule.
                                   desire  to  make  a  long-term

     We assume that the typist     project  of  it.   Since   our

has  a  reasonable  picture (a     design  was  not  firm, it was

two-dimensional    representa-     also necessary that  the  pro-

tion)  of  the  desired  final     gram  be easy to change at any

form, as might be  handwritten     time.

by  the  author of a paper. We
                                        To make the program  easy

also assume that the input  is
                                   to build and to change, and to

                          July 4, 2014

A System for Typesetting Mathematics                     USD:26-7

guarantee   regularity   (``it     significant examples  required

should work everywhere''), the     perhaps  a person-month. Since

language  is  defined   by   a     then, we have spent  a  modest

context-free          grammar,     amount of additional time over

described in  Section  5.  The     several years  tuning,  adding

compiler  for the language was     facilities,  and  occasionally

built   using   a    compiler-     changing the language as users

compiler.                          make  criticisms  and  sugges-

                                   tions.
     A       priori,       the

grammar/compiler-compiler               We  also  decided   quite

approach  seemed   the   right     early  that we would let TROFF

thing  to  do.  Our subsequent     do our work  for  us  whenever

experience leads us to believe     possible.  TROFF  is  quite  a

that  any  other  course would     powerful program, with a macro

have been folly. The  original     facility,  text and arithmetic

language was designed in a few     variables, numerical  computa-

days. Construction of a  work-     tion  and  testing, and condi-

ing  system  sufficient to try     tional branching. Thus we have

                          July 4, 2014

USD:26-8                     A System for Typesetting Mathematics

been  able  to avoid writing a     Since our program is only use-

lot  of  mundane  but   tricky     ful  for typesetting mathemat-

software.   For   example,  we     ics, it is necessary  that  it

store  no  text  strings,  but     interface   cleanly  with  the

simply  pass them on to TROFF.     underlying         typesetting

Thus we avoid having to  write     language  for  the  benefit of

a  storage management package.     users who want to  set  inter-

Furthermore, we have been able     mingled  mathematics  and text

to isolate ourselves from most     (the usual case). The standard

details of the particular dev-     mode of operation is that when

ice    and    character    set     a document is typed, mathemat-

currently in use. For example,     ical  expressions are input as

we   let   TROFF  compute  the     part of the text,  but  marked

widths of all strings of char-     by  user  settable delimiters.

acters;  we  need know nothing     The program reads  this  input

about them.                        and  treats  as comments those

                                   things which are not mathemat-
     A third  design  goal  is

                                   ics,   simply   passing   them
special  to  our  environment.

                          July 4, 2014

A System for Typesetting Mathematics                     USD:26-9

through untouched. At the same     as  they  are  handed  to  the

time it converts the mathemat-     typesetting            program

ical input into the  necessary     (hereinafter  called ``EQN''),

TROFF  commands. The resulting     except that we won't show  the

ioutput is passed directly  to     delimiters that the user types

TROFF  where  the comments and     to mark the beginning and  end

the  mathematical  parts  both     of  the expression. The inter-

become  text and/or TROFF com-     face between EQN and TROFF  is

mands.                             described  at  the end of this

                                   section.

4. The Language
                                        As   we   said,    typing

     We  will   not   try   to     x=y+z+1     should     produce

describe   the  language  pre-     x=y+z+1, and indeed  it  does.

cisely    here;     interested     Variables   are  made  italic,

readers   may   refer  to  the     operators  and  digits  become

appendix  for  more   details.     roman,   and  normal  spacings

Throughout  this  section,  we     between letters and  operators

will write expressions exactly     are altered slightly to give a

                          July 4, 2014

USD:26-10                    A System for Typesetting Mathematics

more pleasing appearance.          several characters of  various

                                   sizes.  A  tilde ``~'' gives a
     Input    is    free-form.

                                   space equal to the normal word
Spaces  and  new  lines in the

                                   spacing  in text; a circumflex
input  are  used  by  EQN   to

                                   gives half this  much,  and  a
separate  pieces of the input;

                                   tab  charcter  spaces  to  the
they are not  used  to  create

                                   next tab stop.
space in the output. Thus

                                        Spaces (or tildes,  etc.)
   x    =    y

                                   also  serve  to delimit pieces
      + z + 1

                                   of the input. For example,  to

also gives x=y+z+1.  Free-form
                                   get

input  is  easier to type ini-

                                         f(t)=2i̅i̅sin(wt)dt
tially; subsequent editing  is

                                   we write
also easier, for an expression

may be  typed  as  many  short        f(t) = 2 pi int sin ( omega t )dt

lines.

     Extra white space can  be     Here spaces are  necessary  in

forced   into  the  output  by     the  input  to  indicate  that

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-11

sin, pi, int,  and  omega  are          Fractions  are  specified

special, and potentially worth     with the keyword over:

special treatment.  EQN  looks
                                      a+b over c+d+e = 1

up each such string of charac-

ters  in  a  table,   and   if

                                   produces
appropriate  gives it a trans-

                                              __±__=1
lation. In this case,  pi  and                c+d+e

omega   become   their   greek          Similarly, subscripts and

equivalents, int  becomes  the     superscripts are introduced by

integral  sign  (which must be     the keywords sub and sup:

moved down and enlarged so  it
                                              x2+y2=z2

looks  ``right''),  and sin is
                                   is produced by

made roman, following  conven-

                                      x sup 2 + y sup 2 = z sup 2
tional  mathematical practice.

Parentheses, digits and opera-

tors  are  automatically  made     The spaces after the  2's  are

roman wherever found.              necessary  to  mark the end of

                                   the  superscripts;   similarly

                          July 4, 2014

USD:26-12                    A System for Typesetting Mathematics

the  keyword  sup  has  to  be           {partial sup 2 f} over {partial x sup 2} =

marked off by spaces  or  some

equivalent    delimiter.   The           x sup 2 over a sup 2 + y sup 2 over b sup 2

return to the proper  baseline

is  automatic. Multiple levels
                                   Braces {} are  used  to  group

of subscripts or  superscripts
                                   objects together; in this case

are    of    course   allowed:
                                   they  indicate   unambiguously

``xsupysupz'' is xyz. The con-
                                   what  goes  over  what  on the

struct  ``something  sub some-
                                   left-hand side of the  expres-

thing   sup   something''   is
                                   sion. The language defines the

recognized  as a special case,
                                   precedence of sup to be higher
                             2
so ``x sub i  sup  2''  is  xi
                                   than   that  of  over,  so  no
instead of xi2.

                                   braces are needed to  get  the

     More complicated  expres-
                                   correct   association  on  the

sions  can  now be formed with
                                   right side. Braces can  always

these primitives:
                                   be  used  when  in doubt about

          `__f_=_2_+_2_                precedence.
          `x2 a2 b2

is produced by

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-13

                                   Since large radicals look poor
     The braces convention  is

                                   on our typesetter, sqrt is not
an  example  of  the  power of

                                   useful for tall expressions.
using a recursive  grammar  to

define  the  language.  It  is          Limits   on   summations,

part of the language that if a     integrals   and  similar  con-

construct  can  appear in some     structions are specified  with

context, then  any  expression     the  keywords  from and to. To

in  braces  can  also occur in     get

that context.                                  oo
                                               ≥̅ xi->0
                                              i=0

     There is a sqrt  operator     we need only type

for making square roots of the
                                      sum from i=0 to inf x sub i -> 0

appropriate size: ``sqrt a+b''

           ___
produces \|a+b, and

                                   Centering and making the ≥̅ big

   x =  {-b +- sqrt{b sup 2 -4ac}} over 2a
                                   enough  and the limits smaller

                                   are all  automatic.  The  from

is                                 and    to   parts   are   both

               ______              optional, and the central part
          _b__\__b__-__a__c
        x=    2a

                          July 4, 2014

USD:26-14                    A System for Typesetting Mathematics

(e.g.,  the  ≥̅) can in fact be     makes

anything:
                                             |_±_|
                                             |2a | = 1

   lim from {x -> pi /2} ( tan~x) = inf
                                   A  left  need   not   have   a

                                   corresponding   right,  as  we

is                                 shall see in the next example.

                                   Any characters may follow left
        lim  (tan x)=oo
      x->i̅i̅/2
                                   and right, but generally  only
Again,  the  braces   indicate

                                   various  parentheses  and bars
just  what  goes into the from

                                   are meaningful.
part.

                                        Big brackets,  etc.,  are
     There is a  facility  for

                                   often used with another facil-
making    braces,    brackets,

                                   ity, called piles, which  make
parentheses, and vertical bars

                                   vertical piles of objects. For
of the right height, using the

                                   example, to get
keywords left and right:

                                                 |
                                                 | 1
   left [ x+y over 2a right ]~=~1      sign(x) _ | 0  if
                                                 |-1  if  x>0
                                                 |    if  x=0
                                                          x<0

                                   we can type

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-15

                                   any number  of  elements;  any
   sign (x) ~==~ left {

                                   element   of  a  pile  can  of

                                   course contain piles.
      rpile {1 above 0 above -1}

                                        Although  EQN   makes   a

      ~~lpile {if above if above ifvaliant  attempt  to  use  the

                                   right sizes and  fonts,  there

      ~~lpile {x>0 above x=0 above are} times  when  the  default

                                   assumptions  are  simply   not

                                   what  is  wanted. For instance
The  construction  ``left  {''

                                   the italic sign in the  previ-
makes  a left brace big enough

                                   ous  example would convention-
to   enclose    the    ``rpile

                                   ally be in roman.  Slides  and
{...}'',  which  is  a  right-

                                   transparencies  often  require
justified pile of ``above  ...

                                   larger characters than  normal
above ...''. ``lpile'' makes a

                                   text.  Thus  we  also  provide
left-justified pile. There are

                                   size and  font  changing  com-
also  centered  piles. Because

                                   mands:    ``size    12    bold
of  the   recursive   language

                                   {A~x~=~y}''    will    produce
definition, a pile can contain

                          July 4, 2014

USD:26-16                    A System for Typesetting Mathematics

A x = y. Size is followed by a          Diacritical marks, long a

number representing a  charac-     problem     in     traditional

ter size in points. (One point     typesetting, are  straightfor-

is 1/72 inch;  this  paper  is     ward:
                                           .       .. ___

set in 9 point type.)
                                           _+^+y̅+^+Y =z+Z

     If  necessary,  an  input     is made by typing

string can be quoted in "...",
                                      x dot under + x hat + y tilde

which  turns  off  grammatical

significance,  and any font or
                                      + X hat + Y dotdot = z+Z bar

spacing  changes  that   might

otherwise  be done on it. Thus

we can say                              There are also facilities

                                   for  globally changing default
   lim~ roman "sup" ~x sub n = 0

                                   sizes and fonts,  for  example

                                   for  making  viewgraphs or for

to ensure  that  the  supremum
                                   setting  chemical   equations.

doesn't become a superscript:
                                   The    language   allows   for

         lim sup xn=0              matrices, and  for  lining  up

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-17

equations at the same horizon-     keywords  like  sup  or  over.

tal position.                      Section  6 shows an example of

                                   definitions.
     Finally,   there   is   a

definition facility, so a user          The   EQN    preprocessor

can say                            reads   intermixed   text  and

                                   equations, and passes its out-
   define name "..."

                                   put to TROFF. Since TROFF uses

                                   lines beginning with a  period

at any time in  the  document;
                                   as    control   words   (e.g.,

henceforth,  any occurrence of
                                   ``.ce''  means  ``center   the

the  token  ``name''   in   an
                                   next  output line''), EQN uses

expression  will  be  expanded
                                   the sequence ``.EQ''  to  mark

into whatever was  inside  the
                                   the  beginning  of an equation

double  quotes  in its defini-
                                   and ``.EN'' to mark  the  end.

tion. This lets  users  tailor
                                   The  ``.EQ''  and  ``.EN'' are

the   language  to  their  own
                                   passed   through   to    TROFF

specifications,  for   it   is
                                   untouched, so they can also be

quite   possible  to  redefine
                                   used by a  knowledgeable  user

                          July 4, 2014

USD:26-18                    A System for Typesetting Mathematics

to  center  equations,  number
                                         .ce

them  automatically,  etc.  By

default,  however, ``.EQ'' and
                                         .EQ

``.EN'' are simply ignored  by

TROFF, so by default equations
                                         x sub i = y sub i ...

are printed in-line.

     ``.EQ'' and  ``.EN''  can           .EN

be  supplemented by TROFF com-

mands as desired; for example,

                                        Since it  is  tedious  to
a  centered  display  equation

                                   type   ``.EQ''   and   ``.EN''
can  be  produced   with   the

                                   around very short  expressions
input:

                                   (single      letters,      for

                                   instance), the user  can  also

                                   define two characters to serve

                                   as the left and  right  delim-

                                   iters  of  expressions.  These

                                   characters are recognized any-

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-19

where  in subsequent text. For     output of one process (EQN) to

example if the left and  right     the input of another (TROFF).

delimiters  have both been set

to ``#'', the input:               5. Language Theory

   Let #x sub i#, #y# and #alpha# be posThevebasic  structure  of

                                   the language is not a particu-

                                   larly original one.  Equations
produces:

                                   are   pictured  as  a  set  of

   Let xi, y and ( be positive
                                   ``boxes,'' pieced together  in

                                   various   ways.  For  example,

                                   something with a subscript  is
     Running a preprocessor is

                                   just a box followed by another
strikingly  easy  on  UNIX. To

                                   box moved downward and  shrunk
typeset text  stored  in  file

                                   by  an  appropriate  amount. A
``f'', one issues the command:

                                   fraction is just  a  box  cen-

   eqn f | troff
                                   tered  above  another  box, at

                                   the  right  altitude,  with  a

The vertical bar connects  the     line  of  correct length drawn

                          July 4, 2014

USD:26-20                    A System for Typesetting Mathematics

between them.                      cate optional material. A TEXT

                                   is a string of non-blank char-
     The   grammar   for   the

                                   acters  or  any  string inside
language  is  shown below. For

                                   double quotes; the other  ter-
purposes  of  exposition,   we

                                   minal     symbols    represent
have  collapsed  some  produc-

                                   literal  occurrences  of   the
tions. In the  original  gram-

                                   corresponding keyword.
mar,  there  are about 70 pro-

ductions, but  many  of  these

are  simple  ones used only to

guarantee that some keyword is

recognized early enough in the

parsing  process.  Symbols  in

capital  letters  are terminal

symbols;  lower  case  symbols

are  non-terminals, i.e., syn-

tactic categories. The  verti-

cal  bar | indicates an alter-

native; the brackets [ ] indi-

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-21

                                       | SIZE text box
 eqn: box | eqn box

                                       | [ROMAN | BOLD | ITALIC] box
 box: text

                                       | box [HAT | BAR | DOT | DOTDOT | TILDE]
    | { eqn }

                                       | DEFINE text text
    | box OVER box

                                    list: eqn | list ABOVE eqn
    | SQRT box

                                    text: TEXT
    | box SUB box | box SUP box

    | [ L | C | R ]PILE { list }
                                        The  grammar   makes   it

                                   obvious   why  there  are  few

    | LEFT text eqn [ RIGHT text ]
                                   exceptions. For  example,  the

                                   observation that something can

    | box [ FROM box ] [ TO box ]
                                   be replaced by a more  compli-

                                   cated  something  in braces is

                          July 4, 2014

USD:26-22                    A System for Typesetting Mathematics

implicit in the productions:
                                      {a over b} over c

      eqn    : box | eqn box

                                   or is it

      box    : text | { eqn }

                                      a over {b over c}  ?

Anywhere  a  single  character

could  be used, any legal con-          To answer questions  like

struction can be used.             this,  the  grammar is supple-

                                   mented with  a  small  set  of
     Clearly, our  grammar  is

                                   rules  that  describe the pre-
highly  ambiguous.  What,  for

                                   cedence and  associativity  of
instance, do we  do  with  the

                                   operators.  In  particular, we
input

                                   specify (more  or  less  arbi-

   a over b over c  ?
                                   trarily)  that over associates

                                   to  the  left,  so  the  first

Is it                              alternative  above  is the one

                                   chosen. On the other hand, sub

                                   and  sup  bind  to  the right,

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-23
                                        The   ambiguous   grammar

because  this  is  closer   to
                                   approach  seems  to  be  quite

standard   mathematical  prac-
                                   useful. The grammar we use  is

tice. That is, we  assume  xab
                                   small   enough  to  be  easily

is x(ab), not  (xa)b.
                                   understood,  for  it  contains

     The   precedence    rules     none  of  the productions that

resolve  the  ambiguity  in  a     would  be  normally  used  for

construction like                  resolving  ambiguity.  Instead

                                   the  supplemental  information
   a sup 2 over b

                                   about  precedence and associa-

                                   tivity (also small  enough  to

We define sup to have a higher
                                   be  understood)  provides  the

precedence  than over, so this
                                   compiler-compiler   with   the
                            _2_
construction is parsed  as  b
            _                      information it needs to make a
            b.
instead of a
                                   fast, deterministic parser for

     Naturally,  a  user   can
                                   the specific language we want.

always   force   a  particular
                                   When the language  is  supple-

parsing  by   placing   braces
                                   mented  by  the disambiguating

around expressions.

                          July 4, 2014

USD:26-24                    A System for Typesetting Mathematics

rules, it is in fact LR(1) and     name for the string, then hand

thus easy to parse[5].             the  name  and  the  string to

                                   TROFF, and let  TROFF  perform
     The output code  is  gen-

                                   the storage management. All we
erated   as   the   input   is

                                   save  is  the  name   of   the
scanned. Any time a production

                                   string,  its  height,  and its
of  the grammar is recognized,

                                   baseline.
(potentially) some TROFF  com-

mands are output. For example,          As another  example,  the

when  the   lexical   analyzer     translation   associated  with

reports  that  it  has found a     the production

TEXT (i.e., a string  of  con-
                                      box    : box OVER box

tiguous  characters),  we have

recognized the production:

                                   is:

   text    : TEXT

The  translation  of  this  is

simple.  We  generate  a local

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-25

                                      draw bottom box (i.e., copy string for bottom box);
 Width of output box =

                                      move up; move left enough to center top box;
   slightly more than largest input width

                                      draw top box (i.e., copy string for top box);
 Height of output box =

                                      move down and left; draw line full width;
   slightly more than sum of input heights

                                      return to proper base line.
 Base of output box =

   slightly more than height of botMostiofutheoother  productions

                                   have  equally  simple semantic

 String describing output box =    actions. Picturing the  output

                                   as  a  set  of properly placed

   move down;                      boxes makes the right sequence

                                   of  positioning commands quite

   move right enough to center bottobvious. The  main  difficulty

                                   is   in   finding   the  right

                          July 4, 2014

USD:26-26                    A System for Typesetting Mathematics

                                   6. Experience
numbers to  use  for  estheti-

cally pleasing positioning.             There  are  really  three

                                   aspects  of  interest-how well
     With  a  grammar,  it  is

                                   EQN sets mathematics, how well
usually  clear  how  to extend

                                   it satisfies its goal of being
the  language.  For  instance,

                                   ``easy to use,'' and how  easy
one  of  our users suggested a

                                   it was to build.
TENSOR operator, to make  con-

structions like                         The  first  question   is

               kj                  easily  addressed. This entire
              lT
              mni
                                   paper has been set by the pro-
Grammatically, this  is  easy:

                                   gram.  Readers  can  judge for
it is sufficient to add a pro-

                                   themselves whether it is  good
duction like

                                   enough for their purposes. One

   box    : TENSOR { list }
                                   of our  users  commented  that

                                   although  the output is not as

Semantically,  we  need   only     good  as  the  best   hand-set

juggle  the boxes to the right     material,  it  is still better

places.                            than average, and much  better

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-27

than  the  worst. In any case,          Some other weaknesses are

who cares? Printed books  can-     inherent in our output device.

not compete with the birds and     It is hard, for  instance,  to

flowers     of     illuminated     draw  a  line  of an arbitrary

manuscripts     on    esthetic     length without getting a  per-

grounds, either, but they have     ceptible   overstrike  at  one

some   clear  economic  advan-     end.

tages.
                                        As to ease of use, at the

     Some of the  deficiencies     time  of  writing,  the system

in the output could be cleaned     has been used by two  distinct

up with more work on our part.     groups.  One  user  population

For   example,   we  sometimes     consists  of   mathematicians,

leave too much space between a     chemists, physicists, and com-

roman  letter  and  an  italic     puter scientists. Their  typi-

one. If  we  were  willing  to     cal  reaction  has  been some-

keep   track   of   the  fonts     thing like:

involved,  we  could  do  this
                                     (1) It's  easy   to   write,

better more of the time.
                                        although  I make the fol-

                          July 4, 2014

USD:26-28                    A System for Typesetting Mathematics

     lowing mistakes...            were  the  original  target of

                                   the system. They  tend  to  be
 (2) How do I do...?

                                   enthusiastic   converts.  They

 (3) It botches the  following
                                   find  the  language  easy   to

     things....  Why don't you
                                   learn  (most are largely self-

     fix them?
                                   taught), and have little trou-

 (4) You really need the  fol-     ble  producing the output they

     lowing features...            want. They are of course  less

                                   critical  of  the esthetics of
     The  learning   time   is

                                   their   output   than    users
short. A few minutes gives the

                                   trained  in mathematics. After
general flavor, and  typing  a

                                   a transition period, most find
page  or  two  of a paper gen-

                                   using    a    computer    more
erally uncovers  most  of  the

                                   interesting  than  a   regular
misconceptions  about  how  it

                                   typewriter.
works.

                                        The main difficulty  that
     The second user group  is

                                   users   have   seems   to   be
much  larger,  the secretaries

                                   remembering that a blank is  a
and mathematical  typists  who

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-29

delimiter;   even  experienced          The language is  somewhat

users use  blanks  where  they     prolix,  but this doesn't seem

shouldn't  and  omit them when     excessive considering how much

they  are  needed.  A   common     is  being done, and it is cer-

instance is typing                 tainly more compact  than  the

                                   corresponding  TROFF commands.
   f(x sub i)

                                   For  example,  here   is   the

                                   source for the continued frac-

which produces
                                   tion expression in  Section  1

            f(xi)                  of this paper:

instead of

            f(xi)

Since the EQN  language  knows

no   mathematics,   it  cannot

deduce    that    the    right

parenthesis is not part of the

subscript.

                          July 4, 2014

USD:26-30                    A System for Typesetting Mathematics

        a sub 0 + b sub 1 over      define emx "{e sup mx}"

          {a sub 1 + b sub 2 over   define mab "{m sqrt ab}"

            {a sub 2 + b sub 3 over define sa "{sqrt a}"

              {a sub 3 + ... }}}    define sb "{sqrt b}"

                                    int dx over {a emx - be sup -mx} ~=~
This  is  the  input  for  the

large  integral  of Section 1;

                                    left { lpile {
notice the use of definitions:

                                         1 over {2 mab} ~log~

                                               {sa emx - sb} over {sa emx + sb}

                                       above

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-31

      1 over mab ~ tanh sup -1 ( safraction?),x ) and    changing

                                   things  found deficient by our

    above                          users (shouldn't a tilde be  a

                                   delimiter?).

      -1 over mab ~ coth sup -1 ( sa over sb emx )
                                        The program consists of a

                                   number  of  small, essentially

 }
                                   unconnected modules  for  code

                                   generation,  a  simple lexical

                                   analyzer,  a   canned   parser
     As to ease  of  construc-

                                   which   we  did  not  have  to
tion,  we  have  already  men-

                                   write,  and  some   miscellany
tioned that there  are  really

                                   associated  with  input  files
only   a   few   person-months

                                   and the  macro  facility.  The
invested. Much  of  this  time

                                   program   is  now  about  1600
has      gone     into     two

                                   lines of C [6],  a  high-level
things-fine-tuning  (what   is

                                   language  reminiscent of BCPL.
the most esthetically pleasing

                                   About  20  percent  of   these
space  to  use   between   the

                                   lines   are  ``print''  state-
numerator and denominator of a

                          July 4, 2014

USD:26-32                    A System for Typesetting Mathematics

ments, generating  the  output     ics, this provides  a  way  to

code.                              get  a  typed  version  of the

                                   final output  which  is  close
     The   semantic   routines

                                   enough for debugging purposes,
that generate the actual TROFF

                                   and sometimes even  for  ulti-
commands  can  be  changed  to

                                   mate use.
accommodate  other  formatting

languages  and  devices.   For

                                   7. Conclusions
example,   in   less  than  24

hours, one of us  changed  the          We think  we  have  shown

entire   semantic  package  to     that  it  is  possible  to  do

drive  NROFF,  a  variant   of     acceptably good typesetting of

TROFF,     for     typesetting     mathematics    on   a   photo-

mathematics on  teletypewriter     typesetter,  with   an   input

devices   capable  of  reverse     language that is easy to learn

line   motions.   Since   many     and  use  and  that  satisfies

potential  users  do  not have     many  users'  demands.  Such a

access to  a  typesetter,  but     package can be implemented  in

still  have  to type mathemat-     short     order,    given    a

                          July 4, 2014

A System for Typesetting Mathematics                    USD:26-33

compiler-compiler and a decent     grammar,  we  can  change  our

typesetting   program   under-     minds  readily  and  still  be

neath.                             reasonably sure that if a con-

                                   struction works in  one  place
     Defining a language,  and

                                   it will work everywhere.
building  a  compiler  for  it

with a compiler-compiler seems

                                   Acknowledgements
like  the only sensible way to

do  business.  Our  experience          We are deeply indebted to

with  the use of a grammar and     J.  F.  Ossanna, the author of

a compiler-compiler  has  been     TROFF, for his willingness  to

uniformly favorable. If we had     modify  TROFF to make our task

written everything  into  code     easier and for his  continuous

directly,  we  would have been     assistance during the develop-

locked   into   our   original     ment of our  program.  We  are

design.  Furthermore, we would     also grateful to A. V. Aho for

have never been sure where the     help with language theory,  to

exceptions  and  special cases     S. C. Johnson for aid with the

were. But because  we  have  a     compiler-compiler, and to  our

                          July 4, 2014

USD:26-34                    A System for Typesetting Mathematics

early  users  A. V. Aho, S. I.          (July 1974), 365-375.

Feldman, S. C. Johnson, R.  W.
                                   [4]  Ossanna,  J.  F.,   TROFF

Hamming, and M. D. McIlroy for
                                        User's    Manual.    Bell

their constructive criticisms.
                                        Laboratories    Computing

                                        Science  Technical Report

References
                                        54, 1977.

[1]  A Manual of  Style,  12th
                                   [5]  Aho, A. V., and  Johnson,

     Edition.   University  of
                                        S.  C.,  ``LR  Parsing.''

     Chicago  Press,  1969.  p
                                        Comp. Surv.  6,  2  (June

     295.
                                        1974), 99-124.

[2]  Model    C/A/T     Photo-
                                   [6]  B. W. Kernighan and D. M.

     typesetter.  Graphic Sys-
                                        Ritchie,  The  C Program-

     tems, Inc., Hudson, N. H.
                                        ming Language.  Prentice-

[3]  Ritchie,   D.   M.,   and          Hall, Inc., 1978.

     Thompson,  K.  L.,  ``The

     UNIX  time-sharing   sys-

     tem.''  Comm.  ACM  17, 7

                          July 4, 2014

Generated on 2014-07-04 21:17:45 by $MirOS: src/scripts/roff2htm,v 1.79 2014/02/10 00:36:11 tg Exp $

These manual pages and other documentation are copyrighted by their respective writers; their source is available at our CVSweb, AnonCVS, and other mirrors. The rest is Copyright © 2002‒2014 The MirOS Project, Germany.
This product includes material provided by Thorsten Glaser.

This manual page’s HTML representation is supposed to be valid XHTML/1.1; if not, please send a bug report – diffs preferred.