MirOS Manual: future(PAPERS)

               Directions of UNIX at Berkeley

                   Marshall Kirk McKusick

                     Michael J. Karels

              Computer Systems Research Group
                 Computer Science Division
 Department of Electrical Engineering and Computer Science
             University of California, Berkeley
                Berkeley, California  94720


          This paper gives a brief overview of the con-
     tributions to UNIX- made by the research community
     and  describes the needs that prompted the distri-
     butions from Berkeley. The  next  Berkeley  system
     will  attempt  to  adapt  to  the current state of
     technology in the areas of virtual memory and file
     system  interfaces. The paper makes a brief survey
     of this  available  technological  base  and  then
     speculates  on  the  ways in which future Berkeley
     systems will use this technology.

1. The Role of Research in Maintaining System Vitality

     Since the divestiture of  AT&T,  UNIX  has  become  the
focus of a massive marketing effort. To succeed, this effort
must convince potential customers that the product  is  sup-
ported,  that future versions will continue to be developed,
and that these versions will be upwardly compatible with all
past applications.

     AT&T's size alone ensures that it  will  be  around  in
years  to come. Because the company has allocated increasing
research, development, and support resources  to  UNIX  over
the  past  10  years it provides an assurance of its commit-
ment. Its massive advertising campaign  for  System  V,  its
presence on the /usr/group UNIX standards committee, and the
-UNIX  is  a registered trademark of AT&T in the US and
other countries.

                       April 13, 2015

                           - 2 -

publication of the System V Interface Definition testify  to
the  company's intention to remain compatible with past sys-

     Although repeal of the law of entropy  is  a  necessary
step  along  the  road  to a viable commercial product, this
runs counter to orderly system evolution. Be that as it may,
AT&T's  major UNIX commercialization effort has succeeded in
making the system available to a much broader audience  than
was previously possible.

     The freezing of  what  previously  had  been  an  ever-
changing  UNIX  interface represented a major departure from
the pattern that the small but highly skilled UNIX community
had  come  to expect. Most early users had accounts at sites
that had the source to the programs they ran. Thus,  as  the
system interface evolved to reflect more current technology,
software could be changed to keep pace. Users simply updated
their  programs to account for the new interface, recompiled
them, and continued to use them  as  before.  Although  this
required  a  large  effort, it allowed the system -- and the
tools that ran on it -- to reflect changes in software tech-

     At the forefront of the technological wave  was  AT&T's
own  Bell  Laboratories  [Ritchie74].  It was there that the
UNIX system was born and nurtured, and it was there that its
evolution  was  controlled  -- up through the release of the
7th Edition. Universities also were involved with the system
almost  from  its inception. The University of California at
Berkeley was among the first participants, playing  host  to
several  researchers  on  sabbatical  from  the  Labs.  This
cooperation typified the harmony that was characteristic  of
the  early  UNIX community. Work that was contributed to the
Labs by different members of the community helped produce  a
rapidly expanding set of tools and facilities.

     With the release of the 7th Edition, though,  the  use-
fulness  of  UNIX  already had been clearly established, and
other organizations within AT&T began to handle  the  public
releases  of  the  system.  These groups took far less input
from the community as they began to freeze the system inter-
face  in  preparation  for entry into the commercial market-

     As the research community continued to modify the  UNIX
system,  it  found that it needed an organization that could
produce releases. Berkeley quickly stepped  into  the  role.
Before  the  final  public  release  of  UNIX from the Labs,
Berkeley's work had been focused on the development of tools
designed  to  be  added  to existing UNIX systems. After the
AT&T freeze, though, a group of researchers at  the  univer-
sity  found  that  they  could  easily  expand their role to
include the coalescing function previously provided  by  the

                       April 13, 2015

                           - 3 -

Labs.  Out of this came the first full Berkeley distribution
of UNIX (3.0BSD), complete with virtual memory  --  a  first
for  UNIX  users.  The  idea was so successful that System V
eventually adopted it six years later.

1.1. Motivations for Change

     At the same time that AT&T was  beginning  to  put  the
brakes  on  further  change in UNIX, local area networks and
bitmapped workstations were just beginning  to  emerge  from
Xerox PARC and other research centers. Users in the academic
and  research  community  realized  that   there   were   no
production-quality  operating  systems capable of using such
hardware. They also saw that networking unquestionably would
be  an  indispensable  facility  in future systems research.
Though it was not clear that UNIX was the  correct  base  on
which  to  build  a networked system, it was clear that UNIX
offered the most expedient means by which to  build  such  a

     This posed the Berkeley group with an interesting chal-
lenge:  how  to  meet  the  needs  of the community of users
without adding needless complexity to existing applications.
Their  efforts  were  aided  by  the presence of a large and
diverse local group of users who were teaching  introductory
programming, typesetting documents, developing software sys-
tems, and trying to build huge Lisp-based systems capable of
solving  differential equations. In addition, they were able
to discuss current problems and hash out potential solutions
at  semi-annual  technical  conferences  run  by  the Usenix

     The assistance of  a  steering  committee  composed  of
academics, commercial vendors, DARPA researchers, and people
from the Labs made it possible for  the  architecture  of  a
networking-based  UNIX  system  to  be developed. By keeping
with the UNIX tradition of integrating work done  by  others
in preference to writing everything from scratch, 4.2BSD was
released less than two years later [Joy83].

2. The Future of UNIX at Berkeley

     The release of 4.3BSD in April of 1986  addressed  many
of   the  performance  problems  and  unfinished  interfaces
present in 4.2BSD [Leffler84] [McKusick85]. Berkeley has now
embarked on a new development phase to likewise update other
old parts of the system. There are three main areas of work.
The  first  is  to rewrite the virtual memory system to take
advantage of current technology and to provide new capabili-
ties  such  as mapped files and shared memory. The second is
to provide a standard interface to file systems so that mul-
tiple local and remote file systems can be supported much as
multiple networking protocols are by 4.3BSD. Finally,  there
is  a  need  to  provide  more internal flexibility in a way

                       April 13, 2015

                           - 4 -

similar to the System V Streams paradigm.

2.1. A New Virtual Memory Implementation

     With the cost per byte of memory  approaching  that  of
the  cost per byte for disks, and with file systems increas-
ingly removed from host machines,  a  new  approach  to  the
implementation of virtual memory is necessary. In 4.3BSD the
swap space is preallocated; this limits the maximum  virtual
memory  that  can  be supported to the size of the swap area
[Babaoglu79] [Someren84]. The new system should support vir-
tual  memory  space at least as great as the sum of sizes of
physical memory plus swap space (a system may  run  with  no
swap space if it has no local disk). For systems that have a
local swap disk, but utilize remote file systems, using some
memory  to  keep  track of the contents of swap space may be
useful to avoid multiple fetches of the same data  from  the
file system.

     The new implementation should also add new  functional-
ity.   Processes  should  be  allowed  to  have large sparse
address spaces, to map files into their address  spaces,  to
map  device  memory  into their address spaces, and to share
memory with other processes. The shared  address  space  may
either  be  obtained  by  mapping a file into (possibly dif-
ferent) parts of the address  space,  or  by  arranging  for
processes  to  share  ``anonymous  memory'' (that is, memory
that is zero-fill on demand, and  whose  contents  are  lost
when  the  last  process  unmaps  the  memory).  This latter
approach was the one adopted by the developers of System V.

     One possible use of  shared  memory  is  to  provide  a
high-speed   Inter-Process   Communication  (IPC)  mechanism
between two or more cooperating  processes.  To  insure  the
integrity  of  data structures in a shared region, processes
must be able to use semaphores to coordinate their access to
these  shared  structures.  In System V, semaphores are pro-
vided as a set of system calls. Unfortunately,  the  use  of
system calls reduces the throughput of the shared memory IPC
to  that  of  existing  IPC  mechanisms.   To   avoid   this
bottleneck,  we  expect  that  the  next release of BSD will
incorporate a scheme  that  places  the  semaphores  in  the
shared  memory segment, so that machines with a test-and-set
instruction will be able to  handle  the  usual  uncontested
``lock'' and ``unlock'' without doing two system calls. Only
in the unusual case of trying to lock an already-locked lock
or  when a desired lock is being released will a system call
be required.  The interface will allow a  user-level  imple-
mentation  of  the  System  V  semaphore  interface  on most
machines with a much lower runtime cost [McKusick86].

2.2. Toward a Compatible File System Interface

     As network or remote file systems have been implemented

                       April 13, 2015

                           - 5 -

for  UNIX, several stylized interfaces between the file sys-
tem implementation and the rest  of  the  kernel  have  been
developed.  Among  these  are Sun Microsystems' Virtual File
System interface  (VFS)  using  vnodes  [Sandburg85]  [Klei-
man86], Digital Equipment's Generic File System (GFS) archi-
tecture [Rodriguez86], AT&T's File System Switch (FSS) [Rif-
kin86],  the  LOCUS  distributed file system [Walker85], and
Masscomp's extended file system [Cole85]. Other remote  file
systems  have  been  implemented  in  research or university
groups for internal use - notably the network file system in
the  Eighth  Edition UNIX system [Weinberger84] and two dif-
ferent file  systems  used  at  Carnegie  Mellon  University
[Satyanarayanan85].   Numerous   other  remote  file  access
methods have been devised for  use  within  individual  UNIX
processes,  many  of  them  by  modifications  to  the C I/O
library similar to those in the Newcastle Connection [Brown-

     Each design attempts to isolate  file  system-dependent
details below a generic interface and to provide a framework
within which new file systems may be incorporated.  However,
each of these interfaces is different from and is incompati-
ble with  the  others.  Each  addresses  somewhat  different
design goals, having been based on a different starting ver-
sion of UNIX, having targeted a different set of  file  sys-
tems  with  varying  characteristics,  and having selected a
different set of file system primitive operations.

     We have studied the various file system  interfaces  to
determine  their generality, completeness, robustness, effi-
ciency,  and  aesthetics.  Based  on  this  study,  we  have
developed a proposal for a new file system interface that we
believe includes the best features of each of  the  existing
implementations.  Briefly,  the  proposal  adopts the 4.3BSD
calling convention for name lookup, but otherwise is closely
related  to  Sun's  VFS.  A  prototype implementation now is
being developed. This proposal and the rationale  underlying
its  development  have been presented to major software ven-
dors as an early step toward  convergence  on  a  compatible
file system interface [Karels86].

2.3. Changes to the Protocol Layering Interface

     The original work on restructuring the  UNIX  character
I/O  system  to allow flexible configuration of the internal
modules by user processes  was  done  at  Bell  Laboratories
[Ritchie84].  Known  as  stackable  line  disciplines, these
interfaces allowed a user process to  open  a  raw  terminal
port  and  then push on appropriate processing modules (such
as one to do line editing). This model allowed terminal pro-
cessing  modules  to  be  used  with virtual-circuit network
modules to create ``network virtual terminals'' by  stacking
a  terminal  processing module on top of a networking proto-

                       April 13, 2015

                           - 6 -

     The design of the networking facilities for 4.2BSD took
a  different  approach  based  on the socket interface. This
design allows a single system to support  multiple  sets  of
networking  protocols with stream, datagram, and other types
of access. Protocol modules may deal  with  multiplexing  of
data  from  different  connections  onto  a single transport

     A problem with stackable line  disciplines  though,  is
that they are inherently linear in nature. Thus, they do not
adequately model the fan-in and fan-out associated with mul-
tiplexing.  The simple and elegant stackable line discipline
implementation of Eighth Edition UNIX was converted  to  the
full  production  implementation  of  Streams  in  System  V
Release 3. In doing the conversion,  many  pragmatic  issues
were  addressed,  including the handling of multiplexed con-
nections  and  commercially  important   protocols.   Unfor-
tunately,  the  implementation  complexity  increased  enor-

     Because AT&T will not allow others to  include  Streams
unless  they  also change their interface to comply with the
System V Interface Definition base and Networking Extension,
we cannot use the Release 3 implementation of Streams in the
Berkeley system. Given that compatibility thus will be  dif-
ficult,  we  feel  we will have complete freedom to make our
choices based solely on technical merits. As a  result,  our
implementation  will appear far more like the simpler stack-
able line  disciplines  than  the  more  complex  Release  3
Streams [Chandler86]. A socket interface will be used rather
than a character device interface, and  demultiplexing  will
be  handled  internally by the protocols in the kernel. How-
ever, like Streams, the interfaces between  kernel  protocol
modules will follow a uniform convention.

3. References

Babaoglu79        Babaoglu, O., W.  Joy,  ``Data  Structures
                  Added   in  the  Berkeley  Virtual  Memory
                  Extensions to the UNIX Operating  System''
                  Computer  Systems  Research Group, Dept of
                  EECS, University of California,  Berkeley,
                  CA 94720, USA, November 1979.

Brownbridge82     Brownbridge, D.R., L.F. Marshall, B.  Ran-
                  dell,   ``The   Newcastle  Connection,  or
                  UNIXes of the  World  Unite!,''  Software-
                  Practice  and  Experience,  Vol.  12,  pp.
                  1147-1162, 1982.

Chandler86        Chandler, D., ``The Monthly  Report  -  Up

                       April 13, 2015

                           - 7 -

                  the  Streams  Without  a  Standard'', UNIX
                  Review, Vol. 4, No. 9, pp. 6-14, September

Cole85            Cole, C.T., P.B. Flinn, A.B.  Atlas,  ``An
                  Implementation  of an Extended File System
                  for UNIX,'' Usenix Conference Proceedings,
                  pp. 131-150, June, 1985.

Joy83             Joy, W., E. Cooper, R. Fabry, S.  Leffler,
                  M.  McKusick,  D.  Mosher, ``4.2BSD System
                  Manual,'' 4.2BSD UNIX Programmer's Manual,
                  Vol 2c, Document #68 August 1983.

Karels86          Karels, M., M. McKusick, ``Towards a  Com-
                  patible  File System Interface,'' Proceed-
                  ings of  the  European  UNIX  Users  Group
                  Meeting, Manchester, England, pp. 481-496,
                  September 1986.

Kleiman86         Kleiman, S., ``Vnodes: An Architecture for
                  Multiple  File System Types in Sun UNIX,''
                  Usenix Conference  Proceedings,  pp.  238-
                  247, June, 1986.

Leffler84         Leffler, S.,  M.K.  McKusick,  M.  Karels,
                  ``Measuring  and Improving the Performance
                  of 4.2BSD,''  Usenix  Conference  Proceed-
                  ings, pp. 237-252, June, 1984.

McKusick85        McKusick, M.K.,  M.  Karels,  S.  Leffler,
                  ``Performance  Improvements and Functional
                  Enhancements in 4.3BSD,''  Usenix  Confer-
                  ence Proceedings, pp. 519-531, June, 1985.

McKusick86        McKusick, M., M. Karels, ``A  New  Virtual
                  Memory Implementation for Berkeley UNIX,''
                  Proceedings of  the  European  UNIX  Users
                  Group  Meeting,  Manchester,  England, pp.
                  451-460, September 1986.

Someren84         Someren,  J.  van,  ``Paging  in  Berkeley
                  UNIX,''  Laboratorium voor schakeltechniek
                  en  techneik  v.d.   informatieverwerkende
                  machines,   Codenummer  051560-44(1984)01,
                  February 1984.

                       April 13, 2015

                           - 8 -

Rifkin86          Rifkin, A.P., M.P. Forbes, R.L.  Hamilton,
                  M.  Sabrio, S. Shah, K. Yueh, ``RFS Archi-
                  tectural  Overview,''  Usenix   Conference
                  Proceedings, pp. 248-259, June, 1986.

Ritchie74         Ritchie, D.M.,  K.  Thompson,  ``The  Unix
                  Time-Sharing  System,''  Communications of
                  the ACM, Vol. 17, pp. 365-375, July, 1974.

Ritchie84         Ritchie,  D.M.,  ``A  Stream  Input-Output
                  System,'' AT&T Bell Laboratories Technical
                  Journal, Vol 63, No 8, Part 2,  pp.  1897-
                  1910, October 1984.

Rodriguez86       Rodriguez, R., M. Koehler, R. Hyde,  ``The
                  Generic  File  System,'' Usenix Conference
                  Proceedings, pp. 260-269, June, 1986.

Sandberg85        Sandberg, R., D. Goldberg, S. Kleiman,  D.
                  Walsh,  B.  Lyon, ``Design and Implementa-
                  tion of the  Sun  Network  File  System,''
                  Usenix  Conference  Proceedings,  pp. 119-
                  130, June, 1985.

Satyanarayanan85  Satyanarayanan, M., et al., ``The ITC Dis-
                  tributed   File   System:  Principles  and
                  Design,'' Proc. 10th Symposium on  Operat-
                  ing  Systems  Principles,  pp. 35-50, ACM,
                  December, 1985.

Walker85          Walker, B.J. and S.H. Kiser,  ``The  LOCUS
                  Distributed  File System,'' The LOCUS Dis-
                  tributed System Architecture,  G.J.  Popek
                  and  B.J. Walker, ed., The MIT Press, Cam-
                  bridge, MA, 1985.

Weinberger84      Weinberger, P.J., ``The Version 8  Network
                  File System,'' Usenix Conference presenta-
                  tion, June, 1984.

                       April 13, 2015

Generated on 2015-04-13 10:26:13 by $MirOS: src/scripts/roff2htm,v 1.80 2015/01/02 13:54:19 tg Exp $

These manual pages and other documentation are copyrighted by their respective writers; their source is available at our CVSweb, AnonCVS, and other mirrors. The rest is Copyright © 2002–2015 The MirOS Project, Germany.
This product includes material provided by Thorsten Glaser.

This manual page’s HTML representation is supposed to be valid XHTML/1.1; if not, please send a bug report – diffs preferred.