Directions of UNIX at Berkeley Marshall Kirk McKusick Michael J. Karels Computer Systems Research Group Computer Science Division Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, California 94720 ABSTRACT This paper gives a brief overview of the con- tributions to UNIX- made by the research community and describes the needs that prompted the distri- butions from Berkeley. The next Berkeley system will attempt to adapt to the current state of technology in the areas of virtual memory and file system interfaces. The paper makes a brief survey of this available technological base and then speculates on the ways in which future Berkeley systems will use this technology. 1. The Role of Research in Maintaining System Vitality Since the divestiture of AT&T, UNIX has become the focus of a massive marketing effort. To succeed, this effort must convince potential customers that the product is sup- ported, that future versions will continue to be developed, and that these versions will be upwardly compatible with all past applications. AT&T's size alone ensures that it will be around in years to come. Because the company has allocated increasing research, development, and support resources to UNIX over the past 10 years it provides an assurance of its commit- ment. Its massive advertising campaign for System V, its presence on the /usr/group UNIX standards committee, and the _________________________ -UNIX is a registered trademark of AT&T in the US and other countries. February 10, 2014 - 2 - publication of the System V Interface Definition testify to the company's intention to remain compatible with past sys- tems. Although repeal of the law of entropy is a necessary step along the road to a viable commercial product, this runs counter to orderly system evolution. Be that as it may, AT&T's major UNIX commercialization effort has succeeded in making the system available to a much broader audience than was previously possible. The freezing of what previously had been an ever- changing UNIX interface represented a major departure from the pattern that the small but highly skilled UNIX community had come to expect. Most early users had accounts at sites that had the source to the programs they ran. Thus, as the system interface evolved to reflect more current technology, software could be changed to keep pace. Users simply updated their programs to account for the new interface, recompiled them, and continued to use them as before. Although this required a large effort, it allowed the system -- and the tools that ran on it -- to reflect changes in software tech- nology. At the forefront of the technological wave was AT&T's own Bell Laboratories [Ritchie74]. It was there that the UNIX system was born and nurtured, and it was there that its evolution was controlled -- up through the release of the 7th Edition. Universities also were involved with the system almost from its inception. The University of California at Berkeley was among the first participants, playing host to several researchers on sabbatical from the Labs. This cooperation typified the harmony that was characteristic of the early UNIX community. Work that was contributed to the Labs by different members of the community helped produce a rapidly expanding set of tools and facilities. With the release of the 7th Edition, though, the use- fulness of UNIX already had been clearly established, and other organizations within AT&T began to handle the public releases of the system. These groups took far less input from the community as they began to freeze the system inter- face in preparation for entry into the commercial market- place. As the research community continued to modify the UNIX system, it found that it needed an organization that could produce releases. Berkeley quickly stepped into the role. Before the final public release of UNIX from the Labs, Berkeley's work had been focused on the development of tools designed to be added to existing UNIX systems. After the AT&T freeze, though, a group of researchers at the univer- sity found that they could easily expand their role to include the coalescing function previously provided by the February 10, 2014 - 3 - Labs. Out of this came the first full Berkeley distribution of UNIX (3.0BSD), complete with virtual memory -- a first for UNIX users. The idea was so successful that System V eventually adopted it six years later. 1.1. Motivations for Change At the same time that AT&T was beginning to put the brakes on further change in UNIX, local area networks and bitmapped workstations were just beginning to emerge from Xerox PARC and other research centers. Users in the academic and research community realized that there were no production-quality operating systems capable of using such hardware. They also saw that networking unquestionably would be an indispensable facility in future systems research. Though it was not clear that UNIX was the correct base on which to build a networked system, it was clear that UNIX offered the most expedient means by which to build such a system. This posed the Berkeley group with an interesting chal- lenge: how to meet the needs of the community of users without adding needless complexity to existing applications. Their efforts were aided by the presence of a large and diverse local group of users who were teaching introductory programming, typesetting documents, developing software sys- tems, and trying to build huge Lisp-based systems capable of solving differential equations. In addition, they were able to discuss current problems and hash out potential solutions at semi-annual technical conferences run by the Usenix organization. The assistance of a steering committee composed of academics, commercial vendors, DARPA researchers, and people from the Labs made it possible for the architecture of a networking-based UNIX system to be developed. By keeping with the UNIX tradition of integrating work done by others in preference to writing everything from scratch, 4.2BSD was released less than two years later [Joy83]. 2. The Future of UNIX at Berkeley The release of 4.3BSD in April of 1986 addressed many of the performance problems and unfinished interfaces present in 4.2BSD [Leffler84] [McKusick85]. Berkeley has now embarked on a new development phase to likewise update other old parts of the system. There are three main areas of work. The first is to rewrite the virtual memory system to take advantage of current technology and to provide new capabili- ties such as mapped files and shared memory. The second is to provide a standard interface to file systems so that mul- tiple local and remote file systems can be supported much as multiple networking protocols are by 4.3BSD. Finally, there is a need to provide more internal flexibility in a way February 10, 2014 - 4 - similar to the System V Streams paradigm. 2.1. A New Virtual Memory Implementation With the cost per byte of memory approaching that of the cost per byte for disks, and with file systems increas- ingly removed from host machines, a new approach to the implementation of virtual memory is necessary. In 4.3BSD the swap space is preallocated; this limits the maximum virtual memory that can be supported to the size of the swap area [Babaoglu79] [Someren84]. The new system should support vir- tual memory space at least as great as the sum of sizes of physical memory plus swap space (a system may run with no swap space if it has no local disk). For systems that have a local swap disk, but utilize remote file systems, using some memory to keep track of the contents of swap space may be useful to avoid multiple fetches of the same data from the file system. The new implementation should also add new functional- ity. Processes should be allowed to have large sparse address spaces, to map files into their address spaces, to map device memory into their address spaces, and to share memory with other processes. The shared address space may either be obtained by mapping a file into (possibly dif- ferent) parts of the address space, or by arranging for processes to share ``anonymous memory'' (that is, memory that is zero-fill on demand, and whose contents are lost when the last process unmaps the memory). This latter approach was the one adopted by the developers of System V. One possible use of shared memory is to provide a high-speed Inter-Process Communication (IPC) mechanism between two or more cooperating processes. To insure the integrity of data structures in a shared region, processes must be able to use semaphores to coordinate their access to these shared structures. In System V, semaphores are pro- vided as a set of system calls. Unfortunately, the use of system calls reduces the throughput of the shared memory IPC to that of existing IPC mechanisms. To avoid this bottleneck, we expect that the next release of BSD will incorporate a scheme that places the semaphores in the shared memory segment, so that machines with a test-and-set instruction will be able to handle the usual uncontested ``lock'' and ``unlock'' without doing two system calls. Only in the unusual case of trying to lock an already-locked lock or when a desired lock is being released will a system call be required. The interface will allow a user-level imple- mentation of the System V semaphore interface on most machines with a much lower runtime cost [McKusick86]. 2.2. Toward a Compatible File System Interface As network or remote file systems have been implemented February 10, 2014 - 5 - for UNIX, several stylized interfaces between the file sys- tem implementation and the rest of the kernel have been developed. Among these are Sun Microsystems' Virtual File System interface (VFS) using vnodes [Sandburg85] [Klei- man86], Digital Equipment's Generic File System (GFS) archi- tecture [Rodriguez86], AT&T's File System Switch (FSS) [Rif- kin86], the LOCUS distributed file system [Walker85], and Masscomp's extended file system [Cole85]. Other remote file systems have been implemented in research or university groups for internal use - notably the network file system in the Eighth Edition UNIX system [Weinberger84] and two dif- ferent file systems used at Carnegie Mellon University [Satyanarayanan85]. Numerous other remote file access methods have been devised for use within individual UNIX processes, many of them by modifications to the C I/O library similar to those in the Newcastle Connection [Brown- bridge82]. Each design attempts to isolate file system-dependent details below a generic interface and to provide a framework within which new file systems may be incorporated. However, each of these interfaces is different from and is incompati- ble with the others. Each addresses somewhat different design goals, having been based on a different starting ver- sion of UNIX, having targeted a different set of file sys- tems with varying characteristics, and having selected a different set of file system primitive operations. We have studied the various file system interfaces to determine their generality, completeness, robustness, effi- ciency, and aesthetics. Based on this study, we have developed a proposal for a new file system interface that we believe includes the best features of each of the existing implementations. Briefly, the proposal adopts the 4.3BSD calling convention for name lookup, but otherwise is closely related to Sun's VFS. A prototype implementation now is being developed. This proposal and the rationale underlying its development have been presented to major software ven- dors as an early step toward convergence on a compatible file system interface [Karels86]. 2.3. Changes to the Protocol Layering Interface The original work on restructuring the UNIX character I/O system to allow flexible configuration of the internal modules by user processes was done at Bell Laboratories [Ritchie84]. Known as stackable line disciplines, these interfaces allowed a user process to open a raw terminal port and then push on appropriate processing modules (such as one to do line editing). This model allowed terminal pro- cessing modules to be used with virtual-circuit network modules to create ``network virtual terminals'' by stacking a terminal processing module on top of a networking proto- col. February 10, 2014 - 6 - The design of the networking facilities for 4.2BSD took a different approach based on the socket interface. This design allows a single system to support multiple sets of networking protocols with stream, datagram, and other types of access. Protocol modules may deal with multiplexing of data from different connections onto a single transport medium. A problem with stackable line disciplines though, is that they are inherently linear in nature. Thus, they do not adequately model the fan-in and fan-out associated with mul- tiplexing. The simple and elegant stackable line discipline implementation of Eighth Edition UNIX was converted to the full production implementation of Streams in System V Release 3. In doing the conversion, many pragmatic issues were addressed, including the handling of multiplexed con- nections and commercially important protocols. Unfor- tunately, the implementation complexity increased enor- mously. Because AT&T will not allow others to include Streams unless they also change their interface to comply with the System V Interface Definition base and Networking Extension, we cannot use the Release 3 implementation of Streams in the Berkeley system. Given that compatibility thus will be dif- ficult, we feel we will have complete freedom to make our choices based solely on technical merits. As a result, our implementation will appear far more like the simpler stack- able line disciplines than the more complex Release 3 Streams [Chandler86]. A socket interface will be used rather than a character device interface, and demultiplexing will be handled internally by the protocols in the kernel. How- ever, like Streams, the interfaces between kernel protocol modules will follow a uniform convention. 3. References Babaoglu79 Babaoglu, O., W. Joy, ``Data Structures Added in the Berkeley Virtual Memory Extensions to the UNIX Operating System'' Computer Systems Research Group, Dept of EECS, University of California, Berkeley, CA 94720, USA, November 1979. Brownbridge82 Brownbridge, D.R., L.F. Marshall, B. Ran- dell, ``The Newcastle Connection, or UNIXes of the World Unite!,'' Software- Practice and Experience, Vol. 12, pp. 1147-1162, 1982. Chandler86 Chandler, D., ``The Monthly Report - Up February 10, 2014 - 7 - the Streams Without a Standard'', UNIX Review, Vol. 4, No. 9, pp. 6-14, September 1986. Cole85 Cole, C.T., P.B. Flinn, A.B. Atlas, ``An Implementation of an Extended File System for UNIX,'' Usenix Conference Proceedings, pp. 131-150, June, 1985. Joy83 Joy, W., E. Cooper, R. Fabry, S. Leffler, M. McKusick, D. Mosher, ``4.2BSD System Manual,'' 4.2BSD UNIX Programmer's Manual, Vol 2c, Document #68 August 1983. Karels86 Karels, M., M. McKusick, ``Towards a Com- patible File System Interface,'' Proceed- ings of the European UNIX Users Group Meeting, Manchester, England, pp. 481-496, September 1986. Kleiman86 Kleiman, S., ``Vnodes: An Architecture for Multiple File System Types in Sun UNIX,'' Usenix Conference Proceedings, pp. 238- 247, June, 1986. Leffler84 Leffler, S., M.K. McKusick, M. Karels, ``Measuring and Improving the Performance of 4.2BSD,'' Usenix Conference Proceed- ings, pp. 237-252, June, 1984. McKusick85 McKusick, M.K., M. Karels, S. Leffler, ``Performance Improvements and Functional Enhancements in 4.3BSD,'' Usenix Confer- ence Proceedings, pp. 519-531, June, 1985. McKusick86 McKusick, M., M. Karels, ``A New Virtual Memory Implementation for Berkeley UNIX,'' Proceedings of the European UNIX Users Group Meeting, Manchester, England, pp. 451-460, September 1986. Someren84 Someren, J. van, ``Paging in Berkeley UNIX,'' Laboratorium voor schakeltechniek en techneik v.d. informatieverwerkende machines, Codenummer 051560-44(1984)01, February 1984. February 10, 2014 - 8 - Rifkin86 Rifkin, A.P., M.P. Forbes, R.L. Hamilton, M. Sabrio, S. Shah, K. Yueh, ``RFS Archi- tectural Overview,'' Usenix Conference Proceedings, pp. 248-259, June, 1986. Ritchie74 Ritchie, D.M., K. Thompson, ``The Unix Time-Sharing System,'' Communications of the ACM, Vol. 17, pp. 365-375, July, 1974. Ritchie84 Ritchie, D.M., ``A Stream Input-Output System,'' AT&T Bell Laboratories Technical Journal, Vol 63, No 8, Part 2, pp. 1897- 1910, October 1984. Rodriguez86 Rodriguez, R., M. Koehler, R. Hyde, ``The Generic File System,'' Usenix Conference Proceedings, pp. 260-269, June, 1986. Sandberg85 Sandberg, R., D. Goldberg, S. Kleiman, D. Walsh, B. Lyon, ``Design and Implementa- tion of the Sun Network File System,'' Usenix Conference Proceedings, pp. 119- 130, June, 1985. Satyanarayanan85 Satyanarayanan, M., et al., ``The ITC Dis- tributed File System: Principles and Design,'' Proc. 10th Symposium on Operat- ing Systems Principles, pp. 35-50, ACM, December, 1985. Walker85 Walker, B.J. and S.H. Kiser, ``The LOCUS Distributed File System,'' The LOCUS Dis- tributed System Architecture, G.J. Popek and B.J. Walker, ed., The MIT Press, Cam- bridge, MA, 1985. Weinberger84 Weinberger, P.J., ``The Version 8 Network File System,'' Usenix Conference presenta- tion, June, 1984. February 10, 2014
Generated on 2014-02-10 02:47:05 by $MirOS: src/scripts/roff2htm,v 1.79 2014/02/10 00:36:11 tg Exp $
These manual pages and other documentation are copyrighted by their respective writers;
their source is available at our CVSweb,
AnonCVS, and other mirrors. The rest is Copyright © 2002‒2014 The MirOS Project, Germany.
This product includes material provided by Thorsten Glaser.
This manual page’s HTML representation is supposed to be valid XHTML/1.1; if not, please send a bug report – diffs preferred.