Berkeley Software Architecture Manual
4.4BSD Edition
William Joy, Robert Fabry,
Samuel Leffler, M. Kirk McKusick,
Michael Karels
Computer Systems Research Group
Computer Science Division
Department of Electrical Engineering and Computer Science
University of California, Berkeley
Berkeley, CA 94720
ABSTRACT
This document summarizes the facilities pro-
vided by the 4.4BSD version of the UNIX* operating
system. It does not attempt to act as a tutorial
for use of the system nor does it attempt to
explain or justify the design of the system facil-
ities. It gives neither motivation nor implementa-
tion details, in favor of brevity.
The first section describes the basic kernel
functions provided to a UNIX process: process nam-
ing and protection, memory management, software
interrupts, object references (descriptors), time
and statistics functions, and resource controls.
These facilities, as well as facilities for
bootstrap, shutdown and process accounting, are
provided solely by the kernel.
The second section describes the standard
system abstractions for files and file systems,
communication, terminal handling, and process con-
trol and debugging. These facilities are imple-
mented by the operating system or by network
server processes.
_________________________
* UNIX is a trademark of Bell Laboratories.
PSD:5-2 4.4BSD Architecture Manual
TABLE OF CONTENTS
Introduction.
0. Notation and types
1. Kernel primitives
1.1. Processes and protection
1.1.1. Host and process identifiers
1.1.2. Process creation and termination
1.1.3. User and group ids
1.1.4. Process groups
1.2. Memory management
1.2.1. Text, data and stack
1.2.2. Mapping pages
1.2.3. Page protection control
1.2.4. Giving and getting advice
1.2.5. Protection primitives
1.3. Signals
1.3.1. Overview
1.3.2. Signal types
1.3.3. Signal handlers
1.3.4. Sending signals
1.3.5. Protecting critical sections
1.3.6. Signal stacks
1.4. Timing and statistics
1.4.1. Real time
1.4.2. Interval time
1.5. Descriptors
1.5.1. The reference table
1.5.2. Descriptor properties
1.5.3. Managing descriptor references
1.5.4. Multiplexing requests
1.5.5. Descriptor wrapping
1.6. Resource controls
1.6.1. Process priorities
1.6.2. Resource utilization
1.6.3. Resource limits
1.7. System operation support
1.7.1. Bootstrap operations
1.7.2. Shutdown operations
1.7.3. Accounting
4.4BSD Architecture Manual PSD:5-3
2. System facilities
2.1. Generic operations
2.1.1. Read and write
2.1.2. Input/output control
2.1.3. Non-blocking and asynchronous operations
2.2. File system
2.2.1 Overview
2.2.2. Naming
2.2.3. Creation and removal
2.2.3.1. Directory creation and removal
2.2.3.2. File creation
2.2.3.3. Creating references to devices
2.2.3.4. Portal creation
2.2.3.6. File, device, and portal removal
2.2.4. Reading and modifying file attributes
2.2.5. Links and renaming
2.2.6. Extension and truncation
2.2.7. Checking accessibility
2.2.8. Locking
2.2.9. Disc quotas
2.3. Interprocess communication
2.3.1. Interprocess communication primitives
2.3.1.1. Communication domains
2.3.1.2. Socket types and protocols
2.3.1.3. Socket creation, naming and service establishment
2.3.1.4. Accepting connections
2.3.1.5. Making connections
2.3.1.6. Sending and receiving data
2.3.1.7. Scatter/gather and exchanging access rights
2.3.1.8. Using read and write with sockets
2.3.1.9. Shutting down halves of full-duplex connections
2.3.1.10. Socket and protocol options
2.3.2. UNIX domain
2.3.2.1. Types of sockets
2.3.2.2. Naming
2.3.2.3. Access rights transmission
2.3.3. INTERNET domain
2.3.3.1. Socket types and protocols
2.3.3.2. Socket naming
2.3.3.3. Access rights transmission
2.3.3.4. Raw access
2.4. Terminals and devices
2.4.1. Terminals
2.4.1.1. Terminal input
2.4.1.1.1 Input modes
2.4.1.1.2 Interrupt characters
2.4.1.1.3 Line editing
2.4.1.2. Terminal output
2.4.1.3. Terminal control operations
2.4.1.4. Terminal hardware support
PSD:5-4 4.4BSD Architecture Manual
2.4.2. Structured devices
2.4.3. Unstructured devices
2.5. Process control and debugging
I. Summary of facilities
4.4BSD Architecture Manual PSD:5-5
0. Notation and types
The notation used to describe system calls is a variant
of a C language call, consisting of a prototype call fol-
lowed by declaration of parameters and results. An addi-
tional keyword result, not part of the normal C language, is
used to indicate which of the declared entities receive
results. As an example, consider the read call, as described
in section 2.1:
cc = read(fd, buf, nbytes);
result int cc; int fd; result char *buf; int nbytes;
The first line shows how the read routine is called, with
three parameters. As shown on the second line cc is an
integer and read also returns information in the parameter
buf.
Description of all error conditions arising from each
system call is not provided here; they appear in the
programmer's manual. In particular, when accessed from the C
language, many calls return a characteristic -1 value when
an error occurs, returning the error code in the global
variable errno. Other languages may present errors in dif-
ferent ways.
A number of system standard types are defined in the
include file <sys/types.h> and used in the specifications
here and in many C programs. These include caddr_t giving a
memory address (typically as a character pointer), off_t
giving a file offset (typically as a long integer), and a
set of unsigned types u_char, u_short, u_int and u_long,
shorthand names for unsigned char, unsigned short, etc.
PSD:5-6 4.4BSD Architecture Manual
1. Kernel primitives
The facilities available to a UNIX user process are
logically divided into two parts: kernel facilities directly
implemented by UNIX code running in the operating system,
and system facilities implemented either by the system, or
in cooperation with a server process. These kernel facili-
ties are described in this section 1.
The facilities implemented in the kernel are those
which define the UNIX virtual machine in which each process
runs. Like many real machines, this virtual machine has
memory management hardware, an interrupt facility, timers
and counters. The UNIX virtual machine also allows access
to files and other objects through a set of descriptors.
Each descriptor resembles a device controller, and supports
a set of operations. Like devices on real machines, some of
which are internal to the machine and some of which are
external, parts of the descriptor machinery are built-in to
the operating system, while other parts are often imple-
mented in server processes on other machines. The facilities
provided through the descriptor machinery are described in
section 2.
4.4BSD Architecture Manual PSD:5-7
1.1. Processes and protection
1.1.1. Host and process identifiers
Each UNIX host has associated with it a 32-bit host id,
and a host name of up to 256 characters (as defined by MAX-
HOSTNAMELEN in <sys/param.h>). These are set (by a
privileged user) and returned by the calls:
sethostid(hostid)
long hostid;
hostid = gethostid();
result long hostid;
sethostname(name, len)
char *name; int len;
len = gethostname(buf, buflen)
result int len; result char *buf; int buflen;
On each host runs a set of processes. Each process is
largely independent of other processes, having its own pro-
tection domain, address space, timers, and an independent
set of references to system or user implemented objects.
Each process in a host is named by an integer called
the process id. This number is in the range 1-30000 and is
returned by the getpid routine:
pid = getpid();
result int pid;
On each UNIX host this identifier is guaranteed to be
unique; in a multi-host environment, the (hostid, process
id) pairs are guaranteed unique.
1.1.2. Process creation and termination
A new process is created by making a logical duplicate
of an existing process:
pid = fork();
result int pid;
The fork call returns twice, once in the parent process,
where pid is the process identifier of the child, and once
in the child process where pid is 0. The parent-child rela-
tionship induces a hierarchical structure on the set of
processes in the system.
A process may terminate by executing an exit call:
PSD:5-8 4.4BSD Architecture Manual
exit(status)
int status;
returning 8 bits of exit status to its parent.
When a child process exits or terminates abnormally,
the parent process receives information about any event
which caused termination of the child process. A second
call provides a non-blocking interface and may also be used
to retrieve information about resources consumed by the pro-
cess during its lifetime.
#include <sys/wait.h>
pid = wait(astatus);
result int pid; result union wait *astatus;
pid = wait3(astatus, options, arusage);
result int pid; result union waitstatus *astatus;
int options; result struct rusage *arusage;
A process can overlay itself with the memory image of
another process, passing the newly created process a set of
parameters, using the call:
execve(name, argv, envp)
char *name, **argv, **envp;
The specified name must be a file which is in a format
recognized by the system, either a binary executable file or
a file which causes the execution of a specified interpreter
program to process its contents.
1.1.3. User and group ids
Each process in the system has associated with it two
user-id's: a real user id and a effective user id, both 16
bit unsigned integers (type uid_t). Each process has an real
accounting group id and an effective accounting group id and
a set of access group id's. The group id's are 16 bit
unsigned integers (type gid_t). Each process may be in
several different access groups, with the maximum concurrent
number of access groups a system compilation parameter, the
constant NGROUPS in the file <sys/param.h>, guaranteed to be
at least 8.
The real and effective user ids associated with a pro-
cess are returned by:
4.4BSD Architecture Manual PSD:5-9
ruid = getuid();
result uid_t ruid;
euid = geteuid();
result uid_t euid;
the real and effective accounting group ids by:
rgid = getgid();
result gid_t rgid;
egid = getegid();
result gid_t egid;
The access group id set is returned by a getgroups call*:
ngroups = getgroups(gidsetsize, gidset);
result int ngroups; int gidsetsize; result int gidset[gidsetsize];
The user and group id's are assigned at login time
using the setreuid, setregid, and setgroups calls:
setreuid(ruid, euid);
int ruid, euid;
setregid(rgid, egid);
int rgid, egid;
setgroups(gidsetsize, gidset)
int gidsetsize; int gidset[gidsetsize];
The setreuid call sets both the real and effective user-
id's, while the setregid call sets both the real and effec-
tive accounting group id's. Unless the caller is the super-
user, ruid must be equal to either the current real or
effective user-id, and rgid equal to either the current real
or effective accounting group id. The setgroups call is
restricted to the super-user.
1.1.4. Process groups
Each process in the system is also normally associated
with a process group. The group of processes in a process
group is sometimes referred to as a job and manipulated by
high-level system software (such as the shell). The current
process group of a process is returned by the getpgrp call:
_________________________
* The type of the gidset array in getgroups and set-
groups remains integer for compatibility with 4.2BSD.
It may change to gid_t in future releases.
PSD:5-10 4.4BSD Architecture Manual
pgrp = getpgrp(pid);
result int pgrp; int pid;
When a process is in a specific process group it may receive
software interrupts affecting the group, causing the group
to suspend or resume execution or to be interrupted or ter-
minated. In particular, a system terminal has a process
group and only processes which are in the process group of
the terminal may read from the terminal, allowing arbitra-
tion of terminals among several different jobs.
The process group associated with a process may be
changed by the setpgrp call:
setpgrp(pid, pgrp);
int pid, pgrp;
Newly created processes are assigned process id's distinct
from all processes and process groups, and the same process
group as their parent. A normal (unprivileged) process may
set its process group equal to its process id. A privileged
process may set the process group of any process to any
value.
4.4BSD Architecture Manual PSD:5-11
1.2. Memory management-
1.2.1. Text, data and stack
Each process begins execution with three logical areas
of memory called text, data and stack. The text area is
read-only and shared, while the data and stack areas are
private to the process. Both the data and stack areas may
be extended and contracted on program request. The call
addr = sbrk(incr);
result caddr_t addr; int incr;
changes the size of the data area by incr bytes and returns
the new end of the data area, while
addr = sstk(incr);
result caddr_t addr; int incr;
changes the size of the stack area. The stack area is also
automatically extended as needed. On the VAX the text and
data areas are adjacent in the P0 region, while the stack
section is in the P1 region, and grows downward.
1.2.2. Mapping pages
The system supports sharing of data between processes
by allowing pages to be mapped into memory. These mapped
pages may be shared with other processes or private to the
process. Protection and sharing options are defined in
<sys/mman.h> as:
/* protections are chosen from these bits, or-ed together */
#define PROT_READ 0x04 /* pages can be read */
#define PROT_WRITE 0x02 /* pages can be written */
#define PROT_EXEC 0x01 /* pages can be executed */
/* flags contain mapping type, sharing type and options */
/* mapping type; choose one */
#define MAP_FILE 0x0001 /* mapped from a file or device */
#define MAP_ANON 0x0002 /* allocated from memory, swap space */
#define MAP_TYPE 0x000f /* mask for type field */
_________________________
- This section represents the interface planned for
later releases of the system. Of the calls described
in this section, only sbrk and getpagesize are included
in 4.3BSD.
PSD:5-12 4.4BSD Architecture Manual
/* sharing types; choose one */
#define MAP_SHARED 0x0010 /* share changes */
#define MAP_PRIVATE 0x0000 /* changes are private */
/* other flags */
#define MAP_FIXED 0x0020 /* map addr must be exactly as requested */
#define MAP_INHERIT 0x0040 /* region is retained after exec */
#define MAP_HASSEMAPHORE 0x0080 /* region may contain semaphores */
#define MAP_NOPREALLOC 0x0100 /* do not preallocate space */
The cpu-dependent size of a page is returned by the get-
pagesize system call:
pagesize = getpagesize();
result int pagesize;
The call:
maddr = mmap(addr, len, prot, flags, fd, pos);
result caddr_t maddr; caddr_t addr; int *len, prot, flags, fd; off_t pos;
causes the pages starting at addr and continuing for at most
len bytes to be mapped from the object represented by
descriptor fd, starting at byte offset pos. The starting
address of the region is returned; for the convenience of
the system, it may differ from that supplied unless the
MAP_FIXED flag is given, in which case the exact address
will be used or the call will fail. The actual amount mapped
is returned in len. The addr, len, and pos parameters must
all be multiples of the pagesize. A successful mmap will
delete any previous mapping in the allocated address range.
The parameter prot specifies the accessibility of the mapped
pages. The parameter flags specifies the type of object to
be mapped, mapping options, and whether modifications made
to this mapped copy of the page are to be kept private, or
are to be shared with other references. Possible types
include MAP_FILE, mapping a regular file or character-
special device memory, and MAP_ANON, which maps memory not
associated with any specific file. The file descriptor used
for creating MAP_ANON regions is used only for naming, and
may be given as -1 if no name is associated with the
region.= The MAP_INHERIT flag allows a region to be inher-
ited after an exec. The MAP_HASSEMAPHORE flag allows special
handling for regions that may contain semaphores. The
_________________________
= The current design does not allow a process to speci-
fy the location of swap space. In the future we may de-
fine an additional mapping type, MAP_SWAP, in which the
file descriptor argument specifies a file or device to
which swapping should be done.
4.4BSD Architecture Manual PSD:5-13
MAP_NOPREALLOC flag allows processes to allocate regions
whose virtual address space, if fully allocated, would
exceed the available memory plus swap resources. Such
regions may get a SIGSEGV signal if they page fault and
resources are not available to service their request; typi-
cally they would free up some resources via unmap so that
when they return from the signal the page fault could be
successfully completed.
A facility is provided to synchronize a mapped region
with the file it maps; the call
msync(addr, len);
caddr_t addr; int len;
writes any modified pages back to the filesystem and updates
the file modification time. If len is 0, all modified pages
within the region containing addr will be flushed; if len is
non-zero, only the pages containing addr and len succeeding
locations will be examined. Any required synchronization of
memory caches will also take place at this time. Filesystem
operations on a file that is mapped for shared modifications
are unpredictable except after an msync.
A mapping can be removed by the call
munmap(addr, len);
caddr_t addr; int len;
This call deletes the mappings for the specified address
range, and causes further references to addresses within the
range to generate invalid memory references.
1.2.3. Page protection control
A process can control the protection of pages using the
call
mprotect(addr, len, prot);
caddr_t addr; int len, prot;
This call changes the specified pages to have protection
prot. Not all implementations will guarantee protection on a
page basis; the granularity of protection changes may be as
large as an entire region.
1.2.4. Giving and getting advice
A process that has knowledge of its memory behavior may
use the madvise call:
madvise(addr, len, behav);
caddr_t addr; int len, behav;
PSD:5-14 4.4BSD Architecture Manual
Behav describes expected behavior, as given in <sys/mman.h>:
#define MADV_NORMAL 0 /* no further special treatment */
#define MADV_RANDOM 1 /* expect random page references */
#define MADV_SEQUENTIAL 2 /* expect sequential references */
#define MADV_WILLNEED 3 /* will need these pages */
#define MADV_DONTNEED 4 /* don't need these pages */
#define MADV_SPACEAVAIL 5 /* insure that resources are reserved */
Finally, a process may obtain information about whether
pages are core resident by using the call
mincore(addr, len, vec)
caddr_t addr; int len; result char *vec;
Here the current core residency of the pages is returned in
the character array vec, with a value of 1 meaning that the
page is in-core.
1.2.5. Synchronization primitives
Primitives are provided for synchronization using sema-
phores in shared memory. Semaphores must lie within a
MAP_SHARED region with at least modes PROT_READ and
PROT_WRITE. The MAP_HASSEMAPHORE flag must have been speci-
fied when the region was created. To acquire a lock a pro-
cess calls:
value = mset(sem, wait)
result int value; semaphore *sem; int wait;
Mset indivisibly tests and sets the semaphore sem. If the
previous value is zero, the process has acquired the lock
and mset returns true immediately. Otherwise, if the wait
flag is zero, failure is returned. If wait is true and the
previous value is non-zero, mset relinquishes the processor
until notified that it should retry.
To release a lock a process calls:
mclear(sem)
semaphore *sem;
Mclear indivisibly tests and clears the semaphore sem. If
the ``WANT'' flag is zero in the previous value, mclear
returns immediately. If the ``WANT'' flag is non-zero in the
previous value, mclear arranges for waiting processes to
retry before returning.
Two routines provide services analogous to the kernel
sleep and wakeup functions interpreted in the domain of
shared memory. A process may relinquish the processor by
calling msleep with a set semaphore:
4.4BSD Architecture Manual PSD:5-15
msleep(sem)
semaphore *sem;
If the semaphore is still set when it is checked by the ker-
nel, the process will be put in a sleeping state until some
other process issues an mwakeup for the same semaphore
within the region using the call:
mwakeup(sem)
semaphore *sem;
An mwakeup may awaken all sleepers on the semaphore, or may
awaken only the next sleeper on a queue.
PSD:5-16 4.4BSD Architecture Manual
1.3. Signals
1.3.1. Overview
The system defines a set of signals that may be
delivered to a process. Signal delivery resembles the
occurrence of a hardware interrupt: the signal is blocked
from further occurrence, the current process context is
saved, and a new one is built. A process may specify the
handler to which a signal is delivered, or specify that the
signal is to be blocked or ignored. A process may also
specify that a default action is to be taken when signals
occur.
Some signals will cause a process to exit when they are
not caught. This may be accompanied by creation of a core
image file, containing the current memory image of the pro-
cess for use in post-mortem debugging. A process may choose
to have signals delivered on a special stack, so that
sophisticated software stack manipulations are possible.
All signals have the same priority. If multiple sig-
nals are pending simultaneously, the order in which they are
delivered to a process is implementation specific. Signal
routines execute with the signal that caused their invoca-
tion blocked, but other signals may yet occur. Mechanisms
are provided whereby critical sections of code may protect
themselves against the occurrence of specified signals.
1.3.2. Signal types
The signals defined by the system fall into one of five
classes: hardware conditions, software conditions,
input/output notification, process control, or resource con-
trol. The set of signals is defined in the file <signal.h>.
Hardware signals are derived from exceptional condi-
tions which may occur during execution. Such signals
include SIGFPE representing floating point and other arith-
metic exceptions, SIGILL for illegal instruction execution,
SIGSEGV for addresses outside the currently assigned area of
memory, and SIGBUS for accesses that violate memory protec-
tion constraints. Other, more cpu-specific hardware signals
exist, such as those for the various customer-reserved
instructions on the VAX (SIGIOT, SIGEMT, and SIGTRAP).
Software signals reflect interrupts generated by user
request: SIGINT for the normal interrupt signal; SIGQUIT for
the more powerful quit signal, that normally causes a core
image to be generated; SIGHUP and SIGTERM that cause grace-
ful process termination, either because a user has ``hung
4.4BSD Architecture Manual PSD:5-17
up'', or by user or program request; and SIGKILL, a more
powerful termination signal which a process cannot catch or
ignore. Programs may define their own asynchronous events
using SIGUSR1 and SIGUSR2. Other software signals (SIGALRM,
SIGVTALRM, SIGPROF) indicate the expiration of interval
timers.
A process can request notification via a SIGIO signal
when input or output is possible on a descriptor, or when a
non-blocking operation completes. A process may request to
receive a SIGURG signal when an urgent condition arises.
A process may be stopped by a signal sent to it or the
members of its process group. The SIGSTOP signal is a
powerful stop signal, because it cannot be caught. Other
stop signals SIGTSTP, SIGTTIN, and SIGTTOU are used when a
user request, input request, or output request respectively
is the reason for stopping the process. A SIGCONT signal is
sent to a process when it is continued from a stopped state.
Processes may receive notification with a SIGCHLD signal
when a child process changes state, either by stopping or by
terminating.
Exceeding resource limits may cause signals to be gen-
erated. SIGXCPU occurs when a process nears its CPU time
limit and SIGXFSZ warns that the limit on file size creation
has been reached.
1.3.3. Signal handlers
A process has a handler associated with each signal.
The handler controls the way the signal is delivered. The
call
#include <signal.h>
struct sigvec {
int (*sv_handler)();
int sv_mask;
int sv_flags;
};
sigvec(signo, sv, osv)
int signo; struct sigvec *sv; result struct sigvec *osv;
assigns interrupt handler address sv_handler to signal
signo. Each handler address specifies either an interrupt
routine for the signal, that the signal is to be ignored, or
that a default action (usually process termination) is to
occur if the signal occurs. The constants SIG_IGN and
SIG_DEF used as values for sv_handler cause ignoring or
defaulting of a condition. The sv_mask value specifies the
signal mask to be used when the handler is invoked; it
implicitly includes the signal which invoked the handler.
PSD:5-18 4.4BSD Architecture Manual
Signal masks include one bit for each signal; the mask for a
signal signo is provided by the macro sigmask(signo), from
<signal.h>. Sv_flags specifies whether system calls should
be restarted if the signal handler returns and whether the
handler should operate on the normal run-time stack or a
special signal stack (see below). If osv is non-zero, the
previous signal vector is returned.
When a signal condition arises for a process, the sig-
nal is added to a set of signals pending for the process. If
the signal is not currently blocked by the process then it
will be delivered. The process of signal delivery adds the
signal to be delivered and those signals specified in the
associated signal handler's sv_mask to a set of those masked
for the process, saves the current process context, and
places the process in the context of the signal handling
routine. The call is arranged so that if the signal han-
dling routine exits normally the signal mask will be
restored and the process will resume execution in the origi-
nal context. If the process wishes to resume in a different
context, then it must arrange to restore the signal mask
itself.
The mask of blocked signals is independent of handlers
for signals. It delays signals from being delivered much as
a raised hardware interrupt priority level delays hardware
interrupts. Preventing an interrupt from occurring by chang-
ing the handler is analogous to disabling a device from
further interrupts.
The signal handling routine sv_handler is called by a C
call of the form
(*sv_handler)(signo, code, scp);
int signo; long code; struct sigcontext *scp;
The signo gives the number of the signal that occurred, and
the code, a word of information supplied by the hardware.
The scp parameter is a pointer to a machine-dependent struc-
ture containing the information for restoring the context
before the signal.
1.3.4. Sending signals
A process can send a signal to another process or group
of processes with the calls:
kill(pid, signo)
int pid, signo;
killpgrp(pgrp, signo)
int pgrp, signo;
Unless the process sending the signal is privileged, it must
4.4BSD Architecture Manual PSD:5-19
have the same effective user id as the process receiving the
signal.
Signals are also sent implicitly from a terminal device
to the process group associated with the terminal when cer-
tain input characters are typed.
1.3.5. Protecting critical sections
To block a section of code against one or more signals,
a sigblock call may be used to add a set of signals to the
existing mask, returning the old mask:
oldmask = sigblock(mask);
result long oldmask; long mask;
The old mask can then be restored later with sigsetmask,
oldmask = sigsetmask(mask);
result long oldmask; long mask;
The sigblock call can be used to read the current mask by
specifying an empty mask.
It is possible to check conditions with some signals
blocked, and then to pause waiting for a signal and restor-
ing the mask, by using:
sigpause(mask);
long mask;
1.3.6. Signal stacks
Applications that maintain complex or fixed size stacks
can use the call
struct sigstack {
caddr_t ss_sp;
int ss_onstack;
};
sigstack(ss, oss)
struct sigstack *ss; result struct sigstack *oss;
to provide the system with a stack based at ss_sp for
delivery of signals. The value ss_onstack indicates whether
the process is currently on the signal stack, a notion main-
tained in software by the system.
When a signal is to be delivered, the system checks
whether the process is on a signal stack. If not, then the
process is switched to the signal stack for delivery, with
the return from the signal arranged to restore the previous
PSD:5-20 4.4BSD Architecture Manual
stack.
If the process wishes to take a non-local exit from the
signal routine, or run code from the signal stack that uses
a different stack, a sigstack call should be used to reset
the signal stack.
4.4BSD Architecture Manual PSD:5-21
1.4. Timers
1.4.1. Real time
The system's notion of the current Greenwich time and
the current time zone is set and returned by the call by the
calls:
#include <sys/time.h>
settimeofday(tvp, tzp);
struct timeval *tp;
struct timezone *tzp;
gettimeofday(tp, tzp);
result struct timeval *tp;
result struct timezone *tzp;
where the structures are defined in <sys/time.h> as:
struct timeval {
long tv_sec; /* seconds since Jan 1, 1970 */
long tv_usec; /* and microseconds */
};
struct timezone {
int tz_minuteswest; /* of Greenwich */
int tz_dsttime; /* type of dst correction to apply */
};
The precision of the system clock is hardware dependent.
Earlier versions of UNIX contained only a 1-second resolu-
tion version of this call, which remains as a library rou-
tine:
time(tvsec)
result long *tvsec;
returning only the tv_sec field from the gettimeofday call.
1.4.2. Interval time
The system provides each process with three interval
timers, defined in <sys/time.h>:
#define ITIMER_REAL 0 /* real time intervals */
#define ITIMER_VIRTUAL 1 /* virtual time intervals */
#define ITIMER_PROF 2 /* user and system virtual time */
The ITIMER_REAL timer decrements in real time. It could be
used by a library routine to maintain a wakeup service
queue. A SIGALRM signal is delivered when this timer
PSD:5-22 4.4BSD Architecture Manual
expires.
The ITIMER_VIRTUAL timer decrements in process virtual
time. It runs only when the process is executing. A
SIGVTALRM signal is delivered when it expires.
The ITIMER_PROF timer decrements both in process vir-
tual time and when the system is running on behalf of the
process. It is designed to be used by processes to statisti-
cally profile their execution. A SIGPROF signal is delivered
when it expires.
A timer value is defined by the itimerval structure:
struct itimerval {
struct timeval it_interval; /* timer interval */
struct timeval it_value; /* current value */
};
and a timer is set or read by the call:
getitimer(which, value);
int which; result struct itimerval *value;
setitimer(which, value, ovalue);
int which; struct itimerval *value; result struct itimerval *ovalue;
The third argument to setitimer specifies an optional struc-
ture to receive the previous contents of the interval timer.
A timer can be disabled by specifying a timer value of 0.
The system rounds argument timer intervals to be not
less than the resolution of its clock. This clock resolu-
tion can be determined by loading a very small value into a
timer and reading the timer back to see what value resulted.
The alarm system call of earlier versions of UNIX is
provided as a library routine using the ITIMER_REAL timer.
The process profiling facilities of earlier versions of UNIX
remain because it is not always possible to guarantee the
automatic restart of system calls after receipt of a signal.
The profil call arranges for the kernel to begin gathering
execution statistics for a process:
profil(buf, bufsize, offset, scale);
result char *buf; int bufsize, offset, scale;
This begins sampling of the program counter, with statistics
maintained in the user-provided buffer.
4.4BSD Architecture Manual PSD:5-23
1.5. Descriptors
1.5.1. The reference table
Each process has access to resources through descrip-
tors. Each descriptor is a handle allowing the process to
reference objects such as files, devices and communications
links.
Rather than allowing processes direct access to
descriptors, the system introduces a level of indirection,
so that descriptors may be shared between processes. Each
process has a descriptor reference table, containing
pointers to the actual descriptors. The descriptors them-
selves thus have multiple references, and are reference
counted by the system.
Each process has a fixed size descriptor reference
table, where the size is returned by the getdtablesize call:
nds = getdtablesize();
result int nds;
and guaranteed to be at least 20. The entries in the
descriptor reference table are referred to by small
integers; for example if there are 20 slots they are num-
bered 0 to 19.
1.5.2. Descriptor properties
Each descriptor has a logical set of properties main-
tained by the system and defined by its type. Each type sup-
ports a set of operations; some operations, such as reading
and writing, are common to several abstractions, while oth-
ers are unique. The generic operations applying to many of
these types are described in section 2.1. Naming contexts,
files and directories are described in section 2.2. Section
2.3 describes communications domains and sockets. Terminals
and (structured and unstructured) devices are described in
section 2.4.
1.5.3. Managing descriptor references
A duplicate of a descriptor reference may be made by
doing
new = dup(old);
result int new; int old;
returning a copy of descriptor reference old indistinguish-
able from the original. The new chosen by the system will
PSD:5-24 4.4BSD Architecture Manual
be the smallest unused descriptor reference slot. A copy of
a descriptor reference may be made in a specific slot by
doing
dup2(old, new);
int old, new;
The dup2 call causes the system to deallocate the descriptor
reference current occupying slot new, if any, replacing it
with a reference to the same descriptor as old. This deallo-
cation is also performed by:
close(old);
int old;
1.5.4. Multiplexing requests
The system provides a standard way to do synchronous
and asynchronous multiplexing of operations.
Synchronous multiplexing is performed by using the
select call to examine the state of multiple descriptors
simultaneously, and to wait for state changes on those
descriptors. Sets of descriptors of interest are specified
as bit masks, as follows:
#include <sys/types.h>
nds = select(nd, in, out, except, tvp);
result int nds; int nd; result fd_set *in, *out, *except;
struct timeval *tvp;
FD_ZERO(&fdset);
FD_SET(fd, &fdset);
FD_CLR(fd, &fdset);
FD_ISSET(fd, &fdset);
int fs; fs_set fdset;
The select call examines the descriptors specified by the
sets in, out and except, replacing the specified bit masks
by the subsets that select true for input, output, and
exceptional conditions respectively (nd indicates the number
of file descriptors specified by the bit masks). If any
descriptors meet the following criteria, then the number of
such descriptors is returned in nds and the bit masks are
updated.
* A descriptor selects for input if an input oriented
operation such as read or receive is possible, or if a
connection request may be accepted (see section
2.3.1.4).
* A descriptor selects for output if an output oriented
4.4BSD Architecture Manual PSD:5-25
operation such as write or send is possible, or if an
operation that was ``in progress'', such as connection
establishment, has completed (see section 2.1.3).
* A descriptor selects for an exceptional condition if a
condition that would cause a SIGURG signal to be gen-
erated exists (see section 1.3.2), or other device-
specific events have occurred.
If none of the specified conditions is true, the operation
waits for one of the conditions to arise, blocking at most
the amount of time specified by tvp. If tvp is given as 0,
the select waits indefinitely.
Options affecting I/O on a descriptor may be read and
set by the call:
dopt = fcntl(d, cmd, arg)
result int dopt; int d, cmd, arg;
/* interesting values for cmd */
#define F_SETFL 3 /* set descriptor options */
#define F_GETFL 4 /* get descriptor options */
#define F_SETOWN 5 /* set descriptor owner (pid/pgrp) */
#define F_GETOWN 6 /* get descriptor owner (pid/pgrp) */
The F_SETFL cmd may be used to set a descriptor in non-
blocking I/O mode and/or enable signaling when I/O is possi-
ble. F_SETOWN may be used to specify a process or process
group to be signaled when using the latter mode of operation
or when urgent indications arise.
Operations on non-blocking descriptors will either com-
plete immediately, note an error EWOULDBLOCK, partially com-
plete an input or output operation returning a partial
count, or return an error EINPROGRESS noting that the
requested operation is in progress. A descriptor which has
signalling enabled will cause the specified process and/or
process group be signaled, with a SIGIO for input, output,
or in-progress operation complete, or a SIGURG for excep-
tional conditions.
For example, when writing to a terminal using non-
blocking output, the system will accept only as much data as
there is buffer space for and return; when making a connec-
tion on a socket, the operation may return indicating that
the connection establishment is ``in progress''. The select
facility can be used to determine when further output is
possible on the terminal, or when the connection establish-
ment attempt is complete.
1.5.5. Descriptor wrapping.-
A user process may build descriptors of a specified
_________________________
- The facilities described in this section are not in-
PSD:5-26 4.4BSD Architecture Manual
type by wrapping a communications channel with a system sup-
plied protocol translator:
new = wrap(old, proto)
result int new; int old; struct dprop *proto;
Operations on the descriptor old are then translated by the
system provided protocol translator into requests on the
underlying object old in a way defined by the protocol. The
protocols supported by the kernel may vary from system to
system and are described in the programmers manual.
Protocols may be based on communications multiplexing
or a rights-passing style of handling multiple requests made
on the same object. For instance, a protocol for implement-
ing a file abstraction may or may not include locally gen-
erated ``read-ahead'' requests. A protocol that provides
for read-ahead may provide higher performance but have a
more difficult implementation.
Another example is the terminal driving facilities.
Normally a terminal is associated with a communications
line, and the terminal type and standard terminal access
protocol are wrapped around a synchronous communications
line and given to the user. If a virtual terminal is
required, the terminal driver can be wrapped around a com-
munications link, the other end of which is held by a vir-
tual terminal protocol interpreter.
_________________________
cluded in 4.3BSD.
4.4BSD Architecture Manual PSD:5-27
1.6. Resource controls
1.6.1. Process priorities
The system gives CPU scheduling priority to processes
that have not used CPU time recently. This tends to favor
interactive processes and processes that execute only for
short periods. It is possible to determine the priority
currently assigned to a process, process group, or the
processes of a specified user, or to alter this priority
using the calls:
#define PRIO_PROCESS 0 /* process */
#define PRIO_PGRP 1 /* process group */
#define PRIO_USER 2 /* user id */
prio = getpriority(which, who);
result int prio; int which, who;
setpriority(which, who, prio);
int which, who, prio;
The value prio is in the range -20 to 20. The default prior-
ity is 0; lower priorities cause more favorable execution.
The getpriority call returns the highest priority (lowest
numerical value) enjoyed by any of the specified processes.
The setpriority call sets the priorities of all of the
specified processes to the specified value. Only the super-
user may lower priorities.
1.6.2. Resource utilization
The resources used by a process are returned by a
getrusage call, returning information in a structure defined
in <sys/resource.h>:
PSD:5-28 4.4BSD Architecture Manual
#define RUSAGE_SELF 0 /* usage by this process */
#define RUSAGE_CHILDREN -1 /* usage by all children */
getrusage(who, rusage)
int who; result struct rusage *rusage;
struct rusage {
struct timeval ru_utime; /* user time used */
struct timeval ru_stime; /* system time used */
int ru_maxrss; /* maximum core resident set size: kbytes */
int ru_ixrss; /* integral shared memory size (kbytes*sec) */
int ru_idrss; /* unshared data memory size */
int ru_isrss; /* unshared stack memory size */
int ru_minflt; /* page-reclaims */
int ru_majflt; /* page faults */
int ru_nswap; /* swaps */
int ru_inblock; /* block input operations */
int ru_oublock; /* block output operations */
int ru_msgsnd; /* messages sent */
int ru_msgrcv; /* messages received */
int ru_nsignals; /* signals received */
int ru_nvcsw; /* voluntary context switches */
int ru_nivcsw; /* involuntary context switches */
};
The who parameter specifies whose resource usage is to be
returned. The resources used by the current process, or by
all the terminated children of the current process may be
requested.
1.6.3. Resource limits
The resources of a process for which limits are con-
trolled by the kernel are defined in <sys/resource.h>, and
controlled by the getrlimit and setrlimit calls:
4.4BSD Architecture Manual PSD:5-29
#define RLIMIT_CPU 0 /* cpu time in milliseconds */
#define RLIMIT_FSIZE 1 /* maximum file size */
#define RLIMIT_DATA 2 /* maximum data segment size */
#define RLIMIT_STACK 3 /* maximum stack segment size */
#define RLIMIT_CORE 4 /* maximum core file size */
#define RLIMIT_RSS 5 /* maximum resident set size */
#define RLIM_NLIMITS 6
#define RLIM_INFINITY 0x7fffffff
struct rlimit {
int rlim_cur; /* current (soft) limit */
int rlim_max; /* hard limit */
};
getrlimit(resource, rlp)
int resource; result struct rlimit *rlp;
setrlimit(resource, rlp)
int resource; struct rlimit *rlp;
Only the super-user can raise the maximum limits. Other
users may only alter rlim_cur within the range from 0 to
rlim_max or (irreversibly) lower rlim_max.
PSD:5-30 4.4BSD Architecture Manual
1.7. System operation support
Unless noted otherwise, the calls in this section are
permitted only to a privileged user.
1.7.1. Bootstrap operations
The call
mount(blkdev, dir, ronly);
char *blkdev, *dir; int ronly;
extends the UNIX name space. The mount call specifies a
block device blkdev containing a UNIX file system to be made
available starting at dir. If ronly is set then the file
system is read-only; writes to the file system will not be
permitted and access times will not be updated when files
are referenced. Dir is normally a name in the root direc-
tory.
The call
swapon(blkdev, size);
char *blkdev; int size;
specifies a device to be made available for paging and swap-
ping.
1.7.2. Shutdown operations
The call
unmount(dir);
char *dir;
unmounts the file system mounted on dir. This call will
succeed only if the file system is not currently being used.
The call
sync();
schedules input/output to clean all system buffer caches.
(This call does not require privileged status.)
The call
reboot(how)
int how;
causes a machine halt or reboot. The call may request a
4.4BSD Architecture Manual PSD:5-31
reboot by specifying how as RB_AUTOBOOT, or that the machine
be halted with RB_HALT. These constants are defined in
<sys/reboot.h>.
1.7.3. Accounting
The system optionally keeps an accounting record in a
file for each process that exits on the system. The format
of this record is beyond the scope of this document. The
accounting may be enabled to a file name by doing
acct(path);
char *path;
If path is null, then accounting is disabled. Otherwise,
the named file becomes the accounting file.
PSD:5-32 4.4BSD Architecture Manual
2. System facilities
This section discusses the system facilities that are
not considered part of the kernel.
The system abstractions described are:
Directory contexts
A directory context is a position in the UNIX file sys-
tem name space. Operations on files and other named
objects in a file system are always specified relative
to such a context.
Files
Files are used to store uninterpreted sequence of bytes
on which random access reads and writes may occur.
Pages from files may also be mapped into process
address space.- A directory may be read as a file.
Communications domains
A communications domain represents an interprocess com-
munications environment, such as the communications
facilities of the UNIX system, communications in the
INTERNET, or the resource sharing protocols and access
rights of a resource sharing system on a local network.
Sockets
A socket is an endpoint of communication and the focal
point for IPC in a communications domain. Sockets may
be created in pairs, or given names and used to rendez-
vous with other sockets in a communications domain,
accepting connections from these sockets or exchanging
messages with them. These operations model a labeled
or unlabeled communications graph, and can be used in a
wide variety of communications domains. Sockets can
have different types to provide different semantics of
communication, increasing the flexibility of the model.
Terminals and other devices
Devices include terminals, providing input editing and
interrupt generation and output flow control and edit-
ing, magnetic tapes, disks and other peripherals. They
often support the generic read and write operations as
well as a number of ioctls.
Processes
Process descriptors provide facilities for control and
debugging of other processes.
_________________________
- Support for mapping files is not included in the 4.3
release.
4.4BSD Architecture Manual PSD:5-33
2.1. Generic operations
Many system abstractions support the operations read,
write and ioctl. We describe the basics of these common
primitives here. Similarly, the mechanisms whereby normally
synchronous operations may occur in a non-blocking or asyn-
chronous fashion are common to all system-defined abstrac-
tions and are described here.
2.1.1. Read and write
The read and write system calls can be applied to com-
munications channels, files, terminals and devices. They
have the form:
cc = read(fd, buf, nbytes);
result int cc; int fd; result caddr_t buf; int nbytes;
cc = write(fd, buf, nbytes);
result int cc; int fd; caddr_t buf; int nbytes;
The read call transfers as much data as possible from the
object defined by fd to the buffer at address buf of size
nbytes. The number of bytes transferred is returned in cc,
which is -1 if a return occurred before any data was
transferred because of an error or use of non-blocking
operations.
The write call transfers data from the buffer to the
object defined by fd. Depending on the type of fd, it is
possible that the write call will accept some portion of the
provided bytes; the user should resubmit the other bytes in
a later request in this case. Error returns because of
interrupted or otherwise incomplete operations are possible.
Scattering of data on input or gathering of data for
output is also possible using an array of input/output vec-
tor descriptors. The type for the descriptors is defined in
<sys/uio.h> as:
struct iovec {
caddr_t iov_msg; /* base of a component */
int iov_len; /* length of a component */
};
The calls using an array of descriptors are:
PSD:5-34 4.4BSD Architecture Manual
cc = readv(fd, iov, iovlen);
result int cc; int fd; struct iovec *iov; int iovlen;
cc = writev(fd, iov, iovlen);
result int cc; int fd; struct iovec *iov; int iovlen;
Here iovlen is the count of elements in the iov array.
2.1.2. Input/output control
Control operations on an object are performed by the
ioctl operation:
ioctl(fd, request, buffer);
int fd, request; caddr_t buffer;
This operation causes the specified request to be performed
on the object fd. The request parameter specifies whether
the argument buffer is to be read, written, read and writ-
ten, or is not needed, and also the size of the buffer, as
well as the request. Different descriptor types and subtypes
within descriptor types may use distinct ioctl requests.
For example, operations on terminals control flushing of
input and output queues and setting of terminal parameters;
operations on disks cause formatting operations to occur;
operations on tapes control tape positioning.
The names for basic control operations are defined in
<sys/ioctl.h>.
2.1.3. Non-blocking and asynchronous operations
A process that wishes to do non-blocking operations on
one of its descriptors sets the descriptor in non-blocking
mode as described in section 1.5.4. Thereafter the read
call will return a specific EWOULDBLOCK error indication if
there is no data to be read. The process may select the
associated descriptor to determine when a read is possible.
Output attempted when a descriptor can accept less than
is requested will either accept some of the provided data,
returning a shorter than normal length, or return an error
indicating that the operation would block. More output can
be performed as soon as a select call indicates the object
is writeable.
Operations other than data input or output may be per-
formed on a descriptor in a non-blocking fashion. These
operations will return with a characteristic error indicat-
ing that they are in progress if they cannot complete
immediately. The descriptor may then be selected for write
to find out when the operation has been completed. When
select indicates the descriptor is writeable, the operation
4.4BSD Architecture Manual PSD:5-35
has completed. Depending on the nature of the descriptor and
the operation, additional activity may be started or the new
state may be tested.
PSD:5-36 4.4BSD Architecture Manual
2.2. File system
2.2.1. Overview
The file system abstraction provides access to a
hierarchical file system structure. The file system contains
directories (each of which may contain other sub-
directories) as well as files and references to other
objects such as devices and inter-process communications
sockets.
Each file is organized as a linear array of bytes. No
record boundaries or system related information is present
in a file. Files may be read and written in a random-access
fashion. The user may read the data in a directory as though
it were an ordinary file to determine the names of the con-
tained files, but only the system may write into the direc-
tories. The file system stores only a small amount of owner-
ship, protection and usage information with a file.
2.2.2. Naming
The file system calls take path name arguments. These
consist of a zero or more component file names separated by
``/'' characters, where each file name is up to 255 ASCII
characters excluding null and ``/''.
Each process always has two naming contexts: one for
the root directory of the file system and one for the
current working directory. These are used by the system in
the filename translation process. If a path name begins with
a ``/'', it is called a full path name and interpreted rela-
tive to the root directory context. If the path name does
not begin with a ``/'' it is called a relative path name and
interpreted relative to the current directory context.
The system limits the total length of a path name to
1024 characters.
The file name ``..'' in each directory refers to the
parent directory of that directory. The parent directory of
the root of the file system is always that directory.
The calls
chdir(path);
char *path;
chroot(path)
char *path;
change the current working directory and root directory
4.4BSD Architecture Manual PSD:5-37
context of a process. Only the super-user can change the
root directory context of a process.
2.2.3. Creation and removal
The file system allows directories, files, special dev-
ices, and ``portals'' to be created and removed from the
file system.
2.2.3.1. Directory creation and removal
A directory is created with the mkdir system call:
mkdir(path, mode);
char *path; int mode;
where the mode is defined as for files (see below). Direc-
tories are removed with the rmdir system call:
rmdir(path);
char *path;
A directory must be empty if it is to be deleted.
2.2.3.2. File creation
Files are created with the open system call,
fd = open(path, oflag, mode);
result int fd; char *path; int oflag, mode;
The path parameter specifies the name of the file to be
created. The oflag parameter must include O_CREAT from
below to cause the file to be created. Bits for oflag are
defined in <sys/file.h>:
#define O_RDONLY 000 /* open for reading */
#define O_WRONLY 001 /* open for writing */
#define O_RDWR 002 /* open for read & write */
#define O_NDELAY 004 /* non-blocking open */
#define O_APPEND 010 /* append on each write */
#define O_CREAT 01000 /* open with file create */
#define O_TRUNC 02000 /* open with truncation */
#define O_EXCL 04000 /* error on create if file exists */
One of O_RDONLY, O_WRONLY and O_RDWR should be speci-
fied, indicating what types of operations are desired to be
performed on the open file. The operations will be checked
against the user's access rights to the file before allowing
the open to succeed. Specifying O_APPEND causes writes to
automatically append to the file. The flag O_CREAT causes
the file to be created if it does not exist, owned by the
current user and the group of the containing directory. The
PSD:5-38 4.4BSD Architecture Manual
protection for the new file is specified in mode. The file
mode is used as a three digit octal number. Each digit
encodes read access as 4, write access as 2 and execute
access as 1, or'ed together. The 0700 bits describe owner
access, the 070 bits describe the access rights for
processes in the same group as the file, and the 07 bits
describe the access rights for other processes.
If the open specifies to create the file with O_EXCL
and the file already exists, then the open will fail without
affecting the file in any way. This provides a simple
exclusive access facility. If the file exists but is a sym-
bolic link, the open will fail regardless of the existence
of the file specified by the link.
2.2.3.3. Creating references to devices
The file system allows entries which reference peri-
pheral devices. Peripherals are distinguished as block or
character devices according by their ability to support
block-oriented operations. Devices are identified by their
``major'' and ``minor'' device numbers. The major device
number determines the kind of peripheral it is, while the
minor device number indicates one of possibly many peri-
pherals of that kind. Structured devices have all operations
performed internally in ``block'' quantities while unstruc-
tured devices often have a number of special ioctl opera-
tions, and may have input and output performed in varying
units. The mknod call creates special entries:
mknod(path, mode, dev);
char *path; int mode, dev;
where mode is formed from the object type and access permis-
sions. The parameter dev is a configuration dependent
parameter used to identify specific character or block I/O
devices.
2.2.3.4. Portal creation-
The call
fd = portal(name, server, param, dtype, protocol, domain, socktype)
result int fd; char *name, *server, *param; int dtype, protocol;
int domain, socktype;
places a name in the file system name space that causes con-
nection to a server process when the name is used. The por-
tal call returns an active portal in fd as though an access
had occurred to activate an inactive portal, as now
described.
_________________________
- The portal call is not implemented in 4.3BSD.
4.4BSD Architecture Manual PSD:5-39
When an inactive portal is accessed, the system sets up
a socket of the specified socktype in the specified communi-
cations domain (see section 2.3), and creates the server
process, giving it the specified param as argument to help
it identify the portal, and also giving it the newly created
socket as descriptor number 0. The accessor of the portal
will create a socket in the same domain and connect to the
server. The user will then wrap the socket in the specified
protocol to create an object of the required descriptor type
dtype and proceed with the operation which was in progress
before the portal was encountered.
While the server process holds the socket (which it
received as fd from the portal call on descriptor 0 at
activation) further references will result in connections
being made to the same socket.
2.2.3.5. File, device, and portal removal
A reference to a file, special device or portal may be
removed with the unlink call,
unlink(path);
char *path;
The caller must have write access to the directory in which
the file is located for this call to be successful.
2.2.4. Reading and modifying file attributes
Detailed information about the attributes of a file may
be obtained with the calls:
#include <sys/stat.h>
stat(path, stb);
char *path; result struct stat *stb;
fstat(fd, stb);
int fd; result struct stat *stb;
The stat structure includes the file type, protection, own-
ership, access times, size, and a count of hard links. If
the file is a symbolic link, then the status of the link
itself (rather than the file the link references) may be
found using the lstat call:
lstat(path, stb);
char *path; result struct stat *stb;
Newly created files are assigned the user id of the
process that created it and the group id of the directory in
which it was created. The ownership of a file may be
PSD:5-40 4.4BSD Architecture Manual
changed by either of the calls
chown(path, owner, group);
char *path; int owner, group;
fchown(fd, owner, group);
int fd, owner, group;
In addition to ownership, each file has three levels of
access protection associated with it. These levels are
owner relative, group relative, and global (all users and
groups). Each level of access has separate indicators for
read permission, write permission, and execute permission.
The protection bits associated with a file may be set by
either of the calls:
chmod(path, mode);
char *path; int mode;
fchmod(fd, mode);
int fd, mode;
where mode is a value indicating the new protection of the
file, as listed in section 2.2.3.2.
Finally, the access and modify times on a file may be
set by the call:
utimes(path, tvp)
char *path; struct timeval *tvp[2];
This is particularly useful when moving files between media,
to preserve relationships between the times the file was
modified.
2.2.5. Links and renaming
Links allow multiple names for a file to exist. Links
exist independently of the file linked to.
Two types of links exist, hard links and symbolic
links. A hard link is a reference counting mechanism that
allows a file to have multiple names within the same file
system. Symbolic links cause string substitution during the
pathname interpretation process.
Hard links and symbolic links have different proper-
ties. A hard link insures the target file will always be
accessible, even after its original directory entry is
removed; no such guarantee exists for a symbolic link. Sym-
bolic links can span file systems boundaries.
The following calls create a new link, named path2, to
4.4BSD Architecture Manual PSD:5-41
path1:
link(path1, path2);
char *path1, *path2;
symlink(path1, path2);
char *path1, *path2;
The unlink primitive may be used to remove either type of
link.
If a file is a symbolic link, the ``value'' of the link
may be read with the readlink call,
len = readlink(path, buf, bufsize);
result int len; result char *path, *buf; int bufsize;
This call returns, in buf, the null-terminated string sub-
stituted into pathnames passing through path.
Atomic renaming of file system resident objects is pos-
sible with the rename call:
rename(oldname, newname);
char *oldname, *newname;
where both oldname and newname must be in the same file sys-
tem. If newname exists and is a directory, then it must be
empty.
2.2.6. Extension and truncation
Files are created with zero length and may be extended
simply by writing or appending to them. While a file is
open the system maintains a pointer into the file indicating
the current location in the file associated with the
descriptor. This pointer may be moved about in the file in
a random access fashion. To set the current offset into a
file, the lseek call may be used,
oldoffset = lseek(fd, offset, type);
result off_t oldoffset; int fd; off_t offset; int type;
where type is given in <sys/file.h> as one of:
#define SEEK_SET 0 /* set file offset to offset */
#define SEEK_CUR 1 /* set file offset to current plus offset */
#define SEEK_CUR 2 /* set file offset to EOF plus offset */
The call ``lseek(fd, 0, SEEK_CUR)'' returns the current
offset into the file.
Files may have ``holes'' in them. Holes are void areas
in the linear extent of the file where data has never been
PSD:5-42 4.4BSD Architecture Manual
written. These may be created by seeking to a location in a
file past the current end-of-file and writing. Holes are
treated by the system as zero valued bytes.
A file may be truncated with either of the calls:
truncate(path, length);
char *path; int length;
ftruncate(fd, length);
int fd, length;
reducing the size of the specified file to length bytes.
2.2.7. Checking accessibility
A process running with different real and effective
user ids may interrogate the accessibility of a file to the
real user by using the access call:
accessible = access(path, how);
result int accessible; char *path; int how;
Here how is constructed by or'ing the following bits,
defined in <sys/file.h>:
#define F_OK 0 /* file exists */
#define X_OK 1 /* file is executable */
#define W_OK 2 /* file is writable */
#define R_OK 4 /* file is readable */
The presence or absence of advisory locks does not affect
the result of access.
2.2.8. Locking
The file system provides basic facilities that allow
cooperating processes to synchronize their access to shared
files. A process may place an advisory read or write lock
on a file, so that other cooperating processes may avoid
interfering with the process' access. This simple mechanism
provides locking with file granularity. More granular lock-
ing can be built using the IPC facilities to provide a lock
manager. The system does not force processes to obey the
locks; they are of an advisory nature only.
Locking is performed after an open call by applying the
flock primitive,
flock(fd, how);
int fd, how;
where the how parameter is formed from bits defined in
<sys/file.h>:
4.4BSD Architecture Manual PSD:5-43
#define LOCK_SH 1 /* shared lock */
#define LOCK_EX 2 /* exclusive lock */
#define LOCK_NB 4 /* don't block when locking */
#define LOCK_UN 8 /* unlock */
Successive lock calls may be used to increase or decrease
the level of locking. If an object is currently locked by
another process when a flock call is made, the caller will
be blocked until the current lock owner releases the lock;
this may be avoided by including LOCK_NB in the how parame-
ter. Specifying LOCK_UN removes all locks associated with
the descriptor. Advisory locks held by a process are
automatically deleted when the process terminates.
2.2.9. Disk quotas
As an optional facility, each file system may be
requested to impose limits on a user's disk usage. Two quan-
tities are limited: the total amount of disk space which a
user may allocate in a file system and the total number of
files a user may create in a file system. Quotas are
expressed as hard limits and soft limits. A hard limit is
always imposed; if a user would exceed a hard limit, the
operation which caused the resource request will fail. A
soft limit results in the user receiving a warning message,
but with allocation succeeding. Facilities are provided to
turn soft limits into hard limits if a user has exceeded a
soft limit for an unreasonable period of time.
To enable disk quotas on a file system the setquota
call is used:
setquota(special, file)
char *special, *file;
where special refers to a structured device file where a
mounted file system exists, and file refers to a disk quota
file (residing on the file system associated with special)
from which user quotas should be obtained. The format of
the disk quota file is implementation dependent.
To manipulate disk quotas the quota call is provided:
#include <sys/quota.h>
quota(cmd, uid, arg, addr)
int cmd, uid, arg; caddr_t addr;
The indicated cmd is applied to the user ID uid. The parame-
ters arg and addr are command specific. The file
<sys/quota.h> contains definitions pertinent to the use of
this call.
PSD:5-44 4.4BSD Architecture Manual
2.3. Interprocess communications
2.3.1. Interprocess communication primitives
2.3.1.1. Communication domains
The system provides access to an extensible set of com-
munication domains. A communication domain is identified by
a manifest constant defined in the file <sys/socket.h>.
Important standard domains supported by the system are the
``unix'' domain, AF_UNIX, for communication within the sys-
tem, the ``Internet'' domain for communication in the DARPA
Internet, AF_INET, and the ``NS'' domain, AF_NS, for commun-
ication using the Xerox Network Systems protocols. Other
domains can be added to the system.
2.3.1.2. Socket types and protocols
Within a domain, communication takes place between com-
munication endpoints known as sockets. Each socket has the
potential to exchange information with other sockets of an
appropriate type within the domain.
Each socket has an associated abstract type, which
describes the semantics of communication using that socket.
Properties such as reliability, ordering, and prevention of
duplication of messages are determined by the type. The
basic set of socket types is defined in <sys/socket.h>:
/* Standard socket types */
#define SOCK_DGRAM 1 /* datagram */
#define SOCK_STREAM 2 /* virtual circuit */
#define SOCK_RAW 3 /* raw socket */
#define SOCK_RDM 4 /* reliably-delivered message */
#define SOCK_SEQPACKET 5 /* sequenced packets */
The SOCK_DGRAM type models the semantics of datagrams in
network communication: messages may be lost or duplicated
and may arrive out-of-order. A datagram socket may send mes-
sages to and receive messages from multiple peers. The
SOCK_RDM type models the semantics of reliable datagrams:
messages arrive unduplicated and in-order, the sender is
notified if messages are lost. The send and receive opera-
tions (described below) generate reliable/unreliable
datagrams. The SOCK_STREAM type models connection-based vir-
tual circuits: two-way byte streams with no record boun-
daries. Connection setup is required before data communica-
tion may begin. The SOCK_SEQPACKET type models a
connection-based, full-duplex, reliable, sequenced packet
exchange; the sender is notified if messages are lost, and
messages are never duplicated or presented out-of-order.
Users of the last two abstractions may use the facilities
4.4BSD Architecture Manual PSD:5-45
for out-of-band transmission to send out-of-band data.
SOCK_RAW is used for unprocessed access to internal
network layers and interfaces; it has no specific semantics.
Other socket types can be defined.
Each socket may have a specific protocol associated
with it. This protocol is used within the domain to provide
the semantics required by the socket type. Not all socket
types are supported by each domain; support depends on the
existence and the implementation of a suitable protocol
within the domain. For example, within the ``Internet''
domain, the SOCK_DGRAM type may be implemented by the UDP
user datagram protocol, and the SOCK_STREAM type may be
implemented by the TCP transmission control protocol, while
no standard protocols to provide SOCK_RDM or SOCK_SEQPACKET
sockets exist.
2.3.1.3. Socket creation, naming and service establishment
Sockets may be connected or unconnected. An uncon-
nected socket descriptor is obtained by the socket call:
s = socket(domain, type, protocol);
result int s; int domain, type, protocol;
The socket domain and type are as described above, and are
specified using the definitions from <sys/socket.h>. The
protocol may be given as 0, meaning any suitable protocol.
One of several possible protocols may be selected using
identifiers obtained from a library routine, getprotobyname.
An unconnected socket descriptor of a connection-
oriented type may yield a connected socket descriptor in one
of two ways: either by actively connecting to another
socket, or by becoming associated with a name in the commun-
ications domain and accepting a connection from another
socket. Datagram sockets need not establish connections
before use.
To accept connections or to receive datagrams, a socket
must first have a binding to a name (or address) within the
communications domain. Such a binding may be established by
a bind call:
bind(s, name, namelen);
int s; struct sockaddr *name; int namelen;
Datagram sockets may have default bindings established when
first sending data if not explicitly bound earlier. In
either case, a socket's bound name may be retrieved with a
getsockname call:
PSD:5-46 4.4BSD Architecture Manual
getsockname(s, name, namelen);
int s; result struct sockaddr *name; result int *namelen;
while the peer's name can be retrieved with getpeername:
getpeername(s, name, namelen);
int s; result struct sockaddr *name; result int *namelen;
Domains may support sockets with several names.
2.3.1.4. Accepting connections
Once a binding is made to a connection-oriented socket,
it is possible to listen for connections:
listen(s, backlog);
int s, backlog;
The backlog specifies the maximum count of connections that
can be simultaneously queued awaiting acceptance.
An accept call:
t = accept(s, name, anamelen);
result int t; int s; result struct sockaddr *name; result int *anamelen;
returns a descriptor for a new, connected, socket from the
queue of pending connections on s. If no new connections are
queued for acceptance, the call will wait for a connection
unless non-blocking I/O has been enabled.
2.3.1.5. Making connections
An active connection to a named socket is made by the
connect call:
connect(s, name, namelen);
int s; struct sockaddr *name; int namelen;
Although datagram sockets do not establish connections, the
connect call may be used with such sockets to create an
association with the foreign address. The address is
recorded for use in future send calls, which then need not
supply destination addresses. Datagrams will be received
only from that peer, and asynchronous error reports may be
received.
It is also possible to create connected pairs of sock-
ets without using the domain's name space to rendezvous;
this is done with the socketpair call-:
_________________________
- 4.3BSD supports socketpair creation only in the
``unix'' communication domain.
4.4BSD Architecture Manual PSD:5-47
socketpair(domain, type, protocol, sv);
int domain, type, protocol; result int sv[2];
Here the returned sv descriptors correspond to those
obtained with accept and connect.
The call
pipe(pv)
result int pv[2];
creates a pair of SOCK_STREAM sockets in the UNIX domain,
with pv[0] only writable and pv[1] only readable.
2.3.1.6. Sending and receiving data
Messages may be sent from a socket by:
cc = sendto(s, buf, len, flags, to, tolen);
result int cc; int s; caddr_t buf; int len, flags; caddr_t to; int tolen;
if the socket is not connected or:
cc = send(s, buf, len, flags);
result int cc; int s; caddr_t buf; int len, flags;
if the socket is connected. The corresponding receive primi-
tives are:
msglen = recvfrom(s, buf, len, flags, from, fromlenaddr);
result int msglen; int s; result caddr_t buf; int len, flags;
result caddr_t from; result int *fromlenaddr;
and
msglen = recv(s, buf, len, flags);
result int msglen; int s; result caddr_t buf; int len, flags;
In the unconnected case, the parameters to and tolen
specify the destination or source of the message, while the
from parameter stores the source of the message, and *from-
lenaddr initially gives the size of the from buffer and is
updated to reflect the true length of the from address.
All calls cause the message to be received in or sent
from the message buffer of length len bytes, starting at
address buf. The flags specify peeking at a message without
reading it or sending or receiving high-priority out-of-band
messages, as follows:
#define MSG_PEEK 0x1 /* peek at incoming message */
#define MSG_OOB 0x2 /* process out-of-band data */
PSD:5-48 4.4BSD Architecture Manual
2.3.1.7. Scatter/gather and exchanging access rights
It is possible scatter and gather data and to exchange
access rights with messages. When either of these opera-
tions is involved, the number of parameters to the call
becomes large. Thus the system defines a message header
structure, in <sys/socket.h>, which can be used to con-
veniently contain the parameters to the calls:
struct msghdr {
caddr_t msg_name; /* optional address */
int msg_namelen; /* size of address */
struct iov *msg_iov; /* scatter/gather array */
int msg_iovlen; /* # elements in msg_iov */
caddr_t msg_accrights; /* access rights sent/received */
int msg_accrightslen; /* size of msg_accrights */
};
Here msg_name and msg_namelen specify the source or destina-
tion address if the socket is unconnected; msg_name may be
given as a null pointer if no names are desired or required.
The msg_iov and msg_iovlen describe the scatter/gather loca-
tions, as described in section 2.1.3. Access rights to be
sent along with the message are specified in msg_accrights,
which has length msg_accrightslen. In the ``unix'' domain
these are an array of integer descriptors, taken from the
sending process and duplicated in the receiver.
This structure is used in the operations sendmsg and
recvmsg:
sendmsg(s, msg, flags);
int s; struct msghdr *msg; int flags;
msglen = recvmsg(s, msg, flags);
result int msglen; int s; result struct msghdr *msg; int flags;
2.3.1.8. Using read and write with sockets
The normal UNIX read and write calls may be applied to
connected sockets and translated into send and receive calls
from or to a single area of memory and discarding any rights
received. A process may operate on a virtual circuit
socket, a terminal or a file with blocking or non-blocking
input/output operations without distinguishing the descrip-
tor type.
2.3.1.9. Shutting down halves of full-duplex connections
A process that has a full-duplex socket such as a vir-
tual circuit and no longer wishes to read from or write to
this socket can give the call:
4.4BSD Architecture Manual PSD:5-49
shutdown(s, direction);
int s, direction;
where direction is 0 to not read further, 1 to not write
further, or 2 to completely shut the connection down. If the
underlying protocol supports unidirectional or bidirectional
shutdown, this indication will be passed to the peer. For
example, a shutdown for writing might produce an end-of-file
condition at the remote end.
2.3.1.10. Socket and protocol options
Sockets, and their underlying communication protocols,
may support options. These options may be used to manipu-
late implementation- or protocol-specific facilities. The
getsockopt and setsockopt calls are used to control options:
getsockopt(s, level, optname, optval, optlen)
int s, level, optname; result caddr_t optval; result int *optlen;
setsockopt(s, level, optname, optval, optlen)
int s, level, optname; caddr_t optval; int optlen;
The option optname is interpreted at the indicated protocol
level for socket s. If a value is specified with optval and
optlen, it is interpreted by the software operating at the
specified level. The level SOL_SOCKET is reserved to indi-
cate options maintained by the socket facilities. Other
level values indicate a particular protocol which is to act
on the option request; these values are normally interpreted
as a ``protocol number''.
2.3.2. UNIX domain
This section describes briefly the properties of the
UNIX communications domain.
2.3.2.1. Types of sockets
In the UNIX domain, the SOCK_STREAM abstraction pro-
vides pipe-like facilities, while SOCK_DGRAM provides (usu-
ally) reliable message-style communications.
2.3.2.2. Naming
Socket names are strings and may appear in the UNIX
file system name space through portals-.
_________________________
- The 4.3BSD implementation of the UNIX domain embeds
bound sockets in the UNIX file system name space; this
may change in future releases.
PSD:5-50 4.4BSD Architecture Manual
2.3.2.3. Access rights transmission
The ability to pass UNIX descriptors with messages in
this domain allows migration of service within the system
and allows user processes to be used in building system
facilities.
2.3.3. INTERNET domain
This section describes briefly how the Internet domain
is mapped to the model described in this section. More
information will be found in the document describing the
network implementation in 4.3BSD.
2.3.3.1. Socket types and protocols
SOCK_STREAM is supported by the Internet TCP protocol;
SOCK_DGRAM by the UDP protocol. Each is layered atop the
transport-level Internet Protocol (IP). The Internet Control
Message Protocol is implemented atop/beside IP and is acces-
sible via a raw socket. The SOCK_SEQPACKET has no direct
Internet family analogue; a protocol based on one from the
XEROX NS family and layered on top of IP could be imple-
mented to fill this gap.
2.3.3.2. Socket naming
Sockets in the Internet domain have names composed of
the 32 bit Internet address, and a 16 bit port number.
Options may be used to provide IP source routing or security
options. The 32-bit address is composed of network and host
parts; the network part is variable in size and is frequency
encoded. The host part may optionally be interpreted as a
subnet field plus the host on subnet; this is is enabled by
setting a network address mask at boot time.
2.3.3.3. Access rights transmission
No access rights transmission facilities are provided
in the Internet domain.
2.3.3.4. Raw access
The Internet domain allows the super-user access to the
raw facilities of IP. These interfaces are modeled as
SOCK_RAW sockets. Each raw socket is associated with one IP
protocol number, and receives all traffic received for that
protocol. This allows administrative and debugging functions
to occur, and enables user-level implementations of
special-purpose protocols such as inter-gateway routing pro-
tocols.
4.4BSD Architecture Manual PSD:5-51
2.4. Terminals and Devices
2.4.1. Terminals
Terminals support read and write I/O operations, as
well as a collection of terminal specific ioctl operations,
to control input character interpretation and editing, and
output format and delays.
2.4.1.1. Terminal input
Terminals are handled according to the underlying com-
munication characteristics such as baud rate and required
delays, and a set of software parameters.
2.4.1.1.1. Input modes
A terminal is in one of three possible modes: raw,
cbreak, or cooked. In raw mode all input is passed through
to the reading process immediately and without interpreta-
tion. In cbreak mode, the handler interprets input only by
looking for characters that cause interrupts or output flow
control; all other characters are made available as in raw
mode. In cooked mode, input is processed to provide standard
line-oriented local editing functions, and input is
presented on a line-by-line basis.
2.4.1.1.2. Interrupt characters
Interrupt characters are interpreted by the terminal
handler only in cbreak and cooked modes, and cause a
software interrupt to be sent to all processes in the pro-
cess group associated with the terminal. Interrupt charac-
ters exist to send SIGINT and SIGQUIT signals, and to stop a
process group with the SIGTSTP signal either immediately, or
when all input up to the stop character has been read.
2.4.1.1.3. Line editing
When the terminal is in cooked mode, editing of an
input line is performed. Editing facilities allow deletion
of the previous character or word, or deletion of the
current input line. In addition, a special character may be
used to reprint the current input line after some number of
editing operations have been applied.
Certain other characters are interpreted specially when
a process is in cooked mode. The end of line character
determines the end of an input record. The end of file
character simulates an end of file occurrence on terminal
input. Flow control is provided by stop output and start
output control characters. Output may be flushed with the
PSD:5-52 4.4BSD Architecture Manual
flush output character; and a literal character may be used
to force literal input of the immediately following charac-
ter in the input line.
Input characters may be echoed to the terminal as they
are received. Non-graphic ASCII input characters may be
echoed as a two-character printable representation, ``^char-
acter.''
2.4.1.2. Terminal output
On output, the terminal handler provides some simple
formatting services. These include converting the carriage
return character to the two character return-linefeed
sequence, inserting delays after certain standard control
characters, expanding tabs, and providing translations for
upper-case only terminals.
2.4.1.3. Terminal control operations
When a terminal is first opened it is initialized to a
standard state and configured with a set of standard con-
trol, editing, and interrupt characters. A process may
alter this configuration with certain control operations,
specifying parameters in a standard structure:-
struct ttymode {
short tt_ispeed; /* input speed */
int tt_iflags; /* input flags */
short tt_ospeed; /* output speed */
int tt_oflags; /* output flags */
};
and ``special characters'' are specified with the ttychars
structure,
_________________________
- The control interface described here is an internal
interface only in 4.3BSD. Future releases will prob-
ably use a modified interface based on currently-
proposed standards.
4.4BSD Architecture Manual PSD:5-53
struct ttychars {
char tc_erasec; /* erase char */
char tc_killc; /* erase line */
char tc_intrc; /* interrupt */
char tc_quitc; /* quit */
char tc_startc; /* start output */
char tc_stopc; /* stop output */
char tc_eofc; /* end-of-file */
char tc_brkc; /* input delimiter (like nl) */
char tc_suspc; /* stop process signal */
char tc_dsuspc; /* delayed stop process signal */
char tc_rprntc; /* reprint line */
char tc_flushc; /* flush output (toggles) */
char tc_werasc; /* word erase */
char tc_lnextc; /* literal next character */
};
2.4.1.4. Terminal hardware support
The terminal handler allows a user to access basic
hardware related functions; e.g. line speed, modem control,
parity, and stop bits. A special signal, SIGHUP, is
automatically sent to processes in a terminal's process
group when a carrier transition is detected. This is nor-
mally associated with a user hanging up on a modem con-
trolled terminal line.
2.4.2. Structured devices
Structures devices are typified by disks and magnetic
tapes, but may represent any random-access device. The sys-
tem performs read-modify-write type buffering actions on
block devices to allow them to be read and written in a
totally random access fashion like ordinary files. File sys-
tems are normally created in block devices.
2.4.3. Unstructured devices
Unstructured devices are those devices which do not
support block structure. Familiar unstructured devices are
raw communications lines (with no terminal handler), raster
plotters, magnetic tape and disks unfettered by buffering
and permitting large block input/output and positioning and
formatting commands.
PSD:5-54 4.4BSD Architecture Manual
2.5. Process and kernel descriptors
The status of the facilities in this section is still
under discussion. The ptrace facility of earlier UNIX sys-
tems remains in 4.3BSD. Planned enhancements would allow a
descriptor-based process control facility.
4.4BSD Architecture Manual PSD:5-55
I. Summary of facilities
1. Kernel primitives
1.1. Process naming and protection
sethostid set UNIX host id
gethostid get UNIX host id
sethostname set UNIX host name
gethostname get UNIX host name
getpid get process id
fork create new process
exit terminate a process
execve execute a different process
getuid get user id
geteuid get effective user id
setreuid set real and effective user id's
getgid get accounting group id
getegid get effective accounting group id
getgroups get access group set
setregid set real and effective group id's
setgroups set access group set
getpgrp get process group
setpgrp set process group
1.2 Memory management
<sys/mman.h> memory management definitions
sbrk change data section size
sstk- change stack section size
getpagesize get memory page size
mmap- map pages of memory
msync- flush modified mapped pages to filesystem
munmap- unmap memory
mprotect- change protection of pages
madvise- give memory management advice
mincore- determine core residency of pages
msleep- sleep on a lock
mwakeup- wakeup process sleeping on a lock
1.3 Signals
<signal.h> signal definitions
sigvec set handler for signal
kill send signal to process
killpgrp send signal to process group
sigblock block set of signals
sigsetmask restore set of blocked signals
sigpause wait for signals
sigstack set software stack for signals
_________________________
- Not supported in 4.3BSD.
PSD:5-56 4.4BSD Architecture Manual
1.4 Timing and statistics
<sys/time.h> time-related definitions
gettimeofday get current time and timezone
settimeofday set current time and timezone
getitimer read an interval timer
setitimer get and set an interval timer
profil profile process
1.5 Descriptors
getdtablesize descriptor reference table size
dup duplicate descriptor
dup2 duplicate to specified index
close close descriptor
select multiplex input/output
fcntl control descriptor options
wrap- wrap descriptor with protocol
1.6 Resource controls
<sys/resource.h> resource-related definitions
getpriority get process priority
setpriority set process priority
getrusage get resource usage
getrlimit get resource limitations
setrlimit set resource limitations
1.7 System operation support
mount mount a device file system
swapon add a swap device
umount umount a file system
sync flush system caches
reboot reboot a machine
acct specify accounting file
2. System facilities
2.1 Generic operations
read read data
write write data
<sys/uio.h> scatter-gather related definitions
readv scattered data input
writev gathered data output
<sys/ioctl.h> standard control operations
ioctl device control operation
_________________________
- Not supported in 4.3BSD.
4.4BSD Architecture Manual PSD:5-57
2.2 File system
Operations marked with a * exist in two forms: as
shown, operating on a file name, and operating on a file
descriptor, when the name is preceded with a ``f''.
<sys/file.h> file system definitions
chdir change directory
chroot change root directory
mkdir make a directory
rmdir remove a directory
open open a new or existing file
mknod make a special file
portal- make a portal entry
unlink remove a link
stat* return status for a file
lstat returned status of link
chown* change owner
chmod* change mode
utimes change access/modify times
link make a hard link
symlink make a symbolic link
readlink read contents of symbolic link
rename change name of file
lseek reposition within file
truncate* truncate file
access determine accessibility
flock lock a file
2.3 Communications
<sys/socket.h> standard definitions
socket create socket
bind bind socket to name
getsockname get socket name
listen allow queuing of connections
accept accept a connection
connect connect to peer socket
socketpair create pair of connected sockets
sendto send data to named socket
send send data to connected socket
recvfrom receive data on unconnected socket
recv receive data on connected socket
sendmsg send gathered data and/or rights
recvmsg receive scattered data and/or rights
shutdown partially close full-duplex connection
getsockopt get socket option
setsockopt set socket option
PSD:5-58 4.4BSD Architecture Manual
2.4 Terminals, block and character devices
2.5 Processes and kernel hooks
Generated on 2013-04-27 00:20:00 by $MirOS: src/scripts/roff2htm,v 1.77 2013/01/01 20:49:09 tg Exp $
These manual pages and other documentation are copyrighted by their respective writers;
their source is available at our CVSweb,
AnonCVS, and other mirrors. The rest is Copyright © 2002‒2013 The MirOS Project, Germany.
This product includes material
provided by Thorsten Glaser.
This manual page’s HTML representation is supposed to be valid XHTML/1.1; if not, please send a bug report – diffs preferred.