Notes 6: Processes, new and old

Corresponds to Chapter 8 in Advanced Programming in the Unix Environment.

On Forking and Death

"A dream to some; a nightmare to others!!" --Merlin, Excalibur

The only way to create new processes is to fork() them. Here are some rules about forking and killing processes that you should remember:

  1. When a process dies, it becomes a zombie until the parent wait()s for it (sometimes called "reaping").

  2. When the parent process of a child dies, the child is reparented to the init process (PID 1). init will reap any zombies it gets.

The moral is this: in general, wait() for each process that you fork(), or have them reparented to init. Otherwise, your process table will fill with zombies and you won't be able to fork() anything.

pid_t fork(void);

This call returns twice: once to the parent, and once to the child process. The program can tell which one it is by checking the return value of the call: fork() returns 0 to the child process, and returns the PID of the child to the parent:

    if ((pid = fork()) == 0) {
        printf("I'm the child and my parent is PID %d!\n", getppid());
    printf("I'm the parent and my child is PID %d!\n", pid);

As you can see, the child can always get its parent's PID from the call getppid().

When the child is fork()ed, it inherits many things from the parent (adapted from Stevens APITUE):

Things that the child does not share with the parent include:

void exit(int status);

Also from the previous example, you can see that a call to exit() will end a (child) process. Well, all processes are children at some level or another (except for a select couple). You can also call the more direct _exit() or abort() (raises SIGABRT) or wait for one of a dozen other signals that will terminate you.

For controlled termination, exit() will work for almost all occasions.

The status that you pass it? It can be retrieved through a wait() call.

pid_t wait(int *status);
pid_t waitpid(pid_t pid, int *status, int options);

Both of these calls are similar--wait() waits for the next child process to be reaped; waitpid() waits for a specific child process and allows you to specify some additional options.

With both functions, if status is not NULL, the child processes encoded return value is stored in it. Several macros exist to help figure this value out (these take the status itself, not a pointer to it!):

WIFEXITED(status) True if the child exited normally.
WEXITSTATUS(status) If EIFEXITED() true, gives the child exit status (what you passed to the exit() call).
WIFSIGNALED(status) True if the process exited due to a signal that wasn't caught.
WTERMSIG(status) If WIFSIGNALED() true, gives the signal number that caused the termination.
WCOREDUMP(status) If WIFSIGNALED() true, this is true if a core dump file was generated.
WIFSTOPPED(status) True if status is returned for a child that is currently stopped (see WUNTRACED, below).
WSTOPSIG(status) If WIFSTOPPED() true, gives the signal number that stopped the child.
Table 1. Macros for checking exit status.

Again, you can just set this pointer to NULL if you don't care what the exit status is.

The second call, waitpid(), allows you to specify which PID to wait for. This has a little more functionality than first appears. The pid can be one of:

< -1
Wait for any process whose process group ID is equal to the absolute value of the number specified.

= -1
Wait for any child process--this is just like wait().

= 0
Wait for any child processes who have the same process group ID as the calling process.

> 0
Wait for the child process with the specified PID.

Finally, there is an options argument to waitpid(). This can be assigned one or both of the following value by ORing them together (or set options to 0 if you don't want any of them):

Causes waitpid() to return immediately if there are no children waiting to be reaped.

This option causes waitpid() to check to see if there are any stopped children which have not returned their status since stopped. Specify this flag if you want to use the WIFSTOPPED() macro, above.

The WNOHANG option leaves room for abuse; people might think it's a good idea to poll for zombie children, but this can gobble CPU time. It's better to handle the signal SIGCHLD which is raised whenever a child process dies. The handler can then reap the zombie asynchronously.

The exec calls

This group of routines, collectively known as exec(), replaces the currently executing process with a new program. Of course, if you want to spawn off a new process and leave the parent running, you'll have to fork() a child to exec(), first. There are several things which the newly exec'd process shares in common with the original (adapted from Stevens APITUE):

There are two variants of exec: execl and execv. The prototypes are:

    int execl(const char *path, const char *arg, ...);
    int execv(const char *path, char *const arg[]);

Each execl() takes a list of arguments to the program, ending with a NULL. The first argument should be the name of the program, itself:

    execl("/bin/ls", "ls", "-l", NULL);

Whereas execv() takes an array similar to argv[]:

    char *arg[] = { "ls", "-l", NULL };
    execv("/bin/ls", arg);

To convolute matters, both execl() and execv() have three sub-variants (you've already seen the first of these). The other two add a "p" or "e" suffix to the call:

    int execlp(const char *file, const char *arg, ...);
    int execle(const char *path, const char *arg , ..., char *const envp[]);
    int execvp(const char *file, char *const argv[]);
    int execve(const char *filename, const char *argv [], const char *envp[]);

A suffix of "p" means, search the PATH environment variable for this command. For instance, if we have "/bin" in the PATH, we can use execlp() to run ls:

    execlp("ls", "ls", "-l", NULL);

Lastly, the "e" suffix tells exec that you want to pass an array of environment variables to be used by the new process instead of that of the currently running process. First declare an argv-style array with the environment variables you want to pass, then include them in the argument list to one of the exec-e functions:

    char *env[] = { "PATH=/bin:/usr/bin", "HOME=/home/beej", NULL };

    execle("/bin/ls", "ls", "-l", NULL, env);

As was previously mentioned, most often you will be fork()ing right before the exec.

int system (const char *cmd);

This is a library routine that has a fork(), exec(), and waitpid() all bundled into one. It will fork off a child process, execute the named command, and wait for it to complete:

    system("/bin/ls -l");

If the command has any output and you're using buffered I/O, you should fflush() the common output streams (like stdout and stderr) before calling system(), or the output might be out of order.

It is a C library routine, not a system call. Nevertheless, it can be very useful at times.

Dealing with User IDs

There are three kinds of user IDs:

User ID (UID)
This is your normal user ID. This doesn't change unless you're root and you want to change it.

Effective UID (EUID)
When a program is exec'd, the EUID is set to the UID, unless the SUID bit is set, in which case, the EUID is set to the owner of the program. The EUID is used to determine access permissions to files.

Saved set-UID
This is a copy of your EUID.

There are corresponding versions of the group IDs.

uid_t getuid(void);
uid_t geteuid(void);

These functions return your (numerical) UID and EUID, respectively. Use the getpwuid()-type functions to retrieve other user information.

gid_t getgid(void);
gid_t getegid(void);

These correspond to the above calls for UIDs, except they operate on GIDs.

int setuid(uid_t uid);
int seteuid(uid_t euid);

If you're root, a call to setuid() will set all three of your UID, EUID, and saved set-UID. If you're a normal user, you can use this call to set your EUID to either your real UID or your saved set-UID. That's all.

The call seteuid() will only set your EUID, regardless of whether you are root or not.

int setgid(gid_t gid);
int setegid(gid_t egid);

These correspond to the above calls for UIDs, except they operate on GIDs.

int setreuid(uid_t ruid, uid_t euid);
int setregid(gid_t rgid, gid_t egid);

Before saved set-UIDs, you could make these calls in 4.3BSD to set your UID and EUID. If you are root, you can set them both to whatever you want. Otherwise, you are only allowed to set UID to EUID and vice versa. If saved set-UIDs are supported by your system, you are also allowed to set UID to saved set-UID.

Process Accounting

int acct(const char *filename);

This call turns on process accounting and dumps records to the specified file name. Each time a process terminates, information will be appended to this file. If filename is NULL, accounting is turned off.

The information stored in each record of the binary file is defined as struct acct in <sys/acct.h> (here the Linux version is shown:)

    struct acct {
        char    ac_comm[ACCT_COMM]; /* Accounting command name */
        time_t  ac_utime;           /* Accounting user time */
        time_t  ac_stime;           /* Accounting system time */
        time_t  ac_etime;           /* Accounting elapsed time */
        time_t  ac_btime;           /* Beginning time */
        uid_t   ac_uid;             /* Accounting user ID */
        gid_t   ac_gid;             /* Accounting group ID */
        dev_t   ac_tty;             /* controlling tty */
        char    ac_flag;            /* Accounting flag */
        long    ac_minflt;          /* Accounting minor pagefaults */
        long    ac_majflt;          /* Accounting major pagefaults */
        long    ac_exitcode;        /* Accounting process exitcode */

The field ac_flag can be any of the following OR'd together (might not hold for all systems--check your man page):

Process is from a fork() but never called exec.

Process used superuser privileges.

Process dumped core.

Process killed by a signal.

Of course, you have to be superuser to turn process accounting on and off. If accounting is already on and you want to examine the file (and it's readable), it's generally /var/adm/pacct.

char *getlogin(void);

Returns the name of the user logged in to the controlling terminal of this process.

clock_t times(struct tms *buf);

This function returns the number of clock ticks since the system was last booted, and fills the buf structure with information pertaining to the number of clock ticks that have elapsed for the current process (and its child processes).

The argument buf is a pointer to a struct tms:

    struct tms {
        clock_t tms_utime;  /* user time */
        clock_t tms_stime;  /* system time */
        clock_t tms_cutime; /* user time of children */
        clock_t tms_cstime; /* system time of children */

By sampling the time before a section of code (or child process) is executed, then sampling it again afterward and subtracting, you can come up with the total real time, user time, and system time used by that section of a process.

Actually, it returns the number of clock ticks for real, user, and system time, but you can find the number of seconds by dividing it by CLK_TCK (the number of ticks per second). CLK_TCK