Chapter 1: The fork() and exec() system calls


The fork()

 

The fork() system call creates child process under the parent process. The child process starts its execution from the instruction immediately following the fork() call. If there are n fork() calls, 2n child processes will be created.

 

An important thing to remember about fork() calls is that all the child processes are independent processes, each with its own process ID number. Hence, both parent process as well as the child process can exist concurrently.

 

             Example 1:

 

     main() {

 

          int pid;

          pid=fork();

         

          if(pid>0) printf("Child process ID: %d",pid);

    

          else printf("Parent process ID is %d", getppid());

     }

 

To the parent process, fork() returns the value of the child’s PID.

 

The moment a call is made to the fork(), a child process is created. The child copy too, gets a copy of the variable "pid", but with a value of 0 by default.

 

Thus, the variable "pid" of the child process will always be zero. 

 

Sometimes, fork() may not create the child process due to some memory problem. In that case, fork() will return ‘-1’ indicating error.

 

The getpid() function returns the PID of the process from which it is called. The getppid() function returns the PID if the parent of the process from which it is called.

 


Orphans and Zombies

 

Orphans

 

            Example 2:

 

     main() {

          int pid;

          pid=fork();

 

          if (pid==0) { // child part

 

               printf("I am the child\n");

               printf("My parent’s PID: %d\n", getppid());

               printf("My PID: %d\n",getpid());

 

               sleep(20); // child put to sleep for 20 sec

 

               printf("I am the child, I have become ORPHAN");

                printf("\nMy PID is: %d",getpid());

               printf("\nMy parent’s PID is: %d", getppid());

         

} else { // parent’s part

 

     printf("\nI am the Parent & I am dying \n");

     printf("My PID is: %d",getpid());

     printf("\nThe PID of my parent: %d", getppid());

 

}

     }

 

Here, the child prints the PID of its parent and itself and goes to sleep. In the meanwhile, the parent continues to execute, it prints the PID of it’s parent (the shell) and its own PID and terminates.

 

After 20 seconds, the child wakes up to find that the parent has already terminated. Thus, the child now becomes an orphan.

 

Unlike in case of humans, the process dispatcher immediately adopts the orphan child and hence after sleep(), the child displays its on PID as before but instead of printing the PID of its original parent, it prints the PID of the process dispatcher as it’s parent through the getpid() call.    

 

The ps –l command in UNIX shows ‘O’ in the second column against the child process, notifying that its original parent has terminated.

Unfortunately, a ps -l command in LINUX will not show 'O' in the second column, because, LINUX does not identify an Orphan process.

 

The following is the output of the above program when I ran in my PIV machine:

 

I am the child

My parent’s PID: 3871

My PID: 3872

 

I am the Parent & I am dying

My PID is: 3871

The PID of my parent: 3828

 

[AFTER 20 SEC]

 

I am the child, I have become ORPHAN

My PID is: 3872

My parent’s PID is: 1

 

As we can see, after the parent does, the process dispatcher, with PID=1 adopts the child. Hence, after 20 sec, the child prints the PID of its parent as 1.

 

Zombies

 

Zombies are processes that have terminated, but are not removed from the process table.  Let us assume that there is a parent process that creates a child. Both of them now have an entry in the process table. Lets further assume that the child process gets terminated well before the parent does. Since the parent process is still in action, the child cannot be removed from the process table. It therefore exists in the twilight zone and thus becomes a zombie.

 

Lets take an example:

 

     Example 3:

 

     main() {

          if (fork()>0) { // parent part

 

               printf("\nI am the Parent going to sleep\n");

               sleep(20); // parent put to sleep for 20 sec

 

} else { // child’s part

 

     printf("\nI am the Child & after my ");

     printf("termination I become a zombie!");

}

     }

 

 

A command ps -l shows 'Z' in the second column for the child process indicating that it is a zombie.

 

[root@DEVELOPMENT root]$ ps -l

 

F S  UID PID   PPID  C PRI  NI ADDR  SZ  WCHAN  TTY      TIME      CMD

0 S  500 3828  3826  0 75   0    -  1081 wait4  pts/0    00:00:00 bash

0 T  500 4438  3828  0 75   0    -   335 finish pts/0    00:00:00 a.out

1 Z  500 4439  4438  0 75   0    -     0 do_exi pts/0    00:00:00 a.out

0 R  500 4448  3828  2 75   0    -   781   -    pts/0    00:00:00 ps

 

 

The above table shows that the child process with PID 4439 is Zombie as was expected.

 

Kindly note that fork() creates two identical processes but they are totally independent. The two processes share the same variables but each has its own copy of the variables. This can be shown by the example:

 

 

            Example 4:

 

     main() {

          int i=10;

          if(fork()>0) {

               printf("\nThe value of i in parent is %d",i);

          } else {

               i+=10;

               printf("\nThe value of i in child");

               printf(" after incrementation is %d",i);

          }

     }

 

 

The ‘exec()’ system call

 

We start with an example to illustrate the exec() function and its differences with fork().

 

            Example 5

 

     ex1.c

 

     main() {

          printf("\nBefore exec my PID: %d",getpid());

          printf("\nMy parent’s PID: %d",getppid());

 

          printf("\nExec Starts ...");

          execl("/root/ex2","ex2",(char*)0);

          printf("\nThis should not print ...\n");

     }

 

    

ex2.c

    

     main() {

          printf("\nAfter exec, my PID:%d",getpid());

          printf("\nMy Parent PID:%d",getppid());

          printf("\nExec ends\n");

     }

 

 

Compile the two programs as:

 

[root@DEVELOPMENT root]$ cc –oex1 ex1.c

[root@DEVELOPMENT root]$ cc –oex2 ex2.c

 

 

That is, the binary files for the two programs should be separately compiled into independent files.

 

Now, let us analyze this program:

 

 

Ex1.c is the main program that calls the ex2 program through the exec() system call. First, it prints the process ID of itself and it’s parent. Next, it calls the ex2 program through the execl() system call. This function takes a number of arguments depending upon what parameters we want to pass to the newly called process. The first argument is the path of the ex2 program, i.e., the directory where ex2 resides. The next argument is the name of the program itself, here "ex2". The subsequent arguments are the command line parameters that are passed to ex2. The last argument is always a NULL.

 

After the call to execl(), the memory space previously occupied by ex1 is overwritten by ex2 program and thus, any code that exists in ex1 program, after the call to execl() never get executed. Here, the printf() function is never called. Instead, the main() of ex2 get executed.

 

The above program when run in my system gives:

 

    [root@DEVELOPMENT root]$ ./ex1

 

     Before exec my PID: 3931

     My parent’s PID: 3816

 

     After exec, my PID:3931

     My Parent PID:3816

     Exec ends

 

It is clear from the above output that unlike fork(), execl() does not create an independent  process. The PID of the old and the new process is same, indicating that the memory space occupied by the old process is completely overwritten by the new process and the old process no longer has its individual existence.

 

To better understand execl() call, we take another example:

 

Example 6

 

     ex1.c

 

     main(int argc, char* argv[]) {

 

 

          // print PID’s here ...

 

          execl(argv[1],argv[2],argv[3],argv[4],(char*) 0);

 

          printf("\nThis should not print\n");

 

     }

 

     ex2.c

 

     main(int agc, char* argv[]) {

 

          printf("\nChild process after exec() call is %s",argv[0]);

            printf(" & its arguments are: %s %s\n",argv[1],argv[2]);

      }

 

 

Run ex1 as:

 

[root@DEVELOPMENT root]$ ex1 /root/ex2 ex2 Hello World

                   

Here, for ex1 we have:

 

argv[0]="ex1"

argv[1]="/root/ex2"

argv[2]="ex2"

argv[3]="Hello"

argv[4]="World"

 

So, essentially a call to execl() from ex1.c can be written as:

 

execl("/root/ex2","ex2","Hello","World",(char*)0);

 

Thus, "Hello" and "World" are the two arguments that are passed to main() of ex2.

 

In ex2,

argv[0]="ex2" & argv[1]="Hello" & argv[2]="World"

 

which gets printed using printf() call.

The execv() call:

 

We can use ‘execv()’ call instead of execl().

 

To use execv() in the previous ex1.c program, we modify it as follows:-

 

            Example 7

 

     main(int argc, char * argv[]) {

 

          char *temp[4];

         

              temp[0]=argv[2];

              temp[1]=argv[3];

              temp[2]=argv[4];

              temp[3]=(char*)0;

 

              // print PID’s here ...

 

              execv(argv[1],temp);

         

              printf("This should not print \n");

}

 

Thus, instead of separately specifying the called program along with it’s command line parameters, we put them in an array and specify the name of the array instead. Thus, we prevent hard coding of passed parameters. This brings flexibility in specifying number of arguments to called program as well as run time adjustments.

(C) Anirban Sinha, 2003; All Rights Reserved.