CSE 451/851 Programming Assignment 2

Writing a UNIX shell


Before we even start :

DO NOT PUT THE FORK() STATEMENT INSIDE A INFINATE LOOP!!. This means no while(1), or equivalent ANYWHERE in your code.

1. Overview

It is not the purpose of this assignment to write a fully functional UNIX shell. Rather, the purpose of this programming assignment is to introduce and/our familiarize ourselves to various concepts useful in systems programming such file I/O, processes control, and interprocess communication. As such, we will only code parts of the shell that help introduce these topics.

2. Due Dates

Parsing Command Line input (Phase 1) : Feb 6th - Optional - But:
- Handing in a mostly functioning phase 1 will grant you 48hrs (2 days) extra to work on your code without being penalized.
Complete Programming Assignment (Phase 3) : Feb 10th (Feb 12 if you turn in Phase 1)

3. Handing in your assignment

Use web handin to submit your program as homework 2a and 2b respectively.

Include a Makefile with your assignment, running make in your project directory must generate an executable called osh

Limit extranious output in your final shell, to help debug your code use a verbose flag in your code to control printing of informative statements based on this flag

Include a README file in with your code. Use the README file to:

A tester app is available on cse at ~cse451/public/prog2/testshell

To use the tester shell against the demo shell apps you would run

> ~cse451/public/prog2/testshell ~cse451/public/prog2/shell_phase3 

To use the tester shell against your shell you would run

> ~cse451/public/prog2/testshell osh

4. Assignment in brief

Write a shell that prints a prompt "osh>", reads in a line from standard in, parses the line, populates a command data structure such as the one given in comm.h and execute the commands from the command data structure. This shell will support the following redirection and command combination syntax shown below (Much like the standard 'csh' or 'tcsh' shells in UNIX).
'>' - redirect stdout to file
'<' - redirect stdin from file
'>>' - append stdout to file
'|' - pipe stdout of one command into stdin of next.
'&&' - Execute next command on success.
'||' - Execute next command on failure.
';' - Execute next command regardless of success or failure.

4.1 Examples

osh> ls
osh> ls > out
osh> cat < code.c
osh> cat < code.c > out
osh> cat >> out
osh> cat < code.c | more
osh> who | grep paul && echo "Paul is logged in"
osh> who | grep paul || echo "Paul not logged in"
osh> exit

5. Detailed Discussion and Description

This assignemnt is based on programming assignent Project 1 (part I) which is described on pg 157 of the text. We will not implement Part II of the project in the text. Instead we will extend the shell to suppot the file redirections, IPC(pipes) and command combinations described above. This assignment can be overwhelming if not addressed correctly, to help, I have broken this assignemnt into three phases. I suggest you also work on this assignemnt as such - completing each phase before you proceed to the next.

5.1. Phase 1 - Parsing the Input String

The key to this assignment is correctly parsing the input line and building a data structure to represent the parsed data. There is a lot of booking in this program, opening various files, in various modes, etc. If you are able to tokenize the input(command) line and represent it in a list of objects or structures the rest of the program gets significantly easier. Consider the following recommendations below:

In phase 1, the program should correctly parse the input string, identify malformed commands and when complete, the internal command structure should be shown. A sample program can be executed on cse.unl.edu: ~cse451/public/prog2/shell_phase1 Note: This program either takes a "-v" flag or type in "verbose" in the command prompt to place the program in verbose mode, which will shed some light on the internal workings of the program.

5.2. Phase 2 - Forking a child process and execing a command

In this phase, fork() and exec() the commands with their respective argumetns, ignoring file redirects and pipes (treat pipes as ";"). A sample program can be executed from cse:~cse451/public/prog2/shell_phase2 Note: This program either takes a "-v" flag or type in "verbose" in the command prompt to place the program in verbose mode.

Create a child process to launch the executable. As covered in class you must use fork to exec a new program, else the shell will be replaced with the exec command and cease to exist. Shown below is pseudo code to accomplish this. Note the four rules when using fork as covered in class.

  1. Fork is not placed in a while(1) loop.
  2. We trap and exit if the fork fails.
  3. We always have an exit in the child (NOT Shown in Fig 3.9 of the text)
  4. Have a wait in parent (unless we have multiple exec's - i.e. pipes)

     /* Rule 1: Dont forget this should not go inside a while(1) loop anywhere in your program */
     pit_t cpid = fork();
     if (cpid < 0 ){    /* Rule 2: Exit if Fork failed */
        fprintf(stderr, "Fork Failed \n");
        exit(1);
     }
    
     else if (cpid == 0 ){
 
         execlp(....);
         fprintf(stderr, "Exec Failed \n");
         exit(1)  /*Rule 3: Always exit in child */
     }
    
     else {

         int status;  
         wait(&status);	/* Rule 4: Wait for child unless we need to launch another exec */
         printf("Child caught bla bla bla\n");
     } 

You have a choice of six exec functions, execl(), execlp(), execle(), execv(), execv(), execvp(), execve(). You are restricted by two facts, first you do not know ahead of time how many arguments you will have and second our shell does not maintain a hash-table of executables and their location. We would instead like the exec function to find the executable for us (Very unshell like) What all this means is that only one of the above exec function will work.

You will also need to write a function to build the argv structure. Argv (char **argv) is a null terminated array of strings and is represented by the figure below. The pseudo code to build an array of strings follows.

char ** BuildArgv(Arg *arg_list){

     char **argv;
     int argc;

     /*
      * Compute argc
      */
     snip.. snip.. snip.. 

     /*
      * Malloc/New the space for argv - Select one! - Note the + 1 for the null pointer.
      */
     argv = (char **)(malloc(sizeof(char *) * argc + 1 )); /* C  malloc*/
     argv = new char*[argc+1]; /* C++ new  */

     /*
      * Make the argv pointers point to the strings in arg_list
      */
     for (argc = 0; tmp_list != NULL ; argc++){

         argv[argc] = snip.. snip.. snip.. 

     }

     /*
      * Terminate the argv structure with a Null as shown in
      * the above figure.
      */

     argv[argc] = NULL;
     return (argv);
}

5.3. Phase 3, File redirection and Pipes.

In this final phase, the program(child) process will open any files for redirection and dup these file handles into stdin and stdout as appropritae. The parent process will have to create any pipes as necessary (common ancestor requried). The parent process will also have to wait for each exec to complete, unless we have a pipe, in which case a queue is used to collect outstanding child pids and we wait for these children at the end of the pipe.

Processes reading from pipes will hang (not receive an EOF) until ALL write file descriptors to the pipe have been closed. If you pipe hangs - You have open write pipe descriptors. Look as the slides for this chapter to understand how creating the pipes in the parent and forking two children will require careful closing of various file descriptors for pipes to work. To minimize open write pipe descriptors, follow the below recommendations.

Again, you can access a sample program on cse by executing ~cse451/public/prog2/shell_phase3