Home CPSC 225

Redirection and Pipes


 

Command Return Values

All Unix commands return a code which indicates whether or not the command was successful or not. The code is an integer in the range of 0 - 255 where a value of 0 means that the command was successful, and any positive value indicates an error.

After each command, the shell puts the return code in an environment variable called "$?". We can print this after a command to see if it was successful or not:

ifinlay@cpsc:~$ rm real-file
ifinlay@cpsc:~$ echo $?
0
ifinlay@cpsc:~$ rm fake-file
rm: cannot remove `fake-file': No such file or directory
ifinlay@cpsc:~$ echo $?
1

As you can see, the first remove command succeeded (with a code of 0), while the second one failed (with a code of 1).

Return codes will be helpful for writing scripts as we can test if the commands in the script succeed or not.


 

Combining Commands

Return codes are also useful for chaining together commands. For instance, if we are writing a Java program, we might want to compile it and, if the compilation is successful, then run the program. This can be done with the && operator:

ifinlay@cpsc:~$ javac Program.java && java Program

This command combines two commands on one line. The first command, javac Program.java, says to compile Program.java which is a command that may succeed (if the program compiles cleanly), or fail (if the program has errors). The second command, java Program runs the program. The && operator runs the first command and, only if it is successful, runs the second command afterwards.

Note: You may wonder at the usefulness of this, but by using the up arrow to repeat the last command, you can rerun both commands together instead of doing them one at a time, which is more efficient

There is also an || operator which also uses the return code of the first command to determine whether to run the second. Unlike && which runs the second command if the first succeeds, || only runs the second command when the first fails.

This is commonly used to put error messages onto commands that fail. For instance, we could run:

ifinlay@cpsc:~$ javac Program.java || echo "Compilation failed"

If the compilation step fails, then the echo command will run. Otherwise, it will not.

These are most useful for scripting, but are occasionally useful for interactive use as well. They are also often used in command line instructions, such as those telling us how to install or configure software, so they are good to know.

Another way of combining commands is with the ; operator. This connects two commands and simply runs them in sequence, regardless of whether or not the first one is successful or not:

ifinlay@cpsc:~$ echo "Hello"; echo "There"

This command will run the first echo, then the second. This is helpful when we want to run multiple commands as one unit. For example, if we are compiling, then running, then editing a program, we would need different commands for each of these. Instead, we can do something like the following:

ifinlay@cpsc:~$ javac Program.java && java Program; sleep 1; vim Program.java

This will compile the program, then if the compilation was successful, runs it. The command then sleeps for one second (so we can check the output of compiling and running it), then open it back up in Vim. Note that the sleep and vim commands will happen regardless of whether the compilation succeeds or not.

Having all this as one command means we can repeat it simply by tapping the up arrow on the keyboard.


 

Output Redirection

Recall that the sed program prints the result of its substitution to the screen:

ifinlay@cpsc:~$ sed 's/old text/new text/' file.txt
new text
a line with nothing important
here is some new text again

This may seem like a useless default, but the reason is because Unix allows you to redirect the output of any command into a file. This is done by placing a '>' character after the command, then the file to redirect to. To save the result of the sed command above:

ifinlay@cpsc:~$ sed 's/old text/new text/' file.txt > newfile.txt

This command runs the sed and writes the output of it to "newfile.txt" instead of to the screen.

As another example, we can use echo to create files with text already in them:

ifinlay@cpsc:~$ echo "Hello" > a.txt
ifinlay@cpsc:~$ cat a.txt
Hello

If the file we redirect to already exists, it will be overwritten. We can also append the output of a program to a file using >>:

ifinlay@cpsc:~$ echo "There" >> a.txt
ifinlay@cpsc:~$ echo "Friend" >> a.txt
ifinlay@cpsc:~$ cat a.txt
Hello
There
Friend

Output redirection is also very useful for the programs that we write ourselves. For instance, if we are working on a program which produces some output, and we want to check that it is correct, we can redirect it to a file, then run diff between that file and a file containing the known correct output:

ifinlay@cpsc:~$ python3 program.py > output
ifinlay@cpsc:~$ diff output correct

 

Error Redirection

There are actually two types of output that Unix commands print to your terminal window. The one usually used is called "standard output" often abbreviated "stdout". There is another which is used for error messages and warnings called "standard error" or "stderr". The > redirection above redirects stdout. To redirect stderr, we can use "2>".

One instance where this is helpful is when using find to search in places other than our home directory. For instance, if we are looking for a header file like "omp.h" any place in /, find will produce warnings for every directory it is not allowed to search in (due to file permissions):

ifinlay@cpsc:~$ find / -name omp.h
find: `/var/www/clipboard/X': Permission denied
find: `/var/www/data/configs/checks/problems/resources': Permission denied
find: `/var/run/sudo': Permission denied
find: `/var/webmin': Permission denied
find: `/var/log/apache2': Permission denied
find: `/var/log/mysql': Permission denied
find: `/var/lib/mediawiki/config': Permission denied
find: `/var/spool/cups': Permission denied

This is only a few of the 735 permission denied messages (making the actual result impossible to find). To redirect stderr into a file, so we can actually see the output of find:

ifinlay@cpsc:~$ find / -name omp.h 2>error
/usr/lib/gcc/x86_64-linux-gnu/7/include/omp.h

Now the error messages are in a file called "error", and only stdout gets printed to the screen. This file is part of the gcc package by the way.

For output like this that we just do not care about, we can redirect to a special file called "/dev/null", which everyone has access to write to, but which just discards any data that is written to it:

ifinlay@cpsc:~$ find / -name omp.h 2>/dev/null
/usr/lib/gcc/x86_64-linux-gnu/7/include/omp.h

 

Input Redirection

We can also redirect a programs input to come from a file instead of the keyboard. This is very useful when working on our own programs. For instance, if we have a Python program which asks the user to enter a number of values, and the calculates the average:


# program which finds the average of some numbers
N = int(input("Enter the number of values:\n"))
total = 0
for i in range(N):
    total = total + int(input("Enter a value:\n"))
print("Average is", total / N)

When we run this program, we could enter the values by hand each time:

ifinlay@cpsc:~$ python3 program.py
Enter the number of values:
5
Enter a value:
1
Enter a value:
2
Enter a value:
3
Enter a value:
4
Enter a value:
5
Average is 3.0

However, this will become tedious, especially as we test larger test cases. Instead, we can place the input to this program into a file, and then redirect it into the program.

To do this, we can create the text file with Vim, and put the input we want in it:

ifinlay@cpsc:~$ vim input.txt

You can see the contents of a file with cat. So you can see what input.txt contains:

ifinlay@cpsc:~$ cat input.txt 
5
1
2
3
4
5
ifinlay@cpsc:~$ python3 program.py < input.txt
Enter the number of values:
Enter a value:
Enter a value:
Enter a value:
Enter a value:
Enter a value:
Average is 3.0

Here we placed the input intended for the program in a file called "input.txt", and then fed that into the program we are working on using the < operator. Note that our program does not know that the input is coming from a file; it thinks that it is reading directly from the user, and the shell is making the switch for us. This trick works with any programming language; it's not Python specific.

File redirection can be very helpful when testing programs. For instance we can create test input files, and files that contain the known correct outputs, and test them automatically with redirection and diff:

ifinlay@cpsc:~$ python3 program.py < input.txt > output.txt
ifinlay@cpsc:~$ diff output.txt correct.txt

Here, we redirect input and output, then compare the output we got with the known correct output by passing both files to diff. If diff reports no changes, we know we have the right answer. This allows us to test our programs automatically as we work on them, which can prevent introducing regressions as we work.


 

Pipes

Another Unix technique which changes the input or output of a command is a pipe. Pipes chain the output of one program into the input of another program. The form of this is command1 | command2. This runs command1 and then funnels its output into the input of command2.

For an example of when this is useful, the ps -A command lists all processes running, which can be a large number. If we want to search for something specific in the output, we can pipe the output of ps -A through grep. The example below searches for apache processes:

ifinlay@cpsc:~$ ps -A | grep apache
 3550 ?        00:04:17 apache2
 3714 ?        00:03:30 apache2
 5718 ?        00:02:23 apache2
 8368 ?        00:02:30 apache2
 8615 ?        00:02:27 apache2
 8656 ?        00:02:25 apache2
 8658 ?        00:02:24 apache2
10093 ?        00:01:02 apache2
10094 ?        00:01:00 apache2
10579 ?        00:00:57 apache2
12372 ?        00:00:26 apache2

This works because if we don't pass grep a file to search, then it will search from stdin instead. By using the pipe, we pass the output of ps into the input of grep which does the searching.

Pipes allow programs to each do some specific task, and work together to accomplish more complex tasks. Rather than having the ps command itself perform searching, we can use grep to search as it is tailored specifically to that task.

We can chain as many pipes together as we wish. For example, if we wanted to know how many apache2 processes are running, we could use two pipes:

ifinlay@cpsc:~$ ps -A | grep apache | wc -l
11

This takes the output of ps -A, and passes it into grep, just like before. Now, however, the output of grep is passed into wc -l which is a word count program. The "-l" option says to count lines, so wc -l takes its input and reports how many lines it contains.


 

More Pipe Examples

One program that is handy to pipe into is less which is the pager program used by man. For instance, we can pipe the output of ps -A into less so that we can more easily browse and search the output using the less interface:

ifinlay@cpsc:~$ ps -A | less

We can also use vim to view long command output. To do this, we launch vim with a hyphen after it which tells it read from stdin instead of a regular file:

ifinlay@cpsc:~$ ps -A | vim -

Another command which produces lots of output is the history command which prints your most recently used commands. If we did a really cool pipe command that we want to look at again, we can search for it:

ifinlay@cpsc:~$ history | grep "|"

The head and tail commands are also useful for showing only the first few, or last few, lines of output respectively. Each take a numeric option which indicates how many lines to print (with the default 10):

ifinlay@cpsc:~$ ps -A | head -3 
  PID TTY          TIME CMD
    1 ?        00:00:34 init
    2 ?        00:00:00 kthreadd
ifinlay@cpsc:~$ ps -A | tail -3
21905 ?        00:11:10 httpd
23453 ?        00:02:46 pbx_exchange
24935 ?        00:01:34 vnetd

Another handy trick is the awk command. awk is actually a very versatile tool, but one common usage of it is to print only certain columns of its input. Many Unix commands print column-oriented data, so this is a handy tool. For instance, if we wanted to print only the "CMD" column of the ps command, we could use awk to do so:

ifinlay@cpsc:~$ ps -A | awk '{print $4;}'

The syntax for awk is a little tricky since it is actually a full programming language, but the $4 refers to the fourth column of input, which we tell awk to print. The first 10 lines of this can be seen with:

ifinlay@cpsc:~$ ps -A | awk '{print $4;}' | head
CMD
init
kthreadd
migration/0
ksoftirqd/0
watchdog/0
migration/1
ksoftirqd/1
watchdog/1
migration/2

We could also use this to print only the commands in our history and not the numbers on the side:

ifinlay@cpsc:notes$ history | tail
 1036  history
 1037  vim 08-redirection-pipes.html
 1038  ls -l
 1039  pwd
 1040  ls
 1041  cd
 1042  cd -
 1043  cd 225/notes
 1044  ls
 1045  history
ifinlay@cpsc:notes$ history | awk '{print $2;}' | tail
history
ps
history
ls
cd
cd
cd
ls
history
history

This only shows the second column, which is the name of the command we ran.

Suppose we wanted to know how many distinct commands were in our history? For instance, this would count all cd commands only once, all vim commands only once and so on.

For this, we can use the sort and uniq commands. sort takes its input, sorts it, and outputs it.

We can see the result of sorting out list of commands:

ifinlay@cpsc:notes$ history | awk '{print $2;}' | sort | tail
vim
vim
vim
vimdiff
wdiff
whic
which
which
which
yes

uniq takes input and removes duplicated lines which are adjacent. This will remove the three vim commands, leaving only one:

ifinlay@cpsc:notes$ history | awk '{print $2;}' | sort | uniq | tail 
top
touch
v
vim
vimdiff
vimtutor
wdiff
whic
which
yes

Now we can count them by piping all this into wc -l:

ifinlay@cpsc:notes$ history | awk '{print $2;}' | sort | uniq | wc -l
84

This is a sort of silly example, but learning to use pipes will allow you to combine the commands you learn up in interesting ways to solve tricky problems.

Copyright © 2024 Ian Finlayson | Licensed under a Creative Commons BY-NC-SA 4.0 License.