Bash book
The Bash Shell - Simple usage
Before we dive
Bash is an acronym for the Bourne Again Shell. An obvious pun on the born-again Christians, bash is GNU/Linux's de facto standard shell. Its syntax is very close to the korn shell's, another very much widespread Unix shell. You may have never encountered such terms, though, so let's start with some definitions. A shell, in this context, is a command-line shell, that is, an interface that stands between a user and an operating system, enabling interaction between them. Unlike the standard Windows graphical shell, however, every interaction goes through text commands.
A typical bash session could look like
<syntaxhighlight lang="bash"> sav@phiber ~ $ ls l2 los sav@phiber ~ $ vim .qmail sav@phiber ~ $ irssi sav@phiber ~ $ exit </syntaxhighlight> |
The keyboard inputs are what's between the $ and the end of line. ls stands for “list” (as in “list files”), vim is a text editor, irssi is an IRC client, exit, well, exits the shell.
Bash can be used in two mode, interactive and non-interactive. In interactive mode, users input commands and these are executed as the user issues them. In non-interactive mode, programmers write programs, and then call them using parameters. The program will run, get input from somewhere else, be it a file, keyboard input, or another program, perform its operations, and exit.
Getting started
We won't get too much into Unix/Linux programs and will try, as far as it's possible, to stick to shell commands. When it's not possible (for example coming to filtering output), we will use standard Unix programs. We will be performing basic operations, like reading a file, writing to a file, reading input from the standard input, chaining programs together. This knowledge will make you proficient in using the Unix shell.
Reading a file
The Unix read file command is “cat”. It takes for parameter the names of the files to display. For example:
<syntaxhighlight lang="bash"> sav@phiber ~ $ cat /etc/motd ┏━┃┃ ┃┛┏━ ┏━┛┏━┃ ┏━┛┏━┃┏━ ━┏┛┃ ┃┏━┃┏━┛┃ ┃ ┃ ┃┏━┛ ┏━┛┏━┃┃┏━┃┏━┛┏┏┛ ┃ ┏━┃┃ ┃ ┃ ┏━┃┏━┃┃ ┏┛ ┃ ┃━━┃ ┛ ┛ ┛┛━━ ━━┛┛ ┛┛━━┛┛ ┛┛ ┛ ┛ ┛ ┛┛ ┛━━┛┛ ┛┛━━┛━━┛ </syntaxhighlight> |
(yes, it's ascii art, I made it with toilet) will display the file /etc/motd (“Message of the Day”, a Unix standard file that displays usually at logon).
To read a long file, we will use the “less” (or “more” if “less” is not available) command. more and less allow you to read through a file page by page, much like the MORE DOS command.
<syntaxhighlight lang="bash"> sav@phiber ~ $ more /proc/cpuinfo </syntaxhighlight> |
To read the first lines of a file, we use “head”
<syntaxhighlight lang="bash"> sav@phiber ~ $ head -n 1 /etc/passwd root:x:0:0:root:/root:/bin/bash </syntaxhighlight> |
To read the last lines of a file, we use “tail”
<syntaxhighlight lang="bash"> sav@phiber ~ $ tail -n 2 /etc/passwd randomx:x:1004:1004::/home/randomx:/bin/false randomy:x:1005:1005::/home/randomy:/bin/false </syntaxhighlight> |
These two examples read the file /etc/passwd, which contains details about the different accounts on the system. We'll go further on this later on.
To read a whole file, skipping the first lines, we still use “tail”
<syntaxhighlight lang="bash"> sav@phiber ~ $ cat test.csv name,address,phone,e-mail joe biden,12 drunkard st.,+1-234-8888,[email protected] barack obama,1 nigga ave.,+1-234-1337,[email protected] joe frazier,3 boxing bd.,+1-234-0101,[email protected] sav@phiber ~ $ tail -n +2 test.csv joe biden,12 drunkard st.,+1-234-8888,[email protected] barack obama,1 nigga ave.,+1-234-1337,[email protected] joe frazier,3 boxing bd.,+1-234-0101,[email protected] </syntaxhighlight> |
These basic commands are essential when you're into a system. They are used at every instant of your system life.
During prehistory and antiquity, there were no computers, so we didn't care much about computer filesystems. When the 60's and 70's came, mass, random access storage started to appear, and with that came the metaphor of files. Later on folders came into the equation, when storage became big enough to store a significant amount of data that needed to be sorted.
Most probably, if you're not an oldscene old timer, you've always lived with files and folders (or directories, which are the same). Often you will need to search for files with a certain name, or with a certain name pattern, or which contains certain data. But let's do simple things first, and navigate through the file system.
The Unix filesystem, for the scientific minded, is a particular kind of graph we call a tree. That is, it is a non-looping graph (unless we resolve symbolic links, which can, if taken in account, turn any tree into a graph). I suggest you search the net for Unix Tree / Linux Tree and look into Google Images. On the other hand, the Windows file system is not a tree but rather a set of trees, one tree for each “drive letter”.
To know where we are, we will use the pwd command. It stands for “print working directory”, that is, the directory in which we currently are.
<syntaxhighlight lang="bash"> sav@phiber ~ $ pwd /home/sav In order to navigate to other directories, we'll use the cd command. sav@phiber ~ $ pwd /home/sav sav@phiber ~ $ cd .. sav@phiber /home $ pwd /home sav@phiber /home $ cd /proc/self/ sav@phiber /proc/self $ pwd /proc/self sav@phiber /proc/self $ cd ../../tmp/ sav@phiber /tmp $ pwd /tmp </syntaxhighlight> |
A bit of explanation for all this. Historically in file systems, . stands for the current directory, which .. stands for the parent directory. So, basically, cd . would do nothing. cd .. navigates to the parent directory. These are what we call relative paths, that is, paths relative to the current position in the file system. On the other hand, cd /proc/self makes use of an absolute path. It's easily identified by its heading slash. cd /proc/self will always have the same outcome no matter what directory you are currently in.
cd ../../tmp/ is a longer relative path, which performs navigation two directory levels upwards, and then to the tmp directory.
Exercise: try to cd here and there, to existing directories and non-existing directories.
Searching for files and directories
There are two ways to search for files under Unix. The first one, using find walks through the file system, at a given start node, and searches under it. The second one, locate, maintains a database of existing files and searches through it, which is much quicker but requires regular maintainance.
Using find can be pretty awkward, especially for newcomers. We'll explain a simplified version. Its first argument shall be the path you'll be searching in. Then, we use several command-line switches to alter the behaviour of find. A complete guide can be found by typing “man find” into the console or in a Google searchbox.
In order to search for files whose name match a certain pattern, we will use the -name switch.
<syntaxhighlight lang="bash"> sav@phiber /tmp $ find /usr/share/ -name '*.sh' /usr/share/git/contrib/fast-import/git-import.sh /usr/share/git/contrib/ciabot/ciabot.sh /usr/share/git/contrib/rerere-train.sh /usr/share/git/contrib/remotes2config.sh /usr/share/vim/vim73/macros/less.sh </syntaxhighlight> |
searches for all files whose name match the '*.sh' pattern, that is, any file whose name ends with .sh. In order to search for all files belonging to user sav, we will use the -user switch
<syntaxhighlight lang="bash"> sav@phiber ~ $ find /home/sav/ -user sav /home/sav/ /home/sav/l2 /home/sav/los /home/sav/.ssh /home/sav/.bashrc /home/sav/.bash_logout /home/sav/test.csv /home/sav/.viminfo </syntaxhighlight> |
There are many other possibilities, find is a very rich tool, and here comes an exercise.
Build the commands to:
- Find all SUID bit binaries in /usr
- Find all files belonging to user root in the /home directory.
- Find all files modified less than a week ago (huge hint: this one can be “touch”y)
- Find all executable files (not directories) in /usr/share
- Find all .txt files, and all .h files, in the filesystem.
and provide us with output.
Using locate is much easier. First, from time to time, run the updatedb command as user root Then, type locate 'pattern' and you will be given a list of files.
Advanced find use
find is a power tool for any Unix administrators. Its ability to find files matching certain properties covers about 100% of all the needs you'll ever have. With find you can chain conditions, negate them. For example, try to build a command that finds all files created less than 7 days ago but more than 1 day ago (tip: use first touch -d “7 days ago” /tmp/marker)
Combining find with xargs
Typically we will use find with the -print0 option (to have a NULL separator instead of a whitespace or new line) and we'll use xargs with the -0 option. This rules out all the whitespace and quotes-in-file-names issues. Use it with the -L option, the -n option. An example:
<syntaxhighlight lang="bash"> find . -type f |
will pass md5sum 30 arguments each time
Executing several commands in a row
In order to execute several commands in a row, we will usually use a semicolon (“;”) between each instruction. This will execute programs one after the other no matter the result of the programs are. If a program fails, then the next program starts running. Example:
<syntaxhighlight lang="bash"> sav@phiber ~ $ head -n 1 /etc/shadow; head -n 1 /etc/passwd head: cannot open `/etc/shadow' for reading: Permission denied root:x:0:0:root:/root:/bin/bash </syntaxhighlight> |
In some situations, for example when checking for dependencies using “configure” before building and installing a program, this kind of chaining can be misfit. Indeed, we cannot proceed with the build or installation if the previous step has failed.
So, to go forward if and only if the programs succeed, we use the “&&” (AND) operator.
Example:
<syntaxhighlight lang="bash"> sav@phiber ~ $ head -n 1 /etc/shadow && head -n 1 /etc/passwd head: cannot open `/etc/shadow' for reading: Permission denied </syntaxhighlight> |
In other situations, we may want to execute a command only if another has failed. For this, we use the “||” (OR) operator
Example:
<syntaxhighlight lang="bash"> sav@phiber ~ $ head -n /etc/shadow |
Chaining programs
Sometimes we may want to take the output of a program and further refine it, for example when searching for a big number of files. In that case, we may want to be able to read the output of the find or locate command with the help of the less or more program.
In the Unix philosophy basically everything is a file. Block devices, keyboard, sound card, RAM, everything's a file really. For programs, each of them has at least 1 input “file” (often wired to the input terminal) and 2 output “files”, wired to two channels of the terminal by default, too. Each of these file descriptors is numbered, from 0 to 2. 0 is the standard input (STDIN), 1 is the standard output (STDOUT) and 2 is the error channel (STDERR). When errors occur they are normally, for standard-respectful programs, written on the error channel.
Many file manipulation standard programs, when not given a file name, will take their input from the STDIN. This allows, you will have guessed it, for seamless program chaining.
So, for the proposed example, try:
<syntaxhighlight lang="bash"> sav@phiber ~ $ find /etc/ -name '*.conf' |
The | is a “pipe”. On QWERTY keyboards it's situated on the rightmost part of the keyboard, next to the Enter/Return key. On Mac keyboards, it's typed using Apple+Alt+L. On other keyboards, check it for yourself. You will need that symbol basically every second you type in a Linux/Unix terminal.
Programs chains can be virtually unlimited in length. Something like
<syntaxhighlight lang="bash"> sav@phiber ~ $ find /etc/ -name '*.conf' |
with 5 programs in a chain is pretty common. Unix uses and abuses of this, and so should you.
Writing to files
The output of any command can be redirected to a file using the right-pointing angle bracket sign. An example will be more explicit :
<syntaxhighlight lang="bash"> sav@phiber ~ $ head -n 4 /proc/cpuinfo > /tmp/test sav@phiber ~ $ cat /tmp/test processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 </syntaxhighlight> |
When using the simple right-pointing angle bracket without any option, the errors spawned will not be stored into the target file. In order to log them, we need to redirect STDERR to STDOUT. For this, we use the 2>&1 instruction, which tells the shell to redirect channel 2 (2, STDERR) to channel 1 (&1, STDOUT). This way, the errors will be displayed in the same channel as the standard messages and thus can be stored in a log file.
For example:
<syntaxhighlight lang="bash"> sav@phiber ~ $ find /etc/skel/ -name foo > /tmp/test find: `/etc/skel/.ssh': Permission denied find: `/etc/skel/.maildir': Permission denied sav@phiber ~ $ cat /tmp/test sav@phiber ~ $ and now sav@phiber ~ $ find /etc/skel/ -name foo > /tmp/test 2>&1 sav@phiber ~ $ cat /tmp/test find: `/etc/skel/.ssh': Permission denied find: `/etc/skel/.maildir': Permission denied </syntaxhighlight> |
You shall always specify the file descriptors redirections AFTER any standard output file redirection. You may want to log output to a file which keeping standard display on the terminal. Chaining your program to tee will help you do so
<syntaxhighlight lang="bash"> sav@phiber ~ $ find /etc/skel/ -name '*foo*' 2>&1 |
As you can see, the output is both copied to stdout and written to the file passed as argument. Remember tee as a “Golf Tee” or T pipe, which takes one input and splits it in two flows, one to a file, one to stdout. Tee can be used to keep “raw”, unfiltered data in files while not breaking the processing chain. We are done with the basic tools, and can now start to learn more advanced uses of the Unix/Linux shell, especially of bash.
Back on board
In this chapter, you have learned how to use the basic functionalities of a common Unix shell. This knowledge applies primarily to bash but can be applied to virtually any POSIX standard shell. We tried to put real worlds examples into this course, but the best examples will be reading and processing data you actually need. So practice, practice and practice. Soon we will move on to a more advanced level, that will allow you to write more complete bash programs.