The good things are most SAs don't need to know shell very well to do most daily tasks; that's because UNIX offers both smart and awkward ways to get things done. However, hardcore UNIX skills are always essential for working efficiently on UNIX and make the fundamental difference/glasswall between Junior and Senior level SAs.
What we know about UNIX, even after 10 years of working with UNIX, is still like a drop of water from the sea... I'm still learning something new about UNIX everyday. The pleasure of acquiring new skills has inspired me and makes me feel proud of being a UNIX administrator... Working and living a life is one thing, working with passion and living a happy/productive life is another thing... I'm very happy and hopefully am productive everyday.
Command Substitution
· The standard output from a command list enclosed in parentheses preceded by a dollar sign ( $(list ) ), or in a brace group preceded by a dollar sign ( ${ list ;} ), or in a pair of grave accents ( ` ` ) (the obsolete form) may be used as part or all of a word; trailing new-lines are removed.
· The command substitution $( cat file ) can be replaced by the equivalent but faster $( <file ) .
· The command substitution $( n <# ) will expand to the current byte offset for file descriptor n.
Example 1, Count a text file's English word frequency (there are many ways to implement it, I'm using a lazy way with command substitution. A classic way of doing this from Brian Kernighan's book is listed at the end of this example):
1. Turn the text file into one word each line:
$for i in $( <my_txtfile)
do
echo $i
done | sort > txt.1
2.Count word's frequency:
$for i in $(uniq txt.1)
do
echo $i: $(grep \b$i\b txt.1 | wc -l)
done
(note: we need to put boundary in grep's command to avoid counting words like 'them' or 'theory' as 'the'.)
output:
succeed.: 2
succeeded.: 1
symbol: 2
symbol.: 1
that: 1
the: 36
The: 2
them:: 1
those: 1
to: 17
Things need to improve in the above example:
1. count "symbol" the same as "symbol." - hint: sed "s/([A-Za-z]*)[\.:#,]/\1/".
2. sort the output by words frequency.
BTW, here is the classic solution for this problem. It is on page 107 from this book.
cat $* |
tr -sc A-Za-z '\012' |
sort |
uniq -c |
sort -nr |
output:
36 the
21 command
18 of
17 to
12 a
8 is
8 file
5 shell
In this classic solution, two key commands were used.
One is a compress run of tr -sc A-Za-z '\012' which turns non-letters into newline '\012' and squeezes them out. The -c option complements (negates) the set of characters in the expression 'A-Za-z'.
The second one is 'uniq -c' which prefix lines by the number of occurrences.
Also, '\012' is the octal values of special character 'newline' or '\n'. Just in case we want to know other common special characters' octal values, here is a list:
Character | Octal Value |
---|---|
Bell | 7 |
Backspace | 10 |
Tab | 11 |
Newline | 12 |
Linefeed | 12 |
Formfeed | 14 |
Carriage Return | 15 |
Escape | 33 |
Example 2,
$ content=$( <web.py ); print $content
#! /usr/bin/env python import sys, webbrowser def main(): args = sys.argv[1:] if not args: print "Usage: %s querystring" % sys.argv[0] return list = [] for arg in args: if '+' in arg: arg = arg.replace('+', '%2B') if ' ' in arg: arg = '"%s"' % arg arg = arg.replace(' ', '+') list.append(arg) s = '+'.join(list) url = "http://www.google.com/search?q=%s" % s webbrowser.open(url) if __name__ == '__main__': main()
Try $(<$content) in the shell and see what's happening?
Shell Special Characters
Here is the meaning of some of them:
$$ pid of the program being executed
$? The exit status of the last command not executed in the bg
$! The pid of the last program sent to the bg
$- The current shell options in effect (see set manpage)
$# number of the positional parameters passed to the command
$* expands to all positional parameters passed to the command
$@ expands to all positional parameters passed to the command, but individually quoted when "$@" is used.
$1 the value of the first positional parameter passed to the command. $2 is the second positional parameter passed to the command, etc. up to $9.
{ } enclose a group of commands to be launched by the current shell. E.g. { dir }. It needs the spaces.
&& is an "AND" connecting two commands. command1 && command2 will execute command2 only if command1 exits with the exit status 0 (no error). For example: cat file1 && cat file2 will display file2 only if displaying file1 succeeded.
|| is an "OR" connecting two commands. command1 || command2 will execute command2 only if command1 exits with the exit status of non-zero (with an error). For example: cat file1 || cat file2 will display file2 only if displaying file1 didn't succeed.
\ ' " and ' are used for quoting.
<> are used for input/output redirection.
| pipes the output of the command to the left of the pipe symbol "|" to the input of the command on the right of the pipe symbol.
; separates multiple commands written on a single line
* when a filename is expected, it matches any filename except those starting with a dot (or any part of a filename, except the initial dot).
? when a filename is expected, it matches any single character.
[ ] when a filename is expected, it maches any single character enclosed inside the pair of [ ].
Shell's I/O
>& : The notation >& specifies output redirection to a file associated with the file descriptor that follows. echo "Invalid number of arguments" >& 2 will write to file descriptor 2's file which is standard error.
">& - " closes the standard output. If preceded by a file descriptor, then the associated file is closed instead. e.g. "ls >&-" go nowhere since standard output is closed by the shell before ls is executed.
"<&-" same for standard input Use exec to redirect I/O:
exec <input_file
will cause all subsequent commands executed that read from standard input will read from "input_file" instead. Use "exec >& 0" to resume back to standard input.
will cause all subsequent commands write to stand output will write to /tmp/output
exec 2> /tmp/errors
In-line Input Redirection (here) <<
command <<word
the shell will use the lines that follow as the standard input for command, up until a line that contatins just word is found.
e.g.
wc -l <<EOF
>a
>b
>c
>d
>EOF
4
Aliasing
(The following aliases are compiled into the kornshell but can be unset/redefined)
autoload='typeset -fu'
command='command '
compound='typeset -C'
fc=hist
float='typeset -lE'
functions='typeset -f'
hash='alias -t - -'
history='hist -l'
integer='typeset -li'
nameref='typeset -n'
nohup='nohup '
r='hist -s'
redirect='command exec'
source='command .'
stop='kill -s STOP'
suspend='kill -s STOP $$'
times='{ { time;} 2>&1;}'
type='whence -v'
Shell's (ksh and bash) Pattern-Matching Operators
Operator | Meaning |
---|---|
${variable#pattern} | If the pattern matches the beginning of the variable's value, delete the shortest part that matches and return the rest. |
${variable##pattern} | If the pattern matches the beginning of the variable's value, delete the longest part that matches and return the rest. |
${{variable%pattern} | If the pattern matches the end of the variable's value, delete the shortest part that matches and return the rest. |
${variable%%pattern} | If the pattern matches the end of the variable's value, delete the longest part that matches and return the rest. |
Expression | Result |
---|---|
${path##/*/} | long.file.name |
${path#/*/} | billr/mem/long.file.name |
$path | /home/billr/mem/long.file.name |
${path%.*} | /home/billr/mem/long.file |
${path%%.*} | /home/billr/mem/long |