Unix: Sorting, searching, comparing and counting

Sorting

Unix has a built-in sort command named sort. It will sort the lines in a text file alphabetically or numerically. The default is alphabetic sort. To sort the lines in the file unixpast.txt, type:

$ sort unixpast.txt

Searching for a file

find

This command searches recursively through directories for files and directories with a given name, date, size, or any other attribute you specify.

Examples:

To search for all files with file type extention .txt, starting at the current directory (.) and working through all sub-directories, then printing the name of the files found to the screen, type:

$ find .  -name "*.txt" -print

Since print is the default opration, this is equivalent to:

$ find .  -name "*.txt"

To find files over 1Mb in size, and display the result as a long listing, type:

$ find . -size +1M -ls

To find all files modified in the last 24 hours (last full day) in current directory and its sub-directories, type:

$ find . -mtime -1

To search for and remove all files ending with .bak that have not been accessed in 7 days, type:

$ find . -name "*. bak" -type f -atime +7 -exec rm {} \;

Searching the contents of a file

grep

The command grep (get regular expression and print) is one of many standard Unix utilities. It searches files for specified words or patterns. First clear the screen, then type:

$ grep open unixpast.txt

As you can see, grep has printed out each line containg the word open.

Or has it ????

Try typing

$ grep Open unixpast.txt

The grep command is case sensitive; it distinguishes between Open and open.

To ignore upper/lower case distinctions, use the -i option, i.e. type

$ grep -i open unixpast.txt

To search for a phrase or pattern, you must enclose it in single quotes (the apostrophe symbol). For example to search for unix systems, type:

$ grep -i 'unix systems' unixpast.txt

Some of the other options of grep are:

Try some of them and see the different results. Don't forget, you can use more than one option at a time. For example, the number of lines without the words open or Open is:

$ grep -ivc open unixpast.txt

Comparing files

diff

This command compares the contents of two files and displays the differences. Suppose you have a file called file1.txt and you edit some part of it and save it as file2.txt. To see the differences type:

$ diff file1.txt file2.txt

Lines beginning with a < denotes file1.txt, while lines beginning with a > denotes file2.txt.

You may even recursively compare to file trees:

$ diff -r dir1/ dir2/

To only list the names of the files that differ, use --brief:

$ diff --brief -r dir1/ dir2/

Counting

wc (word count)

A handy little utility is the wc command, short for word count. It actually counts characters, words and lines. To do a word count on unixpast.txt, type:

$ wc -w unixpast.txt

To find out how many lines the file has, type:

$ wc -l unixpast.txt

Summary

Command Meaning
find search for files in a directory hierarchy
cp file1 file2 copy file1 and call it file2
mv file1 file2 move or rename file1 to file2
rm file remove a file
rmdir directory remove a directory
cat file display a file
less file display a file a page at a time
head file display the first few lines of a file
tail file display the last few lines of a file
grep 'keyword' file search a file for keywords
wc file count number of lines/words/characters in file