• Aucun résultat trouvé

Finding Files and Directories

Dans le document URLs Referenced in This Book (Page 96-99)

Part II: The Bioinformatics Workstation

Chapter 4. Files and Directories in Unix

4.2 Commands for Working with Directories and Files

4.2.2 Finding Files and Directories

Unix provides many ways to find files, from simply listing out the contents of a directory to search programs that look for specified filenames and the locations of executable programs.

4.2.2.1 Listing files with ls

Usage: ls [-options] pathname

Now that you know where you are, how do you find out what's around you? Simply typing the Unix list command, ls, at the prompt gives you a listing of all the files and subdirectories in the current working directory. You can also give a directory name as an argument to ls. It then prints the names of all files in the named directory.

If you have a directory that contains a lot of files, you can use ls combined with the wildcard character * (asterisk) to produce a partial listing of files. There are several ways to use the *. If you have files in a series (such as ch1 to ch14 ), or files with common characters (like those ending in .txt), you can use * to specify all of them at once. When given as the argument in a command, * takes the place of any number of characters in a filename. For example, let's say you're looking for files called seq11, seq25, and seq34 in a directory of 400 files. Instead of scrolling through the list of files by eye, you could find them by typing:

% ls seq*

What if in that same directory you wanted to find all the text files? You know that text files usually end with .txt, so you can search for them by typing:

% ls *.txt

There are also a variety of command-line options to use with ls. The most useful of these are:

-a

Lists all the files in a directory, even those preceded by a dot. Filenames beginning with a dot (.) aren't listed by ls by default and consequently are referred to as hidden files. Hidden files often contain configuration instructions for programs, and it's sometimes necessary to examine or modify them.

-R

Lists subdirectories recursively. The content of the current directory is listed, and whenever a subdirectory is reached, its contents are also explicitly included in the listing. This command can create a catalog of files in your filesystem.

-1

Lists exactly one filename per line, a useful option. A single-column listing of all your source datafiles can quickly be turned into a shell script that executes an identical operation on each file, using just a few regular-expression tricks.

-F

Includes a code indicating the file type. A / following the filename indicates that the file is a directory, * indicates that the file is executable, and @ following the filename indicates that the file is a symbolic link.

-s

Safari | Developing Bioinformatics Computer Skills -> 4.2 Commands for Working with Directories and Files

http://safari.oreilly.com/main.asp?bookname=bioskills&snode=46 (2 of 7) [6/2/2002 8:54:50 AM]

Lists the size of the file in blocks along with the filename.

-t

Lists files in chronological order of when they were last modified.

-l

Lists files in the long format.

- - color

Uses color to distinguish different file types.

4.2.2.2 Interpreting ls output

ls gives its output in two formats, the short and the long format. The short format is the default. It includes only the name of each file along with information requested using the -F or -s options:

#corr.pl# commands.txt hi.c psimg.c

#eva.pl# corr.pl nsmail res.sty

#pitch.txt# corr.pl~ paircount.pl res.sty~

#wish-list.txt# correlation.pl paircount.pl~ resume.tex

Xrootenv.0 correlation.pl~ pj-resume.dvi seq-scratch.txt a.out detailed-prac.txt pj-resume.log sources.txt

The long format of the ls command output contains a variety of useful information about file ownership and permissions, file sizes, and the dates and times that files were last modified:

drwxrwxr-x 4 jambeck weasel 2048 Mar5 18:23 ./

This listing was generated with the command ls -alF. The first 10 characters in the line give information about file permissions. The first character describes the file type. You will commonly encounter three types of files: the ordinary file (represented by -), the directory (d ), and the symbolic link (l ).

The next nine characters are actually three sets of three bits containing file permission information. The first three characters following the file type are the file permissions for the user. The next set are for the user's group, and the final set are for users outside the group. The character string rwxrwxrwx indicates a file is readable (r ), writable (w), and executable (x ) by any user. We talk about how to change file permissions and file ownership in Section 4.3.3.2.

The next column in the long format file listing tells you how many links a file has; that is, how many directory listings for that file exist on the filesystem. The same file can be named in multiple directories. In the section Section 4.2.3, we talk about how to create links (directory listings) for new and existing files.

The next two columns show the ownership of the file. The owner of the files in the preceding example is jambeck , a member of the group weasel.

The next three columns show the size of the file in characters, and the date and time that the file was last modified.

The final column shows the name of the file.

4.2.2.3 Finding files with find

Usage: find pathname list -[test] criterion

The find command is one of the most powerful, flexible, and complicated commands in the standard set of Unix programs. find searches a path or paths for files based on various tests. There are over 20 different tests that can be used with find; here are a few of the most useful:

Safari | Developing Bioinformatics Computer Skills -> 4.2 Commands for Working with Directories and Files

http://safari.oreilly.com/main.asp?bookname=bioskills&snode=46 (3 of 7) [6/2/2002 8:54:50 AM]

-print

This test is always true and sends the pathname of the current file to standard output. -print should be the last command specified in a line, because, as it's always true, it causes every file in the pathname being searched to be sent to the list if it comes before other tests in a sequence.

-name

This is the test most commonly applied with find and the one that is the most immediately useful. find -name weasel.txt -print lists to standard output the full pathnames of all files on the filesystem named weasel.txt. The wildcard operator * can be used within the filename criterion to find files that match a given substring. find -name weas* -print finds not only weasel.txt, but weasel.c and weasel.

-user uname

This test finds all files owned by the specified user.

-group gname

This test finds all files owned by the specified group.

-ctime n

This test is true if the current file has been changed n days ago. Changing a file refers to any change, including a change in permissions, whereas modification refers only to changes to the internal text of the file.

-atime and -mtime tests, which check the access and modification times of the files, are also available.

Performing two find tests one after another amounts to applying a logical "and" between the tests. A -o between tests indicates a logical "or." A slash ( / ) negates a command, which means it finds only those files that fail the test.

find can be combined with other commands to selectively archive or remove particular files from a filesystem. Let's say you want a list of every file you have modified in your home directory and all subdirectories in the last week:

% find ~ -type f -mtime -7 -print

Changing the type to d shows only new directories; changing the -7 to +7 shows all files modified more than a week ago. Now let's go back to the original problem and find executable files. One way to do this with find is to use the following command:

% find / -name progname -type f -exec ls -alF '{' ';'

This example finds every match for progname and executes ls -alF FullPathName for every match. Any Unix command can be used as the object of -exec. Cleanup of the /tmp directory, which is usually done automatically by the operating system, can be done with this command:

find /tmp -type f -mtime +1 -exec rm -rf '{' ';'

This deletes everything that hasn't been modified within the last day. As always, you need to refer to your manual pages, or manpages, for more details (for more on manpages, see Chapter 5).

4.2.2.4 Finding an executable file with which

Usage: which progname

The which command searches your current path and reports the full path of the program that executes if you enter progname at the command prompt. This is useful if you want to know where a program is located, if, for instance, you want to be sure you're using the right version of the program. which can't find a program in a directory that isn't in your path.

4.2.2.5 Finding an executable file with whereis

Usage: whereis -[options] progname

The whereis command searches a standard set of directories for executables, manpages, and source files. Unlike which, whereis isn't dependent on your path, but it looks for programs only in a limited set of directories, so it doesn't give a definitive answer about the existence of a program.

Safari | Developing Bioinformatics Computer Skills -> 4.2 Commands for Working with Directories and Files

http://safari.oreilly.com/main.asp?bookname=bioskills&snode=46 (4 of 7) [6/2/2002 8:54:50 AM]

Dans le document URLs Referenced in This Book (Page 96-99)