Filename Extensions

In Microsoft Windows and some other operating systems, filenames often have the form

name

extension

. For example, plain text files have extensions such as .txt. The operating system treats the extension as separate from the filename and has rules about how long it must be, and so forth.

Unix doesn't have any special rules about extensions. The dot has no special meaning as a separator, and extensions can be any length. However, a number of programs (especially compilers) make use of

extensions to recognize the different types of files they work with. In addition, there are a number of

conventions that users have adopted to make clear the contents of their files. For example, you might name a text file containing some design notes notes.txt.

Table 1-1 lists some of the filename extensions you might see and a brief description of the programs that recognize them.

Table 1-1. Filename extensions that programs expect

Extension Description

.a Archive file (library)

.c C program source file

.f FORTRAN program source file

.F FORTRAN program source file to preprocess

.gz gzip ped file (Section 15.6)

.h C program header file

.html or .htm HTML file for web servers

.xhtml XHTML file for web servers

.o Object file (compiled and assembled code)

.s Assembly language code

.z Packed file

.Z Compressed file Section 15.6)

.1 to .8 Online manual (Section 2.1) source file

~ Emacs editor backup file (Section 19.4)

In Table 1-2 are some extensions often used by users to signal the contents of a file, but are not actually recognized by the programs themselves.

Table 1-2. Filename extensions for user's benefit

Extension Description

.tar tar archive (Section 39.2)

.tar.gz or .tgz gzip ped (Section 15.6) tar archive (Section 39.2)

.shar Shell archive

.sh Bourne shell script (Section 1.8)

.csh C shell script

.mm Text file containing troff's mm macros

.ms Text file containing troff's ms macros

.ps PostScript source file

.pdf Adobe Portable Document Format

—ML and TOR

1.13 Wildcards

The shells provide a number of wildcards that you can use to abbreviate filenames or refer to groups of files. For example, let's say you want to delete all filenames ending in .txt in the current directory

(Section 1.16). You could delete these files one by one, but that would be boring if there were only 5 and very boring if there were 100. Instead, you can use a wildcarded name to say, "I want all files whose names end with .txt, regardless of what the first part is." The wildcard is the "regardless" part. Like a wildcard in a poker game, a wildcard in a filename can have any value.

The wildcard you see most often is

*

(an asterisk), but we'll start with something simpler:

?

(a question mark). When it appears in a filename, the

?

matches any single character. For example,

letter?

refers to any filename that begins with letter and has exactly one character after that. This would include letterA, letter1, as well as filenames with a nonprinting character as their last letter, such as letter^C.

The

*

wildcard matches any character or group of zero or more characters. For example,

*.txt

matches all files whose names end with .txt;

c*

matches all files whose names start with c;

cb

matches names starting with c and containing at least one b; and so on.

The

*

and

?

wildcards are sufficient for 90 percent of the situations that you will find. However, there are some situations that they can't handle. For example, you may want to list files whose names end with .txt, mail, or let. There's no way to do this with a single

*

; it won't let you exclude the files you don't want. In this situation, use a separate

*

with each filename ending:

.txt mail *let

Sometimes you need to match a particular group of characters. For example, you may want to list all filenames that begin with digits or all filenames that begin with uppercase letters. Let's assume that you want to work with the files

program

n

, where

n

is a single-digit number. Use the filename:

program.[0123456789]

In other words, the wildcard

[character-list]

matches any single character that appears in the list. The character list can be any group of ASCII characters; however, if they are consecutive (e.g., A-Z, a-z, 0-9, or 3-5, for that matter), you can use a hyphen as shorthand for the range. For example,

[a-zA-Z]

means any alphabetic English character.

There is one exception to these wildcarding rules. Wildcards never match

/

, which is both the name of the filesystem root (Section 1.14) and the character used to separate directory names in a path (Section 1.16). The only way to match on this character is to escape it using the backslash character (

\)

However, you'll find it difficult to use the forward slash within a filename anyway (the system will keep trying to use it as a directory command).

If you are new to computers, you probably will catch on to Unix wildcarding quickly. If you have used any other computer system, you have to watch out for one important detail. Virtually all computer systems except for Unix consider a period (.) a special character within a filename. Many operating systems even require a filename to have a period in it. With these operating systems, a

*

does not match a period; you have to say

.

. Therefore, the equivalent of

rm *

does virtually nothing on some operating systems.

Under Unix, it is dangerous: it means "delete all the files in the current directory, regardless of their name." You only want to give this command when you really mean it.

But here's the exception to the exception. The shells and the ls command consider a . special if it is the first character of a filename. This is often used to hide initialization files and other files with which you aren't normally concerned; the ls command doesn't show these files unless you ask (Section 8.9) for them.

If a file's name begins with ., you always have to type the . explicitly. For example,

.*rc

matches all files whose names begin with . and end with rc. This is a common convention for the names of Unix initialization files.

Table 1-3 has a summary of common wildcards.

Table 1-3. Common shell wildcards

Wildcard Matches

? Any single character

* Any group of zero or more characters

[ab] Either a or b

[a-z] Any character between a and z, inclusive

Wildcards can be used at any point or points within a path. Remember, wildcards only match names that already exist. You can't use them to create new files (Section 28.3) — though many shells have curly braces (

{}

) for doing that. Section 33.3 explains how wildcards are handled, and Section 33.2 has more about wildcards, including specialized wildcards in each of the shells.

— ML

Dans le document How to Use This Book (Page 48-52)