Changing the Frequency of Hyphenation

You can increase or decrease the frequency of hyphenation in a document using the MINPT parameter of the .HY [Hyphenate] control word. MINPT controls the minimum hyphenation point (the smallest number of characters acceptable as a hyphenation point for the word) .

The default value for MINPT is 4. To change it, you could specify .hy set minpt .3

.hy on

SPELLING VERIFICATION

The spelling of words in your input file will be checked by the SCRIPT/VS spelling verification function when you include the SPELLCHK option in the SCRIPT command.

Spelling verification is accomplished by attempting to find each word in the input line in the SCRIPT/VS dictionary (described lat-er in this chaptlat-er).

For purposes of spelling verification, a "word" is a string of 2 to 55 characters delimited by "word delimiters." The default word delimiters are listed in Figure 37 on page 363. You may change the word delimiters for spelling verification with the .DC [Define Character] WORD control word.

Punctuation characters are considered part of the word if they appear within it, but are removed before spelling verification is performed if they appear at the end of the word. The defaul t punc-tuation characters are the hyphen (-) and apostrophe ('). You can change the punctuation characters with the .DC [Define Character]

PUNC control word.

When words are verified for correct spelling, the original word, using the case (upper, lower, or mixed) as it occurs in the input line after symbol substitution, is checked against both the main and addenda dictionaries that make up the SCRIPT/VS dictionary being used. If no match is found and the word is in uppercase, all of the letters except the first are translated to lowercase and the word is again checked against both dictionaries. If still no match is found, the first letter is translated to lowercase and the word is again checked against both dictionaries. (If the word is not all uppercase and any character other than the first is capitalized, the word is considered unverified.) If no match is found this time, SCRIPT/VS removes the prefix and suffix, if any, to yield the word's "root." This form of the word is then checked against both dictionaries. If again no match is found, the word is considered unverified. SCRIPT/VS issues a message listing all of the unveri fied (and potentially misspelled) words in an input line by invoking the .UW [Unverified Word] control word.

Note:

Since stem processing is performed only after each word is translated to lowercase, all words placed in the addenda diction-ary should be in lowercase if stem processing is desired. No match will be found for a lowercase occurrence of a word if that word was added to the addenda dictionary in uppercase.

Spelling veri fication is normally performed using the main and addenda dictionaries wi th stem processing. ~Jords that contain numbers are not checked unless requested with the NUM parameter of

.SV [Spelling Verification] control word.

You can specify that:

• The addenda dictionary is not to be used:

.sv noadd

• Full word processing rather than stem processing is to be per-formed:

.sv nostem

• Words that contain numbers are to be checked:

.sv num

Spelling verification can also be used to verify that proper names start with an initial capital letter. For example, if an entry is made in the addenda dictionary as follows,

. du add Teri

Chapter 16. Automatic Hyphenation and Spelling Verification 173

Fallibility

then "Teri" and "TERI" will both be correctly spelled. However, Oteri" will be regarded as misspelled.

SCRIPT/VS spelling verification is not infallible. A misspelled word wi th a suffix or prefix could possibly yield a correctly spelled word after stem processing. For example, "disbooked"

(with the stem "book"), and "missteak" (with the stem "steak") are both "correctly" spelled.

Alsol the stem processing algorithms do not handle all exceptions to general spelling rules used in the English Language. For exam-ple, the plural of "mouse" must be explicitly added to an addenda dictionary.

THE SCRIPT/VS DICTIONARIES

There are three types of SCRIPT/VS dictionaries that are used for hyphenation and spelling verification:

• Read-only dictionaries of root words provided by IBM with SCRIPT/VS. Each contains about 10 , 000 words. Since suffixes and prefixes are removed before a word is searched for in this dictionarYI the effective dictionary size is significantly larger.

• User dictionaries created by your installation using the Dic-tionary Maintenance program. These dictionaries contain words that are not in the main dictionaries but are used in most of the documents produced at your installation. These words often reflect the nature of a particular business and usually include technical terms and company acronyms. Once created , these dictionaries are also read-only.

• Addenda dictionaries you create for a speci fic document or group of documents using the .DU [Dictionary Update] ADD con-trol word. Addenda dictionaries contain words that are not in the main or user-created dictionaries but are frequently used in a specific document or a group of documents. This type of dictionary often includes acronyms that apply to a particular product, jargon, and the names of people and places. It is the most temporary of the three types of dictionaries since i t is rebuilt in storage every time SCRIPT/VS processes a document that requires it. Addenda dictionaries can be updated as required.

IBM provides root word dictionaries in nine languages:

• American English

• United Kingdom English

• Canadian English

• Canadian French

• French

• German

• Italian

• Dutch

• Spanish

The unique stem processing routine that IBM provides with each of these languages is used by all three types of SCRIPT/VS diction-aries in performing hyphenation and spelling verification in a gi ven language.

Use the .DL [Dictionary List] control word to specify which lan-guage you want to use for hyphenation and spelling verification.

This control word automaticallY activates the corresponding stem processing routine for that language, as well as any user diction-aries that are associated wi th that root word dictionary.

D U

The hexadecimal codepoints for SCRIPT/VS spelling checking and listed in Figure 12.

Figure 12. Codepoint Assignments for Accented Characters: Accented charac-ters in the SCRIPT/VS Spelling Checking and Hyphenation diction-aries are represented using the hexadecimal codepoints shown under each language for upp~rcase

(UC)

and lowercase

(lc)

charac-ters.

Building a User Dictionary

A user dictionary is created and updated using the dictionary maintenance procedures that are described in "Appendix F. Main-taining User Dictionaries" On page 391. The words that are to be

Chapter 16. Automatic Hyphenation and Spelling Verification 175

Dans le document Program Product (Page 191-194)