Appendix D: Glossary

Ascii Character

An Ascii character is one of the set of standard Roman characters that English computers display. PC compatible computers use an extended Ascii character set which consists of the standard Ascii character set ( character number(D- - 2) 0 to 127), and the so-called extended characters(D- - 3) (128 to 255). See the ASCII Chart(8- 5).


The Ascii code is one of Smart Character's methods of representing a character number(D- - 2) by writing it in ordinary digits (0 to 9), preceded by the Chars(D- - 2) object type code(4- 5) (^R). If you were viewing this document in Smart Characters, you would see a Chinese character here: ^R2312. Besides the object type code, from 1 to 6 digits are used per Chinese character, requiring at least twice as much storage per Chinese character as the Binary Code(8- 1). However, many tens of thousands of characters can be encoded this way.


The Autoexec.bat file controls your computer when you first turn it on, or re-boot. It's main use is to set the DOS environment, and to install resident or pop-up programs. Smart Characters does not require anything special in your Autoexec.bat file.

Big Five

Big Five (Big-5) is the name of a traditional Chinese document format(7- 2) and symbol set(D- - 7). See GuoBiao(D- - 4).

Binary Code

Binary code is a way Smart Characters uses to represent a character number(D- - 2) using two extended characters(D- - 3). This method handles character numbers from 0 to 15,875 (126 x 126). Higher numbers require using the ASCII Code(D- - 1). There are various binary codes: Sc is Smart Characters', Shift JIS(D- - 7) and EUC are typical Japanese codes, Big Five(D- - 1). and GuoBiao(D- - 4) (GB) are used in Chinese.


A buffer is an area in the computer's RAM (memory) to put documents or other text for editing, viewing or reference. These areas are called document windows(4- 3), documents, etc.


Bugs are errors in the software that can cause tremendous grief. If you spot one, call Apropos Customer Service(F- - 1).

Bulletin Board

A electronic bulletin board service (or BBS) is a telecommunications information service. Dial up bulletin boards can be accessed using a modem. See User's Group(D- - 8).

Character Dictionary

A Character dictionary is a printed or electronic Chinese character dictionary indexed by character (instead of pronunciation).

Character Map

A character map is a chart that displays characters in a certain typeface(4- 11) and code page(D- - 2) or symbol set(D- - 7). Windows uses a character map application CharMap.exe to select symbols from a font. These characters are copied to the Windows clipboard(D- - 9) for later pasting into an application. Use the Keyboard Character Map(3- 29) command to launch CharMap.

Character Number

Unlike English dictionaries, a Chinese or Japanese character dictionary(D- - 1) gives each character a character number for easy reference. The numbering scheme is different from dictionary to dictionary, and there are dozens of conflicting numbering schemes. These schemes are generally based upon radicals(D- - 6) and the number of strokes(D- - 7). The Smart Characters Combined(4- 9) symbol set uses a numbering scheme based upon Practical Chinese Dictionary, with additional Japanese, Chinese, and Korean characters added to the end.


Chars is the name given to the Input Mode(4- 2) that is used for direct keyboard entry (by character number(D- - 2)) of kanji or hanzi (Chinese characters).

Code Page

A code page is a single byte code space(D- ~2), consisting of codes 0-255.

Code Space

A code space (also knows as a code page(D- ~2)) specifies character codes (or character numbers(D- - 2)) and their corresponding characters. See Code Spaces(5- 10).

Compound Objects

Compound objects are structured groups of other objects. A word consists of a notes object followed by one or more text objects(4- 1). Additional examples include headers, footers, footnotes, bookmarks, and OLE(D- - 5) (Object Linking and Embedding) objects.


Config.sys is the name of a file that DOS uses in the boot process. Your computer "boots up" when you turn the power on, press the Reset button (if you have one), or press Ctrl+Alt+Delete. Smart Characters does not require any entry in Config.sys.


A concordance is a dictionary that lists the correspondence of one set of numbers or words (the From set) to another set (To set). A concordance has a natural order, such as English to Japanese. Each From entry has exactly one To correspondence, but a To entry can have more than one corresponding From entry. If so, the reverse order concordance (e.g. Japanese to English) will have ambiguities.

Control Character

The control range is the first 32 codes (0-31) in the Ascii character set. You can generate control characters by pressing the Ctrl shift key plus a letter key or one of @ ^ _ \ [ ]. See Control Code History(5- 11) and the ASCII Chart(5- 5).

Current Directory

The current directory is the directory that is the base for paths and file names. There are several current directories relevant in Smart Characters operation, the system path(D- - 7), the default path(D- - 3), and the DOS current directory.

Current Line

The current line is the line the insertion point(5- 1) (flashing text cursor) is on.

Default Object Type

The default object type, usually English, is the object type(4- 2) at the beginning of a line presumed by Smart Characters text encoding(D- - 8) method.

Default Path

The default path is where document or text files are presumed to reside. This is controlled by the User.ini WorkStation(B- - 6) section DefPath entry.


Generally, a dictionary consists of entries of a key word and information about it. For example, a typical English dictionary has English words as key words with pronunciation, etymology, definitions, and gloss(D- - 4). See Dictionary Basics(4- 7).

Dictionary Set

A dictionary set consists of a group of dictionaries related by language. Japanese, Chinese (bopomofo input), the obsolete Chi-Pin (Pinyin input), and English are typical dictionary set definitions.

Encoding Method

Encoding method is how a character number(D- - 2) is represented in a document. There are many standards, such as Shift JIS(D- - 7), and Big Five(D- - 1). Smart Characters uses two methods: ASCII Code(D- - 1) and Binary Code(D- - 1). See How a Character Number Is Stored(12- 8). Use the File Format(3- 2) dialog and ScConv(D- - 7) to convert documents from one method to another.

Escape JIS

Escape JIS is a method of encapsulating Shin JIS(D- - 7) text by escape sequences that lets some software identify the embedded Japanese text.

Extended Characters

Extended characters are ASCII characters(D- - 1) with values from 128 to 255 ($80 to $FF in hexadecimal). These codes display characters according the current code space(D- - 2). The two most important code spaces are

Flash Cards

Flash cards are a set of printed cards used for memorizing vocabulary. Each has a word or phrase on one side, and its pronunciation and meaning on the other. They are handy on the subway, until you drop them and lose the one that will be asked on tomorrow's quiz at school.

Font Resolution

Font resolution is a measure of the number of pixels(D- - 6) used to represent each character.

Font Parameters

Font parameters are numbers or names that describe a font, such as font resolution(D- - 3), baseline, symbol set(D- - 7), size, etc. See Setup Bitmap Font(3- 43).

Font Families

Font families allow selection of multiple matching fonts (Chinese, alphabetic(4- 13), and punctuation characters) by one name, such as Combined, Plain-JIS, or Fancy-JIS.

Font Family Name

A font family name describes the way fonts were described in Smart Characters for Students version 1.0-2.9 using a particular combination of typeface(4- 11), proportion, and style (e.g. HelveticaBold, Combined, Plain-JIS).

Format Codes

Format codes do not print but rather affect the way subsequent characters print. DOS ASCII text recognizes only simple format codes: Carriage Return, Line Feed, Form Feed, Tab, and End of File. WordPerfectreg. uses the ASCII extended characters(D- - 3) as hidden codes(D- - 4), allowing for a wide range of features, such as centering, font changes, automatic indexing, tables of contents, etc. The addition of Chinese characters radically complicates matters, since they require an additional 15000 or so codes! Smart Characters uses format codes consisting of the format object type code(D- - 5)("^]") followed by an identifier and a value. You can view the format codes by turning on the Ascii Codes(5- 5) window. For a list of format codes, see FmtCodes.txt.

Other word processors (e.g., Microsoft Word) store formatting information for each paragraph in a data structure called a paragraph mark. It is not possible to view codes using these products, because there are none.


Furigana are small Japanese kana that are written usually above a kanji (Chinese character) to indicate its pronunciation.


A gloss is an equivalent word or phrase in an alternate language. Contrast this to a definition, which is a sentence that describes what an item is and is not.


A glossary is a list of specific or frequently used characters, words, or phrases. A dynamic glossary(4- 6) facilitates extremely rapid input of large amounts of standard text. See dictionary(D- - 3).


A glyph is a rendering (drawing) of a character in a particular typeface(4- 11) and point size(D- - 6). A character can be represented by glyphs in an infinite variety of typefaces, styles, and sizes.


GuoBiao (GB) is the name of a Chinese document format(7- 2) and symbol set(D- - 7) used primarily for simplified(4- 10) characters. The encoding method is equivalent to JIS(D- - 4) (7 bit) and EUC (8 bit). HZ is escaped 7 bit GuoBiao. See Big Five(D- - 1).

Help Topic

A Help Topic is a Topic Entry(12- 3) written for and used by help.

Hidden Codes

Hidden codes are codes in the text that do not print, but affect the display, the printer, or how Smart Characters works. See format codes(D- - 3). You can examine them in the hidden characters(5- 4), or the Ascii Codes(5- 5) window, or by using ASCII Only display mode(3- 7).


Hypertext is a way of writing and presenting text with hot buttons(5- 13): words or phrases that the user can highlight and pick to get further information on a concept or term. See Writing Hypertext(12- 1), a discussion of hypertext objects beginning with context reference(12- 3), and a summary of hypertext codes(5- 12).


JIS Japanese Industrial Standards apply to text Chinese Character encoding, storage and transfer. The symbol set(D- - 7) is divided into three levels(D- - 5) (I - common, II - less common, III -very uncommon) defined. Two encoding methods(D- - 3) are specified: Shin JIS(D- - 7), and Shift JIS(D- - 7).

Keyboard Mapping

Keyboard mapping is the relationship between the keys on the keyboard and the characters they represent. Smart Characters uses keyboard definition(4- 6) files. to map the standard PC keyboard to kana, bopomofo, and Asian punctuation. See the Asian Punctuation Chart(E- - 1).


A keystroke is what the computer receives each time you gets press a key. A typewriter key(5- 6) such as a, Q, or 3 sends an Ascii Character(D- - 1) plus a scan code that indicates which keyboard key was pressed. The arrow and function keys send only scan codes.


Font levels are a portions of a symbol set(D- - 7) that have been placed into separate font files in order to segment the symbol set for different applications. See JIS(D- - 4) and Combined(4- 9) symbol set.


A macro is a very short computer program that you create and use to accelerate text entry, menu selections, or other tasks. See macro keyboard(4- 6) and the Macro(3- 29) command.


Mapping is the concordance(D- - 2) or correspondence between one list and another. For example, the 3 key maps to the English number 3, and the Japanese letter A.


A maximized window is a window of the largest possible size, neither restored(D- - 7) nor minimized(D- - 5). A maximized window is fills its parent window completely.

Menu Item

A menu item is a line or part of a menu. It can represent a choice, an indicator, or a title. Active items represent choices, and can be selected by the mouse or arrow keys. Inactive items cannot.


A minimized window is a window of the smallest possible size, neither restored(D- - 7) nor maximized(D- - 5). A maximized window displays as an icon in its parent window.

Object Type Code

An object type code is a single control character(D- - 2) that sets the object type(4- 2) of the following text objects(4- 1) until the next object type code. You can enter object type codes(5- 11) by pressing the corresponding control keys. See Entering Object Type Codes(5- 7).

OLE (Object Linking and Embedding)

Object linking and embedding (OLE) is a group of related methods used by Windows applications to share data. You can insert an object from an OLE server (e.g., Smart Characters or Paint(D- - 6)) into an OLE client (e.g., Smart Characters or Microsoft Write) in one of three ways: as a server object, as a link to a server object, or as a static converted picture or bitmap. You need only double click the inserted object in the client application to edit it (using the server application). See Using Chinese and Japanese in Other Applications(10- 1) and Inserting Objects from Other Programs(10- 2).


Modal is a term applied to the meaning of an item or event such as a character code or a keystroke. Software is modal in order to increase functionality without requiring a correspondingly large number of keys or character codes. Smart Characters is particularly modal, because of the large number of languages that must be written and displayed. The advantage is cost and size: a standard keyboard can be used. The disadvantage is in remembering how to use the software: each key has multiple meanings.


Paint is the name of a simple OLE(D- - 5) server bitmap editor application that is part of Windows and located in the Accessory group.

Phonetic Dictionary

A phonetic dictionary is a dictionary(D- - 3) that is keyed or indexed by the pronunciation or sound of a word.


Pinyin is romanized Chinese text wo_ yo_ hun_dou bu\dong_de- di\fang_. The check or underscore indicates the 3rd (low) tone, while the dot or hyphen is the neutral tone. The first (high) tone has no mark, while 2nd (rising) and 4th (falling) are indicated by "/" and "\." Smart Characters also uses numbers for tone markings, e.g. wo3 yo3 hun3dou bu4dong3de5 di4fang3. Pinyin to bopomofo translation is controlled by keyboard definition(4- 6) files, typically ChiRules.kbd.


A pixel is one dot on the screen or printed page. Characters are stored and written by a rectangular pattern of pixels. Each pixel has a color. Simple images have just two possible colors: black and white, while full color images may have millions of possible colors. Resolution is the number of rows and columns of pixels that make up an image. Chinese character font resolutions(D- - 3) include 16x16, 24x24, and 48x48 pixels, and higher. See Resolution and Point Size(4- 11) and Point Size(D- - 6).

Point Size

Point size is the size of a printed character in points, or 1/72 of an inch. Point size also measures the number of vertical pixels(D- - 6) or scan lines used to render a glyph(D- - 4) in a bitmap font(8- 5). See Resolution and Point Size(4- 11).

Postscript Name

The postscript name is a name using English characters stored in a TrueType(D- - 8) font file for use with postscript printers. The names are unique, but are not intended for user selection. The DoubleByte TrueType Font Interface(13- 1) uses postscript names anyway as a basis for a platform independent way to uniquely designate a font using English letters.

Preferred Mode

A preferred mode is an input mode whose Access is set to F12, Side + - using the Keyboard Setup(3- 26) dialog.

Proxy Font

A proxy font is a small font that contains the user characters(4- 10) used in the document, and which is often embedded (see embedded proxy font(9- 1)) into the document for electronic transmission, and correct display on other Smart Characters systems.


A radical is a part of a Chinese Character(4- 8), used to classify characters so they can be identified by sight without knowing their pronunciations. Chinese characters can be created using roughly 800 component parts, but only 214 of these are considered radicals.


A restored window is a window of intermediate size, neither maximized(D- - 5 )nor minimized(D- - 5). You can adjust the size of a restored(D- - 7) active window by dragging its borders with the mouse (or the arrow keys via the child window system menu).

Rich Text Format

Rich text format is the name of an interchangeable word processing file format that uses Ascii characters but preserves all types of formatting. This format is also used for the "formatted text" Windows clipboard(D- - 9) data type.


Romaji is Japanese romanized text. Watashi wa nihongo wo benkyoo shiteiru mono desu.

Scalable Typeface

A scalable typeface can be scaled to any point size(D- - 6) to create a font. The two most popular scalable type formats are Adobe type 1, and TrueType(D- - 8).


ScConv converts or translates documents made by other Chinese and Japanese word processors. Also converts romaji to hiragana, and pinyin to bopomofo.


The accessory utility ScDict builds new dictionaries of various types.


The accessory utility ScDisk reads and writes foreign (not DOS) disks, as used in Japanese word computers and processors.

Shift JIS

Shift JIS is a text encoding(D- - 8) method that uses the ASCII extended characters(D- - 3) to make kanji and special symbols.

Shin JIS

Shin JIS is a text encoding(D- - 8) method that uses the lower 128 standard Ascii characters to make Chinese characters and special symbols.


A stroke is a continuous line in a Chinese character. Most strokes are straight or curved, but many have right-angle bends, and look like two strokes. This makes counting strokes a difficult task for students.

Symbol Set

A symbol set is the group of characters intended for a particular use. The most popular single byte symbol sets in use are realized in the standard roman fonts using the Windows code page(D- - 2), and the Symbol and WingDings fonts. There are various incompatible Asian symbol sets(4- 9).

A symbol set is similar to a printed character dictionary(D- - 1) in that both contain certain characters numbered with a character number(D- - 2) in a specific order. You can browse a symbol set in the symbol set view(8- 1) window.

System Path

The system path is the directory Smart Characters uses for font, keyboard, glossary, dictionary, concordance, menu and dialog resource, and help files.

Text Encoding

Text encoding is the relationship between a character as shown on the screen or page, and how it is stored in a document file. Whatever method is used has to be distinguishable from whatever format codes(D- - 3) are employed. The notable world standards are ASCII, China's GB2312, Taiwan's CNS11643, Japan's JIS X0208, Korea's KS C5601, ANSI Z39.64, and Unicode.


The TrueType font format is one of several formats that uses mathematical formulas instead of fixed pixel(D- - 6) bitmaps. The formulas are scaled to create bitmaps for high quality display at any resolution over a certain minimum. TrueType was developed by Apple Computer and adopted by Microsoft Windows. Its principal competitor is the Adobe Type 1 format. See DoubleByte TrueType Font Interface(13- 1).


Unify is a DOS utility for symbol set(D- - 7) unification supplied with font updates, or available on the User's Group(D- - 8) bulletin board(D- - 1).

User's Group

Anyone can donate their vocabulary lessons(D- - 8) or user dictionary(4- 7) to the Smart Characters User's Group. You can obtain these lessons whether you donate or not. The best of the User's Group materials are available in the \Sc\Jpn and \Sc\Chi directories. For new materials, check the User's Group bulletin board(D- - 1).

Unix Japanese

Unix Japanese (frequently called EUC) is the Japanese JIS text encoding(D- - 8) method adapted for the Unix operating system by setting bit 7 high.


The version name indicates the name and revision of the product. For example Smart Characters comes in various versions: for Students and for Windows. The latter versions contain more word processing functions, and are adaptable for use by native as well as non-native speakers. Additional versions include specific user interfaces to machine translation systems, etc. A version number indicates the version of the software currently in use. For example, this manual covers Smart Characters for Windows version 3.0.029.

Vocabulary Lesson

A vocabulary lesson is a Smart Characters document consisting of vocabulary words or phrases organized one per line in the vocabulary lesson format(6- 3). Chinese vocabulary lessons typically have an extension of .cv#, where the "c" stands for Chinese, the "v" for vocabulary, and the # is the font symbol set(D- - 7) character order number (usually 0, 1 or 2). The User's Group(D- - 8) vocabulary lessons are included for your amusement.

Wild Cards

Wild cards are special characters which indicate that any character can be substituted when searching or matching. The DOS wild cards are the asterisk ("*") and the question mark ("?"). The question mark matches any single character. The asterisk matches any character and all following characters. See your DOS or Windows manual.


Windows is a popular operating system, written and sold by Microsoft Corporation. Smart Characters works on the following versions of Windows: 3.1 (enhanced mode), 3.11, Windows for Workgroups 3.11, NT, and Windows 95.

Windows Clipboard

The Windows clipboard is a Windows function to copy and paste text and objects between and within applications. Smart Characters uses its own internal clipboard(4- 3) window and font clipboard for speed. The Edit Paste(3- 9) command pastes text from the internal clipboard(4- 3) window. The Edit Copy(3- 9) command copies text to the clipboard(4- 3) window, and exports text, text objects(4- 1), and graphics to the Windows clipboard for importing into other applications.

Windows Directory

The Windows directory is the directory containing the version of and Win.ini which launched the currently-running version of Windows. Most windows directories are named c:\Windows, but for those who run multiple versions of Windows, it is convenient to name the windows directory after the version name followed by the language name (e.g., d:\Wfw311jp, or \W95us).

ZIP Archive File

An archive file or archive is a file that contains one or more other files. An archive can hold dozens or hundreds of files and their directory path names. Archives can be compressed to take up much less disk space than the individual files alone. The ZIP archive file format uses a variety of compression schemes to yield high compression ratios, and is widely used on computer bulletin boards(D- - 1). A Zip program makes archives from files, while the UnZip extracts and expands the original files from an archive.

