Emacs Online Documentation

define-charset

define-charset is a compiled Lisp function in `mule.el'.

(define-charset NAME DOCSTRING &rest PROPS)

Define NAME (symbol) as a charset with DOCSTRING.
The remaining arguments must come in pairs ATTRIBUTE VALUE. ATTRIBUTE
may be any symbol. The following have special meanings, and one of
`:code-offset', `:map', `:subset', `:superset' must be specified.

`:short-name'

VALUE must be a short string to identify the charset. If omitted,
NAME is used.

`:long-name'

VALUE must be a string longer than `:short-name' to identify the
charset. If omitted, the value of the `:short-name' attribute is used.

`:dimension'

VALUE must be an integer 0, 1, 2, or 3, specifying the dimension of
code-points of the charsets. If omitted, it is calculated from the
value of the `:code-space' attribute.

`:code-space'

VALUE must be a vector of length at most 8 specifying the byte code
range of each dimension in this format:
[ MIN-1 MAX-1 MIN-2 MAX-2 ... ]
where MIN-N is the minimum byte value of Nth dimension of code-point,
MAX-N is the maximum byte value of that.

`:min-code'

VALUE must be an integer specifying the minimum code point of the
charset. If omitted, it is calculated from `:code-space'. VALUE may
be a cons (HIGH . LOW), where HIGH is the most significant 16 bits of
the code point and LOW is the least significant 16 bits.

`:max-code'

VALUE must be an integer specifying the maximum code point of the
charset. If omitted, it is calculated from `:code-space'. VALUE may
be a cons (HIGH . LOW), where HIGH is the most significant 16 bits of
the code point and LOW is the least significant 16 bits.

`:iso-final-char'

VALUE must be a character in the range 32 to 127 (inclusive)
specifying the final char of the charset for ISO-2022 encoding. If
omitted, the charset can't be encoded by ISO-2022 based
coding-systems.

`:iso-revision-number'

VALUE must be an integer in the range 0..63, specifying the revision
number of the charset for ISO-2022 encoding.

`:emacs-mule-id'

VALUE must be an integer of 0, 129..255. If omitted, the charset
can't be encoded by coding-systems of type `emacs-mule'.

`:ascii-compatible-p'

VALUE must be nil or t (default nil). If VALUE is t, the charset is
compatible with ASCII, i.e. the first 128 code points map to ASCII.

`:supplementary-p'

VALUE must be nil or t. If the VALUE is t, the charset is
supplementary, which means it is used only as a parent or a
subset of some other charset, or it is provided just for backward
compatibility.

`:invalid-code'

VALUE must be a nonnegative integer that can be used as an invalid
code point of the charset. If the minimum code is 0 and the maximum
code is greater than Emacs's maximum integer value, `:invalid-code'
should not be omitted.

`:code-offset'

VALUE must be an integer added to the index number of a character to
get the corresponding character code.

`:map'

VALUE must be vector or string.

If it is a vector, the format is [ CODE-1 CHAR-1 CODE-2 CHAR-2 ... ],
where CODE-n is a code-point of the charset, and CHAR-n is the
corresponding character code.

If it is a string, it is a name of file that contains the above
information. Each line of the file must be this format:
0xXXX 0xYYY
where XXX is a hexadecimal representation of CODE-n and YYY is a
hexadecimal representation of CHAR-n. A line starting with `#' is a
comment line.

`:subset'

VALUE must be a list:
( PARENT MIN-CODE MAX-CODE OFFSET )
PARENT is a parent charset. MIN-CODE and MAX-CODE specify the range
of characters inherited from the parent. OFFSET is an integer value
to add to a code point of the parent charset to get the corresponding
code point of this charset.

`:superset'

VALUE must be a list of parent charsets. The charset inherits
characters from them. Each element of the list may be a cons (PARENT
. OFFSET), where PARENT is a parent charset, and OFFSET is an offset
value to add to a code point of PARENT to get the corresponding code
point of this charset.

`:unify-map'

VALUE must be vector or string.

If it is a vector, the format is [ CODE-1 CHAR-1 CODE-2 CHAR-2 ... ],
where CODE-n is a code-point of the charset, and CHAR-n is the
corresponding Unicode character code.

If it is a string, it is a name of file that contains the above
information. The file format is the same as what described for `:map'
attribute.