plan9fox/sys/man/1/tcs

.TH TCS 1
.SH NAME
tcs \- translate character sets
.SH SYNOPSIS
.B tcs
[
.B -slcv
]
[
.B -f
.I ics
]
[
.B -t
.I ocs
]
[
.I file ...
]
.SH DESCRIPTION
.I Tcs
interprets the named
.I file(s)
(standard input default) as a stream of characters from the
.I ics
character set or format, converts them to runes,
and then converts them into a stream of characters from the
.I ocs
character set or format on the standard output.
The default value for
.I ics
and
.I ocs
is
.BR utf ,
the
.SM UTF
encoding described in
.IR utf (6).
The
.B -l
option lists the character sets known to
.IR tcs .
Processing continues in the face of conversion errors (the
.B -s
option prevents reporting of these errors).
The
.B -c
option forces the output to contain only correctly converted characters;
otherwise,
.B Runeerror
(0xFFFD)
characters will be substituted for
.SM UTF
encoding errors and unknown characters.
.PP
The
.B -v
option generates various diagnostic and summary information on standard error,
or makes the
.B -l
output more verbose.
.PP
.I Tcs
recognizes an ever changing list of character sets.
In particular, it supports a variety of Russian and Japanese encodings.
Some of the supported encodings are
.TF jis-kanji
.TP
.B utf
The Plan 9
.SM UTF
encoding, known by ISO as UTF-8
.TP
.B utf1
The deprecated original
.SM UTF
encoding from ISO 10646
.TP
.B ascii
7-bit ASCII
.TP
.B 8859-1
Latin-1 (Central European)
.TP
.B 8859-2
Latin-2 (Czech .. Slovak)
.TP
.B 8859-3
Latin-3 (Dutch .. Turkish)
.TP
.B 8859-4
Latin-4 (Scandinavian)
.TP
.B 8859-5
Part 5 (Cyrillic)
.TP
.B 8859-6
Part 6 (Arabic)
.TP
.B 8859-7
Part 7 (Greek)
.TP
.B 8859-8
Part 8 (Hebrew)
.TP
.B 8859-9
Latin-5 (Finnish .. Portuguese)
.TP
.B html
Unicode as encoded by HTML
.TP
.B koi8
KOI-8 (GOST 19769-74)
.TP
.B jis-kanji
ISO 2022-JP
.TP
.B ujis
EUC-JX: JIS 0208
.TP
.B ms-kanji
Microsoft, or Shift-JIS
.TP
.B jis
(from only) guesses between ISO 2022-JP, EUC or Shift-Jis
.TP
.B gb
Chinese national standard (GB2312-80)
.TP
.B big5
Big 5 (HKU version)
.TP
.B unicode
Unicode Standard 1.0
.TP
.B tis
Thai character set plus
.SM ASCII
(TIS 620-1986)
.TP
.B msdos
IBM PC: CP 437
.TP
.B atari
Atari-ST character set
.SH EXAMPLES
.TP
.B tcs -f 8859-1
Convert 8859-1 (Latin-1) characters into
.SM UTF
format.
.TP
.B tcs -s -f jis
Convert characters encoded in one of several shift JIS encodings into
.SM UTF
format.
Unknown Kanji will be converted into
.B 0xFFFD
characters.
.TP
.B tcs -t html
Convert UTF into character set-independent HTML.
.TP
.B tcs -lv
Print an up to date list of the supported character sets.
.SH SOURCE
.B /sys/src/cmd/tcs
.SH SEE ALSO
.IR ascii (1), 
.IR rune (2), 
.IR utf (6).
Import sources from 2011-03-30 iso image - sys/man 2011-03-30 13:49:47 +00:00			`.TH TCS 1`
			`.SH NAME`
			`tcs \- translate character sets`
			`.SH SYNOPSIS`
			`.B tcs`
			`[`
			`.B -slcv`
			`]`
			`[`
			`.B -f`
			`.I ics`
			`]`
			`[`
			`.B -t`
			`.I ocs`
			`]`
			`[`
			`.I file ...`
			`]`
			`.SH DESCRIPTION`
			`.I Tcs`
			`interprets the named`
			`.I file(s)`
			`(standard input default) as a stream of characters from the`
			`.I ics`
			`character set or format, converts them to runes,`
			`and then converts them into a stream of characters from the`
			`.I ocs`
			`character set or format on the standard output.`
			`The default value for`
			`.I ics`
			`and`
			`.I ocs`
			`is`
			`.BR utf ,`
			`the`
			`.SM UTF`
			`encoding described in`
			`.IR utf (6).`
			`The`
			`.B -l`
			`option lists the character sets known to`
			`.IR tcs .`
			`Processing continues in the face of conversion errors (the`
			`.B -s`
			`option prevents reporting of these errors).`
			`The`
			`.B -c`
			`option forces the output to contain only correctly converted characters;`
			`otherwise,`
			`.B Runeerror`
			`(0xFFFD)`
			`characters will be substituted for`
			`.SM UTF`
			`encoding errors and unknown characters.`
			`.PP`
			`The`
			`.B -v`
			`option generates various diagnostic and summary information on standard error,`
			`or makes the`
			`.B -l`
			`output more verbose.`
			`.PP`
			`.I Tcs`
			`recognizes an ever changing list of character sets.`
			`In particular, it supports a variety of Russian and Japanese encodings.`
			`Some of the supported encodings are`
			`.TF jis-kanji`
			`.TP`
			`.B utf`
			`The Plan 9`
			`.SM UTF`
			`encoding, known by ISO as UTF-8`
			`.TP`
			`.B utf1`
			`The deprecated original`
			`.SM UTF`
			`encoding from ISO 10646`
			`.TP`
			`.B ascii`
			`7-bit ASCII`
			`.TP`
			`.B 8859-1`
			`Latin-1 (Central European)`
			`.TP`
			`.B 8859-2`
			`Latin-2 (Czech .. Slovak)`
			`.TP`
			`.B 8859-3`
			`Latin-3 (Dutch .. Turkish)`
			`.TP`
			`.B 8859-4`
			`Latin-4 (Scandinavian)`
			`.TP`
			`.B 8859-5`
			`Part 5 (Cyrillic)`
			`.TP`
			`.B 8859-6`
			`Part 6 (Arabic)`
			`.TP`
			`.B 8859-7`
			`Part 7 (Greek)`
			`.TP`
			`.B 8859-8`
			`Part 8 (Hebrew)`
			`.TP`
			`.B 8859-9`
			`Latin-5 (Finnish .. Portuguese)`
			`.TP`
			`.B html`
			`Unicode as encoded by HTML`
			`.TP`
			`.B koi8`
			`KOI-8 (GOST 19769-74)`
			`.TP`
			`.B jis-kanji`
			`ISO 2022-JP`
			`.TP`
			`.B ujis`
			`EUC-JX: JIS 0208`
			`.TP`
			`.B ms-kanji`
			`Microsoft, or Shift-JIS`
			`.TP`
			`.B jis`
			`(from only) guesses between ISO 2022-JP, EUC or Shift-Jis`
			`.TP`
			`.B gb`
			`Chinese national standard (GB2312-80)`
			`.TP`
			`.B big5`
			`Big 5 (HKU version)`
			`.TP`
			`.B unicode`
			`Unicode Standard 1.0`
			`.TP`
			`.B tis`
			`Thai character set plus`
			`.SM ASCII`
			`(TIS 620-1986)`
			`.TP`
			`.B msdos`
			`IBM PC: CP 437`
			`.TP`
			`.B atari`
			`Atari-ST character set`
			`.SH EXAMPLES`
			`.TP`
			`.B tcs -f 8859-1`
			`Convert 8859-1 (Latin-1) characters into`
			`.SM UTF`
			`format.`
			`.TP`
			`.B tcs -s -f jis`
			`Convert characters encoded in one of several shift JIS encodings into`
			`.SM UTF`
			`format.`
			`Unknown Kanji will be converted into`
			`.B 0xFFFD`
			`characters.`
			`.TP`
			`.B tcs -t html`
			`Convert UTF into character set-independent HTML.`
			`.TP`
			`.B tcs -lv`
			`Print an up to date list of the supported character sets.`
			`.SH SOURCE`
			`.B /sys/src/cmd/tcs`
			`.SH SEE ALSO`
			`.IR ascii (1),`
			`.IR rune (2),`
			`.IR utf (6).`