MirOS Manual: Encode::TW(3p)


ext::Encode::TW::PerlpProgrammers Referencext::Encode::TW::TW(3p)

NAME

     Encode::TW - Taiwan-based Chinese Encodings

SYNOPSIS

         use Encode qw/encode decode/;
         $big5 = encode("big5", $utf8); # loads Encode::TW implicitly
         $utf8 = decode("big5", $big5); # ditto

DESCRIPTION

     This module implements tradition Chinese charset encodings
     as used in Taiwan and Hong Kong. Encodings supported are as
     follows.

       Canonical   Alias             Description
       --------------------------------------------------------------------
       big5-eten   /\bbig-?5$/i      Big5 encoding (with ETen extensions)
                   /\bbig5-?et(en)?$/i
                   /\btca-?big5$/i
       big5-hkscs  /\bbig5-?hk(scs)?$/i
                   /\bhk(scs)?-?big5$/i
                                     Big5 + Cantonese characters in Hong Kong
       MacChineseTrad                Big5 + Apple Vendor Mappings
       cp950                         Code Page 950
                                     = Big5 + Microsoft vendor mappings
       --------------------------------------------------------------------

     To find out how to use this module in detail, see Encode.

NOTES

     Due to size concerns, "EUC-TW" (Extended Unix Character),
     "CCCII" (Chinese Character Code for Information Inter-
     change), "BIG5PLUS" (CMEX's Big5+) and "BIG5EXT" (CMEX's
     Big5e) are distributed separately on CPAN, under the name
     Encode::HanExtra. That module also contains extra China-
     based encodings.

BUGS

     Since the original "big5" encoding (1984) is not supported
     anywhere (glibc and DOS-based systems uses "big5" to mean
     "big5-eten"; Microsoft uses "big5" to mean "cp950"), a cons-
     cious decision was made to alias "big5" to "big5-eten",
     which is the de facto superset of the original big5.

     The "CNS11643" encoding files are not complete. For common
     "CNS11643" manipulation, please use "EUC-TW" in
     Encode::HanExtra, which contains planes 1-7.

     The ASCII region (0x00-0x7f) is preserved for all encodings,
     even though this conflicts with mappings by the Unicode Con-
     sortium.  See

perl v5.8.8                2005-02-05                           1

ext::Encode::TW::PerlpProgrammers Referencext::Encode::TW::TW(3p)

     <http://www.debian.or.jp/~kubota/unicode-symbols.html.en>

     to find out why it is implemented that way.

SEE ALSO

     Encode

perl v5.8.8                2005-02-05                           2

Generated on 2014-07-04 21:17:45 by $MirOS: src/scripts/roff2htm,v 1.79 2014/02/10 00:36:11 tg Exp $

These manual pages and other documentation are copyrighted by their respective writers; their source is available at our CVSweb, AnonCVS, and other mirrors. The rest is Copyright © 2002‒2014 The MirOS Project, Germany.
This product includes material provided by Thorsten Glaser.

This manual page’s HTML representation is supposed to be valid XHTML/1.1; if not, please send a bug report – diffs preferred.