CJK Unified Ideographs

CJKV character in traditional and simplified Chinese, Korean, Vietnamese and Japanese forms

The Chinese, Japanese and Korean (also known as CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 17.0, Unicode defines a total of 101,996 characters.[1]

The term ideographs is a misnomer, as the Chinese script is not ideographic but rather logographic.[citation needed]

Until the early 20th century, Vietnam also used Chinese characters (Chữ Nôm), so sometimes the abbreviation CJKV is used.

Sources

[edit]

The Ideographic Research Group (IRG) is responsible for developing extensions to the encoded repertoires of CJK unified ideographs. IRG processes proposals for new CJK unified ideographs submitted by its member bodies, and after undergoing several rounds of expert review, IRG submits a consolidated set of characters to ISO/IEC JTC 1/SC 2 Working Group 2 (WG2) and the Unicode Technical Committee (UTC) for consideration for inclusion in the ISO/IEC 10646 and Unicode standards. The following IRG member bodies have been involved in the standardization of CJK unified ideographs:

The ideographs submitted by the UTC and the United Kingdom are not specific to any particular region, but are characters which have been suggested for encoding by individual experts. The ideographs submitted by SAT are required for the SAT Daizōkyō text database.

The table below gives the numbers of encoded CJK unified ideographs for each IRG source for Unicode 17.0.[4]The total number of characters (267,742) far exceeds the number of encoded CJK unified ideographs (101,996) as many characters have more than one source.

CJK unified ideographs by source
Country or region Character count
China 69,724
Hong Kong 17,654
 Macau 344
Taiwan (TCA) 59,570
Japan 52,560
South Korea 21,358
North Korea 23,975
Vietnam 14,276
United Kingdom 3,409
SAT 3,714
SATM 1,157
UTC 1,026
Total 267,742

UTC sources

[edit]

The majority of characters submitted by the UTC to the IRG are derived from Unicode Technical Committee (UTC) documents.[5] Other sources include:

Ordering

[edit]

The ordering of CJK Unified Ideographs within Unicode blocks (not counting those added to the block later) was initially determined by consulting the following four dictionaries. Primarily, they were arranged in Kangxi Dictionary order, with the other dictionaries consulted, in order, for characters not found in the Kangxi Dictionary, to determine which Kangxi Dictionary character they should follow in the ordering.[6]

  1. Kangxi Dictionary
  2. Dai Kan-Wa Jiten
  3. Hanyu Da Zidian
  4. Dae Jaweon

This system is not used for more recently-added Unicode blocks. The Ideographic Research Group no longer uses the Dae Jaweon,[7] nor the Dai Kan-Wa Jiten,[8] in its work. The Kangxi Dictionary and Hanyu Da Zidian are still used[7] both in existing character source references,[9] and as potential replacements for existing source references discovered to be erroneous.[10] Similarly, although a (real or virtual) Kangxi Dictionary index was previously provided as part of the submission data for UTC-source characters, this is no longer the case.[11] Instead, the stroke type of the first residual stroke (first stroke which does not form part of the radical) is supplied with all submitted characters, and used to order characters with the same radical and stroke count within the new Unicode block.[12]

CJK Unified Ideographs blocks

[edit]

CJK Unified Ideographs

[edit]

The basic block named CJK Unified Ideographs (4E00–9FFF) contains 20,992 basic Chinese characters in the range U+4E00 through U+9FFF. The block not only includes characters used in the Chinese writing system but also kanji used in the Japanese writing system, hanja in Korea, and chữ Nôm characters in Vietnamese. Many characters in this block are used in all three writing systems, while others are in only one or two of the three.

This block is also known as the Unified Repertoire and Ordering (URO), especially when it needs to be differentiated from the other CJK Unified Ideographs blocks.[13]

The first 20,902 characters in the block are arranged according to the Kangxi Dictionary ordering of radicals. In this system the characters written with the fewest strokes are listed first. The remaining characters were added later, and so are not in radical order.

The block is the result of Han unification,[14] which was somewhat controversial within East Asia.[15] Since single characters used in more than one of Chinese, Japanese and Korean were coded in the same location, and the modern typographical conventions and handwriting curricula differ slightly between regions (not necessarily along language boundaries—for example, Hong Kong and Taiwan, which both use Traditional Chinese, have slightly different local conventions),[16] the appearance of a selected glyph could depend on the particular font being used. However, the URO applies the source separation rule, meaning that pairs of characters treated as distinct in a character set used as a source for the URO (e.g. JIS X 0208 as used in e.g. Shift JIS) would remain pairs of separate characters in the new Unicode encoding.[17]

Using variation selectors, it is possible to specify certain variant CJK ideograms within Unicode.[18] The Adobe-Japan1 character set, which has 14,684 ideographic variation sequences,[19] is an extreme example of the use of variation selectors.[20]

Charts

[edit]

4E00-62FF, 6300-77FF, 7800-8CFF, 8D00-9FFF.

Sources

[edit]

Note: Most characters appear in multiple sources, so the sum of individual character counts (108,493) is far greater than the number of encoded characters (20,992).[21]

Country or region Code Source[22] Character count Total
China G0 GB 2312-80 6,763 20,938
G1 GB 12345-90 (Traditional Chinese analogue to GB 2312-80) 2,202
G3 GB 13131 (unpublished Traditional Chinese analogue to GB 7589-87) 4,833
G5 GB 13132 (unpublished Traditional Chinese analogue to GB 7590-87) 2,843
G7 Modern Chinese general character chart (Simplified Chinese: 现代汉语通用字表) 42
G8 GB 8565-88 203
GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 6
GCE National Academy for Educational Research 4
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 2
GE GB 16500-95 3,767
GFC Modern Chinese Standard Dictionary (现代汉语规范词典第二版) 2
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 6
GGT Characters collected by the National Library of China (中国国家图书馆) 1
GH GB/T 15564-1995 59
GHZ Hanyu Da Zidian (漢語大字典) 1
GHZR Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版) 29
GK GB 12052-89 89
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 16
GKX Kangxi Dictionary (康熙字典) 43
GLK Longkan Shoujian (龍龕手鑑) 1
GT Standard Telegraph Codebook (revised), 1983 16
GU No source (the original source reference may have been moved) 8
GWY Cultural Heritage Ideographs (文化遗产用字) 1
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 1
Hong Kong H Hong Kong Supplementary Character Set, 2008 2,292 15,376
HB0 Computer Chinese Glyph and Character Code Mapping Table, Technical Report C-26
(電腦用中文字型與字碼對照表, 技術通報C-26)
9
HB1 Big-5, Level 1 5,401
HB2 Big-5, Level 2 7,650
HD Hong Kong Supplementary Character Set, 2016 24
Japan J0 JIS X 0208-1990 6,356 18,249
J1 JIS X 0212-1990 3,058
J13 JIS X 0213:2004 level-3 characters replacing J1 characters 1,037
J13A JIS X 0213:2004 level-3 character addendum from JIS X 0213:2000 level-3 replacing J1 character 2
J14 JIS X 0213:2004 level-4 characters replacing J1 characters 1,704
J3 JIS X 0213:2004 Level 3 95
J3A JIS X 0213:2004 Level 3 addendum 7
J4 JIS X 0213:2004 Level 4 301
JARIB ARIB STD-B24 3
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 5,686
South Korea K0 KS C 5601-87 (now KS X 1001:2004) 4,620 15,450
K1 KS C 5657-91 (now KS X 1002:2001) 2,855
K2 PKS C 5700-1:1994 (now KS X 1027-1:2011) 7,911
K3 PKS C 5700-2:1994 (now KS X 1027-2:2011) 1
K4 PKS C 5700-3:1998 (now KS X 1027-3:2011) 4
K6 KS X 1027-5:2014 57
KC Korean History On-Line (한국 역사 정보 통합 시스템) 1
KU No source (the original source reference may have been moved) 1
North Korea KP0 KPS 9566-97 4,652 15,008
KP1 KPS 10721-2000 10,356
 Macau MA HKSCS-2008 29 200
MB1 Big Five 10
MB2 Big Five 7
MC MCSCS Reference 3
MD MCSCS horizontal extensions 127
MDH MCSCS horizontal extensions 24
Taiwan T1 CNS 11643-1992 plane 1 5,413 18,385
T2 CNS 11643-1992 plane 2 7,651
T3 CNS 11643-1992 plane 3 4,145
T4 CNS 11643-1992 plane 4 893
T5 CNS 11643-1992 plane 5 64
T6 CNS 11643-1992 plane 6 31
T7 CNS 11643-1992 plane 7 16
TB CNS 11643-2007 plane 11 2
TC CNS 11643-2007 plane 12 2
TE CNS 11643-2007 plane 14 9
TF CNS 11643-2007 plane 15 159
Vietnam V0 TCVN 5773:1993 598 4,808
V1 TCVN 6056:1995 3,305
V2 VHN 01-1998 759
V3 VHN 02-1998 91
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 19
VN Vietnamese horizontal extensions 36
N/A UTC UTC sources 79 79

In Unicode 4.1, 14 HKSCS-2004 characters and 8 GB 18030 characters were assigned to between U+9FA6 and U+9FBB code points. Since then, other additions were added to this block for various reasons, all summarized in the version history section below.

CJK Unified Ideographs Extension A

[edit]

The block named CJK Unified Ideographs Extension A (3400–4DBF) contains 6,592 additional characters in the range U+3400 through U+4DBF.

Charts

[edit]

3400-4DBF.

Sources

[edit]

Note: Most characters appear in more than one source, so the sum of individual character counts (23,997) is far greater than the number of encoded characters (6,592).[21]

Country or region Code Source[22] Character count Total
 China G3 GB 13131 (unpublished Traditional Chinese analogue to GB 7589-87) 2,390 6,230
G5 GB 13132 (unpublished Traditional Chinese analogue to GB 7590-87) 1,226
G7 Modern Chinese general character chart 120
GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 12
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 7
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 2
GHZ Hanyu Da Zidian (漢語大字典) 341
GHZR Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版) 1
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 3
GKX Kangxi Dictionary (康熙字典) 1,889
GS Singapore Chinese characters[note 1] 226
GWY Cultural Heritage Ideographs (文化遗产用字) 1
GZ Ancient Zhuang Character Dictionary (古壮字字典) 12
Hong Kong H Hong Kong Supplementary Character Set, 2008 572 572
 Japan J3 JIS X 0213:2004 Level 3 2 5,856
J4 JIS X 0213:2004 Level 4 78
JA Japanese IT Vendors Contemporary Ideographs, 1993 574
JA3 JIS X 0213:2004 level-3 characters replacing JA characters 17
JA4 JIS X 0213:2004 level-4 characters replacing JA characters 67
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 5,118
 South Korea K3 PKS C 5700-2:1994 (now KS X 1027-2:2011) 1,833 1,876
K4 PKS C 5700-3:1998 (now KS X 1027-3:2011) 2
K6 KS X 1027-5:2014 37
KC Korean History On-Line (한국 역사 정보 통합 시스템) 3
KU No source (the original source reference may have been moved) 1
 North Korea KP0 KPS 9566-97 1 3,191
KP1 KPS 10721-2000 3,190
 Macau MA HKSCS-2008 4 12
MD MCSCS horizontal extensions 8
Taiwan T3 CNS 11643-1992 plane 3 2,179 5,917
T4 CNS 11643-1992 plane 4 2,920
T5 CNS 11643-1992 plane 5 400
T6 CNS 11643-1992 plane 6 200
T7 CNS 11643-1992 plane 7 133
TE CNS 11643-2007 plane 14 1
TF CNS 11643-2007 plane 15 84
United Kingdom UK IRG N2107R2 3 3
 Vietnam V0 TCVN 5773:1993 140 319
V2 VHN 01-1998 149
V3 VHN 02-1998 19
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 5
VN Vietnamese horizontal extensions 6
N/A UTC UTC sources 21 21

CJK Unified Ideographs Extension B

[edit]

The block named CJK Unified Ideographs Extension B (20000–2A6DF) contains 42,720 characters in the range U+20000 through U+2A6DF. These include most of the characters used in the Kangxi Dictionary that are not in the basic CJK Unified Ideographs block, as well as many Hán-Nôm characters that were formerly used to write Vietnamese.

Charts

[edit]

20000-215FF, 21600-230FF, 23100-245FF, 24600-260FF, 26100-275FF, 27600-290FF, 29100-2A6DF.

Sources

[edit]

Note: Many characters appear in more than one source, so the sum of individual character counts (100,887) is far greater than the number of encoded characters (42,720).[21]

Country or region Code Source[22] Character count Total
 China G3 GB 13131 (unpublished Traditional Chinese analogue to GB 7589-87) 1 31,345
G4K Siku Quanshu (四庫全書) 474
GBK Encyclopedia of China (中國大百科全書) 59
GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 78
GCESI Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院) 102
GCH Cihai (辞海) 247
GCY Ciyuan (辭源) 66
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 146
GFZ Founder Press System 65
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 5
GHC Hanyu Da Cidian (漢語大詞典) 553
GHF Hanwen fodian yinan suzi huishi yu yanjiu (漢文佛典疑難俗字彙釋與研究) 1
GHZ Hanyu Da Zidian (漢語大字典) 10,506
GHZR Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版) 4
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 17
GKX Kangxi Dictionary (康熙字典) 18,472
GU No source (the original source reference may have been moved) 73
GWY Cultural Heritage Ideographs (文化遗产用字) 12
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 8
GZ Ancient Zhuang Character Dictionary (古壮字字典) 453
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 3
Hong Kong H Hong Kong Supplementary Character Set, 2008 1,703 1,703
 Japan J3 JIS X 0213:2004 Level 3 25 25,745
J3A JIS X 0213:2004 Level 3 addendum 1
J4 JIS X 0213:2004 Level 4 277
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 25,442
 South Korea K1 KS C 5657-91 (now KS X 1002:2001) 1 683
K4 PKS C 5700-3:1998 (now KS X 1027-3:2011) 166
K6 KS X 1027-5:2014 502
KC Korean History On-Line (한국 역사 정보 통합 시스템) 14
 North Korea KP1 KPS 10721-2000 5,766 5,766
 Macau MA HKSCS-2008 9 38
MC MCSCS Reference 2
MD MCSCS horizontal extensions 27
Taiwan T3 CNS 11643-1992 plane 3 28 30,212
T4 CNS 11643-1992 plane 4 3,408
T5 CNS 11643-1992 plane 5 8,114
T6 CNS 11643-1992 plane 6 5,942
T7 CNS 11643-1992 plane 7 6,299
TA CNS 11643-2007 plane 10 12
TB CNS 11643-2007 plane 11 7
TC CNS 11643-2007 plane 12 1
TF CNS 11643-2007 plane 15 6,401
 United Kingdom UK IRG N2107R2 12 12
 Vietnam V0 TCVN 5773:1993 1,570 5,299
V1 TCVN 6056:1995 1
V2 VHN 01-1998 2,286
V3 VHN 02-1998 422
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 33
VN Vietnamese horizontal extensions 987
Buddhist canon SAT SAT Daizōkyō Text Database 1 1
N/A UTC UTC sources 83 83

CJK Unified Ideographs Extension C

[edit]

The block named CJK Unified Ideographs Extension C (2A700–2B73F) contains 4,160 characters in the range U+2A700 through U+2B73F. It was initially added in Unicode 5.2 (2009).

Charts

[edit]

2A700-2B73F.

Sources

[edit]

Note: Some characters appear in more than one source, so the sum of individual character counts (4,967) is greater than the number of encoded characters (4,160).[21]

Country or region Code Source[22] Character count Total
 China GBK Encyclopedia of China (中國大百科全書) 74 1,456
GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 12
GCESI Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院) 117
GCH Cihai (辞海) 264
GCY Ciyuan (辭源) 1
GCYY Chinese Academy of Surveying and Mapping ideographs 55
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 83
GFZ Founder Press System 1
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 2
GGH Gudai Hanyu Cidian (古代汉语词典) 51
GHC Hanyu Da Cidian (漢語大詞典) 14
GHZ Hanyu Da Zidian (漢語大字典) 1
GHZR Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版) 1
GJZ Commercial Press ideographs 61
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 8
GWY Cultural Heritage Ideographs (文化遗产用字) 2
GKX Kangxi Dictionary (康熙字典) 6
GXC Xiandai Hanyu Cidian (现代汉语词典) 25
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 2
GZ Ancient Zhuang Character Dictionary (古壮字字典) 109
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 202
GZJW Yin Zhou Jinwen Jicheng Yinde (殷周金文集成引得) 365
Hong Kong H Hong Kong Supplementary Character Set, 2008 1 1
 Japan JK Japanese Kokuji Collection (Mojikyō subset) 367 431
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 64
 South Korea K5 Korean IRG Hanja Character Set (later became KS X 1027-4:2011) 404 407
K6 KS X 1027-5:2014 2
KC Korean History On-Line (한국 역사 정보 통합 시스템) 1
 North Korea KP1 KPS 10721-2000 9 9
 Macau MC MCSCS Reference 17 21
MD MCSCS horizontal extensions 4
Taiwan T4 CNS 11643-1992 plane 4 1 1,757
T5 CNS 11643-1992 plane 5 1
T6 CNS 11643-1992 plane 6 2
TB CNS 11643-2007 plane 11 2
TC CNS 11643-2007 plane 12 634
TD CNS 11643-2007 plane 13 766
TE CNS 11643-2007 plane 14 350
TU No source (the original source reference may have been moved) 1
 United Kingdom UK IRG N2107R2 1 1
 Vietnam V0 TCVN 5773:1993 4 795
V1 TCVN 6056:1995 2
V2 VHN 01-1998 1
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 782
VN Vietnamese horizontal extensions 6
N/A UTC UTC sources 89 89

CJK Unified Ideographs Extension D

[edit]

The block named CJK Unified Ideographs Extension D (2B740–2B81F) contains 222 characters in the range U+2B740 through U+2B81D that were added in Unicode 6.0 (2010).

Charts

[edit]

2B740–2B81F.

Sources

[edit]

Note: Some characters appear in more than one source, so the sum of individual character counts (260) is greater than the number of encoded characters (222).[21]

Country or region Code Source[22] Character count Total
 China GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 12 99
GCESI Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院) 6
GCH Cihai (辞海) 1
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 1
GIDC ID System of the Ministry of Public Security of China 9
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 2
GXC Xiandai Hanyu Cidian (现代汉语词典) 4
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 22
GZ Ancient Zhuang Character Dictionary (古壮字字典) 3
GZH Zhonghua Zihai (中华字海) 39
 Japan JH Hanyo-Denshi Program (汎用電子情報交換環境整備プログラム) 117 117
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 10
Taiwan TB CNS 11643-2007 plane 11 24 24
N/A UTC UTC sources 20 20

CJK Unified Ideographs Extension E

[edit]

The block named CJK Unified Ideographs Extension E (2B820–2CEAF) contains 5,774 characters in the range U+2B820 through U+2CEAD. It was originally added in Unicode 8.0 (2015).

Charts

[edit]

2B820–2CEAF.

Sources

[edit]

Note: Some characters appear in more than one source, so the sum of individual character counts (6,272) is greater than the number of encoded characters (5,774).[21]

Country or region Code Source[22] Character count Total
 China GBK Encyclopedia of China (中國大百科全書) 11 3,173
GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 20
GCESI Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院) 211
GCH Cihai (辞海) 112
GCY Ciyuan (辭源) 3
GCYY Chinese Academy of Surveying and Mapping ideographs 98
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 10
GDZ Geology Press ideographs 1
GGFZ Tongyong Guifan Hanzi Zidian (通用规范汉字字典) 4
GGH Gudai Hanyu Cidian (古代汉语词典) 175
GGT Characters collected by the National Library of China (中国国家图书馆) 2
GHC Hanyu Da Cidian (漢語大詞典) 7
GIDC ID System of the Ministry of Public Security of China 37
GJZ Commercial Press ideographs 147
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 2
GKX Kangxi Dictionary (康熙字典) 22
GRM People's Daily ideographs 3
GU No source (the original source reference may have been removed) 3
GWY Cultural Heritage Ideographs (文化遗产用字) 1
GWZ Hanyu Da Cidian Press ideographs 12
GXC Xiandai Hanyu Cidian (现代汉语词典) 57
GXH Xinhua Zidian (新华字典) 4
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 1
GZ Ancient Zhuang Character Dictionary (古壮字字典) 107
GZFY Hanyu Fangyan Dacidian (汉语方言大词典) 712
GZHSJ Characters collected by the Zhonghua Book Company (中华书局) 1
GZJW Yin Zhou Jinwen Jicheng Yinde (殷周金文集成引得) 1,410
Hong Kong HD Hong Kong Supplementary Character Set, 2016 1 1
 Japan JK Japanese Kokuji Collection (Mojikyō subset) 415 503
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 88
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 7 7
Macau MC MCSCS Reference 48 51
MD MCSCS horizontal extensions 3
Taiwan T3 CNS 11643-1992 plane 3 2 1,261
TB CNS 11643-2007 plane 11 2
TC CNS 11643-2007 plane 12 323
TD CNS 11643-2007 plane 13 595
TE CNS 11643-2007 plane 14 339
 United Kingdom UK IRG N2107R2 2 2
 Vietnam V0 TCVN 5773:1993 7 1,037
V2 VHN 01-1998 1
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 1,023
VN Vietnamese horizontal extensions 6
N/A UTC UTC sources 237 237

CJK Unified Ideographs Extension F

[edit]

The block named CJK Unified Ideographs Extension F (2CEB0–2EBEF) contains 7,473 characters in the range U+2CEB0 through 2EBE0 that were added in Unicode 10.0 (2017). It includes more than 1,000 Sawndip characters for Zhuang.

Charts

[edit]

2CEB0–2EBEF.

Sources

[edit]

Note: Some characters appear in more than one source, so the sum of individual character counts (8,015) is greater than the number of encoded characters (7,473).[21]

Country or region Code Source[22] Character count Total
 China GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 46 1,546
GCESI Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院) 73
GCY Ciyuan (辭源) 122
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 31
GFC Modern Chinese Standard Dictionary (现代汉语规范词典第二版) 27
GIDC ID System of the Ministry of Public Security of China 1
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 5
GLGYJ Zhuang Liao Songs Research (壮族嘹歌研究) 1
GOCD Oxford English-Chinese Chinese-English Dictionary (牛津英汉汉英词典) 2
GPGLG Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌) 69
GWY Cultural Heritage Ideographs (文化遗产用字) 6
GXHZ Xinhua Da Zidian (新华大字典) 51
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 2
GZ Ancient Zhuang Character Dictionary (古壮字字典) 1,075
GZJW Yin Zhou Jinwen Jicheng Yinde (殷周金文集成引得) 33
GZYS Chinese Ancient Ethnic Characters Research (中国民族古文字研究) 2
Hong Kong HD Hong Kong Supplementary Character Set, 2016 1 1
 Japan JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 1,646 1,646
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 1,810 1,810
 Macau MC MCSCS Reference 22 22
Taiwan T3 CNS 11643-1992 plane 3 1 6
T6 CNS 11643-1992 plane 6 2
T7 CNS 11643-1992 plane 7 2
T6 CNS 11643-1992 plane 6 1
TC CNS 11643-2007 plane 12 1
 United Kingdom UK IRG N2107R2 2 2
Vietnam V0 TCVN 5773:1993 1 17
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 8
VN Vietnamese horizontal extensions 8
Buddhist canon SAT SAT Daizōkyō Text Database 2,884 2,884
N/A UTC UTC sources 81 81

CJK Unified Ideographs Extension G

[edit]

A block named CJK Unified Ideographs Extension G was added as part of Unicode 13.0 to the Tertiary Ideographic Plane in the range U+30000 through U+3134F, containing 4,939 characters.[24]

Charts

[edit]

30000–3134F.

Sources

[edit]

Note: Some characters appear in more than one source, so the sum of individual character counts (5,239) is greater than the number of encoded characters (4,939).[21]

Country or region Code Source[22] Character count Total
 China GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 69 2,082
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 49
GHZR Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版) 878
GPGLG Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌) 13
GWY Cultural Heritage Ideographs (文化遗产用字) 11
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 11
GZ Ancient Zhuang Character Dictionary (古壮字字典) 2,239
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 435 435
 Taiwan T13 CNS 11643 (pending new version) plane 19 347 354
T5 CNS 11643-1992 plane 5 1
TB CNS 11643-2007 plane 11 3
TC CNS 11643-2007 plane 12 2
TD CNS 11643-2007 plane 13 1
 United Kingdom UK IRG N2107R2 1,566 1,566
Vietnam V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 6 76
VN Vietnamese horizontal extensions 70
Buddhist canon SAT SAT Daizōkyō Text Database 329 329
N/A UTC UTC sources 240 240

CJK Unified Ideographs Extension H

[edit]

A block named CJK Unified Ideographs Extension H was added as part of Unicode 15.0 to the Tertiary Ideographic Plane in the range U+31350 through U+323AF, containing 4,192 characters.[25]

Charts

[edit]

31350–323AF.

Sources

[edit]

Note: Some characters appear in more than one source, so the sum of individual character counts (4,541) is greater than the number of encoded characters (4,192).[21]

Country or region Code Source[22] Character count Total
 China GCA Culture and Art Publishing House Ideographs (文化艺术出版社用字) 9 1,059
GCESI Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院) 1
GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 128
GHC Hanyu Da Cidian (漢語大詞典) 27
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 30
GLGYJ Zhuang Liao Songs Research (壮族嘹歌研究) 11
GPGLG Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌) 14
GU No source (the original source reference may have been moved) 1
GWY Cultural Heritage Ideographs (文化遗产用字) 5
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 216
GZ Ancient Zhuang Character Dictionary (古壮字字典) 330
GZA-1 A Vibrant and Unbroken Transmission—Filial Piety and Zhuang Funeral Songs (生生不息的传承•孝与壮族行孝歌之研究) 6
GZA-2 Annotated Long Zhuang Morality Songs (壮族伦理道德长诗传扬歌译注) 38
GZA-3 Compendium of Old Zhuang Folksong Texts—Wooing Songs vol. 1—Liao Songs (壮族民歌古籍集成•情歌(一)嘹歌) 2
GZA-4 Compendium of Old Zhuang Folksong Texts—Wooing Songs vol. 2—Fwen Nganx (壮族民歌古籍集成•情歌(二)欢𭪤) 11
GZA-6 Zhuang Proverbs from China (中国壮族谚语) 59
GZA-7 Ancient Remembrance—Zhuang Creation Myth Songs (远古的追忆•壮族创世神话古歌研究) 1
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 512 512
 North Korea KP1 KPS 10721-2000 1 1
 Taiwan T12 CNS 11643 (pending new version) plane 18 7 716
T13 CNS 11643 (pending new version) plane 19 696
T4 CNS 11643-1992 plane 4 1
T6 CNS 11643-1992 plane 6 1
T7 CNS 11643-1992 plane 7 2
TB CNS 11643-2007 plane 11 5
TC CNS 11643-2007 plane 12 3
TE CNS 11643-2007 plane 14 1
 United Kingdom UK IRG N2232R 917 917
Vietnam V0 TCVN 5773:1993 6 931
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 74
VN Vietnamese horizontal extensions 851
Buddhist canon SAT SAT Daizōkyō Text Database 241 241
N/A UTC UTC sources 164 164

CJK Unified Ideographs Extension I

[edit]

A block named CJK Unified Ideographs Extension I was added as part of Unicode 15.1 to the Supplementary Ideographic Plane in the range U+2EBF0 through U+2EE5F, containing 622 characters.[26]

Charts

[edit]

2EBF0–2EE5F.

Sources

[edit]

Note: Some characters appear in more than one source, making the sum of individual character counts (625) more than the number of encoded characters (622).[21]

Country or region Code Source[22] Character count Total
 China GIDC23 ID system of the Ministry of Public Security of China, 2023 622 622
 Japan JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 1 1
N/A UTC UTC sources 2 2

CJK Unified Ideographs Extension J

[edit]

A block named CJK Unified Ideographs Extension J was added as part of Unicode 17.0 to the Supplementary Ideographic Plane in the range U+323B0-U+33479, containing 4,298 characters.

Charts

[edit]

323B0-3347F.

Sources

[edit]

Note: Some characters appear in more than one source, making the sum of individual character counts (4,406) more than the number of encoded characters (4,298).[21]

Country or region Code Source[22] Character count Total
 China GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 144 1,059
GKJ Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST) 567
GXM Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China 4
GZ Ancient Zhuang Character Dictionary (古壮字字典) 290
 South Korea KC Korean History On-Line (한국 역사 정보 통합 시스템) 178 178
 Taiwan T9 CNS 11643 (pending new version) plane 9 17 937
T11 CNS 11643 (pending new version) plane 17 1
TB CNS 11643-2007 plane 11 69
TC CNS 11643-2007 plane 12 165
TD CNS 11643-2007 plane 13 241
TE CNS 11643-2007 plane 14 396
TF CNS 11643-2007 plane 15 6
 United Kingdom UK IRG N2232R 6 906
UK IRG N2487 900
 Vietnam V0 TCVN 5773:1993 16 991
V1 TCVN 6056:1995 1
V2 VHN 01-1998 2
V3 VHN 02-1998 1
V4 Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire) 58
VN Vietnamese horizontal extensions 913
Buddhist canon SAT SAT Daizōkyō Text Database 241 906
SATM SAT manuscript collection for Buddhist studies 1
N/A UTC UTC sources 129 129

CJK Compatibility Ideographs

[edit]

The block named CJK Compatibility Ideographs (F900–FAFF) was created to retain round-trip compatibility with other standards.

However, twelve characters in this block actually have the "Unified Ideograph" property: U+FA0E 﨎, U+FA0F 﨏, U+FA11 﨑, U+FA13 﨓, U+FA14 﨔, U+FA1F 﨟, U+FA21 﨡, U+FA23 﨣, U+FA24 﨤, U+FA27 﨧, U+FA28 﨨, and U+FA29 﨩.[1] None of the other characters in this and other "Compatibility" blocks relate to CJK unification.

While 龜 and 亀 are not considered unifiable, U+FA20 CJK COMPATIBILITY IDEOGRAPH-FA20 is considered a duplicate to U+8612 CJK UNIFIED IDEOGRAPH-8612.

Charts

[edit]

F900–FAFF.

Sources

[edit]

Note: All characters appear in more than one source, so the sum of individual character counts (40) is greater than the number of encoded characters (12).[21]

Country or region Code Source[22] Character count Total
China GDM Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China 1 12
GU No source (the original source reference may have been moved) 11
 Japan J3 JIS X 0213:2004 Level 3 3 12
J4 JIS X 0213:2004 Level 4 3
JA Japanese IT Vendors Contemporary Ideographs, 1993 1
JA3 JIS X 0213:2004 level-3 characters replacing JA characters 1
JMJ Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業) 4
Taiwan TF CNS 11643-2007 plane 15 1 1
 Vietnam V0 TCVN 5773:1993 3 3
N/A UTC UTC sources 12 12

Known issues

[edit]

Disunification

[edit]

U+4039

[edit]

The character U+4039 (䀹) was a unification of two different characters (one with jiā 夾 phonetic and one with shǎn 㚒 phonetic) until Unicode 5.0. However, they were lexically different characters that should not have been unified; they have different pronunciations and different meanings.

The proposal of disunification of U+4039[27] was accepted for Unicode 5.1, encoding a new character at U+9FC3 (鿃) to represent shǎn.

Other 3 glyphs in Extension B

[edit]

In CJK Unified Ideographs Extension B, some characters were incorrectly unified with others. These characters include U+2017B (𠅻), U+204AF (𠒯) and U+24CB2 (𤲲). The first two characters contained a wrong unification of Chinese Mainland and Vietnamese source of their glyph, while the last one unifies the Chinese Mainland and Taiwanese ones.[28]

The glyphs for U+2017B (𠅻) and U+204AF (𠒯) were corrected in version 10.0, and the erroneous UCS2003 source glyph U+24CB2 (𤲲) was removed in version 13.0.

Unifiable variants and exact duplicates

[edit]

Also in CJK Unified Ideographs Extension B, hundreds of glyph variants were encoded by mistake.[29] Additionally, an ISO/IEC JTC 1/SC 2 report has found that six exact duplicates (where the same character has inadvertently been encoded twice) and two semi-duplicates (where the CJK-B character represents a de facto disunification of two glyph forms unified in the corresponding BMP character) were encoded by mistake:[30]

  • U+34A8 㒨 = U+20457 𠑗 : U+20457 is the same as the China-source glyph for U+34A8, but it is significantly different from the Taiwan-source glyph for U+34A8
  • U+3DB7 㶷 = U+2420E 𤈎 : same glyph shapes
  • U+8641 虁 = U+27144 𧅄 : U+27144 is the same as the Korean-source glyph for U+8641, but it is significantly different from the Chinese Mainland-, Taiwan- and Japan-source glyphs for U+8641
  • U+204F2 𠓲 = U+23515 𣔕 : same glyph shapes, but ordered under different radicals
  • U+249BC 𤦼 = U+249E9 𤧩 : same glyph shapes
  • U+24BD2 𤯒 = U+2A415 𪐕 : same glyph shapes, but ordered under different radicals
  • U+26842 𦡂 = U+26866 𦡦 : same glyph shapes
  • U+FA23 﨣 = U+27EAF 𧺯 : same glyph shapes (U+FA23 﨣 is a unified CJK ideograph, despite its name "CJK COMPATIBILITY IDEOGRAPH-FA23.")

Other CJK ideographs in Unicode, not Unified

[edit]

Apart from the eleven blocks of "Unified Ideographs," Unicode has about a dozen more blocks with not-unified CJK-characters. These are mainly CJK radicals, strokes, punctuation, marks, symbols and compatibility characters. Although some characters have their (decomposable) counterparts in other blocks, the usages can be different. An example of a not-unified CJK-character is U+3007 IDEOGRAPHIC NUMBER ZERO in the CJK Symbols and Punctuation block. Although it is not covered under "CJK Unified Ideographs", it is treated as a CJK-character for all other intents and purposes.[31]

Four blocks of compatibility characters are included for compatibility with legacy text handling systems and older character sets:

They include forms of characters for vertical text layout and rich text characters that Unicode recommends handling through other means. Therefore, their use is discouraged.

Font support

[edit]

The blocks CJK Unified Ideographs and CJK Unified Ideographs Extension A, being parts of the Basic Multilingual Plane, are supported by the majority of the CJK fonts. However, Japanese and Korean fonts usually have fewer characters (about 13,000 and 8,000, respectively) than Chinese. Extensions B, C, D are supported by additional fonts MingLiU-ExtB, MingLiU_HKSCS-ExtB, PMingLiU-ExtB, SimSun-ExtB included in Microsoft Windows since Vista.[32]

Unicode version history

[edit]
CJK unified ideographs additions per Unicode version
Unicode version Addition Plane Characters added Total characters
1.0 (1991) CJK Unified Ideographs Basic Multilingual Plane (BMP) 20,902 20,914
CJK Compatibility Ideographs BMP 12
3.0 (1999) CJK Unified Ideographs Extension A BMP 6,582 27,496
3.1 (2001) CJK Unified Ideographs Extension B Supplementary Ideographic Plane (SIP) 42,711 70,207
4.1 (2005) CJK Unified Ideographs: Ideographs from HKSCS-2004 and GB 18030-2000 not in ISO 10646 BMP 22 70,229
5.1 (2008) CJK Unified Ideographs: Ideographs from Adobe Japan and disunification of U+4039 BMP 8 70,237
5.2 (2009) CJK Unified Ideographs Extension C SIP 4,149 74,394
8 other characters from ARIB #47, #95, #93 and HKSCS BMP 8
6.0 (2010) CJK Unified Ideographs Extension D SIP 222 74,616
6.1 (2012) 1 character corresponding to Adobe-Japan1-6 CID+20156 BMP 1 74,617
8.0 (2015) CJK Unified Ideographs Extension E SIP 5,762 80,388
9 other characters BMP 9
10.0 (2017) CJK Unified Ideographs Extension F SIP 7,473 87,882
21 other characters BMP 21
11.0 (2018) CJK Unified Ideographs BMP 5 87,887
13.0 (2020) CJK Unified Ideographs BMP 13 92,856
CJK Unified Ideographs Extension A BMP 10
CJK Unified Ideographs Extension B SIP 7
CJK Unified Ideographs Extension G Tertiary Ideographic Plane (TIP) 4,939
14.0 (2021) CJK Unified Ideographs BMP 3 92,865
CJK Unified Ideographs Extension B SIP 2
CJK Unified Ideographs Extension C SIP 4
15.0 (2022) CJK Unified Ideographs Extension C SIP 1 97,058
CJK Unified Ideographs Extension H TIP 4,192
15.1 (2023) CJK Unified Ideographs Extension I SIP 622 97,680
17.0 (2025) CJK Unified Ideographs Extension C SIP 6 101,996
CJK Unified Ideographs Extension E SIP 12
CJK Unified Ideographs Extension J SIP 4,298

See also

[edit]

Notes

[edit]
  1. ^ Characters presumably intended for Singapore Chinese characters, but apparently an ad hoc collection rather than a Singapore national standard.[23]

References

[edit]
  1. ^ a b "Unicode 16.0 UCD: PropList.txt". 2025-06-30. Retrieved 2025-09-11.
  2. ^ IRG Convenor (2024-12-10). "IRG Experts List". ISO/IEC JTC1/SC2/WG2/IRG N2769.
  3. ^ Lunde, Ken (2024-09-13). "US/Unicode Activity Report for IRG #63 Meeting" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2700.
  4. ^ "Unicode 17.0 UCD: Unihan: Unihan_IRGSources.txt". 2025-07-24. Retrieved 2025-09-12.
  5. ^ "UAX #45: U-source Ideographs". Unicode Consortium. 2025-07-24.
  6. ^ "18.1.7. Han Ideograph Arrangement". The Unicode Standard: Core Specification. Version 16.0.0. Unicode Consortium.
  7. ^ a b "3.3. Dictionary Indices". Unicode Han Database (Unihan). UAX #38. Three of the dictionary properties represent official IRG indices for the dictionaries used in the four dictionary sorting algorithm. Two (kIRGHanyuDaZidian and kIRGKangXi) are still being used by the IRG, but the other one (kIRGDaeJaweon) is not.
  8. ^ Lunde, Ken (2022-09-01). "Proposal to remove/improve provisional Unihan database properties" (PDF). p. 6. UTC L2/22-188. In addition, the IRG no longer uses this dictionary for its ongoing work.
  9. ^ "kIRG_GSource". Unicode Han Database (Unihan). UAX #38. GKX: Kangxi Dictionary ideographs (康熙字典) 9th edition (1958) including the addendum (康熙字典)補遺. GHZ: Hanyu Dazidian ideographs (漢語大字典).
  10. ^ Lunde, Ken (2018-02-22). "Proposed kIRG_GSource Changes & Corrections" (PDF). UTC L2/18-065; ISO/IEC JTC1/SC2/WG2/IRG N2297.
  11. ^ "2. Text File Data". U-Source Ideographs. Unicode Consortium. UAX #45. A KangXi dictionary index for the ideograph, as described in Unicode Standard Annex #38, "Unicode Han Database (Unihan)" [UAX38]. This field is no longer used and contains no data.
  12. ^ Lunde, Ken (2024-09-30). "Proposal to remove FS (first residual stroke) value from submissions" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2713. This document proposes that the inclusion of first residual stroke (aka FS) values be removed from the submission requirements for new CJK Unified Ideographs […] The ISO/IEC 10646 Project Editor, when compiling an IRG working set into a new CJK Unified Ideographs extension block, uses the FS values to sort ideographs that share the same Radical-Stroke (Radical + SC) value.
  13. ^ Lunde, Ken (2012-09-16). "URO". CJK Type Blog. Adobe Inc.
  14. ^ The Unicode Standard 4.0, Appendix A - Han Unification History
  15. ^ Suzanne Topping, "The secret life of Unicode". Archived from the original on 2007-11-14. Retrieved 2010-05-12.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  16. ^ Lu, Qin (2015-06-08). "The Proposed Hong Kong Character Set" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2074.
  17. ^ "Chapter 11 - East Asian scripts", The Unicode standard, 4.0.
  18. ^ "Ideographic Variation Database". 2022-09-13. Retrieved 2022-09-20.
  19. ^ "IVD Stats". 2025-07-14. Retrieved 2025-09-12.
  20. ^ PRI 108: Combined registration of the Adobe Japan1 collection and of sequences in that collection
  21. ^ a b c d e f g h i j k l "Unihan_IRGSources.txt (from Unihan.zip)". 2025-07-24. Retrieved 2025-09-12.
  22. ^ a b c d e f g h i j k l "UAX #38: Unicode Han Database (Unihan)". Unicode Consortium. 2025-08-21.
  23. ^ Lunde, Ken (2009). "Chapter 3: Character Set Standards § Chinese Character Set Standards—Singapore". CJKV information processing (2nd ed.). Sebastopol, Calif.: O'Reilly Media, Inc. p. 130. ISBN 978-0-596-15611-4. OCLC 317878469. To what extent these 226 characters are ad-hoc, or codified by a Singapore national standard, is unknown, at least to me. My suspicion is that they are ad-hoc simply for the apparent lack of any Singapore national standard.
  24. ^ "Unicode 13.0.0". 10 March 2020. Retrieved 10 March 2020.
  25. ^ "Unicode 15.0.0". 13 September 2022. Retrieved 14 September 2022.
  26. ^ "Unicode 15.1.0". 2023-09-12. Retrieved 2023-09-12.
  27. ^ Andrew West and John Jenkins, proposal of disunification of U+4039
  28. ^ Eiso Chan (陈永聪), Comments on four error glyphs on CJK Unified Ideographs Ext B & E.[1]
  29. ^ Taichi Kawabata. "IRGN1155 Possible Duplicates" (.zip). Retrieved 2019-06-22.
  30. ^ Cook, Richard (6 October 2003). "Defect Report on Duplicate Encoded CJK Forms" (PDF). ISO/IEC JTC1/SC2/WG2. Retrieved 2025-08-21.
  31. ^ GB/T 15835-2011《出版物上数字用法》. China Guojia Biaozhun. https://journals.usst.edu.cn/uploadfile/file/GBT%2015835-2011%E3%80%8A%E5%87%BA%E7%89%88%E7%89%A9%E4%B8%8A%E6%95%B0%E5%AD%97%E7%94%A8%E6%B3%95%E3%80%8B.pdf
  32. ^ Lunde, Ken (2009). CJKV Information Processing. O'Reilly. pp. 633–634. ISBN 978-0-596-51447-1.
[edit]