CJK Unified Ideographs

The Chinese, Japanese and Korean (also known as CJK) scripts share a common background, collectively known as CJK characters. During the process called Han unification, the common (shared) characters were identified and named CJK Unified Ideographs. As of Unicode 17.0, Unicode defines a total of 101,996 characters.^[1]

The term ideographs is a misnomer, as the Chinese script is not ideographic but rather logographic.^{[citation needed]}

Until the early 20th century, Vietnam also used Chinese characters (Chữ Nôm), so sometimes the abbreviation CJKV is used.

Sources

The Ideographic Research Group (IRG) is responsible for developing extensions to the encoded repertoires of CJK unified ideographs. IRG processes proposals for new CJK unified ideographs submitted by its member bodies, and after undergoing several rounds of expert review, IRG submits a consolidated set of characters to ISO/IEC JTC 1/SC 2 Working Group 2 (WG2) and the Unicode Technical Committee (UTC) for consideration for inclusion in the ISO/IEC 10646 and Unicode standards. The following IRG member bodies have been involved in the standardization of CJK unified ideographs:

China
Hong Kong
Japan
North Korea
South Korea
Macau
Taiwan, liaison member represented by the Taipei Computer Association (TCA)
Vietnam
Unicode Technical Committee (liaison member, also representing the United States)^[2]^[3]
United Kingdom
SAT (liaison member)

The ideographs submitted by the UTC and the United Kingdom are not specific to any particular region, but are characters which have been suggested for encoding by individual experts. The ideographs submitted by SAT are required for the SAT Daizōkyō text database.

The table below gives the numbers of encoded CJK unified ideographs for each IRG source for Unicode 17.0.^[4] The total number of characters (267,742) far exceeds the number of encoded CJK unified ideographs (101,996) as many characters have more than one source.

CJK unified ideographs by source
Member	Character count
China	69,724
Hong Kong	17,654
Japan	52,560
North Korea	23,975
South Korea	21,358
Macau	344
Taiwan	59,570
United Kingdom	3,409
Vietnam	14,276
SAT	3,715
UTC	1,157
Total	267,742

UTC sources

The majority of characters submitted by the UTC to the IRG are derived from Unicode Technical Committee (UTC) documents.^[5] Other sources include:

ABC Chinese-English Dictionary by John DeFrancis
The Adobe-CNS1 glyph collection
The Adobe-Japan1 glyph collection
A Complete Checklist of Species and Subspecies of Chinese Birds (中国鸟类系统检索)
The Great Nom Dictionary (Đại Tự Điển Chữ Nôm)
Annotations to Shuowen Jiezi (annotated by Duan Yucai)
GB18030-2000
Required Character List Supplied by the Church of Jesus Christ of Latter-day Saints (Hong Kong)
New Commercial Dictionary (商务新词典), Hong Kong
Modern Chinese Dictionary (现代汉语词典), by Chinese Academy of Social Sciences, Linguistics Research Institute, Dictionary Editorial Office
Working Group (WG2) documents

Ordering

The ordering of CJK Unified Ideographs within Unicode blocks (not counting those added to the block later) was initially determined by consulting the following four dictionaries. Primarily, they were arranged in Kangxi Dictionary order, with the other dictionaries consulted, in order, for characters not found in the Kangxi Dictionary, to determine which Kangxi Dictionary character they should follow in the ordering.^[6]

Kangxi Dictionary
Dai Kan-Wa Jiten
Hanyu Da Zidian
Dae Jaweon

This system is not used for more recently-added Unicode blocks. The Ideographic Research Group no longer uses the Dae Jaweon,^[7] nor the Dai Kan-Wa Jiten,^[8] in its work. The Kangxi Dictionary and Hanyu Da Zidian are still used^[7] both in existing character source references,^[9] and as potential replacements for existing source references discovered to be erroneous.^[10] Similarly, although a (real or virtual) Kangxi Dictionary index was previously provided as part of the submission data for UTC-source characters, this is no longer the case.^[11] Instead, the stroke type of the first residual stroke (first stroke which does not form part of the radical) is supplied with all submitted characters, and used to order characters with the same radical and stroke count within the new Unicode block.^[12]

CJK Unified Ideographs blocks

CJK Unified Ideographs

The basic block named CJK Unified Ideographs (4E00–9FFF) contains 20,992 basic Chinese characters in the range U+4E00 through U+9FFF. The block not only includes characters used in the Chinese writing system but also kanji used in the Japanese writing system, hanja in Korea, and chữ Nôm characters in Vietnamese. Many characters in this block are used in all three writing systems, while others are in only one or two of the three.

This block is also known as the Unified Repertoire and Ordering (URO), especially when it needs to be differentiated from the other CJK Unified Ideographs blocks.^[13]

The first 20,902 characters in the block are arranged according to the Kangxi Dictionary ordering of radicals. In this system the characters written with the fewest strokes are listed first. The remaining characters were added later, and so are not in radical order.

The block is the result of Han unification,^[14] which was somewhat controversial within East Asia.^[15] Since single characters used in more than one of Chinese, Japanese and Korean were coded in the same location, and the modern typographical conventions and handwriting curricula differ slightly between regions (not necessarily along language boundaries—for example, Hong Kong and Taiwan, which both use Traditional Chinese, have slightly different local conventions),^[16] the appearance of a selected glyph could depend on the particular font being used. However, the URO applies the source separation rule, meaning that pairs of characters treated as distinct in a character set used as a source for the URO (e.g. JIS X 0208 as used in e.g. Shift JIS) would remain pairs of separate characters in the new Unicode encoding.^[17]

Using variation selectors, it is possible to specify certain variant CJK ideograms within Unicode.^[18] The Adobe-Japan1 character set, which has 14,684 ideographic variation sequences,^[19] is an extreme example of the use of variation selectors.^[20]

Charts

4E00–62FF, 6300–77FF, 7800–8CFF, 8D00–9FFF.

Sources

Note: Most characters appear in multiple sources, so the sum of individual character counts (108,493) is far greater than the number of encoded characters (20,992).^[21]

Member	Code	Source^[22]	Character count	Total
China	G0	GB/T 2312-1980 (formerly GB 2312-80)	6,763	20,938
	G1	GB/T 12345-1990 (formerly GB/T 12345-90); Traditional Chinese analogue to GB 2312-80	2,202
	G3	GB/T 13131 (unpublished GB/T 7589-1987 unsimplified forms)	4,833
	G5	GB/T 13132 (unpublished GB/T 7590-1987 unsimplified forms)	2,843
	G7	General Purpose Hanzi List for Modern Chinese Language, and General List of Simplified Hanzi (现代汉语通用字表)	42
	G8	GB/T 8565.2-1988 (formerly GB 8565.2-88)	203
	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	6
	GCE	Names of newly-discovered chemical elements as assigned by the China National Committee for Terms in Sciences and Technologies and the China National Language and Character Working Committee (全国科学技术名词审定委员会，国家语言文字工作委员会)	4
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	2
	GE	GB/T 16500-1998	3,767
	GFC	Modern Chinese Standard Dictionary (现代汉语规范词典第二版)	2
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	6
	GGT	Characters collected by the National Library of China (中国国家图书馆)	1
	GH	GB/T 15564-1995	59
	GHZ	Hanyu Da Zidian (漢語大字典)	1
	GHZR	Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版)	29
	GK	GB/T 12052-1989 (formerly GB 12052-89)	89
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	16
	GKX	Kangxi Dictionary (康熙字典)	43
	GLK	Longkan Shoujian (龍龕手鑑)	1
	GT	Standard Telegraph Codebook (revised), 1983 (标准电码本（修订本）)	16
	GU	No source (the original source reference has been moved)	8
	GWY	Cultural Heritage Ideographs (文化遗产用字)	1
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	1
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	2,292	15,376
	HB0	Computer Chinese Glyph and Character Code Mapping Table, Technical Report C-26 (電腦用中文字型與字碼對照表, 技術通報C-26)	9
	HB1	Big-5, Level 1	5,401
	HB2	Big-5, Level 2	7,650
	HD	Hong Kong Supplementary Character Set, 2016	24
Japan	J0	JIS X 0208-1990	6,356	18,249
	J1	JIS X 0212-1990	3,058
	J13	JIS X 0213:2004 level-3 characters replacing J1 characters	1,037
	J13A	JIS X 0213:2004 level-3 character addendum from JIS X 0213:2000 level-3 replacing J1 character	2
	J14	JIS X 0213:2004 level-4 characters replacing J1 characters	1,704
	J3	JIS X 0213:2004 Level 3	95
	J3A	JIS X 0213:2004 Level 3 addendum from JIS X 0213:2000 Level 3	7
	J4	JIS X 0213:2004 Level 4	301
	JARIB	ARIB STD-B24 Version 5.1, March 14 2007	3
	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	5,686
North Korea	KP0	KPS 9566-97	4,652	15,008
North Korea	KP1	KPS 10721-2000	10,356	15,008
South Korea	K0	KS X 1001:2004 (formerly KS C 5601-1987)	4,620	15,450
	K1	KS X 1002:2001 (formerly KS C 5657-1991)	2,855
	K2	KS X 1027-1:2011 (formerly PKS C 5700-1 1994)	7,911
	K3	KS X 1027-2:2011 (formerly PKS C 5700-2 1994)	1
	K4	KS X 1027-3:2011 (formerly PKS 5700-3:1998)	4
	K6	KS X 1027-5:2021	57
	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	1
	KU	No source (the original source reference has been moved)	1
Macau	MA	HKSCS-2008	29	200
	MB1	Big Five	10
	MB2	Big Five	7
	MC	Macau Supplementary Character Set (MSCS) reference	3
	MD	Macau Supplementary Character Set (MSCS) horizontal extensions	127
	MDH	HKSCS-2016	24
Taiwan	T1	CNS 11643-1986 plane 1	5,413	18,385
	T2	CNS 11643-1986 plane 2	7,651
	T3	CNS 11643-1992 plane 3	4,145
	T4	CNS 11643-1992 plane 4	893
	T5	CNS 11643-1992 plane 5	64
	T6	CNS 11643-1992 plane 6	31
	T7	CNS 11643-1992 plane 7	16
	TB	CNS 11643-2007 plane 11	2
	TC	CNS 11643-2007 plane 12	2
	TE	CNS 11643-2007 plane 14	9
	TF	CNS 11643-2007 plane 15	159
Vietnam	V0	TCVN 5773:1993	598	4,808
	V1	TCVN 6056:1995	3,305
	V2	VHN 01-1998	759
	V3	VHN 02-1998	91
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	19
	VN	Vietnamese horizontal and vertical extensions	36
N/A	UTC	UTC sources	79	79

In Unicode 4.1, 14 HKSCS-2004 characters and 8 GB 18030 characters were assigned to between U+9FA6 and U+9FBB code points. Since then, other additions were added to this block for various reasons, all summarized in the version history section below.

CJK Unified Ideographs Extension A

The block named CJK Unified Ideographs Extension A (3400–4DBF) contains 6,592 additional characters in the range U+3400 through U+4DBF.

Charts

3400–4DBF.

Sources

Note: Most characters appear in more than one source, so the sum of individual character counts (23,997) is far greater than the number of encoded characters (6,592).^[21]

Member	Code	Source^[22]	Character count	Total
China	G3	GB/T 13131 (unpublished GB/T 7589-1987 unsimplified forms)	2,390	6,230
	G5	GB/T 13132 (unpublished GB/T 7590-1987 unsimplified forms)	1,226
	G7	General Purpose Hanzi List for Modern Chinese Language, and General List of Simplified Hanzi (现代汉语通用字表)	120
	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	12
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	7
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	2
	GHZ	Hanyu Da Zidian (漢語大字典)	341
	GHZR	Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版)	1
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	3
	GKX	Kangxi Dictionary (康熙字典)	1,889
	GS	Singapore Chinese characters^{[note 1]}	226
	GWY	Cultural Heritage Ideographs (文化遗产用字)	1
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	12
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	572	572
Japan	J3	JIS X 0213:2004 Level 3	2	5,856
	J4	JIS X 0213:2004 Level 4	78
	JA	Japanese IT Vendors Contemporary Ideographs, 1993	574
	JA3	JIS X 0213:2004 level-3 characters replacing JA characters	17
	JA4	JIS X 0213:2004 level-4 characters replacing JA characters	67
	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	5,118
North Korea	KP0	KPS 9566-97	1	3,191
North Korea	KP1	KPS 10721-2000	3,190	3,191
South Korea	K3	KS X 1027-2:2011 (formerly PKS C 5700-2 1994)	1,833	1,876
	K4	KS X 1027-3:2011 (formerly PKS 5700-3:1998)	2
	K6	KS X 1027-5:2021	37
	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	3
	KU	No source (the original source reference has been moved)	1
Macau	MA	HKSCS-2008	4	12
Macau	MD	Macau Supplementary Character Set (MSCS) horizontal extensions	8	12
Taiwan	T3	CNS 11643-1992 plane 3	2,179	5,917
	T4	CNS 11643-1992 plane 4	2,920
	T5	CNS 11643-1992 plane 5	400
	T6	CNS 11643-1992 plane 6	200
	T7	CNS 11643-1992 plane 7	133
	TE	CNS 11643-2007 plane 14	1
	TF	CNS 11643-2007 plane 15	84
United Kingdom	UK	IRG N2107R2	3	3
Vietnam	V0	TCVN 5773:1993	140	319
	V2	VHN 01-1998	149
	V3	VHN 02-1998	19
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	5
	VN	Vietnamese horizontal and vertical extensions	6
N/A	UTC	UTC sources	21	21

CJK Unified Ideographs Extension B

The block named CJK Unified Ideographs Extension B (20000–2A6DF) contains 42,720 characters in the range U+20000 through U+2A6DF. These include most of the characters used in the Kangxi Dictionary that are not in the basic CJK Unified Ideographs block, as well as many Hán-Nôm characters that were formerly used to write Vietnamese.

Charts

20000–215FF, 21600–230FF, 23100–245FF, 24600–260FF, 26100–275FF, 27600–290FF, 29100–2A6DF.

Sources

Note: Many characters appear in more than one source, so the sum of individual character counts (100,887) is far greater than the number of encoded characters (42,720).^[21]

Member	Code	Source^[22]	Character count	Total
China	G3	GB/T 13131 (unpublished GB/T 7589-1987 unsimplified forms)	1	31,345
	G4K	Siku Quanshu (四庫全書)	474
	GBK	Encyclopedia of China (中國大百科全書)	59
	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	78
	GCESI	Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院)	102
	GCH	Cihai (辞海)	247
	GCY	Ciyuan (辭源)	66
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	146
	GFZ	Founder Press System (方正排版系统)	65
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	5
	GHC	Hanyu Da Cidian (漢語大詞典)	553
	GHF	Hanwen fodian yinan suzi huishi yu yanjiu (漢文佛典疑難俗字彙釋與研究)	1
	GHZ	Hanyu Da Zidian (漢語大字典)	10,506
	GHZR	Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版)	4
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	17
	GKX	Kangxi Dictionary (康熙字典)	18,472
	GU	No source (the original source reference has been moved)	73
	GWY	Cultural Heritage Ideographs (文化遗产用字)	12
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	8
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	453
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	3
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	1,703	1,703
Japan	J3	JIS X 0213:2004 Level 3	25	25,745
	J3A	JIS X 0213:2004 Level 3 addendum from JIS X 0213:2000 Level 3	1
	J4	JIS X 0213:2004 Level 4	277
	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	25,442
North Korea	KP1	KPS 10721-2000	5,766	5,766
South Korea	K1	KS X 1002:2001 (formerly KS C 5657-1991)	1	683
	K4	KS X 1027-3:2011 (formerly PKS 5700-3:1998)	166
	K6	KS X 1027-5:2021	502
	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	14
Macau	MA	HKSCS-2008	9	38
	MC	Macau Supplementary Character Set (MSCS) reference	2
	MD	Macau Supplementary Character Set (MSCS) horizontal extensions	27
Taiwan	T3	CNS 11643-1992 plane 3	28	30,212
	T4	CNS 11643-1992 plane 4	3,408
	T5	CNS 11643-1992 plane 5	8,114
	T6	CNS 11643-1992 plane 6	5,942
	T7	CNS 11643-1992 plane 7	6,299
	TA	CNS 11643-2007 plane 10	12
	TB	CNS 11643-2007 plane 11	7
	TC	CNS 11643-2007 plane 12	1
	TF	CNS 11643-2007 plane 15	6,401
United Kingdom	UK	IRG N2107R2	12	12
Vietnam	V0	TCVN 5773:1993	1,570	5,299
	V1	TCVN 6056:1995	1
	V2	VHN 01-1998	2,286
	V3	VHN 02-1998	422
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	33
	VN	Vietnamese horizontal and vertical extensions	987
*Buddhist canon*	SAT	SAT Daizōkyō Text Database	1	1
N/A	UTC	UTC sources	83	83

CJK Unified Ideographs Extension C

The block named CJK Unified Ideographs Extension C (2A700–2B73F) contains 4,160 characters in the range U+2A700 through U+2B73F. It was initially added in Unicode 5.2 (2009).

Charts

2A700–2B73F.

Sources

Note: Some characters appear in more than one source, so the sum of individual character counts (4,967) is greater than the number of encoded characters (4,160).^[21]

Member	Code	Source^[22]	Character count	Total
China	GBK	Encyclopedia of China (中國大百科全書)	74	1,456
	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	12
	GCESI	Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院)	117
	GCH	Cihai (辞海)	264
	GCY	Ciyuan (辭源)	1
	GCYY	Chinese Academy of Surveying and Mapping ideographs (中国测绘科学院用字)	55
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	83
	GFZ	Founder Press System (方正排版系统)	1
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	2
	GGH	Gudai Hanyu Cidian (古代汉语词典)	51
	GHC	Hanyu Da Cidian (漢語大詞典)	14
	GHZ	Hanyu Da Zidian (漢語大字典)	1
	GHZR	Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版)	1
	GJZ	Commercial Press ideographs (商务印书馆用字)	61
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	6
	GKX	Kangxi Dictionary (康熙字典)	8
	GWY	Cultural Heritage Ideographs (文化遗产用字)	2
	GXC	Xiandai Hanyu Cidian (现代汉语词典)	25
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	2
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	109
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	202
	GZJW	Yin Zhou Jinwen Jicheng Yinde (殷周金文集成引得)	365
Hong Kong	H	Hong Kong Supplementary Character Set, 2008	1	1
Japan	JK	Japanese Kokuji Collection (Mojikyō subset)	367	431
Japan	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	64	431
North Korea	KP1	KPS 10721-2000	9	9
South Korea	K5	KS X 1027-4:2011 (formerly Korean IRG Hanja Character Set 5th Edition: 2001)	404	407
	K6	KS X 1027-5:2021	2
	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	1
Macau	MC	Macau Supplementary Character Set (MSCS) reference	17	21
Macau	MD	Macau Supplementary Character Set (MSCS) horizontal extensions	4	21
Taiwan	T4	CNS 11643-1992 plane 4	1	1,757
	T5	CNS 11643-1992 plane 5	1
	T6	CNS 11643-1992 plane 6	2
	TB	CNS 11643-2007 plane 11	2
	TC	CNS 11643-2007 plane 12	634
	TD	CNS 11643-2007 plane 13	766
	TE	CNS 11643-2007 plane 14	350
	TU	No source (the original source reference has been moved)	1
United Kingdom	UK	IRG N2107R2	1	1
Vietnam	V0	TCVN 5773:1993	4	795
	V1	TCVN 6056:1995	2
	V2	VHN 01-1998	1
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	782
	VN	Vietnamese horizontal and vertical extensions	6
N/A	UTC	UTC sources	89	89

CJK Unified Ideographs Extension D

The block named CJK Unified Ideographs Extension D (2B740–2B81F) contains 222 characters in the range U+2B740 through U+2B81D that were added in Unicode 6.0 (2010).

Charts

2B740–2B81F.

Sources

Note: Some characters appear in more than one source, so the sum of individual character counts (260) is greater than the number of encoded characters (222).^[21]

Member	Code	Source^[22]	Character count	Total
China	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	12	99
	GCESI	Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院)	6
	GCH	Cihai (辞海)	1
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	1
	GIDC	ID System of the Ministry of Public Security of China (公安人口信息专用字库补充汉字)	9
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	2
	GXC	Xiandai Hanyu Cidian (现代汉语词典)	4
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	22
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	3
	GZH	Zhonghua Zihai (中华字海)	39
Japan	JH	Hanyo-Denshi Program (汎用電子情報交換環境整備プログラム)	107	117
Japan	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	10	117
Taiwan	TB	CNS 11643-2007 plane 11	24	24
N/A	UTC	UTC sources	20	20

CJK Unified Ideographs Extension E

The block named CJK Unified Ideographs Extension E (2B820–2CEAF) contains 5,774 characters in the range U+2B820 through U+2CEAD. It was originally added in Unicode 8.0 (2015).

Charts

2B820–2CEAF.

Sources

Note: Some characters appear in more than one source, so the sum of individual character counts (6,272) is greater than the number of encoded characters (5,774).^[21]

Member	Code	Source^[22]	Character count	Total
China	GBK	Encyclopedia of China (中國大百科全書)	11	3,173
	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	20
	GCESI	Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院)	211
	GCH	Cihai (辞海)	112
	GCY	Ciyuan (辭源)	3
	GCYY	Chinese Academy of Surveying and Mapping ideographs (中国测绘科学院用字)	98
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	10
	GDZ	Geographic Publishing House ideographs (地质出版社用字)	1
	GGFZ	Tongyong Guifan Hanzi Zidian (通用规范汉字字典)	4
	GGH	Gudai Hanyu Cidian (古代汉语词典)	175
	GGT	Characters collected by the National Library of China (中国国家图书馆)	2
	GHC	Hanyu Da Cidian (漢語大詞典)	7
	GIDC	ID System of the Ministry of Public Security of China (公安人口信息专用字库补充汉字)	37
	GJZ	Commercial Press ideographs (商务印书馆用字)	147
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	2
	GKX	Kangxi Dictionary (康熙字典)	22
	GRM	People's Daily ideographs (人民日报用字)	3
	GU	No source (the original source reference has been moved)	3
	GWY	Cultural Heritage Ideographs (文化遗产用字)	1
	GWZ	Hanyu Da Cidian Press ideographs (漢語大詞典出版社用字)	12
	GXC	Xiandai Hanyu Cidian (现代汉语词典)	57
	GXH	Xinhua Zidian (新华字典)	4
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	1
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	107
	GZFY	Hanyu Fangyan Dacidian (汉语方言大词典)	712
	GZHSJ	Characters collected by the Zhonghua Book Company (中华书局)	1
	GZJW	Yin Zhou Jinwen Jicheng Yinde (殷周金文集成引得)	1,410
Hong Kong	HD	Hong Kong Supplementary Character Set, 2016	1	1
Japan	JK	Japanese Kokuji Collection (Mojikyō subset)	415	503
Japan	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	88	503
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	7	7
Macau	MC	Macau Supplementary Character Set (MSCS) reference	48	51
Macau	MD	Macau Supplementary Character Set (MSCS) horizontal extensions	3	51
Taiwan	T3	CNS 11643-1992 plane 3	2	1,261
	TB	CNS 11643-2007 plane 11	2
	TC	CNS 11643-2007 plane 12	323
	TD	CNS 11643-2007 plane 13	595
	TE	CNS 11643-2007 plane 14	339
United Kingdom	UK	IRG N2107R2	2	2
Vietnam	V0	TCVN 5773:1993	7	1,037
	V2	VHN 01-1998	1
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	1,023
	VN	Vietnamese horizontal and vertical extensions	6
N/A	UTC	UTC sources	237	237

CJK Unified Ideographs Extension F

The block named CJK Unified Ideographs Extension F (2CEB0–2EBEF) contains 7,473 characters in the range U+2CEB0 through 2EBE0 that were added in Unicode 10.0 (2017). It includes more than 1,000 Sawndip characters for Zhuang.

Charts

2CEB0–2EBEF.

Sources

Note: Some characters appear in more than one source, so the sum of individual character counts (8,015) is greater than the number of encoded characters (7,473).^[21]

Member	Code	Source^[22]	Character count	Total
China	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	46	1,546
	GCESI	Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院)	73
	GCY	Ciyuan (辭源)	122
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	31
	GFC	Modern Chinese Standard Dictionary (现代汉语规范词典第二版)	27
	GIDC	ID System of the Ministry of Public Security of China (公安人口信息专用字库补充汉字)	1
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	5
	GLGYJ	Zhuang Liao Songs Research (壮族嘹歌研究)	1
	GOCD	Oxford English-Chinese Chinese-English Dictionary (牛津英汉汉英词典)	2
	GPGLG	Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌)	69
	GWY	Cultural Heritage Ideographs (文化遗产用字)	6
	GXHZ	Xinhua Da Zidian (新华大字典)	51
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	2
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	1,075
	GZJW	Yin Zhou Jinwen Jicheng Yinde (殷周金文集成引得)	33
	GZYS	Chinese Ancient Ethnic Characters Research, 1984 (中国民族古文字研究)	2
Hong Kong	HD	Hong Kong Supplementary Character Set, 2016	1	1
Japan	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	1,646	1,646
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	1,810	1,810
Macau	MC	Macau Supplementary Character Set (MSCS) reference	22	22
Taiwan	T3	CNS 11643-1992 plane 3	1	6
	T6	CNS 11643-1992 plane 6	2
	T7	CNS 11643-1992 plane 7	2
	TC	CNS 11643-2007 plane 12	1
United Kingdom	UK	IRG N2107R2	2	2
Vietnam	V0	TCVN 5773:1993	1	17
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	8
	VN	Vietnamese horizontal and vertical extensions	8
*Buddhist canon*	SAT	SAT Daizōkyō Text Database	2,884	2,884
N/A	UTC	UTC sources	81	81

CJK Unified Ideographs Extension G

A block named CJK Unified Ideographs Extension G was added as part of Unicode 13.0 to the Tertiary Ideographic Plane in the range U+30000 through U+3134F, containing 4,939 characters.^[24]

Charts

30000–3134F.

Sources

Note: Some characters appear in more than one source, so the sum of individual character counts (5,239) is greater than the number of encoded characters (4,939).^[21]

Member	Code	Source^[22]	Character count	Total
China	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	69	2,239
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	49
	GHZR	Hanyu Da Zidian 2nd ed. (汉语大字典, 第二版)	878
	GPGLG	Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌)	13
	GWY	Cultural Heritage Ideographs (文化遗产用字)	11
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	11
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	1,208
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	435	435
Taiwan	T13	CNS 11643 (pending new version) plane 19	347	354
	T5	CNS 11643-1992 plane 5	1
	TB	CNS 11643-2007 plane 11	3
	TC	CNS 11643-2007 plane 12	2
	TD	CNS 11643-2007 plane 13	1
United Kingdom	UK	IRG N2107R2	1,566	1,566
Vietnam	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	6	76
Vietnam	VN	Vietnamese horizontal and vertical extensions	70	76
*Buddhist canon*	SAT	SAT Daizōkyō Text Database	329	329
N/A	UTC	UTC sources	240	240

CJK Unified Ideographs Extension H

A block named CJK Unified Ideographs Extension H was added as part of Unicode 15.0 to the Tertiary Ideographic Plane in the range U+31350 through U+323AF, containing 4,192 characters.^[25]

Charts

31350–323AF.

Sources

Note: Some characters appear in more than one source, so the sum of individual character counts (4,541) is greater than the number of encoded characters (4,192).^[21]

Member	Code	Source^[22]	Character count	Total
China	GCA	Culture and Art Publishing House Ideographs (文化艺术出版社用字)	9	1,059
	GCESI	Characters collected by China Electronics Standardization Institute (中国电子技术标准化研究院)	1
	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	298
	GHC	Hanyu Da Cidian (漢語大詞典)	27
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	30
	GLGYJ	Zhuang Liao Songs Research (壮族嘹歌研究)	11
	GPGLG	Zhuang Folk Song Culture Series - Pingguo County Liao Songs (壮族民歌文化丛书•平果嘹歌)	14
	GU	No source (the original source reference has been moved)	1
	GWY	Cultural Heritage Ideographs (文化遗产用字)	5
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	216
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	330
	GZA-1	A Vibrant and Unbroken Transmission—Filial Piety and Zhuang Funeral Songs (生生不息的传承•孝与壮族行孝歌之研究)	6
	GZA-2	Annotated Long Zhuang Morality Songs (壮族伦理道德长诗传扬歌译注)	38
	GZA-3	Compendium of Old Zhuang Folksong Texts—Wooing Songs vol. 1—Liao Songs (壮族民歌古籍集成•情歌（一）嘹歌)	2
	GZA-4	Compendium of Old Zhuang Folksong Texts—Wooing Songs vol. 2—Fwen Nganx (壮族民歌古籍集成•情歌（二）欢𭪤)	11
	GZA-6	Zhuang Proverbs from China (中国壮族谚语)	59
	GZA-7	Ancient Remembrance—Zhuang Creation Myth Songs (远古的追忆•壮族创世神话古歌研究)	1
North Korea	KP1	KPS 10721-2000	1	1
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	512	512
Taiwan	T12	CNS 11643 (pending new version) plane 18	7	716
	T13	CNS 11643 (pending new version) plane 19	696
	T4	CNS 11643-1992 plane 4	1
	T6	CNS 11643-1992 plane 6	1
	T7	CNS 11643-1992 plane 7	2
	TB	CNS 11643-2007 plane 11	5
	TC	CNS 11643-2007 plane 12	3
	TE	CNS 11643-2007 plane 14	1
United Kingdom	UK	IRG N2232R	917	917
Vietnam	V0	TCVN 5773:1993	6	931
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	74
	VN	Vietnamese horizontal and vertical extensions	851
*Buddhist canon*	SAT	SAT Daizōkyō Text Database	241	241
N/A	UTC	UTC sources	164	164

CJK Unified Ideographs Extension I

A block named CJK Unified Ideographs Extension I was added as part of Unicode 15.1 to the Supplementary Ideographic Plane in the range U+2EBF0 through U+2EE5F, containing 622 characters.^[26]

Charts

2EBF0–2EE5F.

Sources

Note: Some characters appear in more than one source, making the sum of individual character counts (625) more than the number of encoded characters (622).^[21]

Member	Code	Source^[22]	Character count	Total
China	GIDC23	ID system of the Ministry of Public Security of China, 2023	622	622
Japan	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	1	1
N/A	UTC	UTC sources	2	2

CJK Unified Ideographs Extension J

A block named CJK Unified Ideographs Extension J was added as part of Unicode 17.0 to the Tertiary Ideographic Plane in the range U+323B0-U+33479, containing 4,298 characters.

Charts

323B0–3347F.

Sources

Note: Some characters appear in more than one source, making the sum of individual character counts (4,406) more than the number of encoded characters (4,298).^[21]

Member	Code	Source^[22]	Character count	Total
China	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	144	1,005
	GKJ	Terms in Sciences and Technologies (科技用字) approved by the China National Committee for Terms in Sciences and Technologies (CNCTST)	567
	GXM	Characters for use in personal names in China from Public Order Administration, Ministry of Public Security of the People's Republic of China	4
	GZ	Ancient Zhuang Character Dictionary (古壮字字典)	290
South Korea	KC	Korean History On-Line (한국 역사 정보 통합 시스템)	178	178
Taiwan	T11	CNS 11643 (pending new version) plane 17	1	937
	T9	CNS 11643 (pending new version) plane 9	59
	TB	CNS 11643-2007 plane 11	69
	TC	CNS 11643-2007 plane 12	165
	TD	CNS 11643-2007 plane 13	241
	TE	CNS 11643-2007 plane 14	396
	TF	CNS 11643-2007 plane 15	6
United Kingdom	UK	IRG N2232R	6	906
United Kingdom	UK	IRG N2487	900	906
Vietnam	V0	TCVN 5773:1993	16	991
	V1	TCVN 6056:1995	1
	V2	VHN 01-1998	2
	V3	VHN 02-1998	1
	V4	Kho Chữ Hán Nôm Mã Hoá (Hán Nôm Coded Character Repertoire)	58
	VN	Vietnamese horizontal and vertical extensions	913
*Buddhist canon*	SAT	SAT Daizōkyō Text Database	259	260
*Buddhist canon*	SATM	SAT manuscript collection for Buddhist studies	1	260
N/A	UTC	UTC sources	129	129

CJK Compatibility Ideographs

The block named CJK Compatibility Ideographs (F900–FAFF) was created to retain round-trip compatibility with other standards.

However, twelve characters in this block actually have the "Unified Ideograph" property: U+FA0E 﨎, U+FA0F 﨏, U+FA11 﨑, U+FA13 﨓, U+FA14 﨔, U+FA1F 﨟, U+FA21 﨡, U+FA23 﨣, U+FA24 﨤, U+FA27 﨧, U+FA28 﨨, and U+FA29 﨩.^[1] None of the other characters in this and other "Compatibility" blocks relate to CJK unification.

While 龜 and 亀 are not considered unifiable, U+FA20 蘒 CJK COMPATIBILITY IDEOGRAPH-FA20 is considered a duplicate to U+8612 蘒 CJK UNIFIED IDEOGRAPH-8612.

Charts

F900–FAFF.

Sources

Note: All characters appear in more than one source, so the sum of individual character counts (40) is greater than the number of encoded characters (12).^[21]

Member	Code	Source^[22]	Character count	Total
China	GDM	Place name characters from the Public Order Administration, Ministry of Public Security of the People's Republic of China	1	12
China	GU	No source (the original source reference has been moved)	11	12
Japan	J3	JIS X 0213:2004 Level 3	3	12
	J4	JIS X 0213:2004 Level 4	3
	JA	Japanese IT Vendors Contemporary Ideographs, 1993	1
	JA3	JIS X 0213:2004 level-3 characters replacing JA characters	1
	JMJ	Character Information Development and Maintenance Project for e-Government "MojiJoho-Kiban Project" (文字情報基盤整備事業)	4
Taiwan	TF	CNS 11643-2007 plane 15	1	1
Vietnam	V0	TCVN 5773:1993	3	3
N/A	UTC	UTC sources	12	12

Known issues

Disunification

U+4039

The character U+4039 (䀹) was a unification of two different characters (one with jiā 夾 phonetic and one with shǎn 㚒 phonetic) until Unicode 5.0. However, they were lexically different characters that should not have been unified; they have different pronunciations and different meanings.

The proposal of disunification of U+4039^[27] was accepted for Unicode 5.1, encoding a new character at U+9FC3 (鿃) to represent shǎn.

Other 3 glyphs in Extension B

In CJK Unified Ideographs Extension B, some characters were incorrectly unified with others. These characters include U+2017B (𠅻), U+204AF (𠒯) and U+24CB2 (𤲲). The first two characters contained a wrong unification of Chinese Mainland and Vietnamese source of their glyph, while the last one unifies the Chinese Mainland and Taiwanese ones.^[28]

The glyphs for U+2017B (𠅻) and U+204AF (𠒯) were corrected in version 10.0, and the erroneous UCS2003 source glyph U+24CB2 (𤲲) was removed in version 13.0.

Unifiable variants and exact duplicates

Also in CJK Unified Ideographs Extension B, hundreds of glyph variants were encoded by mistake.^[29] Additionally, an ISO/IEC JTC 1/SC 2 report has found that six exact duplicates (where the same character has inadvertently been encoded twice) and two semi-duplicates (where the CJK-B character represents a de facto disunification of two glyph forms unified in the corresponding BMP character) were encoded by mistake:^[30]

U+34A8 㒨 = U+20457 𠑗 : U+20457 is the same as the China-source glyph for U+34A8, but it is significantly different from the Taiwan-source glyph for U+34A8
U+3DB7 㶷 = U+2420E 𤈎 : same glyph shapes
U+8641 虁 = U+27144 𧅄 : U+27144 is the same as the Korean-source glyph for U+8641, but it is significantly different from the Chinese Mainland-, Taiwan- and Japan-source glyphs for U+8641
U+204F2 𠓲 = U+23515 𣔕 : same glyph shapes, but ordered under different radicals
U+249BC 𤦼 = U+249E9 𤧩 : same glyph shapes
U+24BD2 𤯒 = U+2A415 𪐕 : same glyph shapes, but ordered under different radicals
U+26842 𦡂 = U+26866 𦡦 : same glyph shapes
U+FA23 﨣 = U+27EAF 𧺯 : same glyph shapes (U+FA23 﨣 is a unified CJK ideograph, despite its name "CJK COMPATIBILITY IDEOGRAPH-FA23.")

Other CJK ideographs in Unicode, not Unified

Apart from the eleven blocks of "Unified Ideographs," Unicode has about a dozen more blocks with not-unified CJK-characters. These are mainly CJK radicals, strokes, punctuation, marks, symbols and compatibility characters. Although some characters have their (decomposable) counterparts in other blocks, the usages can be different. An example of a not-unified CJK-character is U+3007 〇 IDEOGRAPHIC NUMBER ZERO in the CJK Symbols and Punctuation block. Although it is not covered under "CJK Unified Ideographs", it is treated as a CJK-character for all other intents and purposes.^[31]

Four blocks of compatibility characters are included for compatibility with legacy text handling systems and older character sets:

CJK Compatibility (3300–33FF)
CJK Compatibility Forms (FE30–FE4F)
CJK Compatibility Ideographs (F900–FAFF)
CJK Compatibility Ideographs Supplement (2F800–2FA1F)

They include forms of characters for vertical text layout and rich text characters that Unicode recommends handling through other means. Therefore, their use is discouraged.

Font support

The blocks CJK Unified Ideographs and CJK Unified Ideographs Extension A, being parts of the Basic Multilingual Plane, are supported by the majority of the CJK fonts. However, Japanese and Korean fonts usually have fewer characters (about 13,000 and 8,000, respectively) than Chinese. Extensions B, C, D are supported by additional fonts MingLiU-ExtB, MingLiU_HKSCS-ExtB, PMingLiU-ExtB, SimSun-ExtB included in Microsoft Windows since Vista.^[32]

Unicode version history

CJK unified ideograph additions per Unicode version
Unicode version	Addition	Plane	Characters added	Total characters
1.0 (1991)	CJK Compatibility Ideographs	Basic Multilingual Plane (BMP)	12	20,914
1.0 (1991)	CJK Unified Ideographs	BMP	20,902	20,914
3.0 (1999)	CJK Unified Ideographs Extension A	BMP	6,582	27,496
3.1 (2001)	CJK Unified Ideographs Extension B	Supplementary Ideographic Plane (SIP)	42,711	70,207
4.1 (2005)	CJK Unified Ideographs: Ideographs from HKSCS-2004 and GB 18030-2000 not in ISO 10646	BMP	22	70,229
5.1 (2008)	CJK Unified Ideographs: Ideographs from Adobe Japan and disunification of U+4039	BMP	8	70,237
5.2 (2009)	CJK Unified Ideographs: Characters from ARIB #47, #95, #93 and HKSCS	BMP	8	74,394
5.2 (2009)	CJK Unified Ideographs Extension C	SIP	4,149	74,394
6.0 (2010)	CJK Unified Ideographs Extension D	SIP	222	74,616
6.1 (2012)	CJK Unified Ideographs: Character corresponding to Adobe-Japan1-6 CID+20156	BMP	1	74,617
8.0 (2015)	CJK Unified Ideographs	BMP	9	80,388
8.0 (2015)	CJK Unified Ideographs Extension E	SIP	5,762	80,388
10.0 (2017)	CJK Unified Ideographs	BMP	21	87,882
10.0 (2017)	CJK Unified Ideographs Extension F	SIP	7,473	87,882
11.0 (2018)	CJK Unified Ideographs	BMP	5	87,887
13.0 (2020)	CJK Unified Ideographs	BMP	13	92,856
	CJK Unified Ideographs Extension A	BMP	10
	CJK Unified Ideographs Extension B	SIP	7
	CJK Unified Ideographs Extension G	Tertiary Ideographic Plane (TIP)	4,939
14.0 (2021)	CJK Unified Ideographs	BMP	3	92,865
	CJK Unified Ideographs Extension B	SIP	2
	CJK Unified Ideographs Extension C	SIP	4
15.0 (2022)	CJK Unified Ideographs Extension C	SIP	1	97,058
15.0 (2022)	CJK Unified Ideographs Extension H	TIP	4,192	97,058
15.1 (2023)	CJK Unified Ideographs Extension I	SIP	622	97,680
17.0 (2025)	CJK Unified Ideographs Extension C	SIP	6	101,996
	CJK Unified Ideographs Extension E	SIP	12
	CJK Unified Ideographs Extension J	TIP	4,298

Notes

^ Characters presumably intended for Singapore Chinese characters, but apparently an ad hoc collection rather than a Singapore national standard.^[23]

References

^ ^a ^b "Unicode PropList.txt". 2025-06-30. Retrieved 2025-09-11.
^ IRG Convenor (2024-12-10). "IRG Experts List". ISO/IEC JTC1/SC2/WG2/IRG N2769.
^ Lunde, Ken (2024-09-13). "US/Unicode Activity Report for IRG #63 Meeting" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2700.
^ "Unihan_IRGSources.txt". 2025-07-24. Retrieved 2025-09-12.
^ "UAX #45: U-source Ideographs". Unicode Consortium. 2025-07-24.
^ "18.1.7. Han Ideograph Arrangement". The Unicode Standard: Core Specification. Version 16.0.0. Unicode Consortium.
^ ^a ^b "3.3. Dictionary Indices". Unicode Han Database (Unihan). UAX #38. Three of the dictionary properties represent official IRG indices for the dictionaries used in the four dictionary sorting algorithm. Two (kIRGHanyuDaZidian and kIRGKangXi) are still being used by the IRG, but the other one (kIRGDaeJaweon) is not.
^ Lunde, Ken (2022-09-01). "Proposal to remove/improve provisional Unihan database properties" (PDF). p. 6. UTC L2/22-188. In addition, the IRG no longer uses this dictionary for its ongoing work.
^ "kIRG_GSource". Unicode Han Database (Unihan). UAX #38. GKX: Kangxi Dictionary ideographs (康熙字典) 9th edition (1958) including the addendum (康熙字典)補遺. GHZ: Hanyu Dazidian ideographs (漢語大字典).
^ Lunde, Ken (2018-02-22). "Proposed kIRG_GSource Changes & Corrections" (PDF). UTC L2/18-065; ISO/IEC JTC1/SC2/WG2/IRG N2297.
^ "2. Text File Data". U-Source Ideographs. Unicode Consortium. UAX #45. A KangXi dictionary index for the ideograph, as described in Unicode Standard Annex #38, "Unicode Han Database (Unihan)" [UAX38]. This field is no longer used and contains no data.
^ Lunde, Ken (2024-09-30). "Proposal to remove FS (first residual stroke) value from submissions" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2713. This document proposes that the inclusion of first residual stroke (aka FS) values be removed from the submission requirements for new CJK Unified Ideographs […] The ISO/IEC 10646 Project Editor, when compiling an IRG working set into a new CJK Unified Ideographs extension block, uses the FS values to sort ideographs that share the same Radical-Stroke (Radical + SC) value.
^ Lunde, Ken (2012-09-16). "URO". CJK Type Blog. Adobe Inc.
^ The Unicode Standard 4.0, Appendix A - Han Unification History
^ Suzanne Topping, "The secret life of Unicode". Archived from the original on 2007-11-14. Retrieved 2010-05-12.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
^ Lu, Qin (2015-06-08). "The Proposed Hong Kong Character Set" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2074.
^ "Chapter 11 - East Asian scripts", The Unicode standard, 4.0.
^ "Ideographic Variation Database". 2022-09-13. Retrieved 2022-09-20.
^ "IVD Stats". 2025-07-14. Retrieved 2025-09-12.
^ PRI 108: Combined registration of the Adobe Japan1 collection and of sequences in that collection
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l "Unihan_IRGSources.txt (from Unihan.zip)". 2025-07-24. Retrieved 2025-09-12.
^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l "UAX #38: Unicode Han Database (Unihan)". Unicode Consortium.
^ Lunde, Ken (2009). "Chapter 3: Character Set Standards § Chinese Character Set Standards—Singapore". CJKV information processing (2nd ed.). Sebastopol, Calif.: O'Reilly Media, Inc. p. 130. ISBN 978-0-596-15611-4. OCLC 317878469. To what extent these 226 characters are ad-hoc, or codified by a Singapore national standard, is unknown, at least to me. My suspicion is that they are ad-hoc simply for the apparent lack of any Singapore national standard.
^ "Unicode 13.0.0". 10 March 2020. Retrieved 10 March 2020.
^ "Unicode 15.0.0". 13 September 2022. Retrieved 14 September 2022.
^ "Unicode 15.1.0". 2023-09-12. Retrieved 2023-09-12.
^ Andrew West and John Jenkins, proposal of disunification of U+4039
^ Eiso Chan (陈永聪), Comments on four error glyphs on CJK Unified Ideographs Ext B & E.[1]
^ Taichi Kawabata. "IRGN1155 Possible Duplicates" (.zip). Retrieved 2019-06-22.
^ Cook, Richard (6 October 2003). "Defect Report on Duplicate Encoded CJK Forms" (PDF). ISO/IEC JTC1/SC2/WG2. Retrieved 2025-08-21.
^ GB/T 15835-2011《出版物上数字用法》. China Guojia Biaozhun. https://journals.usst.edu.cn/uploadfile/file/GBT%2015835-2011%E3%80%8A%E5%87%BA%E7%89%88%E7%89%A9%E4%B8%8A%E6%95%B0%E5%AD%97%E7%94%A8%E6%B3%95%E3%80%8B.pdf
^ Lunde, Ken (2009). CJKV Information Processing. O'Reilly. pp. 633–634. ISBN 978-0-596-51447-1.

External links

UK-Source Ideographs (Documents IRG N2107R2 and IRG N2232R)

[24] Characters presumably intended for Singapore Chinese characters, but apparently an ad hoc collection rather than a Singapore national standard.^[23]

[PropList-1] "Unicode PropList.txt". 2025-06-30. Retrieved 2025-09-11.

[2] IRG Convenor (2024-12-10). "IRG Experts List". ISO/IEC JTC1/SC2/WG2/IRG N2769.

[3] Lunde, Ken (2024-09-13). "US/Unicode Activity Report for IRG #63 Meeting" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2700.

[4] "Unihan_IRGSources.txt". 2025-07-24. Retrieved 2025-09-12.

[5] "UAX #45: U-source Ideographs". Unicode Consortium. 2025-07-24.

[four-dictionary-6] "18.1.7. Han Ideograph Arrangement". The Unicode Standard: Core Specification. Version 16.0.0. Unicode Consortium.

[irg-unihan-dictionary-7] "3.3. Dictionary Indices". Unicode Han Database (Unihan). UAX #38. Three of the dictionary properties represent official IRG indices for the dictionaries used in the four dictionary sorting algorithm. Two (kIRGHanyuDaZidian and kIRGKangXi) are still being used by the IRG, but the other one (kIRGDaeJaweon) is not.

[8] Lunde, Ken (2022-09-01). "Proposal to remove/improve provisional Unihan database properties" (PDF). p. 6. UTC L2/22-188. In addition, the IRG no longer uses this dictionary for its ongoing work.

[9] "kIRG_GSource". Unicode Han Database (Unihan). UAX #38. GKX: Kangxi Dictionary ideographs (康熙字典) 9th edition (1958) including the addendum (康熙字典)補遺. GHZ: Hanyu Dazidian ideographs (漢語大字典).

[10] Lunde, Ken (2018-02-22). "Proposed kIRG_GSource Changes & Corrections" (PDF). UTC L2/18-065; ISO/IEC JTC1/SC2/WG2/IRG N2297.

[11] "2. Text File Data". U-Source Ideographs. Unicode Consortium. UAX #45. A KangXi dictionary index for the ideograph, as described in Unicode Standard Annex #38, "Unicode Han Database (Unihan)" [UAX38]. This field is no longer used and contains no data.

[12] Lunde, Ken (2024-09-30). "Proposal to remove FS (first residual stroke) value from submissions" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2713. This document proposes that the inclusion of first residual stroke (aka FS) values be removed from the submission requirements for new CJK Unified Ideographs […] The ISO/IEC 10646 Project Editor, when compiling an IRG working set into a new CJK Unified Ideographs extension block, uses the FS values to sort ideographs that share the same Radical-Stroke (Radical + SC) value.

[13] Lunde, Ken (2012-09-16). "URO". CJK Type Blog. Adobe Inc.

[14] The Unicode Standard 4.0, Appendix A - Han Unification History

[15] Suzanne Topping, "The secret life of Unicode". Archived from the original on 2007-11-14. Retrieved 2010-05-12.{{cite web}}: CS1 maint: bot: original URL status unknown (link)

[irgn2074-16] Lu, Qin (2015-06-08). "The Proposed Hong Kong Character Set" (PDF). ISO/IEC JTC1/SC2/WG2/IRG N2074.

[17] "Chapter 11 - East Asian scripts", The Unicode standard, 4.0.

[18] "Ideographic Variation Database". 2022-09-13. Retrieved 2022-09-20.

[19] "IVD Stats". 2025-07-14. Retrieved 2025-09-12.

[20] PRI 108: Combined registration of the Adobe Japan1 collection and of sequences in that collection

[IRGSources-21] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l "Unihan_IRGSources.txt (from Unihan.zip)". 2025-07-24. Retrieved 2025-09-12.

[UAX38-22] ^ ^a ^b ^c ^d ^e ^f ^g ^h ⁱ ^j ^k ^l "UAX #38: Unicode Han Database (Unihan)". Unicode Consortium.

[23] Lunde, Ken (2009). "Chapter 3: Character Set Standards § Chinese Character Set Standards—Singapore". CJKV information processing (2nd ed.). Sebastopol, Calif.: O'Reilly Media, Inc. p. 130. ISBN 978-0-596-15611-4. OCLC 317878469. To what extent these 226 characters are ad-hoc, or codified by a Singapore national standard, is unknown, at least to me. My suspicion is that they are ad-hoc simply for the apparent lack of any Singapore national standard.

[25] "Unicode 13.0.0". 10 March 2020. Retrieved 10 March 2020.

[26] "Unicode 15.0.0". 13 September 2022. Retrieved 14 September 2022.

[27] "Unicode 15.1.0". 2023-09-12. Retrieved 2023-09-12.

[28] Andrew West and John Jenkins, proposal of disunification of U+4039

[29] Eiso Chan (陈永聪), Comments on four error glyphs on CJK Unified Ideographs Ext B & E.[1]

[N1155-30] Taichi Kawabata. "IRGN1155 Possible Duplicates" (.zip). Retrieved 2019-06-22.

[31] Cook, Richard (6 October 2003). "Defect Report on Duplicate Encoded CJK Forms" (PDF). ISO/IEC JTC1/SC2/WG2. Retrieved 2025-08-21.

[32] GB/T 15835-2011《出版物上数字用法》. China Guojia Biaozhun. https://journals.usst.edu.cn/uploadfile/file/GBT%2015835-2011%E3%80%8A%E5%87%BA%E7%89%88%E7%89%A9%E4%B8%8A%E6%95%B0%E5%AD%97%E7%94%A8%E6%B3%95%E3%80%8B.pdf

[33] Lunde, Ken (2009). CJKV Information Processing. O'Reilly. pp. 633–634. ISBN 978-0-596-51447-1.

[cnote_a_grp_version] 
As of version 17.0

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[note 1]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[23]

[a]

CJK Unified Ideographs

Sources

UTC sources

Ordering

CJK Unified Ideographs blocks

CJK Unified Ideographs

Charts

Sources

CJK Unified Ideographs Extension A

Charts

Sources

CJK Unified Ideographs Extension B

Charts

Sources

CJK Unified Ideographs Extension C

Charts

Sources

CJK Unified Ideographs Extension D

Charts

Sources

CJK Unified Ideographs Extension E

Charts

Sources

CJK Unified Ideographs Extension F

Charts

Sources

CJK Unified Ideographs Extension G

Charts

Sources

CJK Unified Ideographs Extension H

Charts

Sources

CJK Unified Ideographs Extension I

Charts

Sources

CJK Unified Ideographs Extension J

Charts

Sources

CJK Compatibility Ideographs

Charts

Sources

Known issues

Disunification

U+4039

Other 3 glyphs in Extension B

Unifiable variants and exact duplicates

Other CJK ideographs in Unicode, not Unified

Font support

Unicode version history

See also

Notes

References

External links