Stroke-based sorting

From Wikipedia, the free encyclopedia

Strokes (Pinyin: bǐhuà; Traditional Chinese: 筆畫; Simplified Chinese: 笔画 ) are the most basic writing units of Chinese characters. Stroke-based sorting, also called stroke-based ordering or stroke-based order, is one of the five sorting methods frequently used in modern Chinese dictionaries, the others being radical-based sorting, pinyin-based sorting, bopomofo and the four-corner method.[1] In addition to functioning as an independent sorting method, stroke-based sorting is often employed to support the other methods.[2] For example, in Xinhua Dictionary (新华字典), Xiandai Hanyu Cidian (现代汉语词典) and Oxford Chinese Dictionary,[3] stroke-based sorting is used to sort homophones in Pinyin sorting, while in radical-based sorting it helps to sort the radical list, the characters under a common radical, as well as the list of characters difficult to lookup by radicals.

In stroke-based sorting, Chinese characters are ordered by different features of strokes, including stroke counts, stroke forms, stroke orders, stroke combinations, stroke positions, etc.[4]

Stroke-count sorting[edit]

This method arranges characters according to their numbers of strokes ascendingly. A character with less strokes is put before those of more strokes. For example, the different characters in "漢字筆劃, 汉字笔画" (Chinese character strokes) are sorted into "汉(5)字(6)画(8)笔(10)[筆(12)畫(12)]漢(14)", where stroke counts are put in brackets. (Please note that both 筆 and 畫 are of 12 strokes and their order is not determinable by stroke-count sorting.).

Stroke-count sorting was first used in Zihui to arrange the radicals and the characters under each radical when the dictionary was published in 1615 [5] It was also used in Kangxi Chinese Character Dictionary when the dictionary was first compiled in 1710s. [5]

Stroke-count-stroke-order sorting[edit]

This is a combination of stroke-count sorting and stroke-order sorting. Characters are first arranged by stroke-counts in ascending order. Then Stroke-order sorting is employed to sort characters with the same number of strokes. The characters are firstly arranged by their first strokes according to an order of stroke form groups, such as “heng (横, ㇐), shu (竖, ㇑), pie (撇, ㇓), dian (点, ㇔), zhe (折, ㇕)”, or “dian (点), heng (横), shu (竖), pie (撇), zhe (折)”. If the first strokes of two characters belong to the same group, then sort by their second strokes in a similar way, and so on. In our example of the previous section, both 筆 and 畫 are of 12 strokes. 筆 starts with stroke "㇓" of the pie (撇) group, and 畫 starts with "㇕" of the zhe (折) group, and pie is before zhe in the groups order, so 筆 comes before 畫. Hence the different characters in "汉字笔画, 漢字筆劃" are finally sorted into "汉(5)字(6)画(8)笔(10)筆(12)畫(12)漢(14)", where each character is put at its unique position.

Stroke-count-stroke-order sorting was used in Xinhua Dictionary and Xiandai Hanyu Cidian before the national standard for stroke-based sorting was released in 1999.

GB stroke-based order[edit]

The Standard of GB13000.1 Character Set Chinese Character Order (Stroke-Based Order) (GB13000.1字符集汉字字序(笔画序)规范))[6] is a standard released by the National Language Commission of China in 1999 for Chinese characters sorting by strokes. This is an enhanced version of the traditional stroke-count-stroke-order sorting.

According to this standard, two characters are first sorted by stroke counts. If they are of the same stroke counts, sort by stroke order (of the five families of heng, shu, pie, dian and zhe). Then if the characters are of the same stroke order, they will be sorted by the primary-secondary stroke order. For example, 子 and 孑 have the same five-group stroke order (㇐ and ㇀ both belong to the heng family), but according to primary-secondary stroke order rule, primary stroke ㇐ is before secondary stroke ㇀. So 子 comes before 孑. If two characters are of the same stroke count, stroke order and primary-secondary stroke, then sort them according to their modes of stroke combination. Stroke separation comes before stroke connection, and connection comes before stroke intersection. For example: 八 is before 人, and 人 is before 乂. And there are other sorting rules in the standard for more accurate sorting.

This standard has been employed by the new editions of Xinhua Dictionary" [7] and "Xiandai Hanyu Cidian" .[8]

YES sorting[edit]

Handbook of the YES Stroke-Order Sorting for Chinese Characters
Handbook of the YES Stroke-Order Sorting for Chinese Characters

YES is a simplified stroke-based sorting method free of stroke counting and grouping, without comprise in accuracy. Briefly speaking, YES arranges Chinese characters according to their stroke orders and an "alphabet" of 30 strokes:

㇐ ㇕ ㇅ ㇎ ㇡ ㇋ ㇊ ㇍ ㇈ ㇆ ㇇ ㇌  ㇀ ㇑ ㇗ ㇞ ㇉ ㄣ ㇙ ㇄ ㇟ ㇚ ㇓ ㇜ ㇛ ㇢ ㇔ ㇏ ㇂ 

built on the basis of Unicode CJK strokes.[9][10] The YES order of the different characters in "汉字笔画, 漢字筆劃" is "画畫筆笔字漢汉", where each character is put at its unique position.

YES sorting has been applied to the indexing of all the characters in Xinhua Zidian and Xiandai Hanyu Cidian.[10]

See also[edit]

References[edit]

  1. ^ Su, Peicheng (苏培成) (2014). 现代汉字学纲要 (Essentials of Modern Chinese Characters) (in Chinese) (3rd ed.). Beijing: Commercial Press. pp. 189–207. ISBN 978-7-100-10440-1.
  2. ^ Wang, Ning (王寧,鄒曉麗) (2003). 工具書 (Reference Books) (in Chinese). Hong Kong: 和平圖書有限公司. pp. 23–25. ISBN 962-238-363-7.
  3. ^ Kleeman, Julie (and Harry Yu) (2010). Oxford Chinese Dictionary (牛津英漢-漢英詞典). Oxfoed: Oxford University Press. ISBN 978-0-19-920761-9.
  4. ^ Su 2014, pp. 205–207.
  5. ^ a b Su 2014, p. 187.
  6. ^ National Language Commission of China (October 1, 1999). GB13000.1字符集汉字字序(笔画序)规范 (Standard of GB13000.1 Character Set Chinese Character Order (Stroke-Based Order)) (PDF) (in Chinese). Shanghai Education Press. ISBN 7-5320-6674-6.
  7. ^ Language Institute, Chinese Academy of Social Sciences (2020). 新华字典 (Xinhua Dictionary ) (in Chinese) (12th ed.). Beijing: Commercial Press. ISBN 978-7-100-17093-2.
  8. ^ Language Institute, Chinese Academy of Social Sciences (2016). 现代汉语词典 (Modern Chinese Dictionary) (in Chinese) (7th ed.). Beijing: Commercial Press. ISBN 978-7-100-12450-8.
  9. ^ "Unicode CJK Strokes" (PDF). The Unicode Standard. Retrieved 2023-06-21.
  10. ^ a b Zhang, Xiaoheng et. al (张小衡, 李笑通) (2013). 一二三笔顺检字手册 (Handbook of the YES Sorting Method) (in Chinese). Beijing: 语文出版社 (The Language Press). ISBN 978-7-80241-670-3.{{cite book}}: CS1 maint: multiple names: authors list (link)