2708-9517
MLA Directory of Periodicals
REAO: East Asian Studies Journals
EBSCO Education
ProQuest
Google Scholar
Semantic Scholar
ROAD
BASE
Baidu Scholar
夏诠真
摘要
直到目前,汉语只有字被编码。无论是手机书写或是网上阅读,所有的文章,全部是以 unicode(国际统一码)为内存。Unicode 是字形的代码,然而,文章其实是句的组合,句是词的组合,词是最小的理解单位。以 unicode 去记忆汉语文章是间接方式,unicode 不带义,所以无论是人阅读或是机器翻译都会被误解或曲解。百宝箱的思路是改用概念码(词的代号)来记忆文章,概念码是用语素定义,于是文字变得非常清晰准确,可以帮助汉语进入人工智能领域。本文详细解释语素码的设计。
关键词
统一码,概念码,中性码,语素码
The Morpheme Dictionary of the Chinese Toolbox
Qianzhen Xia
Abstract
Up to now, only individual Chinese characters have been encoded. Whether it’s writing on mobile phones or reading online, all articles are stored using Unicode (globally unified code). However, articles are composition of sentences, and sentences combinations of words, while words are the smallest units of meaning. Remembering Chinese texts by using Unicode is an indirect method, as Unicode does not bear meaning, thereby resulting in misunderstanding or twisted meaning, no matter whether it is translated by machine or read by human beings. The concept proposed by the Chinese Language Toolbox project is to use conceptual codes (meaning of words) to replace Unicode. The conceptual code is defined by morphemes, making Chinese characters clearer and more accurate and helping the Chinese language into the artificial intelligence era. This text provides a detailed explanation of our design.
Keywords
Unicode, conceptual code, neutral code, morpheme code