clt-international

ISSN Number

 2708-9517

Abstracting/Indexing/Listing

MLA Directory of Periodicals

REAO: East Asian Studies Journals

EBSCO Education

ProQuest

Google Scholar

Semantic Scholar

ROAD

BASE

Baidu Scholar

Home Journal Index 2024-2

百宝箱语素字词典

Download Full PDF

夏诠真

 

摘要

直到目前,汉语只有字被编码。无论是手机书写或是网上阅读,所有的文章,全部是以 unicode(国际统一码)为内存。Unicode 是字形的代码,然而,文章其实是句的组合,句是词的组合,词是最小的理解单位。以 unicode 去记忆汉语文章是间接方式,unicode 不带义,所以无论是人阅读或是机器翻译都会被误解或曲解。百宝箱的思路是改用概念码(词的代号)来记忆文章,概念码是用语素定义,于是文字变得非常清晰准确,可以帮助汉语进入人工智能领域。本文详细解释语素码的设计。

 

关键词

统一码,概念码,中性码,语素码

 

The Morpheme Dictionary of the Chinese Toolbox


Qianzhen Xia


Abstract

Up to now, only individual Chinese characters have been encoded. Whether it’s writing on mobile phones or reading online, all articles are stored using Unicode (globally unified code). However, articles are composition of sentences, and sentences combinations of words, while words are the smallest units of meaning. Remembering Chinese texts by using Unicode is an indirect method, as Unicode does not bear meaning, thereby resulting in misunderstanding or twisted meaning, no matter whether it is translated by machine or read by human beings. The concept proposed by the Chinese Language Toolbox project is to use conceptual codes (meaning of words) to replace Unicode. The conceptual code is defined by morphemes, making Chinese characters clearer and more accurate and helping the Chinese language into the artificial intelligence era. This text provides a detailed explanation of our design.

 

Keywords

Unicode, conceptual code, neutral code, morpheme code