By Kam-Fai Wong, Wenji Li, Ruifeng Xu, Zheng-sheng Zhang, Graeme Hirst
This publication introduces chinese language language-processing matters and methods to readers who have already got a simple heritage in normal language processing (NLP). because the significant distinction among chinese language and Western languages is on the be aware point, the e-book basically specializes in chinese language morphological research and introduces the concept that, constitution, and interword semantics of chinese language phrases. the next themes are lined: a common creation to chinese language NLP; chinese language characters, morphemes, and phrases and the features of chinese language phrases that experience to be thought of in NLP functions; chinese language observe segmentation; unknown observe detection; be aware which means and chinese language linguistic assets; interword semantics in response to be aware collocation and NLP thoughts for collocation extraction. desk of Contents: advent / phrases in chinese language / demanding situations in chinese language Morphological Processing / chinese language notice Segmentation / Unknown notice identity / notice that means / chinese language Collocations / automated chinese language Collocation Extraction / Appendix / References / writer Biographies
Read or Download Introduction to Chinese Natural Language Processing PDF
Similar ai & machine learning books
Synthetic Intelligence via Prolog e-book
As a pioneer in computational linguistics, operating within the earliest days of language processing through desktop, Margaret Masterman believed that that means, now not grammar, was once the major to realizing languages, and that machines may well make certain the that means of sentences. This quantity brings jointly Masterman's groundbreaking papers for the 1st time, demonstrating the significance of her paintings within the philosophy of technology and the character of iconic languages.
This research explores the layout and alertness of typical language text-based processing platforms, in accordance with generative linguistics, empirical copus research, and synthetic neural networks. It emphasizes the sensible instruments to house the chosen procedure
Extra resources for Introduction to Chinese Natural Language Processing
To like. This is a case of lexical ambiguity due to 好 being a homograph. The process to resolve such lexical ambiguity is commonly referred to as word sense disambiguation (WSD). Structural ambiguity results from having more than one way to analyze a complex linguistic unit. Due to the need to segment character strings into words in Chinese, the possibility of multiple segmentation arises. At the most basic level, a character string can exhibit overlapping ambiguity, combinatorial ambiguity, or a mixture of both.
6 Regional Variation Despite outward appearances, Chinese is, in fact, a family of languages including Mandarin, Cantonese, Shanghai, Min, Hakka, etc. These languages, popularly known as dialects, are often mutually unintelligible when spoken mainly due to great differences in pronunciation. They do, however, share the same writing system, which makes communication between the dialect speakers possible. Although written Chinese (which is the main concern of the present volume) does share greater similarity than the spoken varieties, differences do exist in vocabulary and grammar between the dialects.
Thus, the number 300,000 can be written simply as 300000. Date and address format. In Chinese, both date and address formats follow the “large to small” principle, namely, the larger units preceding the smaller ones. Thus, June 3, 2009, is written as 二零 零九年六月三日 (2009, June 3rd), and #35 Heping Lane, Chaoyang District, Beijing, China, is rendered as 中国北京朝阳区和平里35号 (China, Beijing, Chaoyang District, Heping Lane, #35). CHALLENGES IN CHINESE MORPHOLOGICAL PROCESSING 33 Percentage. Instead of N%, Chinese uses the format of %N.