Academic literature on the topic 'Chinese language Technical Chinese Data processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Chinese language Technical Chinese Data processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Chinese language Technical Chinese Data processing"

1

Cheng, Xian Yi, Wei Kang, and Yu Guo. "An Algorithm of Network Sensitive Information Features Extracting." Applied Mechanics and Materials 556-562 (May 2014): 3558–61. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.3558.

Full text
Abstract:
In the vast ocean of data, how people can learn from the best, to eliminate the dross, the Internet age has become a major issue, also facing great challenge for data processing, and it is the key to develop the national network economy. For sensitive information filtering time lag, low accuracy, poor self-adaptability to the Internet, Chinese text media (webpage, micro-blogging, forum, etc.) for the study, using the technology of the opinion mining and natural language processing, study of sensitive information feature extraction algorithm to reveal the interrelationship of sensitive information and sensitive dictionary, providing technical support for sensitive dictionary and sensitive information recognition.
APA, Harvard, Vancouver, ISO, and other styles
2

ZHANG, WEN, TAKETOSHI YOSHIDA, and XIJIN TANG. "DISTRIBUTION OF MULTI-WORDS IN CHINESE AND ENGLISH DOCUMENTS." International Journal of Information Technology & Decision Making 08, no. 02 (June 2009): 249–65. http://dx.doi.org/10.1142/s0219622009003399.

Full text
Abstract:
As a hybrid of N-gram in natural language processing and collocation in statistical linguistics, multi-word is becoming a hot topic in area of text mining and information retrieval. In this paper, a study concerning distribution of multi-words is carried out to explore a theoretical basis for probabilistic term-weighting scheme. Specifically, the Poisson distribution, zero-inflated binomial distribution, and G-distribution are comparatively studied on a task of predicting probabilities of multi-words' occurrences using these distributions, for both technical multi-words and nontechnical multi-words. In addition, a rule-based multi-word extraction algorithm is proposed to extract multi-words from texts based on words' occurring patterns and syntactical structures. Our experimental results demonstrate that G-distribution has the best capability to predict probabilities of frequency of multi-words' occurrence and the Poisson distribution is comparable to zero-inflated binomial distribution in estimation of multi-word distribution. The outcome of this study validates that burstiness is a universal phenomenon in linguistic count data, which is applicable not only for individual content words but also for multi-words.
APA, Harvard, Vancouver, ISO, and other styles
3

Shi, Lijuan, Ang Li, and Lei Zhang. "Sustainable Fault Diagnosis of Imbalanced Text Mining for CTCS-3 Data Preprocessing." Sustainability 13, no. 4 (February 17, 2021): 2155. http://dx.doi.org/10.3390/su13042155.

Full text
Abstract:
At present, the method for fault diagnosis and maintenance of the CTCS-3 (Chinese Train Control System Level 3) electronic equipment relies too heavily on expert knowledge. Moreover, the use of historical fault data is not valued. This paper proposes a sustainable fault diagnosis model based on imbalanced text mining. First, to process fault data from the field recorded in natural language, natural language processing technology is used to extract fault feature words. Then, a term frequency-inverse document frequency model is used to transform the fault feature words extracted from the database into vectors. It is worth noting that imbalance in the fault samples affects the accuracy of this sustainable fault diagnosis model. To solve this problem, we use the borderline-synthetic minority over-sampling technique in the step of predicting train fault components, we also use the backpropagation neural network we proposed and the naive Bayesian model which is commonly used as a classification model, to compare the prediction accuracy of these two algorithms. The experimental results perform well, which proves that the fault diagnosis method using the backpropagation neural network can further assist engineers to complete timely repair and maintenance work. The research in this paper has played a very important role in technical support for intelligent train dispatching and command, and will also play a positive role in technical support for the automatic operation of urban rail transit under the prevention and control of the new coronavirus.
APA, Harvard, Vancouver, ISO, and other styles
4

Lingling, Wu, and Chen Fuli. "Role of AI Technology in Brend Building of Chinese Higher Education Institution – Thought Based on Integrated Marketing Communicanion." Marketing and Digital Technologies 5, no. 2 (June 29, 2021): 7–13. http://dx.doi.org/10.15276/mdt.5.2.2021.1.

Full text
Abstract:
As the competitions among higher education institutions (HEIs) intensify, brand building has gradually become an important means for HEIs to build their images and enhance their competitiveness. For HEIs, the significance of integrated marketing communication lies in the integration of brand image communication content, communication channel and communication process. At present, the influence of traditional communication channels declines, the influence of self-established media is limited, and the negative information is not monitored well. Under such circumstances, AI technology can provide technical support for integrated marketing communications of HEI brand. In terms of communication content, VR/AR, UAV, interactive games and chatbot are mainly applied. In the aspect of communication channels, the data mining technique is mainly used to achieve differentiated communication, and the big data analysis technique is adopted to integrate brand image information communication channels. With regard to negative information monitoring, the natural language processing technology can provide high-efficiency, full-coverage and round-the-clock negative information monitoring.
APA, Harvard, Vancouver, ISO, and other styles
5

He, Lanfei, Xuefei Zhang, Zhiwei Li, Peng Xiao, Ziming Wei, Xu Cheng, and Shaocheng Qu. "A Chinese Named Entity Recognition Model of Maintenance Records for Power Primary Equipment Based on Progressive Multitype Feature Fusion." Complexity 2022 (February 7, 2022): 1–11. http://dx.doi.org/10.1155/2022/8114217.

Full text
Abstract:
Presently, the State Grid Corporation of China has accumulated a large amount of maintenance records for power primary equipment. Unfortunately, most of these records are unstructured data which lead to difficultly analyze and utilize them. The emergence of natural language processing technology and deep learning methods provide a solution for unstructured text data. This paper proposes a progressive multitype feature fusion model to recognize Chinese named entity of unstructured maintenance records for power primary equipment. Firstly, the textual characteristics and word separation difficulties of maintenance records are analyzed, then 7 main entity categories of power technical terms from unstructured maintenance records are chosen, and 3452 maintenance records are labeled by these categories, which is so called EPE-MR training dataset. Secondly, the standard test reports, standard maintenance, and fault analysis reports for three types of power primary equipment (namely, main transformer, circuit breaker, and isolating switch) are employed as corpus to train character embedding in order to obtain certain words representation ability of maintenance records. After that, progressive multilevel radicals feature extraction module is designed to get detailed and fine semantic information in a hierarchical manner. Further, radicals feature representation and character embedding are concatenated and sent to BiLSTM module to extract contextual information in order to improve Chinese entity recognition ability. Moreover, CRF is introduced to handle the dependencies among prediction labels and to output the optimal prediction sequence, which can easily obtain structured data of maintenance records. Finally, comparative experiments on public MSRA dataset, China People’s Daily corpus, and EPE-MR dataset are implemented, respectively, which show the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
6

Chen, Zi Li. "Research and Application of Clustering Algorithm for Text Big Data." Computational Intelligence and Neuroscience 2022 (June 8, 2022): 1–8. http://dx.doi.org/10.1155/2022/7042778.

Full text
Abstract:
In the era of big data, text as an information reserve database is very important, in all walks of life. From humanities research to government decision-making, from precision medicine to quantitative finance, from customer management to marketing, massive text, as one of the most important information carriers, plays an important role everywhere. The text data generated in these practical problems of humanities research, financial industry, marketing, and other fields often has obvious domain characteristics, often containing the professional vocabulary and unique language patterns in these fields and often accompanied by a variety of “noise.” Dealing with such texts is a great challenge for the current technical conditions, especially for Chinese texts. A clustering algorithm provides a better solution for text big data information processing. Clustering algorithm is the main body of cluster analysis, K-means algorithm with its implementation principle is simple, low time complexity is widely used in the field of cluster analysis, but its K value needs to be preset, initial clustering center random selection into local optimal solution, other clustering algorithm, such as mean drift clustering, K-means clustering in mining text big data. In view of the problems of the above algorithm, this paper first extracts and analyzes the text big data and then does experiments with the clustering algorithm. Experimental conclusion: by analyzing large-scale text data limited to large-scale and simple data set, the traditional K-means algorithm has low efficiency and reduced accuracy, and the K-means algorithm is susceptible to the influence of initial center and abnormal data. According to the above problems, the K-means cluster analysis algorithm for data sets with large data volumes is analyzed and improved to improve its execution efficiency and accuracy on data sets with large data volume set. Mean shift clustering can be regarded as making many random centers move towards the direction of maximum density gradually, that is, moving their mean centroid continuously according to the probability density of data and finally obtaining multiple maximum density centers. It can also be said that mean shift clustering is a kernel density estimation algorithm.
APA, Harvard, Vancouver, ISO, and other styles
7

Kamal, Suhail Muhammad, Yidong Chen, Shaozi Li, Xiaodong Shi, and Jiangbin Zheng. "Technical Approaches to Chinese Sign Language Processing: A Review." IEEE Access 7 (2019): 96926–35. http://dx.doi.org/10.1109/access.2019.2929174.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Semenov, Kirill I., Armine K. Titizian, Aleksandra O. Piskunova, Yulia O. Korotkova, Alena D. Tsvetkova, Elena A. Volf, Alexandra S. Konovalova, and Yulia N. Kuznetsova. "Linguistic Annotation of Translated Chinese Texts: Coordinating Theory, Algorithms and Data." Journal of Linguistics/Jazykovedný casopis 72, no. 2 (December 1, 2021): 590–602. http://dx.doi.org/10.2478/jazcas-2021-0054.

Full text
Abstract:
Abstract The article tackles the problems of linguistic annotation in the Chinese texts presented in the Ruzhcorp – Russian-Chinese Parallel Corpus of RNC, and the ways to solve them. Particular attention is paid to the processing of Russian loanwords. On the one hand, we present the theoretical comparison of the widespread standards of Chinese text processing. On the other hand, we describe our experiments in three fields: word segmentation, grapheme-to-phoneme conversion, and PoS-tagging, on the specific corpus data that contains many transliterations and loanwords. As a result, we propose the preprocessing pipeline of the Chinese texts, that will be implemented in Ruzhcorp.
APA, Harvard, Vancouver, ISO, and other styles
9

Xu, Yi. "Processing relative clauses in Chinese as a second language." Second Language Research 30, no. 4 (July 8, 2014): 439–61. http://dx.doi.org/10.1177/0267658313511485.

Full text
Abstract:
This project investigates second language (L2) learners’ processing of four types of Chinese relative clauses crossing extraction types and demonstrative-classifier (DCl) positions. Using a word order judgment task with a whole-sentence reading technique, the study also discusses how psycholinguistic theories bear explanatory power in L2 data. An overall preference for DCl-first structures and an advantage of DCl-subject relative clauses over the other three structures were found. Results were largely compatible with the filler-gap domain theory and indicated a weak subject-gap advantage. These motivations are subject to influences from other factors, and a multi-constraint proposal was proposed.
APA, Harvard, Vancouver, ISO, and other styles
10

Lu, Cailing, Frank Boers, and Averil Coxhead. "Exploring learners’ understanding of technical vocabulary in Traditional Chinese Medicine." Studies in Second Language Learning and Teaching 11, no. 1 (March 29, 2021): 71–101. http://dx.doi.org/10.14746/ssllt.2021.11.1.4.

Full text
Abstract:
This study explores English for specific purposes learners’ understanding of technical words in a previously-developed technical word list in Traditional Chinese Medicine (TCM). The principal aim was to estimate what kind of technical terms pose problems to TCM learners and might therefore merit special attention in instruction. Of particular interest was the question whether there is a divergence in the understanding of technical vocabulary in TCM between Chinese and Western background learners. To achieve these aims, a combination of word association tasks and retrospective interviews was implemented with 11 Chinese and 10 Western background TCM learners. The data showed that both Chinese and Western learners encountered certain difficulties in understanding technical vocabulary in their study. However, their sources of difficulty were different. Comparisons of typical word associations between Chinese and Western learners indicated that there was a degree of divergence in the way these two participant groups understood TCM terms.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Chinese language Technical Chinese Data processing"

1

洪進德 and Chun-tak Hung. "Chinese workbench: an integrated environment for Chinese writers." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1992. http://hub.hku.hk/bib/B31210314.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Yiu, Lai Kuen Candy. "Chinese character synthesis : towards universal Chinese information exchange." HKBU Institutional Repository, 2003. http://repository.hkbu.edu.hk/etd_ra/477.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

余銘龍 and Ming-lung Yu. "Automatic processing of Chinese language bank cheques." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B31225548.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

羅憲璋 and Hin-cheung Hubert Law. "A language model for mandarin Chinese." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1997. http://hub.hku.hk/bib/B29913391.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lee, Chi-yin. "A pure orthographic stage in processing Chinese characters evidence from data of sub-morphemic processing in preschool children /." Click to view the E-thesis via HKU Scholars Hub, 2003. http://lookup.lib.hku.hk/lookup/bib/B38888919.

Full text
Abstract:
Thesis (B.Sc.)--University of Hong Kong, 2003.
"A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, April 30, 2003." Includes bibliographical references (p. 28-30) Also available in print.
APA, Harvard, Vancouver, ISO, and other styles
6

Lee, Hiu-wing Doris, and 李曉穎. "A study of automatic expansion of Chinese abbreviations." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B31609338.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

施雷 and Lui Sze. "Computer recognition of printed Chinese characters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31213601.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Wong, Kun-wing Peter, and 黃冠榮. "Breaking the learning barrier of Chinese Changjei input method." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31961198.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

黃伯光 and Pak-kwong Wong. "Statistical language models for Chinese recognition: speech and character." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31239456.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

陳國評 and Kwok-ping Chan. "Fuzzy set theoretic approach to handwritten Chinese character recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1989. http://hub.hku.hk/bib/B30425876.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Chinese language Technical Chinese Data processing"

1

Gao, Liwei. Chinese Internet language: A study of identity constructions. München: Lincom Europa, 2007.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Dian nao Ying yu: Computer English. 2nd ed. Jinan: Shandong ke xue ji shu chu ban she, 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Lunde, Ken. CJKV information processing. 2nd ed. [Sebastopol, CA]: O'Reilly, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

D, Huang Timothy, ed. An introduction to Chinese, Japanese, and Korean computing. Singapore: World Scientific, 1989.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhong yi xi tong: Chinese binary system. Jilong Shi: Qi Tongxin, 1990.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Richard, Suchenwirth, ed. Optical recognition of Chinese characters. Braunschweig: Friedr. Vieweg, 1989.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lunde, Ken. CJKV information processing. 2nd ed. [Sebastopol, CA]: O'Reilly, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zhong wen dian nao bai bu shu ru fa chu gao. Taibei Shi: Taiwan shang wu yin shu guan, 1987.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Shi yong Han zi cao zuo xi tong da quan. Shanghai Shi: Shanghai jiao tong da xue chu ban she, 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ji suan ji wu zi ku zhi neng zao zi: Han zi ye ke yi zhe yang ji suan ji xin xi hua = Computer no character of intelligent building : Chinese characters can also be computer information. Beijing: Guo fang gong ye chu ban she, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Chinese language Technical Chinese Data processing"

1

Zheng, Thomas Fang, Zhanjiang Song, Lihong Zhang, Michael Brasser, Wei Wu, and Jing Deng. "CCC Speaker Recognition Evaluation 2006: Overview, Methods, Data, Results and Perspective." In Chinese Spoken Language Processing, 485–93. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11939993_51.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Yang, Qingliang Miao, Ji Geng, Christoph Alt, Robert Schwarzenberg, Leonhard Hennig, Changjian Hu, and Feiyu Xu. "Question Answering for Technical Customer Support." In Natural Language Processing and Chinese Computing, 3–15. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-99495-6_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhao, Yu, Mingyue Zhou, Zhenghua Li, and Min Zhang. "Dependency Parsing with Noisy Multi-annotation Data." In Natural Language Processing and Chinese Computing, 120–31. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60457-8_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Yang, Liang, Jingjie Zeng, Shuqun Li, Zhexu Shen, Yansong Sun, and Hongfei Lin. "Metaphor Recognition and Analysis via Data Augmentation." In Natural Language Processing and Chinese Computing, 746–57. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-88480-2_60.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wan, Jing, Haoming Li, Lei Hou, and Juaizi Li. "Reinforcement Learning for Named Entity Recognition from Noisy Data." In Natural Language Processing and Chinese Computing, 333–45. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60450-9_27.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Hu, Xin Wang, Hongye Tan, and Ru Li. "Applying Data Discretization to DPCNN for Law Article Prediction." In Natural Language Processing and Chinese Computing, 459–70. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32233-5_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ma, Fu-Yuan, Wen-Qi Chen, Min-Hao Xiao, Xin Wang, and Ying Wang. "Explanation Chains Model Based on the Fine-Grained Data." In Natural Language Processing and Chinese Computing, 684–98. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32236-6_63.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Wu, Huanqin, Muyun Yang, Jiaqi Wang, Junguo Zhu, and Tiejun Zhao. "Target Oriented Data Generation for Quality Estimation of Machine Translation." In Natural Language Processing and Chinese Computing, 393–405. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-32233-5_31.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Jiang, Shengyi, Yingwen Fu, Xiaotian Lin, and Nankai Lin. "Pre-trained Language Models for Tagalog with Multi-source Data." In Natural Language Processing and Chinese Computing, 210–23. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-88480-2_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Fang, Jie, and Peifeng Li. "Data Augmentation with Reinforcement Learning for Document-Level Event Coreference Resolution." In Natural Language Processing and Chinese Computing, 751–63. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60450-9_59.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Chinese language Technical Chinese Data processing"

1

You, Liping, Tao Liu, and Kaiying Liu. "Chinese FrameNet Data in Semantic Web Language." In 2007 International Conference on Natural Language Processing and Knowledge Engineering. IEEE, 2007. http://dx.doi.org/10.1109/nlpke.2007.4368010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

"Technical program at a glance." In 2004 International Symposium on Chinese Spoken Language Processing. IEEE, 2004. http://dx.doi.org/10.1109/chinsl.2004.1409562.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Chung, I., and Chuan-Jie Lin. "TOCAB: A Dataset for Chinese Abusive Language Processing." In 2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI). IEEE, 2021. http://dx.doi.org/10.1109/iri51335.2021.00069.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Li, Wen, and Markus Dickinson. "Gender Prediction for Chinese Social Media Data." In RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning. Incoma Ltd. Shoumen, Bulgaria, 2017. http://dx.doi.org/10.26615/978-954-452-049-6_058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

"ISCSLP 2008 Technical Program Committee." In 2008 6th International Symposium on Chinese Spoken Language Processing. IEEE, 2008. http://dx.doi.org/10.1109/chinsl.2008.ecp.7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Noever, David, Josh Kalin, Matthew Ciolino, Dom Hambrick, and Gerry Dozier. "Local Translation Services for Neglected Languages." In 8th International Conference on Artificial Intelligence and Applications (AIAP 2021). AIRCC Publishing Corporation, 2021. http://dx.doi.org/10.5121/csit.2021.110110.

Full text
Abstract:
Taking advantage of computationally lightweight, but high-quality translators prompt consideration of new applications that address neglected languages. For projects with protected or personal data, translators for less popular or low-resource languages require specific compliance checks before posting to a public translation API. In these cases, locally run translators can render reasonable, cost-effective solutions if done with an army of offline, smallscale pair translators. Like handling a specialist’s dialect, this research illustrates translating two historically interesting, but obfuscated languages: 1) hacker-speak (“l33t”) and 2) reverse (or “mirror”) writing as practiced by Leonardo da Vinci. The work generalizes a deep learning architecture to translatable variants of hacker-speak with lite, medium, and hard vocabularies. The original contribution highlights a fluent translator of hacker-speak in under 50 megabytes and demonstrates a companion text generator for augmenting future datasets with greater than a million bilingual sentence pairs. A primary motivation stems from the need to understand and archive the evolution of the international computer community, one that continuously enhances their talent for speaking openly but in hidden contexts. This training of bilingual sentences supports deep learning models using a long short-term memory, recurrent neural network (LSTM-RNN). It extends previous work demonstrating an English-to-foreign translation service built from as little as 10,000 bilingual sentence pairs. This work further solves the equivalent translation problem in twenty-six additional (non-obfuscated) languages and rank orders those models and their proficiency quantitatively with Italian as the most successful and Mandarin Chinese as the most challenging. For neglected languages, the method prototypes novel services for smaller niche translations such as Kabyle (Algerian dialect) which covers between 5-7 million speakers but one which for most enterprise translators, has not yet reached development. One anticipates the extension of this approach to other important dialects, such as translating technical (medical or legal) jargon and processing health records or handling many of the dialects collected from specialized domains (mixed languages like “Spanglish”, acronym-laden Twitter feeds, or urban slang).
APA, Harvard, Vancouver, ISO, and other styles
7

Dong, Minghui, and Kim-Teng Lua. "Automatic prosodic break labeling for Mandarin Chinese speech data." In 7th International Conference on Spoken Language Processing (ICSLP 2002). ISCA: ISCA, 2002. http://dx.doi.org/10.21437/icslp.2002-145.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

"Technical Program Committee." In 7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010). IEEE, 2010. http://dx.doi.org/10.1109/iscslp.2010.5684919.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Lu, Shixiang, Wei Wei, Xiaoyin Fu, Lichun Fan, and Bo Xu. "Phrase-based data selection for language model adaptation in spoken language translation." In 2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012). IEEE, 2012. http://dx.doi.org/10.1109/iscslp.2012.6423483.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Feng, JunLan, XianFang Wang, and LiMin Du. "Data collection and processing in a Chinese spontaneous speech corpus IIS_CSS." In 6th International Conference on Spoken Language Processing (ICSLP 2000). ISCA: ISCA, 2000. http://dx.doi.org/10.21437/icslp.2000-558.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Chinese language Technical Chinese Data processing"

1

Murdick, Dewey, Daniel Chou, Ryan Fedasiuk, and Emily Weinstein. The Public AI Research Portfolio of China’s Security Forces. Center for Security and Emerging Technology, March 2021. http://dx.doi.org/10.51593/20200057.

Full text
Abstract:
New analytic tools are used in this data brief to explore the public artificial intelligence (AI) research portfolio of China’s security forces. The methods contextualize Chinese-language scholarly papers that claim a direct working affiliation with components of the Ministry of Public Security, People's Armed Police Force, and People’s Liberation Army. The authors review potential uses of computer vision, robotics, natural language processing and general AI research.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography