Log in

Relevant bibliographies by topics / Corpora Processing Software / Journal articles

To see the other types of publications on this topic, follow the link: Corpora Processing Software.

Journal articles on the topic 'Corpora Processing Software'

Author: Grafiati

Published: 28 June 2021

Last updated: 17 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Corpora Processing Software.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Soloviev, F. N. "Embedding Additional Natural Language Processing Tools into the TXM Platform." Vestnik NSU. Series: Information Technologies 18, no. 1 (2020): 74–82. http://dx.doi.org/10.25205/1818-7900-2020-18-1-74-82.

Full text

Abstract:

In our work we present a description of integration of natural language processing tools (pseudostem extraction, noun phrase extraction, verb government analysis) in order to extend analytic facilities of the TXM corpora analysis platform. The tools introduced in the paper are combined into a single software package providing TXM platform with an effective specialized corpora preparation tool for further analysis.

APA, Harvard, Vancouver, ISO, and other styles

2

Dutsova, Ralitsa. "Web-based software system for processing bilingual digital resources." Cognitive Studies | Études cognitives, no. 14 (September 4, 2014): 33–43. http://dx.doi.org/10.11649/cs.2014.004.

Full text

Abstract:

Web-based software system for processing bilingual digital resourcesThe article describes a software management system developed at the Institute of Mathematics and Informatics, BAS, for the creation, storing and processing of digital language resources in Bulgarian. Independent components of the system are intended for the creation and management of bilingual dictionaries, for information retrieval and data mining from a bilingual dictionary, and for the presentation of aligned corpora. A module which connects these components is also being developed. The system, implemented as a web-application, contains tools for compilation, editing and search within all components.

APA, Harvard, Vancouver, ISO, and other styles

3

Ali, Mohammed Abdulmalik. "Artificial intelligence and natural language processing: the Arabic corpora in online translation software." International Journal of ADVANCED AND APPLIED SCIENCES 3, no. 9 (September 2016): 59–66. http://dx.doi.org/10.21833/ijaas.2016.09.010.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Renouf, Antoinette. "The Establishment and Use of Text Corpora at Birmingham University." HERMES - Journal of Language and Communication in Business 4, no. 7 (July 28, 2015): 71. http://dx.doi.org/10.7146/hjlcb.v4i7.21475.

Full text

Abstract:

The School of English at Birmingham University has over the last ten years increasingly integrated the study and use of corpora into its research and teaching activities. Cobuild Ltd and the English for Overseas Students Unit are particularly active, as is the Research and Development Unit for English Language Studies. Members of the Research Unit have created the purpose-built corpora that make up the Birmingham Collection of English Text. The Research Unit is using these to support its linguistic research projects and the development of new types of text-processing software, as well as for specialised teaching purposes.

APA, Harvard, Vancouver, ISO, and other styles

5

ZHAO, SHIQI, HAIFENG WANG, TING LIU, and SHENG LI. "Extracting paraphrase patterns from bilingual parallel corpora." Natural Language Engineering 15, no. 4 (September 16, 2009): 503–26. http://dx.doi.org/10.1017/s1351324909990155.

Full text

Abstract:

AbstractParaphrase patterns are semantically equivalent patterns, which are useful in both paraphrase recognition and generation. This paper presents a pivot approach for extracting paraphrase patterns from bilingual parallel corpora, whereby the paraphrase patterns in English are extracted using the patterns in another language as pivots. We make use of log-linear models for computing the paraphrase likelihood between pattern pairs and exploit feature functions based on maximum likelihood estimation (MLE), lexical weighting (LW), and monolingual word alignment (MWA). Using the presented method, we extract more than 1 million pairs of paraphrase patterns from about 2 million pairs of bilingual parallel sentences. The precision of the extracted paraphrase patterns is above 78%. Experimental results show that the presented method significantly outperforms a well-known method called discovery of inference rules from text (DIRT). Additionally, the log-linear model with the proposed feature functions are effective. The extracted paraphrase patterns are fully analyzed. Especially, we found that the extracted paraphrase patterns can be classified into five types, which are useful in multiple natural language processing (NLP) applications.

APA, Harvard, Vancouver, ISO, and other styles

6

MIHALCEA, RADA, and DAN I. MOLDOVAN. "AutoASC — A SYSTEM FOR AUTOMATIC ACQUISITION OF SENSE TAGGED CORPORA." International Journal of Pattern Recognition and Artificial Intelligence 14, no. 01 (February 2000): 3–17. http://dx.doi.org/10.1142/s0218001400000039.

Full text

Abstract:

Many natural language processing tasks, such as word sense disambiguation, knowledge acquisition, information retrieval, use semantically tagged corpora. Till recently, these corpus-based systems relied on text manually annotated with semantic tags; but the massive human intervention in this process has become a serious impediment in building robust systems. In this paper, we present AutoASC, a system which automatically acquires sense tagged corpora. It is based on (1) the information provided in WordNet, particularly the word definitions found within the glosses and (2) the information gathered from Internet using existing search engines. The system was tested on a set of 46 concepts, for which 2071 example sentences have been acquired; for these, a precision of 87% was observed.

APA, Harvard, Vancouver, ISO, and other styles

7

Chersoni, E., E. Santus, L. Pannitto, A. Lenci, P. Blache, and C. R. Huang. "A structured distributional model of sentence meaning and processing." Natural Language Engineering 25, no. 4 (July 2019): 483–502. http://dx.doi.org/10.1017/s1351324919000214.

Full text

Abstract:

AbstractMost compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a structured distributional model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from discourse representation theory and containing distributional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modelled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension.We evaluate SDMon two recently introduced compositionality data sets, and our results show that combining a simple compositionalmodel with event knowledge constantly improves performances, even with dif ferent types of word embeddings.

APA, Harvard, Vancouver, ISO, and other styles

8

PERIÑAN-PASCUAL, CARLOS. "DEXTER: A workbench for automatic term extraction with specialized corpora." Natural Language Engineering 24, no. 2 (October 5, 2017): 163–98. http://dx.doi.org/10.1017/s1351324917000365.

Full text

Abstract:

AbstractAutomatic term extraction has become a priority area of research within corpus processing. Despite the extensive literature in this field, there are still some outstanding issues that should be dealt with during the construction of term extractors, particularly those oriented to support research in terminology and terminography. In this regard, this article describes the design and development of DEXTER, an online workbench for the extraction of simple and complex terms from domain-specific corpora in English, French, Italian and Spanish. In this framework, three issues contribute to placing the most important terms in the foreground. First, unlike the elaborate morphosyntactic patterns proposed by most previous research, shallow lexical filters have been constructed to discard term candidates. Second, a large number of common stopwords are automatically detected by means of a method that relies on the IATE database together with the frequency distribution of the domain-specific corpus and a general corpus. Third, the term-ranking metric, which is grounded on the notions of salience, relevance and cohesion, is guided by the IATE database to display an adequate distribution of terms.

APA, Harvard, Vancouver, ISO, and other styles

9

XUE, NAIWEN, FEI XIA, FU-DONG CHIOU, and MARTA PALMER. "The Penn Chinese TreeBank: Phrase structure annotation of a large corpus." Natural Language Engineering 11, no. 2 (May 19, 2005): 207–38. http://dx.doi.org/10.1017/s135132490400364x.

Full text

Abstract:

With growing interest in Chinese Language Processing, numerous NLP tools (e.g., word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the world. However, since no large-scale bracketed corpora are available to the public, these tools are trained on corpora with different segmentation criteria, part-of-speech tagsets and bracketing guidelines, and therefore, comparisons are difficult. As a first step towards addressing this issue, we have been preparing a large bracketed corpus since late 1998. The first two installments of the corpus, 250 thousand words of data, fully segmented, POS-tagged and syntactically bracketed, have been released to the public via LDC (www.ldc.upenn.edu). In this paper, we discuss several Chinese linguistic issues and their implications for our treebanking efforts and how we address these issues when developing our annotation guidelines. We also describe our engineering strategies to improve speed while ensuring annotation quality.

APA, Harvard, Vancouver, ISO, and other styles

10

Altheneyan, Alaa, and Mohamed El Bachir Menai. "Evaluation of State-of-the-Art Paraphrase Identification and Its Application to Automatic Plagiarism Detection." International Journal of Pattern Recognition and Artificial Intelligence 34, no. 04 (August 22, 2019): 2053004. http://dx.doi.org/10.1142/s0218001420530043.

Full text

Abstract:

Paraphrase identification is a natural language processing (NLP) problem that involves the determination of whether two text segments have the same meaning. Various NLP applications rely on a solution to this problem, including automatic plagiarism detection, text summarization, machine translation (MT), and question answering. The methods for identifying paraphrases found in the literature fall into two main classes: similarity-based methods and classification methods. This paper presents a critical study and an evaluation of existing methods for paraphrase identification and its application to automatic plagiarism detection. It presents the classes of paraphrase phenomena, the main methods, and the sets of features used by each particular method. All the methods and features used are discussed and enumerated in a table for easy comparison. Their performances on benchmark corpora are also discussed and compared via tables. Automatic plagiarism detection is presented as an application of paraphrase identification. The performances on benchmark corpora of existing plagiarism detection systems able to detect paraphrases are compared and discussed. The main outcome of this study is the identification of word overlap, structural representations, and MT measures as feature subsets that lead to the best performance results for support vector machines in both paraphrase identification and plagiarism detection on corpora. The performance results achieved by deep learning techniques highlight that these techniques are the most promising research direction in this field.

APA, Harvard, Vancouver, ISO, and other styles

11

Bestgen, Yves. "Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora." Quaderns de Filologia - Estudis Lingüístics 22, no. 22 (January 7, 2018): 33. http://dx.doi.org/10.7203/qf.22.11299.

Full text

Abstract:

Log-likelihood and Chi-square tests are probably the most popular statistical tests used in corpus linguistics, especially when the research is aiming to describe the lexical variations between corpora. However, because this specific use of the Chi-square test is not valid, it produces far too many significant results. This paper explains the source of the problem (i.e., the non-independence of the observations), the reasons for which the usual solutions are not acceptable and which kinds of statistical test should be used instead. A corpus analysis conducted on the lexical differences between American and British English is then reported, in order to demonstrate the problem and to confirm the adequacy of the proposed solution. The last section presents the commands that can be used with WordSmith Tools, a very popular software for corpus processing, to obtain the necessary data for the adequate tests, as well as a very easy-to-use procedure in R, a free and easy to install statistical software, that performs these tests.

APA, Harvard, Vancouver, ISO, and other styles

12

Zhukovska, Victoriia V., Oleksandr O. Mosiiuk, and Veronika V. Komarenko. "ЗАСТОСУВАННЯ ПРОГРАМНОГО ПАКЕТУ R У НАУКОВИХ ДОСЛІДЖЕННЯХ МАЙБУТНІХ ФІЛОЛОГІВ." Information Technologies and Learning Tools 66, no. 4 (September 30, 2018): 272. http://dx.doi.org/10.33407/itlt.v66i4.2196.

Full text

Abstract:

Corpus linguistics is a newly emerging field of study in applied linguistics that deals with construction, processing, and exploitation of text corpora. To date, a high-quality analysis of vast amounts of empirical language data provided by computerized corpora is impossible without computer technologies and relevant statistical methods. Therefore, teaching future philologists to effectively apply statistical computer programs is an important stage in their research training. The article discusses the possibilities of using one of the leading in Western linguistics, but not well-known in Ukraine, software packages for statistical data analysis – R statistical software environment – in the research by future philologists. The paper reveals the advantages and disadvantages of this program in comparison with other similar software packages (SPSS and Statistica) and provides Internet links to R self-learn tutorials. The flexibility and efficacy of R for linguistic research are demonstrated on the example of a statistical analysis of the use of hedges in the corpus of academic speech. For novice philologists to properly understand the peculiarities of conducting a statistical linguistic experiment with R, a detailed description of each stage of the study is provided. The statistical verification of hedges in the speech of students and lecturers was carried out using such statistical methods as the Kolmogorov–Smirnov test and the Mann-Whitney U Test. The article presents the developed algorithms to calculate the specified tests applying the built-in commands and various specialized library functions, created by R user community to enhance the functionality of this statistical software. Each script for statistical calculations in R is accompanied by a detailed description and interpretation of the results obtained. Further study of the issue will involve a number of activities aimed at raising awareness and improving skills of future philologists in using R statistical software, which is important for their professional development as researchers.

APA, Harvard, Vancouver, ISO, and other styles

13

Johnston, Trevor. "The reluctant oracle: using strategic annotations to add value to, and extract value from, a signed language corpus." Corpora 9, no. 2 (November 2014): 155–89. http://dx.doi.org/10.3366/cor.2014.0056.

Full text

Abstract:

In this paper, I discuss the ways in which multimedia annotation software is being used to transform an archive of Auslan recordings into a true machine-readable language corpus. After the basic structure of the annotation files in the Auslan corpus is described and the exercise differentiated from transcription, the glossing and annotation conventions are explained. Following this, I exemplify the searching and pattern-matching at different levels of linguistic organisation that these annotations make possible. The paper shows how, in the creation of signed language corpora, it is important to be clear about the difference between transcription and annotation. Without an awareness of this distinction – and despite time consuming and expensive processing of the video recordings – we may not be able to discern the types of patterns in our corpora that we hope to. The conventions are designed to ensure that the annotations really do enable researchers to identify regularities at different levels of linguistic organisation in the corpus and, thus, to test, or build on, existing descriptions of the language.

APA, Harvard, Vancouver, ISO, and other styles

14

Giannella, Chris R., Ransom K. Winder, and Joseph P. Jubinski. "Annotation projection for temporal information extraction." Natural Language Engineering 25, no. 3 (May 2019): 385–403. http://dx.doi.org/10.1017/s1351324919000044.

Full text

Abstract:

AbstractApproaches to building temporal information extraction systems typically rely on large, manually annotated corpora. Thus, porting these systems to new languages requires acquiring large corpora of manually annotated documents in the new languages. Acquiring such corpora is difficult owing to the complexity of temporal information extraction annotation. One strategy for addressing this difficulty is to reduce or eliminate the need for manually annotated corpora through annotation projection. This technique utilizes a temporal information extraction system for a source language (typically English) to automatically annotate the source language side of a parallel corpus. It then uses automatically generated word alignments to project the annotations, thereby creating noisily annotated target language training data. We developed an annotation projection technique for producing target language temporal information extraction systems. We carried out an English (source) to French (target) case study wherein we compared a French temporal information extraction system built using annotation projection with one built using a manually annotated French corpus. While annotation projection has been applied to building other kinds of Natural Language Processing tools (e.g., Named Entity Recognizers), to our knowledge, this is the first paper examining annotation projection as applied to temporal information extraction where no manual corrections of the target language annotations were made. We found that, even using manually annotated data to build a temporal information extraction system, F-scores were relatively low (<0.35), which suggests that the problem is challenging even with manually annotated data. Our annotation projection approach performed well (relative to the system built from manually annotated data) on some aspects of temporal information extraction (e.g., event–document creation time temporal relation prediction), but it performed poorly on the other kinds of temporal relation prediction (e.g., event–event and event–time).

APA, Harvard, Vancouver, ISO, and other styles

15

VILA, M., H. RODRÍGUEZ, and M. A. MARTÍ. "Relational paraphrase acquisition from Wikipedia: The WRPA method and corpus." Natural Language Engineering 21, no. 3 (September 16, 2013): 355–89. http://dx.doi.org/10.1017/s1351324913000235.

Full text

Abstract:

AbstractParaphrase corpora are an essential but scarce resource in Natural Language Processing. In this paper, we present the Wikipedia-based Relational Paraphrase Acquisition (WRPA) method, which extracts relational paraphrases from Wikipedia, and the derived WRPA paraphrase corpus. The WRPA corpus currently covers person-related and authorship relations in English and Spanish, respectively, suggesting that, given adequate Wikipedia coverage, our method is independent of the language and the relation addressed. WRPA extracts entity pairs from structured information in Wikipedia applying distant learning and, based on the distributional hypothesis, uses them as anchor points for candidate paraphrase extraction from the free text in the body of Wikipedia articles. Focussing on relational paraphrasing and taking advantage of Wikipedia-structured information allows for an automatic and consistent evaluation of the results. The WRPA corpus characteristics distinguish it from other types of corpora that rely on string similarity or transformation operations. WRPA relies on distributional similarity and is the result of the free use of language outside any reformulation framework. Validation results show a high precision for the corpus.

APA, Harvard, Vancouver, ISO, and other styles

16

LANGLOIS, D., M. SAAD, and K. SMAILI. "Alignment of comparable documents: Comparison of similarity measures on French–English–Arabic data." Natural Language Engineering 24, no. 5 (June 19, 2018): 677–94. http://dx.doi.org/10.1017/s1351324918000232.

Full text

Abstract:

AbstractThe objective, in this article, is to address the issue of the comparability of documents, which are extracted from different sources and written in different languages. These documents are not necessarily translations of each other. This material is referred as multilingual comparable corpora. These language resources are useful for multilingual natural language processing applications, especially for low-resourced language pairs. In this paper, we collect different data in Arabic, English, and French. Two corpora are built by using available hyperlinks for Wikipedia and Euronews. Euronews is an aligned multilingual (Arabic, English, and French) corpus of 34k documents collected from Euronews website. A more challenging issue is to build comparable corpus from two different and independent media having two distinct editorial lines, such as British Broadcasting Corporation (BBC) and Al Jazeera (JSC). To build such corpus, we propose to use the Cross-Lingual Latent Semantic approach. For this purpose, documents have been harvested from BBC and JSC websites for each month of the years 2012 and 2013. The comparability is calculated for each Arabic–English couple of documents of each month. This automatic task is then validated by hand. This led to a multilingual (Arabic–English) aligned corpus of 305 pairs of documents (233k English words and 137k Arabic words). In addition, a study is presented in this paper to analyze the performance of three methods of the literature allowing to measure the comparability of documents on the multilingual reference corpora. A recall at rank 1 of 50.16 per cent is achieved with the Cross-lingual LSI approach for BBC–JSC test corpus, while the dictionary-based method reaches a recall of only 35.41 per cent.

APA, Harvard, Vancouver, ISO, and other styles

17

Szakács, Béla Benedek, and Tamás Mészáros. "Hybrid Distance-based, CNN and Bi-LSTM System for Dictionary Expansion." Infocommunications journal, no. 4 (2020): 6–13. http://dx.doi.org/10.36244/icj.2020.4.2.

Full text

Abstract:

Dictionaries like Wordnet can help in a variety of Natural Language Processing applications by providing additional morphological data. They can be used in Digital Humanities research, building knowledge graphs and other applications. Creating dictionaries from large corpora of texts written in a natural language is a task that has not been a primary focus of research, as other tasks have dominated the field (such as chat-bots), but it can be a very useful tool in analysing texts. Even in the case of contemporary texts, categorizing the words according to their dictionary entry is a complex task, and for less conventional texts (in old or less researched languages) it is even harder to solve this problem automatically. Our task was to create a software that helps in expanding a dictionary containing word forms and tagging unprocessed text. We used a manually created corpus for training and testing the model. We created a combination of Bidirectional Long-Short Term Memory networks, convolutional networks and a distancebased solution that outperformed other existing solutions. While manual post-processing for the tagged text is still needed, it significantly reduces the amount of it.

APA, Harvard, Vancouver, ISO, and other styles

18

ZENNAKI, O., N. SEMMAR, and L. BESACIER. "A neural approach for inducing multilingual resources and natural language processing tools for low-resource languages." Natural Language Engineering 25, no. 1 (August 6, 2018): 43–67. http://dx.doi.org/10.1017/s1351324918000293.

Full text

Abstract:

AbstractThis work focuses on the rapid development of linguistic annotation tools for low-resource languages (languages that have no labeled training data). We experiment with several cross-lingual annotation projection methods using recurrent neural networks (RNN) models. The distinctive feature of our approach is that our multilingual word representation requires only a parallel corpus between source and target languages. More precisely, our approach has the following characteristics: (a) it does not use word alignment information, (b) it does not assume any knowledge about target languages (one requirement is that the two languages (source and target) are not too syntactically divergent), which makes it applicable to a wide range of low-resource languages, (c) it provides authentic multilingual taggers (one tagger forNlanguages). We investigate both uni and bidirectional RNN models and propose a method to include external information (for instance, low-level information from part-of-speech tags) in the RNN to train higher level taggers (for instance, Super Sense taggers). We demonstrate the validity and genericity of our model by using parallel corpora (obtained by manual or automatic translation). Our experiments are conducted to induce cross-lingual part-of-speech and Super Sense taggers. We also use our approach in a weakly supervised context, and it shows an excellent potential for very low-resource settings (less than 1k training utterances).

APA, Harvard, Vancouver, ISO, and other styles

19

Celard, P., A. Seara Vieira, E. L. Iglesias, and L. Borrajo. "LDA filter: A Latent Dirichlet Allocation preprocess method for Weka." PLOS ONE 15, no. 11 (November 9, 2020): e0241701. http://dx.doi.org/10.1371/journal.pone.0241701.

Full text

Abstract:

This work presents an alternative method to represent documents based on LDA (Latent Dirichlet Allocation) and how it affects to classification algorithms, in comparison to common text representation. LDA assumes that each document deals with a set of predefined topics, which are distributions over an entire vocabulary. Our main objective is to use the probability of a document belonging to each topic to implement a new text representation model. This proposed technique is deployed as an extension of the Weka software as a new filter. To demonstrate its performance, the created filter is tested with different classifiers such as a Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Naive Bayes in different documental corpora (OHSUMED, Reuters-21578, 20Newsgroup, Yahoo! Answers, YELP Polarity, and TREC Genomics 2015). Then, it is compared with the Bag of Words (BoW) representation technique. Results suggest that the application of our proposed filter achieves similar accuracy as BoW but greatly improves classification processing times.

APA, Harvard, Vancouver, ISO, and other styles

20

MELERO, M., M. R. COSTA-JUSSÀ, P. LAMBERT, and M. QUIXAL. "Selection of correction candidates for the normalization of Spanish user-generated content." Natural Language Engineering 22, no. 1 (February 24, 2014): 135–61. http://dx.doi.org/10.1017/s1351324914000011.

Full text

Abstract:

AbstractWe present research aiming to build tools for the normalization of User-Generated Content (UGC). We argue that processing this type of text requires the revisiting of the initial steps of Natural Language Processing, since UGC (micro-blog, blog, and, generally, Web 2.0 user-generated texts) presents a number of nonstandard communicative and linguistic characteristics – often closer to oral and colloquial language than to edited text. We present a corpus of UGC text in Spanish from three different sources: Twitter, consumer reviews, and blogs, and describe its main characteristics. We motivate the need for UGC text normalization by analyzing the problems found when processing this type of text through a conventional language processing pipeline, particularly in the tasks of lemmatization and morphosyntactic tagging. Our aim with this paper is to seize the power of already existing spell and grammar correction engines and endow them with automatic normalization capabilities in order to pave the way for the application of standard Natural Language Processing tools to typical UGC text. Particularly, we propose a strategy for automatically normalizing UGC by adding a module on top of a pre-existing spell-checker that selects the most plausible correction from an unranked list of candidates provided by the spell-checker. To build this selector module we train four language models, each one containing a different type of linguistic information in a trade-off with its generalization capabilities. Our experiments show that the models trained on truecase and lowercase word forms are more discriminative than the others at selecting the best candidate. We have also experimented with a parametrized combination of the models by both optimizing directly on the selection task and doing a linear interpolation of the models. The resulting parametrized combinations obtain results close to the best performing model but do not improve on those results, as measured on the test set. The precision of the selector module in ranking number one the expected correction proposal on the test corpora reaches 82.5% for Twitter text (baseline 57%) and 88% for non-Twitter text (baseline 64%).

APA, Harvard, Vancouver, ISO, and other styles

21

Başkaya, Osman, and David Jurgens. "Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation." Journal of Artificial Intelligence Research 55 (April 22, 2016): 1025–58. http://dx.doi.org/10.1613/jair.4917.

Full text

Abstract:

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to benefit many applications in Natural Language Processing. Although supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words. While unsupervised techniques have been proposed to overcome this data sparsity problem, such techniques have not outperformed supervised methods. In this paper, we propose a new approach to building semi-supervised WSD systems that combines a small amount of sense-annotated data with information from Word Sense Induction, a fully-unsupervised technique that automatically learns the different senses of a word based on how it is used. In three experiments, we show how sense induction models may be effectively combined to ultimately produce high-performance semi-supervised WSD systems that exceed the performance of state-of-the-art supervised WSD techniques trained on the same sense-annotated data. We anticipate that our results and released software will also benefit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce accurate WSD systems with minimal annotation effort.

APA, Harvard, Vancouver, ISO, and other styles

22

LI, YAOYONG, KALINA BONTCHEVA, and HAMISH CUNNINGHAM. "Adapting SVM for data sparseness and imbalance: a case study in information extraction." Natural Language Engineering 15, no. 2 (April 2009): 241–71. http://dx.doi.org/10.1017/s1351324908004968.

Full text

Abstract:

AbstractSupport Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.

APA, Harvard, Vancouver, ISO, and other styles

23

HEWAVITHARANA, SANJIKA, and STEPHAN VOGEL. "Extracting parallel phrases from comparable data for machine translation." Natural Language Engineering 22, no. 4 (June 15, 2016): 549–73. http://dx.doi.org/10.1017/s1351324916000139.

Full text

Abstract:

AbstractMining parallel data from comparable corpora is a promising approach for overcoming the data sparseness in statistical machine translation and other natural language processing applications. In this paper, we address the task of detecting parallel phrase pairs embedded in comparable sentence pairs. We present a novel phrase alignment approach that is designed to only align parallel sections bypassing non-parallel sections of the sentence. We compare the proposed approach with two other alignment methods: (1) the standard phrase extraction algorithm, which relies on the Viterbi path of the word alignment, (2) a binary classifier to detect parallel phrase pairs when presented with a large collection of phrase pair candidates. We evaluate the accuracy of these approaches using a manually aligned data set, and show that the proposed approach outperforms the other two approaches. Finally, we demonstrate the effectiveness of the extracted phrase pairs by using them in Arabic–English and Urdu–English translation systems, which resulted in improvements upto 1.2 Bleu over the baseline. The main contributions of this paper are two-fold: (1) novel phrase alignment algorithms to extract parallel phrase pairs from comparable sentences, (2) evaluating the utility of the extracted phrases by using them directly in the MT decoder.

APA, Harvard, Vancouver, ISO, and other styles

24

Wang, Yu, Yining Sun, Zuchang Ma, Lisheng Gao, and Yang Xu. "Named Entity Recognition in Chinese Medical Literature Using Pretraining Models." Scientific Programming 2020 (September 9, 2020): 1–9. http://dx.doi.org/10.1155/2020/8812754.

Full text

Abstract:

The medical literature contains valuable knowledge, such as the clinical symptoms, diagnosis, and treatments of a particular disease. Named Entity Recognition (NER) is the initial step in extracting this knowledge from unstructured text and presenting it as a Knowledge Graph (KG). However, the previous approaches of NER have often suffered from small-scale human-labelled training data. Furthermore, extracting knowledge from Chinese medical literature is a more complex task because there is no segmentation between Chinese characters. Recently, the pretraining models, which obtain representations with the prior semantic knowledge on large-scale unlabelled corpora, have achieved state-of-the-art results for a wide variety of Natural Language Processing (NLP) tasks. However, the capabilities of pretraining models have not been fully exploited, and applications of other pretraining models except BERT in specific domains, such as NER in Chinese medical literature, are also of interest. In this paper, we enhance the performance of NER in Chinese medical literature using pretraining models. First, we propose a method of data augmentation by replacing the words in the training set with synonyms through the Mask Language Model (MLM), which is a pretraining task. Then, we consider NER as the downstream task of the pretraining model and transfer the prior semantic knowledge obtained during pretraining to it. Finally, we conduct experiments to compare the performances of six pretraining models (BERT, BERT-WWM, BERT-WWM-EXT, ERNIE, ERNIE-tiny, and RoBERTa) in recognizing named entities from Chinese medical literature. The effects of feature extraction and fine-tuning, as well as different downstream model structures, are also explored. Experimental results demonstrate that the method of data augmentation we proposed can obtain meaningful improvements in the performance of recognition. Besides, RoBERTa-CRF achieves the highest F1-score compared with the previous methods and other pretraining models.

APA, Harvard, Vancouver, ISO, and other styles

25

Nagy T., István, Anita Rácz, and Veronika Vincze. "Detecting light verb constructions across languages." Natural Language Engineering 26, no. 3 (July 15, 2019): 319–48. http://dx.doi.org/10.1017/s1351324919000330.

Full text

Abstract:

AbstractLight verb constructions (LVCs) are verb and noun combinations in which the verb has lost its meaning to some degree and the noun is used in one of its original senses, typically denoting an event or an action. They exhibit special linguistic features, especially when regarded in a multilingual context. In this paper, we focus on the automatic detection of LVCs in raw text in four different languages, namely, English, German, Spanish, and Hungarian. First, we analyze the characteristics of LVCs from a linguistic point of view based on parallel corpus data. Then, we provide a standardized (i.e., language-independent) representation of LVCs that can be used in machine learning experiments. After, we experiment on identifying LVCs in different languages: we exploit language adaptation techniques which demonstrate that data from an additional language can be successfully employed in improving the performance of supervised LVC detection for a given language. As there are several annotated corpora from several domains in the case of English and Hungarian, we also investigate the effect of simple domain adaptation techniques to reduce the gap between domains. Furthermore, we combine domain adaptation techniques with language adaptation techniques for these two languages. Our results show that both out-domain and additional language data can improve performance. We believe that our language adaptation method may have practical implications in several fields of natural language processing, especially in machine translation.

APA, Harvard, Vancouver, ISO, and other styles

26

Rubino, Raphael, Benjamin Marie, Raj Dabre, Atushi Fujita, Masao Utiyama, and Eiichiro Sumita. "Extremely low-resource neural machine translation for Asian languages." Machine Translation 34, no. 4 (December 2020): 347–82. http://dx.doi.org/10.1007/s10590-020-09258-6.

Full text

Abstract:

AbstractThis paper presents a set of effective approaches to handle extremely low-resource language pairs for self-attention based neural machine translation (NMT) focusing on English and four Asian languages. Starting from an initial set of parallel sentences used to train bilingual baseline models, we introduce additional monolingual corpora and data processing techniques to improve translation quality. We describe a series of best practices and empirically validate the methods through an evaluation conducted on eight translation directions, based on state-of-the-art NMT approaches such as hyper-parameter search, data augmentation with forward and backward translation in combination with tags and noise, as well as joint multilingual training. Experiments show that the commonly used default architecture of self-attention NMT models does not reach the best results, validating previous work on the importance of hyper-parameter tuning. Additionally, empirical results indicate the amount of synthetic data required to efficiently increase the parameters of the models leading to the best translation quality measured by automatic metrics. We show that the best NMT models trained on large amount of tagged back-translations outperform three other synthetic data generation approaches. Finally, comparison with statistical machine translation (SMT) indicates that extremely low-resource NMT requires a large amount of synthetic parallel data obtained with back-translation in order to close the performance gap with the preceding SMT approach.

APA, Harvard, Vancouver, ISO, and other styles

27

CHEN, QINGCAI, XIAOLONG WANG, PENGFEI SU, and YI YAO. "AUTO ADAPTED ENGLISH PRONUNCIATION EVALUATION: A FUZZY INTEGRAL APPROACH." International Journal of Pattern Recognition and Artificial Intelligence 22, no. 01 (February 2008): 153–68. http://dx.doi.org/10.1142/s0218001408006090.

Full text

Abstract:

To evaluate the pronunciation skills of spoken English is one of the key tasks for computer-aided spoken language learning (CALL). While most of the researchers focus on improving the speech recognition techniques to build a reliable evaluation system, another important aspect of this task has been ignored, i.e. the pronunciation evaluation model that integrates both the reliabilities of existing speech processing systems and the learner's pronunciation personalities. To take this aspect into consideration, a Sugeno integral-based evaluation model is introduced in this paper. At first, the English phonemes that are hard to be distinguished (HDP) for Chinese language learners are grouped into different HDP sets. Then, the system reliabilities for distinguishing the phonemes within a HDP set are computed from the standard speech corpus and are integrated with the phoneme recognition results under the Sugeno integral framework. The fuzzy measures are given for each subset of speech segments that contains n occurrences of phonemes within a HDP set. Rather than providing a quantity of scores, the linguistic descriptions of evaluation results are given by the model, which is more helpful for the users to improve their spoken language skills. To get a better performance, generic algorithm (GA)-based parameter optimization is also applied to optimize the model parameters. Experiments are conducted on the Sphinx-4 speech recognition platform. They show that, with 84.7% of average recognition rate of the SR system on standard speech corpus, our pronunciation evaluation model has got reasonable and reliable results for three kinds of test corpora.

APA, Harvard, Vancouver, ISO, and other styles

28

Tarmom, Taghreed, William Teahan, Eric Atwell, and Mohammad Ammar Alsalka. "Compression versus traditional machine learning classifiers to detect code-switching in varieties and dialects: Arabic as a case study." Natural Language Engineering 26, no. 6 (May 5, 2020): 663–76. http://dx.doi.org/10.1017/s135132492000011x.

Full text

Abstract:

AbstractThe occurrence of code-switching in online communication, when a writer switches among multiple languages, presents a challenge for natural language processing tools, since they are designed for texts written in a single language. To answer the challenge, this paper presents detailed research on ways to detect code-switching in Arabic text automatically. We compare the prediction by partial matching (PPM) compression-based classifier, implemented in Tawa, and a traditional machine learning classifier sequential minimal optimization (SMO), implemented in Waikato Environment for Knowledge Analysis, working specifically on Arabic text taken from Facebook. Three experiments were conducted in order to: (1) detect code-switching among the Egyptian dialect and English; (2) detect code-switching among the Egyptian dialect, the Saudi dialect, and English; and (3) detect code-switching among the Egyptian dialect, the Saudi dialect, Modern Standard Arabic (MSA), and English. Our experiments showed that PPM achieved a higher accuracy rate than SMO with 99.8% versus 97.5% in the first experiment and 97.8% versus 80.7% in the second. In the third experiment, PPM achieved a lower accuracy rate than SMO with 53.2% versus 60.2%. Code-switching between Egyptian Arabic and English text is easiest to detect because Arabic and English are generally written in different character sets. It is more difficult to distinguish between Arabic dialects and MSA as these use the same character set, and most users of Arabic, especially Saudis and Egyptians, frequently mix MSA with their dialects. We also note that the MSA corpus used for training the MSA model may not represent MSA Facebook text well, being built from news websites. This paper also describes in detail the new Arabic corpora created for this research and our experiments.

APA, Harvard, Vancouver, ISO, and other styles

29

Mochalov, Valery P., Gennady I. Linets, Natalya Yu Bratchenko, and Svetlana V. Govorova. "An Analytical Model of a Corporate Software-Controlled Network Switch." Scalable Computing: Practice and Experience 21, no. 2 (June 27, 2020): 337–46. http://dx.doi.org/10.12694/scpe.v21i2.1698.

Full text

Abstract:

Implementing the almost limitless possibilities of a software-defined network requires additional study of its infrastructure level and assessment of the telecommunications aspect. The aim of this study is to develop an analytical model for analyzing the main quality indicators of modern network switches. Based on the general theory of queuing systems and networks, generated functions and Laplace-Stieltjes transforms, a three-phase model of a network switch was developed. Given that, in this case, the relationship between processing steps is not significant, quality indicators were obtained by taking into account the parameters of single-phase networks. This research identified the dependencies of service latency and service time of incoming network packets on load, as well as equations for finding the volume of a switch’s buffer memory with an acceptable probability for message loss.

APA, Harvard, Vancouver, ISO, and other styles

30

Ivakin, Ya A., S. A. Morozov, V. M. Balashov, and M. S. Smirnova. "QUALITY ASSURANCE OF SOFTWARE AND HARDWARE COMPLEXES FOR DATA STORAGE AND PROCESSING CENTERS." Issues of radio electronics, no. 3 (March 20, 2018): 145–50. http://dx.doi.org/10.21778/2218-5453-2018-3-145-150.

Full text

Abstract:

The article analyzes and presents an analysis of the structure of software and hardware systems, reviewed the documentation of regulatory and technical regulation, identified the main performance indicators for software and hardware systems for data storage and processing centers. A generalized representation of the software and hardware structure of software and hardware complexes for data centers in the form of an embedded scheme was developed and presented. Also, the article identifies and structures the basic or typical services supported by modern software and hardware systems of data centers. The role and place of software and hardware complexes of data centers in information support of state and corporate governance bodies are determined. The main indicators of the quality of the functioning of the software-hardware complexes of data center are presented in the work when hosting services are provided. The problem of creation of the normative and technical base and scientific and methodological tools for assessment and improvement of the quality of the corresponding software and hardware complexes is revealed. The structure of software and hardware systems, reviews the documentation of regulatory and technical regulation, determines the main performance indicators for software and hardware systems for data storage and processing centers.

APA, Harvard, Vancouver, ISO, and other styles

31

Gwoździewicz, Sylwia, Dariusz Prokopowicz, Jan Grzegorek, and Martin Dahl. "APPLICATION OF DATA BASE SYSTEMS BIG DATA AND BUSINESS INTELLIGENCE SOFTWARE IN INTEGRATED RISK MANAGEMENT IN ORGANIZATION." International Journal of New Economics and Social Sciences 8, no. 2 (December 30, 2018): 12–14. http://dx.doi.org/10.5604/01.3001.0012.9925.

Full text

Abstract:

Currently, business analytics uses computerized platforms containing ready-made reporting formulas in the field of Business Intelligence. In recent years, software companies supporting enterprise management offer advanced applications of information-analytical Business Intelligence class systems consisting of modular development of these systems and combining business intelligence software with platforms that use data warehouse technology, multi-dimensional analytical processing software and data mining and processing applications. This article describes an example of this type of computerized analytical platform for business entities, which is included in analytical applications that allow quick access to necessary, aggregated and multi-criteria processed information. The software allows entrepreneurs and corporate managers as well as entities from the SME sector on the one hand to use embedded patterns of reports or analyzes, and on the other hand to self-develop and configure analyzes carried out, tailored to the specifics of a specific entity. Such analytical applications make it possible to build integrated risk management systems in the organization.

APA, Harvard, Vancouver, ISO, and other styles

32

Dahl, Göran, Stephan Steigele, Per Hillertz, Anna Tigerström, Anders Egnéus, Alexander Mehrle, Martin Ginkel, et al. "Unified Software Solution for Efficient SPR Data Analysis in Drug Research." SLAS DISCOVERY: Advancing the Science of Drug Discovery 22, no. 2 (October 28, 2016): 203–9. http://dx.doi.org/10.1177/1087057116675316.

Full text

Abstract:

Surface plasmon resonance (SPR) is a powerful method for obtaining detailed molecular interaction parameters. Modern instrumentation with its increased throughput has enabled routine screening by SPR in hit-to-lead and lead optimization programs, and SPR has become a mainstream drug discovery technology. However, the processing and reporting of SPR data in drug discovery are typically performed manually, which is both time-consuming and tedious. Here, we present the workflow concept, design and experiences with a software module relying on a single, browser-based software platform for the processing, analysis, and reporting of SPR data. The efficiency of this concept lies in the immediate availability of end results: data are processed and analyzed upon loading the raw data file, allowing the user to immediately quality control the results. Once completed, the user can automatically report those results to data repositories for corporate access and quickly generate printed reports or documents. The software module has resulted in a very efficient and effective workflow through saved time and improved quality control. We discuss these benefits and show how this process defines a new benchmark in the drug discovery industry for the handling, interpretation, visualization, and sharing of SPR data.

APA, Harvard, Vancouver, ISO, and other styles

33

Dewi, Sofia Prima, and Cynthia Cynthia. "Aggressiveness tax in indonesia." Jurnal Akuntansi 22, no. 2 (August 29, 2018): 239. http://dx.doi.org/10.24912/ja.v22i2.350.

Full text

Abstract:

The purpose of this study was to obtain empirical evidence about the influence of liquidity, corporate social responsibility, earnings management, and firm size against tax aggressiveness on manufacturing companies listed consistently in the Indonesia Stock Exchange during the year 2013-2015. This study used a sample of sixty-four manufacturing companies. This study uses a software program Eviews for data processing. These results indicate that liquidity has an influence on tax aggressiveness, while corporate social responsibility, earnings management, and firm size have no influence on tax aggressiveness.

APA, Harvard, Vancouver, ISO, and other styles

34

Espinel Villalobos, Rodrigo Ivan, Erick Ardila Triana, Henry Zarate Ceballos, and Jorge Eduardo Ortiz Triviño. "Design and Implementation of Network Monitoring System for Campus Infrastructure Using Software Agents." Ingeniería e Investigación 42, no. 1 (July 16, 2021): e87564. http://dx.doi.org/10.15446/ing.investig.v42n1.87564.

Full text

Abstract:

In network management and monitoring systems, or Network Management Stations (NMS), the Simple Network monitoring Protocol (SNMP) is normally used, with which it is possible to obtain information on the behavior, the values of the variables, and the status of the network architecture. network. However, for large corporate networks, the protocol can present latency in data collection and processing, thus making real-time monitoring difficult. This article proposes a multi-agent system based on layers, with three types of agents. This includes the collector agent, which uses a Management Information Base (MIB) value to collect information from the network equipment, an input table of information from the network devices for the consolidator agent to process the collected data and leave it in a consumable format, and its subsequent representation by the application agent as a web service, in this case, as a heat map.

APA, Harvard, Vancouver, ISO, and other styles

35

Orazbekov, Zh N., A. K. Moshkalov, and K. Zh Sabraev. "OPTIMIZATION OF ANOTHER MANAGEMENT ALGORITHM IN THE PROCESS OF PROCESSING AND EXCHANGE OF PRODUCTION DATA IN THE ENVIRONMENT OF THE CORPORATE PORTAL." BULLETIN Series of Physics & Mathematical Sciences 69, no. 1 (March 10, 2020): 395–99. http://dx.doi.org/10.51889/2020-1.1728-7901.72.

Full text

Abstract:

Now actively there are processes of integration of small enterprises into corporations. Information system of the Corporation usually needs to provide work of several geographically dispersed units. The automation process begins with an analysis of th company's activities and formulate basic recommendations for a future information system. Only then the question of choosing a ready-made system or develop their own. In this case, it is necessary to solve a number of problems such as the choice of base software and hardware, design of the functional structure of the information system, designing distributed databases, and calculation of parameters of its functioning. This article describes the topological model of managing a specialized enterprise for managing a corporate data warehouse, tools and components for transmitting production data in the form of key topological objects, an algorithm and a mathematical model for efficient queue management in the process of exchanging and exchanging corporate data. This model describes the result of performing an additional function in the process of transferring and processing corporate portal data.

APA, Harvard, Vancouver, ISO, and other styles

36

Zhang, Yang You. "Two Integration Model and Algorithm Design of Supply Chain Based on Swarm Calculation and Simulation." Applied Mechanics and Materials 608-609 (October 2014): 181–85. http://dx.doi.org/10.4028/www.scientific.net/amm.608-609.181.

Full text

Abstract:

When settlement amount and settlement efficiency grow, enterprise credit will grow, financial interest rate will decrease, and corporate earnings will increase. Realizing two integration of financial supply chain management from order to settlement can improve the utilization ratio of the fund, and help enterprises better managing the capital cost, and improve the economic efficiency of enterprises. In this paper we mainly use the computer data processing method to improve the financial data processing model, and use the method of function approximation to speed up the convergence of data processing. Combined with the nonlinear fitting principle we establish interpolation function model of financial management, and use VB software to develop and design the financial supply chain management system. Finally, through calculation we obtain the economic benefit of the two integration management program. It provides technical reference for the application of computer technology in financial management.

APA, Harvard, Vancouver, ISO, and other styles

37

Sutisna, Husen, Aida Vitayala S. Hubeis, and Muhammad Syamsun. "Peran Human Capital, Corporate Value dan Good Corporate Governance melalui Kinerja Karyawan terhadap Kinerja Perusahaan di PTPN VII Lampung." MANAJEMEN IKM: Jurnal Manajemen Pengembangan Industri Kecil Menengah 9, no. 2 (December 2, 2014): 131–39. http://dx.doi.org/10.29244/mikm.9.2.131-139.

Full text

Abstract:

Changes in business environment lead various companies to continue to strive to improve their business strategies in order to survive and have a competitive advantage. The peak changes occurred with the coming of business era in the era of information and science. In this era, business strategies that are considered suitable, among others, the application of human resource development system based on human capital and company management based on corporate values and good corporate governance (GCG). PTPN VII as state-owned enterprise in agribusiness has tried to implement the system. This study aimed to examine the relationship between human capital, corporate values and GCG and their relation to the performances of the employees and the company. The population of this research as many as 400 people consisted of the employees of PTPN VII head office, but the number of the samples set was 120 respondents. The sampling technique used was non-probability sampling with quota sampling technique. The methods of processing and analyzing the data was structural equation modeling analysis-partial least squares (SEM-PLS), and the data processing used software smartPLS. The results indicated that the implementation of human capital by the company contributed positively to the increase of the employee performance. The implementation and internalization of corporate values to the employees positively contributed to the improvement of the employee performance. The increased employee performance played a positive role in improving the company performance. The implementation of corporate governance principles could improve the performance of the company, but did not play a great role in improving employee performance

APA, Harvard, Vancouver, ISO, and other styles

38

DALE, ROBERT. "Industry Watch." Natural Language Engineering 11, no. 4 (November 10, 2005): 435–38. http://dx.doi.org/10.1017/s1351324905003979.

Full text

Abstract:

Suppose you're a corporate vice president at a well-known international software company, and you want to check on the visibility of one of your leading researchers in the outside world. You're sitting at your desk, so the most obvious thing to do is to enter their name into a search engine. If the well-known international software company happened to be Microsoft, and if the leading researcher happened to be Microsoft's Susan Dumais, and if the search engine you decided to use happened to be Google, you might be surprised to find that the sponsored link that comes atop the search results is actually from Google itself, exhorting you to ‘Work on NLP at Google’, and alerting you to the fact that ‘Google is hiring experts in statistical language processing’.

APA, Harvard, Vancouver, ISO, and other styles

39

Surnin, O. L., P. V. Sitnikov, A. A. Khorina, A. V. Ivaschenko, A. A. Stolbova, and N. Yu Ilyasova. "Industrial application of big data services in digital economy." Information Technology and Nanotechnology, no. 2416 (2019): 409–16. http://dx.doi.org/10.18287/1613-0073-2019-2416-409-416.

Full text

Abstract:

Nowadays, the world is moving to automation. Appropriate programs for the implementation of industrial applications are developed by many companies. But is it so easy to implement systems capable of processing large amounts of information in production? Despite multiple positive results in research and development of Big Data technologies, their practical implementation and use remain challenging. At the same time most prominent trends of digital economy require Big Data analysis in various problem domains. We carried out the analysis of existing data processing works. Based on generalization of theoretical research and a number of real economy projects in this area there is proposed in this paper an architecture of a software development kit that can be used as a solid platform to build industrial applications. Was formed a basic algorithm for processing data from various sources (sensors, corporate systems, etc.). Examples are given for automobile industry with a reference of Industry 4.0 paradigm implementation in practice. The given examples are illustrated by trends graphs and by subject area ontology of the automotive industry.

APA, Harvard, Vancouver, ISO, and other styles

40

Manurung, Vio Lolyta, and Francis Hutabarat. "PENGARUH CORPORATE GOVERNANCE TERHADAP TAX AVOIDANCE DENGAN MEDIASI LIKUIDITAS PADA PERUSAHAAN BUMN YANG TERDAPAT DI BEI TAHUN 2017-2019." GOING CONCERN : JURNAL RISET AKUNTANSI 15, no. 3 (October 12, 2020): 478. http://dx.doi.org/10.32400/gc.15.3.30275.2020.

Full text

Abstract:

The aims of this study is to test the effect of corporate governance on tax avoidance by using liquidity as a mediation in state-owned companies (BUMN) that have been listed on the Indonesia Stock Exchange (BEI) for the 2017-2019 period. This study uses descriptive analysis theory and processing is carried out in SPSS software. There are 20 state-owned companies (BUMN) listed on the Indonesia Stock Exchange (BEI) for the 2017-2019 period which were used as samples in this study. This study uses tax avoidance which functions as the dependent variable and uses corporate governance which functions as an independent variable and is assisted by using liquidity as a mediating variable. In addition, descriptive statistics, F test, and t test were also used in this study as statistical analysis. The results of the analysis in this study show that corporate governance has a significant effect on tax avoidance, while liquidity does not have a significant effect on tax avoidance with liquidity as the mediating variable.

APA, Harvard, Vancouver, ISO, and other styles

41

Zörög, Zoltán, Tamás Csomós, and Csaba Szűcs. "ERP systems in higher education." Applied Studies in Agribusiness and Commerce 6, no. 3-4 (November 30, 2012): 103–9. http://dx.doi.org/10.19041/apstract/2012/3-4/14.

Full text

Abstract:

In the past few decades data processing and in-company communication has changed significantly. First there were only a few computers purchased at companies, therefore departments developed applications that covered corporate administration which lead to so called isolated solutions. These days with the spread of electronic data processing the greatest problem for companies is not gaining information – since they can be found in all sorts of databases and data warehouses as internal or external information – rather producing information that is necessary in a given situation. What can help to solve this situation? It is informatics, more precisely ERP systems which have substituted software that provided isolated solutions at companies for decades. System based thinking is important in their application beside the fact that only data absolutely necessary for managerial decisions must be produced. This paper points out why we consider practice oriented teaching of ERP systems in higher education important.

APA, Harvard, Vancouver, ISO, and other styles

42

Bangun, Nurainun, Yuniarwati Yuniarwati, and Linda Santioso. "Pengaruh corporate governance, profitability, dan foreign ownership terhadap dividend policy pada perusahaan manufaktur yang terdaftar di bursa efek indonesia Periode 2014-2016." Jurnal Akuntansi 22, no. 2 (August 29, 2018): 279. http://dx.doi.org/10.24912/ja.v22i2.353.

Full text

Abstract:

The purpose of this research is to analyze the effect of corporate governance, profitability, and foreign ownership on dividend policy. This research uses manufacturing company listed in Indonesia Stock Exchange for the period 2014-2016 as the population. Using purposive sampling, 95 data are selected as samples. Data processing in this research uses software program IBM SPSS version 23. The result shows that board size have a significant effect on dividend policy. Board independence do not have a significant effect on dividend policy. CEO duality do not have a significant effect on dividend policy. Profitability have a significant effect on dividend policy. Foreign ownership have a significant effect on dividend policy.

APA, Harvard, Vancouver, ISO, and other styles

43

Sokolov, B., and A. Kolosov. "Blockchain Technology as a Platform for Integrating Corporate Systems." Automatic Control and Computer Sciences 55, no. 3 (May 2021): 234–42. http://dx.doi.org/10.3103/s014641162103010x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Loginovskiy, O. V., A. A. Maximov, S. A. Zolotykh, and V. O. Loginovskaya. "DEVELOPMENT OF ORGANIZATIONAL AND CORPORATE SYSTEMS USING MODERN MATHEMATICAL METHODS AND MODELS." Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics 21, no. 1 (February 2021): 116–35. http://dx.doi.org/10.14529/ctcr210111.

Full text

Abstract:

Analysis of modern technologies, methods and models used in various types of organizational and corporate structures convincingly proves that the improvement of preparation and decision-making for the management of these structures is currently carried out mainly on the basis of often outdated and not quite corresponding to modern capabilities of computer technology, and also information, software and software developments. The article shows that in modern conditions of global instability in the world, it becomes necessary to use adequate methods for data analysis and preparation of managerial decision-making on the development of organizational and corporate structures. Proposals and recommendations for improving the processes of analytical data processing, extracting useful information from large amounts of data located in the relevant organizational and corporate systems are presented, as well as adequate mathematical models and algorithms that can be successfully used to improve the quality of management decisions by the management of companies. The purpose of the study is to form methods and models for the analysis of strategic alternatives for the development of organizational and corporate systems using the concept of big data, technologies for extracting the necessary information from existing data banks, etc. Materials and methods. The research methods are based on modern information and analytical technologies, data science and models developed by the authors for the analysis of strategic alternatives for the deve¬lopment of organizational and corporate systems. Results. The scientific provisions and developments presented in the article can be used to improve the efficiency of management in various information and analytical systems for various management structures. Conclusion. The results of the research presented in this article make it possible to perform a qualitative analysis of data, to model the options for the work of organizational and corporate structures in an on-line mode, which makes it possible to increase the efficiency of managing their development based on a comparison of alternative options for management decisions.

APA, Harvard, Vancouver, ISO, and other styles

45

Nurutdinova, I. N., and L. A. Dimitrova. "Information system for assessing maturity level of an organization." Advanced Engineering Research 20, no. 3 (October 5, 2020): 317–24. http://dx.doi.org/10.23947/2687-1653-2020-20-3-317-324.

Full text

Abstract:

Introduction. The paper considers problems of creating information support for solving the task of assessing the maturity level of an organization. It is proposed to use intelligent information systems, i.e. expert systems. Substantive aspects of various stages of creating such systems are briefly described; the expert system architecture, which is based on using a fuzzy expert knowledge base, is given. The work objective was to create new software to solve the problem of assessing the maturity level of an organization. Materials and Methods. Previously performed modeling of the subject domain under consideration allowed us to create a knowledge base in the form of production memory, which is the basis of the fuzzy inference mechanism. The software is written in PHP and is suitable for embedding in complex web applications. The software system is a web application written primarily in PHP and JavaScript. The software works in all modern web browsers, which accelerates significantly the implementation and deployment based on both the parent-enterprise and its subsidiaries.Results. New software has been created to automate the processing of questionnaires during the organization’s selfassessment based on key indicators, as well as considering 6 main groups of the quality management system indicators. Application of the program will significantly speed up the process of input and processing of expert information required for self-assessment. The program provides organizations to get an adequate idea of the opportunities and prospects for improving the organization’s quality management system. Some fragments of the software system interface are given. Discussion and Conclusions. The proposed software can be used to determine the level of maturity of an organization. The application of Web-technologies improves usability, reduces software support costs. The software can be both deployed in the existing network infrastructure of a customer and used by all the functionality through connecting to a remote server. The software is optimized for various screen resolutions, which allows you to use it not only at the central office, but also when analyzing the quality management system of corporate customers. The traffic generated by the web application is optimized for working with mobile devices with a low-speed Internet connection. Application of the program will significantly reduce the time for users to enter and process expert information required for the problem solving and to eliminate duplication of information.

APA, Harvard, Vancouver, ISO, and other styles

46

Hadiah Fitriyah, Bambang Tjahjadi, and Noorlailie Soewarno. "Peran Green Product Innovation Dalam Memediasi Pengaruh Corporate Social Responsibility Terhadap Kinerja Bisnis Industri Kreatif." Journal of Accounting Science 4, no. 1 (January 31, 2020): 12–28. http://dx.doi.org/10.21070/jas.v4i1.397.

Full text

Abstract:

This study aims to empirically examine the role of green product innovation in mediating the influence of corporate social responsibility on the business performance of the creative industries of batik. The research approach used is quantitative research. The population is the creative industries of spread the r Sidoarjo and Bangkalan regions. The number of Small and Medium Industries of batik crafts in Sidoarjo and Bangkalan is 98 batik creative industries. The sample in this study were 79 creative industries of batik using the Slovin formula, then the determination and distribution of the sample was based on simple random sampling. The analytical method used in the data processing in this study uses SEM-PLS with the help of SmartPLS software. The results showed that: (1) corporate social responsibility (CSR) influences green product innovation (GPI) with a t value of 5.384 and a Pvalue of 0.000; (2) green product innovation (GPI) influences business performance with a t value of 5.492 and Pvalue of 0.000; (3) green product innovation(GPI) is able to mediate the influence of corporate social responsibility (CSR) business performance (BP) of creative industries with a t value of 3.771 and a Pvalue of 0.000.

APA, Harvard, Vancouver, ISO, and other styles

47

Istianingsih, Terri Trireksani, and Daniel T. H. Manurung. "The Impact of Corporate Social Responsibility Disclosure on the Future Earnings Response Coefficient (ASEAN Banking Analysis)." Sustainability 12, no. 22 (November 19, 2020): 9671. http://dx.doi.org/10.3390/su12229671.

Full text

Abstract:

Corporate social responsibility in the banking industry has an impact on the environment and society. Research was conducted on the impacts of environmental social responsibility disclosure on future income response coefficients of The Association of South East Asian Nations (ASEAN) Banking to determine the level of concern ASEAN banks have in disclosing corporate responsibility, and to understand the levels of future revenue response coefficients. The variable in this research was measured by corporate social responsibility disclosure, while the variable of the Future Earnings Response Coefficient (FERC) was based on the value of banking stocks. Other variables—size, growth, earning persistence, and earnings volatility—were the control variables. The sampling method used was a purposive sampling approach; a research sample of 280 banks in 5 ASEAN countries was determined with this provision: banking report data were taken from the stock exchanges of each country and sustainability reports, using the Global Reporting Initiative (GRI) standard version 4 (G4) from 2014 to 2018. The researchers used conducted multiple regression analysis to examine the variables. The analysis tools used included panel data, so that data processing was carried out using review software. The results of the study show that corporate social responsibility disclosure has a positive and significant effect on the future earnings response coefficient, whereas other variables (i.e., company size, growth, and earnings persistence), do not have a relationship with the disclosure of corporate responsibility or FERC. Only the volatility of earnings has an influence on disclosure of corporate social responsibility and FERC.

APA, Harvard, Vancouver, ISO, and other styles

48

Pathak, Jagdish, and Navneet Vidyarthi. "Cost Framework for Evaluation of Information Technology Alternatives in Supply Chain." International Journal of Strategic Decision Sciences 2, no. 1 (January 2011): 66–84. http://dx.doi.org/10.4018/jsds.2011010104.

Full text

Abstract:

Organizations are often facing the problem of determining the degree of investment in building information links with their suppliers and buyers to reduce costs, lead times, and quality problems, improve timely customized delivery, increase asset utilization, and improve corporate profitability. One of the critical enablers for an efficient and effective supply chain is timely planning and information processing across the entire value-added chain. This paper presents an analytical model for selecting the right mix of analytical software and hardware alternatives at various planning and execution levels of an organization to remain competitive in a supply chain. Factors such as quality, reliability, flexibility, timeliness and organizational compatibility have been quantified into cost components that form the weighted cost function. The weights of the various cost components of software and hardware are derived from pair-wise comparison. These weights account for the relative importance of alternative supply chain strategies for an organization. A numerical example is presented to demonstrate the applicability of the proposed framework and exhibit the efficacy of the procedures and algorithms.

APA, Harvard, Vancouver, ISO, and other styles

49

Podroiko, Ye V., and Yu M. Lysetskyi. "Network technologies: evolution and peculiarities. Mathematical machines and systems." Mathematical machines and systems 2 (2020): 14–29. http://dx.doi.org/10.34121/1028-9763-2020-2-14-29.

Full text

Abstract:

Today corporate network is seen as a complex system and traditionally provides the set of interacting essential components, such as: Main Site – a network of head office; Remote Site (Branch) – networks of remote office; WAN – global network uniting networks of the offices; LAN – a local network; WAN Edge – a point of connection to WAN. Internet Edge – a point of connection to the Internet; Data Cen-ter – corporate centre of data processing. Some sources also regard Service Block as a component, which is a separate segment of the network with specific services. Every component of corporate network fea-tures contains individual set of technologies, each having its history of origination and development. The paper offers short review of basic technologies which form the history of development of corporate network, as well as their evolution from a set of separated network technologies to a unified multi-service network infrastructure. This unified infrastructure is inextricably linked with a global network of Internet which is both a service and a carrier for majority of modern corporate networks. The paper de-scribes origination and development of Internet, local and global networks, Wi-Fi networks and software defined networks. Corporate network has been through a long evolution from co-existence of separated technologies to modern unified intellectual network infrastructure with high security and reliable man-agement. Due to fast-moving development of information technologies the corporate networks have dynamically transformed in several directions: network functions virtualization (NFV – Network Func-tions Virtualization); utilization of SDN solutions; automation of management processes; analytics; se-curity; cloud services. In the course of such a transformation the corporate network turned into unified, flexible, application oriented infrastructure with high reliability, easily modified and expanded function-ality, single management center, unified security policies, fast and detailed analysis of internal network processes.

APA, Harvard, Vancouver, ISO, and other styles

50

Neroda, Tetyana. "APPROACHES TO TECHNOLOGICAL STAGES SIMULATION IN ACADEMIC MEDIA PLATFORM ENVIRONMENT OF LEARNING EXPERIMENT." ГРААЛЬ НАУКИ, no. 7 (September 3, 2021): 163–69. http://dx.doi.org/10.36074/grail-of-science.27.08.2021.029.

Full text

Abstract:

The methodology of a virtual laboratory workshop organizing on the example of research of technological stages lamination of a printing order in the production process of post-press processing of printed products in the training of qualified specialists in engineering specialties is presented. Despite the extensive coverage in open sources a features application of commercial complexes of simulation modeling in the educational process, the performed analysis showed the need to design an original client-server virtual platform for learning experiment and further development of industry-oriented structural components as pedagogical toolkit for it. Therefore, a software engine is proposed with the support of relevant program libraries and up-to-date information from the corporate database of the enterprise for operational computation and dynamic management of the active education environment based on requests and subsequent decision-making, when the student independently builds a strategy to achieve the goal by means of the most adequate simulation models. The applied architecture of the software engine presupposes the presence of interdisciplinary skills allows the academic media platform of the learning experiment to work in three educationally oriented modes and provides work experience close to the production one.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!