To see the other types of publications on this topic, follow the link: Paraphrase extraction.

Journal articles on the topic 'Paraphrase extraction'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 39 journal articles for your research on the topic 'Paraphrase extraction.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

HO, CHUKFONG, MASRAH AZRIFAH AZMI MURAD, RABIAH ABDUL KADIR, and SHYAMALA DORAISAMY. "COMPARING TWO CORPUS-BASED METHODS FOR EXTRACTING PARAPHRASES TO DICTIONARY-BASED METHOD." International Journal of Semantic Computing 05, no. 02 (June 2011): 133–78. http://dx.doi.org/10.1142/s1793351x11001225.

Full text
Abstract:
Paraphrase extraction plays an increasingly important role in language-related research and applications in areas such as information retrieval, question answering and automatic machine evaluation. Most of the existing methods extract paraphrases from different types of corpora by using syntactic-based approaches. Since a syntactic-based approach relies on the similarity of context to identify and capture paraphrases, other than paraphrases, other terms which tend to appear in a similar context such as loosely related terms and functionally similar yet unrelated terms tend to be extracted. Besides, different types of corpora suffer from different kinds of problems such as limited availability and domain biased. This paper presents a solely semantic-based paraphrase extraction model. This model collects paraphrases from multiple lexical resources and validates those paraphrases semantically in three ways; by computing domain similarity, definition similarity and word similarity. This model is benchmarked with two outstanding syntactic-based approaches. The experimental results from a manual evaluation show that the proposed model outperforms the benchmarks. The results indicate that a semantic-based approach should be applied in paraphrase extraction instead of a syntactic-based approach. The results further suggest that a hybrid of these two approaches should be applied if one targets strictly precise paraphrases.
APA, Harvard, Vancouver, ISO, and other styles
2

Pöckelmann, Marcus, Janis Dähne, Jörg Ritter, and Paul Molitor. "Fast paraphrase extraction in Ancient Greek literature." it - Information Technology 62, no. 2 (April 26, 2020): 75–89. http://dx.doi.org/10.1515/itit-2019-0042.

Full text
Abstract:
AbstractIn this paper,A shorter version of the paper appeared in German in the final report of the Digital Plato project which was funded by the Volkswagen Foundation from 2016 to 2019. [35], [28]. we present a method for paraphrase extraction in Ancient Greek that can be applied to huge text corpora in interactive humanities applications. Since lexical databases and POS tagging are either unavailable or do not achieve sufficient accuracy for ancient languages, our approach is based on pure word embeddings and the word mover’s distance (WMD) [20]. We show how to adapt the WMD approach to paraphrase searching such that the expensive WMD computation has to be computed for a small fraction of the text segments contained in the corpus, only. Formally, the time complexity will be reduced from \mathcal{O}(N\cdot {K^{3}}\cdot \log K) to \mathcal{O}(N+{K^{3}}\cdot \log K), compared to the brute-force approach which computes the WMD between each text segment of the corpus and the search query. N is the length of the corpus and K the size of its vocabulary. The method, which searches not only for paraphrases of the same length as the search query but also for paraphrases of varying lengths, was evaluated on the Thesaurus Linguae Graecae® (TLG®) [25]. The TLG consists of about 75\cdot {10^{6}} Greek words. We searched the whole TLG for paraphrases for given passages of Plato. The experimental results show that our method and the brute-force approach, with only very few exceptions, propose the same text passages in the TLG as possible paraphrases. The computation times of our method are in a range that allows its application in interactive systems and let the humanities scholars work productively and smoothly.
APA, Harvard, Vancouver, ISO, and other styles
3

Chitra, A., and Anupriya Rajkumar. "Paraphrase Extraction using fuzzy hierarchical clustering." Applied Soft Computing 34 (September 2015): 426–37. http://dx.doi.org/10.1016/j.asoc.2015.05.017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

VILA, M., H. RODRÍGUEZ, and M. A. MARTÍ. "Relational paraphrase acquisition from Wikipedia: The WRPA method and corpus." Natural Language Engineering 21, no. 3 (September 16, 2013): 355–89. http://dx.doi.org/10.1017/s1351324913000235.

Full text
Abstract:
AbstractParaphrase corpora are an essential but scarce resource in Natural Language Processing. In this paper, we present the Wikipedia-based Relational Paraphrase Acquisition (WRPA) method, which extracts relational paraphrases from Wikipedia, and the derived WRPA paraphrase corpus. The WRPA corpus currently covers person-related and authorship relations in English and Spanish, respectively, suggesting that, given adequate Wikipedia coverage, our method is independent of the language and the relation addressed. WRPA extracts entity pairs from structured information in Wikipedia applying distant learning and, based on the distributional hypothesis, uses them as anchor points for candidate paraphrase extraction from the free text in the body of Wikipedia articles. Focussing on relational paraphrasing and taking advantage of Wikipedia-structured information allows for an automatic and consistent evaluation of the results. The WRPA corpus characteristics distinguish it from other types of corpora that rely on string similarity or transformation operations. WRPA relies on distributional similarity and is the result of the free use of language outside any reformulation framework. Validation results show a high precision for the corpus.
APA, Harvard, Vancouver, ISO, and other styles
5

Recasens, Marta, and Marta Vila. "On Paraphrase and Coreference." Computational Linguistics 36, no. 4 (December 2010): 639–47. http://dx.doi.org/10.1162/coli_a_00014.

Full text
Abstract:
By providing a better understanding of paraphrase and coreference in terms of similarities and differences in their linguistic nature, this article delimits what the focus of paraphrase extraction and coreference resolution tasks should be, and to what extent they can help each other. We argue for the relevance of this discussion to Natural Language Processing.
APA, Harvard, Vancouver, ISO, and other styles
6

ZHAO, Shi-Qi, Lin ZHAO, Ting LIU, and Sheng LI. "Paraphrase Collocation Extraction Based on Binary Classification." Journal of Software 21, no. 6 (June 29, 2010): 1267–76. http://dx.doi.org/10.3724/sp.j.1001.2010.03586.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mauro Mirto, Ignazio. "Automatic Extraction of Semantic Roles in Support Verb Constructions." International Journal on Natural Language Computing 10, no. 03 (June 30, 2021): 1–10. http://dx.doi.org/10.5121/ijnlc.2021.10301.

Full text
Abstract:
This paper deals with paraphrastic relations in Italian. In the following sentences: (a) Max strappò delle lacrime a Sara 'Max moved Sara to tears' and (b) Max fece piangere Sara 'Max made Sara cry', the verbs differ syntactically and semantically. Strappare 'tear/rip/wring' is transitive, fare ‘have/make’ is a causative, and piangere 'cry' is intransitive. Despite this, a translation of (a) as (b) is legitimate and therefore (a) is a paraphrase of (b). In theoretical linguistics this raises an issue concerning the relationship between strappare and fare/piangere in Italian, and that in English between move and make. In computational linguistics, can such paraphrases be obtained automatically? Which apparatus should be deployed? The aim of this paper is to suggest a pathway with which to answer these questions.
APA, Harvard, Vancouver, ISO, and other styles
8

Hu Hongsi, Zhang Wenbo, and Yao Tianfang. "Paraphrase Extraction from Interactive Q&A Communities." International Journal of Information Processing and Management 4, no. 2 (April 30, 2013): 45–52. http://dx.doi.org/10.4156/ijipm.vol4.issue2.6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Glazkova, Anna Valer'evna. "Statistical evaluation of the information content of attributes for the task of searching for semantically close sentences." Программные системы и вычислительные методы, no. 1 (January 2020): 8–17. http://dx.doi.org/10.7256/2454-0714.2020.1.31728.

Full text
Abstract:
The paper presents the results of evaluating the informative value of quantitative and binary signs to solve the problem of finding semantically close sentences (paraphrases). Three types of signs are considered in the article: those built on vector representations of words (according to the Word2Vec model), based on the extraction of numbers and structured information and reflecting the quantitative characteristics of the text. As indicators of information content, the percentage of paraphrases among examples with a characteristic, and the percentage of paraphrases with a attribute (for binary characteristics), as well as estimates using the accumulated frequency method (for quantitative indicators) are used. The assessment was conducted on the Russian paraphrase corps. The set of features considered in the work was tested as input for two machine learning models for defining semantically close sentences: reference vector machines (SVMs) and a recurrent neural network model. The first model accepts only the considered set of signs as input parameters, the second - the text in the form of sequences and the set of signs as an additional input. The quality of the models was 67.06% (F-measure) and 69.49% (accuracy) and 79.85% (F-measure) and 74.16% (accuracy), respectively. The result obtained in the work is comparable with the best results of the systems presented in 2017 at the competition for the definition of paraphrase for the Russian language (the second result for the F-measure, the third result for accuracy). The results proposed in the work can be used both in the implementation of search models for semantically close fragments of texts in natural language, and for the analysis of Russian-language paraphrases from the point of view of computer linguistics.
APA, Harvard, Vancouver, ISO, and other styles
10

박에스더, 임해창, 김민정, and 이형규. "Pivot Discrimination Approach for Paraphrase Extraction from Bilingual Corpus." Korean Journal of Cognitive Science 22, no. 1 (March 2011): 57–78. http://dx.doi.org/10.19066/cogsci.2011.22.1.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Mahmoud, Adnen, and Mounir Zrigui. "Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity." International Journal of Cognitive Informatics and Natural Intelligence 14, no. 1 (January 2020): 35–50. http://dx.doi.org/10.4018/ijcini.2020010103.

Full text
Abstract:
The problem addressed is to develop a model that can reliably identify whether a previously unseen document pair is paraphrased or not. Its detection in Arabic documents is a challenge because of its variability in features and the lack of publicly available corpora. Faced with these problems, the authors propose a semantic approach. At the feature extraction level, the authors use global vectors representation combining global co-occurrence counting and a contextual skip gram model. At the paraphrase identification level, the authors apply a convolutional neural network model to learn more contextual and semantic information between documents. For experiments, the authors use Open Source Arabic Corpora as a source corpus. Then the authors collect different datasets to create a vocabulary model. For the paraphrased corpus construction, the authors replace each word from the source corpus by its most similar one which has the same grammatical class applying the word2vec algorithm and the part-of-speech annotation. Experiments show that the model achieves promising results in terms of precision and recall compared to existing approaches in the literature.
APA, Harvard, Vancouver, ISO, and other styles
12

Choi, Sung-Pil, and Sung-Hyon Myaeng. "Terminological paraphrase extraction from scientific literature based on predicate argument tuples." Journal of Information Science 38, no. 6 (August 31, 2012): 593–611. http://dx.doi.org/10.1177/0165551512459920.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Aguilar, Jose, Camilo Salazar, Henry Velasco, Julian Monsalve-Pulido, and Edwin Montoya. "Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents." Computation 8, no. 2 (April 15, 2020): 30. http://dx.doi.org/10.3390/computation8020030.

Full text
Abstract:
This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.
APA, Harvard, Vancouver, ISO, and other styles
14

DORR, BONNIE J., REBECCA J. PASSONNEAU, DAVID FARWELL, REBECCA GREEN, NIZAR HABASH, STEPHEN HELMREICH, EDUARD HOVY, et al. "Interlingual annotation of parallel text corpora: a new framework for annotation and evaluation." Natural Language Engineering 16, no. 3 (June 15, 2010): 197–243. http://dx.doi.org/10.1017/s1351324910000070.

Full text
Abstract:
AbstractThis paper focuses on an important step in the creation of a system of meaning representation and the development of semantically annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to annotate multiple translations of foreign-language texts with interlingual content. Three levels of representation are introduced: deep syntactic dependencies (IL0), intermediate semantic representations (IL1), and a normalized representation that unifies conversives, nonliteral language, and paraphrase (IL2). The resulting annotated, multilingually induced, parallel corpora will be useful as an empirical basis for a wide range of research, including the development and evaluation of interlingual NLP systems and paraphrase-extraction systems as well as a host of other research and development efforts in theoretical and applied linguistics, foreign language pedagogy, translation studies, and other related disciplines.
APA, Harvard, Vancouver, ISO, and other styles
15

Xu, Wei, Alan Ritter, Chris Callison-Burch, William B. Dolan, and Yangfeng Ji. "Extracting Lexically Divergent Paraphrases from Twitter." Transactions of the Association for Computational Linguistics 2 (December 2014): 435–48. http://dx.doi.org/10.1162/tacl_a_00194.

Full text
Abstract:
We present MultiP (Multi-instance Learning Paraphrase Model), a new model suited to identify paraphrases within the short messages on Twitter. We jointly model paraphrase relations between word and sentence pairs and assume only sentence-level annotations during learning. Using this principled latent variable model alone, we achieve the performance competitive with a state-of-the-art method which combines a latent space model with a feature-based supervised classifier. Our model also captures lexically divergent paraphrases that differ from yet complement previous methods; combining our model with previous work significantly outperforms the state-of-the-art. In addition, we present a novel annotation methodology that has allowed us to crowdsource a paraphrase corpus from Twitter. We make this new dataset available to the research community.
APA, Harvard, Vancouver, ISO, and other styles
16

Diedrichsen, Elke. "Linguistic challenges in automatic summarization technology." Journal of Computer-Assisted Linguistic Research 1, no. 1 (June 26, 2017): 40. http://dx.doi.org/10.4995/jclr.2017.7787.

Full text
Abstract:
Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse or paraphrase sections of the source document. As of today, most summarization systems are extractive. Automatic document summarization technology presents interesting challenges for Natural Language Processing. It works on the basis of coreference resolution, discourse analysis, named entity recognition (NER), information extraction (IE), natural language understanding, topic segmentation and recognition, word segmentation and part-of-speech tagging. This study will overview some current approaches to the implementation of auto summarization technology and discuss the state of the art of the most important NLP tasks involved in them. We will pay particular attention to current methods of sentence extraction and compression for single and multi-document summarization, as these applications are based on theories of syntax and discourse and their implementation therefore requires a solid background in linguistics. Summarization technologies are also used for image collection summarization and video summarization, but the scope of this paper will be limited to document summarization.
APA, Harvard, Vancouver, ISO, and other styles
17

DIAS, GAËL, RUMEN MORALIYSKI, JOÃO CORDEIRO, ANTOINE DOUCET, and HELENA AHONEN-MYKA. "Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis." Natural Language Engineering 16, no. 4 (October 2010): 439–67. http://dx.doi.org/10.1017/s135132491000015x.

Full text
Abstract:
AbstractThesauri, which list the most salient semantic relations between words, have mostly been compiled manually. Therefore, the inclusion of an entry depends on the subjective decision of the lexicographer. As a consequence, those resources are usually incomplete. In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal differs from all other research presented so far as it tries to take the best of two different methodologies, i.e. semantic space models and information extraction models. In particular, it can be applied to extract close semantic relations, it limits the search space to few, highly probable options and it is unsupervised.
APA, Harvard, Vancouver, ISO, and other styles
18

Gates, Kelly. "Policing as Digital Platform." Surveillance & Society 17, no. 1/2 (March 31, 2019): 63–68. http://dx.doi.org/10.24908/ss.v17i1/2.12940.

Full text
Abstract:
Much of the discussion about platforms and “platform capitalism” centers on commercial platform companies like Google, Facebook, Amazon, and Apple. Shoshana Zuboff’s (2015) analysis of “surveillance capitalism” similarly focuses on Google as the trailblazer pushing the new logic of accumulation that is focused on data extraction and analysis of human activities. In his typology of platform companies, Nick Srnicek (2017) includes less visible industrial platforms that situate themselves as intermediaries between companies rather than between companies and consumer-users. In this article, the focus is a platform-building effort that looks something like an industrial platform but differs in the sense that the company in question, Axon Enterprise, aims to situate itself as an intermediary within and among law enforcement agencies (non-market entities) as a means of building a large-scale data-extractive system of monetization. Axon’s business strategy is emblematic of the ways that police evidence and record-keeping systems are being reimagined, and to some extent reconfigured, as sources of data extraction and analytics on the model of the platform. Whether Axon succeeds or is eclipsed by a competitor like Palantir or even Amazon or Microsoft, the process of reimagining and reorganizing policing as a platform is underway—a process that, to paraphrase Zuboff, deeply imbricates public and private surveillance activities, dissolving the boundary between public and private authority in the surveillance project.
APA, Harvard, Vancouver, ISO, and other styles
19

Madnani, Nitin, and Bonnie J. Dorr. "Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods." Computational Linguistics 36, no. 3 (September 2010): 341–87. http://dx.doi.org/10.1162/coli_a_00002.

Full text
Abstract:
The task of paraphrasing is inherently familiar to speakers of all languages. Moreover, the task of automatically generating or extracting semantic equivalences for the various units of language—words, phrases, and sentences—is an important part of natural language processing (NLP) and is being increasingly employed to improve the performance of several NLP applications. In this article, we attempt to conduct a comprehensive and application-independent survey of data-driven phrasal and sentential paraphrase generation methods, while also conveying an appreciation for the importance and potential use of paraphrases in the field of NLP research. Recent work done in manual and automatic construction of paraphrase corpora is also examined. We also discuss the strategies used for evaluating paraphrase generation techniques and briefly explore some future trends in paraphrase generation.
APA, Harvard, Vancouver, ISO, and other styles
20

Taghizadeh, Nasrin, and Heshaam Faili. "Cross-lingual Adaptation Using Universal Dependencies." ACM Transactions on Asian and Low-Resource Language Information Processing 20, no. 4 (May 26, 2021): 1–23. http://dx.doi.org/10.1145/3448251.

Full text
Abstract:
We describe a cross-lingual adaptation method based on syntactic parse trees obtained from the Universal Dependencies (UD), which are consistent across languages, to develop classifiers in low-resource languages. The idea of UD parsing is to capture similarities as well as idiosyncrasies among typologically different languages. In this article, we show that models trained using UD parse trees for complex NLP tasks can characterize very different languages. We study two tasks of paraphrase identification and relation extraction as case studies. Based on UD parse trees, we develop several models using tree kernels and show that these models trained on the English dataset can correctly classify data of other languages, e.g., French, Farsi, and Arabic. The proposed approach opens up avenues for exploiting UD parsing in solving similar cross-lingual tasks, which is very useful for languages for which no labeled data is available.
APA, Harvard, Vancouver, ISO, and other styles
21

Chitra, A., and Anupriya Rajkumar. "Plagiarism Detection Using Machine Learning-Based Paraphrase Recognizer." Journal of Intelligent Systems 25, no. 3 (July 1, 2016): 351–59. http://dx.doi.org/10.1515/jisys-2014-0146.

Full text
Abstract:
AbstractPlagiarism in free text has become a common occurrence due to the wide availability of voluminous information resources. Automatic plagiarism detection systems aim to identify plagiarized content present in large repositories. This task is rendered difficult by the use of sophisticated plagiarism techniques such as paraphrasing and summarization, which mask the occurrence of plagiarism. In this work, a monolingual plagiarism detection technique has been developed to tackle cases of paraphrased plagiarism. A support vector machine based paraphrase recognition system, which works by extracting lexical, syntactic, and semantic features from input text has been used. Both sentence-level and passage-level approaches have been investigated. The performance of the system has been evaluated on various corpora, and the passage level approach has registered promising results.
APA, Harvard, Vancouver, ISO, and other styles
22

ZHAO, SHIQI, HAIFENG WANG, TING LIU, and SHENG LI. "Extracting paraphrase patterns from bilingual parallel corpora." Natural Language Engineering 15, no. 4 (September 16, 2009): 503–26. http://dx.doi.org/10.1017/s1351324909990155.

Full text
Abstract:
AbstractParaphrase patterns are semantically equivalent patterns, which are useful in both paraphrase recognition and generation. This paper presents a pivot approach for extracting paraphrase patterns from bilingual parallel corpora, whereby the paraphrase patterns in English are extracted using the patterns in another language as pivots. We make use of log-linear models for computing the paraphrase likelihood between pattern pairs and exploit feature functions based on maximum likelihood estimation (MLE), lexical weighting (LW), and monolingual word alignment (MWA). Using the presented method, we extract more than 1 million pairs of paraphrase patterns from about 2 million pairs of bilingual parallel sentences. The precision of the extracted paraphrase patterns is above 78%. Experimental results show that the presented method significantly outperforms a well-known method called discovery of inference rules from text (DIRT). Additionally, the log-linear model with the proposed feature functions are effective. The extracted paraphrase patterns are fully analyzed. Especially, we found that the extracted paraphrase patterns can be classified into five types, which are useful in multiple natural language processing (NLP) applications.
APA, Harvard, Vancouver, ISO, and other styles
23

Anchiêta, Rafael T., Rogério F. de Sousa, and Thiago A. S. Pardo. "Modeling the Paraphrase Detection Task over a Heterogeneous Graph Network with Data Augmentation." Information 11, no. 9 (September 1, 2020): 422. http://dx.doi.org/10.3390/info11090422.

Full text
Abstract:
Paraphrase detection is a Natural-Language Processing (NLP) task that aims at automatically identifying whether two sentences convey the same meaning (even with different words). For the Portuguese language, most of the works model this task as a machine-learning solution, extracting features and training a classifier. In this paper, following a different line, we explore a graph structure representation and model the paraphrase identification task over a heterogeneous network. We also adopt a back-translation strategy for data augmentation to balance the dataset we use. Our approach, although simple, outperforms the best results reported for the paraphrase detection task in Portuguese, showing that graph structures may capture better the semantic relatedness among sentences.
APA, Harvard, Vancouver, ISO, and other styles
24

Ho, ChukFong, Masrah Azrifah Azmi Murad, Shyamala Doraisamy, and Rabiah Abdul Kadir. "Extracting lexical and phrasal paraphrases: a review of the literature." Artificial Intelligence Review 42, no. 4 (October 10, 2012): 851–94. http://dx.doi.org/10.1007/s10462-012-9357-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Keshtkar, Fazel, and Diana Inkpen. "A BOOTSTRAPPING METHOD FOR EXTRACTING PARAPHRASES OF EMOTION EXPRESSIONS FROM TEXTS." Computational Intelligence 29, no. 3 (September 4, 2012): 417–35. http://dx.doi.org/10.1111/j.1467-8640.2012.00458.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Parikh, Soham, Quaizar Vohra, and Mitul Tiwari. "Automated Utterance Generation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 08 (April 3, 2020): 13344–49. http://dx.doi.org/10.1609/aaai.v34i08.7047.

Full text
Abstract:
Conversational AI assistants are becoming popular and question-answering is an important part of any conversational assistant. Using relevant utterances as features in question-answering has shown to improve both the precision and recall for retrieving the right answer by a conversational assistant. Hence, utterance generation has become an important problem with the goal of generating relevant utterances (sentences or phrases) from a knowledge base article that consists of a title and a description. However, generating good utterances usually requires a lot of manual effort, creating the need for an automated utterance generation. In this paper, we propose an utterance generation system which 1) uses extractive summarization to extract important sentences from the description, 2) uses multiple paraphrasing techniques to generate a diverse set of paraphrases of the title and summary sentences, and 3) selects good candidate paraphrases with the help of a novel candidate selection algorithm.
APA, Harvard, Vancouver, ISO, and other styles
27

K, Manjula, and M. B.Anandaraju. "A comparative study on feature extraction and classification of mind waves for brain computerinterface (BCI)." International Journal of Engineering & Technology 7, no. 1.9 (March 1, 2018): 132. http://dx.doi.org/10.14419/ijet.v7i1.9.9749.

Full text
Abstract:
Brain Computer Interfacing (BCI) is a methodology which imparts a path for communication from external world using brain signals through computer. BCI identifies the specific patterns in a person’s changing brain activity to initiate control which relates to the person’s intention. The BCI system paraphrases these signal patterns into meaningful control command. In evolving BCI system, numerous signal processing algorithms are proposed. Non-invasive Electroencephalogram (EEG) signals or mind waves are used to extract the distinguished features and further they are classified choosing an appropriate classifier. A study on different feature extraction & Classification algorithms is used in EEG-based BCI exploration and to identify their distinct properties. This paper proposes different methodologies of feature extraction and feature Classification. It also addresses the methods and technology adapted in every phase of the EEG signal processing.This comparative survey also helps in selecting suitable algorithm for the development and accomplishment of further classification of signals.
APA, Harvard, Vancouver, ISO, and other styles
28

SZPEKTOR, IDAN, HRISTO TANEV, IDO DAGAN, BONAVENTURA COPPOLA, and MILEN KOUYLEKOV. "Unsupervised acquisition of entailment relations from the Web." Natural Language Engineering 21, no. 1 (July 30, 2013): 3–47. http://dx.doi.org/10.1017/s1351324913000156.

Full text
Abstract:
AbstractEntailment recognition is a primary generic task in natural language inference, whose focus is to detect whether the meaning of one expression can be inferred from the meaning of the other. Accordingly, many NLP applications would benefit from high coverage knowledgebases of paraphrases and entailment rules. To this end, learning such knowledgebases from the Web is especially appealing due to its huge size as well as its highly heterogeneous content, allowing for a more scalable rule extraction of various domains. However, the scalability of state-of-the-art entailment rule acquisition approaches from the Web is still limited. We present a fully unsupervised learning algorithm for Web-based extraction of entailment relations. We focus on increased scalability and generality with respect to prior work, with the potential of a large-scale Web-based knowledgebase. Our algorithm takes as its input a lexical–syntactic template and searches the Web for syntactic templates that participate in an entailment relation with the input template. Experiments show promising results, achieving performance similar to a state-of-the-art unsupervised algorithm, operating over an offline corpus, but with the benefit of learning rules for different domains with no additional effort.
APA, Harvard, Vancouver, ISO, and other styles
29

DAGAN, IDO, BILL DOLAN, BERNARDO MAGNINI, and DAN ROTH. "Recognizing textual entailment: Rational, evaluation and approaches." Natural Language Engineering 15, no. 4 (October 2009): i—xvii. http://dx.doi.org/10.1017/s1351324909990209.

Full text
Abstract:
AbstractThe goal of identifying textual entailment – whether one piece of text can be plausibly inferred from another – has emerged in recent years as a generic core problem in natural language understanding. Work in this area has been largely driven by the PASCAL Recognizing Textual Entailment (RTE) challenges, which are a series of annual competitive meetings. The current work exhibits strong ties to some earlier lines of research, particularly automatic acquisition of paraphrases and lexical semantic relationships and unsupervised inference in applications such as question answering, information extraction and summarization. It has also opened the way to newer lines of research on more involved inference methods, on knowledge representations needed to support this natural language understanding challenge and on the use of learning methods in this context. RTE has fostered an active and growing community of researchers focused on the problem of applied entailment. This special issue of the JNLE provides an opportunity to showcase some of the most important work in this emerging area.
APA, Harvard, Vancouver, ISO, and other styles
30

Ahmed, Mahtab, and Robert E. Mercer. "Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7358–66. http://dx.doi.org/10.1609/aaai.v34i05.6230.

Full text
Abstract:
Learning sentence representation is a fundamental task in Natural Language Processing. Most of the existing sentence pair modelling architectures focus only on extracting and using the rich sentence pair features. The drawback of utilizing all of these features makes the learning process much harder. In this study, we propose a reinforcement learning (RL) method to learn a sentence pair representation when performing tasks like semantic similarity, paraphrase identification, and question-answer pair modelling. We formulate this learning problem as a sequential decision making task where the decision made in the current state will have a strong impact on the following decisions. We address this decision making with a policy gradient RL method which chooses the irrelevant words to delete by looking at the sub-optimal representation of the sentences being compared. With this policy, extensive experiments show that our model achieves on par performance when learning task-specific representations of sentence pairs without needing any further knowledge like parse trees. We suggest that the simplicity of each task inference provided by our RL model makes it easier to explain.
APA, Harvard, Vancouver, ISO, and other styles
31

Shuvalov, Petr. "Die Blonden des 11. Buches des Pseudo-Maurikios." Amsterdamer Beiträge zur älteren Germanistik 80, no. 1-2 (August 12, 2020): 108–33. http://dx.doi.org/10.1163/18756719-12340182.

Full text
Abstract:
Abstract This analysis of the text of Pseudo-Maurice’s Strategikon ch. xi,3, discussing the “light-haired peoples,” is based on a new investigation of the MSS by the on-line photocopies, and shows that in the text there are many inner citations and paraphrases as well as some traces of redactions previous to the archetype (i.e. common ancestor of the MSS). The analysis of the punctuation allows to propose the hypothesis that the cola in Leo’s Problemata do reflect directly the system of punctuation in the hyparchetype α (i.e. the ancestor of β, which is the progenitor of the main MSS). The text’s development before the first split of the tradition between MSS families could be separated into the following phases for ch. xi,3: (1) Xanth (the Urtext of the chapter), (2) Kairos (many interpolations and possible extraction of the text of Xanth including the first part of the title), and (3) Abar (some additional interpolations, including the names of the Franks and Lombards). The blonds (xantha ethne) of the first phase are neither Franks nor Lombards. More likely they are different gentes of the Middle Danube between the time of Attila and the appearance of the Avars – like Ostrogoths, Gepids, Heruls, etc.
APA, Harvard, Vancouver, ISO, and other styles
32

Guan, Xiaohan, Jianhui Han, Zhi Liu, and Mengmeng Zhang. "Sentence Similarity Algorithm Based on Fused Bi-Channel Dependency Matching Feature." International Journal of Pattern Recognition and Artificial Intelligence 34, no. 07 (October 18, 2019): 2050019. http://dx.doi.org/10.1142/s0218001420500196.

Full text
Abstract:
Many tasks of natural language processing such as information retrieval, intelligent question answering, and machine translation require the calculation of sentence similarity. The traditional calculation methods used in the past could not solve semantic understanding problems well. First, the model structure based on Siamese lack of interaction between sentences; second, it has matching problem which contains lacking position information and only using partial matching factor based on the matching model. In this paper, a combination of word and word’s dependence is proposed to calculate the sentence similarity. This combination can extract the word features and word’s dependency features. To extract more matching features, a bi-directional multi-interaction matching sequence model is proposed by using word2vec and dependency2vec. This model obtains matching features by convolving and pooling the word-granularity (word vector, dependency vector) interaction sequences in two directions. Next, the model aggregates the bi-direction matching features. The paper evaluates the model on two tasks: paraphrase identification and natural language inference. The experimental results show that the combination of word and word’s dependence can enhance the ability of extracting matching features between two sentences. The results also show that the model with dependency can achieve higher accuracy than these models without using dependency.
APA, Harvard, Vancouver, ISO, and other styles
33

Gupta, Dhruv. "Search and analytics using semantic annotations." ACM SIGIR Forum 53, no. 2 (December 2019): 100–101. http://dx.doi.org/10.1145/3458553.3458567.

Full text
Abstract:
Current information retrieval systems are limited to text in documents for helping users with their information needs. With the progress in the field of natural language processing, there now exists the possibility of enriching large document collections with accurate semantic annotations. Annotations in the form of part-of-speech tags, temporal expressions, numerical values, geographic locations, and other named entities can help us look at terms in text with additional semantics. This doctoral dissertation presents methods for search and analysis of large semantically annotated document collections. Concretely, we make contributions along three broad directions: indexing, querying, and mining of large semantically annotated document collections. Indexing Annotated Document Collections. Knowledge-centric tasks such as information extraction, question answering, and relationship extraction require a user to retrieve text regions within documents that detail relationships between entities. Current search systems are ill-equipped to handle such tasks, as they can only provide phrase querying with Boolean operators. To enable knowledge acquisition at scale, we propose gyani, an indexing infrastructure for knowledge-centric tasks. gyani enables search for structured query patterns by allowing regular expression operators to be expressed between word sequences and semantic annotations. To implement grep-like search capabilities over large annotated document collections, we present a data model and index design choices involving word sequences, annotations, and their combinations. We show that by using our proposed indexing infrastructure we bring about drastic speedups in crucial knowledge-centric tasks: 95× in information extraction, 53× in question answering, and 12× in relationship extraction. Hyper-phrase queries are multi-phrase set queries that naturally arise when attempting to spot knowledge graph facts or subgraphs in large document collections. An example hyper-phrase query for the fact 〈mahatma gandhi, nominated for, nobel peace prize〉 is: 〈{ mahatma gandhi, m k gandhi, gandhi }, { nominated, nominee, nomination received }, { nobel peace prize, nobel prize for peace, nobel prize in peace }〉. Efficient execution of hyper-phrase queries is of essence when attempting to verify and validate claims concerning named entities or emerging named entities. To do so, it is required that the fact concerning the entity can be contextualized in text. To acquire text regions given a hyper-phrase query, we propose a retrieval framework using combinations of n-gram and skip-gram indexes. Concretely, we model the combinatorial space of the phrases in the hyper-phrase query to be retrieved using vertical and horizontal operators and propose a dynamic programming approach for optimized query processing. We show that using our proposed optimizations we can retrieve sentences in support of knowledge graph facts and subgraphs from large document collections within seconds. Querying Annotated Document Collections. Users often struggle to convey their information needs in short keyword queries. This often results in a series of query reformulations, in an attempt to find relevant documents. To assist users navigate large document collections and lead them to their information needs with ease, we propose methods that leverage semantic annotations. As a first step, we focus on temporal information needs. Specifically, we leverage temporal expressions in large document collections to serve time-sensitive queries better. Time-sensitive queries, e.g., summer olympics implicitly carry a temporal dimension for document retrieval. To help users explore longitudinal document collections, we propose a method that generates time intervals of interest as query reformulations. For instance, for the query world war , time intervals of interest are: [1914; 1918] and [1939;1945]. The generated time intervals are immediately useful in search-related tasks such as temporal query classification and temporal diversification of documents. As a second and final step, we focus on helping the user in navigating large document collections by generating semantic aspects. The aspects are generated using semantic annotations in the form of temporal expressions, geographic locations, and other named entities. Concretely, we propose the xFactor algorithm that generates semantic aspects in two steps. In the first step, xFactor computes the salience of annotations in models informed of their semantics. Thus, the temporal expressions 1930s and 1939 are considered similar as well as entities such as usain bolt and justin gatlin are considered related when computing their salience. Second, the xFactor algorithm computes the co-occurrence salience of annotations belonging to different types by using an efficient partitioning procedure. For instance, the aspect 〈{usain bolt}, {beijing, London}, [2008;2012]〉 signifies that the entity, locations, and the time interval are observed frequently in isolation as well as together in the documents retrieved for the query olympic medalists. Mining Annotated Document Collections. Large annotated document collections are a treasure trove of historical information concerning events and entities. In this regard, we first present EventMiner, a clustering algorithm, that mines events for keyword queries by using annotations in the form of temporal expressions, geographic locations, and other disambiguated named entities present in a pseudo-relevant set of documents. EventMiner aggregates the annotation evidences by mathematically modeling their semantics. Temporal expressions are modeled in an uncertainty and proximity-aware time model. Geographic locations are modeled as minimum bounding rectangles over their geographic co-ordinates. Other disambiguated named entities are modeled as a set of links corresponding to their Wikipedia articles. For a set of history-oriented queries concerning entities and events, we show that our approach can truly identify event clusters when compared to approaches that disregard annotation semantics. Second and finally, we present jigsaw, an end-to-end query-driven system that generates structured tables for user-defined schema from unstructured text. To define the table schema, we describe query operators that help perform structured search on annotated text and fill in table cell values. To resolve table cell values whose values can not be retrieved, we describe methods for inferring null values using local context. jigsaw further relies on semantic models for text and numbers to link together near-duplicate rows. This way, jigsaw is able to piece together paraphrased, partial, and redundant text regions retrieved in response to structured queries to generate high-quality tables within seconds. This doctoral dissertation was supervised by Klaus Berberich at the Max Planck Institute for Informatics and htw saar in Saarbrücken, Germany. This thesis is available online at: https://people.mpi-inf.mpg.de/~dhgupta/pub/dhruv-thesis.pdf.
APA, Harvard, Vancouver, ISO, and other styles
34

Мелик-Гайказян, Ирина Вигеновна. "BIOETHICS AND SEMIOTICS: INSTEAD OF A FOREWORD." ΠΡΑΞΗMΑ. Journal of Visual Semiotics, no. 3(29) (June 18, 2021): 9–18. http://dx.doi.org/10.23951/2312-7899-2021-3-9-18.

Full text
Abstract:
Обстоятельства помешали научному редактору номера – Елене Георгиевне Гребенщиковой – написать предисловие. В нашем молодом журнале есть уже своя традиция: научный редактор предваряет номер концептуальной преамбулой к статьям, посвященных обсуждению различных аспектов одной проблемы. Авторов данного номера объединяют исследовательские и организационные обстоятельства. Все мы были вовлечены в исследовательское поле биоэтики Борисом Григорьевичем Юдиным. Привлечение же методологических потенциалов семиотики для решения задач биоэтики произошло на «томской почве» как результат организации серий конференций. Эти серии стартовали с конференции «Антропологические основания биоэтики», которая, по словам Б.Г. Юдина, была первой в России научной конференцией «по биоэтике». Основные доклады этой научной встречи составили содержание номера журнала «Бюллетень сибирской медицины». Среди его публикаций были две статьи, определившее дальнейшее, – «Чтоб сказку сделать былью? (Конструирование человека)» [Юдин 2006] и «Тело страдания: философско-антропологическое истолкование» [Тищенко 2006]. Слова «Конструирование человека» дали случайным образом название ряду конференций, проводимых Томским государственным педагогическим университетом. Специализация указанного университета понятным образом привлекла к участию коллег-педагогов, интерпретировавших слово «конструирование» в качестве указания на суть образования. Было любопытно наблюдать такую стихийно складывающуюся прагматику с учетом различий русскоязычного и англоязычного названия статьи [Юдин 2006] – словами «Чтоб сказку сделать былью?», заканчивающими вопросительным знаком цитату из песни о цели рождения советского человека, и «To make a dream true?», где под вопрос ставилась возможность мечтой-иллюзией заменить действительность. Поскольку организаторы конференции испытывают некоторую зависимость от тематики заявляемых докладов, для «чистой» биоэтики была создана отдельная секция «Тело и власть», название которой стало парафразом другой упомянутой статьи [Тищенко 2006]. Параллельно с серией конференций «Конструирование человека» был организован ряд конференций «Системы и модели: границы интерпретаций», акцентировавших не столько биоэтику, сколько постнеклассическую парадигму семиотического моделирования. Слова «системы и модели» цитировали название известной в семиотике книги А.А. Шарова и Ю.А.Шрейдера; фиксировали проблему нахождения пределов применимости математических моделей в синергетике; приглашали к обсуждению системной методологии. Слова «границы интерпретаций» изначально определяли соотношение герменевтики и семиотики в методологии исследования социокультурной динамики. Замечу, что географическое образование заставляло меня трактовать понятие «границы» исключительно в семиотическом ракурсе. Доклады, представленные на первой конференции «Системы и модели: границы интерпретаций», составили номер журнала «Вестник Томского государственного педагогического университета» (2008, № 1), из содержания которого ясно, что эта научная встреча была посвящена обсуждению наследия Эрика Григорьевича Юдина и современному прочтению его версии системного подхода. В этом номере есть много заслуживающих внимания материалов, но для определения области пересечения биоэтики и семиотики, необходимо назвать статью Р.Г. Апресяна [Апресян 2008], хотя в самой статье отсутствует прямое обращение и к семиотике, и к биоэтике. В статье лаконично и предметно установлены ценностные границы локусов модели «этический квадрат», созданной Р.Г. Апресяном. Ясность изложения [Апресян 2008] позволила разглядеть сразу несколько моментов, объединяющих и разделяющих биоэтику с педагогикой. Настороженность биоэтики вызывает любое воззвание превзойти норму, а любая педагогика взывает к тому, чтобы превзойти норму. Общность биоэтики и педагогики составляет распределение символики ролей для всех своих субъектов, подчиненное коммуникативным форматам, устанавливающих пределы допустимого/отвергаемого внутри коммуникативных ареалов. Выяснение этой общности позволило увидеть, что модели биоэтики, по сути, формируют синтаксис коммуникативных ролей субъектов биомедицины. Причем, фиксируют эти роли или в индексах, или в иконических знаках, или даже в символах. Генезис биоэтики есть ответ на социальный запрос в необходимости «сторожа» (в смысле: «…пойди, поставь сторожа; пусть он сказывает, что увидит») для предотвращения моральных катаклизмов, вызываемых темпом прогресса (в чем бы он не выражался) и соблазнами прогресса. Темп трансформаций порождает неопределенность социальных сценариев, а, следовательно, технику биоэтических экспертиз диктует синхрония и абдукция, что явно служит указателем релевантности концепции семиозиса, созданной Ч.С. Пирсом. Семиотическая сущность решения задач биоэтики была увидена столь четко, что удивление вызывало только одно: почему биоэтика, потенциально обладая «ключом» методов семиотики, подбирает «отмычки» для анализа спонтанно возникающих кейсов? Подчеркну, что упомянутый «ключ» открывает «двери», не столько ведущие к анализу нарративов субъектов, нормативных дискурсов и уже случившихся кейсов, сколько к прогностике кейсов в синхронии с меняющимся контекстом. Вместе с тем представленные резоны для обоснования того, что методы семиотики обладают релевантностью для решения задач биоэтики, могут встретить вопрос: зачем биоэтике применять эти методы, если она и без них прекрасно решает свои задачи? Возможным ответом на этот вопрос будет – для того чтобы осуществить точную и опережающую диагностику социокультурных трансформаций, способных вызвать и уже вызывающих модификацию «человеческого в человеке», т.е. вызывающих скольжение границ нормы. Излишне объяснять преимущества диагностики в качестве процедуры с точным результатом, способным опередить наступление необратимых состояний. Семиотические методы устремлены к достижению точности, и их применение объединяет всё гуманитарное знание. При этом семиотика сама существует в нескольких конкурирующих направлениях, и ее исследовательские методы очень отличаются в конкретных научных областях, в том числе и тех, которые составляют «computer science», ответственных за тотальную «цифровизацию» и самоорганизацию «информационного общества». У перечисленных сущностей – точности, самоорганизации, семиотики, «computer science» и даже «цифровизации» – есть идейный общий знаменатель: философия процесса. Создатель философии процесса – А.Н. Уайтхед – видел ее результат в оригинальной концепции символизма, обладающей сущностными пересечениями с концепцией семиозиса Ч.С. Пирса. Если концепция семиозиса раскрывала микропроцессы, обеспечивающие самопроизвольный «рост символов», то концепция А.Н. Уайтхеда устанавливала направления воздействий этого «роста» и основной оператор воздействия: навык организаций «революций в символизме». А также критерии, по которым можно диагностировать событие «перекодирования» символа, т.е. отличить его от того, что событийным рождением нового символизма не является. Т.е. диагностировать дистанцию, отделяющую «событие-в-действительности» от «события-в-реальности». А, следовательно, диагностировать генезисы и цели идейных направлений «конструирования человека». Или диагностировать «семиотические аттракторы», завершающие фазовые переходы в конкуренции сценариев социокультурной динамики. В социокультурных трансформациях сила аттракторов аналогична силе мечты (вспомним слово «dream» в названии статьи [Юдин 2006]). Мечты, имеющей две стороны, – миф и утопию. Биоэтика как «сторож» социокультурных трансформаций вынуждена распознавать миф и утопию в манипуляции целями. Жизненными целями. Целями, диктующими вариативную селекцию ценностей. В феномене мечты и в его воплощениях – в мифе и в утопии – отсутствуют позитив или негатив. Всё зависит, с какой целью им придаются. Зависит от разновидности процессов: процесса рецепции символики мифа с компенсаторными целями или процесса акцептации символических асимптот с целями cамореализации [Брызгалина 2020; Бараш, Антоновский 2019; Шульман, Кутузова 2020]. Обладание представлениями о «росте символов» способно из наблюдаемых мерцаний визуального «вытянуть» то, что скрывают внешние эффекты – осуществить семиотическую диагностику на основе аналогий между симптомами и семантикой, синдромами и синтактикой, анамнезом (в сочетании с целеполаганием) и прагматикой. Такая диагностика создает область пересечения биоэтики и семиотики. Область формируют: (а) изначальные позиции биоэтики в её прагматической концентрации трансдисциплинарного знания для разрешения конкретной проблемы индивидуальности (с сочувственным принятием веера индивидуальных целей); и (б) постнеклассические преимущества семиотики, эффективно реализующие свои потенциалы в расширяющемся трансдисциплинарном поле. В завершении должна выразить благодарность Р.Г. Апресяну, П.Д. Тищенко, Б.Г.Юдину за их щедрые разъяснения существа задач биоэтики, что позволило увидеть эти задачи посредством «оптических» инструментов семиотики. И благодарность всем авторам этого номера за интерес к семиотике, проявленный на основе глубокого понимания решаемых в настоящее время биоэтикой задач и/или оригинальной постановкой этих задач. Circumstances prevented the scientific editor of the issue – Elena G. Grebenshchikova – from writing a foreword. Our young journal already has its own tradition: a scientific editor prefaces an issue with a conceptual preamble to articles discussing various aspects of one problem. The authors of this issue are united by research and organizational circumstances. We were all involved in the research field of bioethics by Boris G. Yudin. Implementation of the methodological potentials of semiotics for solving the problems of bioethics was begun on “Tomsk grounds” as a result of the organization of several conference series. These series started with the conference “Anthropological Foundations of Bioethics”, which, according to Yudin, was the first scientific conference “on bioethics” in Russia. The main reports of this scientific meeting became the content of an issue of Bulletin of Siberian Medicine journal. Among the publications in the issue, there were two articles that determined the further development – “To make a dream true? (Human engineering)” [Yudin 2006] and “The body of suffering: The philosophical and anthropological interpretation” [Tishchenko 2006]. The words “human engineering” (or “human construction”) incidentally gave a title to a number of conferences held by Tomsk State Pedagogical University. The specialization of this university understandably attracted the participation of fellow educators, who interpreted the word “construction” as an indication of the essence of education. It was curious to observe such a spontaneously emerging pragmatism, taking into account the differences between the Russian and English versions of the article title [Yudin 2006] – i.e. between the words “To make a fairy tale come true?” ending with a question mark a quote from a song about the purpose of the Soviet person’s birth and “To make a dream true?” calling the possibility to replace the reality by a dream-illusion into question. Since conference organizers experience some dependence on the subject of submitted reports, a separate section “Body and Power” was created for “pure” bioethics, and the title of the section became a paraphrase of another mentioned article [Tishchenko 2006]. In parallel with the series of conferences “Human Construction”, a number of conferences “Systems and Models: Limits of Interpretation” were organized, which emphasized the post-nonclassical paradigm of semiotic modeling rather than bioethics itself. The words “systems and models” quoted the title of the well-known book in semiotics by A. A. Sharov and Yu. A. Shreider; fixed the problem of finding the limits of applicability of mathematical models in synergetics; invited to discuss the system methodology. The words “limits of interpretation” initially determined the relationship between hermeneutics and semiotics in the methodology of sociocultural dynamics study. I note that the geographical education background forced me to interpret the concept “limit”, or “border”, exclusively from a semiotic perspective. The reports presented at the first conference “Systems and Models: Limits of Interpretation” made up an issue of the Tomsk State Pedagogical University Bulletin journal (2008, No. 1). From the content of the issue it is clear that this scientific meeting was devoted to the discussion of the legacy of Eric G. Yudin and the modern interpretation of his version of the systems approach. The issue contains many noteworthy materials; however, in order to define the area of the intersection of bioethics and semiotics, it is necessary to highlight the article by Ruben G. Apressyan [Apressyan 2008], although the article itself lacks a direct reference to either semiotics or bioethics. The article laconically and substantively establishes the value boundaries of the loci of the “ethical square” model that Apressyan created. The clarity of presentation [Apressyan 2008] made it possible to discern simultaneously several points that unite and separate bioethics and pedagogy. The alertness of bioethics evokes any appeal to surpass the norm, while any pedagogy calls to surpass the norm. The commonality of bioethics and pedagogy is the distribution of the symbolism of roles for all their subjects, subordinated to communicative formats which set the limits of what is permissible/rejected within communicative areas. The elucidation of this commonality made it possible to see that the models of bioethics, in fact, form the syntax of the communicative roles of the subjects of biomedicine. Moreover, these models define the roles either in indices, or in iconic signs, or even in symbols. The genesis of bioethics is a response to the social demand for a “watchman” (in the sense: “Go, set a watchman, let him declare what he seeth”) to prevent moral cataclysms caused by the pace of progress (whatever it is expressed in) and the temptations of progress. The pace of transformations gives rise to the uncertainty of social scenarios, and, consequently, the technique of bioethical examinations is dictated by synchrony and abduction, which clearly serves as an indicator of the relevance of the concept of semiosis created by Charles Sanders Peirce. The semiotic essence of solving the problems of bioethics was seen so clearly that only one thing aroused surprise: why does bioethics, potentially possessing the “key” of semiotics methods, select “lock-picking tools” for the analysis of spontaneously arising cases? Let me emphasize that the mentioned “key” opens “doors” that do not lead to the analysis of narratives of subjects, normative discourses and cases that have already happened, but to the forecasting of cases in synchrony with the changing context. At the same time, the presented reasons for substantiating the fact that the methods of semiotics are relevant for solving the problems of bioethics may face the question: why should bioethics apply these methods if it perfectly solves its problems without them? A possible answer to this question would be – in order to carry out accurate and advanced diagnostics of sociocultural transformations that can cause and are already causing a modification of the “human in a human being”, i.e. causing the sliding of norm boundaries. It is unnecessary to explain the advantages of diagnostics as a procedure with an accurate result that can anticipate the onset of irreversible conditions. Semiotic methods are aimed at achieving accuracy, and their application unites all humanitarian knowledge. At that, semiotics itself exists in several competing directions, and its research methods are very different in specific scientific fields, including those that make up “computer science” which is responsible for the total “digitalization” and self-organization of an “information society”. The listed entities – accuracy, self-organization, semiotics, “computer science”, and even “digitalization” – have an ideological common denominator: process philosophy. The creator of process philosophy – Alfred North Whitehead – saw its result in the original concept of symbolism, which has essential intersections with Peirce’s concept of semiosis. While the concept of semiosis revealed microprocesses that ensured the spontaneous “growth of symbols”, Whitehead’s concept established the directions of the impact of this “growth” and the main operator of the impact: the skill of organizing “revolutions in symbolism” as well as the criteria by which one can diagnose the occasion of “re-coding” of a symbol, i.e. distinguish the occasion from the fact that is not the eventual birth of a new symbolism, or diagnose the distance separating the “occasion-in-actuality” from the “occasion-in-reality”, and, consequently, diagnose the geneses and goals of the ideological directions of “human construction”, or “semiotic attractors” that complete phase transitions in the competition of sociocultural dynamics scenarios. In sociocultural transformations, the power of attractors is similar to the power of a dream (recalling the word “dream” from the article title [Yudin 2006]), a dream that has two sides – myth and utopia. Bioethics as a “watchman” of sociocultural transformations is forced to recognize myth and utopia in the manipulation of goals: life goals, goals dictating variable selection of values. There is neither the positive nor the negative in the phenomenon of dreams and its incarnations – myth and utopia. It all depends on what goals are being pursued. It depends on the variety of processes: process of the reception of the symbolics of a myth with compensatory goals, or of the acceptance of symbolic asymptotes with the goals of self-realization [Bryzgalina 2020; Barash, Antonovskiy 2019; Schulman, Kutuzova 2020]. Possession of ideas about the “growth of symbols” is capable to reveal what external effects hide, extracting it from the observed flickers of the visual – to carry out semiotic diagnostics based on analogies between symptoms and semantics, syndromes and syntactics, anamnesis (in combination with goal-setting) and pragmatics. Such diagnostics create an area of intersection of bioethics and semiotics. The area is formed by: (a) the initial position of bioethics in its pragmatic concentration of transdisciplinary knowledge for solving a specific problem of individuality (with a sympathetic acceptance of a set of individual goals); and (b) the post-nonclassical advantages of semiotics which effectively realize their potentials in the expanding transdisciplinary field. In conclusion, I would like to express my gratitude to Ruben G. Apressyan, Pavel D. Tishchenko, Boris G. Yudin for their generous explanations of the essence of the tasks of bioethics, which made it possible to see these tasks through the “optical” instruments of semiotics. I thank all the authors of this issue for their interest in semiotics demonstrated through the deep understanding of the problems currently being solved by bioethics and/or the original formulation of these problems.
APA, Harvard, Vancouver, ISO, and other styles
35

Yagunova, Elena, Ekaterina Pronoza, and Nataliya Kochetkova. "Construction of Paraphrase Graphs as a Means of News Clusters Extraction." Computación y Sistemas 22, no. 4 (December 31, 2018). http://dx.doi.org/10.13053/cys-22-4-3065.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Purwanto, Devi Dwi. "SINONIM DAN WORD SENSE DISAMBIGUATION UNTUK MELENGKAPI DETEKTOR PLAGIAT DOKUMEN TUGAS AKHIR." Jurnal Sistem Informasi 11, no. 1 (April 27, 2015). http://dx.doi.org/10.21609/jsi.v11i1.412.

Full text
Abstract:
Plagiarism can be categorized into several levels: carbon copy, the addition of words, word substitutions, changing active into passive sentences, and paraphrase. In this research, the detection is only performed by local similarity assessment method. This research is categorized into 3 major processes: preprocessing, candidate determination, and calculation of similarity. In preprocessing, extraction and conversion of a PDF file into XML is performed. Stopword removal and stemming are also performed in this process. For Candidates determination, the process used VSM (Vector Space Model) algorithm using Lucene.NET. It will then calculate the similarity values of the candidates. Similarity values that meet the threshold will be processed in the third stage. The next process is detecting plagiarism at the level of carbon copy. The plagiarism of the substitution level will be determined by finding synonymous with Lesk algorithm and utilizing WordNet as a language dictionary. Lesk notice the words around it, before doing the search process is synonymous with Lesk, performed first sentence extractor. From this experiment, it is concluded that the determination of synonyms using WordNet and Lesk algorithm does not seem to increase its similarity value role. This is due to the difficulty of finding plagiarism by just substituting words. However, plagiarism at the level of carbon copy can be handled with the help of sentence matching.
APA, Harvard, Vancouver, ISO, and other styles
37

"An Adaptive Correlation Based Video Data Mining using Machine Learning." International Journal of Recent Technology and Engineering 8, no. 4 (November 30, 2019): 11066–72. http://dx.doi.org/10.35940/ijrte.d5437.118419.

Full text
Abstract:
With the immense growth in the multimedia contents for education and other purposes, the availability of the video contents has also increased Nevertheless, the retrieval of the content is always a challenge. The identification of two video contents based on internal content similarity highly depends on extraction of key frames and that makes the process highly time complex. In the recent time, many of research attempts have tried to approach this problem with the intention to reduce the time complexity using various methods such as video to text conversion and further analysing both extracted text similarity analysis. Regardless to mention, this strategy is again language dependent and criticised for various reasons like local language dependencies and language paraphrase dependencies. Henceforth, this work approaches the problem with a different dimension with reduction possibilities of the video key frames using adaptive similarity. The proposed method analyses the key frames extracted from the library content and from the search video data based on various parameters and reduces the key frames using adaptive similarity. Also, this work uses machine learning and parallel programming algorithms to reduce the time complexity to a greater extend. The final outcome of this work is a reduced time complex algorithm for video data-based search to video content retrieval. The work demonstrates a nearly 50% reduction in the key frame without losing information with nearly 70% reduction in time complexity and 100% accuracy on search results.
APA, Harvard, Vancouver, ISO, and other styles
38

Nahorniak, Ivanna, and Svitlana Fedorenko. "PECULIARITIES OF THE REPRODUCTION OF REALIA IN TRANSLATIONS OF MODERN ENGLISH PROSE (ON THE EXAMPLES OF THE NOVELS «WHITE TEETH» BY Z. SMITH AND «THE SELLOUT» BY P. BEATTY)." Young Scientist 11, no. 87 (2020). http://dx.doi.org/10.32839/2304-5809/2020-11-87-24.

Full text
Abstract:
The article, based on the elaboration of theoretical sources, clarifies the ways of reproduction of realia in translations of modern English prose. For the conduction of more detailed analysis, there were chosen 400 realia from novels «White Teeth» by the modern British author Z. Smith and «The Sellout» by the modern American writer P. Beatty. The peculiarities of the transfer of both the substantive meaning of realia and its connotation (national and historical coloration) are analyzed. The factors influencing the choice of the way of reproduction of realia are studied. It was found that the most common way to reproduce realia is transcription, due to the presence of a large number of geographical realia. Combined renomination took second place in terms of frequency of use in translations of the works «White Teeth» and «The Sellout». This is primarily due to the large proportion of polynomial realia, the reproduction of which requires the simultaneous application of different approaches, such as transcription, calquing, addition, extraction, situational equivalent, and assimilation method. The third most popular method in the novel «White Teeth» turned out to be transliteration, which often served to convey the meaning of realia that denoted geographical objects, and in the work “The Sellout” – calquing, which was used to reproduce the names of educational institutions, authorities, various organizations. Among the least common ways of reproducing realities in the works “White Teeth” and “The Sellout” were hyperonymic renaming, descriptive paraphrase, situational equivalent, method of assimilation and direct inclusion. The translation of the novel «White Teeth» also features a small number of examples of the use of transposition at the connotative level and extraction. The low prevalence of the above-mentioned methods can be explained by the risk of losing an important coloration of realia. In most cases, preference was given to those methods that helped to convey the meaning of realia as accurately as possible, and at the same time, helped to preserve the coloration of such national and cultural units.
APA, Harvard, Vancouver, ISO, and other styles
39

Crane, Gregg. "Criticism Against Itself." American Literary History, December 31, 2020. http://dx.doi.org/10.1093/alh/ajaa037.

Full text
Abstract:
Abstract A prominent strand of literary criticism today assumes that literature as literature is not significant enough to merit critical scrutiny. Instead of attending to the features that distinguish literature from everyday expression, this criticism values literature for its closeness to and reflection of life. Different as they might appear in their subject matter and approach, Character, The Disposition of Nature, and None Like Us share this “literature-as-life” orientation. In Character, Toril Moi, Rita Felski, and Amanda Anderson remind us of the pleasures of identification, embracing the layperson’s native inclination to consider “characters as objects of identification, sources of emotional response, or agents of moral vision and behavior” (4). Blending life writing with cultural criticism, Stephen Best uses his own narrative to attempt to rewrite the “‘traumatic model of black history’ in which the present is merely an endless, Oedipal repetition of slavery and Jim Crow” (6). Instead of the presence of an identity founded on this never-ending circuit of remembrance and despair, Best wants to dwell in impossibility, contradiction, paradox, and in-betweenness. Jennifer Wenzel’s study, The Disposition of Nature, blends paraphrases of literary representations of real-world environmental problems with references to political and cultural theories and to historical and journalistic accounts. In Wenzel’s case studies on such things as the story of oil extraction in the Niger Delta, the lifelikeness of literature engenders a meditative kind of outrage and skepticism. Given the fact that literary critics are no more expert in life than are their readers, the literature-as-life orientation shared by these authors leads to a kind of critical self-destruction.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography