Log in

Relevant bibliographies by topics / Vectorial embeddings / Journal articles

To see the other types of publications on this topic, follow the link: Vectorial embeddings.

Journal articles on the topic 'Vectorial embeddings'

Author: Grafiati

Published: 25 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 27 journal articles for your research on the topic 'Vectorial embeddings.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Rydhe, Eskil. "Vectorial Hankel operators, Carleson embeddings, and notions of BMOA." Geometric and Functional Analysis 27, no. 2 (March 7, 2017): 427–51. http://dx.doi.org/10.1007/s00039-017-0400-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Szymański, Piotr. "A broadband multistate interferometer for impedance measurement." Journal of Telecommunications and Information Technology, no. 2 (June 30, 2005): 29–33. http://dx.doi.org/10.26636/jtit.2005.2.311.

Full text

Abstract:

We present a new four-state interferometer for measuring vectorial reflection coefficient from 50 to 1800 MHz. The interferometer is composed of a four-state phase shifter, a double-directional coupler and a spectrum analyzer with an in-built tracking generator. We describe a design of the interferometer and methods developed for its calibration and de-embedding the measurements. Experimental data verify good accuracy of the impedance measurement.

APA, Harvard, Vancouver, ISO, and other styles

3

Hammer, Barbara, and Alexander Hasenfuss. "Topographic Mapping of Large Dissimilarity Data Sets." Neural Computation 22, no. 9 (September 2010): 2229–84. http://dx.doi.org/10.1162/neco_a_00012.

Full text

Abstract:

Topographic maps such as the self-organizing map (SOM) or neural gas (NG) constitute powerful data mining techniques that allow simultaneously clustering data and inferring their topological structure, such that additional features, for example, browsing, become available. Both methods have been introduced for vectorial data sets; they require a classical feature encoding of information. Often data are available in the form of pairwise distances only, such as arise from a kernel matrix, a graph, or some general dissimilarity measure. In such cases, NG and SOM cannot be applied directly. In this article, we introduce relational topographic maps as an extension of relational clustering algorithms, which offer prototype-based representations of dissimilarity data, to incorporate neighborhood structure. These methods are equivalent to the standard (vectorial) techniques if a Euclidean embedding exists, while preventing the need to explicitly compute such an embedding. Extending these techniques for the general case of non-Euclidean dissimilarities makes possible an interpretation of relational clustering as clustering in pseudo-Euclidean space. We compare the methods to well-known clustering methods for proximity data based on deterministic annealing and discuss how far convergence can be guaranteed in the general case. Relational clustering is quadratic in the number of data points, which makes the algorithms infeasible for huge data sets. We propose an approximate patch version of relational clustering that runs in linear time. The effectiveness of the methods is demonstrated in a number of examples.

APA, Harvard, Vancouver, ISO, and other styles

4

RIESEN, KASPAR, and HORST BUNKE. "GRAPH CLASSIFICATION BASED ON VECTOR SPACE EMBEDDING." International Journal of Pattern Recognition and Artificial Intelligence 23, no. 06 (September 2009): 1053–81. http://dx.doi.org/10.1142/s021800140900748x.

Full text

Abstract:

Graphs provide us with a powerful and flexible representation formalism for pattern classification. Many classification algorithms have been proposed in the literature. However, the vast majority of these algorithms rely on vectorial data descriptions and cannot directly be applied to graphs. Recently, a growing interest in graph kernel methods can be observed. Graph kernels aim at bridging the gap between the high representational power and flexibility of graphs and the large amount of algorithms available for object representations in terms of feature vectors. In the present paper, we propose an approach transforming graphs into n-dimensional real vectors by means of prototype selection and graph edit distance computation. This approach allows one to build graph kernels in a straightforward way. It is not only applicable to graphs, but also to other kind of symbolic data in conjunction with any kind of dissimilarity measure. Thus it is characterized by a high degree of flexibility. With several experimental results, we prove the robustness and flexibility of our new method and show that our approach outperforms other graph classification methods on several graph data sets of diverse nature.

APA, Harvard, Vancouver, ISO, and other styles

5

Zhu, Huiming, Chunhui He, Yang Fang, Bin Ge, Meng Xing, and Weidong Xiao. "Patent Automatic Classification Based on Symmetric Hierarchical Convolution Neural Network." Symmetry 12, no. 2 (January 21, 2020): 186. http://dx.doi.org/10.3390/sym12020186.

Full text

Abstract:

With the rapid growth of patent applications, it has become an urgent problem to automatically classify the accepted patent application documents accurately and quickly. Most previous patent automatic classification studies are based on feature engineering and traditional machine learning methods like SVM, and some even rely on the knowledge of domain experts, hence they suffer from low accuracy problem and have poor generalization ability. In this paper, we propose a patent automatic classification method via the symmetric hierarchical convolution neural network (CNN) named PAC-HCNN. We use the title and abstract of the patent as the input data, and then apply the word embedding technique to segment and vectorize the input data. Then we design a symmetric hierarchical CNN framework to classify the patents based on the word embeddings, which is much more efficient than traditional RNN models dealing with texts, meanwhile keeping the history and future information of the input sequence. We also add gated linear units (GLUs) and residual connection to help realize the deep CNN. Additionally, we equip our model with a self attention mechanism to address the long-term dependency problem. Experiments are performed on large-scale datasets for Chinese short text patent classification. Experimental results prove our proposed model’s effectiveness, and it performs better than other state-of-the-art models significantly and consistently on both fine-grained and coarse-grained classification.

APA, Harvard, Vancouver, ISO, and other styles

6

Ji, Jiayi, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, and Rongrong Ji. "Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 2 (May 18, 2021): 1655–63. http://dx.doi.org/10.1609/aaai.v35i2.16258.

Full text

Abstract:

Transformer-based architectures have shown great success in image captioning, where object regions are encoded and then attended into the vectorial representations to guide the caption decoding. However, such vectorial representations only contain region-level information without considering the global information reflecting the entire image, which fails to expand the capability of complex multi-modal reasoning in image captioning. In this paper, we introduce a Global Enhanced Transformer (termed GET) to enable the extraction of a more comprehensive global representation, and then adaptively guide the decoder to generate high-quality captions. In GET, a Global Enhanced Encoder is designed for the embedding of the global feature, and a Global Adaptive Decoder are designed for the guidance of the caption generation. The former models intra- and inter-layer global representation by taking advantage of the proposed Global Enhanced Attention and a layer-wise fusion module. The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation. Extensive experiments on MS COCO dataset demonstrate the superiority of our GET over many state-of-the-arts.

APA, Harvard, Vancouver, ISO, and other styles

7

Dutta, Anjan, Pau Riba, Josep Lladós, and Alicia Fornés. "Hierarchical stochastic graphlet embedding for graph-based pattern recognition." Neural Computing and Applications 32, no. 15 (December 6, 2019): 11579–96. http://dx.doi.org/10.1007/s00521-019-04642-7.

Full text

Abstract:

AbstractDespite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the stochastic graphlet embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low-to-high-order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

8

Szemenyei, Márton, and Ferenc Vajda. "3D Object Detection and Scene Optimization for Tangible Augmented Reality." Periodica Polytechnica Electrical Engineering and Computer Science 62, no. 2 (May 23, 2018): 25–37. http://dx.doi.org/10.3311/ppee.10482.

Full text

Abstract:

Object recognition in 3D scenes is one of the fundamental tasks in computer vision. It is used frequently in robotics or augmented reality applications [1]. In our work we intend to apply 3D shape recognition to create a Tangible Augmented Reality system that is able to pair virtual and real objects in natural indoors scenes. In this paper we present a method for arranging virtual objects in a real-world scene based on primitive shape graphs. For our scheme, we propose a graph node embedding algorithm for graphs with vectorial nodes and edges, and genetic operators designed to improve the quality of the global setup of virtual objects. We show that our methods improve the quality of the arrangement significantly.

APA, Harvard, Vancouver, ISO, and other styles

9

Shrock, R. "Recent results on renormalization-group evolution of theories with gauge, fermion, and scalar fields." International Journal of Modern Physics A 32, no. 35 (December 20, 2017): 1747007. http://dx.doi.org/10.1142/s0217751x17470078.

Full text

Abstract:

We discuss recent results on renormalization-group evolution of several types of theories. First, we consider asymptotically free vectorial gauge theories with various fermion contents and discuss higher-loop calculations of the UV to IR evolution in these theories, including an IR zero of the beta function and the value of the anomalous dimension [Formula: see text] at this point, together with comparisons with lattice measurements. Effects of scheme transformations are discussed. We then present a novel way to determine the value of [Formula: see text] in an [Formula: see text] technicolor model from a particular embedding in an extended technicolor theory. Finally, we analyze the renormalization-group behavior of several non-asymptotically free theories, including a U(1) gauge theory, a non-Abelian gauge theory with many fermions, an [Formula: see text] [Formula: see text] scalar theory, and Yukawa theories.

APA, Harvard, Vancouver, ISO, and other styles

10

MACHET, B. "COMMENTS ON THE STANDARD MODEL OF ELECTROWEAK INTERACTIONS." International Journal of Modern Physics A 11, no. 01 (January 10, 1996): 29–63. http://dx.doi.org/10.1142/s0217751x96000031.

Full text

Abstract:

The Standard Model of electroweak interactions is shown to include a gauge theory for the observed scalar and pseudoscalar mesons. This is done by exploiting the consequences of embedding the SU(2)L×U(1) group into the chiral group of strong interactions and by explicitly considering as composite the Higgs boson and its three companions inside the standard scalar four-plet. No extra scale of interaction is needed. Quantizing by the Feynman path integral reveals how, in the “Nambu-Jona-Lasinio approximation,” the quarks and the Higgs boson become unobservable, and the theory anomaly-free. Nevertheless, the “anomalous” couplings of the pseudoscalar mesons to gauge fields spring again from the constraints associated with their compositeness itself. This work is the complement of Ref. 1, where the leptonic sector was shown to be compatible with a purely vectorial theory and, consequently, to be also anomaly-free. The bond between quarks and leptons loosens.

APA, Harvard, Vancouver, ISO, and other styles

11

GÜZEL, Mehmet, Hakan ERTEN, and Erkan BOSTANCİ. "GENERATING TURKISH LYRICS WITH LONG SHORT TERM MEMORY." Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 62, no. 1 (June 30, 2020): 71–78. http://dx.doi.org/10.33769/aupse.584380.

Full text

Abstract:

Long Short Term Memory (LSTM) has gained a serious achievement on sequential data which have been used generally videos, text and time-series. In this paper, we aim for generating lyrics with newly created “Turkish Lyrics” dataset. By this time, there have been studies for creating Turkish Lyrics with character-level. Unlike previous studies, we propose to Turkish Lyrics generator working with word-level instead on character-level. Also, for employing LSTM, we can’t send the words as string and words must be vectorized. To vectorize, we tried two ways for encoding the words that are used in dataset and compared them. Firstly, we sample for generating one-hot encoding and then, secondly word-embedding way (Word2Vec). Observational results show us that word- level generation with word-embedding way gives more meaningful and realistic lyrics. Actually, there have not been good results enough to be used for a song because of Turkish Grammar. But, this study encourages authors to work on this field and we do believe that this study will initialize research on this area and lead researchers to contribute to this as well.

APA, Harvard, Vancouver, ISO, and other styles

12

Olotu, Samuel Ibukun, and Oladunni Abosede Daramola. "Hybrid spam message detection using convolutional neural network and long short-term memory techniques." Applied and Computational Engineering 2, no. 1 (March 22, 2023): 265–75. http://dx.doi.org/10.54254/2755-2721/2/20220601.

Full text

Abstract:

Short Message Service (SMS) is a feature of a mobile phone that enable convenient and instant way of sending electronic messages between users. As SMS usage increases fraudulent text messages, known as spam, are becoming more common. Spam SMS may result in leaking personal information, invasion of privacy or accessing unauthorized data from mobile devices. Users of mobile phones can mistakingly give away personal information with the assumption that they are sharing it with the right recipients. This work propose a SMS spam detection method that combines convolutional neural network (CNN) and long short term memory (LSTM) deep learning algorithms. The CNN is used for feature extraction while the LSTM classifies the message. The SMS spam dataset, collected from online repository, is used to train the model. Word embeddings is used to vectorize the words in the message to make it suitable for the model. The result obtained from the implementation outperforms other machine learning algorithms with an accuracy of 99.77%.

APA, Harvard, Vancouver, ISO, and other styles

13

Mutinda, James, Waweru Mwangi, and George Okeyo. "Sentiment Analysis of Text Reviews Using Lexicon-Enhanced Bert Embedding (LeBERT) Model with Convolutional Neural Network." Applied Sciences 13, no. 3 (January 21, 2023): 1445. http://dx.doi.org/10.3390/app13031445.

Full text

Abstract:

Sentiment analysis has become an important area of research in natural language processing. This technique has a wide range of applications, such as comprehending user preferences in ecommerce feedback portals, politics, and in governance. However, accurate sentiment analysis requires robust text representation techniques that can convert words into precise vectors that represent the input text. There are two categories of text representation techniques: lexicon-based techniques and machine learning-based techniques. From research, both techniques have limitations. For instance, pre-trained word embeddings, such as Word2Vec, Glove, and bidirectional encoder representations from transformers (BERT), generate vectors by considering word distances, similarities, and occurrences ignoring other aspects such as word sentiment orientation. Aiming at such limitations, this paper presents a sentiment classification model (named LeBERT) combining sentiment lexicon, N-grams, BERT, and CNN. In the model, sentiment lexicon, N-grams, and BERT are used to vectorize words selected from a section of the input text. CNN is used as the deep neural network classifier for feature mapping and giving the output sentiment class. The proposed model is evaluated on three public datasets, namely, Amazon products’ reviews, Imbd movies’ reviews, and Yelp restaurants’ reviews datasets. Accuracy, precision, and F-measure are used as the model performance metrics. The experimental results indicate that the proposed LeBERT model outperforms the existing state-of-the-art models, with a F-measure score of 88.73% in binary sentiment classification.

APA, Harvard, Vancouver, ISO, and other styles

14

Faraz, Anum, Fardin Ahsan, Jinane Mounsef, Ioannis Karamitsos, and Andreas Kanavos. "Enhancing Child Safety in Online Gaming: The Development and Application of Protectbot, an AI-Powered Chatbot Framework." Information 15, no. 4 (April 19, 2024): 233. http://dx.doi.org/10.3390/info15040233.

Full text

Abstract:

This study introduces Protectbot, an innovative chatbot framework designed to improve safety in children’s online gaming environments. At its core, Protectbot incorporates DialoGPT, a conversational Artificial Intelligence (AI) model rooted in Generative Pre-trained Transformer 2 (GPT-2) technology, engineered to simulate human-like interactions within gaming chat rooms. The framework is distinguished by a robust text classification strategy, rigorously trained on the Publicly Available Natural 2012 (PAN12) dataset, aimed at identifying and mitigating potential sexual predatory behaviors through chat conversation analysis. By utilizing fastText for word embeddings to vectorize sentences, we have refined a support vector machine (SVM) classifier, achieving remarkable performance metrics, with recall, accuracy, and F-scores approaching 0.99. These metrics not only demonstrate the classifier’s effectiveness, but also signify a significant advancement beyond existing methodologies in this field. The efficacy of our framework is additionally validated on a custom dataset, composed of 71 predatory chat logs from the Perverted Justice website, further establishing the reliability and robustness of our classifier. Protectbot represents a crucial innovation in enhancing child safety within online gaming communities, providing a proactive, AI-enhanced solution to detect and address predatory threats promptly. Our findings highlight the immense potential of AI-driven interventions to create safer digital spaces for young users.

APA, Harvard, Vancouver, ISO, and other styles

15

Roth, Jan-Philipp, Thomas Kühler, and Elmar Griese. "Utilizing multimode interference effects in integrated graded-index optical waveguides for efficient power splitting." COMPEL - The international journal for computation and mathematics in electrical and electronic engineering 37, no. 4 (July 2, 2018): 1556–63. http://dx.doi.org/10.1108/compel-09-2017-0374.

Full text

Abstract:

Purpose For the realization of optical waveguide components, needed for photonic integrated circuits, multimode-interference based (MMI-based) devices are an excellent component class for the realization of low loss optical splitters. A promising approach to the manufacturing of these components is their embedding in thin glass sheets by ion-exchange diffusion processes, which has not yet been extensively studied. This study aims to significantly enhance the modeling of the diffusion process to support manufacturing of graded-index, MMI-based optical splitters. Design/methodology/approach The methods of design and analysis of MMI-based components are based on a step-index refractive index profile. In this work, fundamental correlations between the properties of the manufacturing ion-exchange process and the characteristics of the graded-index, MMI-based components are established. The refractive index profile is calculated with a proprietary solver based on the finite element method. Any further investigation with respect to parameter influence is based on the beam propagation method, specifically a finite difference based, semi-vectorial, wide-angle beam propagation algorithm. The influence of the parameters of the self-imaging effect is investigated. On this basis, different approaches for efficient power splitting with graded-index, MMI-based waveguide components are evaluated. Findings Easy approximations – mostly linear – can be found to model the dependencies of the investigated parameters. The resulting graded-index splitters are characterized by their low excess and insertion loss. Originality/value These findings are the first step in the direction of the semi-analytical modeling of the respective waveguide components to reduce the numerical effort.

APA, Harvard, Vancouver, ISO, and other styles

16

Liu, Xinda, and Lili Wang. "Multi-granularity sequence generation for hierarchical image classification." Computational Visual Media 10, no. 2 (January 3, 2024): 243–60. http://dx.doi.org/10.1007/s41095-022-0332-2.

Full text

Abstract:

AbstractHierarchical multi-granularity image classification is a challenging task that aims to tag each given image with multiple granularity labels simultaneously. Existing methods tend to overlook that different image regions contribute differently to label prediction at different granularities, and also insufficiently consider relationships between the hierarchical multi-granularity labels. We introduce a sequence-to-sequence mechanism to overcome these two problems and propose a multi-granularity sequence generation (MGSG) approach for the hierarchical multi-granularity image classification task. Specifically, we introduce a transformer architecture to encode the image into visual representation sequences. Next, we traverse the taxonomic tree and organize the multi-granularity labels into sequences, and vectorize them and add positional information. The proposed multi-granularity sequence generation method builds a decoder that takes visual representation sequences and semantic label embedding as inputs, and outputs the predicted multi-granularity label sequence. The decoder models dependencies and correlations between multi-granularity labels through a masked multi-head self-attention mechanism, and relates visual information to the semantic label information through a cross-modality attention mechanism. In this way, the proposed method preserves the relationships between labels at different granularity levels and takes into account the influence of different image regions on labels with different granularities. Evaluations on six public benchmarks qualitatively and quantitatively demonstrate the advantages of the proposed method. Our project is available at https://github.com/liuxindazz/mgsg.

APA, Harvard, Vancouver, ISO, and other styles

17

Goldstein, Ariel, Avigail Grinstein-Dabush, Mariano Schain, Haocheng Wang, Zhuoqiao Hong, Bobbi Aubrey, Mariano Schain, et al. "Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns." Nature Communications 15, no. 1 (March 30, 2024). http://dx.doi.org/10.1038/s41467-024-46631-y.

Full text

Abstract:

AbstractContextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. Using stringent zero-shot mapping we demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns. The common geometric patterns allow us to predict the brain embedding in IFG of a given left-out word based solely on its geometrical relationship to other non-overlapping words in the podcast. Furthermore, we show that contextual embeddings capture the geometry of IFG embeddings better than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain.

APA, Harvard, Vancouver, ISO, and other styles

18

Chersoni, Emmanuele, Enrico Santus, Chu-Ren Huang, and Alessandro Lenci. "Decoding Word Embeddings with Brain-Based Semantic Features." Computational Linguistics, August 26, 2021, 1–36. http://dx.doi.org/10.1162/coli_a_00412.

Full text

Abstract:

Abstract Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their introduction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings, including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT). In our analysis, we first evaluate the quality of the mapping in a retrieval task, then we shed light on the semantic features that are better encoded in each embedding type. A large number of probing tasks is finally set to assess how the original and the mapped embeddings perform in discriminating semantic categories. For each probing task, we identify the most relevant semantic features and we show that there is a correlation between the embedding performance and how they encode those features. This study sets itself as a step forward in understanding which aspects of meaning are captured by vector spaces, by proposing a new and simple method to carve humaninterpretable semantic representations from distributional vectors.

APA, Harvard, Vancouver, ISO, and other styles

19

le Gorrec, Luce, Philip A. Knight, and Auguste Caen. "Learning network embeddings using small graphlets." Social Network Analysis and Mining 12, no. 1 (December 15, 2021). http://dx.doi.org/10.1007/s13278-021-00846-9.

Full text

Abstract:

AbstractTechniques for learning vectorial representations of graphs (graph embeddings) have recently emerged as an effective approach to facilitate machine learning on graphs. Some of the most popular methods involve sophisticated features such as graph kernels or convolutional networks. In this work, we introduce two straightforward supervised learning algorithms based on small-size graphlet counts, combined with a dimension reduction step. The first relies on a classic feature extraction method powered by principal component analysis (PCA). The second is a feature selection procedure also based on PCA. Despite their conceptual simplicity, these embeddings are arguably more meaningful than some popular alternatives and at the same time are competitive with state-of-the-art methods. We illustrate this second point on a downstream classification task. We then use our algorithms in a novel setting, namely to conduct an analysis of author relationships in Wikipedia articles, for which we present an original dataset. Finally, we provide empirical evidence suggesting that our methods could also be adapted to unsupervised learning algorithms.

APA, Harvard, Vancouver, ISO, and other styles

20

Xenos, A., N. Malod-Dognin, S. Milinković, and N. Pržulj. "Linear functional organization of the omic embedding space." Bioinformatics, July 2, 2021. http://dx.doi.org/10.1093/bioinformatics/btab487.

Full text

Abstract:

Abstract Motivation We are increasingly accumulating complex omics data that capture different aspects of cellular functioning. A key challenge is to untangle their complexity and effectively mine them for new biomedical information. To decipher this new information, we introduce algorithms based on network embeddings. Such algorithms represent biological macromolecules as vectors in d-dimensional space, in which topologically similar molecules are embedded close in space and knowledge is extracted directly by vector operations. Recently, it has been shown that neural networks used to obtain vectorial representations (embeddings) are implicitly factorizing a mutual information matrix, called Positive Pointwise Mutual Information (PPMI) matrix. Thus, we propose the use of the PPMI matrix to represent the human protein–protein interaction (PPI) network and also introduce the graphlet degree vector PPMI matrix of the PPI network to capture different topological (structural) similarities of the nodes in the molecular network. Results We generate the embeddings by decomposing these matrices with Nonnegative Matrix Tri-Factorization. We demonstrate that genes that are embedded close in these spaces have similar biological functions, so we can extract new biomedical knowledge directly by doing linear operations on their embedding vector representations. We exploit this property to predict new genes participating in protein complexes and to identify new cancer-related genes based on the cosine similarities between the vector representations of the genes. We validate 80% of our novel cancer-related gene predictions in the literature and also by patient survival curves that demonstrating that 93.3% of them have a potential clinical relevance as biomarkers of cancer. Availability and implementation Code and data are available online at https://gitlab.bsc.es/axenos/embedded-omics-data-geometry/ Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

21

Zhang, Fei, Bo Sun, Xiaolin Diao, Wei Zhao, and Ting Shu. "Prediction of adverse drug reactions based on knowledge graph embedding." BMC Medical Informatics and Decision Making 21, no. 1 (February 4, 2021). http://dx.doi.org/10.1186/s12911-021-01402-3.

Full text

Abstract:

Abstract Background Adverse drug reactions (ADRs) are an important concern in the medication process and can pose a substantial economic burden for patients and hospitals. Because of the limitations of clinical trials, it is difficult to identify all possible ADRs of a drug before it is marketed. We developed a new model based on data mining technology to predict potential ADRs based on available drug data. Method Based on the Word2Vec model in Nature Language Processing, we propose a new knowledge graph embedding method that embeds drugs and ADRs into their respective vectors and builds a logistic regression classification model to predict whether a given drug will have ADRs. Result First, a new knowledge graph embedding method was proposed, and comparison with similar studies showed that our model not only had high prediction accuracy but also was simpler in model structure. In our experiments, the AUC of the classification model reached a maximum of 0.87, and the mean AUC was 0.863. Conclusion In this paper, we introduce a new method to embed knowledge graph to vectorize drugs and ADRs, then use a logistic regression classification model to predict whether there is a causal relationship between them. The experiment showed that the use of knowledge graph embedding can effectively encode drugs and ADRs. And the proposed ADRs prediction system is also very effective.

APA, Harvard, Vancouver, ISO, and other styles

22

Alonso-Álvarez, Gonzalo, and James M. Cline. "Gauging lepton flavor SU(3) for the muon g − 2." Journal of High Energy Physics 2022, no. 3 (March 2022). http://dx.doi.org/10.1007/jhep03(2022)042.

Full text

Abstract:

Abstract Gauging a specific difference of lepton numbers such as Lμ− Lτ is a popular model-building option, which gives rise to economical explanations for the muon anomalous magnetic moment. However, this choice of gauge group seems rather arbitrary, and additional physics is required to reproduce the observed neutrino masses and mixings. We address these shortcomings by embedding Lμ− Lτ in the vectorial SU(3) gauge symmetry of lepton flavor. The vacuum expectation values (VEVs) of scalar fields in the fundamental, six-dimensional and adjoint representations allow for phenomenologically viable lepton and gauge boson masses. The octet scalar gives rise to charged lepton masses, and together with the triplet scalar generates masses for all the leptophilic gauge bosons except for the Lμ− Lτ one. The latter gets its smaller mass from the sextet VEVs, which also generate the neutrino masses, and are determined up to an overall scaling by the observed masses and mixings. The model predicts three heavy neutral leptons at the GeV-TeV scale as well as vectorlike charged lepton partners; it requires the mass of the lightest active neutrino to exceed 10−4 eV, and it naturally provides a resolution of the Cabibbo angle anomaly.

APA, Harvard, Vancouver, ISO, and other styles

23

Peñafiel-Saiz, Carmen, Jordi Morales-i-Gras, and Lázaro Echegaray-Eizaguirre. "Las imágenes como recurso fundamental de la información durante la covid-19 y la fase de vacunación en medios digitales españoles." Revista de Comunicación, January 31, 2024. http://dx.doi.org/10.26441/rc23.1-2024-3427.

Full text

Abstract:

El estudio tiene como objetivo caracterizar las imágenes que acompañan a las informaciones de la pandemia de la covid-19, la vacunación y los tratamientos contra el coronavirus en los medios de comunicación digitales: ABC, Deia, EITB.eus, El Correo, elDiario.es, El Mundo, La Razón, La Vanguardia, Naiz y Público (2020-2022). Se ha trabajado con una muestra de 15.654 imágenes únicas, sobre las que se ha procedido a identificar 15 clústeres con técnicas de Inteligencia Artificial, entre las que consta el algoritmo Inception V3ylas incrustaciones en espacios vectoriales o embeddings. Se opta por una estrategia no supervisada, propias de investigaciones de corte exploratorio e inductivo. Entre los resultados destaca la identificación de distintas tipologías de imágenes utilizadas por los medios de orientación médica y sanitaria, representaciones de la muerte y del drama humano causado por la pandemia, vinculadas a la dimensión más política y económica de la campaña de vacunación. Las imágenes analizadas forman parte de la ‘comunicación política’: un modelo de comunicación que busca y refuerza la relación entre el gobierno y los medios de comunicación, y viceversa, para generar confianza en la gestión y en el propio sistema. La diversidad de enfoques observada es un valor positivo ya que contribuye a establecer una realidad social operativa más poliédrica y rica. Los medios muestran diferentes preferencias a la hora de representar visualmente los temas, lo que se traduce en una distribución desigual de los mismos. El estudio ha identificado las diferentes narrativas revelando diferentes usos de imágenes cargadas emocionalmente que plantean un escenario complejo.

APA, Harvard, Vancouver, ISO, and other styles

24

Chen, Zhao, Yin Jiang, Xiaoyu Zhang, Rui Zheng, Ruijin Qiu, Yang Sun, Chen Zhao, and Hongcai Shang. "ResNet18DNN: prediction approach of drug-induced liver injury by deep neural network with ResNet18." Briefings in Bioinformatics 23, no. 1 (December 9, 2021). http://dx.doi.org/10.1093/bib/bbab503.

Full text

Abstract:

Abstract Drug-induced liver injury (DILI) has always been the focus of clinicians and drug researchers. How to improve the performance of the DILI prediction model to accurately predict liver injury was an urgent problem for researchers in the field of medical research. In order to solve this scientific problem, this research collected a comprehensive and accurate dataset of DILI with high recognition and high quality based on clinically confirmed DILI compound datasets, including 1446 chemical compounds. Then, the residual neural network with 18-layer by using more 5-layer blocks (ResNet18) with deep neural network (ResNet18DNN) model was proposed to predict DILI, which was an improved model for DILI prediction through vectorization of compound structure image. In predicting DILI, the ResNet18DNN learned greatly and outperformed the existing state-of-the-art DILI predictors. The results of DILI prediction model based on ResNet18DNN showed that the AUC (area under the curve), accuracy, recall, precision, F1-score and specificity of the training set were 0.973, 0.992, 0.995, 0.994, 0.995 and 0.975; those of test set were, respectively, 0.958, 0.976, 0.935, 0.947, 0.926 and 0.913, which were better than the performance of previously published described DILI prediction models. This method adopted ResNet18 embedding method to vectorize molecular structure images and the evaluation indicators of Resnet18DNN were obtained after 10 000 iterations. This prediction approach will greatly improve the performance of the predictive model of DILI and provide an accurate and precise early warning method for DILI in drug development and clinical medication.

APA, Harvard, Vancouver, ISO, and other styles

25

Ouyang, Ningjing. "Analyze IMDb movies by sentiment and topic analysis." Environment and Social Psychology 8, no. 3 (October 25, 2023). http://dx.doi.org/10.54517/esp.v8i3.1958.

Full text

Abstract:

Movie is an important cultural form, carrying multiple levels and meanings such as art, entertainment and social value. Movie review and rating data sets are huge, and deep learning and natural language processing methods are widely used today. Advances in big data and deep learning offer unprecedented opportunities to understand moviegoer behavior and preferences while providing a cost-effective way to gain insights relevant to the entertainment industry. This project conducts sentiment analysis, topic modeling, and visual statistical analysis based on the IMDb movie data set to identify key factors and deeper insights that influence successful decision-making in film production. This project first uses the word embedding method to vectorize the movie review text, and then uses Bidirectional Long Short-Term Memory (Bi-LSTM) to perform sentiment classification. In addition, statistical methods such as visualization were used to discover conclusions such as the highest average number of movies released in November, and identify trends, patterns and relationships between the variables of IMDb movies. Finally, the Latent Dirichlet Allocation (LDA) topic modeling model was constructed to find out that the important topic with increased demand is light entertainment movies, highlighting the commercial feasibility of comedy movies as a profitable business model. In summary, this project uses an emotion-topic fusion analysis method based on the Bi-LSTM emotion classification method and the LDA topic modeling method. The results show that the Bi-LSTM model can better identify positive and negative emotions in movie reviews, and the LDA topic model performs well in mining popular topics.

APA, Harvard, Vancouver, ISO, and other styles

26

Matougui, Brahim, Abdelbasset Boukelia, Hacene Belhadef, Clovis Galiez, and Mohamed Batouche. "NLP-MeTaxa: A Natural Language Processing approach for Metagenomic Taxonomic Binning based on deep learning." Current Bioinformatics 16 (June 21, 2021). http://dx.doi.org/10.2174/1574893616666210621101150.

Full text

Abstract:

Background: Metagenomics is the study of genomic content in mass from an environment of interest such as the human gut or soil. Taxonomy is one of the most important fields of metagenomics, which is the science of defining and naming groups of microbial organisms that share the same characteristics. The problem of taxonomy classification is the identification and quantification of microbial species or higher-level taxa sampled by high throughput sequencing. Objective: Although many methods exist to deal with the taxonomic classification problem, assignment to low taxonomic ranks remains an important challenge for binning methods as is scalability to Gb-sized datasets generated with deep sequencing techniques. Methods: In this paper, we introduce NLP-MeTaxa, a novel composition-based method for taxonomic binning, which relies on the use of words embeddings and deep learning architecture. The new proposed approach is word-based, where the metagenomic DNA fragments are processed as a set of overlapping words by using the word2vec model to vectorize them in order to feed the deep learning model. NLP-MeTaxa output is visualized as NCBI taxonomy tree, this representation helps to show the connection between the predicted taxonomic identifiers. NLP-MeTaxa was trained on large-scale data from the NCBI RefSeq, more than 14,000 complete microbial genomes. The NLP-MeTaxa code is available at the website: https://github.com/padriba/NLP_MeTaxa/ Results: We evaluated NLP-MeTaxa with a real and simulated metagenomic dataset and compared our results to other tools' results. The experimental results have shown that our method outperforms the other methods especially for the classification of low-ranking taxonomic class such as species and genus. Conclusion: In summary, our new method might provide novel insight for understanding the microbial community through the identification of the organisms it might contain.

APA, Harvard, Vancouver, ISO, and other styles

27

Xu, Liang, Lu Lu, Minglu Liu, Chengxuan Song, and Lizhen Wu. "Nanjing Yunjin intelligent question-answering system based on knowledge graphs and retrieval augmented generation technology." Heritage Science 12, no. 1 (April 9, 2024). http://dx.doi.org/10.1186/s40494-024-01231-3.

Full text

Abstract:

AbstractNanjing Yunjin, a traditional Chinese silk weaving craft, is celebrated globally for its unique local characteristics and exquisite workmanship, forming an integral part of the world's intangible cultural heritage. However, with the advancement of information technology, the experiential knowledge of the Nanjing Yunjin production process is predominantly stored in text format. As a highly specialized and vertical domain, this information is not readily convert into usable data. Previous studies on a knowledge graph-based Nanjing Yunjin Question-Answering System have partially addressed this issue. However, knowledge graphs need to be constantly updated and rely on predefined entities and relationship types. Faced with ambiguous or complex natural language problems, knowledge graph information retrieval faces some challenges. Therefore, this study proposes a Nanjing Yunjin Question-Answering System that integrates Knowledge Graphs and Retrieval Augmented Generation techniques. In this system, the ROBERTA model is first utilized to vectorize Nanjing Yunjin textual information, delving deep into textual semantics to unveil its profound cultural connotations. Additionally, the FAISS vector database is employed for efficient storage and retrieval of Nanjing Yunjin information, achieving a deep semantic match between questions and answers. Ultimately, related retrieval results are fed into the Large Language Model for enhanced generation, aiming for more accurate text generation outcomes and improving the interpretability and logic of the Question-Answering System. This research merges technologies like text embedding, vectorized retrieval, and natural language generation, aiming to overcome the limitations of knowledge graphs-based Question-Answering System in terms of graph updating, dependency on predefined types, and semantic understanding. System implementation and testing have shown that the Nanjing Yunjin Intelligent Question-Answering System, constructed on the basis of Knowledge Graphs and Retrieval Augmented Generation, possesses a broader knowledge base that considers context, resolving issues of polysemy, vague language, and sentence ambiguity, and efficiently and accurately generates answers to natural language queries. This significantly facilitates the retrieval and utilization of Yunjin knowledge, providing a paradigm for constructing Question-Answering System for other intangible cultural heritages, and holds substantial theoretical and practical significance for the deep exploration and discovery of the knowledge structure of human intangible heritage, promoting cultural inheritance and protection.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!