Увійти

Готові списки джерел за темами / Auto-encodeur variationnel à graphes (VGAE)

Добірка наукової літератури з теми "Auto-encodeur variationnel à graphes (VGAE)"

Автор: Grafiati

Опубліковано: 7 липня 2024

Оновлено: 7 липня 2024

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Auto-encodeur variationnel à graphes (VGAE)".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Зміст

Статті в журналах
Дисертації

Статті в журналах з теми "Auto-encodeur variationnel à graphes (VGAE)":

1

Patel, Neel, Nhat Le, Tan Nguyen, Fedaa Najdawi, Sandhya Srinivasan, Adam Stanford-Moore, Deeksha Kartik, et al. "Abstract 4912: Unsupervised detection of stromal phenotypes with distinct fibrogenic and inflamed properties in NSCLC." Cancer Research 84, no. 6_Supplement (March 22, 2024): 4912. http://dx.doi.org/10.1158/1538-7445.am2024-4912.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Background: Understanding the composition of cancer-associated stroma (CAS) is vital, as the number and location of immune cells and fibroblasts, as well as the degree of extracellular matrix deposition, have implications for cancer progression and response to treatment, including in non-small cell lung cancer (NSCLC). Manual analysis of CAS does not fully describe the stromal milieu, especially from a spatial perspective, and is highly subjective. To this end, we have developed an unsupervised machine learning (ML) model to characterize the CAS in NSCLC from hematoxylin and eosin (H&E) stained whole slide images (WSI) at scale. Methods: PathExploreTM models were deployed to predict stromal tissue and cell types, while another ML model was used to detect collagen fibers from H&E stained WSIs from the TCGA LUAD (N=536) and LUSC (N=464) datasets. Stroma was divided into small regions (median = 0.02 mm2), and 88 features characterizing cell distribution, tissue composition and fiber density were extracted from each region. Graphs were generated connecting neighboring regions (nodes), and an unsupervised variational graph auto-encoder (VGAE) model was trained to learn 8 latent features through dimensionality reduction. Stromal phenotypes were then derived from the latent features using k-means clustering. The fraction of each phenotype in the stroma was correlated against immune- and stroma-related gene expression signatures (GES) and overall survival (OS). Results: Deployment of VGAE on LUAD and LUSC WSIs revealed three distinct stromal phenotypes - P0, P1 and P2. Fibroblast density was elevated in P0 and P1 regions (p<0.001), immune cell density was elevated in P2 regions (p<0.001), and collagen fiber intensity was highest in P1 regions (p<0.001). P2 enrichment was correlated with elevated expression of the T cell-inflamed gene expression profile (TGEP; Spearman ρ = 0.43 in LUAD; ρ = 0.27 in LUSC) and with improved OS (HR = 0.696; 95% CIs: 0.571-0.847 in LUSC). Conversely, P1 enrichment was positively associated with a transforming growth factor-β-induced cancer associated fibroblast GES (TGFβ-CAF: ρ = 0.19 in LUAD and ρ = 0.12 in LUSC) and poor OS (HR = 1.358; 95% CIs: 1.149-1.603 in LUSC). These phenotypes are consistent with fibroblast-enriched, collagen-depleted stroma (P0), collagen-rich, fibroblast-enriched tumor-promoting stroma (P1), and immune cell-enriched, tumor-suppressive stroma (P2). Conclusions: We describe an unsupervised, data-driven method of predicting stromal regions with discrete patterns of cell composition and collagen deposition in NSCLC. This approach identified three phenotypes of NSCLC stroma. These results highlight the ability of ML models to characterize and find meaningful patterns within the cell, tissue, and matrix components of a tumor. This work provides further evidence of the potential of ML to discover novel precision medicine biomarkers in NSCLC. Citation Format: Neel Patel, Nhat Le, Tan Nguyen, Fedaa Najdawi, Sandhya Srinivasan, Adam Stanford-Moore, Deeksha Kartik, Jun Zhang, Jacqueline Brosnan-Cashman, Robert Egger, Justin Lee, Matthew Bronnimann. Unsupervised detection of stromal phenotypes with distinct fibrogenic and inflamed properties in NSCLC [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 4912.

2

Zhu, Guixiang, Jie Cao, Lei Chen, Youquan Wang, Zhan Bu, Shuxin Yang, Jianqing Wu, and Zhiping Wang. "A Multi-task Graph Neural Network with Variational Graph Auto-Encoders for Session-based Travel Packages Recommendation." ACM Transactions on the Web, February 2023. http://dx.doi.org/10.1145/3577032.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Session-based travel packages recommendation aims to predict users’ next click based on their current and historical sessions recorded by Online Travel Agencies (OTA). Recently, an increasing number of studies attempted to apply Graph Neural Networks (GNN) to the session-based recommendation and obtained promising results. However, most of them do not take full advantage of the explicit latent structure from attributes of items, making learned representations of items less effective and difficult to interpret. Moreover, they only combine historical sessions (long-term preferences) with a current session (short-term preference) to learn a unified representation of users, ignoring the effects of historical sessions for the current session. To this end, this paper proposes a novel session-based model named STR-VGAE, which fills subtasks of the travel packages recommendation and variational graph auto-encoders simultaneously. STR-VGAE mainly consists of three components: travel packages encoder , users behaviors encoder , and interaction modeling . Specifically, the travel packages encoder module is used to learn a unified travel package representation from co-occurrence attribute graphs by using multi-view variational graph auto-encoders and a multi-view attention network. The users behaviors encoder module is used to encode user’ historical and current sessions with a personalized GNN, which considers the effects of historical sessions on the current session, and coalesce these two kinds of session representations to learn the high-quality users’ representations by exploiting a gated fusion approach. The interaction modeling module is used to calculate recommendation scores over all candidate travel packages. Extensive experiments on a real-life tourism e-commerce dataset from China show that STR-VGAE yields significant performance advantages over several competitive methods, meanwhile provides an interpretation for the generated recommendation list.

3

Duy Nguyen, Viet Thanh, and Truong Son Hy. "Multimodal pretraining for unsupervised protein representation learning." Biology Methods and Protocols, June 18, 2024. http://dx.doi.org/10.1093/biomethods/bpae043.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Proteins are complex biomolecules essential for numerous biological processes, making them crucial targets for advancements in molecular biology, medical research, and drug design. Understanding their intricate, hierarchical structures and functions is vital for progress in these fields. To capture this complexity, we introduce MPRL—Multimodal Protein Representation Learning, a novel framework for symmetry-preserving multimodal pretraining that learns unified, unsupervised protein representations by integrating primary and tertiary structures. MPRL employs Evolutionary Scale Modeling (ESM-2) for sequence analysis, Variational Graph Auto-Encoders (VGAE) for residue-level graphs, and PointNet Autoencoder (PAE) for 3D point clouds of atoms, each designed to capture the spatial and evolutionary intricacies of proteins while preserving critical symmetries. By leveraging Auto-Fusion to synthesize joint representations from these pretrained models, MPRL ensures robust and comprehensive protein representations. Our extensive evaluation demonstrates that MPRL significantly enhances performance in various tasks such as protein-ligand binding affinity prediction, protein fold classification, enzyme activity identification, and mutation stability prediction. This framework advances the understanding of protein dynamics and facilitates future research in the field. Our source code is publicly available at https://github.com/HySonLab/Protein_Pretrain.

4

Yuan, Wei, Shiyu Zhao, Li Wang, Lijia Cai, and Yong Zhang. "Online course evaluation model based on graph auto-encoder." Intelligent Data Analysis, March 21, 2024, 1–23. http://dx.doi.org/10.3233/ida-230557.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In the post-epidemic era, online learning has gained increasing attention due to the advancements in information and big data technology, leading to large-scale online course data with various student behaviors. Online data mining has become a popular and important way of extracting valuable insights from large amounts of data. However, previous online course analysis methods often focused on individual aspects of the data and neglected the correlation among the large-scale learning behavior data, which can lead to an incomplete understanding of the overall learning behavior and patterns within the online course. To solve the problems, this paper proposes an online course evaluation model based on a graph auto-encoder. In our method, the features of collected online course data are used to construct K-Nearest Neighbor(KNN) graphs to represent the association among the courses. Then the variational graph auto-encoder(VGAE) is introduced to learn the useful implicit features. Finally, we feed the learned implicit features into unsupervised and semi-supervised downstream tasks for online course evaluation, respectively. We conduct experiments on two datasets. In the clustering task, our method showed a more than tenfold increase in the Calinski-Harabasz index compared to unoptimized features, demonstrating significant structural distinction and group coherence. In the classification task, compared to traditional methods, our model exhibited an overall performance improvement of about 10%, indicating its effectiveness in handling complex network data.

Дисертації з теми "Auto-encodeur variationnel à graphes (VGAE)":

1

Belhadj, Djedjiga. "Multi-GAT semi-supervisé pour l’extraction d’informations et son adaptation au chiffrement homomorphe." Electronic Thesis or Diss., Université de Lorraine, 2024. http://www.theses.fr/2024LORR0023.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Cette thèse est réalisée dans le cadre du projet BPI DeepTech, en collaboration avec la société Fair&Smart, veillant principalement à la protection des données personnelles conformément au Règlement Général sur la Protection des Données (RGPD). Dans ce contexte, nous avons proposé un modèle neuronal profond pour l'extraction d'informations dans les documents administratifs semi-structurés (DSSs). En raison du manque de données d'entraînement publiques, nous avons proposé un générateur artificiel de DSSs qui peut générer plusieurs classes de documents avec une large variation de contenu et de mise en page. Les documents sont générés à l'aide de variables aléatoires permettant de gérer le contenu et la mise en page en respectant des contraintes visant à garantir leur proximité avec des documents réels. Des métriques ont été introduites pour évaluer la diversité des DSSs générés en termes de contenu et de mise en page. Les résultats de l'évaluation ont montré que les jeux de données générés pour trois types de DSSs (fiches de paie, tickets de caisse et factures) présentent un degré élevé de diversité, ce qui permet d'éviter le sur-apprentissage lors de l'entraînement des systèmes d'extraction d'informations. En s'appuyant sur le format spécifique des DSSs, constitué de paires de mots (mots-clés, informations) situés dans des voisinages proches spatialement, le document est modélisé sous forme de graphe où les nœuds représentent les mots et les arcs, les relations de voisinage. Le graphe est incorporé dans un réseau d'attention à graphe (GAT) multi-couches (Multi-GAT). Celui-ci applique le mécanisme d'attention multi-têtes permettant d'apprendre l'importance des voisins de chaque mot pour mieux le classer. Une première version de ce modèle a été utilisée en mode supervisé et a obtenu un score F1 de 96 % sur deux jeux de données de factures et de fiches de paie générées, et de 89 % sur un ensemble de tickets de caisse réels (SROIE). Nous avons ensuite enrichi le Multi-GAT avec un plongement multimodal de l'information au niveau des mots (avec des composantes textuelle, visuelle et positionnelle), et l'avons associé à un auto-encodeur variationnel à graphe (VGAE). Ce modèle fonctionne en mode semi-supervisé, capable d'apprendre à partir des données annotées et non annotées simultanément. Pour optimiser au mieux la classification des nœuds du graphe, nous avons proposé un semi-VGAE dont l'encodeur partage ses premières couches avec le classifieur Multi-GAT. Cette optimisation est encore renforcée par la proposition d'une fonction de perte VGAE gérée par la perte de classification. En utilisant une petite base de données non annotées, nous avons pu améliorer de plus de 3 % le score F1 obtenu sur un ensemble de factures générées. Destiné à fonctionner dans un environnement protégé, nous avons adapté l'architecture du modèle pour son chiffrement homomorphe. Nous avons étudié une méthode de réduction de la dimensionnalité du modèle Multi-GAT. Ensuite, nous avons proposé une approche d'approximation polynomiale des fonctions non-linéaires dans le modèle. Pour réduire la dimension du modèle, nous avons proposé une méthode de fusion de caractéristiques multimodales qui nécessite peu de paramètres supplémentaires et qui réduit les dimensions du modèle tout en améliorant ses performances. Pour l'adaptation au chiffrement, nous avons étudié des approximations polynomiales de degrés faibles aux fonctions non-linéaires avec une utilisation des techniques de distillation de connaissance et de fine tuning pour mieux adapter le modèle aux nouvelles approximations. Nous avons pu minimiser la perte lors de l'approximation d'environ 3 % pour deux jeux de données de factures ainsi qu'un jeu de données de fiches de paie et de 5 % pour SROIE
This thesis is being carried out as part of the BPI DeepTech project, in collaboration with the company Fair&Smart, primarily looking after the protection of personal data in accordance with the General Data Protection Regulation (RGPD). In this context, we have proposed a deep neural model for extracting information in semi-structured administrative documents (SSDs). Due to the lack of public training datasets, we have proposed an artificial generator of SSDs that can generate several classes of documents with a wide variation in content and layout. Documents are generated using random variables to manage content and layout, while respecting constraints aimed at ensuring their similarity to real documents. Metrics were introduced to evaluate the content and layout diversity of the generated SSDs. The results of the evaluation have shown that the generated datasets for three SSD types (payslips, receipts and invoices) present a high diversity level, thus avoiding overfitting when training the information extraction systems. Based on the specific format of SSDs, consisting specifically of word pairs (keywords-information) located in spatially close neighborhoods, the document is modeled as a graph where nodes represent words and edges, neighborhood connections. The graph is fed into a multi-layer graph attention network (Multi-GAT). The latter applies the multi-head attention mechanism to learn the importance of each word's neighbors in order to better classify it. A first version of this model was used in supervised mode and obtained an F1 score of 96% on two generated invoice and payslip datasets, and 89% on a real receipt dataset (SROIE). We then enriched the multi-GAT with multimodal embedding of word-level information (textual, visual and positional), and combined it with a variational graph auto-encoder (VGAE). This model operates in semi-supervised mode, being able to learn on both labeled and unlabeled data simultaneously. To further optimize the graph node classification, we have proposed a semi-VGAE whose encoder shares its first layers with the multi-GAT classifier. This is also reinforced by the proposal of a VGAE loss function managed by the classification loss. Using a small unlabeled dataset, we were able to improve the F1 score obtained on a generated invoice dataset by over 3%. Intended to operate in a protected environment, we have adapted the architecture of the model to suit its homomorphic encryption. We studied a method of dimensionality reduction of the Multi-GAT model. We then proposed a polynomial approximation approach for the non-linear functions in the model. To reduce the dimensionality of the model, we proposed a multimodal feature fusion method that requires few additional parameters and reduces the dimensions of the model while improving its performance. For the encryption adaptation, we studied low-degree polynomial approximations of nonlinear functions, using knowledge distillation and fine-tuning techniques to better adapt the model to the new approximations. We were able to minimize the approximation loss by around 3% on two invoice datasets as well as one payslip dataset and by 5% on SROIE