Academic literature on the topic 'Protein Representation Learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Protein Representation Learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Protein Representation Learning"

1

Kim, Paul T., Robin Winter, and Djork-Arné Clevert. "Unsupervised Representation Learning for Proteochemometric Modeling." International Journal of Molecular Sciences 22, no. 23 (November 28, 2021): 12882. http://dx.doi.org/10.3390/ijms222312882.

Full text
Abstract:
In silico protein–ligand binding prediction is an ongoing area of research in computational chemistry and machine learning based drug discovery, as an accurate predictive model could greatly reduce the time and resources necessary for the detection and prioritization of possible drug candidates. Proteochemometric modeling (PCM) attempts to create an accurate model of the protein–ligand interaction space by combining explicit protein and ligand descriptors. This requires the creation of information-rich, uniform and computer interpretable representations of proteins and ligands. Previous studies in PCM modeling rely on pre-defined, handcrafted feature extraction methods, and many methods use protein descriptors that require alignment or are otherwise specific to a particular group of related proteins. However, recent advances in representation learning have shown that unsupervised machine learning can be used to generate embeddings that outperform complex, human-engineered representations. Several different embedding methods for proteins and molecules have been developed based on various language-modeling methods. Here, we demonstrate the utility of these unsupervised representations and compare three protein embeddings and two compound embeddings in a fair manner. We evaluate performance on various splits of a benchmark dataset, as well as on an internal dataset of protein–ligand binding activities and find that unsupervised-learned representations significantly outperform handcrafted representations.
APA, Harvard, Vancouver, ISO, and other styles
2

Heinzinger, Michael, Christian Dallago, and Burkhard Rost. "Protein matchmaking through representation learning." Cell Systems 12, no. 10 (October 2021): 948–50. http://dx.doi.org/10.1016/j.cels.2021.09.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Fasoulis, Romanos, Georgios Paliouras, and Lydia E. Kavraki. "Graph representation learning for structural proteomics." Emerging Topics in Life Sciences 5, no. 6 (October 19, 2021): 789–802. http://dx.doi.org/10.1042/etls20210225.

Full text
Abstract:
The field of structural proteomics, which is focused on studying the structure–function relationship of proteins and protein complexes, is experiencing rapid growth. Since the early 2000s, structural databases such as the Protein Data Bank are storing increasing amounts of protein structural data, in addition to modeled structures becoming increasingly available. This, combined with the recent advances in graph-based machine-learning models, enables the use of protein structural data in predictive models, with the goal of creating tools that will advance our understanding of protein function. Similar to using graph learning tools to molecular graphs, which currently undergo rapid development, there is also an increasing trend in using graph learning approaches on protein structures. In this short review paper, we survey studies that use graph learning techniques on proteins, and examine their successes and shortcomings, while also discussing future directions.
APA, Harvard, Vancouver, ISO, and other styles
4

Rives, Alexander, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, et al. "Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences." Proceedings of the National Academy of Sciences 118, no. 15 (April 5, 2021): e2016239118. http://dx.doi.org/10.1073/pnas.2016239118.

Full text
Abstract:
In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In the life sciences, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Protein language modeling at the scale of evolution is a logical step toward predictive and generative artificial intelligence for biology. To this end, we use unsupervised learning to train a deep contextual language model on 86 billion amino acids across 250 million protein sequences spanning evolutionary diversity. The resulting model contains information about biological properties in its representations. The representations are learned from sequence data alone. The learned representation space has a multiscale organization reflecting structure from the level of biochemical properties of amino acids to remote homology of proteins. Information about secondary and tertiary structure is encoded in the representations and can be identified by linear projections. Representation learning produces features that generalize across a range of applications, enabling state-of-the-art supervised prediction of mutational effect and secondary structure and improving state-of-the-art features for long-range contact prediction.
APA, Harvard, Vancouver, ISO, and other styles
5

Warikoo, Neha, Yung-Chun Chang, and Shang-Pin Ma. "Gradient Boosting over Linguistic-Pattern-Structured Trees for Learning Protein–Protein Interaction in the Biomedical Literature." Applied Sciences 12, no. 20 (October 11, 2022): 10199. http://dx.doi.org/10.3390/app122010199.

Full text
Abstract:
Protein-based studies contribute significantly to gathering functional information about biological systems; therefore, the protein–protein interaction detection task is one of the most researched topics in the biomedical literature. To this end, many state-of-the-art systems using syntactic tree kernels (TK) and deep learning have been developed. However, these models are computationally complex and have limited learning interpretability. In this paper, we introduce a linguistic-pattern-representation-based Gradient-Tree Boosting model, i.e., LpGBoost. It uses linguistic patterns to optimize and generate semantically relevant representation vectors for learning over the gradient-tree boosting. The patterns are learned via unsupervised modeling by clustering invariant semantic features. These linguistic representations are semi-interpretable with rich semantic knowledge, and owing to their shallow representation, they are also computationally less expensive. Our experiments with six protein–protein interaction (PPI) corpora demonstrate that LpGBoost outperforms the SOTA tree-kernel models, as well as the CNN-based interaction detection studies for BioInfer and AIMed corpora.
APA, Harvard, Vancouver, ISO, and other styles
6

Chornozhuk, S. "The New Geometric “State-Action” Space Representation for Q-Learning Algorithm for Protein Structure Folding Problem." Cybernetics and Computer Technologies, no. 3 (October 27, 2020): 59–73. http://dx.doi.org/10.34229/2707-451x.20.3.6.

Full text
Abstract:
Introduction. The spatial protein structure folding is an important and actual problem in computational biology. Considering the mathematical model of the task, it can be easily concluded that finding an optimal protein conformation in a three dimensional grid is a NP-hard problem. Therefore some reinforcement learning techniques such as Q-learning approach can be used to solve the problem. The article proposes a new geometric “state-action” space representation which significantly differs from all alternative representations used for this problem. The purpose of the article is to analyze existing approaches of different states and actions spaces representations for Q-learning algorithm for protein structure folding problem, reveal their advantages and disadvantages and propose the new geometric “state-space” representation. Afterwards the goal is to compare existing and the proposed approaches, make conclusions with also describing possible future steps of further research. Result. The work of the proposed algorithm is compared with others on the basis of 10 known chains with a length of 48 first proposed in [16]. For each of the chains the Q-learning algorithm with the proposed “state-space” representation outperformed the same Q-learning algorithm with alternative existing “state-space” representations both in terms of average and minimal energy values of resulted conformations. Moreover, a plenty of existing representations are used for a 2D protein structure predictions. However, during the experiments both existing and proposed representations were slightly changed or developed to solve the problem in 3D, which is more computationally demanding task. Conclusion. The quality of the Q-learning algorithm with the proposed geometric “state-action” space representation has been experimentally confirmed. Consequently, it’s proved that the further research is promising. Moreover, several steps of possible future research such as combining the proposed approach with deep learning techniques has been already suggested. Keywords: Spatial protein structure, combinatorial optimization, relative coding, machine learning, Q-learning, Bellman equation, state space, action space, basis in 3D space.
APA, Harvard, Vancouver, ISO, and other styles
7

Yao, Yu, Xiuquan Du, Yanyu Diao, and Huaixu Zhu. "An integration of deep learning with feature embedding for protein–protein interaction prediction." PeerJ 7 (June 17, 2019): e7126. http://dx.doi.org/10.7717/peerj.7126.

Full text
Abstract:
Protein–protein interactions are closely relevant to protein function and drug discovery. Hence, accurately identifying protein–protein interactions will help us to understand the underlying molecular mechanisms and significantly facilitate the drug discovery. However, the majority of existing computational methods for protein–protein interactions prediction are focused on the feature extraction and combination of features and there have been limited gains from the state-of-the-art models. In this work, a new residue representation method named Res2vec is designed for protein sequence representation. Residue representations obtained by Res2vec describe more precisely residue-residue interactions from raw sequence and supply more effective inputs for the downstream deep learning model. Combining effective feature embedding with powerful deep learning techniques, our method provides a general computational pipeline to infer protein–protein interactions, even when protein structure knowledge is entirely unknown. The proposed method DeepFE-PPI is evaluated on the S. Cerevisiae and human datasets. The experimental results show that DeepFE-PPI achieves 94.78% (accuracy), 92.99% (recall), 96.45% (precision), 89.62% (Matthew’s correlation coefficient, MCC) and 98.71% (accuracy), 98.54% (recall), 98.77% (precision), 97.43% (MCC), respectively. In addition, we also evaluate the performance of DeepFE-PPI on five independent species datasets and all the results are superior to the existing methods. The comparisons show that DeepFE-PPI is capable of predicting protein–protein interactions by a novel residue representation method and a deep learning classification framework in an acceptable level of accuracy. The codes along with instructions to reproduce this work are available from https://github.com/xal2019/DeepFE-PPI.
APA, Harvard, Vancouver, ISO, and other styles
8

Garruss, Alexander S., Katherine M. Collins, and George M. Church. "Deep representation learning improves prediction of LacI-mediated transcriptional repression." Proceedings of the National Academy of Sciences 118, no. 27 (June 29, 2021): e2022838118. http://dx.doi.org/10.1073/pnas.2022838118.

Full text
Abstract:
Recent progress in DNA synthesis and sequencing technology has enabled systematic studies of protein function at a massive scale. We explore a deep mutational scanning study that measured the transcriptional repression function of 43,669 variants of the Escherichia coli LacI protein. We analyze structural and evolutionary aspects that relate to how the function of this protein is maintained, including an in-depth look at the C-terminal domain. We develop a deep neural network to predict transcriptional repression mediated by the lac repressor of Escherichia coli using experimental measurements of variant function. When measured across 10 separate training and validation splits using 5,009 single mutations of the lac repressor, our best-performing model achieved a median Pearson correlation of 0.79, exceeding any previous model. We demonstrate that deep representation learning approaches, first trained in an unsupervised manner across millions of diverse proteins, can be fine-tuned in a supervised fashion using lac repressor experimental datasets to more effectively predict a variant’s effect on repression. These findings suggest a deep representation learning model may improve the prediction of other important properties of proteins.
APA, Harvard, Vancouver, ISO, and other styles
9

Rahman, Julia, Nazrul Islam Mondal, Khaled Ben Islam, and Al Mehedi Hasan. "Feature Fusion Based SVM Classifier for Protein Subcellular Localization Prediction." Journal of Integrative Bioinformatics 13, no. 1 (March 1, 2016): 23–33. http://dx.doi.org/10.1515/jib-2016-288.

Full text
Abstract:
Summary For the importance of protein subcellular localization in different branch of life science and drug discovery, researchers have focused their attentions on protein subcellular localization prediction. Effective representation of features from protein sequences plays most vital role in protein subcellular localization prediction specially in case of machine learning technique. Single feature representation like pseudo amino acid composition (PseAAC), physiochemical property model (PPM), amino acid index distribution (AAID) contains insufficient information from protein sequences. To deal with such problem, we have proposed two feature fusion representations AAIDPAAC and PPMPAAC to work with Support Vector Machine classifier, which fused PseAAC with PPM and AAID accordingly. We have evaluated performance for both single and fused feature representation of Gram-negative bacterial dataset. We have got at least 3% more actual accuracy by AAIDPAAC and 2% more locative accuracy by PPMPAAC than single feature representation.
APA, Harvard, Vancouver, ISO, and other styles
10

Jin, Chen, Zhuangwei Shi, Chuanze Kang, Ken Lin, and Han Zhang. "TLCrys: Transfer Learning Based Method for Protein Crystallization Prediction." International Journal of Molecular Sciences 23, no. 2 (January 16, 2022): 972. http://dx.doi.org/10.3390/ijms23020972.

Full text
Abstract:
X-ray diffraction technique is one of the most common methods of ascertaining protein structures, yet only 2–10% of proteins can produce diffraction-quality crystals. Several computational methods have been proposed so far to predict protein crystallization. Nevertheless, the current state-of-the-art computational methods are limited by the scarcity of experimental data. Thus, the prediction accuracy of existing models hasn’t reached the ideal level. To address the problems above, we propose a novel transfer-learning-based framework for protein crystallization prediction, named TLCrys. The framework proceeds in two steps: pre-training and fine-tuning. The pre-training step adopts attention mechanism to extract both global and local information of the protein sequences. The representation learned from the pre-training step is regarded as knowledge to be transferred and fine-tuned to enhance the performance of crystalization prediction. During pre-training, TLCrys adopts a multi-task learning method, which not only improves the learning ability of protein encoding, but also enhances the robustness and generalization of protein representation. The multi-head self-attention layer guarantees that different levels of the protein representation can be extracted by the fine-tuned step. During transfer learning, the fine-tuning strategy used by TLCrys improves the task-specialized learning ability of the network. Our method outperforms all previous predictors significantly in five crystallization stages of prediction. Furthermore, the proposed methodology can be well generalized to other protein sequence classification tasks.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Protein Representation Learning"

1

Tubiana, Jérôme. "Restricted Boltzmann machines : from compositional representations to protein sequence analysis." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE039/document.

Full text
Abstract:
Les Machines de Boltzmann restreintes (RBM) sont des modèles graphiques capables d’apprendre simultanément une distribution de probabilité et une représentation des données. Malgré leur architecture relativement simple, les RBM peuvent reproduire très fidèlement des données complexes telles que la base de données de chiffres écrits à la main MNIST. Il a par ailleurs été montré empiriquement qu’elles peuvent produire des représentations compositionnelles des données, i.e. qui décomposent les configurations en leurs différentes parties constitutives. Cependant, toutes les variantes de ce modèle ne sont pas aussi performantes les unes que les autres, et il n’y a pas d’explication théorique justifiant ces observations empiriques. Dans la première partie de ma thèse, nous avons cherché à comprendre comment un modèle si simple peut produire des distributions de probabilité si complexes. Pour cela, nous avons analysé un modèle simplifié de RBM à poids aléatoires à l’aide de la méthode des répliques. Nous avons pu caractériser théoriquement un régime compositionnel pour les RBM, et montré sous quelles conditions (statistique des poids, choix de la fonction de transfert) ce régime peut ou ne peut pas émerger. Les prédictions qualitatives et quantitatives de cette analyse théorique sont en accord avec les observations réalisées sur des RBM entraînées sur des données réelles. Nous avons ensuite appliqué les RBM à l’analyse et à la conception de séquences de protéines. De part leur grande taille, il est en effet très difficile de simuler physiquement les protéines, et donc de prédire leur structure et leur fonction. Il est cependant possible d’obtenir des informations sur la structure d’une protéine en étudiant la façon dont sa séquence varie selon les organismes. Par exemple, deux sites présentant des corrélations de mutations importantes sont souvent physiquement proches sur la structure. A l’aide de modèles graphiques tels que les Machine de Boltzmann, on peut exploiter ces signaux pour prédire la proximité spatiale des acides-aminés d’une séquence. Dans le même esprit, nous avons montré sur plusieurs familles de protéines que les RBM peuvent aller au-delà de la structure, et extraire des motifs étendus d’acides aminés en coévolution qui reflètent les contraintes phylogénétiques, structurelles et fonctionnelles des protéines. De plus, on peut utiliser les RBM pour concevoir de nouvelles séquences avec des propriétés fonctionnelles putatives par recombinaison de ces motifs. Enfin, nous avons développé de nouveaux algorithmes d’entraînement et des nouvelles formes paramétriques qui améliorent significativement la performance générative des RBM. Ces améliorations les rendent compétitives avec l’état de l’art des modèles génératifs tels que les réseaux génératifs adversariaux ou les auto-encodeurs variationnels pour des données de taille intermédiaires
Restricted Boltzmann machines (RBM) are graphical models that learn jointly a probability distribution and a representation of data. Despite their simple architecture, they can learn very well complex data distributions such the handwritten digits data base MNIST. Moreover, they are empirically known to learn compositional representations of data, i.e. representations that effectively decompose configurations into their constitutive parts. However, not all variants of RBM perform equally well, and little theoretical arguments exist for these empirical observations. In the first part of this thesis, we ask how come such a simple model can learn such complex probability distributions and representations. By analyzing an ensemble of RBM with random weights using the replica method, we have characterised a compositional regime for RBM, and shown under which conditions (statistics of weights, choice of transfer function) it can and cannot arise. Both qualitative and quantitative predictions obtained with our theoretical analysis are in agreement with observations from RBM trained on real data. In a second part, we present an application of RBM to protein sequence analysis and design. Owe to their large size, it is very difficult to run physical simulations of proteins, and to predict their structure and function. It is however possible to infer information about a protein structure from the way its sequence varies across organisms. For instance, Boltzmann Machines can leverage correlations of mutations to predict spatial proximity of the sequence amino-acids. Here, we have shown on several synthetic and real protein families that provided a compositional regime is enforced, RBM can go beyond structure and extract extended motifs of coevolving amino-acids that reflect phylogenic, structural and functional constraints within proteins. Moreover, RBM can be used to design new protein sequences with putative functional properties by recombining these motifs at will. Lastly, we have designed new training algorithms and model parametrizations that significantly improve RBM generative performance, to the point where it can compete with state-of-the-art generative models such as Generative Adversarial Networks or Variational Autoencoders on medium-scale data
APA, Harvard, Vancouver, ISO, and other styles
2

(6326255), Stefan M. Irby. "Evaluation of a Novel Biochemistry Course-Based Undergraduate Research Experience (CURE)." Thesis, 2019.

Find full text
Abstract:

Course-based Undergraduate Research Experiences (CUREs) have been described in a range of educational contexts. Although various learning objectives, termed anticipated learning outcomes (ALOs) in this project, have been proposed, processes for identifying them may not be rigorous or well-documented, which can lead to inappropriate assessment and speculation about what students actually learn from CUREs. Additionally, evaluation of CUREs has primarily relied on student and instructor perception data rather than more reliable measures of learning.This dissertation investigated a novel biochemistry laboratory curriculum for a Course-based Undergraduate Research Experience (CURE) known as the Biochemistry Authentic Scientific Inquiry Lab (BASIL). Students participating in this CURE use a combination of computational and biochemical wet-lab techniques to elucidate the function of proteins of known structure but unknown function. The goal of the project was to evaluate the efficacy of the BASIL CURE curriculum for developing students’ research abilities across implementations. Towards achieving this goal, we addressed the following four research questions (RQs): RQ1) How can ALOs be rigorously identified for the BASIL CURE; RQ2) How can the identified ALOs be used to develop a matrix that characterizes the BASIL CURE; RQ3) What are students’ perceptions of their knowledge, confidence and competence regarding their abilities to perform the top-rated ALOs for this CURE; RQ4) What are appropriate assessments for student achievement of the identified ALOs and what is the nature of student learning, and related difficulties, developed by students during the BASIL CURE? To address these RQs, this project focused on the development and use of qualitative and quantitative methods guided by constructivism and situated cognition theoretical frameworks. Data was collected using a range of instruments including, content analysis, Qualtrics surveys, open-ended questions and interviews, in order to identify ALOs and to determine student learning for the BASIL CURE. Analysis of the qualitative data was through inductive coding guided by the concept-reasoning-mode (CRM) model and the assessment triangle, while analysis of quantitative data was done by using standard statistical techniques (e.g. conducting a parried t-test and effect size). The results led to the development of a novel method for identifying ALOs, namely a process for identifying course-based undergraduate research abilities (PICURA; RQ1; Irby, Pelaez, & Anderson 2018b). Application of PICURA to the BASIL CURE resulted in the identification and rating by instructors of a wide range of ALOs, termed course-based undergraduate research abilities (CURAs), which were formulated into a matrix (RQs 2; Irby, Pelaez, & Anderson, 2018a,). The matrix was, in turn, used to characterize the BASIL CURE and to inform the design of student assessments aimed at evaluating student development of the identified CURAs (RQs 4; Irby, Pelaez, & Anderson, 2018a). Preliminary findings from implementation of the open-ended assessments in a small case study of students, revealed a range of student competencies for selected top-rated CURAs as well as evidence for student difficulties (RQ4). In this way we were able to confirm that students are developing some of the ALOs as actual learning outcomes which we term VLOs or verified learning outcomes. In addition, a participant perception indicator (PPI) survey was used to gauge students’ perceptions of their gains in knowledge, experience, and confidence during the BASIL CURE and, therefore, to inform which CURAs should be specifically targeted for assessment in specific BASIL implementations (RQ3;). These results indicate that, across implementations of the CURE, students perceived significant gains with large effect sizes in their knowledge, experience, and confidence for items on the PPI survey (RQ3;). In our view, the results of this dissertation will make important contributions to the CURE literature, as well as to the biochemistry education and assessment literature in general. More specifically, it will significantly improve understanding of the nature of student learning from CUREs and how to identify ALOs and design assessments that reveal what students actually learn from such CUREs - an area where there has been a dearth of available knowledge in the past. The outcomes of this dissertation could also help instructors and administrators identify and align assessments with the actual features of a CURE (or courses in general), use the identified CURAs to ensure the material fits departmental or university needs, and evaluate the benefits of students participating in these innovative curricula. Future research will focus on expanding the development and validation of assessments so that practitioners can better evaluate the efficacy of their CUREs for developing the research competencies of their undergraduate students and continue to render improvements to their curricula.

APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Protein Representation Learning"

1

Lv, Zhibin, Hong Wenjing, and Xue Xu, eds. Feature Representation and Learning Methods With Applications in Protein Secondary Structure. Frontiers Media SA, 2021. http://dx.doi.org/10.3389/978-2-88971-555-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Faflik, David. Urban Formalism. Fordham University Press, 2020. http://dx.doi.org/10.5422/fordham/9780823288045.001.0001.

Full text
Abstract:
Urban Formalism radically reimagines what it meant to “read” a brave new urban world during the transformative middle decades of the nineteenth century. At a time when contemporaries in the twin capitals of modernity in the West, New York and Paris, were learning to make sense of unfamiliar surroundings, city peoples increasingly looked to the experiential patterns, or forms, from their everyday lives in an attempt to translate urban experience into something they could more easily comprehend. Urban Formalism interrogates both the risks and rewards of an interpretive practice that depended on the mutual relation between urbanism and formalism, at a moment when the subjective experience of the city had reached unprecedented levels of complexity. What did it mean to read a city sidewalk as if it were a literary form, like a poem? On what basis might the material form of a burning block of buildings be received as a pleasurable spectacle? How closely aligned were the ideology and choreography of the political form of a revolutionary street protest? And what were the implications of conceiving of the city’s exciting dynamism in the static visual form of a photographic composition? These are the questions that Urban Formalism asks and begins to answer, with the aim of proposing a revisionist semantics of the city. This book not only provides an original cultural history of forms. It posits a new form of urban history, comprised of the representative rituals of interpretation that have helped give meaningful shape to metropolitan life.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Protein Representation Learning"

1

Dawn, Sucheta, and Monidipa Das. "Graph Representation Learning for Protein Classification." In Artificial Intelligence Technologies for Computational Biology, 1–28. Boca Raton: CRC Press, 2022. http://dx.doi.org/10.1201/9781003246688-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Rahman, Taseef, Yuanqi Du, and Amarda Shehu. "Graph Representation Learning for Protein Conformation Sampling." In Computational Advances in Bio and Medical Sciences, 16–28. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-17531-2_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Quadrini, Michela, Sebastian Daberdaku, and Carlo Ferrari. "Hierarchical Representation and Graph Convolutional Networks for the Prediction of Protein–Protein Interaction Sites." In Machine Learning, Optimization, and Data Science, 409–20. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-64580-9_34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zhou, Peixuan, Yijia Zhang, Fei Chen, Kuo Pang, and Mingyu Lu. "Heterogeneous PPI Network Representation Learning for Protein Complex Identification." In Bioinformatics Research and Applications, 217–28. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-23198-8_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Cantoni, Virginio, Alessio Ferone, Ozlem Ozbudak, and Alfredo Petrosino. "Protein Structural Blocks Representation and Search through Unsupervised NN." In Artificial Neural Networks and Machine Learning – ICANN 2012, 515–22. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33266-1_64.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Ma, Wenzheng, Wenzheng Bao, Yi Cao, Bin Yang, and Yuehui Chen. "Prediction of Protein-Protein Interaction Based on Deep Learning Feature Representation and Random Forest." In Intelligent Computing Theories and Application, 654–62. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-84532-2_59.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Haj Mohamed, Hela, Samir Belaid, and Wady Naanaa. "RingNet: Geometric Deep Representation Learning for 3D Multi-domain Protein Shape Retrieval." In Computational Collective Intelligence, 135–47. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-16014-1_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Nanni, Luca. "Computational Inference of DNA Folding Principles: From Data Management to Machine Learning." In Special Topics in Information Technology, 79–88. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-85918-3_7.

Full text
Abstract:
AbstractDNA is the molecular basis of life and would total about three meters if linearly untangled. To fit in the cell nucleus at the micrometer scale, DNA has, therefore, to fold itself into several layers of hierarchical structures, which are thought to be associated with functional compartmentalization of genomic features like genes and their regulatory elements. For this reason, understanding the mechanisms of genome folding is a major biological research problem. Studying chromatin conformation requires high computational resources and complex data analyses pipelines. In this chapter, we first present the PyGMQL software for interactive and scalable data exploration for genomic data. PyGMQL allows the user to inspect genomic datasets and design complex analysis pipelines. The software presents itself as a easy-to-use Python library and interacts seamlessly with other data analysis packages. We then use the software for the study of chromatin conformation data. We focus on the epigenetic determinants of Topologically Associating Domains (TADs), which are region of high self chromatin interaction. The results of this study highlight the existence of a “grammar of genome folding” which dictates the formation of TADs and boundaries, which is based on the CTCF insulator protein. Finally we focus on the relationship between chromatin conformation and gene expression, designing a graph representation learning model for the prediction of gene co-expression from gene topological features obtained from chromatin conformation data. We demonstrate a correlation between chromatin topology and co-expression, shedding a new light on this debated topic and providing a novel computational framework for the study of co-expression networks.
APA, Harvard, Vancouver, ISO, and other styles
9

Orhobor, Oghenejokpeme I., Joseph French, Larisa N. Soldatova, and Ross D. King. "Generating Explainable and Effective Data Descriptors Using Relational Learning: Application to Cancer Biology." In Discovery Science, 374–85. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-61527-7_25.

Full text
Abstract:
Abstract The key to success in machine learning is the use of effective data representations. The success of deep neural networks (DNNs) is based on their ability to utilize multiple neural network layers, and big data, to learn how to convert simple input representations into richer internal representations that are effective for learning. However, these internal representations are sub-symbolic and difficult to explain. In many scientific problems explainable models are required, and the input data is semantically complex and unsuitable for DNNs. This is true in the fundamental problem of understanding the mechanism of cancer drugs, which requires complex background knowledge about the functions of genes/proteins, their cells, and the molecular structure of the drugs. This background knowledge cannot be compactly expressed propositionally, and requires at least the expressive power of Datalog. Here we demonstrate the use of relational learning to generate new data descriptors in such semantically complex background knowledge. These new descriptors are effective: adding them to standard propositional learning methods significantly improves prediction accuracy. They are also explainable, and add to our understanding of cancer. Our approach can readily be expanded to include other complex forms of background knowledge, and combines the generality of relational learning with the efficiency of standard propositional learning.
APA, Harvard, Vancouver, ISO, and other styles
10

Feinstein, Joseph, Wentao Shi, J. Ramanujam, and Michal Brylinski. "Bionoi: A Voronoi Diagram-Based Representation of Ligand-Binding Sites in Proteins for Machine Learning Applications." In Methods in Molecular Biology, 299–312. New York, NY: Springer US, 2021. http://dx.doi.org/10.1007/978-1-0716-1209-5_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Protein Representation Learning"

1

Zhang, Da, and Mansur R. Kabuka. "Multimodal Deep Representation Learning for Protein-Protein Interaction Networks." In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2018. http://dx.doi.org/10.1109/bibm.2018.8621366.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Xia, Tian, Bo Hui, and Wei-Shinn Ku. "APIP: Attention-based Protein Representation Learning for Protein-Ligand Interface Prediction." In 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022. http://dx.doi.org/10.1109/bigdata55660.2022.10020490.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Xia, Tian, and Wei-Shinn Ku. "Geometric Graph Representation Learning on Protein Structure Prediction." In KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3447548.3467323.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Xu, Bo, Kun Li, Xiaoxia Liu, Delong Liu, Yijia Zhang, Hongfei Lin, Zhihao Yang, Jian Wang, and Feng Xia. "Protein Complexes Detection Based on Global Network Representation Learning." In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2018. http://dx.doi.org/10.1109/bibm.2018.8621541.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Quan, Zhe, Yan Guo, Xuan Lin, Zhi-Jie Wang, and Xiangxiang Zeng. "GraphCPI: Graph Neural Representation Learning for Compound-Protein Interaction." In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2019. http://dx.doi.org/10.1109/bibm47256.2019.8983267.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Zhou, Peixuan, Yijia Zhang, Fei Chen, Mingyu Lu, Wen Qu, Hongfei Lin, and Xiaoxia Liu. "Contrastive Self-Supervised Representation Learning for Protein Complexes Identification." In 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2022. http://dx.doi.org/10.1109/bibm55620.2022.9995094.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Zhijian, Lyu, Jiang Shaohua, Liang Yigao, and Gao Min. "GDGRU-DTA: Predicting Drug-Target Binding Affinity based on GNN and Double GRU." In 3rd International Conference on Data Mining and Machine Learning (DMML 2022). Academy and Industry Research Collaboration Center (AIRCC), 2022. http://dx.doi.org/10.5121/csit.2022.120703.

Full text
Abstract:
The work for predicting drug and target affinity(DTA) is crucial for drug development and repurposing. In this work, we propose a novel method called GDGRU-DTA to predict the binding affinity between drugs and targets, which is based on GraphDTA, but we consider that protein sequences are long sequences, so simple CNN cannot capture the context dependencies in protein sequences well. Therefore, we improve it by interpreting the protein sequences as time series and extracting their features using Gate Recurrent Unit(GRU) and Bidirectional Gate Recurrent Unit(BiGRU). For the drug, our processing method is similar to that of GraphDTA, but uses two different graph convolution methods. Subsequently, the representation of drugs and proteins are concatenated for final prediction. We evaluate the proposed model on two benchmark datasets. Our model outperforms some state-of-the-art deep learning methods, and the results demonstrate the feasibility and excellent feature capture ability of our model.
APA, Harvard, Vancouver, ISO, and other styles
8

Wang, Zhizheng, Yuanyuan Sun, Yawen Guan, Yibin Zhang, Liang Yang, Kan Xu, Yijia Zhang, and Hongfei Lin. "A Weak Supervised Learning Method for Essential Protein Detection Based on STRING Database and Learning Representation." In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2018. http://dx.doi.org/10.1109/bibm.2018.8621469.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Arango-Rodriguez, J. D., A. F. Cardona-Escobar, J. A. Jaramillo-Garzon, and J. C. Arroyave-Ospina. "Machine learning based protein-protein interaction prediction using physical-chemical representations." In 2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA). IEEE, 2016. http://dx.doi.org/10.1109/stsiva.2016.7743304.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Alam, Fardina Fathmiul, Taseef Rahman, and Amarda Shehu. "Learning Reduced Latent Representations of Protein Structure Data." In BCB '19: 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3307339.3343866.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography