Academic literature on the topic 'Protein language models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Protein language models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Protein language models"

1

Tang, Lin. "Protein language models using convolutions." Nature Methods 21, no. 4 (April 2024): 550. http://dx.doi.org/10.1038/s41592-024-02252-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ali, Sarwan, Prakash Chourasia, and Murray Patterson. "When Protein Structure Embedding Meets Large Language Models." Genes 15, no. 1 (December 23, 2023): 25. http://dx.doi.org/10.3390/genes15010025.

Full text
Abstract:
Protein structure analysis is essential in various bioinformatics domains such as drug discovery, disease diagnosis, and evolutionary studies. Within structural biology, the classification of protein structures is pivotal, employing machine learning algorithms to categorize structures based on data from databases like the Protein Data Bank (PDB). To predict protein functions, embeddings based on protein sequences have been employed. Creating numerical embeddings that preserve vital information while considering protein structure and sequence presents several challenges. The existing literature lacks a comprehensive and effective approach that combines structural and sequence-based features to achieve efficient protein classification. While large language models (LLMs) have exhibited promising outcomes for protein function prediction, their focus primarily lies on protein sequences, disregarding the 3D structures of proteins. The quality of embeddings heavily relies on how well the geometry of the embedding space aligns with the underlying data structure, posing a critical research question. Traditionally, Euclidean space has served as a widely utilized framework for embeddings. In this study, we propose a novel method for designing numerical embeddings in Euclidean space for proteins by leveraging 3D structure information, specifically employing the concept of contact maps. These embeddings are synergistically combined with features extracted from LLMs and traditional feature engineering techniques to enhance the performance of embeddings in supervised protein analysis. Experimental results on benchmark datasets, including PDB Bind and STCRDAB, demonstrate the superior performance of the proposed method for protein function prediction.
APA, Harvard, Vancouver, ISO, and other styles
3

Ferruz, Noelia, and Birte Höcker. "Controllable protein design with language models." Nature Machine Intelligence 4, no. 6 (June 2022): 521–32. http://dx.doi.org/10.1038/s42256-022-00499-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Li, Xiang, Zhuoyu Wei, Yueran Hu, and Xiaolei Zhu. "GraphNABP: Identifying nucleic acid-binding proteins with protein graphs and protein language models." International Journal of Biological Macromolecules 280 (November 2024): 135599. http://dx.doi.org/10.1016/j.ijbiomac.2024.135599.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Singh, Arunima. "Protein language models guide directed antibody evolution." Nature Methods 20, no. 6 (June 2023): 785. http://dx.doi.org/10.1038/s41592-023-01924-w.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Tran, Chau, Siddharth Khadkikar, and Aleksey Porollo. "Survey of Protein Sequence Embedding Models." International Journal of Molecular Sciences 24, no. 4 (February 14, 2023): 3775. http://dx.doi.org/10.3390/ijms24043775.

Full text
Abstract:
Derived from the natural language processing (NLP) algorithms, protein language models enable the encoding of protein sequences, which are widely diverse in length and amino acid composition, in fixed-size numerical vectors (embeddings). We surveyed representative embedding models such as Esm, Esm1b, ProtT5, and SeqVec, along with their derivatives (GoPredSim and PLAST), to conduct the following tasks in computational biology: embedding the Saccharomyces cerevisiae proteome, gene ontology (GO) annotation of the uncharacterized proteins of this organism, relating variants of human proteins to disease status, correlating mutants of beta-lactamase TEM-1 from Escherichia coli with experimentally measured antimicrobial resistance, and analyzing diverse fungal mating factors. We discuss the advances and shortcomings, differences, and concordance of the models. Of note, all of the models revealed that the uncharacterized proteins in yeast tend to be less than 200 amino acids long, contain fewer aspartates and glutamates, and are enriched for cysteine. Less than half of these proteins can be annotated with GO terms with high confidence. The distribution of the cosine similarity scores of benign and pathogenic mutations to the reference human proteins shows a statistically significant difference. The differences in embeddings of the reference TEM-1 and mutants have low to no correlation with minimal inhibitory concentrations (MIC).
APA, Harvard, Vancouver, ISO, and other styles
7

Pokharel, Suresh, Pawel Pratyush, Hamid D. Ismail, Junfeng Ma, and Dukka B. KC. "Integrating Embeddings from Multiple Protein Language Models to Improve Protein O-GlcNAc Site Prediction." International Journal of Molecular Sciences 24, no. 21 (November 6, 2023): 16000. http://dx.doi.org/10.3390/ijms242116000.

Full text
Abstract:
O-linked β-N-acetylglucosamine (O-GlcNAc) is a distinct monosaccharide modification of serine (S) or threonine (T) residues of nucleocytoplasmic and mitochondrial proteins. O-GlcNAc modification (i.e., O-GlcNAcylation) is involved in the regulation of diverse cellular processes, including transcription, epigenetic modifications, and cell signaling. Despite the great progress in experimentally mapping O-GlcNAc sites, there is an unmet need to develop robust prediction tools that can effectively locate the presence of O-GlcNAc sites in protein sequences of interest. In this work, we performed a comprehensive evaluation of a framework for prediction of protein O-GlcNAc sites using embeddings from pre-trained protein language models. In particular, we compared the performance of three protein sequence-based large protein language models (pLMs), Ankh, ESM-2, and ProtT5, for prediction of O-GlcNAc sites and also evaluated various ensemble strategies to integrate embeddings from these protein language models. Upon investigation, the decision-level fusion approach that integrates the decisions of the three embedding models, which we call LM-OGlcNAc-Site, outperformed the models trained on these individual language models as well as other fusion approaches and other existing predictors in almost all of the parameters evaluated. The precise prediction of O-GlcNAc sites will facilitate the probing of O-GlcNAc site-specific functions of proteins in physiology and diseases. Moreover, these findings also indicate the effectiveness of combined uses of multiple protein language models in post-translational modification prediction and open exciting avenues for further research and exploration in other protein downstream tasks. LM-OGlcNAc-Site’s web server and source code are publicly available to the community.
APA, Harvard, Vancouver, ISO, and other styles
8

Wang, Wenkai, Zhenling Peng, and Jianyi Yang. "Single-sequence protein structure prediction using supervised transformer protein language models." Nature Computational Science 2, no. 12 (December 19, 2022): 804–14. http://dx.doi.org/10.1038/s43588-022-00373-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Pang, Yihe, and Bin Liu. "IDP-LM: Prediction of protein intrinsic disorder and disorder functions based on language models." PLOS Computational Biology 19, no. 11 (November 22, 2023): e1011657. http://dx.doi.org/10.1371/journal.pcbi.1011657.

Full text
Abstract:
Intrinsically disordered proteins (IDPs) and regions (IDRs) are a class of functionally important proteins and regions that lack stable three-dimensional structures under the native physiologic conditions. They participate in critical biological processes and thus are associated with the pathogenesis of many severe human diseases. Identifying the IDPs/IDRs and their functions will be helpful for a comprehensive understanding of protein structures and functions, and inform studies of rational drug design. Over the past decades, the exponential growth in the number of proteins with sequence information has deepened the gap between uncharacterized and annotated disordered sequences. Protein language models have recently demonstrated their powerful abilities to capture complex structural and functional information from the enormous quantity of unlabelled protein sequences, providing opportunities to apply protein language models to uncover the intrinsic disorders and their biological properties from the amino acid sequences. In this study, we proposed a computational predictor called IDP-LM for predicting intrinsic disorder and disorder functions by leveraging the pre-trained protein language models. IDP-LM takes the embeddings extracted from three pre-trained protein language models as the exclusive inputs, including ProtBERT, ProtT5 and a disorder specific language model (IDP-BERT). The ablation analysis shown that the IDP-BERT provided fine-grained feature representations of disorder, and the combination of three language models is the key to the performance improvement of IDP-LM. The evaluation results on independent test datasets demonstrated that the IDP-LM provided high-quality prediction results for intrinsic disorder and four common disordered functions.
APA, Harvard, Vancouver, ISO, and other styles
10

Weber, Leon, Kirsten Thobe, Oscar Arturo Migueles Lozano, Jana Wolf, and Ulf Leser. "PEDL: extracting protein–protein associations using deep language models and distant supervision." Bioinformatics 36, Supplement_1 (July 1, 2020): i490—i498. http://dx.doi.org/10.1093/bioinformatics/btaa430.

Full text
Abstract:
Abstract Motivation A significant portion of molecular biology investigates signalling pathways and thus depends on an up-to-date and complete resource of functional protein–protein associations (PPAs) that constitute such pathways. Despite extensive curation efforts, major pathway databases are still notoriously incomplete. Relation extraction can help to gather such pathway information from biomedical publications. Current methods for extracting PPAs typically rely exclusively on rare manually labelled data which severely limits their performance. Results We propose PPA Extraction with Deep Language (PEDL), a method for predicting PPAs from text that combines deep language models and distant supervision. Due to the reliance on distant supervision, PEDL has access to an order of magnitude more training data than methods solely relying on manually labelled annotations. We introduce three different datasets for PPA prediction and evaluate PEDL for the two subtasks of predicting PPAs between two proteins, as well as identifying the text spans stating the PPA. We compared PEDL with a recently published state-of-the-art model and found that on average PEDL performs better in both tasks on all three datasets. An expert evaluation demonstrates that PEDL can be used to predict PPAs that are missing from major pathway databases and that it correctly identifies the text spans supporting the PPA. Availability and implementation PEDL is freely available at https://github.com/leonweber/pedl. The repository also includes scripts to generate the used datasets and to reproduce the experiments from this article. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Protein language models"

1

Meynard, Barthélémy. "Language Models towards Conditional Generative Modelsof Proteins Sequences." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS195.

Full text
Abstract:
Nous commençons par examiner ce qui rend un modèle génératif efficace pour les séquences de protéines. Dans notre première étude, "Interpretable Pairwise Distillations for Generative Protein Sequence Models" nous comparons les modèles de réseaux de neurones complexes à des modèles de distributions pair à pair plus simples. Cette comparaison révèle que les modèles plus simples peuvent égaler de près la performance des modèles plus complexes dans la prédiction de l'effet des mutations sur les protéines. Cette découverte remet en question l'hypothèse selon laquelle les modèles plus complexes sont toujours meilleurs, préparant le terrain pour de plus amples explorations.Dans une seconde partie, nous nous penchons sur le conditionnement de séquence avec "Generating Interacting Protein Sequences using Domain-to-Domain Translation" Cette étude introduit une approche novatrice pour générer des séquences de protéines qui peuvent interagir avec d'autres protéines spécifiques. En traitant cela comme un problème de traduction, similaire aux méthodes utilisées dans le traitement du langage naturel, nous créons des séquences avec des fonctionnalités intentionnelles. De plus, nous abordons le défi crucial de la prédiction de l'interaction entre le récepteur des cellules T (TCR) et l'épitope dans "TULIP—a Transformer based Unsupervised Language model for Interacting Peptides and T-cell receptors" Cette étude introduit une approche d'apprentissage non supervisée pour prédire avec précision les liaisons TCR-épitope, surmontant les limitations de qualité des données et les biais de formation inhérents aux modèles précédents. Ces avancées soulignent le potentiel du conditionnement de séquence dans la création de designs de protéines fonctionnellement spécifiques et conscients de l'interaction. Enfin, nous explorons le conditionnement de structure dans "Uncovering Sequence Diversity from a Known Protein Structure". Ici, nous présentons InvMSAFold, une méthode qui produit des séquences de protéines diverses conçues pour se plier dans une structure spécifique. Cette approche met en lumière l'importance de considérer la structure finale de la protéine dans le processus de conception, permettant la génération de séquences qui sont non seulement diverses mais maintiennent également leur intégrité structurelle prévue
This thesis explores the intersection of artificial intelligence (AI) and biology, focusing on how generative models can innovate in protein sequence design. Our research unfolds in three distinct yet interconnected stages, each building upon the insights of the previous to enhance the model's applicability and performance in protein engineering.We begin by examining what makes a generative model effective for protein sequences. In our first study, "Interpretable Pairwise Distillations for Generative Protein Sequence Models," we compare complex neural network models to simpler, pairwise distribution models. This comparison highlights that deep learning strategy mainly model second order interaction, highlighting their fundamental role in modeling proteins family.In a second part, we try to expand this principle of using second order interaction to inverse folding. We explore structure conditioning in "Uncovering Sequence Diversity from a Known Protein Structure" Here, we present InvMSAFold, a method that produces diverse protein sequences designed to fold into a specific structure. This approach tries to combines two different tradition of proteins modeling: the MSA based models that try to capture the entire fitness landscape and the inverse folding types of model that focus on recovering one specific sequence. This is a first step towards the possibility of conditioning the fitness landscape by considering the protein's final structure in the design process, enabling the generation of sequences that are not only diverse but also maintain their intended structural integrity. Finally, we delve into sequence conditioning with "Generating Interacting Protein Sequences using Domain-to-Domain Translation." This study introduces a novel approach to generate protein sequences that can interact with specific other proteins. By treating this as a translation problem, similar to methods used in language processing, we create sequences with intended functionalities. Furthermore, we address the critical challenge of T-cell receptor (TCR) and epitope interaction prediction in "TULIP—a Transformer based Unsupervised Language model for Interacting Peptides and T-cell receptors." This study introduces an unsupervised learning approach to accurately predict TCR-epitope bindings, overcoming limitations in data quality and training bias inherent in previous models. These advancements underline the potential of sequence conditioning in creating functionally specific and interaction-aware protein designs
APA, Harvard, Vancouver, ISO, and other styles
2

Hladiš, Matej. "Réseaux de neurones en graphes et modèle de langage des protéines pour révéler le code combinatoire de l'olfaction." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ5024.

Full text
Abstract:
Les mammifères identifient et interprètent une myriade de stimuli olfactifs par un mécanisme de codage complexe reposant sur la reconnaissance des molécules odorantes par des centaines de récepteurs olfactifs (RO). Ces interactions génèrent des combinaisons uniques de récepteurs activés, appelées code combinatoire, que le cerveau humain interprète comme la sensation que nous appelons l'odeur. Jusqu'à présent, le grand nombre de combinaisons possibles entre les récepteurs et les molécules a empêché une étude expérimentale à grande échelle de ce code et de son lien avec la perception des odeurs. La révélation de ce code est donc cruciale pour répondre à la question à long terme de savoir comment nous percevons notre environnement chimique complexe. Les RO appartiennent à la classe A des récepteurs couplés aux protéines G (RCPG) et constituent la plus grande famille multigénique connue. Pour étudier de façon systématique le codage olfactif, nous avons développé M2OR, une base de données exhaustive compilant les 25 dernières années d'essais biologiques sur les RO. À l'aide de cet ensemble de données, un modèle d'apprentissage profond sur mesure a été conçu et entraîné. Il combine l'intégration de jetons [CLS] d'un modèle de langage des protéines avec des réseaux de neurones en graphes et un mécanisme d'attention multi-têtes. Ce modèle prédit l'activation des RO par les odorants et révèle le code combinatoire résultant pour toute molécule odorante. Cette approche est affinée en développant un nouveau modèle capable de prédire l'activité d'un odorant à une concentration spécifique, permettant alors d'estimer la valeur d'EC50 de n'importe quelle paire OR-odorant. Enfin, les codes combinatoires dérivés des deux modèles sont utilisés pour prédire la perception olfactive des molécules. En incorporant des biais inductifs inspirés par la théorie du codage olfactif, un modèle d'apprentissage automatique basé sur ces codes est plus performant que l'état de l'art actuel en matière de prédiction d'odeurs. À notre connaissance, il s'agit de l'application la plus aboutie liant le code combinatoire à la prédiction de l'odeur d'une molécule. Dans l'ensemble, ce travail établit un lien entre les interactions complexes molécule odorante-récepteur et la perception humaine
Mammals identify and interpret a myriad of olfactory stimuli using a complex coding mechanism involving interactions between odorant molecules and hundreds of olfactory receptors (ORs). These interactions generate unique combinations of activated receptors, called the combinatorial code, which the human brain interprets as the sensation we call smell. Until now, the vast number of possible receptor-molecule combinations have prevented a large-scale experimental study of this code and its link to odor perception. Therefore, revealing this code is crucial to answering the long-term question of how we perceive our intricate chemical environment. ORs belong to the class A of G protein-coupled receptors (GPCRs) and constitute the largest known multigene family. To systematically study olfactory coding, we develop M2OR, a comprehensive database compiling the last 25 years of OR bioassays. Using this dataset, a tailored deep learning model is designed and trained. It combines the [CLS] token embedding from a protein language model with graph neural networks and multi-head attention. This model predicts the activation of ORs by odorants and reveals the resulting combinatorial code for any odorous molecule. This approach is refined by developing a novel model capable of predicting the activity of an odorant at a specific concentration, subsequently allowing the estimation of the EC50 value for any OR-odorant pair. Finally, the combinatorial codes derived from both models are used to predict the odor perception of molecules. By incorporating inductive biases inspired by olfactory coding theory, a machine learning model based on these codes outperforms the current state-of-the-art in smell prediction. To the best of our knowledge, this is the most comprehensive and successful application of combinatorial coding to odor quality prediction. Overall, this work provides a link between the complex molecule-receptor interactions and human perception
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Protein language models"

1

Bovington, Sue. Tigris/Thames. [United Kingdom]: [Sue Bovington], 2011.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Beatriz, Solís Leree, ed. La Ley televisa y la lucha por el poder en México. México, D.F: Universidad Autónoma Metropolitana, Unidad Xochimilco, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Yoshikawa, Saeko. William Wordsworth and Modern Travel. Liverpool University Press, 2020. http://dx.doi.org/10.3828/liverpool/9781789621181.001.0001.

Full text
Abstract:
This book explores William Wordsworth’s pervasive influence on the tourist landscapes of the Lake District throughout the age of transport revolutions, popular tourism, and the Great 1914-18 War. It reveals how Wordsworth’s response to railways was not a straightforward matter of opposition and protest; his ideas were taken up by advocates and opponents of railways, and through their controversies had a surprising impact on the earliest motorists as they sought a language to describe the liberty and independence of their new mode of travel. Once the age of motoring was underway, the outbreak of the First World War encouraged British people to connect Wordsworth’s patriotic passion with his wish to protect the Lake District as a national heritage—a transition that would have momentous effects in the interwar period when the popularisation of motoring paradoxically brought a vogue for open-air activities and a renewal of Romantic pedestrianism. With the arrival of global tourism, preservation of the cultural landscape of the Lake District became an urgent national and international concern. By revealing how Romantic ideas of nature, travel, liberty and self-reliance were re-interpreted and utilized in discourses on landscape, transport, accessibility, preservation, war and cultural heritage, this book portrays multiple Wordsworthian legacies in modern ways of perceiving and valuing the nature and culture of the Lake District.
APA, Harvard, Vancouver, ISO, and other styles
4

Hardiman, David. The Nonviolent Struggle for Indian Freedom, 1905-19. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780190920678.001.0001.

Full text
Abstract:
Much of the recent surge in writing about the practice of nonviolent forms of resistance has focused on movements that occurred after the end of the Second World War, many of which have been extremely successful. Although the fact that such a method of civil resistance was developed in its modern form by Indians is acknowledged in this writing, there has not until now been an authoritative history of the role of Indians in the evolution of the phenomenon.The book argues that while nonviolence is associated above all with the towering figure of Mahatma Gandhi, 'passive resistance' was already being practiced as a form of civil protest by nationalists in British-ruled India, though there was no principled commitment to nonviolence as such. The emphasis was on efficacy, rather than the ethics of such protest. It was Gandhi, first in South Africa and then in India, who evolved a technique that he called 'satyagraha'. He envisaged this as primarily a moral stance, though it had a highly practical impact. From 1915 onwards, he sought to root his practice in terms of the concept of ahimsa, a Sanskrit term that he translated as ‘nonviolence’. His endeavors saw 'nonviolence' forged as both a new word in the English language, and as a new political concept. This book conveys in vivid detail exactly what such nonviolence entailed, and the formidable difficulties that the pioneers of such resistance encountered in the years 1905-19.
APA, Harvard, Vancouver, ISO, and other styles
5

McNally, Michael D. Defend the Sacred. Princeton University Press, 2020. http://dx.doi.org/10.23943/princeton/9780691190907.001.0001.

Full text
Abstract:
From North Dakota's Standing Rock encampments to Arizona's San Francisco Peaks, Native Americans have repeatedly asserted legal rights to religious freedom to protect their sacred places, practices, objects, knowledge, and ancestral remains. But these claims have met with little success in court because Native American communal traditions don't fit easily into modern Western definitions of religion. This book explores how, in response to this situation, Native peoples have creatively turned to other legal means to safeguard what matters to them. To articulate their claims, Native peoples have resourcefully used the languages of cultural resources under environmental and historic preservation law; of sovereignty under treaty-based federal Indian law; and, increasingly, of Indigenous rights under international human rights law. Along the way, Native nations still draw on the rhetorical power of religious freedom to gain legislative and regulatory successes beyond the First Amendment. This book casts new light on discussions of religious freedom, cultural resource management, and the vitality of Indigenous religions today.
APA, Harvard, Vancouver, ISO, and other styles
6

Meddings, Jennifer, Vineet Chopra, and Sanjay Saint. Preventing Hospital Infections. 2nd ed. Oxford University Press, 2021. http://dx.doi.org/10.1093/med/9780197509159.001.0001.

Full text
Abstract:
This book provides a detailed, step-by-step description of a model quality improvement intervention for hospitals, pinpointing the obstacles and showing how to surmount them. This second edition has been carefully updated, with new material describing some technical aspects of infection prevention, new tools for use by front-line providers, and results of recent large collaborative infection prevention studies. In easy-to-read, user-friendly language, it explains why clinicians neglect or actively oppose quality changes—from physicians who distrust change, to nurses who want to protect their turf, to infection preventionists who avoid the wards. The book also sheds light on how and why hospitals embark on quality improvements, the role of the hospital’s leadership cadre, the selection and training of the project team, and how to sustain quality gains long term. The intervention framework described in the book focuses on the prevention of hospital-associated infections—in particular, catheter-associated urinary tract infection (CAUTI)—but it is directly applicable to a variety of other hospital issues, such as falls, pressure sores, and Clostridioides difficile infection (CDI). In fact, the book includes a chapter applying this framework to a CDI prevention initiative. In addition, for hospitals having trouble with staff adherence to a quality initiative, we provide three infection-specific questionnaires (for CAUTI, CLABSI, and CDI) to help pinpoint individual problems, and provide a link to a website offering advice tailored to their specific circumstances.
APA, Harvard, Vancouver, ISO, and other styles
7

Halvorsen, Tar, and Peter Vale. One World, Many Knowledges: Regional experiences and cross-regional links in higher education. African Minds, 2016. http://dx.doi.org/10.47622/978-0-620-55789-4.

Full text
Abstract:
Various forms of academic co-operation criss-cross the modern university system in a bewildering number of ways, from the open exchange of ideas and knowledge, to the sharing of research results, and frank discussions about research challenges. Embedded in these scholarly networks is the question of whether a global template for the management of both higher education and national research organisations is necessary, and if so, must institutions slavishly follow the high-flown language of the global knowledge society or risk falling behind in the ubiquitous university ranking system? Or are there alternatives that can achieve a better, more ethically inclined, world? Basing their observations on their own experiences, an interesting mix of seasoned scholars and new voices from southern Africa and the Nordic region offer critical perspectives on issues of inter- and cross-regional academic co-operation. Several of the chapters also touch on the evolution of the higher education sector in the two regions. An absorbing and intelligent study, this book will be invaluable for anyone interested in the strategies scholars are using to adapt to the interconnectedness of the modern world. It offers fresh insights into how academics are attempting to protect the spaces in which they can freely and openly debate the challenges they face, while aiming to transform higher education, and foster scholarly collaboration. The Southern African-Nordic Centre (SANORD) is a partnership of higher education institutions from Denmark, Finland, Iceland, Norway, Sweden, Botswana, Namibia, Malawi, South Africa, Zambia and Zimbabwe. SANORDs primary aim is to promote multilateral research co-operation on matters of importance to the development of both regions. Our activities are based on the values of democracy, equity, and mutually beneficial academic engagement.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Protein language models"

1

Xu, Yaoyao, Xinjian Zhao, Xiaozhuang Song, Benyou Wang, and Tianshu Yu. "Boosting Protein Language Models with Negative Sample Mining." In Lecture Notes in Computer Science, 199–214. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-70381-2_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zhao, Junming, Chao Zhang, and Yunan Luo. "Contrastive Fitness Learning: Reprogramming Protein Language Models for Low-N Learning of Protein Fitness Landscape." In Lecture Notes in Computer Science, 470–74. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-1-0716-3989-4_55.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ghazikhani, Hamed, and Gregory Butler. "A Study on the Application of Protein Language Models in the Analysis of Membrane Proteins." In Distributed Computing and Artificial Intelligence, Special Sessions, 19th International Conference, 147–52. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-23210-7_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zeng, Shuai, Duolin Wang, Lei Jiang, and Dong Xu. "Prompt-Based Learning on Large Protein Language Models Improves Signal Peptide Prediction." In Lecture Notes in Computer Science, 400–405. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-1-0716-3989-4_40.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Fernández, Diego, Álvaro Olivera-Nappa, Roberto Uribe-Paredes, and David Medina-Ortiz. "Exploring Machine Learning Algorithms and Protein Language Models Strategies to Develop Enzyme Classification Systems." In Bioinformatics and Biomedical Engineering, 307–19. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-34953-9_24.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Paaß, Gerhard, and Sven Giesselbach. "Foundation Models for Speech, Images, Videos, and Control." In Artificial Intelligence: Foundations, Theory, and Algorithms, 313–82. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-23190-2_7.

Full text
Abstract:
AbstractFoundation Models are able to model not only tokens of natural language but also token elements of arbitrary sequences. For images, square image patches can be represented as tokens; for videos, we can define tubelets that span an image patch across multiple frames. Subsequently, the proven self-attention algorithms can be applied to these tokens. Most importantly, several modalities like text and images can be processed in the same sequence allowing, for instance, the generation of images from text and text descriptions from video. In addition, the models are scalable to very large networks and huge datasets. The following multimedia types are covered in the subsequent sections. Speech recognition and text-to-speech models describe the translation of spoken language into text and vice versa. Image processing has the task to interpret images, describe them by captions, and generate new images according to textual descriptions. Video interpretation aims at recognizing action in videos and describing them through text. Furthermore, new videos can be created according to a textual description. Dynamical system trajectories characterize sequential decision problems, which can be simulated and controlled. DNA and protein sequences can be analyzed with Foundation Models to predict the structure and properties of the corresponding molecules.
APA, Harvard, Vancouver, ISO, and other styles
7

Shan, Kaixuan, Xiankun Zhang, and Chen Song. "Prediction of Protein-DNA Binding Sites Based on Protein Language Model and Deep Learning." In Advanced Intelligent Computing in Bioinformatics, 314–25. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-5692-6_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Matsiunova, Antonina. "Semantic opposition of US versus THEM in late 2020 Russian-language Belarusian discourse." In Protest in Late Modern Societies, 42–55. London: Routledge, 2023. http://dx.doi.org/10.4324/9781003270065-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Mullett, Michael. "Language and Action in Peasant Revolts." In Popular Culture and Popular Protest in Late Medieval and Early Modern Europe, 71–109. London: Routledge, 2021. http://dx.doi.org/10.4324/9781003188858-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Fan, Jianye, Xiaofeng Liu, Shoubin Dong, and Jinlong Hu. "Enriching Pre-trained Language Model with Dependency Syntactic Information for Chemical-Protein Interaction Extraction." In Lecture Notes in Computer Science, 58–69. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-56725-5_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Protein language models"

1

Jiang, Yanfeng, Ning Sun, Zhengxian Lu, Shuang Peng, Yi Zhang, Fei Yang, and Tao Li. "MEFold: Memory-Efficient Optimization for Protein Language Models via Chunk and Quantization." In 2024 International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE, 2024. http://dx.doi.org/10.1109/ijcnn60899.2024.10651470.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kim, Yunsoo. "Foundation Model for Biomedical Graphs: Integrating Knowledge Graphs and Protein Structures to Large Language Models." In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 346–55. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.acl-srw.30.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Engel, Ryan, and Gilchan Park. "Evaluating Large Language Models for Predicting Protein Behavior under Radiation Exposure and Disease Conditions." In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 427–39. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.bionlp-1.34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

lin, ming. "DPPred-indel:a pathogenic inframe indel prediction method based on biological language models and fusion of DNA and protein features." In 2024 Fourth International Conference on Biomedicine and Bioinformatics Engineering (ICBBE 2024), edited by Pier Paolo Piccaluga, Ahmed El-Hashash, and Xiangqian Guo, 67. SPIE, 2024. http://dx.doi.org/10.1117/12.3044406.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zeinalipour, Kamyar, Neda Jamshidi, Monica Bianchini, Marco Maggini, and Marco Gori. "Design Proteins Using Large Language Models: Enhancements and Comparative Analyses." In Proceedings of the 1st Workshop on Language + Molecules (L+M 2024), 34–47. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.langmol-1.5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Peng, Shuang, Fei Yang, Ning Sun, Sheng Chen, Yanfeng Jiang, and Aimin Pan. "Exploring Post-Training Quantization of Protein Language Models." In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2023. http://dx.doi.org/10.1109/bibm58861.2023.10385775.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gao, Liyuan, Kyler Shu, Jun Zhang, and Victor S. Sheng. "Explainable Transcription Factor Prediction with Protein Language Models." In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2023. http://dx.doi.org/10.1109/bibm58861.2023.10385498.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Škrhák, Vít, Kamila Riedlova, Marian Novotný, and David Hoksza. "Cryptic binding site prediction with protein language models." In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2023. http://dx.doi.org/10.1109/bibm58861.2023.10385497.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Wu, Xinbo, Alexandru Hanganu, Ayuko Hoshino, and Lav R. Varshney. "Source Identification for Exosomal Communication via Protein Language Models." In 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943418.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Iinuma, Naoki, Makoto Miwa, and Yutaka Sasaki. "Improving Supervised Drug-Protein Relation Extraction with Distantly Supervised Models." In Proceedings of the 21st Workshop on Biomedical Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.bionlp-1.16.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Protein language models"

1

Wu, Jyun-Jie. Improving Predictive Efficiency and Literature Quality Assessment for Lung Cancer Complications Post-Proton Therapy Through Large Language Models and Meta-Analysis. INPLASY - International Platform of Registered Systematic Review and Meta-analysis Protocols, August 2024. http://dx.doi.org/10.37766/inplasy2024.8.0103.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Shani, Uri, Lynn Dudley, Alon Ben-Gal, Menachem Moshelion, and Yajun Wu. Root Conductance, Root-soil Interface Water Potential, Water and Ion Channel Function, and Tissue Expression Profile as Affected by Environmental Conditions. United States Department of Agriculture, October 2007. http://dx.doi.org/10.32747/2007.7592119.bard.

Full text
Abstract:
Constraints on water resources and the environment necessitate more efficient use of water. The key to efficient management is an understanding of the physical and physiological processes occurring in the soil-root hydraulic continuum.While both soil and plant leaf water potentials are well understood, modeled and measured, the root-soil interface where actual uptake processes occur has not been sufficiently studied. The water potential at the root-soil interface (yᵣₒₒₜ), determined by environmental conditions and by soil and plant hydraulic properties, serves as a boundary value in soil and plant uptake equations. In this work, we propose to 1) refine and implement a method for measuring yᵣₒₒₜ; 2) measure yᵣₒₒₜ, water uptake and root hydraulic conductivity for wild type tomato and Arabidopsis under varied q, K⁺, Na⁺ and Cl⁻ levels in the root zone; 3) verify the role of MIPs and ion channels response to q, K⁺ and Na⁺ levels in Arabidopsis and tomato; 4) study the relationships between yᵣₒₒₜ and root hydraulic conductivity for various crops representing important botanical and agricultural species, under conditions of varying soil types, water contents and salinity; and 5) integrate the above to water uptake term(s) to be implemented in models. We have made significant progress toward establishing the efficacy of the emittensiometer and on the molecular biology studies. We have added an additional method for measuring ψᵣₒₒₜ. High-frequency water application through the water source while the plant emerges and becomes established encourages roots to develop towards and into the water source itself. The yᵣₒₒₜ and yₛₒᵢₗ values reflected wetting and drying processes in the rhizosphere and in the bulk soil. Thus, yᵣₒₒₜ can be manipulated by changing irrigation level and frequency. An important and surprising finding resulting from the current research is the obtained yᵣₒₒₜ value. The yᵣₒₒₜ measured using the three different methods: emittensiometer, micro-tensiometer and MRI imaging in both sunflower, tomato and corn plants fell in the same range and were higher by one to three orders of magnitude from the values of -600 to -15,000 cm suggested in the literature. We have added additional information on the regulation of aquaporins and transporters at the transcript and protein levels, particularly under stress. Our preliminary results show that overexpression of one aquaporin gene in tomato dramatically increases its transpiration level (unpublished results). Based on this information, we started screening mutants for other aquaporin genes. During the feasibility testing year, we identified homozygous mutants for eight aquaporin genes, including six mutants for five of the PIP2 genes. Including the homozygous mutants directly available at the ABRC seed stock center, we now have mutants for 11 of the 19 aquaporin genes of interest. Currently, we are screening mutants for other aquaporin genes and ion transporter genes. Understanding plant water uptake under stress is essential for the further advancement of molecular plant stress tolerance work as well as for efficient use of water in agriculture. Virtually all of Israel’s agriculture and about 40% of US agriculture is made possible by irrigation. Both countries face increasing risk of water shortages as urban requirements grow. Both countries will have to find methods of protecting the soil resource while conserving water resources—goals that appear to be in direct conflict. The climate-plant-soil-water system is nonlinear with many feedback mechanisms. Conceptual plant uptake and growth models and mechanism-based computer-simulation models will be valuable tools in developing irrigation regimes and methods that maximize the efficiency of agricultural water. This proposal will contribute to the development of these models by providing critical information on water extraction by the plant that will result in improved predictions of both water requirements and crop yields. Plant water use and plant response to environmental conditions cannot possibly be understood by using the tools and language of a single scientific discipline. This proposal links the disciplines of soil physics and soil physical chemistry with plant physiology and molecular biology in order to correctly treat and understand the soil-plant interface in terms of integrated comprehension. Results from the project will contribute to a mechanistic understanding of the SPAC and will inspire continued multidisciplinary research.
APA, Harvard, Vancouver, ISO, and other styles
3

Melnyk, Iurii. JUSTIFICATION OF OCCUPATION IN GERMAN (1938) AND RUSSIAN (2014) MEDIA: SUBSTITUTION OF AGGRESSOR AND VICTIM. Ivan Franko National University of Lviv, March 2021. http://dx.doi.org/10.30970/vjo.2021.50.11101.

Full text
Abstract:
The article is dedicated to the examination and comparison of the justification of occupation of a neighboring country in the German (1938) and Russian (2014) media. The objective of the study is to reveal the mechanics of the application of the classical manipulative method of substituting of aggressor and victim on the material of German and Russian propaganda in 1938 and in 2014 respectively. According to the results of the study, clear parallels between the two information strategies can be traced at the level of the condemnation of internal aggression against a national minority loyal to Berlin / Moscow and its political representative (the Sudeten Germans – the pro-Russian Ukrainians, as well as the security forces of the Yanukovych regime); the reflections on dangers that Czechoslovakia / Ukraine poses to itself and to its neighbors; condemnation of the violation of the cultural rights of the minority that the occupier intends to protect (German language and culture – Russian language and culture); the historical parallels designed to deepen the modern conflict, to show it as a long-standing and a natural one (“Hussites” – “Banderites”). In the manipulative strategy of both media, the main focus is not on factual fabrication, but on the bias selection of facts, due to which the reader should have an unambiguous understanding of who is the permanent aggressor in the conflict (Czechoslovakia, Czechs – Ukraine, Ukrainians), and who is the permanent victim (Germans – Russians, Russian speakers). The substitution of victim and aggressor in the media in both cases became one of the most important manipulative strategies designed to justify the German occupation of part of Czechoslovakia and the Russian occupation of part of Ukraine.
APA, Harvard, Vancouver, ISO, and other styles
4

Yatsymirska, Mariya. KEY IMPRESSIONS OF 2020 IN JOURNALISTIC TEXTS. Ivan Franko National University of Lviv, March 2021. http://dx.doi.org/10.30970/vjo.2021.50.11107.

Full text
Abstract:
The article explores the key vocabulary of 2020 in the network space of Ukraine. Texts of journalistic, official-business style, analytical publications of well-known journalists on current topics are analyzed. Extralinguistic factors of new word formation, their adaptation to the sphere of special and socio-political vocabulary of the Ukrainian language are determined. Examples show modern impressions in the media, their stylistic use and impact on public opinion in a pandemic. New meanings of foreign expressions, media terminology, peculiarities of translation of neologisms from English into Ukrainian have been clarified. According to the materials of the online media, a «dictionary of the coronavirus era» is provided. The journalistic text functions in the media on the basis of logical judgments, credible arguments, impressive language. Its purpose is to show the socio-political problem, to sharpen its significance for society and to propose solutions through convincing considerations. Most researchers emphasize the influential role of journalistic style, which through the media shapes public opinion on issues of politics, economics, education, health care, war, the future of the country. To cover such a wide range of topics, socio-political vocabulary is used first of all – neutral and emotionally-evaluative, rhetorical questions and imperatives, special terminology, foreign words. There is an ongoing discussion in online publications about the use of the new foreign token «lockdown» instead of the word «quarantine», which has long been learned in the Ukrainian language. Research on this topic has shown that at the initial stage of the pandemic, the word «lockdown» prevailed in the colloquial language of politicians, media personalities and part of society did not quite understand its meaning. Lockdown, in its current interpretation, is a restrictive measure to protect people from a dangerous virus that has spread to many countries; isolation of the population («stay in place») in case of risk of spreading Covid-19. In English, US citizens are told what a lockdown is: «A lockdown is a restriction policy for people or communities to stay where they are, usually due to specific risks to themselves or to others if they can move and interact freely. The term «stay-at-home» or «shelter-in-place» is often used for lockdowns that affect an area, rather than specific locations». Content analysis of online texts leads to the conclusion that in 2020 a special vocabulary was actively functioning, with the appropriate definitions, which the media described as a «dictionary of coronavirus vocabulary». Media broadcasting is the deepest and pulsating source of creative texts with new meanings, phrases, expressiveness. The influential power of the word finds its unconditional embodiment in the media. Journalists, bloggers, experts, politicians, analyzing current events, produce concepts of a new reality. The world is changing and the language of the media is responding to these changes. It manifests itself most vividly and emotionally in the network sphere, in various genres and styles.
APA, Harvard, Vancouver, ISO, and other styles
5

Or, Etti, David Galbraith, and Anne Fennell. Exploring mechanisms involved in grape bud dormancy: Large-scale analysis of expression reprogramming following controlled dormancy induction and dormancy release. United States Department of Agriculture, December 2002. http://dx.doi.org/10.32747/2002.7587232.bard.

Full text
Abstract:
The timing of dormancy induction and release is very important to the economic production of table grape. Advances in manipulation of dormancy induction and dormancy release are dependent on the establishment of a comprehensive understanding of biological mechanisms involved in bud dormancy. To gain insight into these mechanisms we initiated the research that had two main objectives: A. Analyzing the expression profiles of large subsets of genes, following controlled dormancy induction and dormancy release, and assessing the role of known metabolic pathways, known regulatory genes and novel sequences involved in these processes B. Comparing expression profiles following the perception of various artificial as well as natural signals known to induce dormancy release, and searching for gene showing similar expression patterns, as candidates for further study of pathways having potential to play a central role in dormancy release. We first created targeted EST collections from V. vinifera and V. riparia mature buds. Clones were randomly selected from cDNA libraries prepared following controlled dormancy release and controlled dormancy induction and from respective controls. The entire collection (7920 vinifera and 1194 riparia clones) was sequenced and subjected to bioinformatics analysis, including clustering, annotations and GO classifications. PCR products from the entire collection were used for printing of cDNA microarrays. Bud tissue in general, and the dormant bud in particular, are under-represented within the grape EST database. Accordingly, 59% of the our vinifera EST collection, composed of 5516 unigenes, are not included within the current Vitis TIGR collection and about 22% of these transcripts bear no resemblance to any known plant transcript, corroborating the current need for our targeted EST collection and the bud specific cDNA array. Analysis of the V. riparia sequences yielded 814 unigenes, of which 140 are unique (keilin et al., manuscript, Appendix B). Results from computational expression profiling of the vinifera collection suggest that oxidative stress, calcium signaling, intracellular vesicle trafficking and anaerobic mode of carbohydrate metabolism play a role in the regulation and execution of grape-bud dormancy release. A comprehensive analysis confirmed the induction of transcription from several calcium–signaling related genes following HC treatment, and detected an inhibiting effect of calcium channel blocker and calcium chelator on HC-induced and chilling-induced bud break. It also detected the existence of HC-induced and calcium dependent protein phosphorylation activity. These data suggest, for the first time, that calcium signaling is involved in the mechanism of dormancy release (Pang et al., in preparation). We compared the effects of heat shock (HS) to those detected in buds following HC application and found that HS lead to earlier and higher bud break. We also demonstrated similar temporary reduction in catalase expression and temporary induction of ascorbate peroxidase, glutathione reductase, thioredoxin and glutathione S transferase expression following both treatments. These findings further support the assumption that temporary oxidative stress is part of the mechanism leading to bud break. The temporary induction of sucrose syntase, pyruvate decarboxylase and alcohol dehydrogenase indicate that temporary respiratory stress is developed and suggest that mitochondrial function may be of central importance for that mechanism. These finding, suggesting triggering of identical mechanisms by HS and HC, justified the comparison of expression profiles of HC and HS treated buds, as a tool for the identification of pathways with a central role in dormancy release (Halaly et al., in preparation). RNA samples from buds treated with HS, HC and water were hybridized with the cDNA arrays in an interconnected loop design. Differentially expressed genes from the were selected using R-language package from Bioconductor project called LIMMA and clones showing a significant change following both HS and HC treatments, compared to control, were selected for further analysis. A total of 1541 clones show significant induction, of which 37% have no hit or unknown function and the rest represent 661 genes with identified function. Similarly, out of 1452 clones showing significant reduction, only 53% of the clones have identified function and they represent 573 genes. The 661 induced genes are involved in 445 different molecular functions. About 90% of those functions were classified to 20 categories based on careful survey of the literature. Among other things, it appears that carbohydrate metabolism and mitochondrial function may be of central importance in the mechanism of dormancy release and studies in this direction are ongoing. Analysis of the reduced function is ongoing (Appendix A). A second set of hybridizations was carried out with RNA samples from buds exposed to short photoperiod, leading to induction of bud dormancy, and long photoperiod treatment, as control. Analysis indicated that 42 genes were significant difference between LD and SD and 11 of these were unique.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography