To see the other types of publications on this topic, follow the link: Generative sequence models.

Journal articles on the topic 'Generative sequence models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Generative sequence models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Yongkang, Xuan Liu, Feng Huang, Zhankun Xiong, and Wen Zhang. "A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 1 (March 24, 2024): 3–11. http://dx.doi.org/10.1609/aaai.v38i1.27749.

Full text
Abstract:
Therapeutic peptides represent a unique class of pharmaceutical agents crucial for the treatment of human diseases. Recently, deep generative models have exhibited remarkable potential for generating therapeutic peptides, but they only utilize sequence or structure information alone, which hinders the performance in generation. In this study, we propose a Multi-Modal Contrastive Diffusion model (MMCD), fusing both sequence and structure modalities in a diffusion framework to co-generate novel peptide sequences and structures. Specifically, MMCD constructs the sequence-modal and structure-modal diffusion models, respectively, and devises a multi-modal contrastive learning strategy with inter-contrastive and intra-contrastive in each diffusion timestep, aiming to capture the consistency between two modalities and boost model performance. The inter-contrastive aligns sequences and structures of peptides by maximizing the agreement of their embeddings, while the intra-contrastive differentiates therapeutic and non-therapeutic peptides by maximizing the disagreement of their sequence/structure embeddings simultaneously. The extensive experiments demonstrate that MMCD performs better than other state-of-the-art deep generative methods in generating therapeutic peptides across various metrics, including antimicrobial/anticancer score, diversity, and peptide-docking.
APA, Harvard, Vancouver, ISO, and other styles
2

Wu, Zachary, Kadina E. Johnston, Frances H. Arnold, and Kevin K. Yang. "Protein sequence design with deep generative models." Current Opinion in Chemical Biology 65 (December 2021): 18–27. http://dx.doi.org/10.1016/j.cbpa.2021.04.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Akl, Hoda, Brooke Emison, Xiaochuan Zhao, Arup Mondal, Alberto Perez, and Purushottam D. Dixit. "GENERALIST: A latent space based generative model for protein sequence families." PLOS Computational Biology 19, no. 11 (November 27, 2023): e1011655. http://dx.doi.org/10.1371/journal.pcbi.1011655.

Full text
Abstract:
Generative models of protein sequence families are an important tool in the repertoire of protein scientists and engineers alike. However, state-of-the-art generative approaches face inference, accuracy, and overfitting- related obstacles when modeling moderately sized to large proteins and/or protein families with low sequence coverage. Here, we present a simple to learn, tunable, and accurate generative model, GENERALIST: GENERAtive nonLInear tenSor-factorizaTion for protein sequences. GENERALIST accurately captures several high order summary statistics of amino acid covariation. GENERALIST also predicts conservative local optimal sequences which are likely to fold in stable 3D structure. Importantly, unlike current methods, the density of sequences in GENERALIST-modeled sequence ensembles closely resembles the corresponding natural ensembles. Finally, GENERALIST embeds protein sequences in an informative latent space. GENERALIST will be an important tool to study protein sequence variability.
APA, Harvard, Vancouver, ISO, and other styles
4

Feinauer, Christoph, Barthelemy Meynard-Piganeau, and Carlo Lucibello. "Interpretable pairwise distillations for generative protein sequence models." PLOS Computational Biology 18, no. 6 (June 23, 2022): e1010219. http://dx.doi.org/10.1371/journal.pcbi.1010219.

Full text
Abstract:
Many different types of generative models for protein sequences have been proposed in literature. Their uses include the prediction of mutational effects, protein design and the prediction of structural properties. Neural network (NN) architectures have shown great performances, commonly attributed to the capacity to extract non-trivial higher-order interactions from the data. In this work, we analyze two different NN models and assess how close they are to simple pairwise distributions, which have been used in the past for similar problems. We present an approach for extracting pairwise models from more complex ones using an energy-based modeling framework. We show that for the tested models the extracted pairwise models can replicate the energies of the original models and are also close in performance in tasks like mutational effect prediction. In addition, we show that even simpler, factorized models often come close in performance to the original models.
APA, Harvard, Vancouver, ISO, and other styles
5

Won, K. J., C. Saunders, and A. Prügel-Bennett. "Evolving Fisher Kernels for Biological Sequence Classification." Evolutionary Computation 21, no. 1 (March 2013): 83–105. http://dx.doi.org/10.1162/evco_a_00065.

Full text
Abstract:
Fisher kernels have been successfully applied to many problems in bioinformatics. However, their success depends on the quality of the generative model upon which they are built. For Fisher kernel techniques to be used on novel problems, a mechanism for creating accurate generative models is required. A novel framework is presented for automatically creating domain-specific generative models that can be used to produce Fisher kernels for support vector machines (SVMs) and other kernel methods. The framework enables the capture of prior knowledge and addresses the issue of domain-specific kernels, both of which are current areas that are lacking in many kernel-based methods. To obtain the generative model, genetic algorithms are used to evolve the structure of hidden Markov models (HMMs). A Fisher kernel is subsequently created from the HMM, and used in conjunction with an SVM, to improve the discriminative power. This paper investigates the effectiveness of the proposed method, named GA-SVM. We show that its performance is comparable if not better than other state of the art methods in classifying secretory protein sequences of malaria. More interestingly, it showed better results than the sequence-similarity-based approach, without the need for additional homologous sequence information in protein enzyme family classification. The experiments clearly demonstrate that the GA-SVM is a novel way to find features with good performance from biological sequences, that does not require extensive tuning of a complex model.
APA, Harvard, Vancouver, ISO, and other styles
6

Liu, Yitian, and Zhouhui Lian. "DeepCalliFont: Few-Shot Chinese Calligraphy Font Synthesis by Integrating Dual-Modality Generative Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 4 (March 24, 2024): 3774–82. http://dx.doi.org/10.1609/aaai.v38i4.28168.

Full text
Abstract:
Few-shot font generation, especially for Chinese calligraphy fonts, is a challenging and ongoing problem. With the help of prior knowledge that is mainly based on glyph consistency assumptions, some recently proposed methods can synthesize high-quality Chinese glyph images. However, glyphs in calligraphy font styles often do not meet these assumptions. To address this problem, we propose a novel model, DeepCalliFont, for few-shot Chinese calligraphy font synthesis by integrating dual-modality generative models. Specifically, the proposed model consists of image synthesis and sequence generation branches, generating consistent results via a dual-modality representation learning strategy. The two modalities (i.e., glyph images and writing sequences) are properly integrated using a feature recombination module and a rasterization loss function. Furthermore, a new pre-training strategy is adopted to improve the performance by exploiting large amounts of uni-modality data. Both qualitative and quantitative experiments have been conducted to demonstrate the superiority of our method to other state-of-the-art approaches in the task of few-shot Chinese calligraphy font synthesis. The source code can be found at https://github.com/lsflyt-pku/DeepCalliFont.
APA, Harvard, Vancouver, ISO, and other styles
7

Safranchik, Esteban, Shiying Luo, and Stephen Bach. "Weakly Supervised Sequence Tagging from Noisy Rules." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 5570–78. http://dx.doi.org/10.1609/aaai.v34i04.6009.

Full text
Abstract:
We propose a framework for training sequence tagging models with weak supervision consisting of multiple heuristic rules of unknown accuracy. In addition to supporting rules that vote on tags in the output sequence, we introduce a new type of weak supervision, called linking rules, that vote on how sequence elements should be grouped into spans with the same tag. These rules are an alternative to candidate span generators that require significantly more human effort. To estimate the accuracies of the rules and combine their conflicting outputs into training data, we introduce a new type of generative model, linked hidden Markov models (linked HMMs), and prove they are generically identifiable (up to a tag permutation) without any observed training labels. We find that linked HMMs provide an average 7 F1 point boost on benchmark named entity recognition tasks versus generative models that assume the tags are i.i.d. Further, neural sequence taggers trained with these structure-aware generative models outperform comparable state-of-the-art approaches to weak supervision by an average of 2.6 F1 points.
APA, Harvard, Vancouver, ISO, and other styles
8

Polceanu, Mihai, Julie Porteous, Alan Lindsay, and Marc Cavazza. "Narrative Plan Generation with Self-Supervised Learning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 7 (May 18, 2021): 5984–92. http://dx.doi.org/10.1609/aaai.v35i7.16747.

Full text
Abstract:
Narrative Generation has attracted significant interest as a novel application of Automated Planning techniques. However, the vast amount of narrative material available opens the way to the use of Deep Learning techniques. In this paper, we explore the feasibility of narrative generation through self-supervised learning, using sequence embedding techniques or auto-encoders to produce narrative sequences. We use datasets of well-formed plots generated by a narrative planning approach, using pre-existing, published, narrative planning domains, to train generative models. Our experiments demonstrate the ability of generative sequence models to produce narrative plots with similar structure to those obtained with planning techniques, but with significant plot novelty in comparison with the training set. Most importantly, generated plots share structural properties associated with narrative quality measures used in Planning-based methods. As plan-based structures account for a higher level of causality and narrative consistency, this suggests that our approach is able to extend a set of narratives with novel sequences that display the same high-level narrative properties. Unlike methods developed to extend sets of textual narratives, ours operates at the level of plot structure. Thus, it has the potential to be used across various media for plots of significant complexity, being initially limited to training and generation operating in the same narrative genre.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhang, Zhiyuan, and Zhanshan Wang. "Multi-Objective Prediction of Integrated Energy System Using Generative Tractive Network." Mathematics 11, no. 20 (October 19, 2023): 4350. http://dx.doi.org/10.3390/math11204350.

Full text
Abstract:
Accurate load forecasting can bring economic benefits and scheduling optimization. The complexity and uncertainty arising from the coupling of different energy sources in integrated energy systems pose challenges for simultaneously predicting multiple target load sequences. Existing data-driven methods for load forecasting in integrated energy systems use multi-task learning to address these challenges. When determining the input data for multi-task learning, existing research primarily relies on data correlation analysis and considers the influence of external environmental factors in terms of feature engineering. However, such feature engineering methods lack the utilization of the characteristics of multi-target sequences. In leveraging the characteristics of multi-target sequences, language generation models trained on textual logic structures and other sequence features can generate synthetic data that can even be applied to self-training to improve model performance. This provides an idea for feature engineering in data-driven time-series forecasting models. However, because time-series data are different from textual data, existing transformer-based language generation models cannot be directly applied to generating time-series data. In order to consider the characteristics of multi-target load sequences in integrated energy system load forecasting, this paper proposed a generative tractive network (GTN) model. By selectively utilizing appropriate autoregressive feature data for temporal data, this model facilitates feature mining from time-series data. This model is capable of analyzing temporal data variations, generating novel synthetic time-series data that align with the intrinsic temporal patterns of the original sequences. Moreover, the model can generate synthetic samples that closely mimic the variations in the original time series. Subsequently, through the integration of the GTN and autoregressive feature data, various prediction models are employed in case studies to affirm the effectiveness of the proposed methodology.
APA, Harvard, Vancouver, ISO, and other styles
10

Hawkins-Hooker, Alex, Florence Depardieu, Sebastien Baur, Guillaume Couairon, Arthur Chen, and David Bikard. "Generating functional protein variants with variational autoencoders." PLOS Computational Biology 17, no. 2 (February 26, 2021): e1008736. http://dx.doi.org/10.1371/journal.pcbi.1008736.

Full text
Abstract:
The vast expansion of protein sequence databases provides an opportunity for new protein design approaches which seek to learn the sequence-function relationship directly from natural sequence variation. Deep generative models trained on protein sequence data have been shown to learn biologically meaningful representations helpful for a variety of downstream tasks, but their potential for direct use in the design of novel proteins remains largely unexplored. Here we show that variational autoencoders trained on a dataset of almost 70000 luciferase-like oxidoreductases can be used to generate novel, functional variants of the luxA bacterial luciferase. We propose separate VAE models to work with aligned sequence input (MSA VAE) and raw sequence input (AR-VAE), and offer evidence that while both are able to reproduce patterns of amino acid usage characteristic of the family, the MSA VAE is better able to capture long-distance dependencies reflecting the influence of 3D structure. To confirm the practical utility of the models, we used them to generate variants of luxA whose luminescence activity was validated experimentally. We further showed that conditional variants of both models could be used to increase the solubility of luxA without disrupting function. Altogether 6/12 of the variants generated using the unconditional AR-VAE and 9/11 generated using the unconditional MSA VAE retained measurable luminescence, together with all 23 of the less distant variants generated by conditional versions of the models; the most distant functional variant contained 35 differences relative to the nearest training set sequence. These results demonstrate the feasibility of using deep generative models to explore the space of possible protein sequences and generate useful variants, providing a method complementary to rational design and directed evolution approaches.
APA, Harvard, Vancouver, ISO, and other styles
11

Tang, Fangfang, Mengyuan Ren, Xiaofan Li, Zhanglin Lin, and Xiaofeng Yang. "Generating Novel and Soluble Class II Fructose-1,6-Bisphosphate Aldolase with ProteinGAN." Catalysts 13, no. 12 (November 22, 2023): 1457. http://dx.doi.org/10.3390/catal13121457.

Full text
Abstract:
Fructose-1,6-bisphosphate aldolase (FBA) is an important enzyme involved in central carbon metabolism (CCM) with promising industrial applications. Artificial intelligence models like generative adversarial networks (GANs) can design novel sequences that differ from natural ones. To expand the sequence space of FBA, we applied the generative adversarial network (ProteinGAN) model for the de novo design of FBA in this study. First, we corroborated the viability of the ProteinGAN model through replicating the generation of functional MDH variants. The model was then applied to the design of class II FBA. Computational analysis showed that the model successfully captured features of natural class II FBA sequences while expanding sequence diversity. Experimental results validated soluble expression and activity for the generated FBAs. Among the 20 generated FBA sequences (identity ranging from 85% to 99% with the closest natural FBA sequences), 4 were successfully expressed as soluble proteins in E. coli, and 2 of these 4 were functional. We further proposed a filter based on sequence identity to the endogenous FBA of E. coli and reselected 10 sequences (sequence identity ranging from 85% to 95%). Among them, six were successfully expressed as soluble proteins, and five of these six were functional—a significant improvement compared to the previous results. Furthermore, one generated FBA exhibited activity that was 1.69fold the control FBA. This study demonstrates that enzyme design with GANs can generate functional protein variants with enhanced performance and unique sequences.
APA, Harvard, Vancouver, ISO, and other styles
12

Lu, Zhengdong, Todd K. Leen, and Jeffrey Kaye. "Kernels for Longitudinal Data with Variable Sequence Length and Sampling Intervals." Neural Computation 23, no. 9 (September 2011): 2390–420. http://dx.doi.org/10.1162/neco_a_00164.

Full text
Abstract:
We develop several kernel methods for classification of longitudinal data and apply them to detect cognitive decline in the elderly. We first develop mixed-effects models, a type of hierarchical empirical Bayes generative models, for the time series. After demonstrating their utility in likelihood ratio classifiers (and the improvement over standard regression models for such classifiers), we develop novel Fisher kernels based on mixture of mixed-effects models and use them in support vector machine classifiers. The hierarchical generative model allows us to handle variations in sequence length and sampling interval gracefully. We also give nonparametric kernels not based on generative models, but rather on the reproducing kernel Hilbert space. We apply the methods to detecting cognitive decline from longitudinal clinical data on motor and neuropsychological tests. The likelihood ratio classifiers based on the neuropsychological tests perform better than than classifiers based on the motor behavior. Discriminant classifiers performed better than likelihood ratio classifiers for the motor behavior tests.
APA, Harvard, Vancouver, ISO, and other styles
13

Philip, Philemon, and Sidra Minhas. "A Brief Survey on Natural Language Processing Based Text Generation and Evaluation Techniques." VFAST Transactions on Software Engineering 10, no. 3 (September 27, 2022): 24–36. http://dx.doi.org/10.21015/vtse.v10i3.1104.

Full text
Abstract:
Text Generation is a pressing topic of Natural Language Processing that involves the prediction of upcoming text. Applications like auto-complete, chatbots, auto-correct, and many others use text generation to meet certain communicative requirements. However more accurate text generation methods are needed to encapsulate all possibilities of natural language communication. In this survey, we present cutting-edge methods being adopted for text generation. These methods are divided into three broad categories i.e. 1) Sequence-to-Sequence models (Seq2Seq), 2) Generative Adversarial Networks (GAN), and 3) Miscellaneous. Sequence-to-Sequence involves supervised methods, while GANs are unsupervised, aimed at reducing the dependence of models on training data. After this, we also list a few other text generation methods. We also summarize some evaluation metrics available for text generation and their Performance
APA, Harvard, Vancouver, ISO, and other styles
14

Bitard-Feildel, Tristan. "Navigating the amino acid sequence space between functional proteins using a deep learning framework." PeerJ Computer Science 7 (September 17, 2021): e684. http://dx.doi.org/10.7717/peerj-cs.684.

Full text
Abstract:
Motivation Shedding light on the relationships between protein sequences and functions is a challenging task with many implications in protein evolution, diseases understanding, and protein design. The protein sequence space mapping to specific functions is however hard to comprehend due to its complexity. Generative models help to decipher complex systems thanks to their abilities to learn and recreate data specificity. Applied to proteins, they can capture the sequence patterns associated with functions and point out important relationships between sequence positions. By learning these dependencies between sequences and functions, they can ultimately be used to generate new sequences and navigate through uncharted area of molecular evolution. Results This study presents an Adversarial Auto-Encoder (AAE) approached, an unsupervised generative model, to generate new protein sequences. AAEs are tested on three protein families known for their multiple functions the sulfatase, the HUP and the TPP families. Clustering results on the encoded sequences from the latent space computed by AAEs display high level of homogeneity regarding the protein sequence functions. The study also reports and analyzes for the first time two sampling strategies based on latent space interpolation and latent space arithmetic to generate intermediate protein sequences sharing sequential properties of original sequences linked to known functional properties issued from different families and functions. Generated sequences by interpolation between latent space data points demonstrate the ability of the AAE to generalize and produce meaningful biological sequences from an evolutionary uncharted area of the biological sequence space. Finally, 3D structure models computed by comparative modelling using generated sequences and templates of different sub-families point out to the ability of the latent space arithmetic to successfully transfer protein sequence properties linked to function between different sub-families. All in all this study confirms the ability of deep learning frameworks to model biological complexity and bring new tools to explore amino acid sequence and functional spaces.
APA, Harvard, Vancouver, ISO, and other styles
15

Ramakers, Julius, Christopher Frederik Blum, Sabrina König, Stefan Harmeling, and Markus Kollmann. "De novo prediction of RNA 3D structures with deep generative models." PLOS ONE 19, no. 2 (February 15, 2024): e0297105. http://dx.doi.org/10.1371/journal.pone.0297105.

Full text
Abstract:
We present a Deep Learning approach to predict 3D folding structures of RNAs from their nucleic acid sequence. Our approach combines an autoregressive Deep Generative Model, Monte Carlo Tree Search, and a score model to find and rank the most likely folding structures for a given RNA sequence. We show that RNA de novo structure prediction by deep learning is possible at atom resolution, despite the low number of experimentally measured structures that can be used for training. We confirm the predictive power of our approach by achieving competitive results in a retrospective evaluation of the RNA-Puzzles prediction challenges, without using structural contact information from multiple sequence alignments or additional data from chemical probing experiments. Blind predictions for recent RNA-Puzzle challenges under the name “Dfold” further support the competitive performance of our approach.
APA, Harvard, Vancouver, ISO, and other styles
16

Hazra, Debapriya, Mi-Ryung Kim, and Yung-Cheol Byun. "Generative Adversarial Networks for Creating Synthetic Nucleic Acid Sequences of Cat Genome." International Journal of Molecular Sciences 23, no. 7 (March 28, 2022): 3701. http://dx.doi.org/10.3390/ijms23073701.

Full text
Abstract:
Nucleic acids are the basic units of deoxyribonucleic acid (DNA) sequencing. Every organism demonstrates different DNA sequences with specific nucleotides. It reveals the genetic information carried by a particular DNA segment. Nucleic acid sequencing expresses the evolutionary changes among organisms and revolutionizes disease diagnosis in animals. This paper proposes a generative adversarial networks (GAN) model to create synthetic nucleic acid sequences of the cat genome tuned to exhibit specific desired properties. We obtained the raw sequence data from Illumina next generation sequencing. Various data preprocessing steps were performed using Cutadapt and DADA2 tools. The processed data were fed to the GAN model that was designed following the architecture of Wasserstein GAN with gradient penalty (WGAN-GP). We introduced a predictor and an evaluator in our proposed GAN model to tune the synthetic sequences to acquire certain realistic properties. The predictor was built for extracting samples with a promoter sequence, and the evaluator was built for filtering samples that scored high for motif-matching. The filtered samples were then passed to the discriminator. We evaluated our model based on multiple metrics and demonstrated outputs for latent interpolation, latent complementation, and motif-matching. Evaluation results showed our proposed GAN model achieved 93.7% correlation with the original data and produced significant outcomes as compared to existing models for sequence generation.
APA, Harvard, Vancouver, ISO, and other styles
17

Zhang, Zhaohui, Lijun Yang, Ligong Chen, Qiuwen Liu, Ying Meng, Pengwei Wang, and Maozhen Li. "A generative adversarial network–based method for generating negative financial samples." International Journal of Distributed Sensor Networks 16, no. 2 (February 2020): 155014772090705. http://dx.doi.org/10.1177/1550147720907053.

Full text
Abstract:
In financial anti-fraud field, negative samples are small and sparse with serious sample imbalanced problem. Generating negative samples consistent with original data to naturally solve imbalanced problem is a serious problem. This article proposes a new method to solve this problem. We introduce a new generation model, combined Generative Adversarial Network with Long Short-Term Memory network for one-dimensional negative financial samples. The characteristic association between transaction sequences can be learned by long short-term memory layer, and the generator covers real data distribution by the adversarial discriminator with time-sequence. Mapping data distribution to feature space is a common evaluation method of synthetic data; however, relationships between data attributes have been ignored in online transactions. We define a comprehensive evaluation method to evaluate the validity of generated samples from data distribution and attribute characteristics. Experimental results on real bank B2B transaction data show that the proposed model has higher overall ratings, which is 10% higher than traditional generation models. Finally, well-trained model is used to generate negative samples and form new dataset. The classification results on new datasets show that precision and recall are all higher than baseline models. Our work has a certain practical value and provides a new idea to solve imbalanced problem in whatever fields.
APA, Harvard, Vancouver, ISO, and other styles
18

Shen, Xiaojuan, Tao Zeng, Nianhang Chen, Jiabo Li, and Ruibo Wu. "NIMO: A Natural Product-Inspired Molecular Generative Model Based on Conditional Transformer." Molecules 29, no. 8 (April 19, 2024): 1867. http://dx.doi.org/10.3390/molecules29081867.

Full text
Abstract:
Natural products (NPs) have diverse biological activity and significant medicinal value. The structural diversity of NPs is the mainstay of drug discovery. Expanding the chemical space of NPs is an urgent need. Inspired by the concept of fragment-assembled pseudo-natural products, we developed a computational tool called NIMO, which is based on the transformer neural network model. NIMO employs two tailor-made motif extraction methods to map a molecular graph into a semantic motif sequence. All these generated motif sequences are used to train our molecular generative models. Various NIMO models were trained under different task scenarios by recognizing syntactic patterns and structure–property relationships. We further explored the performance of NIMO in structure-guided, activity-oriented, and pocket-based molecule generation tasks. Our results show that NIMO had excellent performance for molecule generation from scratch and structure optimization from a scaffold.
APA, Harvard, Vancouver, ISO, and other styles
19

Xuan, Bona, Jin Li, and Yafei Song. "SFCWGAN-BiTCN with Sequential Features for Malware Detection." Applied Sciences 13, no. 4 (February 5, 2023): 2079. http://dx.doi.org/10.3390/app13042079.

Full text
Abstract:
In the field of adversarial attacks, the generative adversarial network (GAN) has shown better performance. There have been few studies applying it to malware sample supplementation, due to the complexity of handling discrete data. More importantly, unbalanced malware family samples interfere with the analytical power of malware detection models and mislead malware classification. To address the problem of the impact of malware family imbalance on accuracy, a selection feature conditional Wasserstein generative adversarial network (SFCWGAN) and bidirectional temporal convolutional network (BiTCN) are proposed. First, we extract the features of malware Opcode and API sequences and use Word2Vec to represent features, emphasizing the semantic logic between API tuning and Opcode calling sequences. Second, the Spearman correlation coefficient and the whale optimization algorithm extreme gradient boosting (WOA-XGBoost) algorithm are combined to select features, filter out invalid features, and simplify structure. Finally, we propose a GAN-based sequence feature generation algorithm. Samples were generated using the conditional Wasserstein generative adversarial network (CWGAN) on the imbalanced malware family dataset, added to the trainset to supplement the samples, and trained on BiTCN. In comparison, in tests on the Kaggle and DataCon datasets, the model achieved detection accuracies of 99.56% and 96.93%, respectively, which were 0.18% and 2.98% higher than the models of other methods.
APA, Harvard, Vancouver, ISO, and other styles
20

Truong, Thanh-Dat, Chi Nhan Duong, Minh-Triet Tran, Ngan Le, and Khoa Luu. "Fast Flow Reconstruction via Robust Invertible n × n Convolution." Future Internet 13, no. 7 (July 8, 2021): 179. http://dx.doi.org/10.3390/fi13070179.

Full text
Abstract:
Flow-based generative models have recently become one of the most efficient approaches to model data generation. Indeed, they are constructed with a sequence of invertible and tractable transformations. Glow first introduced a simple type of generative flow using an invertible 1×1 convolution. However, the 1×1 convolution suffers from limited flexibility compared to the standard convolutions. In this paper, we propose a novel invertible n×n convolution approach that overcomes the limitations of the invertible 1×1 convolution. In addition, our proposed network is not only tractable and invertible but also uses fewer parameters than standard convolutions. The experiments on CIFAR-10, ImageNet and Celeb-HQ datasets, have shown that our invertible n×n convolution helps to improve the performance of generative models significantly.
APA, Harvard, Vancouver, ISO, and other styles
21

You, Yuyang, Xiaoyu Guo, Xuyang Zhong, and Zhihong Yang. "A Few-Shot Learning-Based EEG and Stage Transition Sequence Generator for Improving Sleep Staging Performance." Biomedicines 10, no. 12 (November 22, 2022): 3006. http://dx.doi.org/10.3390/biomedicines10123006.

Full text
Abstract:
In this study, generative adversarial networks named SleepGAN are proposed to expand the training set for automatic sleep stage classification tasks by generating both electroencephalogram (EEG) epochs and sequence relationships of sleep stages. In order to reach high accuracy, most existing classification methods require substantial amounts of training data, but obtaining such quantities of real EEG epochs is expensive and time-consuming. We introduce few-shot learning, which is a method of training a GAN using a very small set of training data. This paper presents progressive Wasserstein divergence generative adversarial networks (GANs) and a relational memory generator to generate EEG epochs and stage transition sequences, respectively. For the evaluation of our generated data, we use single-channel EEGs from the public dataset Sleep-EDF. The addition of our augmented data and sequence to the training set was shown to improve the performance of the classification model. The accuracy of the model increased by approximately 1% after incorporating generated EEG epochs. Adding both the augmented data and sequence to the training set resulted in a further increase of 3%, from the original accuracy of 79.40% to 83.06%. The result proves that SleepGAN is a set of GANs capable of generating realistic EEG epochs and transition sequences under the condition of insufficient training data and can be used to enlarge the training dataset and improve the performance of sleep stage classification models in clinical practice.
APA, Harvard, Vancouver, ISO, and other styles
22

Wang, Zhenyi, Ping Yu, Yang Zhao, Ruiyi Zhang, Yufan Zhou, Junsong Yuan, and Changyou Chen. "Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent Transitions." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 12281–88. http://dx.doi.org/10.1609/aaai.v34i07.6911.

Full text
Abstract:
Human-motion generation is a long-standing challenging task due to the requirement of accurately modeling complex and diverse dynamic patterns. Most existing methods adopt sequence models such as RNN to directly model transitions in the original action space. Due to high dimensionality and potential noise, such modeling of action transitions is particularly challenging. In this paper, we focus on skeleton-based action generation and propose to model smooth and diverse transitions on a latent space of action sequences with much lower dimensionality. Conditioned on a latent sequence, actions are generated by a frame-wise decoder shared by all latent action-poses. Specifically, an implicit RNN is defined to model smooth latent sequences, whose randomness (diversity) is controlled by noise from the input. Different from standard action-prediction methods, our model can generate action sequences from pure noise without any conditional action poses. Remarkably, it can also generate unseen actions from mixed classes during training. Our model is learned with a bi-directional generative-adversarial-net framework, which can not only generate diverse action sequences of a particular class or mix classes, but also learns to classify action sequences within the same model. Experimental results show the superiority of our method in both diverse action-sequence generation and classification, relative to existing methods.
APA, Harvard, Vancouver, ISO, and other styles
23

Kim, Ha Young, and Dongsup Kim. "Prediction of mutation effects using a deep temporal convolutional network." Bioinformatics 36, no. 7 (November 20, 2019): 2047–52. http://dx.doi.org/10.1093/bioinformatics/btz873.

Full text
Abstract:
Abstract Motivation Accurate prediction of the effects of genetic variation is a major goal in biological research. Towards this goal, numerous machine learning models have been developed to learn information from evolutionary sequence data. The most effective method so far is a deep generative model based on the variational autoencoder (VAE) that models the distributions using a latent variable. In this study, we propose a deep autoregressive generative model named mutationTCN, which employs dilated causal convolutions and attention mechanism for the modeling of inter-residue correlations in a biological sequence. Results We show that this model is competitive with the VAE model when tested against a set of 42 high-throughput mutation scan experiments, with the mean improvement in Spearman rank correlation ∼0.023. In particular, our model can more efficiently capture information from multiple sequence alignments with lower effective number of sequences, such as in viral sequence families, compared with the latent variable model. Also, we extend this architecture to a semi-supervised learning framework, which shows high prediction accuracy. We show that our model enables a direct optimization of the data likelihood and allows for a simple and stable training process. Availability and implementation Source code is available at https://github.com/ha01994/mutationTCN. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
24

Härkönen, Erik, Miika Aittala, Tuomas Kynkäänniemi, Samuli Laine, Timo Aila, and Jaakko Lehtinen. "Disentangling random and cyclic effects in time-lapse sequences." ACM Transactions on Graphics 41, no. 4 (July 2022): 1–13. http://dx.doi.org/10.1145/3528223.3530170.

Full text
Abstract:
Time-lapse image sequences offer visually compelling insights into dynamic processes that are too slow to observe in real time. However, playing a long time-lapse sequence back as a video often results in distracting flicker due to random effects, such as weather, as well as cyclic effects, such as the day-night cycle. We introduce the problem of disentangling time-lapse sequences in a way that allows separate, after-the-fact control of overall trends, cyclic effects, and random effects in the images, and describe a technique based on data-driven generative models that achieves this goal. This enables us to "re-render" the sequences in ways that would not be possible with the input images alone. For example, we can stabilize a long sequence to focus on plant growth over many months, under selectable, consistent weather. Our approach is based on Generative Adversarial Networks (GAN) that are conditioned with the time coordinate of the time-lapse sequence. Our architecture and training procedure are designed so that the networks learn to model random variations, such as weather, using the GAN's latent space, and to disentangle overall trends and cyclic variations by feeding the conditioning time label to the model using Fourier features with specific frequencies. We show that our models are robust to defects in the training data, enabling us to amend some of the practical difficulties in capturing long time-lapse sequences, such as temporary occlusions, uneven frame spacing, and missing frames.
APA, Harvard, Vancouver, ISO, and other styles
25

Wilburn, Grey W., and Sean R. Eddy. "Remote homology search with hidden Potts models." PLOS Computational Biology 16, no. 11 (November 30, 2020): e1008085. http://dx.doi.org/10.1371/journal.pcbi.1008085.

Full text
Abstract:
Most methods for biological sequence homology search and alignment work with primary sequence alone, neglecting higher-order correlations. Recently, statistical physics models called Potts models have been used to infer all-by-all pairwise correlations between sites in deep multiple sequence alignments, and these pairwise couplings have improved 3D structure predictions. Here we extend the use of Potts models from structure prediction to sequence alignment and homology search by developing what we call a hidden Potts model (HPM) that merges a Potts emission process to a generative probability model of insertion and deletion. Because an HPM is incompatible with efficient dynamic programming alignment algorithms, we develop an approximate algorithm based on importance sampling, using simpler probabilistic models as proposal distributions. We test an HPM implementation on RNA structure homology search benchmarks, where we can compare directly to exact alignment methods that capture nested RNA base-pairing correlations (stochastic context-free grammars). HPMs perform promisingly in these proof of principle experiments.
APA, Harvard, Vancouver, ISO, and other styles
26

Hu, Yijia. "Performance exploration of Generative Pre-trained Transformer-2 for lyrics generation." Applied and Computational Engineering 48, no. 1 (March 19, 2024): 53–60. http://dx.doi.org/10.54254/2755-2721/48/20241154.

Full text
Abstract:
In recent years, the field of Natural Language Processing (NLP) has undergone a revolution, with text generation playing a key role in this transformation. This shift is not limited to technological areas but has also seamlessly penetrated creative domains, with a prime example being the generation of song lyrics. To be truly effective, generative models, like Generative Pre-trained Transformer (GPT)-2, require fine-tuning as a crucial step. This paper, utilizing the robustness of the widely-referenced Kaggle dataset titled "Song Lyrics", carefully explores the impacts of modulating three key parameters: learning rate, batch size, and sequence length. The dataset presents a compelling narrative that highlights the learning rate as the most influential determinant, directly impacting the quality and coherence of the lyrics generated. While increasing the batch size and extending sequence lengths promise enhanced model performance, it is evident that there is a saturation point beyond which further benefits are limited. Through this exploration, the paper aims to demystify the complex world of model calibration and emphasize the importance of strategic parameter selection in pursuit of lyrical excellence.
APA, Harvard, Vancouver, ISO, and other styles
27

Nguyen, Viet, Giang Vu, Tung Nguyen Thanh, Khoat Than, and Toan Tran. "On Inference Stability for Diffusion Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 13 (March 24, 2024): 14449–56. http://dx.doi.org/10.1609/aaai.v38i13.29359.

Full text
Abstract:
Denoising Probabilistic Models (DPMs) represent an emerging domain of generative models that excel in generating diverse and high-quality images. However, most current training methods for DPMs often neglect the correlation between timesteps, limiting the model's performance in generating images effectively. Notably, we theoretically point out that this issue can be caused by the cumulative estimation gap between the predicted and the actual trajectory. To minimize that gap, we propose a novel sequence-aware loss that aims to reduce the estimation gap to enhance the sampling quality. Furthermore, we theoretically show that our proposed loss function is a tighter upper bound of the estimation loss in comparison with the conventional loss in DPMs. Experimental results on several benchmark datasets including CIFAR10, CelebA, and CelebA-HQ consistently show a remarkable improvement of our proposed method regarding the image generalization quality measured by FID and Inception Score compared to several DPM baselines. Our code and pre-trained checkpoints are available at https://github.com/VinAIResearch/SA-DPM.
APA, Harvard, Vancouver, ISO, and other styles
28

Li, Shuyu, and Yunsick Sung. "Transformer-Based Seq2Seq Model for Chord Progression Generation." Mathematics 11, no. 5 (February 23, 2023): 1111. http://dx.doi.org/10.3390/math11051111.

Full text
Abstract:
Machine learning is widely used in various practical applications with deep learning models demonstrating advantages in handling huge data. Treating music as a special language and using deep learning models to accomplish melody recognition, music generation, and music analysis has proven feasible. In certain music-related deep learning research, recurrent neural networks have been replaced with transformers. This has achieved significant results. In traditional approaches with recurrent neural networks, input sequences are limited in length. This paper proposes a method to generate chord progressions for melodies using a transformer-based sequence-to-sequence model, which is divided into a pre-trained encoder and decoder. A pre-trained encoder extracts contextual information from melodies, whereas a decoder uses this information to produce chords asynchronously and finally outputs chord progressions. The proposed method addresses length limitation issues while considering the harmony between chord progressions and melodies. Chord progressions can be generated for melodies in practical music composition applications. Evaluation experiments are conducted using the proposed method and three baseline models. The baseline models included the bidirectional long short-term memory (BLSTM), bidirectional encoder representation from transformers (BERT), and generative pre-trained transformer (GPT2). The proposed method outperformed the baseline models in Hits@k (k = 1) by 25.89, 1.54, and 2.13 %, respectively.
APA, Harvard, Vancouver, ISO, and other styles
29

ZAKI, NAZAR M., SAFAAI DERIS, and ROSLI M. ILLIAS. "FEATURES EXTRACTION FOR PROTEIN HOMOLOGY DETECTION USING HIDDEN MARKOV MODELS COMBINING SCORES." International Journal of Computational Intelligence and Applications 04, no. 01 (March 2004): 1–12. http://dx.doi.org/10.1142/s1469026804001161.

Full text
Abstract:
Few years back, Jaakkola and Haussler published a method of combining generative and discriminative approaches for detecting protein homologies. The method was a variant of support vector machines using a new kernel function called Fisher Kernel. They begin by training a generative hidden Markov model for a protein family. Then, using the model, they derive a vector of features called Fisher scores that are assigned to the sequence and then use support vector machine in conjunction with the fisher scores for protein homologies detection. In this paper, we revisit the idea of using a discriminative approach, and in particular support vector machines for protein homologies detection. However, in place of the Fisher scoring method, we present a new Hidden Markov Model Combining Scores approach. Six scoring algorithms are combined as a way of extracting features from a protein sequence. Experiments show that our method, improves on previous methods for homologies detection of protein domains.
APA, Harvard, Vancouver, ISO, and other styles
30

Isacchini, Giulio, Aleksandra M. Walczak, Thierry Mora, and Armita Nourmohammad. "Deep generative selection models of T and B cell receptor repertoires with soNNia." Proceedings of the National Academy of Sciences 118, no. 14 (April 1, 2021): e2023141118. http://dx.doi.org/10.1073/pnas.2023141118.

Full text
Abstract:
Subclasses of lymphocytes carry different functional roles to work together and produce an immune response and lasting immunity. Additionally to these functional roles, T and B cell lymphocytes rely on the diversity of their receptor chains to recognize different pathogens. The lymphocyte subclasses emerge from common ancestors generated with the same diversity of receptors during selection processes. Here, we leverage biophysical models of receptor generation with machine learning models of selection to identify specific sequence features characteristic of functional lymphocyte repertoires and subrepertoires. Specifically, using only repertoire-level sequence information, we classify CD4+ and CD8+ T cells, find correlations between receptor chains arising during selection, and identify T cell subsets that are targets of pathogenic epitopes. We also show examples of when simple linear classifiers do as well as more complex machine learning methods.
APA, Harvard, Vancouver, ISO, and other styles
31

Wang, Chuantao, Xuexin Yang, and Linkai Ding. "Imbalanced sentiment classification based on sequence generative adversarial nets." Journal of Intelligent & Fuzzy Systems 39, no. 5 (November 19, 2020): 7909–19. http://dx.doi.org/10.3233/jifs-201370.

Full text
Abstract:
The purpose of sentiment classification is to solve the problem of automatic judgment of sentiment tendency. In the sentiment classification task of text data (such as online reviews), the traditional deep learning model focuses on algorithm optimization, but ignores the characteristics of the imbalanced distribution of the number of samples in each classification, which will cause the classification performance of the model to decrease in practical applications. In this paper, the experiment is divided into two stages. In the first stage, samples of minority class in the sample distribution are used to train a sequence generative adversarial nets, so that the sequence generative adversarial nets can learn the features of the samples of minority class in depth. In the second stage, the trained generator of sequence generative adversarial nets is used to generate false samples of minority class and mix them with the original samples to balance the sample distribution. After that, the mixed samples are input into the sentiment classification deep model to complete the model training. Experimental results show that the model has excellent classification performance in comparing a variety of deep learning models based on classic imbalanced learning methods in the sentiment classification task of hotel reviews.
APA, Harvard, Vancouver, ISO, and other styles
32

Stokes, James, and John Terilla. "Probabilistic Modeling with Matrix Product States." Entropy 21, no. 12 (December 17, 2019): 1236. http://dx.doi.org/10.3390/e21121236.

Full text
Abstract:
Inspired by the possibility that generative models based on quantum circuits can provide a useful inductive bias for sequence modeling tasks, we propose an efficient training algorithm for a subset of classically simulable quantum circuit models. The gradient-free algorithm, presented as a sequence of exactly solvable effective models, is a modification of the density matrix renormalization group procedure adapted for learning a probability distribution. The conclusion that circuit-based models offer a useful inductive bias for classical datasets is supported by experimental results on the parity learning problem.
APA, Harvard, Vancouver, ISO, and other styles
33

Wang, Xun, Changnan Gao, Peifu Han, Xue Li, Wenqi Chen, Alfonso Rodríguez Patón, Shuang Wang, and Pan Zheng. "PETrans: De Novo Drug Design with Protein-Specific Encoding Based on Transfer Learning." International Journal of Molecular Sciences 24, no. 2 (January 6, 2023): 1146. http://dx.doi.org/10.3390/ijms24021146.

Full text
Abstract:
Recent years have seen tremendous success in the design of novel drug molecules through deep generative models. Nevertheless, existing methods only generate drug-like molecules, which require additional structural optimization to be developed into actual drugs. In this study, a deep learning method for generating target-specific ligands was proposed. This method is useful when the dataset for target-specific ligands is limited. Deep learning methods can extract and learn features (representations) in a data-driven way with little or no human participation. Generative pretraining (GPT) was used to extract the contextual features of the molecule. Three different protein-encoding methods were used to extract the physicochemical properties and amino acid information of the target protein. Protein-encoding and molecular sequence information are combined to guide molecule generation. Transfer learning was used to fine-tune the pretrained model to generate molecules with better binding ability to the target protein. The model was validated using three different targets. The docking results show that our model is capable of generating new molecules with higher docking scores for the target proteins.
APA, Harvard, Vancouver, ISO, and other styles
34

Wu, Shaohan, Jingfeng Xue, Yong Wang, and Zixiao Kong. "Black-Box Evasion Attack Method Based on Confidence Score of Benign Samples." Electronics 12, no. 11 (May 23, 2023): 2346. http://dx.doi.org/10.3390/electronics12112346.

Full text
Abstract:
Recently, malware detection models based on deep learning have gradually replaced manual analysis as the first line of defense for anti-malware systems. However, it has been shown that these models are vulnerable to a specific class of inputs called adversarial examples. It is possible to evade the detection model by adding some carefully crafted tiny perturbations to the malicious samples without changing the sample functions. Most of the adversarial example generation methods ignore the information contained in the detection results of benign samples from detection models. Our method extracts sequence fragments called benign payload from benign samples based on detection results and uses an RNN generative model to learn benign features embedded in these sequences. Then, we use the end of the original malicious sample as input to generate an adversarial perturbation that reduces the malicious probability of the sample and append it to the end of the sample to generate an adversarial sample. According to different adversarial scenarios, we propose two different generation strategies, which are the one-time generation method and the iterative generation method. Under different query times and append scale constraints, the maximum evasion success rate can reach 90.8%.
APA, Harvard, Vancouver, ISO, and other styles
35

Russ, William P., Matteo Figliuzzi, Christian Stocker, Pierre Barrat-Charlaix, Michael Socolich, Peter Kast, Donald Hilvert, et al. "An evolution-based model for designing chorismate mutase enzymes." Science 369, no. 6502 (July 23, 2020): 440–45. http://dx.doi.org/10.1126/science.aba3304.

Full text
Abstract:
The rational design of enzymes is an important goal for both fundamental and practical reasons. Here, we describe a process to learn the constraints for specifying proteins purely from evolutionary sequence data, design and build libraries of synthetic genes, and test them for activity in vivo using a quantitative complementation assay. For chorismate mutase, a key enzyme in the biosynthesis of aromatic amino acids, we demonstrate the design of natural-like catalytic function with substantial sequence diversity. Further optimization focuses the generative model toward function in a specific genomic context. The data show that sequence-based statistical models suffice to specify proteins and provide access to an enormous space of functional sequences. This result provides a foundation for a general process for evolution-based design of artificial proteins.
APA, Harvard, Vancouver, ISO, and other styles
36

Chen, Yunjie, Marius Staring, Olaf M. Neve, Stephan R. Romeijn, Erik F. Hensen, Berit M. Verbist, Jelmer M. Wolterink, and Qian Tao. "CoNeS: Conditional neural fields with shift modulation for multi-sequence MRI translation." Machine Learning for Biomedical Imaging 2, Generative Models (February 9, 2024): 657–85. http://dx.doi.org/10.59275/j.melba.2024-d61g.

Full text
Abstract:
Multi-sequence magnetic resonance imaging (MRI) has found wide applications in both modern clinical studies and deep learning research. However, in clinical practice, it frequently occurs that one or more of the MRI sequences are missing due to different image acquisition protocols or contrast agent contraindications of patients, limiting the utilization of deep learning models trained on multi-sequence data. One promising approach is to leverage generative models to synthesize the missing sequences, which can serve as a surrogate acquisition. State-of-the-art methods tackling this problem are based on convolutional neural networks (CNN) which usually suffer from spectral biases, resulting in poor reconstruction of high-frequency fine details. In this paper, we propose Conditional Neural fields with Shift modulation (CoNeS), a model that takes voxel coordinates as input and learns a representation of the target images for multi-sequence MRI translation. The proposed model uses a multi-layer perceptron (MLP) instead of a CNN as the decoder for pixel-to-pixel mapping. Hence, each target image is represented as a neural field that is conditioned on the source image via shift modulation with a learned latent code. Experiments on BraTS 2018 and an in-house clinical dataset of vestibular schwannoma patients showed that the proposed method outperformed state-of-the-art methods for multi-sequence MRI translation both visually and quantitatively. Moreover, we conducted spectral analysis, showing that CoNeS was able to overcome the spectral bias issue common in conventional CNN models. To further evaluate the usage of synthesized images in clinical downstream tasks, we tested a segmentation network using the synthesized images at inference. The results showed that CoNeS improved the segmentation performance when some MRI sequences were missing and outperformed other synthesis models. We concluded that neural fields are a promising technique for multi-sequence MRI translation.
APA, Harvard, Vancouver, ISO, and other styles
37

Qi, Jingyuan, Minqian Liu, Ying Shen, Zhiyang Xu, and Lifu Huang. "MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (March 24, 2024): 18888–96. http://dx.doi.org/10.1609/aaai.v38i17.29854.

Full text
Abstract:
Automatically generating scripts (i.e. sequences of key steps described in text) from video demonstrations and reasoning about the subsequent steps are crucial to the modern AI virtual assistants to guide humans to complete everyday tasks, especially unfamiliar ones. However, current methods for generative script learning rely heavily on well-structured preceding steps described in text and/or images or are limited to a certain domain, resulting in a disparity with real-world user scenarios. To address these limitations, we present a new benchmark challenge – MULTISCRIPT, with two new tasks on task-oriented multimodal script learning: (1) multimodal script generation, and (2) subsequent step prediction. For both tasks, the input consists of a target task name and a video illustrating what has been done to complete the target task, and the expected output is (1) a sequence of structured step descriptions in text based on the demonstration video, and (2) a single text description for the subsequent step, respectively. Built from WikiHow, MULTISCRIPT covers multimodal scripts in videos and text descriptions for over 6,655 human everyday tasks across 19 diverse domains. To establish baseline performance on MULTISCRIPT, we propose two knowledge-guided multimodal generative frameworks that incorporate the task-related knowledge prompted from large language models such as Vicuna. Experimental results show that our proposed approaches significantly improve over the competitive baselines.
APA, Harvard, Vancouver, ISO, and other styles
38

Liu, Danyang, Juntao Li, Meng-Hsuan Yu, Ziming Huang, Gongshen Liu, Dongyan Zhao, and Rui Yan. "A Character-Centric Neural Model for Automated Story Generation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 02 (April 3, 2020): 1725–32. http://dx.doi.org/10.1609/aaai.v34i02.5536.

Full text
Abstract:
Automated story generation is a challenging task which aims to automatically generate convincing stories composed of successive plots correlated with consistent characters. Most recent generation models are built upon advanced neural networks, e.g., variational autoencoder, generative adversarial network, convolutional sequence to sequence model. Although these models have achieved prompting results on learning linguistic patterns, very few methods consider the attributes and prior knowledge of the story genre, especially from the perspectives of explainability and consistency. To fill this gap, we propose a character-centric neural storytelling model, where a story is created encircling the given character, i.e., each part of a story is conditioned on a given character and corresponded context environment. In this way, we explicitly capture the character information and the relations between plots and characters to improve explainability and consistency. Experimental results on open dataset indicate that our model yields meaningful improvements over several strong baselines on both human and automatic evaluations.
APA, Harvard, Vancouver, ISO, and other styles
39

Tubiana, Jérôme, Lucia Adriana-Lifshits, Michael Nissan, Matan Gabay, Inbal Sher, Marina Sova, Haim J. Wolfson, and Maayan Gal. "Funneling modulatory peptide design with generative models: Discovery and characterization of disruptors of calcineurin protein-protein interactions." PLOS Computational Biology 19, no. 2 (February 2, 2023): e1010874. http://dx.doi.org/10.1371/journal.pcbi.1010874.

Full text
Abstract:
Design of peptide binders is an attractive strategy for targeting “undruggable” protein-protein interfaces. Current design protocols rely on the extraction of an initial sequence from one known protein interactor of the target protein, followed by in-silico or in-vitro mutagenesis-based optimization of its binding affinity. Wet lab protocols can explore only a minor portion of the vast sequence space and cannot efficiently screen for other desirable properties such as high specificity and low toxicity, while in-silico design requires intensive computational resources and often relies on simplified binding models. Yet, for a multivalent protein target, dozens to hundreds of natural protein partners already exist in the cellular environment. Here, we describe a peptide design protocol that harnesses this diversity via a machine learning generative model. After identifying putative natural binding fragments by literature and homology search, a compositional Restricted Boltzmann Machine is trained and sampled to yield hundreds of diverse candidate peptides. The latter are further filtered via flexible molecular docking and an in-vitro microchip-based binding assay. We validate and test our protocol on calcineurin, a calcium-dependent protein phosphatase involved in various cellular pathways in health and disease. In a single screening round, we identified multiple 16-length peptides with up to six mutations from their closest natural sequence that successfully interfere with the binding of calcineurin to its substrates. In summary, integrating protein interaction and sequence databases, generative modeling, molecular docking and interaction assays enables the discovery of novel protein-protein interaction modulators.
APA, Harvard, Vancouver, ISO, and other styles
40

Shen, Kevin. "Multi-world Model in Continual Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 21 (March 24, 2024): 23757–59. http://dx.doi.org/10.1609/aaai.v38i21.30555.

Full text
Abstract:
World Models are made of generative networks that can predict future states of a single environment which it was trained on. This research proposes a Multi-world Model, a foundational model built from World Models for the field of continual reinforcement learning that is trained on many different environments, enabling it to generalize state sequence predictions even for unseen settings.
APA, Harvard, Vancouver, ISO, and other styles
41

Vaškevičius, Mantas, Jurgita Kapočiūtė-Dzikienė, and Liudas Šlepikas. "Generative LLMs in Organic Chemistry: Transforming Esterification Reactions into Natural Language Procedures." Applied Sciences 13, no. 24 (December 11, 2023): 13140. http://dx.doi.org/10.3390/app132413140.

Full text
Abstract:
This paper presents a novel approach to predicting esterification procedures in organic chemistry by employing generative large language models (LLMs) to interpret and translate SMILES molecular notation into detailed procedural texts of synthesis reactions. Esterification reaction is important in producing various industrial intermediates, fragrances, and flavors. Recognizing the challenges of accurate prediction in complex chemical landscapes, we have compiled and made publicly available a curated dataset of esterification reactions to enhance research collaboration. We systematically compare machine learning algorithms, ranging from the conventional k-nearest neighbors (kNN) to advanced sequence-to-sequence transformer models, including FLAN-T5 and ChatGPT-based variants. Our analysis highlights the FLAN-T5 model as the standout performer with a BLEU score of 51.82, suggesting that the model has significant potential in enhancing reaction planning and chemical synthesis. Our findings contribute to the growing field of AI in chemistry, offering a promising direction for enhancing the efficiency of reaction planning and chemical synthesis.
APA, Harvard, Vancouver, ISO, and other styles
42

Marocco, Paolo, and Roberto Gigliucci. "An Investigation about Entailment and Narrative by AI Techniques (Generative Models)." Communication, Society and Media 3, no. 4 (November 16, 2020): p61. http://dx.doi.org/10.22158/csm.v3n4p61.

Full text
Abstract:
Many storytelling generation problems concern the difficulty to model the sequence of sentences. Language models are generally able to assign high scores to well-formed text, especially in the cases of short texts, failing when they try to simulate human textual inference. Although in some cases output text automatically generated sounds as bland, incoherent, repetitive and unrelated to the context, in other cases the process reveals capability to surprise the reader, avoiding to be boring/predictable, even if the generated text satisfies entailment task requirements. The lyric tradition often does not proceed towards a real logical inference, but takes into account alternatives like the unexpectedness, useful for predicting when a narrative story will be perceived as interesting. To achieve a best comprehension of narrative variety, we propose a novel measure based on two components: inference and unexpectedness, whose different weights can modify the opportunity for readers to have different experiences about the functionality of a generated story. We propose a supervised validation treatment, in order to compare the authorial original text, learned by the model, with the generated one.
APA, Harvard, Vancouver, ISO, and other styles
43

Lee, Sangho, Hayun Lee, and Dongkun Shin. "Proxyformer: Nyström-Based Linear Transformer with Trainable Proxy Tokens." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (March 24, 2024): 13418–26. http://dx.doi.org/10.1609/aaai.v38i12.29244.

Full text
Abstract:
Transformer-based models have demonstrated remarkable performance in various domains, including natural language processing, image processing and generative modeling. The most significant contributor to the successful performance of Transformer models is the self-attention mechanism, which allows for a comprehensive understanding of the interactions between tokens in the input sequence. However, there is a well-known scalability issue, the quadratic dependency (i.e. O(n^2)) of self-attention operations on the input sequence length n, making the handling of lengthy sequences challenging. To address this limitation, there has been a surge of research on efficient transformers, aiming to alleviate the quadratic dependency on the input sequence length. Among these, the Nyströmformer, which utilizes the Nyström method to decompose the attention matrix, achieves superior performance in both accuracy and throughput. However, its landmark selection exhibits redundancy, and the model incurs computational overhead when calculating the pseudo-inverse matrix. We propose a novel Nyström method-based transformer, called Proxyformer. Unlike the traditional approach of selecting landmarks from input tokens, the Proxyformer utilizes trainable neural memory, called proxy tokens, for landmarks. By integrating contrastive learning, input injection, and a specialized dropout for the decomposed matrix, Proxyformer achieves top-tier performance for long sequence tasks in the Long Range Arena benchmark.
APA, Harvard, Vancouver, ISO, and other styles
44

LEE, CHAN-SU, and DIMITRIS SAMARAS. "ANALYSIS AND CONTROL OF FACIAL EXPRESSIONS USING DECOMPOSABLE NONLINEAR GENERATIVE MODELS." International Journal of Pattern Recognition and Artificial Intelligence 28, no. 05 (July 31, 2014): 1456009. http://dx.doi.org/10.1142/s0218001414560096.

Full text
Abstract:
Facial expressions convey personal characteristics and subtle emotional states. This paper presents a new framework for modeling subtle facial motions of different people with different types of expressions from high-resolution facial expression tracking data to synthesize new stylized subtle facial expressions. A conceptual facial motion manifold is used for a unified representation of facial motion dynamics from three-dimensional (3D) high-resolution facial motions as well as from two-dimensional (2D) low-resolution facial motions. Variant subtle facial motions in different people with different expressions are modeled by nonlinear mappings from the embedded conceptual manifold to input facial motions using empirical kernel maps. We represent facial expressions by a factorized nonlinear generative model, which decomposes expression style factors and expression type factors from different people with multiple expressions. We also provide a mechanism to control the high-resolution facial motion model from low-resolution facial video sequence tracking and analysis. Using the decomposable generative model with a common motion manifold embedding, we can estimate parameters to control 3D high resolution facial expressions from 2D tracking results, which allows performance-driven control of high-resolution facial expressions.
APA, Harvard, Vancouver, ISO, and other styles
45

Liu, Zuozhu, Thiparat Chotibut, Christopher Hillar, and Shaowei Lin. "Biologically Plausible Sequence Learning with Spiking Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 02 (April 3, 2020): 1316–23. http://dx.doi.org/10.1609/aaai.v34i02.5487.

Full text
Abstract:
Motivated by the celebrated discrete-time model of nervous activity outlined by McCulloch and Pitts in 1943, we propose a novel continuous-time model, the McCulloch-Pitts network (MPN), for sequence learning in spiking neural networks. Our model has a local learning rule, such that the synaptic weight updates depend only on the information directly accessible by the synapse. By exploiting asymmetry in the connections between binary neurons, we show that MPN can be trained to robustly memorize multiple spatiotemporal patterns of binary vectors, generalizing the ability of the symmetric Hopfield network to memorize static spatial patterns. In addition, we demonstrate that the model can efficiently learn sequences of binary pictures as well as generative models for experimental neural spike-train data. Our learning rule is consistent with spike-timing-dependent plasticity (STDP), thus providing a theoretical ground for the systematic design of biologically inspired networks with large and robust long-range sequence storage capacity.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhou, Kun, Wenyong Wang, Teng Hu, and Kai Deng. "Time Series Forecasting and Classification Models Based on Recurrent with Attention Mechanism and Generative Adversarial Networks." Sensors 20, no. 24 (December 16, 2020): 7211. http://dx.doi.org/10.3390/s20247211.

Full text
Abstract:
Time series classification and forecasting have long been studied with the traditional statistical methods. Recently, deep learning achieved remarkable successes in areas such as image, text, video, audio processing, etc. However, research studies conducted with deep neural networks in these fields are not abundant. Therefore, in this paper, we aim to propose and evaluate several state-of-the-art neural network models in these fields. We first review the basics of representative models, namely long short-term memory and its variants, the temporal convolutional network and the generative adversarial network. Then, long short-term memory with autoencoder and attention-based models, the temporal convolutional network and the generative adversarial model are proposed and applied to time series classification and forecasting. Gaussian sliding window weights are proposed to speed the training process up. Finally, the performances of the proposed methods are assessed using five optimizers and loss functions with the public benchmark datasets, and comparisons between the proposed temporal convolutional network and several classical models are conducted. Experiments show the proposed models’ effectiveness and confirm that the temporal convolutional network is superior to long short-term memory models in sequence modeling. We conclude that the proposed temporal convolutional network reduces time consumption to around 80% compared to others while retaining the same accuracy. The unstable training process for generative adversarial network is circumvented by tuning hyperparameters and carefully choosing the appropriate optimizer of “Adam”. The proposed generative adversarial network also achieves comparable forecasting accuracy with traditional methods.
APA, Harvard, Vancouver, ISO, and other styles
47

Wang, Yi. "Intelligent auxiliary system for music performance under edge computing and long short-term recurrent neural networks." PLOS ONE 18, no. 5 (May 8, 2023): e0285496. http://dx.doi.org/10.1371/journal.pone.0285496.

Full text
Abstract:
Music performance action generation can be applied in multiple real-world scenarios as a research hotspot in computer vision and cross-sequence analysis. However, the current generation methods of music performance actions have consistently ignored the connection between music and performance actions, resulting in a strong sense of separation between visual and auditory content. This paper first analyzes the attention mechanism, Recurrent Neural Network (RNN), and long and short-term RNN. The long and short-term RNN is suitable for sequence data with a strong temporal correlation. Based on this, the current learning method is improved. A new model that combines attention mechanisms and long and short-term RNN is proposed, which can generate performance actions based on music beat sequences. In addition, image description generative models with attention mechanisms are adopted technically. Combined with the RNN abstract structure that does not consider recursion, the abstract network structure of RNN-Long Short-Term Memory (LSTM) is optimized. Through music beat recognition and dance movement extraction technology, data resources are allocated and adjusted in the edge server architecture. The metric for experimental results and evaluation is the model loss function value. The superiority of the proposed model is mainly reflected in the high accuracy and low consumption rate of dance movement recognition. The experimental results show that the result of the loss function of the model is at least 0.00026, and the video effect is the best when the number of layers of the LSTM module in the model is 3, the node value is 256, and the Lookback value is 15. The new model can generate harmonious and prosperous performance action sequences based on ensuring the stability of performance action generation compared with the other three models of cross-domain sequence analysis. The new model has an excellent performance in combining music and performance actions. This paper has practical reference value for promoting the application of edge computing technology in intelligent auxiliary systems for music performance.
APA, Harvard, Vancouver, ISO, and other styles
48

Li, Longyuan, Jihai Zhang, Junchi Yan, Yaohui Jin, Yunhao Zhang, Yanjie Duan, and Guangjian Tian. "Synergetic Learning of Heterogeneous Temporal Sequences for Multi-Horizon Probabilistic Forecasting." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 8420–28. http://dx.doi.org/10.1609/aaai.v35i10.17023.

Full text
Abstract:
Time-series is ubiquitous across applications, such as transportation, finance and healthcare. Time-series is often influenced by external factors, especially in the form of asynchronous events, making forecasting difficult. However, existing models are mainly designated for either synchronous time-series or asynchronous event sequence, and can hardly provide a synthetic way to capture the relation between them. We propose Variational Synergetic Multi-Horizon Network (VSMHN), a novel deep conditional generative model. To learn complex correlations across heterogeneous sequences, a tailored encoder is devised to combine the advances in deep point processes models and variational recurrent neural networks. In addition, an aligned time coding and an auxiliary transition scheme are carefully devised for batched training on unaligned sequences. Our model can be trained effectively using stochastic variational inference and generates probabilistic predictions with Monte-Carlo simulation. Furthermore, our model produces accurate, sharp and more realistic probabilistic forecasts. We also show that modeling asynchronous event sequences is crucial for multi-horizon time-series forecasting.
APA, Harvard, Vancouver, ISO, and other styles
49

Murad, Taslim, Sarwan Ali, and Murray Patterson. "Exploring the Potential of GANs in Biological Sequence Analysis." Biology 12, no. 6 (June 14, 2023): 854. http://dx.doi.org/10.3390/biology12060854.

Full text
Abstract:
Biological sequence analysis is an essential step toward building a deeper understanding of the underlying functions, structures, and behaviors of the sequences. It can help in identifying the characteristics of the associated organisms, such as viruses, etc., and building prevention mechanisms to eradicate their spread and impact, as viruses are known to cause epidemics that can become global pandemics. New tools for biological sequence analysis are provided by machine learning (ML) technologies to effectively analyze the functions and structures of the sequences. However, these ML-based methods undergo challenges with data imbalance, generally associated with biological sequence datasets, which hinders their performance. Although various strategies are present to address this issue, such as the SMOTE algorithm, which creates synthetic data, however, they focus on local information rather than the overall class distribution. In this work, we explore a novel approach to handle the data imbalance issue based on generative adversarial networks (GANs), which use the overall data distribution. GANs are utilized to generate synthetic data that closely resembles real data, thus, these generated data can be employed to enhance the ML models’ performance by eradicating the class imbalance problem for biological sequence analysis. We perform four distinct classification tasks by using four different sequence datasets (Influenza A Virus, PALMdb, VDjDB, Host) and our results illustrate that GANs can improve the overall classification performance.
APA, Harvard, Vancouver, ISO, and other styles
50

Trinquier, Jeanne, Guido Uguzzoni, Andrea Pagnani, Francesco Zamponi, and Martin Weigt. "Efficient generative modeling of protein sequences using simple autoregressive models." Nature Communications 12, no. 1 (October 4, 2021). http://dx.doi.org/10.1038/s41467-021-25756-4.

Full text
Abstract:
AbstractGenerative models emerge as promising candidates for novel sequence-data driven approaches to protein design, and for the extraction of structural and functional information about proteins deeply hidden in rapidly growing sequence databases. Here we propose simple autoregressive models as highly accurate but computationally efficient generative sequence models. We show that they perform similarly to existing approaches based on Boltzmann machines or deep generative models, but at a substantially lower computational cost (by a factor between 102 and 103). Furthermore, the simple structure of our models has distinctive mathematical advantages, which translate into an improved applicability in sequence generation and evaluation. Within these models, we can easily estimate both the probability of a given sequence, and, using the model’s entropy, the size of the functional sequence space related to a specific protein family. In the example of response regulators, we find a huge number of ca. 1068 possible sequences, which nevertheless constitute only the astronomically small fraction 10−80 of all amino-acid sequences of the same length. These findings illustrate the potential and the difficulty in exploring sequence space via generative sequence models.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography