To see the other types of publications on this topic, follow the link: Generative sequence models.

Dissertations / Theses on the topic 'Generative sequence models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 18 dissertations / theses for your research on the topic 'Generative sequence models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Svensk, Gustav. "TDNet : A Generative Model for Taxi Demand Prediction." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158514.

Full text
Abstract:
Supplying the right amount of taxis in the right place at the right time is very important for taxi companies. In this paper, the machine learning model Taxi Demand Net (TDNet) is presented which predicts short-term taxi demand in different zones of a city. It is based on WaveNet which is a causal dilated convolutional neural net for time-series generation. TDNet uses historical demand from the last years and transforms features such as time of day, day of week and day of month into 26-hour taxi demand forecasts for all zones in a city. It has been applied to one city in northern Europe and one in South America. In northern europe, an error of one taxi or less per hour per zone was achieved in 64% of the cases, in South America the number was 40%. In both cities, it beat the SARIMA and stacked ensemble benchmarks. This performance has been achieved by tuning the hyperparameters with a Bayesian optimization algorithm. Additionally, weather and holiday features were added as input features in the northern European city and they did not improve the accuracy of TDNet.
APA, Harvard, Vancouver, ISO, and other styles
2

Goodman, Genghis. "A Machine Learning Approach to Artificial Floorplan Generation." UKnowledge, 2019. https://uknowledge.uky.edu/cs_etds/89.

Full text
Abstract:
The process of designing a floorplan is highly iterative and requires extensive human labor. Currently, there are a number of computer programs that aid humans in floorplan design. These programs, however, are limited in their inability to fully automate the creative process. Such automation would allow a professional to quickly generate many possible floorplan solutions, greatly expediting the process. However, automating this creative process is very difficult because of the many implicit and explicit rules a model must learn in order create viable floorplans. In this paper, we propose a method of floorplan generation using two machine learning models: a sequential model that generates rooms within the floorplan, and a graph-based model that finds adjacencies between generated rooms. Each of these models can be altered such that they are each capable of producing a floorplan independently; however, we find that the combination of these models outperforms each of its pieces, as well as a statistic-based approach.
APA, Harvard, Vancouver, ISO, and other styles
3

Tubiana, Jérôme. "Restricted Boltzmann machines : from compositional representations to protein sequence analysis." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE039/document.

Full text
Abstract:
Les Machines de Boltzmann restreintes (RBM) sont des modèles graphiques capables d’apprendre simultanément une distribution de probabilité et une représentation des données. Malgré leur architecture relativement simple, les RBM peuvent reproduire très fidèlement des données complexes telles que la base de données de chiffres écrits à la main MNIST. Il a par ailleurs été montré empiriquement qu’elles peuvent produire des représentations compositionnelles des données, i.e. qui décomposent les configurations en leurs différentes parties constitutives. Cependant, toutes les variantes de ce modèle ne sont pas aussi performantes les unes que les autres, et il n’y a pas d’explication théorique justifiant ces observations empiriques. Dans la première partie de ma thèse, nous avons cherché à comprendre comment un modèle si simple peut produire des distributions de probabilité si complexes. Pour cela, nous avons analysé un modèle simplifié de RBM à poids aléatoires à l’aide de la méthode des répliques. Nous avons pu caractériser théoriquement un régime compositionnel pour les RBM, et montré sous quelles conditions (statistique des poids, choix de la fonction de transfert) ce régime peut ou ne peut pas émerger. Les prédictions qualitatives et quantitatives de cette analyse théorique sont en accord avec les observations réalisées sur des RBM entraînées sur des données réelles. Nous avons ensuite appliqué les RBM à l’analyse et à la conception de séquences de protéines. De part leur grande taille, il est en effet très difficile de simuler physiquement les protéines, et donc de prédire leur structure et leur fonction. Il est cependant possible d’obtenir des informations sur la structure d’une protéine en étudiant la façon dont sa séquence varie selon les organismes. Par exemple, deux sites présentant des corrélations de mutations importantes sont souvent physiquement proches sur la structure. A l’aide de modèles graphiques tels que les Machine de Boltzmann, on peut exploiter ces signaux pour prédire la proximité spatiale des acides-aminés d’une séquence. Dans le même esprit, nous avons montré sur plusieurs familles de protéines que les RBM peuvent aller au-delà de la structure, et extraire des motifs étendus d’acides aminés en coévolution qui reflètent les contraintes phylogénétiques, structurelles et fonctionnelles des protéines. De plus, on peut utiliser les RBM pour concevoir de nouvelles séquences avec des propriétés fonctionnelles putatives par recombinaison de ces motifs. Enfin, nous avons développé de nouveaux algorithmes d’entraînement et des nouvelles formes paramétriques qui améliorent significativement la performance générative des RBM. Ces améliorations les rendent compétitives avec l’état de l’art des modèles génératifs tels que les réseaux génératifs adversariaux ou les auto-encodeurs variationnels pour des données de taille intermédiaires
Restricted Boltzmann machines (RBM) are graphical models that learn jointly a probability distribution and a representation of data. Despite their simple architecture, they can learn very well complex data distributions such the handwritten digits data base MNIST. Moreover, they are empirically known to learn compositional representations of data, i.e. representations that effectively decompose configurations into their constitutive parts. However, not all variants of RBM perform equally well, and little theoretical arguments exist for these empirical observations. In the first part of this thesis, we ask how come such a simple model can learn such complex probability distributions and representations. By analyzing an ensemble of RBM with random weights using the replica method, we have characterised a compositional regime for RBM, and shown under which conditions (statistics of weights, choice of transfer function) it can and cannot arise. Both qualitative and quantitative predictions obtained with our theoretical analysis are in agreement with observations from RBM trained on real data. In a second part, we present an application of RBM to protein sequence analysis and design. Owe to their large size, it is very difficult to run physical simulations of proteins, and to predict their structure and function. It is however possible to infer information about a protein structure from the way its sequence varies across organisms. For instance, Boltzmann Machines can leverage correlations of mutations to predict spatial proximity of the sequence amino-acids. Here, we have shown on several synthetic and real protein families that provided a compositional regime is enforced, RBM can go beyond structure and extract extended motifs of coevolving amino-acids that reflect phylogenic, structural and functional constraints within proteins. Moreover, RBM can be used to design new protein sequences with putative functional properties by recombining these motifs at will. Lastly, we have designed new training algorithms and model parametrizations that significantly improve RBM generative performance, to the point where it can compete with state-of-the-art generative models such as Generative Adversarial Networks or Variational Autoencoders on medium-scale data
APA, Harvard, Vancouver, ISO, and other styles
4

Rehn, Martin. "Aspects of memory and representation in cortical computation." Doctoral thesis, KTH, Numerisk Analys och Datalogi, NADA, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4161.

Full text
Abstract:
Denna avhandling i datalogi föreslår modeller för hur vissa beräkningsmässiga uppgifter kan utföras av hjärnbarken. Utgångspunkten är dels kända fakta om hur en area i hjärnbarken är uppbyggd och fungerar, dels etablerade modellklasser inom beräkningsneurobiologi, såsom attraktorminnen och system för gles kodning. Ett neuralt nätverk som producerar en effektiv gles kod i binär mening för sensoriska, särskilt visuella, intryck presenteras. Jag visar att detta nätverk, när det har tränats med naturliga bilder, reproducerar vissa egenskaper (receptiva fält) hos nervceller i lager IV i den primära synbarken och att de koder som det producerar är lämpliga för lagring i associativa minnesmodeller. Vidare visar jag hur ett enkelt autoassociativt minne kan modifieras till att fungera som ett generellt sekvenslärande system genom att utrustas med synapsdynamik. Jag undersöker hur ett abstrakt attraktorminnessystem kan implementeras i en detaljerad modell baserad på data om hjärnbarken. Denna modell kan sedan analyseras med verktyg som simulerar experiment som kan utföras på en riktig hjärnbark. Hypotesen att hjärnbarken till avsevärd del fungerar som ett attraktorminne undersöks och visar sig leda till prediktioner för dess kopplingsstruktur. Jag diskuterar också metodologiska aspekter på beräkningsneurobiologin idag.
In this thesis I take a modular approach to cortical function. I investigate how the cerebral cortex may realise a number of basic computational tasks, within the framework of its generic architecture. I present novel mechanisms for certain assumed computational capabilities of the cerebral cortex, building on the established notions of attractor memory and sparse coding. A sparse binary coding network for generating efficient representations of sensory input is presented. It is demonstrated that this network model well reproduces the simple cell receptive field shapes seen in the primary visual cortex and that its representations are efficient with respect to storage in associative memory. I show how an autoassociative memory, augmented with dynamical synapses, can function as a general sequence learning network. I demonstrate how an abstract attractor memory system may be realised on the microcircuit level -- and how it may be analysed using tools similar to those used experimentally. I outline some predictions from the hypothesis that the macroscopic connectivity of the cortex is optimised for attractor memory function. I also discuss methodological aspects of modelling in computational neuroscience.
QC 20100916
APA, Harvard, Vancouver, ISO, and other styles
5

Shimagaki, Kai. "Advanced statistical modeling and variable selection for protein sequences." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS548.

Full text
Abstract:
Au cours des dernières décennies, des techniques de séquençage de protéines ont été développées et des expériences continues ont été menées. Grâce à tous ces efforts, de nos jours, nous avons obtenu plus de deux-cents millions données relative à des séquences de protéines. Afin de traiter une telle quantité de données biologiques, nous avons maintenant besoin de théories et de technologies pour extraire des informations de ces données que nous pouvons comprendre et pour apporter des idées. L'idée clé pour résoudre ce problème est la physique statistique et l'état de l'art de le Machine Learning (ML). La physique statistique est un domaine de la physique qui peut décrire avec succès de nombreux systèmes complexes en extrayant ou en réduisant les variables pour en faire des variables interprétables basées sur des principes simples.ML, d'autre part, peut représenter des données (par exemple en les reconstruisant ou en les classifiant) sans comprendre comment les données ont été générées, c'est-à-dire le phénomène physique à l'origine de la création de ces données. Dans cette thèse, nous rapportons des études de modélisation générative de séquences protéiques et de prédictions de contacts protéines-résidus à l'aide de la modélisation statistique inspirée de la physique et de méthodes orientées ML. Dans la première partie, nous passons en revue le contexte général de la biologie et de la génomique. Ensuite, nous discutons des modélisations statistiques pour la séquence des protéines. En particulier, nous passons en revue l'analyse de couplage direct (DCA), qui est la technologie de base de notre recherche
Over the last few decades, protein sequencing techniques have been developed and continuous experiments have been done. Thanks to all of these efforts, nowadays, we have obtained more than two hundred million protein sequence data. In order to deal with such a huge amount of biological data, now, we need theories and technologies to extract information that we can understand and interpret.The key idea to resolve this problem is statistical physics and the state of the art of machine learning (ML). Statistical physics is a field of physics that can successfully describe many complex systems by extracting or reducing variables to be interpretable variables based on simple principles. ML, on the other hand, can represent data (such as reconstruction and classification) without assuming how the data was generated, i.e. physical phenomenon behind of data. In this dissertation, we report studies of protein sequence generative modeling and protein-residue contact predictions using statistical physics-inspired modeling and ML-oriented methods. In the first part, we review the general background of biology and genomics. Then we discuss statistical modelings for protein sequence. In particular, we review Direct Coupling Analysis (DCA), which is the core technology of our research. We also discuss the effects of higher-order statistics contained in protein sequences and introduces deep learning-based generative models as a model that can go beyond pairwise interaction
APA, Harvard, Vancouver, ISO, and other styles
6

Adak, Bulent Mehmet. "Model-based Code Generation For The High Level Architecture Federates." Phd thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/3/12609032/index.pdf.

Full text
Abstract:
We tackle the problem of automated code generation for a High Level Architecture (HLA)- compliant federate application, given a model of the federation architecture including the federate&rsquo
s behavior model. The behavior model is based on Live Sequence Charts (LSCs), adopted as the behavioral specification formalism in the Federation Architecture Metamodel (FAMM). The FAMM is constructed conforming to metaGME, the meta-metamodel offered by Generic Modeling Environment (GME). FAMM serves as a formal language for describing federation architectures. We present a code generator that generates Java/AspectJ code directly from a federation architecture model. An objective is to help verify a federation architecture by testing it early in the development lifecycle. Another objective is to help developers construct complete federate applications. Our approach to achieve these objectives is aspect-oriented in that the code generated from the LSC in conjunction with the Federation Object Model (FOM) serves as the base code on which the computation logic is weaved as an aspect.
APA, Harvard, Vancouver, ISO, and other styles
7

Kunst, Rafael. "Um injetor de erros aplicado à avaliação de desempenho do codificador de canal em redes IEEE 802.16." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2009. http://hdl.handle.net/10183/17800.

Full text
Abstract:
A necessidade de suportar serviços multimídia impulsiona o desenvolvimento das redes sem fio. Com isso, torna-se importante fornecer confiabilidade na transmissão de dados em um ambiente sujeito a variações espaciais, temporais e de freqüência, causadas por fenômenos físicos que, geralmente, causam erros nos dados transmitidos. Esses erros são basicamente de dois tipos: erros em rajada e erros aleatórios (Additive White Gaussian Noise - AWGN). Simular o comportamento dos canais sem fio afetados por erros é objeto de pesquisa há diversos anos. Entretanto, grande parte das pesquisas não considera a aplicação dos dois tipos de erros simultaneamente, o que pode gerar imprecisões nos resultados das simulações. Sendo assim, este trabalho propõe um injetor capaz de gerar tanto seqüências de erros em rajada quanto AWGN, além de propor um modelo de erros híbrido que considera a injeção de ambos os tipos de erros para simular o comportamento de um canal sem fio. O injetor de erros proposto é empregado em um estudo de caso referente à análise de desempenho do mecanismo de codificação de canal em redes que seguem o padrão IEEE 802.16, tanto nomádicas (fixas) quanto móveis. É avaliada a capacidade de correção dos codificadores Forward Error Correction (FEC), de emprego obrigatório de acordo com o referido padrão. Além disso, estuda-se o impacto causado pela aplicação de técnicas que consistem na adição de diversidade temporal à transmissão, em cenários cuja ocorrência dos erros é em rajada, e em cenários cujos erros são modelados de acordo com seqüências AWGN. Finalmente, realiza-se um estudo teórico sobre a vazão que pode ser atingida nos cenários nomádicos e móveis, além de uma discussão sobre os avanços tecnológicos trazidos pela multiplexação de canal baseada em Orthogonal Frequency Division Multiple Access (OFDMA), empregado em redes IEEE 802.16 móveis.
The demand for providing multimedia services is increasing the development of wireless networks. Therefore, an important issue is to guarantee correct transmissions over channels that are affected by time and frequency variant conditions caused by physical impairments that lead to the occurrence of errors during the transmission. These errors are basically of two types: burst errors and random errors, typically modeled as Additive White Gaussian Noise (AWGN). Simulating the behavior of wireless channels affected by physical impairments has been subject of several investigations in the past years. Nevertheless, part of the current researches does not consider the occurrence of both errors at the same time. This approach may lead to imprecisions on the results obtained through simulations. This work proposea an error sequence generator which is able of generating both burst and AWGN error models. Moreover, the proposed model can generate hybrid errors sequences composed of both error types simultaneously. The proposed error sequence generator is applied to a case study that aims to evaluate the performance of the channel encoder of nomadic (fixed) and mobile IEEE 802.16 networks. In this context, we evaluate the error correction capability of FEC encoders which are mandatory according to IEEE 802.16 standard. Furthermore, we study the impact caused by the application of time diversity techniques on the transmission, considering scenarios affected by burst errors and AWGN. We also present a study about the theoretical throughput that can be reached by nomadic and mobile technologies. Finally, we discuss the technological advances brought by Orthogonal Frequency Division Multiple Access (OFDMA) channel multiplexing technique, which is employed in IEEE 802.16 mobile networks.
APA, Harvard, Vancouver, ISO, and other styles
8

Künstner, Axel. "Birds as a Model for Comparative Genomic Studies." Doctoral thesis, Uppsala universitet, Evolutionsbiologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-159766.

Full text
Abstract:
Comparative genomics provides a tool to investigate large biological datasets, i.e. genomic datasets. In my thesis I focused on inferring patterns of selection in coding and non-coding regions of avian genomes. Until recently, large comparative studies on selection were mainly restricted to model species with sequenced genomes. This limitation has been overcome with advances in sequencing technologies and it is now possible to gather large genomic data sets for non-model species.  Next-generation sequencing data was used to study patterns of nucleotide substitutions and from this we inferred how selection has acted in the genomes of 10 non-model bird species. In general, we found evidence for a negative correlation between neutral substitution rate and chromosome size in birds. In a follow up study, we investigated two closely related bird species, to study expression levels in different tissues and pattern of selection. We found that between 2% and 18% of all genes were differentially expressed between the two species. We showed that non-coding regions adjacent to genes are under evolutionary constraint in birds, which suggests that noncoding DNA plays an important functional role in the genome. Regions downstream to genes (3’) showed particularly high level of constraint. The level of constraint in these regions was not correlated to the length of untranslated regions, which suggests that other causes play also a role in sequence conservation. We compared the rate of nonsynonymous substitutions to the rate of synonymous substitutions in order to infer levels of selection in protein-coding sequences. Synonymous substitutions are often assumed to evolve neutrally. We studied synonymous substitutions by estimating constraint on 4-fold degenerate sites of avian genes and found significant evolutionary constraint on this category of sites (between 24% and 43%). These results call for a reappraisal of synonymous substitution rates being used as neutral standards in molecular evolutionary analysis (e.g. the dN/dS ratio to infer positive selection). Finally, the problem of sequencing errors in next-generation sequencing data was investigated. We developed a program that removes erroneous bases from the reads. We showed that low coverage sequencing projects and large genome sequencing projects will especially gain from trimming erroneous reads.
APA, Harvard, Vancouver, ISO, and other styles
9

Alsafi, Radi Taha M. "Generation of complex recombinant fowlpox virus 9 (FP9) encoding simian immunodeficiency virus (SIVmac239) sequences as a model HIV vaccine candidate." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/generation-of-complex-recombinant-fowlpox-virus-9-fp9-encoding-simian-immunodeficiency-virus-sivmac239-sequences-as-a-model-hiv-vaccine-candidate(1a015762-8dc2-4153-a586-d7fab88b9658).html.

Full text
Abstract:
The development of a safe and effective HIV vaccine remains challenging due to its high antigenic variability. Poxviruses are large, stable, and have a track record of use as human vaccine candidates. Recombinant fowlpox virus 9 (rFP9), a highly attenuated host range-restricted poxvirus strain, has been safely administered to humans with no ill effects, and is known to be immunogenic. This thesis describes the construction of complex rFP9 encoding various sequences of SIVmac239. The SIVmac239/macaque model is widely used for HIV vaccine development. The ultimate aim of this work was to combine the advantages of FP9 with those of live attenuated SIV to produce a safe yet hopefully effective model HIV vaccine candidate. Transfer plasmids for five different insertion sites within the FP9 genome were designed and constructed. Homologous recombination (HR) of adjacent FP9 sequences was employed to facilitate the integration of SIVmac239 sequences into the FP9 genome. Positive rFP9 were identified by blue colouration in presence of X-gal using a transient colour selection (TCS) technique, and the final markerless pure recombinants were confirmed by PCR. Expression of the target SIV proteins in the presence of T7 polymerase has been demonstrated by immunocytochemical (ICC) staining and Western blotting (WB) assays. Expression was also quantified by enzyme-linked immunosorbent assay (ELISA) in various cell lines at multiple time points. Five different unique rFP9 have been constructed through this project. All SIVmac239 open reading frames (ORFs) save nef have been integrated into the FP9 genome, and protein expression demonstrated where possible. Moreover, a single rFP9 vector expressing the defective SIVmac239 genome driven by T7 RNA polymerase has been successfully constructed and validated using a green fluorescent protein marker.rFP9 showed appropriate transgene expression in both avian and mammalian cells, although at different levels. The expression efficiency of rFP9 was finally compared to another attenuated poxvirus vector, modified vaccinia Ankara (MVA). Comparing the protein expression levels between rFP9 and rMVA was quite difficult because different poxvirus promoters (early/late in rFP9; intermediate in rMVA) were used to direct the transcription of the T7 RNA gene. Given this limitation, although generally higher levels of expression were seen with rFP9, this cannot be attributed to the FP9 with any certainty.
APA, Harvard, Vancouver, ISO, and other styles
10

Blazejewski, Tomasz. "Generative Models for Synthetic Biology." Thesis, 2020. https://doi.org/10.7916/d8-0xvy-cw79.

Full text
Abstract:
Over the past several years, the fields of synthetic biology and machine learning have demonstrated marked advances in the scale of their capabilities and the success of their applications. The work presented in this thesis focuses on the translation of recent advances in machine learning toward new applications in synthetic biology. In particular it is argued that the needs of synthetic biology researchers and practitioners are well met by a class of generative machine learning models, and that the scale of synthetic biology capabilities allows for their successful application across multiple domains of interest. In Chapter 1, a novel algorithm utilizing Markov Random Fields is used to, for the first time, design functional synthetic overlapping pairs of genes with potential applications for improved biological robustness and biosafety. In Chapter 2, motivated by a desire to extend the scope of protein sequence modeling to a greater range and diversity of protein sequences, a variant of a variational autoencoder model is used to project hundreds of millions of protein sequences into a continuous latent space with potentially useful representation features. Finally, in Chapter 3, we move beyond the realm of protein sequences to define a probabilistic species-specific model of regulatory sequences and explore this model’s utility for the challenging task of gene expression prediction for non-model bacterial organisms. Machine learning models presented in this thesis represent novel applications of models traditionally applied to data in the domains of images, text or sound toward addressing challenging problems in biology. Particular attention is devoted to the challenging task of utilizing large amounts of unlabeled data present in metagenomic sequences and the genomes of poorly characterized bacteria in the hope of improving researchers’ abilities to manipulate complex biological phenomena.
APA, Harvard, Vancouver, ISO, and other styles
11

Chiang, Yi-Heng, and 蔣宜衡. "Using the Sequence to Sequence Generative Model for Bidirectional Text Rewriting." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/trmcm5.

Full text
Abstract:
碩士
淡江大學
資訊管理學系碩士班
106
Although the ability to understand and master a language varies from person to person, it is also affected by the evolution of the language itself. In particular, Classical Chinese as a written language of the past has obvious differences from Vernacular Chinese used in modern society. As a consequence, many Chinese today find it hard to understand Classical Chinese texts. In order to bridge the gap in understanding the two writing styles of Classical Chinese and Vernacular Chinese, this work chooses the bidirectional text rewriting of Classical and Vernacular Chinese as the topic. A parallel corpus is collected and processed by natural language techniques. The corpus is used to train a sequence to sequence model under the deep learning architecture. The model can be used to generate sentences of the desired writing style. In addition, this work also uses two separate monolingual corpora to train two independent sets of word vectors in Classical Chinese and Vernacular Chinese, respectively. It aims to extract the semantic relevance between words in each writing style. From the parallel corpus, this work tries to find the correspondence relations between Classical Chinese (CC) and Vernacular Chinese (VC). A neural machine translation model is applied to extract the relevant word alignments in the parallel corpus. As result, the BLEU metric is used to evaluate the generated sentences. For the test dataset, it is found that the word-level model can rewrite VC to CC better than CC to VC. In contrast, the character-level model can rewrite CC to VC better than VC to CC. Overall, the character-level model performs better than the word-level model in Chinese text rewriting. In this work, natural language technologies are applied in rewriting between the two Chinese writing styles of Vernacular Chinese and Classical Chinese. It can be seen that the bidirectional text rewriting method used in this work has provided a promising study direction for understanding related writing styles.
APA, Harvard, Vancouver, ISO, and other styles
12

Shen, Liang-Hsin, and 沈亮欣. "Acrostic Generating System: An Application of Control Signals on Sequence-to-Sequence Models." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/92fu45.

Full text
Abstract:
碩士
國立臺灣大學
資訊工程學研究所
107
An acrostic is a form of writing that the first token of each line (or other recurring features in the text) forms a meaningful sequence. In this paper we present a generalized acrostic generation system that can hide certain message in a flexible pattern specified by the users. Different from previous works that focus on rule-based solutions, this work adopts a neural-based sequence-to-sequence model to achieve this goal. Besides acrostic, users are also allowed to specify the rhyme and length of the output sequences. Based on our knowledge, this is the first neural-based natural language generation system that demonstrates the capability of performing micro-level control over output sentences.
APA, Harvard, Vancouver, ISO, and other styles
13

Sridhar, Adepu. "Generating Test Sequences and Slices for Simulink/Stateflow Models." Thesis, 2013. http://ethesis.nitrkl.ac.in/5000/1/211CS3301.pdf.

Full text
Abstract:
In a typical software development project more than 50 percent of software development effort is spent in testing phase. Test case design as well as execution consumes a lot of time. So automated generation of test cases is highly required. In our thesis we generated test sequences from Simulink/Stateflow, which is used to develop Embedded control systems. Testing of these systems is very important in order to provide error free systems as well as quality assurance. For these purpose Test cases are used to test the systems. We developed the test sequences which are use to generate test cases. First, we represent the System using Simulink/Stateflow models. For this purpose normally we use Simulink tool, which is available in the MATLAB. We developed the dependency graph from the SL/SF model. For Simulink part of the model we use Out put dependency and for the Stateflow part of the model we use Control dependency graph. From those graphs we generate the test sequences. Simulink/Stateflow models often consist of more than ten thousand blocks and a large number of hierarchi-cal levels. In this, we present an approach for slicing Simulink/Stateflow models using dependence graphs from the automotive and avionics do-main. With slicing, the complexity of a model can be reduced to a given point of interest by removing unrelated model elements.
APA, Harvard, Vancouver, ISO, and other styles
14

Montella, Sébastien, and 李胤龍. "Emotionally-Triggered Short Text Conversation using Attention-Based Sequence Generation Models." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/hfpcxx.

Full text
Abstract:
碩士
國立中央大學
資訊工程學系
107
Emotional Intelligence is a field from which awareness is heavily being raised. Coupled with language generation, one expects to further humanize the machine and be a step closer to the user by generating responses that are consistent with a specific emotion. The analysis of sentiment within documents or sentences have been widely studied and improved while the generation of emotional content remains under-researched. Meanwhile, generative models have recently known series of improvements thanks to Generative Adversarial Network (GAN). Promising results are frequently reported in both natural language processing and computer vision. However, when applied to text generation, adversarial learning may lead to poor quality sentences and mode collapse. In this paper, we leverage one-round data conversation from social media to propose a novel approach in order to generate grammatically-correct-and-emotional-consistent answers for Short-Text Conversation task (STC-3) for NTCIR-14 workshop. We make use of an Attention-based Sequence-to-Sequence as our generator, inspired from StarGAN framework. We provide emotion embeddings and direct feedback from an emotion classifier to guide the generator. To avoid the aforementioned issues with adversarial networks, we alternatively train our generator using maximum likelihood and adversarial loss.
APA, Harvard, Vancouver, ISO, and other styles
15

Felix, Reyes Alejandro. "Test case generation using symbolic grammars and quasirandom sequences." Master's thesis, 2010. http://hdl.handle.net/10048/1668.

Full text
Abstract:
This work presents a new test case generation methodology, which has a high degree of automation (cost reduction); while providing increased power in terms of defect detection (benefits increase). Our solution is a variation of model-based testing, which takes advantage of symbolic grammars (a context-free grammar where terminals are replaced by regular expressions that represent their solution space) and quasi-random sequences to generate test cases. Previous test case generation techniques are enhanced with adaptive random testing to maximize input space coverage; and selective and directed sentence generation techniques to optimize sentence generation. Our solution was tested by generating 200 firewall policies containing up to 20 000 rules from a generic firewall grammar. Our results show how our system generates test cases with superior coverage of the input space, increasing the probability of defect detection while reducing considerably the needed number the test cases compared with other previously used approaches.
Software Engineering and Intelligent Systems
APA, Harvard, Vancouver, ISO, and other styles
16

Xu, Kelvin. "Exploring Attention Based Model for Captioning Images." Thèse, 2017. http://hdl.handle.net/1866/20194.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Hao, Yangyang. "Computational modeling for identification of low-frequency single nucleotide variants." 2015. http://hdl.handle.net/1805/8891.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)
Reliable detection of low-frequency single nucleotide variants (SNVs) carries great significance in many applications. In cancer genetics, the frequencies of somatic variants from tumor biopsies tend to be low due to contamination with normal tissue and tumor heterogeneity. Circulating tumor DNA monitoring also faces the challenge of detecting low-frequency variants due to the small percentage of tumor DNA in blood. Moreover, in population genetics, although pooled sequencing is cost-effective compared with individual sequencing, pooling dilutes the signals of variants from any individual. Detection of low frequency variants is difficult and can be cofounded by multiple sources of errors, especially next-generation sequencing artifacts. Existing methods are limited in sensitivity and mainly focus on frequencies around 5%; most fail to consider differential, context-specific sequencing artifacts. To face this challenge, we developed a computational and experimental framework, RareVar, to reliably identify low-frequency SNVs from high-throughput sequencing data. For optimized performance, RareVar utilized a supervised learning framework to model artifacts originated from different components of a specific sequencing pipeline. This is enabled by a customized, comprehensive benchmark data enriched with known low-frequency SNVs from the sequencing pipeline of interest. Genomic-context-specific sequencing error model was trained on the benchmark data to characterize the systematic sequencing artifacts, to derive the position-specific detection limit for sensitive low-frequency SNV detection. Further, a machine-learning algorithm utilized sequencing quality features to refine SNV candidates for higher specificity. RareVar outperformed existing approaches, especially at 0.5% to 5% frequency. We further explored the influence of statistical modeling on position specific error modeling and showed zero-inflated negative binomial as the best-performed statistical distribution. When replicating analyses on an Illumina MiSeq benchmark dataset, our method seamlessly adapted to technologies with different biochemistries. RareVar enables sensitive detection of low-frequency SNVs across different sequencing platforms and will facilitate research and clinical applications such as pooled sequencing, cancer early detection, prognostic assessment, metastatic monitoring, and relapses or acquired resistance identification.
APA, Harvard, Vancouver, ISO, and other styles
18

Andere, Anne A. "De novo genome assembly of the blow fly Phormia regina (Diptera: Calliphoridae)." Thesis, 2014. http://hdl.handle.net/1805/5630.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)
Phormia regina (Meigen), commonly known as the black blow fly is a dipteran that belongs to the family Calliphoridae. Calliphorids play an important role in various research fields including ecology, medical studies, veterinary and forensic sciences. P. regina, a non-model organism, is one of the most common forensically relevant insects in North America and is typically used to assist in estimating postmortem intervals (PMI). To better understand the roles P. regina plays in the numerous research fields, we re-constructed its genome using next generation sequencing technologies. The focus was on generating a reference genome through de novo assembly of high-throughput short read sequences. Following assembly, genetic markers were identified in the form of microsatellites and single nucleotide polymorphisms (SNPs) to aid in future population genetic surveys of P. regina. A total 530 million 100 bp paired-end reads were obtained from five pooled male and female P. regina flies using the Illumina HiSeq2000 sequencing platform. A 524 Mbp draft genome was assembled using both sexes with 11,037 predicted genes. The draft reference genome assembled from this study provides an important resource for investigating the genetic diversity that exists between and among blow fly species; and empowers the understanding of their genetic basis in terms of adaptations, population structure and evolution. The genomic tools will facilitate the analysis of genome-wide studies using modern genomic techniques to boost a refined understanding of the evolutionary processes underlying genomic evolution between blow flies and other insect species.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography