Rozprawy doktorskie: „Sequence Feature”

1

Smith, Stephen Mark. "Feature based image sequence understanding". Thesis, University of Oxford, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.316951.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

2

Sung, Raymond Chun Wai. "Automatic assembly feature recognition and disassembly sequence generation". Thesis, Heriot-Watt University, 2001. http://hdl.handle.net/10399/478.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

3

Lai, Man Lok Michael. "Image sequence coding using intensity-based feature separation". Thesis, University of Cambridge, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.284145.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Tangirala, Karthik. "Unsupervised feature construction approaches for biological sequence classification". Diss., Kansas State University, 2015. http://hdl.handle.net/2097/19123.

Pełny tekst źródła

Streszczenie:

Doctor of Philosophy
Department of Computing and Information Sciences
Doina Caragea
Recent advancements in biological sciences have resulted in the availability of large amounts of sequence data (DNA and protein sequences). Biological sequence data can be annotated using machine learning techniques, but most learning algorithms require data to be represented by a vector of features. In the absence of biologically informative features, k-mers generated using a sliding window-based approach are commonly used to represent biological sequences. A larger k value typically results in better features; however, the number of k-mer features is exponential in k, and many k-mers are not informative. Feature selection is widely used to reduce the dimensionality of the input feature space. Most feature selection techniques use feature-class dependency scores to rank the features. However, when the amount of available labeled data is small, feature selection techniques may not accurately capture feature-class dependency scores. Therefore, instead of working with all k-mers, this dissertation proposes the construction of a reduced set of informative k-mers that can be used to represent biological sequences. This work resulted in three novel unsupervised approaches to construct features: 1. Burrows Wheeler Transform-based approach, that uses the sorted permutations of a given sequence to construct sequential features (subsequences) that occur multiple times in a given sequence. 2. Community detection-based approach, that uses a community detection algorithm to group similar subsequences into communities and refines the communities to form motifs (group of similar subsequences). Motifs obtained using the community detection-based approach satisfy the ZOMOPS constraint (Zero, One or Multiple Occurrences of a Motif Per Sequence). All possible unique subsequences of the obtained motifs are then used as features to represent the sequences. 3. Hybrid-based approach, that combines the Burrows Wheeler Transform-based approach and the community detection-based approach to allow certain mismatches to the features constructed using the Burrows Wheeler Transform-based approach. To evaluate the predictive power of the features constructed using the proposed approaches, experiments were conducted in three learning scenarios: supervised, semi-supervised, and domain adaptation for both nucleotide and protein sequence classification problems. The performance of classifiers learned using features generated with the proposed approaches was compared with the performance of the classifiers learned using k-mers (with feature selection) and feature hashing (another unsupervised dimensionality reduction technique). Experimental results from the three learning scenarios showed that features constructed with the proposed approaches were typically more informative than k-mers and feature hashing.

Style APA, Harvard, Vancouver, ISO itp.

5

Nilsson, Daniel. "Genomic feature identification in trypanosomatid parasites /". Stockholm, 2006. http://diss.kib.ki.se/2006/91-7140-789-8/.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

6

Bozkurt, Burcin. "Prediction Of Protein Subcellular Localization Using Global Protein Sequence Feature". Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/3/1135292/index.pdf.

Pełny tekst źródła

Streszczenie:

The problem of identifying genes in eukaryotic genomic sequences by computational methods has attracted considerable research attention in recent years. Many early approaches to the problem focused on prediction of individual functional elements and compositional properties of coding and non coding deoxyribonucleic acid (DNA) in entire eukaryotic gene structures. More recently, a number of approaches has been developed which integrate multiple types of information including structure, function and genetic properties of proteins. Knowledge of the structure of a protein is essential for describing and understanding its function. In addition, subcellular localization of a protein can be used to provide some amount of characterization of a protein. In this study, a method for the prediction of protein subcellular localization based on primary sequence data is described. Primary sequence data for a protein is based on amino acid sequence. The frequency value for each amino acid is computed in one given position. Assigned frequencies are used in a new encoding scheme that conserves biological information based on point accepted mutations (PAM) substitution matrix. This method can be used to predict the nuclear, the cytosolic sequences, the mitochondrial targeting peptides (mTP) and the signal peptides (SP). For clustering purposes, other than well known traditional techniques, principle component analysis (PCA)"
and self-organizing maps (SOM)"
are used. For classication purposes, support vector machines (SVM)"
, a method of statistical learning theory recently introduced to bioinformatics is used. The aim of the combination of feature extraction, clustering and classification methods is to design an acccurate system that predicts the subcellular localization of proteins presented into the system. Our scheme for combining several methods is cascading or serial combination according to its architecture. In the cascading architecture, the output of a method serves as the input of the other model used.

Style APA, Harvard, Vancouver, ISO itp.

7

Islamaj, Rezarta. "Feature generation and analysis applied to sequence classification for splice-site prediction". College Park, Md.: University of Maryland, 2007. http://hdl.handle.net/1903/7745.

Pełny tekst źródła

Streszczenie:

Thesis (Ph. D.) -- University of Maryland, College Park, 2007.
Thesis research directed by: Dept. of Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

Style APA, Harvard, Vancouver, ISO itp.

8

Lakshmanan, Arun. "Practice makes imperfect? sequence learning and the discontinuous acquisition of feature use skills /". [Bloomington, Ind.] : Indiana University, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3331241.

Pełny tekst źródła

Streszczenie:

Thesis (Ph.D.)--Indiana University, Kelley School of Business, 2008.
Title from PDF t.p. (viewed on Jul 23, 2009). Source: Dissertation Abstracts International, Volume: 69-11, Section: A, page: 4418. Adviser: Shanker Krishnan.

Style APA, Harvard, Vancouver, ISO itp.

9

Ali, Isse. "Analysing and predicting differences between methylated and unmethylated DNA sequence features". Thesis, De Montfort University, 2015. http://hdl.handle.net/2086/12616.

Pełny tekst źródła

Streszczenie:

DNA methylation is involved in various biological phenomena, and its dysregulation has been demonstrated as being correlated with a number of human disease processes, including cancers, autism, and autoimmune, mental health and neuro-degenerative ones. It has become important and useful in characterising and modelling these biological phenomena in or-der to understand the mechanism of such occurrences, in relation to both health and disease. An attempt has previously been made to map DNA methylation across human tissues, however, the means of distinguishing between methylated, unmethylated and differentially-methylated groups using DNA sequence features remains unclear. The aim of this study is therefore to: firstly, investigate DNA methylation classes and predict these based on DNA sequence features; secondly, to further identify methylation-associated DNA sequence features, and distinguish methylation differences between males and females in relation to both healthy and diseased, sta-tuses. This research is conducted in relation to three samples within nine biological feature sub-sets extracted from DNA sequence patterns (Human genome database). Two samples contain classes (methylated, unmethy-lated and differentially-methylated) within a total of 642 samples with 3,809 attributes driven from four human chromosomes, i.e. chromosomes 6, 20, 21 and 22, and the third sample contains all human chromosomes, which encompasses 1628 individuals, and then 1,505 CpG loci (features) were extracted by using Hierarchical clustering (a process Heatmap), along with pair correlation distance and then applied feature selection methods. From this analysis, author extract 47 features associated with gender and age, with 17 revealing significant methylation differences between males and females. Methylation classes prediction were applied a K-nearest Neighbour classifier, combined with a ten-fold cross- validation, since to some data were severely imbalanced (i.e., existed in sub-classes), and it has been established that direct analysis in machine-learning is biased towards the majority class. Hence, author propose a Modified- Leave-One-Out (MLOO) cross-validation and AdaBoost methods to tackle these issues, with the aim of compositing a balanced outcome and limiting the bias in-terference from inter-differences of the classes involved, which has provided potential predictive accuracies between 75% and 100%, based on the DNA sequence context.

Style APA, Harvard, Vancouver, ISO itp.

10

Abril, Ferrando Josep Francesc. "Comparative analysis of eukaryotic gene sequence features". Doctoral thesis, Universitat Pompeu Fabra, 2005. http://hdl.handle.net/10803/7108.

Pełny tekst źródła

Streszczenie:

L'incessant augment del nombre de seqüències genòmiques, juntament amb
l'increment del nombre de tècniques experimentals de les que es disposa,
permetrà obtenir el catàleg complet de les funcions cel.lulars de
diferents organismes, incloent-hi la nostra espècie. Aquest catàleg
definirà els fonaments sobre els que es podrà entendre millor com els
organismes funcionen a nivell molecular. Al mateix temps es tindran més
pistes sobre els canvis que estan associats amb les malalties. Per tant,
la seqüència en brut, tal i com s'obté dels projectes de seqüenciació de
genomes, no té cap valor sense les anàlisis i la subsegüent anotació de
les característiques que defineixen aquestes funcions. Aquesta tesi
presenta la nostra contribució en tres aspectes relacionats de
l'anotació dels gens en genomes eucariotes.

Primer, la comparació a nivell de seqüència entre els genomes humà i de
ratolí es va dur a terme mitjançant un protocol semi-automàtic. El
programa de predicció de gens SGP2 es va desenvolupar a partir
d'elements d'aquest protocol. El concepte al darrera de l'SGP2 és que
les regions de similaritat obtingudes amb el programa TBLASTX, es fan
servir per augmentar la puntuació dels exons predits pel programa
geneid, amb el que s obtenen conjunts d'anotacions més acurats
d'estructures gèniques. SGP2 té una especificitat que és prou gran com
per que es puguin validar experimentalment via RT-PCR. La validació de
llocs d'splicing emprant la tècnica de la RT-PCR és un bon exemple de
com la combinació d'aproximacions computacionals i experimentals
produeix millors resultats que per separat.

S'ha dut a terme l'anàlisi descriptiva a nivell de seqüència dels llocs
d'splicing obtinguts sobre un conjunt fiable de gens ortòlegs per humà,
ratolí, rata i pollastre. S'han explorat les diferències a nivell de
nucleòtid entre llocs U2 i U12, pel conjunt d'introns ortòlegs que se'n
deriva d'aquests gens. S'ha trobat que els senyals d'splicing ortòlegs
entre humà i rossegadors, així com entre rossegadors, estan més
conservats que els llocs no relacionats. Aquesta conservació addicional
pot ser explicada però a nivell de conservació basal dels introns.
D'altra banda, s'ha detectat més conservació de l'esperada entre llocs
d'splicing ortòlegs entre mamífers i pollastre. Els resultats obtinguts
també indiquen que les classes intròniques U2 i U12 han evolucionat
independentment des de l'ancestre comú dels mamífers i les aus. Tampoc
s'ha trobat cap cas convincent d'interconversió entre aquestes dues
classes en el conjunt d'introns ortòlegs generat, ni cap cas de
substitució entre els subtipus AT-AC i GT-AG d'introns U12. Al contrari,
el pas de GT-AG a GC-AG, i viceversa, en introns U2 no sembla ser inusual.

Finalment, s'han implementat una sèrie d'eines de visualització per
integrar anotacions obtingudes pels programes de predicció de gens i per
les anàlisis comparatives sobre genomes. Una d'aquestes eines, el
gff2ps, s'ha emprat en la cartografia dels genomes humà, de la mosca del
vinagre i del mosquit de la malària, entre d'altres. El programa
gff2aplot i els filtres associats, han facilitat la tasca d'integrar
anotacions de seqüència amb els resultats d'eines per la cerca
d'homologia, com ara el BLAST. S'ha adaptat també el concepte de
pictograma a l'anàlisi comparativa de llocs d splicing ortòlegs, amb el
desenvolupament del programa compi.
El aumento incesante del número de secuencias genómicas, junto con el
incremento del número de técnicas experimentales de las que se dispone,
permitirá la obtención del catálogo completo de las funciones celulares
de los diferentes organismos, incluida nuestra especie. Este catálogo
definirá las bases sobre las que se pueda entender mejor el
funcionamiento de los organismos a nivel molecular. Al mismo tiempo, se
obtendrán más pistas sobre los cambios asociados a enfermedades. Por
tanto, la secuencia en bruto, tal y como se obtiene en los proyectos de
secuenciación masiva, no tiene ningún valor sin los análisis y la
posterior anotación de las características que definen estas funciones.
Esta tesis presenta nuestra contribución a tres aspectos relacionados de
la anotación de los genes en genomas eucariotas.

Primero, la comparación a nivel de secuencia entre el genoma humano y el
de ratón se llevó a cabo mediante un protocolo semi-automático. El
programa de predicción de genes SGP2 se desarrolló a partir de elementos
de dicho protocolo. El concepto sobre el que se fundamenta el SGP2 es
que las regiones de similaridad obtenidas con el programa TBLASTX, se
utilizan para aumentar la puntuación de los exones predichos por el
programa geneid, con lo que se obtienen conjuntos más precisos de
anotaciones de estructuras génicas. SGP2 tiene una especificidad
suficiente como para validar esas anotaciones experimentalmente vía
RT-PCR. La validación de los sitios de splicing mediante el uso de la
técnica de la RT-PCR es un buen ejemplo de cómo la combinación de
aproximaciones computacionales y experimentales produce mejores
resultados que por separado.

Se ha llevado a cabo el análisis descriptivo a nivel de secuencia de los
sitios de splicing obtenidos sobre un conjunto fiable de genes ortólogos
para humano, ratón, rata y pollo. Se han explorado las diferencias a
nivel de nucleótido entre sitios U2 y U12 para el conjunto de intrones
ortólogos derivado de esos genes. Se ha visto que las señales de
splicing ortólogas entre humanos y roedores, así como entre roedores,
están más conservadas que las no ortólogas. Esta conservación puede ser
explicada en parte a nivel de conservación basal de los intrones. Por
otro lado, se ha detectado mayor conservación de la esperada entre
sitios de splicing ortólogos entre mamíferos y pollo. Los resultados
obtenidos indican también que las clases intrónicas U2 y U12 han
evolucionado independientemente desde el ancestro común de mamíferos y
aves. Tampoco se ha hallado ningún caso convincente de interconversión
entre estas dos clases en el conjunto de intrones ortólogos generado, ni
ningún caso de substitución entre los subtipos AT-AC y GT-AG en intrones
U12. Por el contrario, el paso de GT-AG a GC-AG, y viceversa, en
intrones U2 no parece ser inusual.

Finalmente, se han implementado una serie de herramientas de
visualización para integrar anotaciones obtenidas por los programas de
predicción de genes y por los análisis comparativos sobre genomas. Una
de estas herramientas, gff2ps, se ha utilizado para cartografiar los
genomas humano, de la mosca del vinagre y del mosquito de la malaria. El
programa gff2aplot y los filtros asociados, han facilitado la tarea de
integrar anotaciones a nivel de secuencia con los resultados obtenidos
por herramientas de búsqueda de homología, como BLAST. Se ha adaptado
también el concepto de pictograma al análisis comparativo de los sitios
de splicing ortólogos, con el desarrollo del programa compi.
The constantly increasing amount of available genome sequences, along
with an increasing number of experimental techniques, will help to
produce the complete catalog of cellular functions for different
organisms, including humans. Such a catalog will define the base from
which we will better understand how organisms work at the molecular
level. At the same time it will shed light on which changes are
associated with disease. Therefore, the raw sequence from genome
sequencing projects is worthless without the complete analysis and
further annotation of the genomic features that define those functions.
This dissertation presents our contribution to three related aspects of
gene annotation on eukaryotic genomes.

First, a comparison at sequence level of human and mouse genomes was
performed by developing a semi-automatic analysis pipeline. The SGP2
gene-finding tool was developed from procedures used in this pipeline.
The concept behind SGP2 is that similarity regions obtained by TBLASTX
are used to increase the score of exons predicted by geneid, in order to
produce a more accurate set of gene structures. SGP2 provides a
specificity that is high enough for its predictions to be experimentally
verified by RT-PCR. The RT-PCR validation of predicted splice junctions
also serves as example of how combined computational and experimental
approaches will yield the best results.

Then, we performed a descriptive analysis at sequence level of the
splice site signals from a reliable set of orthologous genes for human,
mouse, rat and chicken. We have explored the differences at nucleotide
sequence level between U2 and U12 for the set of orthologous introns
derived from those genes. We found that orthologous splice signals
between human and rodents and within rodents are more conserved than
unrelated splice sites. However, additional conservation can be
explained mostly by background intron conservation. Additional
conservation over background is detectable in orthologous mammalian and
chicken splice sites. Our results also indicate that the U2 and U12
intron classes have evolved independently since the split of mammals and
birds. We found neither convincing case of interconversion between these
two classes in our sets of orthologous introns, nor any single case of
switching between AT-AC and GT-AG subtypes within U12 introns. In
contrast, switching between GT-AG and GC-AG U2 subtypes does not appear
to be unusual.

Finally, we implemented visualization tools to integrate annotation
features for gene- finding and comparative analyses. One of those tools,
gff2ps, was used to draw the whole genome maps for human, fruitfly and
mosquito. gff2aplot and the accompanying parsers facilitate the task of
integrating sequence annotations with the output of homologybased tools,
like BLAST.We have also adapted the concept of pictograms to the
comparative analysis of orthologous splice sites, by developing compi.

Style APA, Harvard, Vancouver, ISO itp.

11

Ozturk, Ozgur. "Feature extraction and similarity-based analysis for proteome and genome databases". The Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=osu1190138805.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

12

Lyons, James Geoffrey. "Enhanced Feature Extraction from Evolutionary Profiles for Protein Fold Recognition". Thesis, Griffith University, 2016. http://hdl.handle.net/10072/365732.

Pełny tekst źródła

Streszczenie:

Proteins are important biological macromolecules that play important roles in al- most all biological reactions. The function of a protein is dependent on the shape it folds in to, which is in turn dependent on the protein’s amino acid sequence. Ex- perimental approaches for determining a protein’s 3D structure are expensive and time consuming, so computational methods for determining the structure from the amino acid sequence are desired. Methods for directly computing the 3D structure of a protein exist, however they are impractical for large proteins and high resolution models due to the large search space. Instead of trying to directly find the 3D struc- ture from first principles, the primary structure can be compared to proteins with known 3D structure. A ‘fold’ is a way of classifying proteins with the same major secondary structures in the same arrangement and with the same topological con- nections. Protein Fold Recognition (PFR) is an important step towards determining a protein’s structure, simplifying the protein structure prediction problem. This is a multi-class classification problem solvable using machine learning techniques. The PFR problem has been widely studied in the past, with feature extraction approaches including using counts of amino acids and pairs of amino acids, physic- ochemical information, evolutionary information from the Position Specific Scoring Matrix (PSSM), and structural information from its predicted secondary structure. These approaches do work, but with limited success. Current state of the art features use information from the PSSM as well as the predicted secondary structure.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
Griffith School of Engineering
Science, Environment, Engineering and Technology
Full Text

Style APA, Harvard, Vancouver, ISO itp.

13

Kabir, Mitra. "Prediction of mammalian essential genes based on sequence and functional features". Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/prediction-of-mammalian-essential-genes-based-on-sequence-and-functional-features(cf8eeed5-c2b3-47c3-9a8f-2cc290c90d56).html.

Pełny tekst źródła

Streszczenie:

Essential genes are those whose presence is imperative for an organism's survival, whereas the functions of non-essential genes may be useful but not critical. Abnormal functionality of essential genes may lead to defects or death at an early stage of life. Knowledge of essential genes is therefore key to understanding development, maintenance of major cellular processes and tissue-specific functions that are crucial for life. Existing experimental techniques for identifying essential genes are accurate, but most of them are time consuming and expensive. Predicting essential genes using computational methods, therefore, would be of great value as they circumvent experimental constraints. Our research is based on the hypothesis that mammalian essential (lethal) and non-essential (viable) genes are distinguishable by various properties. We examined a wide range of features of Mus musculus genes, including sequence, protein-protein interactions, gene expression and function, and found 75 features that were statistically discriminative between lethal and viable genes. These features were used as inputs to create a novel machine learning classifier, allowing the prediction of a mouse gene as lethal or viable with the cross-validation and blind test accuracies of ∼91% and ∼93%, respectively. The prediction results are promising, indicating that our classifier is an effective mammalian essential gene prediction method. We further developed the mouse gene essentiality study by analysing the association between essentiality and gene duplication. Mouse genes were labelled as singletons or duplicates, and their expression patterns over 13 developmental stages were examined. We found that lethal genes originating from duplicates are considerably lower in proportion than singletons. At all developmental stages a significantly higher proportion of singletons and lethal genes are expressed than duplicates and viable genes. Lethal genes were also found to be more ancient than viable genes. In addition, we observed that duplicate pairs with similar patterns of developmental co-expression are more likely to be viable; lethal gene duplicate pairs do not have such a trend. Overall, these results suggest that duplicate genes in mouse are less likely to be essential than singletons. Finally, we investigated the evolutionary age of mouse genes across development to see if the morphological hourglass pattern exists in the mouse. We found that in mouse embryos, genes expressed in early and late stages are evolutionarily younger than those expressed in mid-embryogenesis, thus yielding an hourglass pattern. However, the oldest genes are not expressed at the phylotypic stage stated in prior studies, but instead at an earlier time point - the egg cylinder stage. These results question the application of the hourglass model to mouse development.

Style APA, Harvard, Vancouver, ISO itp.

14

Zhang, Nan. "Feature selection based segmentation of multi-source images : application to brain tumor segmentation in multi-sequence MRI". Phd thesis, INSA de Lyon, 2011. http://tel.archives-ouvertes.fr/tel-00701545.

Pełny tekst źródła

Streszczenie:

Multi-spectral images have the advantage of providing complementary information to resolve some ambiguities. But, the challenge is how to make use of the multi-spectral images effectively. In this thesis, our study focuses on the fusion of multi-spectral images by extracting the most useful features to obtain the best segmentation with the least cost in time. The Support Vector Machine (SVM) classification integrated with a selection of the features in a kernel space is proposed. The selection criterion is defined by the kernel class separability. Based on this SVM classification, a framework to follow up brain tumor evolution is proposed, which consists of the following steps: to learn the brain tumors and select the features from the first magnetic resonance imaging (MRI) examination of the patients; to automatically segment the tumor in new data using a multi-kernel SVM based classification; to refine the tumor contour by a region growing technique; and to possibly carry out an adaptive training. The proposed system was tested on 13 patients with 24 examinations, including 72 MRI sequences and 1728 images. Compared with the manual traces of the doctors as the ground truth, the average classification accuracy reaches 98.9%. The system utilizes several novel feature selection methods to test the integration of feature selection and SVM classifiers. Also compared with the traditional SVM, Fuzzy C-means, the neural network and an improved level set method, the segmentation results and quantitative data analysis demonstrate the effectiveness of our proposed system.

Style APA, Harvard, Vancouver, ISO itp.

15

Taha, May. "Probing sequence-level instructions for gene expression". Thesis, Montpellier, 2018. http://www.theses.fr/2018MONTT096/document.

Pełny tekst źródła

Streszczenie:

La régulation des gènes est fortement contrôlée afin d’assurer une large variété de types cellulaires ayant des fonctions spécifiques. Ces contrôles prennent place à différents niveaux et sont associés à différentes régions génomiques régulatrices. Il est donc essentiel de comprendre les mécanismes à la base des régulations géniques dans les différents types cellulaires, dans le but d’identifier les régulateurs clés. Plusieurs études tentent de mieux comprendre les mécanismes de régulation en modulant l’expression des gènes par des approches épigénétiques. Cependant, ces approches sont basées sur des données expérimentales limitées à quelques échantillons, et sont à la fois couteuses et chronophages. Par ailleurs, les constituants nécessaires à la régulation des gènes au niveau des séquences ne peut pas être capturées par ces approches. L’objectif principal de cette thèse est d’expliquer l’expression des ARNm en se basant uniquement sur les séquences d’ADN.Dans une première partie, nous utilisons le modèle de régression linéaire avec pénalisation Lasso pour prédire l’expression des gènes par l’intermédiaire des caractéristique de l’ADN comme la composition nucléotidique et les sites de fixation des facteurs de transcription. La précision de cette approche a été mesurée sur plusieurs données provenant de la base de donnée TCGA et nous avons trouvé des performances similaires aux modèles ajustés aux données expérimentales. Nous avons montré que la composition nucléotidique a un impact majeur sur l’expression des gènes. De plus, l’influence de chaque régions régulatrices est évaluée et l’effet du corps de gène, spécialement les introns semble être clé dans la prédiction de l’expression. En second partie, nous présentons une tentative d’amélioration des performances du modèle. D’abord, nous considérons inclure dans le modèles les interactions entres les différents variables et appliquer des transformations non linéaires sur les variables prédictives. Cela induit une légère augmentation des performances du modèles. Pour aller plus loin, des modèles d’apprentissage profond sont étudiés. Deux types de réseaux de neurones sont considérés : Les perceptrons multicouches et les réseaux de convolutions.Les paramètres de chaque neurone sont optimisés. Les performances des deux types de réseaux semblent être plus élevées que celles du modèle de régression linéaire pénalisée par Lasso. Les travaux de cette thèse nous ont permis (i) de démontrer l’existence des instructions au niveau de la séquence en relation avec l’expression des gènes, et (ii) de fournir différents cadres de travail basés sur des approches complémentaires. Des travaux complémentaires sont en cours en particulier sur le deep learning, dans le but de détecter des informations supplémentaires présentes dans les séquences
Gene regulation is tightly controlled to ensure a wide variety of cell types and functions. These controls take place at different levels and are associated with different genomic regulatory regions. An actual challenge is to understand how the gene regulation machinery works in each cell type and to identify the most important regulators. Several studies attempt to understand the regulatory mechanisms by modeling gene expression using epigenetic marks. Nonetheless, these approaches rely on experimental data which are limited to some samples, costly and time-consuming. Besides, the important component of gene regulation based at the sequence level cannot be captured by these approaches. The main objective of this thesis is to explain mRNA expression based only on DNA sequences features. In a first work, we use Lasso penalized linear regression to predict gene expression using DNA features such as transcription factor binding site (motifs) and nucleotide compositions. We measured the accuracy of our approach on several data from the TCGA database and find similar performance as that of models fitted with experimental data. In addition, we show that nucleotide compositions of different regulatory regions have a major impact on gene expression. Furthermore, we rank the influence of each regulatory regions and show a strong effect of the gene body, especially introns.In a second part, we try to increase the performances of the model. We first consider adding interactions between nucleotide compositions and applying non-linear transformations on predictive variables. This induces a slight increase in model performances.To go one step further, we then learn deep neuronal networks. We consider two types of neural networks: multilayer perceptrons and convolution networks. Hyperparameters of each network are optimized. The performances of both types of networks appear slightly higher than those of a Lasso penalized linear model. In this thesis, we were able to (i) demonstrate the existence of sequence-level instructions for gene expression and (ii) provide different frameworks based on complementary approaches. Additional work is ongoing, in particular with the last direction based on deep learning, with the aim of detecting additional information present in the sequence

Style APA, Harvard, Vancouver, ISO itp.

16

Eberhardt, Katharina [Verfasser], Hilde [Akademischer Betreuer] Haider i Iring [Akademischer Betreuer] Koch. "What are the basic modules of implicit sequence learning? A feature-based account / Katharina Eberhardt. Gutachter: Hilde Haider ; Iring Koch". Köln : Universitäts- und Stadtbibliothek Köln, 2016. http://d-nb.info/1098427408/34.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

17

Eberhardt, Katharina [Verfasser], Hilde Akademischer Betreuer] Haider i Iring [Akademischer Betreuer] [Koch. "What are the basic modules of implicit sequence learning? A feature-based account / Katharina Eberhardt. Gutachter: Hilde Haider ; Iring Koch". Köln : Universitäts- und Stadtbibliothek Köln, 2016. http://nbn-resolving.de/urn:nbn:de:hbz:38-67021.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

18

Li, Xiaomeng. "Human Promoter Recognition Based on Principal Component Analysis". Thesis, The University of Sydney, 2008. http://hdl.handle.net/2123/3656.

Pełny tekst źródła

Streszczenie:

This thesis presents an innovative human promoter recognition model HPR-PCA. Principal component analysis (PCA) is applied on context feature selection DNA sequences and the prediction network is built with the artificial neural network (ANN). A thorough literature review of all the relevant topics in the promoter prediction field is also provided. As the main technique of HPR-PCA, the application of PCA on feature selection is firstly developed. In order to find informative and discriminative features for effective classification, PCA is applied on the different n-mer promoter and exon combined frequency matrices, and principal components (PCs) of each matrix are generated to construct the new feature space. ANN built classifiers are used to test the discriminability of each feature space. Finally, the 3 and 5-mer feature matrix is selected as the context feature in this model. Two proposed schemes of HPR-PCA model are discussed and the implementations of sub-modules in each scheme are introduced. The context features selected by PCA are III used to build three promoter and non-promoter classifiers. CpG-island modules are embedded into models in different ways. In the comparison, Scheme I obtains better prediction results on two test sets so it is adopted as the model for HPR-PCA for further evaluation. Three existing promoter prediction systems are used to compare to HPR-PCA on three test sets including the chromosome 22 sequence. The performance of HPR-PCA is outstanding compared to the other four systems.

Style APA, Harvard, Vancouver, ISO itp.

19

Li, Xiaomeng. "Human Promoter Recognition Based on Principal Component Analysis". University of Sydney, 2008. http://hdl.handle.net/2123/3656.

Pełny tekst źródła

Streszczenie:

Master of Engineering
This thesis presents an innovative human promoter recognition model HPR-PCA. Principal component analysis (PCA) is applied on context feature selection DNA sequences and the prediction network is built with the artificial neural network (ANN). A thorough literature review of all the relevant topics in the promoter prediction field is also provided. As the main technique of HPR-PCA, the application of PCA on feature selection is firstly developed. In order to find informative and discriminative features for effective classification, PCA is applied on the different n-mer promoter and exon combined frequency matrices, and principal components (PCs) of each matrix are generated to construct the new feature space. ANN built classifiers are used to test the discriminability of each feature space. Finally, the 3 and 5-mer feature matrix is selected as the context feature in this model. Two proposed schemes of HPR-PCA model are discussed and the implementations of sub-modules in each scheme are introduced. The context features selected by PCA are III used to build three promoter and non-promoter classifiers. CpG-island modules are embedded into models in different ways. In the comparison, Scheme I obtains better prediction results on two test sets so it is adopted as the model for HPR-PCA for further evaluation. Three existing promoter prediction systems are used to compare to HPR-PCA on three test sets including the chromosome 22 sequence. The performance of HPR-PCA is outstanding compared to the other four systems.

Style APA, Harvard, Vancouver, ISO itp.

20

Xi, Min. "Image sequence guidance for mobile robot navigation". Thesis, Queensland University of Technology, 1998. https://eprints.qut.edu.au/36082/1/36082_Xi_1998.pdf.

Pełny tekst źródła

Streszczenie:

Vision based mobile robot navigation is a changeling issue in automated robot control. Using a camera as an active sensor requires the processing of a huge amount of visual data that is captured as image sequence. The relevant visual information for a robot navigation system needs to be extracted from the visual data and used for a real-time control. Several questions need to be answered including:1) What is the relevant information and how to extract it from a sequence of 2D images? 2) How to recognise the 3D surrounding environment from the extracted images? 3) How to generate a collision-free path for robot navigation? This thesis discusses all three questions and presents the design of a complete vision based mobile robot navigation system for an a priori unknown indoor environment. The image sequence is captured continuously via an on-board camera during robotnavigation. The movement of the robot with mounted camera causes an optical flow of image points which are utilised for extraction of three dimensional information and the estimation of robot motion in the scene. The developed algorithm of image sequence processing is designed with emphasis on speed such that the system can be fast enough to meet the real-time control requirement. The introduction of a reference image enables the prediction of regions of interest in the image sequence for reducing computational complexity. The system is able to recognise three-dimensional surroundings from the image sequence and to reconstruct them into atwo-dimensional map with information about the location of obstacles. From this map, a collision-free path is generated with the grid potential algorithm used for robot navigation. Furthermore, the system has the capabilities of learning and establishing the geometric construction of the building by exploration which is the first step in building a preliminary artificial intelligent mobile robot.

Style APA, Harvard, Vancouver, ISO itp.

21

Panahandeh, Ghazaleh, Nasser Mohammadiha i Magnus Jansson. "Ground Plane Feature Detection in Mobile Vision-Aided Inertial Navigation". KTH, Signalbehandling, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-99448.

Pełny tekst źródła

Streszczenie:

In this paper, a method for determining ground plane features in a sequence of images captured by a mobile camera is presented. The hardware of the mobile system consists of a monocular camera that is mounted on an inertial measurement unit (IMU). An image processing procedure is proposed, first to extract image features and match them across consecutive image frames, and second to detect the ground plane features using a two-step algorithm. In the first step, the planar homography of the ground plane is constructed using an IMU-camera motion estimation approach. The obtained homography constraints are used to detect the most likely ground features in the sequence of images. To reject the remaining outliers, as the second step, a new plane normal vector computation approach is proposed. To obtain the normal vector of the ground plane, only three pairs of corresponding features are used for a general camera transformation. The normal-based computation approach generalizes the existing methods that are developed for specific camera transformations. Experimental results on real data validate the reliability of the proposed method.

QC 20121107

Style APA, Harvard, Vancouver, ISO itp.

22

Yin, Pei. "Segmental discriminative analysis for American Sign Language recognition and verification". Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33939.

Pełny tekst źródła

Streszczenie:

This dissertation presents segmental discriminative analysis techniques for American Sign Language (ASL) recognition and verification. ASL recognition is a sequence classification problem. One of the most successful techniques for recognizing ASL is the hidden Markov model (HMM) and its variants. This dissertation addresses two problems in sign recognition by HMMs. The first is discriminative feature selection for temporally-correlated data. Temporal correlation in sequences often causes difficulties in feature selection. To mitigate this problem, this dissertation proposes segmentally-boosted HMMs (SBHMMs), which construct the state-optimized features in a segmental and discriminative manner. The second problem is the decomposition of ASL signs for efficient and accurate recognition. For this problem, this dissertation proposes discriminative state-space clustering (DISC), a data-driven method of automatically extracting sub-sign units by state-tying from the results of feature selection. DISC and SBHMMs can jointly search for discriminative feature sets and representation units of ASL recognition. ASL verification, which determines whether an input signing sequence matches a pre-defined phrase, shares similarities with ASL recognition, but it has more prior knowledge and a higher expectation of accuracy. Therefore, ASL verification requires additional discriminative analysis not only in utilizing prior knowledge but also in actively selecting a set of phrases that have a high expectation of verification accuracy in the service of improving the experience of users. This dissertation describes ASL verification using CopyCat, an ASL game that helps deaf children acquire language abilities at an early age. It then presents the "probe" technique which automatically searches for an optimal threshold for verification using prior knowledge and BIG, a bi-gram error-ranking predictor which efficiently selects/creates phrases that, based on the previous performance of existing verification systems, should have high verification accuracy. This work demonstrates the utility of the described technologies in a series of experiments. SBHMMs are validated in ASL phrase recognition as well as various other applications such as lip reading and speech recognition. DISC-SBHMMs consistently produce fewer errors than traditional HMMs and SBHMMs in recognizing ASL phrases using an instrumented glove. Probe achieves verification efficacy comparable to the optimum obtained from manually exhaustive search. Finally, when verifying phrases in CopyCat, BIG predicts which CopyCat phrases, even unseen in training, will have the best verification accuracy with results comparable to much more computationally intensive methods.

Style APA, Harvard, Vancouver, ISO itp.

23

Tucker, Dominic M. "Mapping and Characterization of Phytophthora sojae and Soybean Mosaic Virus Resistance in Soybean". Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/79598.

Pełny tekst źródła

Streszczenie:

Phytophthora sojae, the causal organism of stem and root rot, and Soybean mosaic virus (SMV) cause two of the most highly destructive diseases of soybean (Glycine max L. Merr). P. sojae can be managed either through deployment of race-specific resistance or through quantitative resistance termed partial resistance. In the current study, partial resistance to P. sojae was mapped in an interspecific recombinant inbred line (RIL) population of Glycine max by Glycine soja. One major quantitative trait loci (QTL) on molecular linkage group (MLG)-J (chromosome 16) and two minor QTL on MLG-I (chromosome 20) and -G (chromosome 18) were mapped using conventional molecular markers. Additionally, partial resistance to P. sojae was mapped in the same RIL population using single feature polymorphism (SFP) markers that further fine mapped the P. sojae QTL and identified potential candidate genes contributing to resistance. In a separate study, race-specific resistance was characterized in PI96983 discovering a potentially new allele of Rps4 on MLG-G. Finally, using the newly available whole-genome shotgun sequence of soybean, Rsv4 conferring resistance to strains of SMV known in the US, was localized to an approximately 100 kb region of sequence on chromosome 2 (MLG-D1B). Newly designed PCR-based markers permit for efficient selection of Rsv4 by breeding programs. Identified candidate genes for Rsv4 are discussed. Genomic resources developed in all of these studies provide breeders the tools necessary for developing durable resistance to both SMV and P. sojae.
Ph. D.

Style APA, Harvard, Vancouver, ISO itp.

24

Elbita, Abdulhakim Mehemed. "Efficient processing of corneal confocal microscopy images : development of a computer system for the pre-processing, feature extraction, classification, enhancement and registration of a sequence of corneal images". Thesis, University of Bradford, 2013. http://hdl.handle.net/10454/6463.

Pełny tekst źródła

Streszczenie:

Corneal diseases are one of the major causes of visual impairment and blindness worldwide. Used for diagnoses, a laser confocal microscope provides a sequence of images, at incremental depths, of the various corneal layers and structures. From these, ophthalmologists can extract clinical information on the state of health of a patient’s cornea. However, many factors impede ophthalmologists in forming diagnoses starting with the large number and variable quality of the individual images (blurring, non-uniform illumination within images, variable illumination between images and noise), and there are also difficulties posed for automatic processing caused by eye movements in both lateral and axial directions during the scanning process. Aiding ophthalmologists working with long sequences of corneal image requires the development of new algorithms which enhance, correctly order and register the corneal images within a sequence. The novel algorithms devised for this purpose and presented in this thesis are divided into four main categories. The first is enhancement to reduce the problems within individual images. The second is automatic image classification to identify which part of the cornea each image belongs to, when they may not be in the correct sequence. The third is automatic reordering of the images to place the images in the right sequence. The fourth is automatic registration of the images with each other. A flexible application called CORNEASYS has been developed and implemented using MATLAB and the C language to provide and run all the algorithms and methods presented in this thesis. CORNEASYS offers users a collection of all the proposed approaches and algorithms in this thesis in one platform package. CORNEASYS also provides a facility to help the research team and Ophthalmologists, who are in discussions to determine future system requirements which meet clinicians’ needs.

Style APA, Harvard, Vancouver, ISO itp.

25

Elbita, Abdulhakim M. "Efficient Processing of Corneal Confocal Microscopy Images. Development of a computer system for the pre-processing, feature extraction, classification, enhancement and registration of a sequence of corneal images". Thesis, University of Bradford, 2013. http://hdl.handle.net/10454/6463.

Pełny tekst źródła

Streszczenie:

Corneal diseases are one of the major causes of visual impairment and blindness worldwide. Used for diagnoses, a laser confocal microscope provides a sequence of images, at incremental depths, of the various corneal layers and structures. From these, ophthalmologists can extract clinical information on the state of health of a patient’s cornea. However, many factors impede ophthalmologists in forming diagnoses starting with the large number and variable quality of the individual images (blurring, non-uniform illumination within images, variable illumination between images and noise), and there are also difficulties posed for automatic processing caused by eye movements in both lateral and axial directions during the scanning process. Aiding ophthalmologists working with long sequences of corneal image requires the development of new algorithms which enhance, correctly order and register the corneal images within a sequence. The novel algorithms devised for this purpose and presented in this thesis are divided into four main categories. The first is enhancement to reduce the problems within individual images. The second is automatic image classification to identify which part of the cornea each image belongs to, when they may not be in the correct sequence. The third is automatic reordering of the images to place the images in the right sequence. The fourth is automatic registration of the images with each other. A flexible application called CORNEASYS has been developed and implemented using MATLAB and the C language to provide and run all the algorithms and methods presented in this thesis. CORNEASYS offers users a collection of all the proposed approaches and algorithms in this thesis in one platform package. CORNEASYS also provides a facility to help the research team and Ophthalmologists, who are in discussions to determine future system requirements which meet clinicians’ needs.
The data and image files accompanying this thesis are not available online.

Style APA, Harvard, Vancouver, ISO itp.

26

Hauser, Václav. "Rozpoznávání obličejů v obraze". Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2012. http://www.nusl.cz/ntk/nusl-219434.

Pełny tekst źródła

Streszczenie:

This master thesis deals with the detection and recognition of faces in the image. The content of this thesis is a description of methods that are used for the face detection and recognition. Method described in detail is the principal component analysis (PCA). This method is subsequently used in the implementation of face recognition in video sequence. In conjunction with the implementation work describes the OpenCV library package, which was used for implementation, specifically the C ++ API. Finally described application tests were done on two different video sequences.

Style APA, Harvard, Vancouver, ISO itp.

27

Costa, Gabriella Castro Barbosa. "Uma abordagem para linha de produtos de software cientíﬁco baseada em ontologia e workﬂow". Universidade Federal de Juiz de Fora (UFJF), 2013. https://repositorio.ufjf.br/jspui/handle/ufjf/4787.

Pełny tekst źródła

Streszczenie:

Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-31T17:53:13Z No. of bitstreams: 1 gabriellacastrobarbosacosta.pdf: 2243060 bytes, checksum: 0aef87199975808e0973490875ce39b5 (MD5)
Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-06-01T11:50:00Z (GMT) No. of bitstreams: 1 gabriellacastrobarbosacosta.pdf: 2243060 bytes, checksum: 0aef87199975808e0973490875ce39b5 (MD5)
Made available in DSpace on 2017-06-01T11:50:00Z (GMT). No. of bitstreams: 1 gabriellacastrobarbosacosta.pdf: 2243060 bytes, checksum: 0aef87199975808e0973490875ce39b5 (MD5) Previous issue date: 2013-02-27
CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Uma forma de aprimorar a reutilização e a manutenção de uma família de produtos de software é através da utilização de uma abordagem de Linha de Produtos de Software (LPS). Em algumas situações, tais como aplicações cientíﬁcas para uma determinada área, é vantajoso desenvolver uma coleção de produtos de software relacionados, utilizando uma abordagem de LPS. Linhas de Produtos de Software Cientíﬁco (LPSC) diferem-se de Li nhas de Produtos de Software pelo fato de que LPSC fazem uso de um modelo abstrato de workﬂow cientíﬁco. Esse modelo abstrato de workﬂow é deﬁnido de acordo com o domínio cientíﬁco e, através deste workﬂow, os produtos da LPSC serão instanciados. Analisando as diﬁculdades em especiﬁcar experimentos cientíﬁcos e considerando a necessidade de composição de aplicações cientíﬁcas para a sua implementação, constata-se a necessidade de um suporte semântico mais adequado para a fase de análise de domínio. Para tanto, este trabalho propõe uma abordagem baseada na associação de modelo de features e onto logias, denominada PL-Science, para apoiar a especiﬁcação e a condução de experimentos cientíﬁcos. A abordagem PL-Science, que considera o contexto de LPSC, visa auxiliar os cientistas através de um workﬂow que engloba as aplicações cientíﬁcas de um dado experimento. Usando os conceitos de LPS, os cientistas podem reutilizar modelos que especiﬁcam a LPSC e tomar decisões de acordo com suas necessidades. Este trabalho enfatiza o uso de ontologias para facilitar o processo de aplicação de LPS em domínios cientíﬁcos. Através do uso de ontologia como um modelo de domínio consegue-se fornecer informações adicionais, bem como adicionar mais semântica ao contexto de LPSC.
A way to improve reusability and maintainability of a family of software products is through the Software Product Line (SPL) approach. In some situations, such as scientiﬁc applications for a given area, it is advantageous to develop a collection of related software products, using an SPL approach. Scientiﬁc Software Product Lines (SSPL) diﬀers from the Software Product Lines due to the fact that SSPL uses an abstract scientiﬁc workﬂow model. This workﬂow is deﬁned according to the scientiﬁc domain and, using this abstract workﬂow model, the products will be instantiated. Analyzing the diﬃculties to specify scientiﬁc experiments, and considering the need for scientiﬁc applications composition for its implementation, an appropriated semantic support for the domain analysis phase is necessary. Therefore, this work proposes an approach based on the combination of feature models and ontologies, named PL-Science, to support the speciﬁcation and conduction of scientiﬁc experiments. The PL-Science approach, which considers the context of SPL and aims to assist scientists to deﬁne a scientiﬁc experiment, specifying a workﬂow that encompasses scientiﬁc applications of a given experiment, is presented during this disser tation. Using SPL concepts, scientists can reuse models that specify the scientiﬁc product line and carefully make decisions according to their needs. This work also focuses on the use of ontologies to facilitate the process of applying Software Product Line to scientiﬁc domains. Through the use of ontology as a domain model, we can provide additional information as well as add more semantics in the context of Scientiﬁc Software Product Lines.

Style APA, Harvard, Vancouver, ISO itp.

28

Li, Ming. "Sequence and text classification : features and classifiers". Thesis, University of East Anglia, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.426966.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

29

Sutanto, Kevin. "RNA Sequence Classification Using Secondary Structure Fingerprints, Sequence-Based Features, and Deep Learning". Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/41876.

Pełny tekst źródła

Streszczenie:

RNAs are involved in different facets of biological processes; including but not limited to controlling and inhibiting gene expressions, enabling transcription and translation from DNA to proteins, in processes involving diseases such as cancer, and virus-host interactions. As such, there are useful applications that may arise from studies and analyses involving RNAs, such as detecting cancer by measuring the abundance of specific RNAs, detecting and identifying infections involving RNA viruses, identifying the origins of and relationships between RNA viruses, and identifying potential targets when designing novel drugs. Extracting sequences from RNA samples is usually not a major limitation anymore thanks to sequencing technologies such as RNA-Seq. However, accurately identifying and analyzing the extracted sequences is often still the bottleneck when it comes to developing RNA-based applications. Like proteins, functional RNAs are able to fold into complex structures in order to perform specific functions throughout their lifecycle. This suggests that structural information can be used to identify or classify RNA sequences, in addition to the sequence information of the RNA itself. Furthermore, a strand of RNA may have more than one possible structural conformations it can fold into, and it is also possible for a strand to form different structures in vivo and in vitro. However, past studies that utilized secondary structure information for RNA identification purposes have relied on one predicted secondary structure for each RNA sequence, despite the possible one-to-many relationship between a strand of RNA and the possible secondary structures. Therefore, we hypothesized that using a representation that includes the multiple possible secondary structures of an RNA for classification purposes may improve the classification performance. We proposed and built a pipeline that produces secondary structure fingerprints given a sequence of RNA, that takes into account the aforementioned multiple possible secondary structures for a single RNA. Using this pipeline, we explored and developed different types of secondary structure fingerprints in our studies. A type of fingerprints serves as high-level topological representations of the RNA structure, while another type represents matches with common known RNA secondary structure motifs we have curated from databases and the literature. Next, to test our hypothesis, the different fingerprints are then used with deep learning and with different datasets, alone and together with various sequence-based features, to investigate how the secondary structure fingerprints affect the classification performance. Finally, by analyzing our findings, we also propose approaches that can be adopted by future studies to further improve our secondary structure fingerprints and classification performance.

Style APA, Harvard, Vancouver, ISO itp.

30

Piroddi, R. "Multiple-feature object-based segmentation of video sequences". Thesis, University of Surrey, 2004. http://epubs.surrey.ac.uk/842727/.

Pełny tekst źródła

Streszczenie:

Emerging multimedia applications and Services require efficient and flexible coding (MPEG-4) and description (MPEG-7) of visual information. Object-based representations of visual information obtained by scene segmentation are particularly well-suited to this purpose. In this work, the segmentation of video sequences is addressed using a combination of features, such as motion, texture and colour. First, the Recursive Shortest Spanning Tree (RSST) is considered as a baseline segmentation tool and is adapted to perform single-feature segmentation using different visual cues. A novel motion- based RSST segmentation algorithm that incorporates multiple motion features into a single cost function is presented. Effective texture segmentation is achieved by a novel scheme relying on mathematical morphology operators. This approach is further extended to become applicable to colour texture segmentation. Second, multiple-feature segmentation of video sequences emerges as a major focus of this work. The RSST has been employed in order to perform simultaneous multiple-feature segmentation of video sequences in a hierarchical fashion. The presented work demonstrates that the performance of this approach rapidly degrades as the dimensionality of the feature space increases. To overcome this problem, a novel two-stage architecture for object-based segmentation is presented. The first stage locates perceptually meaningful objects using a hierarchy of single-feature segmentation processes. The second stage refines the boundaries of located objects using a suitable combination of features and a set of appropriate rules. This model is further simplified by minimizing the number of required sequence-dependent parameters and also by minimizing the number of inputs to the rule-based part of the algorithm. A comparative evaluation with state-of-the-art competing algorithms is favourable, demonstrating that the proposed architecture is capable of achieving accurate, meaningful and consistent segmentations which are intuitively correct and have good correspondence with a human viewer's notion of the decomposition of a natural scene to its constituent objects.

Style APA, Harvard, Vancouver, ISO itp.

31

Wang, Caixia. "Using Linear Features for Aerial Image Sequence Mosaiking". Fogler Library, University of Maine, 2004. http://www.library.umaine.edu/theses/pdf/WangC2004.pdf.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

32

Mulet, Parada Miguel. "Intensity independent feature extraction and tracking in echocardiographic sequences". Thesis, University of Oxford, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.343557.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

33

Ng, Siu-Kin. "Lineage specific genomics features and insights into evolutionary pathways /". View abstract or full-text, 2007. http://library.ust.hk/cgi/db/thesis.pl?BIEN%202007%20NG.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

34

Schaidnagel, Michael. "Automated feature construction for classification of complex, temporal data sequences". Thesis, University of the West of Scotland, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.692834.

Pełny tekst źródła

Streszczenie:

Data collected from internet applications are mainly stored in the form of transactions. All transactions of one user form a sequence, which shows the user´s behaviour on the site. Nowadays, it is important to be able to classify the behaviour in real time for various reasons: e.g. to increase conversion rate of customers while they are in the store or to prevent fraudulent transactions before they are placed. However, this is difficult due to the complex structure of the data sequences (i.e. a mix of categorical and continuous data types, constant data updates) and the large amounts of data that are stored. Therefore, this thesis studies the classification of complex data sequences. It surveys the fields of time series analysis (temporal data mining), sequence data mining or standard classification algorithms. It turns out that these algorithms are either difficult to be applied on data sequences or do not deliver a classification: Time series need a predefined model and are not able to handle complex data types; sequence classification algorithms such as the apriori algorithm family are not able to utilize the time aspect of the data. The strengths and weaknesses of the candidate algorithms are identified and used to build a new approach to solve the problem of classification of complex data sequences. The problem is thereby solved by a two-step process. First, feature construction is used to create and discover suitable features in a training phase. Then, the blueprints of the discovered features are used in a formula during the classification phase to perform the real time classification. The features are constructed by combining and aggregating the original data over the span of the sequence including the elapsed time by using a calculated time axis. Additionally, a combination of features and feature selection are used to simplify complex data types. This allows catching behavioural patterns that occur in the course of time. This new proposed approach combines techniques from several research fields. Part of the algorithm originates from the field of feature construction and is used to reveal behaviour over time and express this behaviour in the form of features. A combination of the features is used to highlight relations between them. The blueprints of these features can then be used to achieve classification in real time on an incoming data stream. An automated framework is presented that allows the features to adapt iteratively to a change in underlying patterns in the data stream. This core feature of the presented work is achieved by separating the feature application step from the computational costly feature construction step and by iteratively restarting the feature construction step on the new incoming data. The algorithm and the corresponding models are described in detail as well as applied to three case studies (customer churn prediction, bot detection in computer games, credit card fraud detection). The case studies show that the proposed algorithm is able to find distinctive information in data sequences and use it effectively for classification tasks. The promising results indicate that the suggested approach can be applied to a wide range of other application areas that incorporate data sequences.

Style APA, Harvard, Vancouver, ISO itp.

35

Lawver, Jordan D. "Robust Feature Tracking in Image Sequences Using View Geometric Constraints". The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1365611706.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

36

Yao, Xiaoquan. "Sequence features affecting translation initiation in eukaryotes: A bioinformatic approach". Thesis, University of Ottawa (Canada), 2008. http://hdl.handle.net/10393/27658.

Pełny tekst źródła

Streszczenie:

Sequence features play an important role in the regulation of translation initiation. This thesis focuses on the sequence features affecting eukaryotic initiation. The characteristics of 5' untranslated region in Saccharomyces cerevisiae were explored. It is found that the 40 nucleotides upstream of the start codon is the critical region for translation initiation in yeast. Moreover, this thesis attempted to solve some controversies related to the start codon context. Two key nucleotides in the start codon context are the third nucleotide upstream of the start codon (-3 site) and the nucleotide immediately following the start codon (+4 site). Two hypotheses regarding +4G (G at +4 site) in Kozak consensus, the translation initiation hypothesis and the amino acid constraint hypothesis, were tested. The relationship between the -3 and +4 sites in seven eukaryotic species does not support the translation initiation hypothesis. The amino acid usage at the position after the initiator (P1' position) compared to other positions in the coding sequences of seven eukaryotic species was examined. The result is consistent with the amino acid constraint hypothesis. In addition, this thesis explored the relationship between +4 nucleotide and translation efficiency in yeast. The result shows that +4 nucleotide is not important for translation efficiency, which does not support the translation initiation hypothesis. This work improves our current understanding of eukaryotic translation initiation process.

Style APA, Harvard, Vancouver, ISO itp.

37

Clark, Angus Alistair. "Region classification for the interpretation of video sequences". Thesis, University of Bristol, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.302167.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

38

Perera, Weliwaththage Thilini. "Elucidating the Sequence and Structural Features of Human Bence-Jones Proteins". University of Toledo / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1526078353370352.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

39

Perkins, David Neil. "Computer methods for identifying significant features in protein sequences". Thesis, University of Leeds, 1994. http://etheses.whiterose.ac.uk/6772/.

Pełny tekst źródła

Streszczenie:

The research described in this thesis can be easily and conveniently separated under two broad headings. the definition of discriminating motif sets for protein families and software development In this instance the phrase motif set refers to a combination of features in the amino acid sequences of a family of proteins that is diagnostic of family membership and therefore has predictive value in identifying new family members. Under the first heading. a number of sets of motifs are described in detail while a number of others are included as an appendix in a format compatible with the PRINTS motif database. All these studies involved the multiple alignment of protein sequences extracted from the database and the use of database scanning techniques. From these motif sets it has been possible to identify new members of protein families and they may also supply valuable information for the exploration of the possible function and structure of the protein families. A number of sequence analysis software packages are also described. They include both novel software and also the reworking of old algorithms with additions to make them more efficient. more useful for modem requirements and to fix existing problems. In the former category. new sequence alignment programs have been developed which integrate structural information (if any is available) with sequence and physicochemical properties. A number of programs are also discussed that allow the display and manipulation of a variety of sequence parameters. such as hydropathy and positional variability. which are very useful tools for motif definition. All these programs are written in C and the majority make use of the XlMotif programming libraries. where appropriate and are available on a variety of different hardware platforms. The ADSP system has also been rewritten to make it more efficient and it has been ported to the UNIX operating system to make it more accessible to a larger number of users.

Style APA, Harvard, Vancouver, ISO itp.

40

Ragupathi, Sundaraj. "Three dimensional motion parameter estimation from image sequences : a feature based approach". Thesis, Imperial College London, 1990. http://hdl.handle.net/10044/1/46515.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

41

Lappas, Pelopidas. "Optimal motion estimation of features and objects in long image sequences". Thesis, University of Southampton, 2004. https://eprints.soton.ac.uk/265064/.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

42

Tao, Chuang. "Automated approaches to object measurement and feature extraction from georeferenced mobile mapping image sequences". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/tape17/PQDD_0011/NQ31076.pdf.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

43

Chen, Julie Chih-yu. "Computational analysis of transcriptional regulation from local sequence features to three dimensional chromatin domains". Thesis, University of British Columbia, 2016. http://hdl.handle.net/2429/59384.

Pełny tekst źródła

Streszczenie:

Regulation of gene expression spans different levels of complexity: from genomic sequence, transcription factor binding and epigenetics, to three-dimensional chromatin interactions. Data from different individuals such as genetic variations presents an extra dimension to consider. Abnormal activities at any level may lead to disease phenotypes, motivating deeper exploration of gene regulation. New high-throughput sequencing techniques have empowered genome-wide studies of the regulatory mechanisms within cells. This thesis uses computational approaches to examine gene regulation with high-throughput data in order to address biological hypotheses traversing from short local sequence features to megabase-sized topologically associating domains (TADs). The hypotheses addressed in the thesis have two central themes: 1) the elucidation of local and domain regulation of gene expression, and 2) the application of such knowledge to identify functional phenotypic variants. We developed a computational approach to identify functional variants associated with cancer, and demonstrated how annotating regulatory sequences and linking these regions to target genes can strengthen genome interpretation. The concurrent and intertwined nature of local and domain regulation of gene expression develops as the thesis unfolds. In a study of genes that escape from X-chromosome inactivation, we found the YY1 transcription factor to be a key regulator, and is potentially associated with long distance chromatin looping mechanisms. Similarly, when studying the spread of inactivation to the autosomes in translocated cells, we detected local features associated with inactivation status, and at the domain level, we observed the spreading to be in accordance with TADs. Lastly, when considering TADs as transcriptional units, the identification of cell type-selectively co-expressed and co-localized TADs highlighted an organized and dynamic chromatin architecture across multiple cell types. In summary, this thesis provides insights into the mechanisms involved in gene expression across multiple scales (from local sequences to chromatin domains) using computational analyses on publicly available datasets. The presented methods and results have potential applications to interpret genetic variations and further our understanding in diseases and phenotypes. The findings may contribute to an era of preventative and regenerative medicine to come.
Science, Faculty of
Graduate

Style APA, Harvard, Vancouver, ISO itp.

44

Huang, Gaofeng. "A Seed-and-Grow Algorithmic Framework for identifying Features in Genome Sequences". Thesis, University of Oxford, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.525344.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

45

Umakanthan, Sabanadesan. "Human action recognition from video sequences". Thesis, Queensland University of Technology, 2016. https://eprints.qut.edu.au/93749/1/Sabanadesan_Umakanthan_Thesis.pdf.

Pełny tekst źródła

Streszczenie:

This PhD research has proposed new machine learning techniques to improve human action recognition based on local features. Several novel video representation and classification techniques have been proposed to increase the performance with lower computational complexity. The major contributions are the construction of new feature representation techniques, based on advanced machine learning techniques such as multiple instance dictionary learning, Latent Dirichlet Allocation (LDA) and Sparse coding. A Binary-tree based classification technique was also proposed to deal with large amounts of action categories. These techniques are not only improving the classification accuracy with constrained computational resources but are also robust to challenging environmental conditions. These developed techniques can be easily extended to a wide range of video applications to provide near real-time performance.

Style APA, Harvard, Vancouver, ISO itp.

46

Allison, Deborah. "Promises in the dark : opening title sequences in American feature films of the sound period". Thesis, University of East Anglia, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.247223.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

47

Toda, Nicholas Rafael Tetsuo. "Analysis of the sequence features contributing to centromere organisation and CENP-A positioning and incorporation". Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/16174.

Pełny tekst źródła

Streszczenie:

Centromere identity is integral for proper kinetochore formation and chromosome segregation. In most species chromosomes have a centromere at a defined locus that is propagated across generations. The histone H3 variant CENP-A acts as an epigenetic mark for centromere identity in most species studied. CENP-A is absent from the inactivated centromere on dicentric chromosomes and present at neocentromeres that form on non-centromeric sequences. Thus, the canonical centromere sequence is neither necessary nor sufficient for centromere function. Nevertheless, centromeres are generally associated with particular sequences. Understanding the organisation of centromeric sequence features will provide insight into centromere function and identity. In this study I use the fission yeast Schizosaccharomyces pombe model system to address the relationship between CENP-ACnp1 and centromeric sequence features. These analyses reveal that CENP-ACnp1 nucleosomes are highly positioned within the central domain by large asymmetric AT-rich gaps. The same sequence features underlying CENP-ACnp1 positioning are conserved in the related species S. octosporus, but are not found at neocentromeres, suggesting that they are important but non-essential for centromere function. CENP-ACnp1 over-expression leads to ectopic CENP-ACnp1 incorporation primarily at sites associated with heterochromatin, including the sites where stable neocentromeres form. Ectopic CENP-ACnp1 also occupies additional sites within the central domain that are not occupied in cells with wild-type CENP-ACnp1 levels. In wild-type cells CENP-ACnp1 occupied sites are likely also occupied by H3 nucleosomes or the CENP-T/W/S/X nucleosome-like complex in a mixed population. Several candidate proteins were investigated to determine a protein residing in the large gaps between CENP-ACnp1 nucleosomes could be identified. No proteins could be localised to the AT-rich gaps between CENP-ACnp1 nucleosomes, but the origin recognition complex in a promising candidate. The results presented in this thesis demonstrate that nucleosomes within the fission yeast centromere central domain are highly positioned by sequence features in a conserved manner. This positioning also allows for another complex, possibly the origin recognition complex, to bind to DNA. Nucleosome positioning, DNA replication, and transcription could individually and collectively influence CENP-ACnp1 assembly and centromere function. Further experiments in fission yeast will continue to provide insight into the general properties of centromere function and identity.

Style APA, Harvard, Vancouver, ISO itp.

48

Hu, Jing. "Prediction of Protein Function and Functional Sites From Protein Sequences". DigitalCommons@USU, 2009. https://digitalcommons.usu.edu/etd/292.

Pełny tekst źródła

Streszczenie:

High-throughput genomics projects have resulted in a rapid accumulation of protein sequences. Therefore, computational methods that can predict protein functions and functional sites efficiently and accurately are in high demand. In addition, prediction methods utilizing only sequence information are of particular interest because for most proteins, 3-dimensional structures are not available. However, there are several key challenges in developing methods for predicting protein function and functional sites. These challenges include the following: the construction of representative datasets to train and evaluate the method, the collection of features related to the protein functions, the selection of the most useful features, and the integration of selected features into suitable computational models. In this proposed study, we tackle these challenges by developing procedures for benchmark dataset construction and protein feature extraction, implementing efficient feature selection strategies, and developing effective machine learning algorithms for protein function and functional site predictions. We investigate these challenges in three bioinformatics tasks: the discovery of transmembrane beta-barrel (TMB) proteins in gram-negative bacterial proteomes, the identification of deleterious non-synonymous single nucleotide polymorphisms (nsSNPs), and the identification of helix-turn-helix (HTH) motifs from protein sequence.

Style APA, Harvard, Vancouver, ISO itp.

49

Ding, Sheng. "A Detachable LSTM with Residual-Autoencoder Features Method for Motion Recognition in Video Sequences". The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu160673417735023.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

50

Habib, Ayman Fawzy. "Estimation of motion parameters for stereo-image sequences using data association of linear features /". The Ohio State University, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487859313347387.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat „Sequence Feature”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych