Дисертації з теми "Inférence de réseau omic"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-27 дисертацій для дослідження на тему "Inférence de réseau omic".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Arsenteva, Polina. "Statistical modeling and analysis of radio-induced adverse effects based on in vitro and in vivo data." Electronic Thesis or Diss., Bourgogne Franche-Comté, 2023. http://www.theses.fr/2023UBFCK074.
Повний текст джерелаIn this work we address the problem of adverse effects induced by radiotherapy on healthy tissues. The goal is to propose a mathematical framework to compare the effects of different irradiation modalities, to be able to ultimately choose those treatments that produce the minimal amounts of adverse effects for potential use in the clinical setting. The adverse effects are studied in the context of two types of data: in terms of the in vitro omic response of human endothelial cells, and in terms of the adverse effects observed on mice in the framework of in vivo experiments. In the in vitro setting, we encounter the problem of extracting key information from complex temporal data that cannot be treated with the methods available in literature. We model the radio-induced fold change, the object that encodes the difference in the effect of two experimental conditions, in the way that allows to take into account the uncertainties of measurements as well as the correlations between the observed entities. We construct a distance, with a further generalization to a dissimilarity measure, allowing to compare the fold changes in terms of all the important statistical properties. Finally, we propose a computationally efficient algorithm performing clustering jointly with temporal alignment of the fold changes. The key features extracted through the latter are visualized using two types of network representations, for the purpose of facilitating biological interpretation. In the in vivo setting, the statistical challenge is to establish a predictive link between variables that, due to the specificities of the experimental design, can never be observed on the same animals. In the context of not having access to joint distributions, we leverage the additional information on the observed groups to infer the linear regression model. We propose two estimators of the regression parameters, one based on the method of moments and the other based on optimal transport, as well as the estimators for the confidence intervals based on the stratified bootstrap procedure
Hulot, Audrey. "Analyses de données omiques : clustering et inférence de réseaux Female ponderal index at birth and idiopathic infertility." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL034.
Повний текст джерелаThe development of biological high-throughput technologies (next-generation sequencing and mass spectrometry) have provided researchers with a large amount of data, also known as -omics, that help better understand the biological processes.However, each source of data separately explains only a very small part of a given process. Linking the differents -omics sources between them should help us understand more of these processes.In this manuscript, we will focus on two approaches, clustering and network inference, applied to omics data.The first part of the manuscript presents three methodological developments on this topic. The first two methods are applicable in a situation where the data are heterogeneous.The first method is an algorithm for aggregating trees, in order to create a consensus out of a set of trees. The complexity of the process is sub-quadratic, allowing to use it on data leading to a great number of leaves in the trees. This algorithm is available in an R-package named mergeTrees on the CRAN.The second method deals with the integration data from trees and networks, by transforming these objects into distance matrices using cophenetic and shortest path distances, respectively. This method relies on Multidimensional Scaling and Multiple Factor Analysis and can be also used to build consensus trees or networks.Finally, we use the Gaussian Graphical Models setting and seek to estimate a graph, as well as communities in the graph, from several tables. This method is based on a combination of Stochastic Block Model, Latent Block Model and Graphical Lasso.The second part of the manuscript presents analyses conducted on transcriptomics and metagenomics data to identify targets to gain insight into the predisposition of Ankylosing Spondylitis
Kazhuthuveettil, Sreedharan Jithin. "Échantillonnage et inférence dans réseaux complexes." Thesis, Université Côte d'Azur (ComUE), 2016. http://www.theses.fr/2016AZUR4121/document.
Повний текст джерелаThe recent emergence of large networks, mainly due to the rise of online social networks, brought out the difficulty to gather a complete picture of a network and it prompted the development of new distributed techniques. In this thesis, we design and analyze algorithms based on random walks and diffusion for sampling, estimation and inference of the network functions, and for approximating the spectrum of graph matrices. The thesis starts with the classical problem of finding the dominant eigenvalues and the eigenvectors of symmetric graph matrices like Laplacian of undirected graphs. Using the fact that the eigenspectrum is associated with a Schrödinger-type differential equation, we develop scalable techniques with diffusion over the graph and with gossiping algorithms. They are also adaptable to a simple algorithm based on quantum computing. Next, we consider sampling and estimation of network functions (sum and average) using random walks on graph. In order to avoid the burn-in time of random walks, with the idea of regeneration at its revisits to a fixed node, we develop an estimator for the aggregate function which is non-asymptotically unbiased and derive an approximation to its Bayesian posterior. An estimator based on reinforcement learning is also developed making use of regeneration. The final part of the thesis deals with the use of extreme value theory to make inference from the stationary samples of the random walks. Extremal events such as first hitting time of a large degree node, order statistics and mean cluster size are well captured in the parameter “extremal index”. We theoretically study and estimate extremal index of different random walk sampling techniques
Castel, David. "Inférence du réseau génétique d'Id2 dans les kératinocytes humains par intégration de données génomiques à large échelle." Evry-Val d'Essonne, 2007. http://www.biblio.univ-evry.fr/theses/2007/interne/2007/2007EVRY0026.pdf.
Повний текст джерелаWe report in the present study the characterization of the genetic regulatory network of Id2, a dominant negative regulator of bHLH, to further understand its role in the control of the proliferation/differentiation balance in human keratinocytes. To identify Id2 gene targets, we first used gene expression profiling in cells exhibiting Id2 overexpression or knock-down. At the same time we screened an siRNA library using an siRNA microarrays approach to characterize Id2 transcriptionnal regulators. These results, with additional phenotypic observations, show that Id2 exert a key role in the control of keratinocyte commitment into differentiation or proliferation. Furthermore, we unravel new functions of Id2 in anaphase promotion and DNA recombination control. Overal, our results alllowed a first description of Id2 genetic regulatory network topology
Vincent, Jonathan. "Inférence des réseaux de régulation de la synthèse des protéines de réserve du grain de blé tendre (Triticum aestivum L.) en réponse à l'approvisionnement en azote et en soufre." Thesis, Clermont-Ferrand 2, 2014. http://www.theses.fr/2014CLF22485/document.
Повний текст джерелаGrain storage protein content and composition are the main determinants of bread wheat (Triticum aestivum L.) end-use value. Scaling laws governing grain protein composition according to grain nitrogen and sulfur content could be the outcome of a finely tuned regulation network. Although it was demonstrated that the main regulation of grain storage proteins accumulation occurs at the transcriptomic level in cereals, knowledge of the underlying molecular mechanisms is elusive. Moreover, the effects of nitrogen and sulfur on these mechanisms are unknown. The issue of skyrocketing data generation in research projects is addressed by developing high-throughput bioinformatics approaches. Extracting knowledge on from such massive amounts of data is therefore an important challenge. The work presented herein aims at elucidating regulatory networks involved in grain storage protein synthesis and their response to nitrogen and sulfur supply using a rule discovery approach. This approach was extended, implemented in the form of a web-oriented platform dedicated to the inference and analysis of regulatory networks from qualitative and quantitative –omics data. This platform allowed us to define different semantics in a comprehensive framework; each semantic having its own biological meaning, thus providing us with global informative networks. Spatiotemporal specificity of transcription factors expression was observed and particular attention was paid to their relationship with grain storage proteins in the inferred networks. The work initiated here opens up a field of innovative investigation to identify new targets for plant breeding and for an improved end-use value and nutritional quality of wheat in the context of inputs limitation. Further analyses should enhance the understanding of the control of grain protein composition and allow providing wheat adapted to specific uses or deficient in protein fractions responsible for gluten allergenicity and intolerance
Gallopin, Mélina. "Classification et inférence de réseaux pour les données RNA-seq." Thesis, Université Paris-Saclay (ComUE), 2015. http://www.theses.fr/2015SACLS174/document.
Повний текст джерелаThis thesis gathers methodologicals contributions to the statistical analysis of next-generation high-throughput transcriptome sequencing data (RNA-seq). RNA-seq data are discrete and the number of samples sequenced is usually small due to the cost of the technology. These two points are the main statistical challenges for modelling RNA-seq data.The first part of the thesis is dedicated to the co-expression analysis of RNA-seq data using model-based clustering. A natural model for discrete RNA-seq data is a Poisson mixture model. However, a Gaussian mixture model in conjunction with a simple transformation applied to the data is a reasonable alternative. We propose to compare the two alternatives using a data-driven criterion to select the model that best fits each dataset. In addition, we present a model selection criterion to take into account external gene annotations. This model selection criterion is not specific to RNA-seq data. It is useful in any co-expression analysis using model-based clustering designed to enrich functional annotation databases.The second part of the thesis is dedicated to network inference using graphical models. The aim of network inference is to detect relationships among genes based on their expression. We propose a network inference model based on a Poisson distribution taking into account the discrete nature and high inter sample variability of RNA-seq data. However, network inference methods require a large number of samples. For Gaussian graphical models, we propose a non-asymptotic approach to detect relevant subsets of genes based on a block-diagonale decomposition of the covariance matrix. This method is not specific to RNA-seq data and reduces the dimension of any network inference problem based on the Gaussian graphical model
Brinza, Lilia. "Exploration et inférence du réseau de régulation de la transcription de la bactérie symbiotique intracellulaire à génome réduit Buchnera aphidicola." Phd thesis, INSA de Lyon, 2010. http://tel.archives-ouvertes.fr/tel-00750363.
Повний текст джерелаHaury, Anne-Claire. "Sélection de variables à partir de données d'expression : signatures moléculaires pour le pronostic du cancer du sein et inférence de réseaux de régulation génique." Phd thesis, Ecole Nationale Supérieure des Mines de Paris, 2012. http://pastel.archives-ouvertes.fr/pastel-00818345.
Повний текст джерелаChevalier, Stéphanie. "Inférence logique de réseaux booléens à partir de connaissances et d'observations de processus de différenciation cellulaire." Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG061.
Повний текст джерелаDynamic models are essential tools for exploring regulatory mechanisms in biology. This thesis was guided by the need expressed in oncology and developmental biology to automatically infer Boolean networks reproducing cellular differentiation processes.By considering observations and knowledge that the modelers have at their disposal, this thesis presents an approach that allows to model the richness of this cellular behavior by inferring all the compatible Boolean networks at that scale of the regulatory networks commonly considered in biology.To develop this method, three main contributions are presented.The first contribution is a formal framework of the properties of data collected to study cellular differentiation. This framework allows reasoning about the desired dynamic properties within Boolean networks to be consistent with this cellular behavior.The second contribution concerns the encoding of the model inference problem as a Boolean satisfiability problem whose solutions are the Boolean networks compatible with the biological data. For this, constraints on the dynamics of Boolean networks corresponding to the previously formalized properties have been implemented in logic programming.The last contribution was to apply to real biological problems the model inference method, named BoNesis, which was developed thanks to the constraints. These applications showed the benefit of inferring a set of models for the process analysis and illustrated the modeling methodology, from the preparation of biological data to the analysis of the inferred models
Maesano, Ariele. "Bayesian dynamic scheduling for service composition testing." Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066100/document.
Повний текст джерелаIn present times connectivity between systems becomes more common. It removes human mediation and allows complex distributed systems to autonomously complete long and complex tasks. SOA is a model driven contract based approach that allows legacy systems to collaborate by messages exchange. Collaboration, here, is a key word in the sense that multiple organisation can, with this approach, automate services exchanges between them without putting at risks their confidentiality. This cause to encounter the first difficulty, because if there are exchanges between the different partners, the inner-processes resulting in the exchange information is restricted to some partners and therefor to some of the testers. That put us in a grey-box testing case where the systems are black-boxes and only the message exchange is visible. That is why we propose a probabilistic approach using Bayesian Inference to test the architectures. The second Challenge is the size of the SOA. Since the systems are connected by loosely coupling them two by two according to SOA Specifications, SOA can contain a very important number of participants. In Fact most of the existing SOA are very important in there size. The size of the SOA is reflected in the complexity of the Bayesian inference. This second challenge constraints us to search for better solution for the Bayesian Inference. In order to cope with the size and density of the BN for even small services architectures, techniques of model-driven inference by compilation that allows quick generation of arithmetic circuits directly from the services architecture model and the test suite are being developed
Petiet, Florence. "Réseau bayésien dynamique hybride : application à la modélisation de la fiabilité de systèmes à espaces d'états discrets." Thesis, Paris Est, 2019. http://www.theses.fr/2019PESC2014/document.
Повний текст джерелаReliability analysis is an integral part of system design and operation, especially for systems running critical applications. Recent works have shown the interest of using Bayesian Networks in the field of reliability, for modeling the degradation of a system. The Graphical Duration Models are a specific case of Bayesian Networks, which make it possible to overcome the Markovian property of dynamic Bayesian Networks. They adapt to systems whose sojourn-time in each state is not necessarily exponentially distributed, which is the case for most industrial applications. Previous works, however, have shown limitations in these models in terms of storage capacity and computing time, due to the discrete nature of the sojourn time variable. A solution might be to allow the sojourn time variable to be continuous. According to expert opinion, sojourn time variables follow a Weibull distribution in many systems. The goal of this thesis is to integrate sojour time variables following a Weibull distribution in a Graphical Duration Model by proposing a new approach. After a presentation of the Bayesian networks, and more particularly graphical duration models, and their limitations, this report focus on presenting the new model allowing the modeling of the degradation process. This new model is called Weibull Hybrid Graphical Duration Model. An original algorithm allowing inference in such a network has been deployed. Various so built databases allowed to learn on one hand a Graphical Duration Model, and on an other hand a Graphical Duration Model Hybrid - Weibull, in order to compare them, in term of learning quality, of inference quality, of compute time, and of storage space
Smail, Linda. "Algorithmique pour les Réseaux Bayésiens et leurs extensions." Phd thesis, Université de Marne la Vallée, 2004. http://tel.archives-ouvertes.fr/tel-00007170.
Повний текст джерелаLe chapitre 1 présente la théorie des réseaux bayésiens. Nous introduisons une nouvelle notion, celle de réseau bayésien de niveau deux, utile pour l'introduction de notre algorithme de calcul sur les réseaux bayésiens ; nous donnons également quelques résultats fondamentaux et nous situons dans notre formalisme un exemple d'école de réseau bayésien dit «Visite en Asie» .
Dans le second chapitre, nous exposons une propriété graphique appelée «d-séparation» grâce à laquelle on peut déterminer, pour tout couple de variables aléatoires ou de groupes de variables, et tout ensemble de conditionnement, s'il y a nécessairement, ou non, indépendance conditionnelle. Nous présentons également dans ce chapitre des résultats concernant le calcul de probabilités ou probabilités conditionnelles dans les réseaux bayésiens en utilisant les propriétés de la d-séparation. Ces résultats, qui concernent des écritures à notre connaissance originales de la factorisation de la loi jointe et de la loi conditionnée d'une famille de variables aléatoires du réseau bayésien (en liaison avec la notion de réseau bayésien de niveau deux) doivent trouver leur utilité pour les réseaux bayésiens de grande taille.
Le troisième chapitre donne la présentation détaillée et la justification d'un des algorithmes connus de calcul dans les réseaux bayésiens : il s'agit de l'algorithme LS (Lauritzen and Spigelhalter), basé sur la méthode de l'arbre de jonction. Pour notre part, après avoir présenté la notion de suite recouvrante propre possédant la propriété d'intersection courante, nous proposons un algorithme en deux versions (dont l'une est originale) qui permet de construire une suite de parties d'un réseau bayésien possédant cette propriété. Cette présentation est accompagnée d'exemples.
Dans le chapitre 4, nous donnons une présentation détaillée de l'algorithme des restrictions successives que nous proposons pour le calcul de lois (dans sa première version), et de lois conditionnelles (dans sa deuxième version). Cela est présenté après l'introduction d'une nouvelle notion : il s'agit de la descendance proche. Nous présentons également une application de l'algorithme des restrictions successives sur l'exemple «Visite en Asie» présenté en chapitre 1, et nous comparons le nombre d'opérations élémentaires effectuées avec celui qui intervient dans l'application de l'algorithme LS sur le même exemple. Le gain de calcul qui, à la faveur de cet exemple, apparaît au profit de l'algorithme des restrictions successives, sera comme toujours, d'autant plus marqué que la taille des réseaux et le nombre de valeurs prises par les variables seront plus élevés. C'est ce qui justifie l'insertion de notre algorithme au seins de « ProBT » , un logiciel d'inférence probabiliste, réalisé et diffusé par l'équipe Laplace localisée dans le laboratoire Gravir à INRIA Rhône Alpes.
En annexes nous rappelons les propriétés des graphes orientés sans circuits, les notions de base sur l'indépendance conditionnelle et l'équivalence de plusieurs définitions des réseaux bayésiens.
Tembo, Mouafo Serge Romaric. "Applications de l'intelligence artificielle à la détection et l'isolation de pannes multiples dans un réseau de télécommunications." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0004/document.
Повний текст джерелаTelecommunication networks must be reliable and robust to ensure high availability of services. Operators are currently searching to automate as much as possible, complex network management operations such as fault diagnosis.In this thesis we are focused on self-diagnosis of failures in the optical access networks of the operator Orange. The diagnostic tool used up to now, called DELC, is an expert system based on decision rules. This system is efficient but difficult to maintain due in particular to the very large volume of information to analyze. It is also impossible to have a rule for each possible fault configuration, so that some faults are currently not diagnosed.We proposed in this thesis a new approach. In our approach, the diagnosis of the root causes of malfunctions and alarms is based on a Bayesian network probabilistic model of dependency relationships between the different alarms, counters, intermediate faults and root causes at the level of the various network component. This probabilistic model has been designed in a modular way, so as to be able to evolve in case of modification of the physical architecture of the network. Self-diagnosis of the root causes of malfunctions and alarms is made by inference in the Bayesian network model of the state of the nodes not observed in view of observations (counters, alarms, etc.) collected on the operator's network. The structure of the Bayesian network, as well as the order of magnitude of the probabilistic parameters of this model, were determined by integrating in the model the expert knowledge of the diagnostic experts on this segment of the network. The analysis of thousands of cases of fault diagnosis allowed to fine-tune the probabilistic parameters of the model thanks to an Expectation Maximization algorithm. The performance of the developed probabilistic tool, named PANDA, was evaluated over two months of fault diagnosis in Orange's GPON-FTTH network in July-August 2015. In most cases, the new system, PANDA, and the system in production, DELC, make an identical diagnosis. However, a number of cases are not diagnosed by DELC but are correctly diagnosed by PANDA. The cases for which self-diagnosis results of the two systems are different were evaluated manually, which made it possible to demonstrate in each of these cases the relevance of the decisions taken by PANDA
Leurent, Fabien. "Modélisation du trafic, des déplacements sur un réseau et de l'accessibilité aux activités grâce au transport." Habilitation à diriger des recherches, Université Paris Dauphine - Paris IX, 2006. http://tel.archives-ouvertes.fr/tel-00348286.
Повний текст джерелаUne telle modélisation comporte quatre aspects : un contenu sémantique, à caractère physique ou économique ; une formulation mathématique ; un solveur technique ; un aspect empirique (métrologie, statistique, économétrie).
Les disciplines mises en œuvre sont variées : théorie des réseaux, optimisation, informatique algorithmique, probabilités et statistiques, et aussi économie, socio-économie et physique du trafic. Mes contributions théoriques concernent la théorie des réseaux, l'économie du transport et la physique du trafic.
Mes travaux se répartissent en quatre thèmes :
A. La mesure et la modélisation du trafic. Au niveau local d'une route, j'ai analysé la relation entre flux et vitesse en mettant en cohérence l'analyse désagrégée, probabiliste au niveau du mobile individuel ; et l'analyse macroscopique en termes de flux et de distribution statistique des temps.
B. La modélisation des réseaux et des cheminements. L'équilibre entre offre de transport et demande de déplacement conjugue une dimension spatiale - topologique, une dimension temporelle, et une dimension comportementale - économique. Les enjeux de modélisation concernent : la représentation de l'offre et la demande ; la formulation et les propriétés d'existence – unicité – stabilité ; les algorithmes. Je me suis intéressé à la diversité des comportements ; et à la modélisation fine de l'offre et à la dimension temporelle.
C. L'analyse socio-économique des déplacements. Je me suis intéressé à l'usage de divers moyens de transport et à la prospection de leur clientèle potentielle ; au choix d'horaire de déplacement ; aux caractéristiques à la fois économiques et dynamiques de la congestion.
D. La distribution spatiale des déplacements et des activités. Je me suis intéressé d'une part à l'observation des flux par relation origine-destination (O-D) et à l'inférence statistique des matrices O-D ; et d'autre part, à la justification microéconomique des déplacements en raison de la localisation et de l'utilité des activités.
Dumora, Christophe. "Estimation de paramètres clés liés à la gestion d'un réseau de distribution d'eau potable : Méthode d'inférence sur les noeuds d'un graphe." Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0325.
Повний текст джерелаThe rise of data generated by sensors and operational tools around water distribution network (WDN) management make these systems more and more complex and in general the events more difficult to predict. The history of data related to the quality of distributed water crossed with the knowledge of network assets, contextual data and temporal parameters lead to study a complex system due to its volume and the existence of interactions between these various type of data which may vary in time and space. This big variety of data is grouped by the use of mathematical graph and allow to represent WDN as a whole and all the events that may arise therein or influence their proper functioning. The graph theory associated with these mathematical graphs allow a structural and spectral analysis of WDN to answer to specific needs and enhance existing process. These graphs are then used to answer the probleme of inference on the nodes of large graph from the observation of data on a small number of nodes. An approach by optminisation algorithm is used to construct a variable of flow on every nodes of a graph (therefore at any point of a physical network) using flow algorithm and data measured in real time by flowmeters. Then, a kernel prediction approach based on a Ridge estimator, which raises spectral analysis problems of a large sparse matrix, allow the inference of a signal measured on specific nodes of a graph at any point of a WDN
Prost, Vincent. "Sparse unsupervised learning for metagenomic data." Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL013.
Повний текст джерелаThe development of massively parallel sequencing technologies enables to sequence DNA at high-throughput and low cost, fueling the rise of metagenomics which is the study of complex microbial communities sequenced in their natural environment.Metagenomic problems are usually computationally difficult and are further complicated by the massive amount of data involved.In this thesis we consider two different metagenomics problems: 1. raw reads binning and 2. microbial network inference from taxonomic abundance profiles. We address them using unsupervised machine learning methods leveraging the parsimony principle, typically involving l1 penalized log-likelihood maximization.The assembly of genomes from raw metagenomic datasets is a challenging task akin to assembling a mixture of large puzzles composed of billions or trillions of pieces (DNA sequences). In the first part of this thesis, we consider the related task of clustering sequences into biologically meaningful partitions (binning). Most of the existing computational tools perform binning after read assembly as a pre-processing, which is error-prone (yielding artifacts like chimeric contigs) and discards vast amounts of information in the form of unassembled reads (up to 50% for highly diverse metagenomes). This motivated us to try to address the raw read binning (without prior assembly) problem. We exploit the co-abundance of species across samples as discriminative signal. Abundance is usually measured via the number of occurrences of long k-mers (subsequences of size k). The use of Local Sensitive Hashing (LSH) allows us to contain, at the cost of some approximation, the combinatorial explosion of long k-mers indexing. The first contribution of this thesis is to propose a sparse Non-Negative Matrix factorization (NMF) of the samples x k-mers count matrix in order to extract abundance variation signals. We first show that using sparse NMF is well-grounded since data is a sparse linear mixture of non-negative components. Sparse NMF exploiting online dictionary learning algorithms retained our attention, including its decent behavior on largely asymmetric data matrices. The validation of metagenomic binning being difficult on real datasets, because of the absence of ground truth, we created and used several benchmarks for the different methods evaluated on. We illustrated that sparse NMF improves state of the art binning methods on those datasets. Experiments conducted on a real metagenomic cohort of 1135 human gut microbiota showed the relevance of the approach.In the second part of the thesis, we consider metagenomic data after taxonomic profiling: multivariate data representing abundances of taxa across samples. It is known that microbes live in communities structured by ecological interaction between the members of the community. We focus on the problem of the inference of microbial interaction networks from taxonomic profiles. This problem is frequently cast into the paradigm of Gaussian graphical models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present in this part a zero-inflated log-normal graphical model specifically aimed at handling such "biological" zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets
Donat, Roland. "Modélisation de la fiabilité et de la maintenance par modèles graphiques probabilistes : application à la prévention des ruptures de rail." Phd thesis, INSA de Rouen, 2009. http://tel.archives-ouvertes.fr/tel-00474389.
Повний текст джерелаKanso, Assem. "Evaluation des modèles de calcul des flux polluants des rejets urbains par temps de pluie : Apport de l'approche bayésienne." Phd thesis, Ecole des Ponts ParisTech, 2004. http://pastel.archives-ouvertes.fr/pastel-00001264.
Повний текст джерелаLhoussaine, Cédric. "Réceptivité, mobilité et π-Calcul". Aix-Marseille 1, 2002. http://www.theses.fr/2002AIX11046.
Повний текст джерелаSella, Nadir. "Reconstruction de réseaux à partir de données génomiques et cliniques." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS351.
Повний текст джерелаThis thesis consists in the development of a novel methodological approach to reconstruct networks starting from biological and clinical data. It overcomes some technical and computational problems of existing methods to accomplish this task. Our algorithm (MIIC), allows the study of discrete, continuous and mixed datasets with any type of probability and density distributions, including the possible presence of latent variables, which are very important in real contexts where it is not always possible to collect all relevant variables. MIIC is available through a web interface at the address: https://miic.curie.fr, and as an R package available on CRAN. The second part of the thesis is devoted to the analysis of real life applications: from gene regulatory network reconstruction and protein contact map reconstruction, to the study of clinical records of patients affected by cognitive disorders or breast cancer. MIIC can help physicians in visualizing and analysing direct, indirect and possibly causal effects from patient medical records, discovering novel unexpected direct interdependencies between clinically relevant information or explaining a missing connection through other links found in the reconstruction
Duong, Vu Nguyen. "La résolution des réseaux de contraintes algébriques et qualitatives : une approche d'aide à la conception en ingéniérie." Phd thesis, Ecole Nationale des Ponts et Chaussées, 1990. http://tel.archives-ouvertes.fr/tel-00520680.
Повний текст джерелаPawlowski, Filip igor. "High-performance dense tensor and sparse matrix kernels for machine learning." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEN081.
Повний текст джерелаIn this thesis, we develop high performance algorithms for certain computations involving dense tensors and sparse matrices. We address kernel operations that are useful for machine learning tasks, such as inference with deep neural networks (DNNs). We develop data structures and techniques to reduce memory use, to improve data locality and hence to improve cache reuse of the kernel operations. We design both sequential and shared-memory parallel algorithms. In the first part of the thesis we focus on dense tensors kernels. Tensor kernels include the tensor--vector multiplication (TVM), tensor--matrix multiplication (TMM), and tensor--tensor multiplication (TTM). Among these, TVM is the most bandwidth-bound and constitutes a building block for many algorithms. We focus on this operation and develop a data structure and sequential and parallel algorithms for it. We propose a novel data structure which stores the tensor as blocks, which are ordered using the space-filling curve known as the Morton curve (or Z-curve). The key idea consists of dividing the tensor into blocks small enough to fit cache, and storing them according to the Morton order, while keeping a simple, multi-dimensional order on the individual elements within them. Thus, high performance BLAS routines can be used as microkernels for each block. We evaluate our techniques on a set of experiments. The results not only demonstrate superior performance of the proposed approach over the state-of-the-art variants by up to 18%, but also show that the proposed approach induces 71% less sample standard deviation for the TVM across the d possible modes. Finally, we show that our data structure naturally expands to other tensor kernels by demonstrating that it yields up to 38% higher performance for the higher-order power method. Finally, we investigate shared-memory parallel TVM algorithms which use the proposed data structure. Several alternative parallel algorithms were characterized theoretically and implemented using OpenMP to compare them experimentally. Our results on up to 8 socket systems show near peak performance for the proposed algorithm for 2, 3, 4, and 5-dimensional tensors. In the second part of the thesis, we explore the sparse computations in neural networks focusing on the high-performance sparse deep inference problem. The sparse DNN inference is the task of using sparse DNN networks to classify a batch of data elements forming, in our case, a sparse feature matrix. The performance of sparse inference hinges on efficient parallelization of the sparse matrix--sparse matrix multiplication (SpGEMM) repeated for each layer in the inference function. We first characterize efficient sequential SpGEMM algorithms for our use case. We then introduce the model-parallel inference, which uses a two-dimensional partitioning of the weight matrices obtained using the hypergraph partitioning software. The model-parallel variant uses barriers to synchronize at layers. Finally, we introduce tiling model-parallel and tiling hybrid algorithms, which increase cache reuse between the layers, and use a weak synchronization module to hide load imbalance and synchronization costs. We evaluate our techniques on the large network data from the IEEE HPEC 2019 Graph Challenge on shared-memory systems and report up to 2x times speed-up versus the baseline
Bouzeghoub, Mokrane. "Secsi : un système expert en conception de systèmes d'informations, modélisation conceptuelle de schémas de bases de données." Paris 6, 1986. http://www.theses.fr/1986PA066046.
Повний текст джерелаGallet, Emmanuelle. "Techniques de model-checking pour l’inférence de paramètres et l’analyse de réseaux biologiques." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLC035/document.
Повний текст джерелаIn this thesis, we present the use of model checking techniques for inference of parameters of Gene Regulatory Networks (GRNs) and formal analysis of a signalling pathway. In the first and main part, we provide an approach to infer biological parameters governing the dynamics of discrete models of GRNs. GRNs are encoded in the form of a meta-model, called Parametric GRN, such that a parameter instance defines a discrete model of the original GRN. Provided that targeted biological properties are expressed in the form of LTL formulas, LTL model-checking techniques are combined with symbolic execution and constraint solving techniques to select discrete models satisfying these properties. The challenge is to prevent combinatorial explosion in terms of size and number of discrete models. Our method is implemented in Java, in a tool called SPuTNIk. The second part describes a work performed in collaboration with child neurologists, who aim to understand the occurrence of toxic or protective phenotype of microglia (a type of macrophage in the brain) in the case of preemies. We use an other type of model-checking, the statistical model-checking, to study a particular type of biological network: the Wnt/β- catenin pathway that transmits an external signal into the cells via a cascade of biochemical reactions. Here we present the benefit of the stochastic model checker COSMOS, using the Hybrid Automata Stochastic Logic (HASL), that is an very expressive formalism allowing a sophisticated formal analysis of the dynamics of the Wnt/β-catenin pathway, modelled as a discrete event stochastic process
Raybaud, Sylvain. "De l'utilisation de mesures de confiance en traduction automatique : évaluation, post-édition et application à la traduction de la parole." Thesis, Université de Lorraine, 2012. http://www.theses.fr/2012LORR0260/document.
Повний текст джерелаIn this thesis I shall deal with the issues of confidence estimation for machine translation and statistical machine translation of large vocabulary spontaneous speech translation. I shall first formalize the problem of confidence estimation. I present experiments under the paradigm of multivariate classification and regression. I review the performances yielded by different techniques, present the results obtained during the WMT2012 internation evaluation campaign and give the details of an application to post edition of automatically translated documents. I then deal with the issue of speech translation. After going into the details of what makes it a very specific and particularly challenging problem, I present original methods to partially solve it, by using phonetic confusion networks, confidence estimation techniques and speech segmentation. I show that the prototype I developped yields performances comparable to state-of-the-art of more standard design
Sahin, Serdar. "Advanced receivers for distributed cooperation in mobile ad hoc networks." Thesis, Toulouse, INPT, 2019. http://www.theses.fr/2019INPT0089.
Повний текст джерелаMobile ad hoc networks (MANETs) are rapidly deployable wireless communications systems, operating with minimal coordination in order to avoid spectral efficiency losses caused by overhead. Cooperative transmission schemes are attractive for MANETs, but the distributed nature of such protocols comes with an increased level of interference, whose impact is further amplified by the need to push the limits of energy and spectral efficiency. Hence, the impact of interference has to be mitigated through with the use PHY layer signal processing algorithms with reasonable computational complexity. Recent advances in iterative digital receiver design techniques exploit approximate Bayesian inference and derivative message passing techniques to improve the capabilities of well-established turbo detectors. In particular, expectation propagation (EP) is a flexible technique which offers attractive complexity-performance trade-offs in situations where conventional belief propagation is limited by computational complexity. Moreover, thanks to emerging techniques in deep learning, such iterative structures are cast into deep detection networks, where learning the algorithmic hyper-parameters further improves receiver performance. In this thesis, EP-based finite-impulse response decision feedback equalizers are designed, and they achieve significant improvements, especially in high spectral efficiency applications, over more conventional turbo-equalization techniques, while having the advantage of being asymptotically predictable. A framework for designing frequency-domain EP-based receivers is proposed, in order to obtain detection architectures with low computational complexity. This framework is theoretically and numerically analysed with a focus on channel equalization, and then it is also extended to handle detection for time-varying channels and multiple-antenna systems. The design of multiple-user detectors and the impact of channel estimation are also explored to understand the capabilities and limits of this framework. Finally, a finite-length performance prediction method is presented for carrying out link abstraction for the EP-based frequency domain equalizer. The impact of accurate physical layer modelling is evaluated in the context of cooperative broadcasting in tactical MANETs, thanks to a flexible MAC-level simulator
Arya, Vijay. "Inférence de congestion et Ingénierie de Trafic dans les Réseaux." Phd thesis, 2005. http://tel.archives-ouvertes.fr/tel-00403607.
Повний текст джерела