Tesis sobre el tema "Structures de données probabilistes"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Structures de données probabilistes".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Perrin, Frédéric. "Prise en compte des données expérimentales dans les modèles probabilistes pour la prévision de la durée de vie des structures". Clermont-Ferrand 2, 2008. http://www.theses.fr/2008CLF21823.
Texto completoEl, Abri Marwa. "Probabilistic relational models learning from graph databases". Thesis, Nantes, 2018. http://www.theses.fr/2018NANT4019/document.
Texto completoHistorically, Probabilistic Graphical Models (PGMs) are a solution for learning from uncertain and flat data, also called propositional data or attributevalue representations. In the early 2000s, great interest was addressed to the processing of relational data which includes a large number of objects participating in different relations. Probabilistic Relational Models (PRMs) present an extension of PGMs to the relational context. With the rise of the internet, numerous technological innovations and web applications are driving the dramatic increase of various and complex data. Consequently, Big Data has emerged. Several types of data stores have been created to manage this new data, including the graph databases. Recently there has been an increasing interest in graph databases to model objects and interactions. However, all PRMs structure learning use wellstructured data that are stored in relational databases. Graph databases are unstructured and schema-free data stores. Edges between nodes can have various signatures. Since, relationships that do not correspond to an ER model could be depicted in the database instance. These relationships are considered as exceptions. In this thesis, we are interested by this type of data stores. Also, we study two kinds of PRMs namely, Direct Acyclic Probabilistic Entity Relationship (DAPER) and Markov Logic Networks (MLNs). We propose two significant contributions. First, an approach to learn DAPERs from partially structured graph databases. A second approach consists to benefit from first-order logic to learn DAPERs using MLN framework to take into account the exceptions that are dropped during DAPER learning. We are conducting experimental studies to compare our proposed methods with existing approaches
Fekete, Eric. "Etude probabiliste d'arbres issus de l'algorithmique". Versailles-St Quentin en Yvelines, 2007. http://www.theses.fr/2007VERS0016.
Texto completoThe aim of this thesis is the study of the behavior of trees used in analysis of algorithms. We use probabilistic techniques to study various random objects connected with trees. We formally define the trees we deal with and introduce our main results in chapter one. Each of the three other parts of the thesis contains a specific random phenomenon. We first establish a result on the asymptotics of the rescaled occupation measure of a branching random walk on binary search trees (BSTs). Under weak hypothesis on the increments, we show that this measure converges to a deterministic measure depending on the stable law whose domain of attraction contains the law of the increments. The proof is based on some fundamental properties of the structure of BST. One of them is the result by Louchard on the height of a typical node. This convergence allows to obtain results on two other objects associated to the BST : homogeneous fragmentations of ]0, 1[ and recursive trees. The second study is also on BSTs. We study the profile of the tree (number of leaves at each level) specifying the types of the leaves : arms are the leaves whose brother is an internal node and feet are the leaves whose brother is also a leaf. We use a vector whose coordinates are the level polynomials of arms and feet. The coefficient of order k of these polynomials is the number of arms and feet at level k in the BST of size n. Comparing the two projections of this vector on the eigenspaces of a so-called evolution matrix, we obtain an almost sure and a L2-convergence of a martingale vector, connected to the profile, to a vector associated to the limit of the Jabbour martingale. Finally, the last part deals with another kind of random trees : the suffix trees. These trees are defined from an infinite word and its randomness is given by the source that creates the word. Here we are concerned with -mixing sources. We prove that the fill-up level of a suffix tree with n keys, normalized by log n, converges almost surely to a constant depending on the source. By definition of the suffix trees, the study of this parameter happens to be a word apparition time issue. We obtain the convergence using results of Abadi and Vergne in this field
Scholler, Rémy. "Analyse de données de signalisation mobile pour l’étude de la mobilité respectueuse de la vie privée : Application au secteur du transport routier de marchandises". Electronic Thesis or Diss., Bourgogne Franche-Comté, 2024. http://www.theses.fr/2024UBFCD001.
Texto completoMobile network operators have a significant data source derived from communications of all connected objects (not just smartphones) with the network. These signaling data is a massive source of location data and are regularly used for the mobility analysis. However, potential uses face two major challenges: their low spatiotemporal precision and their highly sensitive nature concerning privacy.In the first phase, the thesis work enhances the understanding of the mobility state (stationary or in motion), speed, direction of movement of connected objects, and the route they take on a transportation infrastructure (e.g., road or rail).In the second phase, we demonstrate how to ensure the confidentiality of continuously produced mobility statistics. The use of signaling data, whether related to users or various connected objects, is legally regulated. For the study of mobility, operators tend to publish anonymized statistics (aggregated data). Specifically, the aim is to calculate complex and anonymized mobility statistics "on the fly" using differential privacy methods and probabilistic data structures (such as Bloom filters).Finally, in the third phase, we illustrate the potential of signaling data and the proposed approaches in this manuscript for quasi-real-time calculation of anonymous statistics on road freight transport. However, this is just an example of what could apply to other subjects analyzing population behaviors and activities with significant public and economic policy implications
Boyer, Laurent. "Apprentissage probabiliste de similarités d'édition". Phd thesis, Université Jean Monnet - Saint-Etienne, 2011. http://tel.archives-ouvertes.fr/tel-00718835.
Texto completoJabbour-Hattab, Jean. "Une approche probabiliste du profil des arbres binaires de recherche". Versailles-St Quentin en Yvelines, 2001. http://www.theses.fr/2001VERS002V.
Texto completoBarriot, Roland. "Intégration des connaissances biologiques à l'échelle de la cellule". Bordeaux 1, 2005. http://www.theses.fr/2005BOR13100.
Texto completoMohamed, Hanène. "Etude probabiliste d'algorithmes en arbre". Paris 6, 2007. https://tel.archives-ouvertes.fr/tel-00270742.
Texto completoIn this thesis a general class of tree algorithms is analyzed. It is shown that, by using an appropriate probabilistic representation of the quantities of interest, the asymptotic behavior of these algorithms can be obtained quite easily. This approach gives a unified probabilistic treatment of these questions. It simplifies and extends some of the results known in this domain
Mohamed, Hanene. "Étude Probabiliste d'Algorithmes en Arbre". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2007. http://tel.archives-ouvertes.fr/tel-00270742.
Texto completoReype, Christophe. "Modélisation probabiliste et inférence bayésienne pour l’analyse de la dynamique des mélanges de fluides géologiques : détection des structures et estimation des paramètres". Electronic Thesis or Diss., Université de Lorraine, 2022. http://www.theses.fr/2022LORR0235.
Texto completoThe analysis of hydrogeochemical data aims to improve the understanding of mass transfer in the sub-surface and the Earth’s crust. This work focuses on the study of fluid-fluid interactions through fluid mixing systems, and more particularly on the detection of the compositions of the mixing sources. The detection is done by means of a point process: the proposed model is unsupervised and applicable to multidimensional data. Physical knowledge of the mixtures and geological knowledge of the data are directly integrated into the probability density of a Gibbs point process, which distributes point patterns in the data space, called the HUG model. The detected sources form the point pattern that maximises the probability density of the HUG model. This probability density is known up to the normalization constant. The knowledge related to the parameters of the model, either acquired experimentally or by using inference methods, is integrated in the method under the form of prior distributions. The configuration of the sources is obtained by a simulated annealing algorithm and Markov Chain Monte Carlo (MCMC) methods. The parameters of the model are estimated by an approximate Bayesian computation method (ABC). First, the model is applied to synthetic data, and then to real data. The parameters of the model are then estimated for a synthetic data set with known sources. Finally, the sensitivity of the model to data uncertainties, to parameters choices and to algorithms set-up is studied
Souihli, Asma. "Interrogation des bases de données XML probabilistes". Thesis, Paris, ENST, 2012. http://www.theses.fr/2012ENST0046/document.
Texto completoProbabilistic XML is a probabilistic model for uncertain tree-structured data, with applications to data integration, information extraction, or uncertain version control. We explore in this dissertation efficient algorithms for evaluating tree-pattern queries with joins over probabilistic XML or, more specifically, for approximating the probability of each item of a query result. The approach relies on, first, extracting the query lineage over the probabilistic XML document, and, second, looking for an optimal strategy to approximate the probability of the propositional lineage formula. ProApproX is the probabilistic query manager for probabilistic XML presented in this thesis. The system allows users to query uncertain tree-structured data in the form of probabilistic XML documents. It integrates a query engine that searches for an optimal strategy to evaluate the probability of the query lineage. ProApproX relies on a query-optimizer--like approach: exploring different evaluation plans for different parts of the formula and predicting the cost of each plan, using a cost model for the various evaluation algorithms. We demonstrate the efficiency of this approach on datasets used in a number of most popular previous probabilistic XML querying works, as well as on synthetic data. An early version of the system was demonstrated at the ACM SIGMOD 2011 conference. First steps towards the new query solution were discussed in an EDBT/ICDT PhD Workshop paper (2011). A fully redesigned version that implements the techniques and studies shared in the present thesis, is published as a demonstration at CIKM 2012. Our contributions are also part of an IEEE ICDE
Souihli, Asma. "Interrogation des bases de données XML probabilistes". Electronic Thesis or Diss., Paris, ENST, 2012. http://www.theses.fr/2012ENST0046.
Texto completoProbabilistic XML is a probabilistic model for uncertain tree-structured data, with applications to data integration, information extraction, or uncertain version control. We explore in this dissertation efficient algorithms for evaluating tree-pattern queries with joins over probabilistic XML or, more specifically, for approximating the probability of each item of a query result. The approach relies on, first, extracting the query lineage over the probabilistic XML document, and, second, looking for an optimal strategy to approximate the probability of the propositional lineage formula. ProApproX is the probabilistic query manager for probabilistic XML presented in this thesis. The system allows users to query uncertain tree-structured data in the form of probabilistic XML documents. It integrates a query engine that searches for an optimal strategy to evaluate the probability of the query lineage. ProApproX relies on a query-optimizer--like approach: exploring different evaluation plans for different parts of the formula and predicting the cost of each plan, using a cost model for the various evaluation algorithms. We demonstrate the efficiency of this approach on datasets used in a number of most popular previous probabilistic XML querying works, as well as on synthetic data. An early version of the system was demonstrated at the ACM SIGMOD 2011 conference. First steps towards the new query solution were discussed in an EDBT/ICDT PhD Workshop paper (2011). A fully redesigned version that implements the techniques and studies shared in the present thesis, is published as a demonstration at CIKM 2012. Our contributions are also part of an IEEE ICDE
Périnel, Emmanuel. "Segmentation en analyse de données symboliques : le cas de données probabilistes". Paris 9, 1996. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=1996PA090079.
Texto completoBa, Mouhamadou Lamine. "Exploitation de la structure des données incertaines". Electronic Thesis or Diss., Paris, ENST, 2015. http://www.theses.fr/2015ENST0013.
Texto completoThis thesis addresses some fundamental problems inherent to the need of uncertainty handling in multi-source Web applications with structured information, namely uncertain version control in Web-scale collaborative editing platforms, integration of uncertain Web sources under constraints, and truth finding over structured Web sources. Its major contributions are: uncertainty management in version control of treestructured data using a probabilistic XML model; initial steps towards a probabilistic XML data integration system for uncertain and dependent Web sources; precision measures for location data and; exploration algorithms for an optimal partitioning of the input attribute set during a truth finding process over conflicting Web sources
Rigouste, Loïs. "Méthodes probabilistes pour l'analyse exploratoire de données textuelles". Phd thesis, Télécom ParisTech, 2006. http://pastel.archives-ouvertes.fr/pastel-00002424.
Texto completoHillali, Younès. "Analyse et modélisation des données probabilistes : capacités et lois multidimensionnelles". Paris 9, 1998. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=1998PA090015.
Texto completoFebrissy, Mickaël. "Nonnegative Matrix Factorization and Probabilistic Models : A unified framework for text data". Electronic Thesis or Diss., Paris, CNAM, 2021. http://www.theses.fr/2021CNAM1291.
Texto completoSince the exponential growth of available Data (Big data), dimensional reduction techniques became essential for the exploration and analysis of high-dimensional data arising from many scientific areas. By creating a low-dimensional space intrinsic to the original data space, theses techniques offer better understandings across many data Science applications. In the context of text analysis where the data gathered are mainly nonnegative, recognized techniques producing transformations in the space of real numbers (e.g. Principal component analysis, Latent semantic analysis) became less intuitive as they could not provide a straightforward interpretation. Such applications show the need of dimensional reduction techniques like Nonnegative Matrix factorization (NMF) useful to embed, for instance, documents or words in the space of reduced dimension. By definition, NMF aims at approximating a nonnegative matrix by the product of two lower dimensionalnonnegative matrices, which results in the solving of a nonlinear optimization problem. Note however that this objective can be harnessed to document/word clustering domain even it is not the objective of NMF. In relying on NMF, this thesis focuses on improving clustering of large text data arising in the form of highly sparse document-term matrices. This objective is first achieved, by proposing several types of regularizations of the original NMF objective function. Setting this objective in a probabilistic context, a new NMF model is introduced bringing theoretical foundations for establishing the connection between NMF and Finite Mixture Models of exponential families leading, therefore, to offer interesting regularizations. This allows to set NMF in a real clustering spirit. Finally, a Bayesian Poisson Latent Block model is proposed to improve document andword clustering simultaneously by capturing noisy term features. This can be connected to NMTF (Nonnegative Matrix factorization Tri-factorization) devoted to co-clustering. Experiments on real datasets have been carried out to support the proposals of the thesis
Santha, Miklos. "Contributions à l'étude des structures aléatoires et des méthodes probabilistes". Paris 11, 1988. http://www.theses.fr/1988PA112057.
Texto completoThis thesis contains a few contributions to the study of randomness and probabilistic methods in theoretical computer science. Some of them deal with random sequences and probabilistic complexity classes, the others with concrete problems. We introduce a new mathematical model of imperfect physical sources of randomness and we show how to convert the output of these sources into quasi-random sequences which are indistinguish able from truly random ones in a strong sense. We show the existence of pseudo-random number generators which can reduce efficiently the error probability in probabilistic algorithms. We present a separation result for the relativized interactive probabilistic proof-systems. We study in a probabilistic model the optimal distribution of processors in a network, when they are subject to failure. We compute the radius of certain graphs on alphabets and we obtain tight bounds on the parallel complexity of the searching problem in multi-dimensional cubes by using proba bilistic arguments
Santha, Miklos. "Contributions à l'étude des structures aléatoires et des méthodes probabilistes". Grenoble 2 : ANRT, 1988. http://catalogue.bnf.fr/ark:/12148/cb37618440v.
Texto completoGelgon, Marc. "Structuration statistique de données multimédia pour la recherche d'information". Habilitation à diriger des recherches, Université de Nantes, 2007. http://tel.archives-ouvertes.fr/tel-00450297.
Texto completoBelazzougui, Djamal. "Structures compactes d'indexation de données". Paris 7, 2011. http://www.theses.fr/2011PA077190.
Texto completoOne of the most important task in Computing is to be able to answer to various queries on a given data. There exist two different ways for this task. The first one is to read the whole data for each query. This approach can be very slow if the data size is very large. The second approach is to do a preprocessing phase on the data and build a "data structure". Later a query is answered by accessing the data structure and possibly a small part of the original data. This approach usually answers to queries in time much smaller than the time needed to read all the data. In some cases, the data structure may occupy much more space than the original data. This could cause a major slowdown to the queries if the data structure becomes too large to fit in fast memory. The solution to this problem is to use some smart encoding of the data structure in such a way that the data structure becomes "compact" enough to fit in the available fast memory while retaining the fastest possible query time. The contribution of this thesis is to show several improvements to known compact data structures as well as new compact data structures. Most of our results are concerned with problems on strings. Our first result is a new compact and fast solution for the dictionary matching problem. Our second result is a compact and fast data structure for the 1-error approximate dictionary problem. Our third result which is a new kind of perfect hashing, we call "monotone minimal perfect hashing". In our last two results, we use the monotone minimal perfect hashing to improve known solutions for the text indexing and the full-text top-k document retrieval problems
Vrac, Mathieu. "Analyse et modélisation de données probabilistes par décomposition de mélange de copules et application à une base de données climatologiques". Phd thesis, Université Paris Dauphine - Paris IX, 2002. http://tel.archives-ouvertes.fr/tel-00002386.
Texto completoArnst, Maarten. "Inversion de modèles probabilistes de structures à partir de fonctionsde transfert expérimentales". Phd thesis, Ecole Centrale Paris, 2007. http://tel.archives-ouvertes.fr/tel-00238573.
Texto completoSellier, Alain. "Modélisations probabilistes du comportement de matériaux et de structures en génie civil". Cachan, Ecole normale supérieure, 1995. http://www.theses.fr/1995DENS0012.
Texto completoSellier, Alain. "Modélisations probabilistes du comportement de matériaux et de structures en génie civil /". Cachan : Laboratoire de mécanique et technologie, 1995. http://catalogue.bnf.fr/ark:/12148/cb35814905t.
Texto completoSiolas, Georges. "Modèles probabilistes et noyaux pour l'extraction d'informations à partir de documents". Paris 6, 2003. http://www.theses.fr/2003PA066487.
Texto completoJaillet, Léonard. "Méthodes probabilistes pour la planifcation réactive de mouvement". Phd thesis, Université Paul Sabatier - Toulouse III, 2005. http://tel.archives-ouvertes.fr/tel-00853031.
Texto completoCastelli, Aleardi Luca. "Représentations compactes de structures de données géométriques". Phd thesis, Ecole Polytechnique X, 2006. http://tel.archives-ouvertes.fr/tel-00336188.
Texto completoDarlay, Julien. "Analyse combinatoire de données : structures et optimisation". Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00683651.
Texto completoSierocinski, Thomas. "Méthodes probabilistes, floues et quantiques pour l'extraction de l'information biologique". Phd thesis, Université Rennes 1, 2008. http://tel.archives-ouvertes.fr/tel-00429878.
Texto completoGILLE-GENEST, ANNE. "Utilisation des methodes numeriques probabilistes dans les applications au domaine de la fiabilite des structures". Paris 6, 1999. http://www.theses.fr/1999PA066212.
Texto completoBouklit, Mohamed. "Autour du graphe du web : modélisations probabilistes de l'internaute et sétection de structures de communauté". Montpellier 2, 2006. http://www.theses.fr/2006MON20096.
Texto completoFejoz, Loïc. "Développement prouvé de structures de données sans verrou". Phd thesis, Université Henri Poincaré - Nancy I, 2008. http://tel.archives-ouvertes.fr/tel-00594978.
Texto completoJaff, Luaï. "Structures de Données dynamiques pour les Systèmes Complèxes". Phd thesis, Université du Havre, 2007. http://tel.archives-ouvertes.fr/tel-00167104.
Texto completola porte vers des applications en économie via les systèmes complexes.
Les structures de données que nous avons étudiées sont les permutations qui ne contiennent pas de sous-suite croissante de longueur plus que deux, les tableaux de Young standards rectangles à deux lignes, les mots de Dyck et les codes qui lient ces structures de données.
Nous avons proposé un modèle économique qui modélise le bénéfice d'un compte bancaire dont l'énumération des configurations possible se fait à l'aide d'un code adapté. Une seconde application
concerne l'évolution de populations d'automate génétique . Ces populations sont étudiées par analyse spectrale et des expérimentations sont données sur des automates probabilistes dont l'évolution conduit à contrôler la dissipation par auto-régulation.
L'ensemble de ce travail a pour ambition de donner quelques outils calculatoires liés à la dynamique de structures de données pour analyser la complexité des systèmes.
Simacek, Jiri. "Vérification de programmes avec structures de données complexes". Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00805794.
Texto completoAuber, David. "Outils de visualisation de larges structures de données". Bordeaux 1, 2002. http://www.theses.fr/2002BOR12607.
Texto completoFejoz, Loïc. "Développement prouvé de structures de données sans verrou". Thesis, Nancy 1, 2009. http://www.theses.fr/2009NAN10022/document.
Texto completoThe central topic of this thesis is the proof-based development of lock-free data-structure algorithms. First motivation comes from new computer architectures that come with new synchronisation features. Those features enable concurrent algorithms that do not use locks and are thus more efficient. The second motivation is the search for proved correct program. Nowadays embedded software are used everywhere included in systems where safety is central. We propose a refinement-based method for designing and verifying non-blocking, and in particular lock-free, implementations of data structures. The entire method has been formalised in Isabelle/HOL. An associated prototype tool generates verification conditions that can be solved by SMT solvers or automatic theorem provers for first-order logic, and we have used this approach to verify a number of such algorithms
Maria, Clément. "Algorithmes et structures de données en topologie algorithmique". Thesis, Nice, 2014. http://www.theses.fr/2014NICE4081/document.
Texto completoThe theory of homology generalizes the notion of connectivity in graphs to higher dimensions. It defines a family of groups on a domain, described discretely by a simplicial complex that captures the connected components, the holes, the cavities and higher-dimensional equivalents. In practice, the generality and flexibility of homology allows the analysis of complex data, interpreted as point clouds in metric spaces. The theory of persistent homology introduces a robust notion of homology for topology inference. Its applications are various and range from the description of high dimensional configuration spaces of complex dynamical systems, classification of shapes under deformations and learning in medical imaging. In this thesis, we explore the algorithmic ramifications of persistent homology. We first introduce the simplex tree, an efficient data structure to construct and maintain high dimensional simplicial complexes. We then present a fast implementation of persistent cohomology via the compressed annotation matrix data structure. We also refine the computation of persistence by describing ideas of homological torsion in this framework, and introduce the modular reconstruction method for computation. Finally, we present an algorithm to compute zigzag persistent homology, an algebraic generalization of persistence. To do so, we introduce new local transformation theorems in quiver representation theory, called diamond principles. All algorithms are implemented in the computational library Gudhi
Senellart, Pierre. "XML probabiliste: Un modèle de données pour le Web". Habilitation à diriger des recherches, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00758055.
Texto completoGao, Yingzhong. "Modèles probabilistes et possibilistes pour la prise en compte de l'incertain dans la sécurité des structures". Phd thesis, Ecole Nationale des Ponts et Chaussées, 1996. http://pastel.archives-ouvertes.fr/pastel-00569129.
Texto completoLi, Xinran. "Evaluation et amélioration des méthodes de chaînage de données". Thesis, Clermont-Ferrand 1, 2015. http://www.theses.fr/2015CLF1MM02/document.
Texto completoRecord linkage is the task of identifying which records from different data sources refer to the same entities. Without the common identification key among different databases, this task could be performed by comparison of corresponding fields (containing the information for identification) in records to link. To do this, many record linkage methods have been proposed in the last decades.In order to ensure a valid and fast linkage of the same patients’ records for GINSENG, a research project which aimed to implement a grid computing infrastructure for sharing medical data, we first studied various commonly used methods for record linkage. These are the methods of approximate comparison of fields in record according to their spellings and pronunciations; the deterministic and probabilistic record linkages and their extensions. The advantages and disadvantages of these methods are clearly demonstrated.In practice, as fields to compare are sometimes subject to typographical errors, we focused on probabilistic record linkage. The implementation of these probabilistic methods proposed by Fellegi and Sunter (PRL-FS) and Winkler (PRL-W) is described in details, and also their evaluation and comparison. Synthetic data sets were used in this work for knowing the truth of matches to evaluate the linkage results. A configurable algorithm for generating synthetic data was therefore proposed.To our knowledge, the PRL-W is one of the most effective methods in terms of validity of linkages in the presence of typographical errors in the field. However, the PRL-W does not satisfactorily treat the missing data problem in the fields, and the implementation of PRL-W is complex and has a computational time that impairs its opportunity in routine use. Solutions are proposed here with the objective of improving the effectiveness of PRL-W in the presence of missing data in the fields. Other solutions are tested to simplify the PRL-W algorithm and both reduce computational time and keep and optimal linkage accuracy.Keywords:
Amoualian, Hesam. "Modélisation et apprentissage de dépendances á l’aide de copules dans les modéles probabilistes latents". Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM078/document.
Texto completoThis thesis focuses on scaling latent topic models for big data collections, especiallywhen document streams. Although the main goal of probabilistic modeling is to find word topics, an equally interesting objective is to examine topic evolutions and transitions. To accomplish this task, we propose in Chapter 3, three new models for modeling topic and word-topic dependencies between consecutive documents in document streams. The first model is a direct extension of Latent Dirichlet Allocation model (LDA) and makes use of a Dirichlet distribution to balance the influence of the LDA prior parameters with respect to topic and word-topic distributions of the previous document. The second extension makes use of copulas, which constitute a generic tool to model dependencies between random variables. We rely here on Archimedean copulas, and more precisely on Franck copula, as they are symmetric and associative and are thus appropriate for exchangeable random variables. Lastly, the third model is a non-parametric extension of the second one through the integration of copulas in the stick-breaking construction of Hierarchical Dirichlet Processes (HDP). Our experiments, conducted on five standard collections that have been used in several studies on topic modeling, show that our proposals outperform previous ones, as dynamic topic models, temporal LDA and the Evolving Hierarchical Processes,both in terms of perplexity and for tracking similar topics in document streams. Compared to previous proposals, our models have extra flexibility and can adapt to situations where there are no dependencies between the documents.On the other hand, the "Exchangeability" assumption in topic models like LDA oftenresults in inferring inconsistent topics for the words of text spans like noun-phrases, which are usually expected to be topically coherent. In Chapter 4, we propose copulaLDA (copLDA), that extends LDA by integrating part of the text structure to the model and relaxes the conditional independence assumption between the word-specific latent topics given the per-document topic distributions. To this end, we assume that the words of text spans like noun-phrases are topically bound and we model this dependence with copulas. We demonstrate empirically the effectiveness of copLDA on both intrinsic and extrinsic evaluation tasks on several publicly available corpora. To complete the previous model (copLDA), Chapter 5 presents an LDA-based model that generates topically coherent segments within documents by jointly segmenting documents and assigning topics to their words. The coherence between topics is ensured through a copula, binding the topics associated to the words of a segment. In addition, this model relies on both document and segment specific topic distributions so as to capture fine-grained differences in topic assignments. We show that the proposed model naturally encompasses other state-of-the-art LDA-based models designed for similar tasks. Furthermore, our experiments, conducted on six different publicly available datasets, show the effectiveness of our model in terms of perplexity, Normalized Pointwise Mutual Information, which captures the coherence between the generated topics, and the Micro F1 measure for text classification
Bornard, Raphaël. "Approches probabilistes appliquées à la restauration numérique d'archives télévisées". Phd thesis, Ecole Centrale Paris, 2002. http://tel.archives-ouvertes.fr/tel-00657636.
Texto completoClaisse, Harry. "Structures chainées et environnement paginé". Compiègne, 1987. http://www.theses.fr/1987COMPI270.
Texto completoThis thesis discusses problems related to pointer data structures. The first part is concerned with results of design decisions that can influence execution efficiency in a paged LISP environment. The main problem can be summarized as a continuing search to reduce the number of page faults. Measurements are used to distinguish crucial points such as structure of core memory, working set sizes and garbage collection. We study several storage implementations of symbols (strings of characters identifying each entities) and we evaluate their performances. The conclusions are applied directly to query data base a used at UTC
Mebarki, Abdelkrim. "Implantation de structures de données compactes pour les triangulations". Phd thesis, Université de Nice Sophia-Antipolis, 2008. http://tel.archives-ouvertes.fr/tel-00336178.
Texto completofaçon compacte les triangulations. Pour ce faire, deux issues sont explorées : modifier la représentation interne en mémoire des objets géométriques, et redéfinir les types abstraits des objets géométriques correspondants. Une première solution consiste à utiliser des indices sur une taille arbitraire de bits, au lieu des références absolues. Les gains dépendent de la taille de la triangulation, et aussi de la taille du mot mémoire de la machine. Le handicap majeur est le coût élevé de la méthode en termes de temps d'exécution. Une deuxième piste consiste à utiliser des catalogues stables. L'idée consiste à regrouper les triangles dans des micro-triangulations, et de représenter la triangulation comme un ensemble de ces micro-triangulations. Le nombre des références multiples vers les sommets, et des références réciproques entre voisins est alors nettement réduit. Les résultats sont
prometteurs, sachant que le temps d'exécution n'est pas dramatiquement altéré par la modification des méthodes d'accés aux triangles. Une troisième solution consiste à décomposer la triangulation en plusieurs sous-triangulations permettant ainsi de coder les références dans une sous-triangulation sur un nombre réduit de bits par rapport aux références absolues. Les résultats de cette techniques sont encourageants, et peuvent être amplifiés par d'autres techniques comme le codage relatif des références, ou le partage de l'information géométrique des sommets sur les bords entre les différentes sous-triangulations. L'élaboration de structures compactes nécessite encore plus d'intérêts, et plusieurs pistes sont à explorer pour pouvoir arriver à des solutions plus économiques en termes d'espace mémoire.
Toss, Julio. "Algorithmes et structures de données parallèles pour applications interactives". Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM056/document.
Texto completoThe quest for performance has been a constant through the history of computing systems. It has been more than a decade now since the sequential processing model had shown its first signs of exhaustion to keep performance improvements.Walls to the sequential computation pushed a paradigm shift and established the parallel processing as the standard in modern computing systems. With the widespread adoption of parallel computers, many algorithms and applications have been ported to fit these new architectures. However, in unconventional applications, with interactivity and real-time requirements, achieving efficient parallelizations is still a major challenge.Real-time performance requirement shows-up, for instance, in user-interactive simulations where the system must be able to react to the user's input within a computation time-step of the simulation loop. The same kind of constraint appears in streaming data monitoring applications. For instance, when an external source of data, such as traffic sensors or social media posts, provides a continuous flow of information to be consumed by an on-line analysis system. The consumer system has to keep a controlled memory budget and delivery fast processed information about the stream.Common optimizations relying on pre-computed models or static index of data are not possible in these highly dynamic scenarios. The dynamic nature of the data brings up several performance issues originated from the problem decomposition for parallel processing and from the data locality maintenance for efficient cache utilization.In this thesis we address data-dependent problems on two different application: one in physics-based simulation and other on streaming data analysis. To the simulation problem, we present a parallel GPU algorithm for computing multiple shortest paths and Voronoi diagrams on a grid-like graph. To the streaming data analysis problem we present a parallelizable data structure, based on packed memory arrays, for indexing dynamic geo-located data while keeping good memory locality
Yahia, Hussein. "Analyse des structures de données arborescentes représentant des images". Paris 11, 1986. http://www.theses.fr/1986PA112292.
Texto completoVincent, Céline. "Détection de structures tourbillonnaires par analyse de données directionnelles". Montpellier 2, 2007. http://www.theses.fr/2007MON20128.
Texto completoJaillet, Léonard. "Méthodes probabilistes pour la planification réactive de mouvements". Phd thesis, Université Paul Sabatier - Toulouse III, 2005. http://tel.archives-ouvertes.fr/tel-00011515.
Texto completoRochd, El Mehdi. "Modèles probabilistes de consommateurs en ligne : personnalisation et recommandation". Thesis, Aix-Marseille, 2015. http://www.theses.fr/2015AIXM4086.
Texto completoResearch systems have facilitated access to information available on the web using mechanisms for collecting, indexing and storage of heterogeneous content. They generate data resulting from the activity of users on Internet (queries, logfile). The next step is to analyze the data using data mining tools in order to improve the response’s quality of these systems, or to customize the response based on users’ profiles. Some actors, such as the company Marketshot, are positioned as intermediaries between consumers and professionals. Indeed, they link potential buyers with the leading brands and distribution networks through their websites. For such purposes, these intermediaries have developed effective portals, and have stored large volumes of data related to the activity of users on their websites. These data repositories are exploited to respond positively to the needs of users as well as those of professionals who seek to understand the behavior of their customers and anticipate their purchasing actions. My thesis comes within the framework of searching through the data collected from the web. The idea is to build models that explain the correlation between the activities of users on websites of aid for the purchase, and sales trends of products in « real life ». In fact, my research concerns probabilistic learning, in particular Topic Models. It involves modeling the users’ behavior from uses of trader websites