Rozprawy doktorskie na temat „Grands Jeux de Données”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Grands Jeux de Données”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Allart, Thibault. "Apprentissage statistique sur données longitudinales de grande taille et applications au design des jeux vidéo". Thesis, Paris, CNAM, 2017. http://www.theses.fr/2017CNAM1136/document.
Pełny tekst źródłaThis thesis focuses on longitudinal time to event data possibly large along the following tree axes : number of individuals, observation frequency and number of covariates. We introduce a penalised estimator based on Cox complete likelihood with data driven weights. We introduce proximal optimization algorithms to efficiently fit models coefficients. We have implemented thoses methods in C++ and in the R package coxtv to allow everyone to analyse data sets bigger than RAM; using data streaming and online learning algorithms such that proximal stochastic gradient descent with adaptive learning rates. We illustrate performances on simulations and benchmark with existing models. Finally, we investigate the issue of video game design. We show that using our model on large datasets available in video game industry allows us to bring to light ways of improving the design of studied games. First we have a look at low level covariates, such as equipment choices through time and show that this model allows us to quantify the effect of each game elements, giving to designers ways to improve the game design. Finally, we show that the model can be used to extract more general design recommendations such as dificulty influence on player motivations
Allart, Thibault. "Apprentissage statistique sur données longitudinales de grande taille et applications au design des jeux vidéo". Electronic Thesis or Diss., Paris, CNAM, 2017. http://www.theses.fr/2017CNAM1136.
Pełny tekst źródłaThis thesis focuses on longitudinal time to event data possibly large along the following tree axes : number of individuals, observation frequency and number of covariates. We introduce a penalised estimator based on Cox complete likelihood with data driven weights. We introduce proximal optimization algorithms to efficiently fit models coefficients. We have implemented thoses methods in C++ and in the R package coxtv to allow everyone to analyse data sets bigger than RAM; using data streaming and online learning algorithms such that proximal stochastic gradient descent with adaptive learning rates. We illustrate performances on simulations and benchmark with existing models. Finally, we investigate the issue of video game design. We show that using our model on large datasets available in video game industry allows us to bring to light ways of improving the design of studied games. First we have a look at low level covariates, such as equipment choices through time and show that this model allows us to quantify the effect of each game elements, giving to designers ways to improve the game design. Finally, we show that the model can be used to extract more general design recommendations such as dificulty influence on player motivations
Schertzer, Jérémie. "Exploiting modern GPUs architecture for real-time rendering of massive line sets". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAT037.
Pełny tekst źródłaIn this thesis, we consider massive line sets generated from brain tractograms. They describe neural connections that are represented with millions of poly-line fibers, summing up to billions of segments. Thanks to the two-staged mesh shader pipeline, we build a tractogram renderer surpassing state-of-the-art performances by two orders of magnitude.Our performances come from fiblets: a compressed representation of segment blocks. By combining temporal coherence and morphological dilation on the z-buffer, we define a fast occlusion culling test for fiblets. Thanks to our heavily-optimized parallel decompression algorithm, surviving fiblets are swiftly synthesized to poly-lines. We also showcase how our fiblet pipeline speeds-up advanced tractogram interaction features.For the general case of line rendering, we propose morphological marching: a screen-space technique rendering custom-width tubes from the thin rasterized lines of the G-buffer. By approximating a tube as the union of spheres densely distributed along its axes, each sphere shading each pixel is retrieved relying on a multi-pass neighborhood propagation filter. Accelerated by the compute pipeline, we reach real-time performances for the rendering of depth-dependant wide lines.To conclude our work, we implement a virtual reality prototype combining fiblets and morphological marching. It makes possible for the first time the immersive visualization of huge tractograms with fast shading of thick fibers, thus paving the way for diverse perspectives
Mansiaux, Yohann. "Analyse d'un grand jeu de données en épidémiologie : problématiques et perspectives méthodologiques". Thesis, Paris 6, 2014. http://www.theses.fr/2014PA066272/document.
Pełny tekst źródłaThe increasing size of datasets is a growing issue in epidemiology. The CoPanFlu-France cohort(1450 subjects), intended to study H1N1 pandemic influenza infection risk as a combination of biolo-gical, environmental, socio-demographic and behavioral factors, and in which hundreds of covariatesare collected for each patient, is a good example. The statistical methods usually employed to exploreassociations have many limits in this context. We compare the contribution of data-driven exploratorymethods, assuming the absence of a priori hypotheses, to hypothesis-driven methods, requiring thedevelopment of preliminary hypotheses.Firstly a data-driven study is presented, assessing the ability to detect influenza infection determi-nants of two data mining methods, the random forests (RF) and the boosted regression trees (BRT), ofthe conventional logistic regression framework (Univariate Followed by Multivariate Logistic Regres-sion - UFMLR) and of the Least Absolute Shrinkage and Selection Operator (LASSO), with penaltyin multivariate logistic regression to achieve a sparse selection of covariates. A simulation approachwas used to estimate the True (TPR) and False (FPR) Positive Rates associated with these methods.Between three and twenty-four determinants of infection were identified, the pre-epidemic antibodytiter being the unique covariate selected with all methods. The mean TPR were the highest for RF(85%) and BRT (80%), followed by the LASSO (up to 78%), while the UFMLR methodology wasinefficient (below 50%). A slight increase of alpha risk (mean FPR up to 9%) was observed for logisticregression-based models, LASSO included, while the mean FPR was 4% for the data-mining methods.Secondly, we propose a hypothesis-driven causal analysis of the infection risk, with a structural-equation model (SEM). We exploited the SEM specificity of modeling latent variables to study verydiverse factors, their relative impact on the infection, as well as their eventual relationships. Only thelatent variables describing host susceptibility (modeled by the pre-epidemic antibody titer) and com-pliance with preventive behaviors were directly associated with infection. The behavioral factors des-cribing risk perception and preventive measures perception positively influenced compliance with pre-ventive behaviors. The intensity (number and duration) of social contacts was not associated with theinfection.This thesis shows the necessity of considering novel statistical approaches for the analysis of largedatasets in epidemiology. Data mining and LASSO are credible alternatives to the tools generally usedto explore associations with a high number of variables. SEM allows the integration of variables des-cribing diverse dimensions and the explicit modeling of their relationships ; these models are thereforeof major interest in a multidisciplinary study as CoPanFlu
Barbier, Sébastien. "Visualisation distance temps-réel de grands volumes de données". Grenoble 1, 2009. http://www.theses.fr/2009GRE10155.
Pełny tekst źródłaNumerical simulations produce huger and huger meshes that can reach dozens of million tetrahedra. These datasets must be visually analyzed to understand the physical simulated phenomenon and draw conclusions. The computational power for scientific visualization of such datasets is often smaller than for numerical simulation. As a consequence, interactive exploration of massive meshes is barely achieved. In this document, we propose a new interactive method to interactively explore massive tetrahedral meshes with over forty million tetrahedra. This method is fully integrated into the simulation process, based on two meshes at different resolutions , one fine mesh and one coarse mesh , of the same simulation. A partition of the fine vertices is computed guided by the coarse mesh. It allows the on-the-fly extraction of a mesh, called \textit{biresolution}, mixed of the two initial resolutions as in usual multiresolution approaches. The extraction of such meshes is carried out into the main memory (CPU), the last generation of graphics cards (GPU) and with an out-of-core algorithm. They guarantee extraction rates never reached in previous work. To visualize the biresolution meshes, a new direct volume rendering (DVR) algorithm is fully implemented into graphics cards. Approximations can be performed and are evaluated in order to guarantee an interactive rendering of any biresolution meshes
Mbimbe, Dean. "L'abus de droit dans les grands évènements sportifs : l'exemple des Jeux Olympiques". Master's thesis, Université Laval, 2017. http://hdl.handle.net/20.500.11794/28341.
Pełny tekst źródłaSince 1984, legal protection for mega sports events, abuse of privilege or ambush marketing have been investigated by jurists, journalists, or sociologists. Notwithstanding, exploring those areas through intellectual property without being influenced by the negative aspects of “ambush” terminology is not that easy. It’s even harder when it is shown to the public as the main harm caused to the World’s main beloved sports events: The Olympics. However, digging back successively to the roots of the disparaged practical and the Movement enable a certain kind of understanding. It unveils the kind of goodwill shown by law toward the mega events’ NGOs such as IOC, FIFA or UEFA, a kind of benevolence that nowadays has to stop. So that we found necessary to recall those organisations the type of mission they assigned to themselves when they chose to rule in sports events with the protection of intellectual property. It was a social mission they must remind. In order to do so, we subjected ourselves to what may be described as a “vagrancy study”, commanded by a study about an event unyielding to the settlement.--Key words : Abuse of Process, Ambush Marketing, Monopoly, Special Legislation, IOC, Trademark Law, Fundamental Rights.
Coveliers, Alexandre. "Sensibilité aux jeux de données de la compilation itérative". Paris 11, 2007. http://www.theses.fr/2007PA112255.
Pełny tekst źródłaIn the context of architecture processor conception, the performance research leads to a constant growth of architecture complexity. This growth of architecture complexity made more difficult the exploitation of their potential performance. To improve architecture performance exploitation, new optimization techniques based on dynamic behavior –i. E. Run time behavior- has been proposed Iterative compilation is a such an optimization approach. This approach allows to determine more relevant transformation than those obtained by static analysis. The main drawback of this optimization method is based on the fact that the information that lead to the code transformation are specific to a particular data set. Thus the determined optimizations are dependent on the data set used during the optimization process. In this thesis, we study the optimized application performance variations according to the data set used for two iterative code transformation techniques. We introduce different metrics to quantify this sensitivity. Also, we propose data set selection methods for choosing which data set to use during code transformation process. Selected data sets enable to obtain an optimized code with good performance with all other available data sets
Rougui, Jamal. "Indexation de documents audio : Cas des grands volumes de données". Phd thesis, Université de Nantes, 2008. http://tel.archives-ouvertes.fr/tel-00450812.
Pełny tekst źródłaRougui, Jamal-Eddine. "Indexation de documents audio : cas des grands volumes de données". Nantes, 2008. http://www.theses.fr/2008NANT2031.
Pełny tekst źródłaThis thesis is devoted to techniques for speaker-based recognition systems to scale up to large amounts of data and speaker models. We have chosen to partition audio documents (news broadcast) according to speakers. The mel-cepstral acoustic characteristics of each speaker are model through a probabilistic Gaussian mixture model. First, speaker change detection in the stream is carried out by Bayesian hypothesis testing. The scheme is incremental : as new speakers are detected, they are either identified in the database or new entries are created in the database. First, we have examined some issues related to building a tree structure exploiting a similarity between speaker models. Several contributions were made. First, a proposal for organising a set of speaker models, based on an elementary model grouping. Then, we used an approximation of Kullback-Leibler divergence for this purpose. Finally, through two studies using binary of nary tree structures, we discuss the way of a version suitable for incremental processing. Finally, perspectives are drawn regarding joint audio/video analysis and future needs are analyzed
Buron, Maxime. "Raisonnement efficace sur des grands graphes hétérogènes". Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAX061.
Pełny tekst źródłaThe Semantic Web offers knowledge representations, which allow to integrate heterogeneous data from several sources into a unified knowledge base. In this thesis, we investigate techniques for querying such knowledge bases.The first part is devoted to query answering techniques on a knowledge base, represented by an RDF graph subject to ontological constraints. Implicit information entailed by the reasoning, enabled by the set of RDFS entailment rules, has to be taken into account to correctly answer such queries. First, we present a sound and complete query reformulation algorithm for Basic Graph Pattern queries, which exploits a partition of RDFS entailment rules into assertion and constraint rules. Second, we introduce a novel RDF storage layout, which combines two well-known layouts. For both contributions, our experiments assess our theoretical and algorithmic results.The second part considers the issue of querying heterogeneous data sources integrated into an RDF graph, using BGP queries. Following the Ontology-Based Data Access paradigm, we introduce a framework of data integration under an RDFS ontology, using the Global-Local-As-View mappings, rarely considered in the literature.We present several query answering strategies, which may materialize the integrated RDF graph or leave it virtual, and differ on how and when RDFS reasoning is handled. We implement these strategies in a platform, in order to conduct experiments, which demonstrate the particular interest of one of the strategies based on mapping saturation. Finally, we show that mapping saturation can be extended to reasoning defined by a subset of existential rules
Caron, Maxime. "Données confidentielles : génération de jeux de données synthétisés par forêts aléatoires pour des variables catégoriques". Master's thesis, Université Laval, 2015. http://hdl.handle.net/20.500.11794/25935.
Pełny tekst źródłaConfidential data are very common in statistics nowadays. One way to treat them is to create partially synthetic datasets for data sharing. We will present an algorithm based on random forest to generate such datasets for categorical variables. We are interested by the formula used to make inference from multiple synthetic dataset. We show that the order of the synthesis has an impact on the estimation of the variance with the formula. We propose a variant of the algorithm inspired by differential privacy, and show that we are then not able to estimate a regression coefficient nor its variance. We show the impact of synthetic datasets on structural equations modeling. One conclusion is that the synthetic dataset does not really affect the coefficients between latent variables and measured variables.
Bletery, Quentin. "Analyse probabiliste et multi-données de la source de grands séismes". Thesis, Nice, 2015. http://www.theses.fr/2015NICE4092/document.
Pełny tekst źródłaEarthquakes are the results of rapid slip on active faults loaded in stress by the tectonic plates motion. It is now establish - at least for large earthquakes - that the distribution of this rapid slip along the rupturing faults is heterogeneous. Imaging the complexity of such slip distributions is one the main challenges in seismology because of the potential implications on understanding earthquake genesis and the associated possibility to better anticipate devastating shaking and tsunami. To improve the imaging of such co-seismic slip distributions, three axes may be followed: increase the constraints on the source models by including more observations into the inversions, improve the physical modeling of the forward problem and improve the formalism to solve the inverse problem. In this PhD thesis, we explore these three axes by studying two recent major earthquakes: the Tohoku-Oki (Mw 9.0) and Sumatra-Andaman (Mw 9.1-9.3) earthquakes, which occured in 2011 and 2004 respectively
Ben, Ellefi Mohamed. "La recommandation des jeux de données basée sur le profilage pour le liage des données RDF". Thesis, Montpellier, 2016. http://www.theses.fr/2016MONTT276/document.
Pełny tekst źródłaWith the emergence of the Web of Data, most notably Linked Open Data (LOD), an abundance of data has become available on the web. However, LOD datasets and their inherent subgraphs vary heavily with respect to their size, topic and domain coverage, the schemas and their data dynamicity (respectively schemas and metadata) over the time. To this extent, identifying suitable datasets, which meet specific criteria, has become an increasingly important, yet challenging task to supportissues such as entity retrieval or semantic search and data linking. Particularlywith respect to the interlinking issue, the current topology of the LOD cloud underlines the need for practical and efficient means to recommend suitable datasets: currently, only well-known reference graphs such as DBpedia (the most obvious target), YAGO or Freebase show a high amount of in-links, while there exists a long tail of potentially suitable yet under-recognized datasets. This problem is due to the semantic web tradition in dealing with "finding candidate datasets to link to", where data publishers are used to identify target datasets for interlinking.While an understanding of the nature of the content of specific datasets is a crucial prerequisite for the mentioned issues, we adopt in this dissertation the notion of "dataset profile" - a set of features that describe a dataset and allow the comparison of different datasets with regard to their represented characteristics. Our first research direction was to implement a collaborative filtering-like dataset recommendation approach, which exploits both existing dataset topic proles, as well as traditional dataset connectivity measures, in order to link LOD datasets into a global dataset-topic-graph. This approach relies on the LOD graph in order to learn the connectivity behaviour between LOD datasets. However, experiments have shown that the current topology of the LOD cloud group is far from being complete to be considered as a ground truth and consequently as learning data.Facing the limits the current topology of LOD (as learning data), our research has led to break away from the topic proles representation of "learn to rank" approach and to adopt a new approach for candidate datasets identication where the recommendation is based on the intensional profiles overlap between differentdatasets. By intensional profile, we understand the formal representation of a set of schema concept labels that best describe a dataset and can be potentially enriched by retrieving the corresponding textual descriptions. This representation provides richer contextual and semantic information and allows to compute efficiently and inexpensively similarities between proles. We identify schema overlap by the help of a semantico-frequential concept similarity measure and a ranking criterion based on the tf*idf cosine similarity. The experiments, conducted over all available linked datasets on the LOD cloud, show that our method achieves an average precision of up to 53% for a recall of 100%. Furthermore, our method returns the mappings between the schema concepts across datasets, a particularly useful input for the data linking step.In order to ensure a high quality representative datasets schema profiles, we introduce Datavore| a tool oriented towards metadata designers that provides rankedlists of vocabulary terms to reuse in data modeling process, together with additional metadata and cross-terms relations. The tool relies on the Linked Open Vocabulary (LOV) ecosystem for acquiring vocabularies and metadata and is made available for the community
Courjault-Rade, Vincent. "Ballstering : un algorithme de clustering dédié à de grands échantillons". Thesis, Toulouse 3, 2018. http://www.theses.fr/2018TOU30126/document.
Pełny tekst źródłaBallstering belongs to the machine learning methods that aim to group in classes a set of objects that form the studied dataset, without any knowledge of true classes within it. This type of methods, of which k-means is one of the most famous representative, are named clustering methods. Recently, a new clustering algorithm "Fast Density Peak Clustering" (FDPC) has aroused great interest from the scientific community for its innovating aspect and its efficiency on non-concentric distributions. However this algorithm showed a such complexity that it can't be applied with ease on large datasets. Moreover, we have identified several weaknesses that impact the quality results and the presence of a general parameter dc difficult to choose while having a significant impact on the results. In view of those limitations, we reworked the principal idea of FDPC in a new light and modified it successively to finally create a distinct algorithm that we called Ballstering. The work carried out during those three years can be summarised by the conception of this clustering algorithm especially designed to be effective on large datasets. As its Precursor, Ballstering works in two phases: An estimation density phase followed by a clustering step. Its conception is mainly based on a procedure that handle the first step with a lower complexity while avoiding at the same time the difficult choice of dc, which becomes automatically defined according to local density. We name ICMDW this procedure which represent a consistent part of our contributions. We also overhauled cores definitions of FDPC and entirely reworked the second phase (relying on the graph structure of ICMDW's intermediate results), to finally produce an algorithm that overcome all the limitations that we have identified
Coninx, Alexandre. "Visualisation interactive de grands volumes de données incertaines : pour une approche perceptive". Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00749885.
Pełny tekst źródłaBoudjeloud-Assala, Baya Lydia. "Visualisation et algorithmes génétiques pour la fouille de grands ensembles de données". Nantes, 2005. http://www.theses.fr/2005NANT2065.
Pełny tekst źródłaWe present cooperative approaches using interactive visualization methods and automatic dimension selection methods for knowledge discovery in databases. Most existing data mining methods work in an automatic way, the user is not implied in the process. We try to involve more significantly the user role in the data mining process in order to improve his confidence and comprehensibility of the obtained models or results. Furthermore, the size of data sets is constantly increasing, these methods must be able to deal with large data sets. We try to improve the performances of the algorithms to deal with these high dimensional data sets. We developed a genetic algorithm for dimension selection with a distance-based fitness function for outlier detection in high dimensional data sets. This algorithm uses only a few dimensions to find the same outliers as in the whole data sets and can easily treat high dimensional data sets. The number of dimensions used being low enough, it is also possible to use visualization methods to explain and interpret outlier detection algorithm results. It is then possible to create a model from the data expert for example to qualify the detected element as an outlier or simply an error. We have also developed an evaluation measure for dimension selection in unsupervised classification and outlier detection. This measure enables us to find the same clusters as in the data set with its whole dimensions as well as clusters containing very few elements (outliers). Visual interpretation of the results shows the dimensions implied, they are considered as relevant and interesting for the clustering and outlier detection. Finally we present a semi-interactive genetic algorithm involving more significantly the user in the selection and evaluation process of the algorithm
Boucheny, Christian. "Visualisation scientifique interactive de grands volumes de données : pour une approche perceptive". Grenoble 1, 2009. http://www.theses.fr/2009GRE10021.
Pełny tekst źródłaWith the fast increase in computing power, numerical simulations of physical phenomena can nowadays rely on up to billions of elements. To extract relevant information in the huge resulting data sets, engineers need visualization tools permitting an interactive exploration and analysis of the computed fields. The goal of this thesis is to improve the visualizations performed by engineers by taking into account the characteristics of the human visual perception, with a particular focus on the perception of space and volume during the visualization of dense 3D data. Firstly, three psychophysics experiments have shown that direct volume rendering, a technique relying on the ordered accumulation of transparencies, provide very ambiguous cues to depth. This is particularly true for static presentations, while the addition of motion and exaggerated perspective cues help to solve part of these difficulties. Then, two algorithms have been developed to improve depth perception during the visualization of complex 3D structures. They have been implemented on the GPU, to achieve interactive renderings independently of the geometric nature of the analysed data. EyeDome Lighting is a new non-photorealistic shading technique that relies on the projected depth image of the scene. This algorithm enhances the perception of shapes and relative depths in complex 3D scenes. Also, a new fast view-dependent cutaway technique has been implemented, which permits to access otherwise occluded objects while providing cues to understand the structure in depth of masking objects
Meddeb, Hamrouni Boubaker. "Méthodes et algorithmes de représentation et de compression de grands dictionnaires de formes". Université Joseph Fourier (Grenoble), 1996. http://www.theses.fr/1996GRE10278.
Pełny tekst źródłaNdiaye, Marie. "Exploration de grands ensembles de motifs". Thesis, Tours, 2010. http://www.theses.fr/2010TOUR4029/document.
Pełny tekst źródłaThe abundance of patterns generated by knowledge extraction algorithms is a major problem in data mining. Ta facilitate the exploration of these patterns, two approaches are often used: the first is to summarize the sets of extracted patterns and the second approach relies on the construction of visual representations of the patterns. However, the summaries are not structured and they are proposed without exploration method. Furthermore, visualizations do not provide an overview of the pattern .sets. We define a generic framework that combines the advantages of bath approaches. It allows building summaries of patterns sets at different levels of detail. These summaries provide an overview of the pattern sets and they are structured in the form of cubes on which OLAP navigational operators can be applied in order to explore the pattern sets. Moreover, we propose an algorithm which provides a summary of good quality whose size is below a given threshold. Finally, we instantiate our framework with association rules
Ducoffe, Guillaume. "Propriétés métriques des grands graphes". Thesis, Université Côte d'Azur (ComUE), 2016. http://www.theses.fr/2016AZUR4134/document.
Pełny tekst źródłaLarge scale communication networks are everywhere, ranging from data centers withmillions of servers to social networks with billions of users. This thesis is devoted tothe fine-grained complexity analysis of combinatorial problems on these networks.In the first part, we focus on the embeddability of communication networks totree topologies. This property has been shown to be crucial in the understandingof some aspects of network traffic (such as congestion). More precisely, we studythe computational complexity of Gromov hyperbolicity and of tree decompositionparameters in graphs – including treelength and treebreadth. On the way, we givenew bounds on these parameters in several graph classes of interest, some of thembeing used in the design of data center interconnection networks. The main resultin this part is a relationship between treelength and treewidth: another well-studiedgraph parameter, that gives a unifying view of treelikeness in graphs and has algorithmicapplications. This part borrows from graph theory and recent techniques incomplexity theory. The second part of the thesis is on the modeling of two privacy concerns with social networking services. We aim at analysing information flows in these networks,represented as dynamical processes on graphs. First, a coloring game on graphs isstudied as a solution concept for the dynamic of online communities. We give afine-grained complexity analysis for computing Nash and strong Nash equilibria inthis game, thereby answering open questions from the literature. On the way, wepropose new directions in algorithmic game theory and parallel complexity, usingcoloring games as a case example
Sirgue, Laurentf1975. "Inversion de la forme d'onde dans le domaine fréquentiel de données sismiques grands offsets". Paris 11, 2003. http://www.theses.fr/2003PA112088.
Pełny tekst źródłaThe standard imaging approach in exploration seismology relies on a decomposition of the velocity model by spatial scales: the determination of the low wavenumbers of the velocity field is followed by the reconstruction of the high wavenumbers. However, for models presenting a complex structure, the recovery of the high wavenumbers may be significantly improved by the determination of intermediate wavenumbers. These, can potentially be recovered by local, non-linear waveform inversion of wide-angle data. However, waveform inversion is limited by the non-linearity of the inverse problem, which is in turn governed by the minimum frequency in the data and the starting model. For very low frequencies, below 7 Hz, the problem is reasonably linear so that waveform inversion may be applied using a starting model obtained from traveltime tomography. The frequency domain is then particularly advantageous as the inversion from the low to the high frequencies is very efficient. Moreover, it is possible to discretise the frequencies with a much larger sampling interval than dictated by the sampling theorem and still obtain a good imaging result. A strategy for selecting frequencies is developed where the number of input frequencies can be reduced when a range of offsets is available: the larger the maximum offset is, the fewer frequencies are required. Real seismic data unfortunatly do not contain very low frequencies and waveform inversion at higher frequencies are likely to fail due to convergence into a local minimum. Preconditioning techniques must hence be applied on the gradient vector and the data residuals in order to enhance the efficacy of waveform inversion starting from realistic frequencies. The smoothing of the gradient vector and inversion of early arrivals significantly improve the chance of convergence into the global minimum. The efficacy of preconditioning methods are however limited by the accuracy of the starting model
Dumonceaux, Frédéric. "Approches algébriques pour la gestion et l’exploitation de partitions sur des jeux de données". Nantes, 2015. http://archive.bu.univ-nantes.fr/pollux/show.action?id=c655f585-5cf3-4554-bea2-8e488315a2b9.
Pełny tekst źródłaThe rise of data analysis methods in many growing contexts requires the design of new tools, enabling management and handling of extracted data. Summarization process is then often formalized through the use of set partitions whose handling depends on applicative context and inherent properties. Firstly, we suggest to model the management of aggregation query results over a data cube within the algebraic framework of the partition lattice. We highlight the value of such an approach with a view to minimize both required space and time to generate those results. We then deal with the consensus of partitions issue in which we emphasize challenges related to the lack of properties that rule partitions combination. The idea put forward is to deepen algebraic properties of the partition lattice for the purpose of strengthening its understanding and generating new consensus functions. As a conclusion, we propose the modelling and implementation of operators defined over generic partitions and we carry out some experiences allowing to assert the benefit of their conceptual and operational use
Fan, Qingfeng. "Stratégie de transfert de données dans les grilles de capteurs". Versailles-St Quentin en Yvelines, 2014. http://www.theses.fr/2014VERS0012.
Pełny tekst źródłaBig data era is coming, and the amount of data increases dramatically in many application fields every day. This thesis mostly focuses on the big data transmission strategy for query optimization in Grid infrastructure. Firstly, we discuss over file degree: the ring and thread replication strategy, and under file degree: the file-parted replication strategy to improve the efficiency of Data Grid. We also tackle the data packets degree using multicast data transfer within a Sensor Grid, which is widely utilized in the in-network query operation. The system comprehensively considers the location factor and data factor, and combines them in a general weighted vector. In a third stage, we extended our model to account for the energy factor to deal with wireless sensor grids, which corresponds to a 3 vectors correlation problem. We show that our approach can be extended further to any finite-dimensional factors. The last part deals with the mobile context, i. E. When users and the queried resources are mobile. We proposed an extension of the semantic cache based optimization for such mobile distributed queries. In this context, the query optimization depends, not only on the cache size and its freshness, but also on the mobility of the user
Chebbo, Manal. "Simulation fine d'optique adaptative à très grand champ pour des grands et futurs très grands télescopes". Thesis, Aix-Marseille, 2012. http://www.theses.fr/2012AIXM4733/document.
Pełny tekst źródłaRefined simulation tools for wide field AO systems on ELTs present new challenges. Increasing the number of degrees of freedom makes the standard simulation's codes useless due to the huge number of operations to be performed at each step of the AO loop process. The classical matrix inversion and the VMM have to be replaced by a cleverer iterative resolution of the Least Square or Minimum Mean Square Error criterion. For this new generation of AO systems, concepts themselves will become more complex: data fusion coming from multiple LGS and NGS will have to be optimized, mirrors covering all the field of view associated to dedicated mirrors inside the scientific instrument itself will have to be coupled using split or integrated tomography schemes, differential pupil or/and field rotations will have to be considered.All these new entries should be carefully simulated, analysed and quantified in terms of performance before any implementation in AO systems. For those reasons i developed, in collaboration with the ONERA, a full simulation code, based on iterative solution of linear systems with many parameters (sparse matrices). On this basis, I introduced new concepts of filtering and data fusion to effectively manage modes such as tip, tilt and defoc in the entire process of tomographic reconstruction. The code will also eventually help to develop and test complex control laws who have to manage a combination of adaptive telescope and post-focal instrument including dedicated DM
Grassot, Lény. "Mobilités évènementielles et espace urbain : Exploitation des donnés de téléphonie mobile pour la modélisation des grands évènements urbains". Rouen, 2016. http://www.theses.fr/2016ROUEL015.
Pełny tekst źródłaThis research is devoted to the apprehension, the detection, the understanding and the analysis of large urban planned events through mobile phone data, provided by French telecom operator Orange. The three cases studied are the Armada de Rouen 2008, the Braderie de Lille 2011 and the Armada de Rouen 2013. The aim of this thesis is to study and evaluate the impacts on urban spatial patterns thanks to modelling and simulation methodologies. To tackle the huge amount of data statistical methods, spatial analysis, and a new agent based model (GAMA) have been used. The achievement of this research lead us to highlight the role of spatial (attractiveness, concentration, etc. ) and temporal patterns (rhythms, urban pulses, etc. ) of urban spaces during the ongoing agenda of a popular large planned event. The outcomes of this research underline the relevance of the mobile phone data to understand the short-lived functioning as well as the routine of the city during major events. Moreover impacts in terms of mobility and social behavior must be taken into account
Abdelmoula, Mariem. "Génération automatique de jeux de tests avec analyse symbolique des données pour les systèmes embarqués". Thesis, Nice, 2014. http://www.theses.fr/2014NICE4149/document.
Pełny tekst źródłaOne of the biggest challenges in hardware and software design is to ensure that a system is error-free. Small errors in reactive embedded systems can have disastrous and costly consequences for a project. Preventing such errors by identifying the most probable cases of erratic system behavior is quite challenging. Indeed, tests in industry are overall non-exhaustive, while formal verification in scientific research often suffers from combinatorial explosion problem. We present in this context a new approach for generating exhaustive test sets that combines the underlying principles of the industrial test technique and the academic-based formal verification approach. Our approach builds a generic model of the system under test according to the synchronous approach. The goal is to identify the optimal preconditions for restricting the state space of the model such that test generation can take place on significant subspaces only. So, all the possible test sets are generated from the extracted subspace preconditions. Our approach exhibits a simpler and efficient quasi-flattening algorithm compared with existing techniques and a useful compiled internal description to check security properties and reduce the state space combinatorial explosion problem. It also provides a symbolic processing technique of numeric data that provides a more expressive and concrete test of the system. We have implemented our approach on a tool called GAJE. To illustrate our work, this tool was applied to verify an industrial project on contactless smart cards security
Modrzejewski, Richard. "Recalage déformable, jeux de données et protocoles d'évaluation pour la chirurgie mini-invasive abdominale augmentée". Thesis, Université Clermont Auvergne (2017-2020), 2020. http://www.theses.fr/2020CLFAC044.
Pełny tekst źródłaThis thesis deals with deformable registration techniques of preoperative data to the intra-operative sceneas an indispensable step in the realisation of augmented reality for abdominal surgery. Such techniques arethus discussed as well as evaluation methodologies associated with them. Two contexts are considered : theregistration for computer-assisted laparoscopic surgery and the postural registration of the patient on theoperating table. For these two contexts, the needs to be met by the registration algorithms considered arediscussed, as well as the main limitations of the existing solutions. Algorithms developped during this thesis,allowing to meet these needs are thus proposed and discussed. Special attention is given to their evaluation.Different datasets allowing a quantitative evaluation of the accuracy of the registration algorithms, also realizedduring this thesis, and made public, are also discussed. Such data are extremely important because they respondto a lack of evaluation data needed in order to evaluate the registration error in a quantitative way, and thus tocompare the different algorithms. The modeling of the illumination of the laparoscopic scene, allowing one toextract strong constraints between the data to be registered and the surface of the observed organ, and thus tobe used to constrain these registration problems, is also discussed. This manuscript has seven parts. The firstdeals with the context surrounding this thesis. Minimally invasive surgery is presented as well as various generalcomputer vision problems which, when applied to the medical context, allow the definition of computer-assistedsurgery. The second part deals with the prerequisites for reading the thesis. The pre-processing of pre-operativeand per-operative data, before their use by the presented registration algorithms, is thus discussed. The thirdpart corresponds to the registration of hepatic data in laparoscopy, and the evaluation associated with thisproblems. The fourth part deals with the problem of postural registration. The fifth part proposes a modellingof the lighting in laparoscopy which can be used to obtain strong constraints between the observed surfaceand the laparoscopic images. The sixth part proposes a use of the light models discussed in the previous partin order to refine and densify reconstructions of the laparoscopic scene. Finally, the seventh and last partcorresponds to our conclusions regarding the issues addressed during this thesis, and to future work
Simon, Franck. "Découverte causale sur des jeux de données classiques et temporels. Application à des modèles biologiques". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS528.
Pełny tekst źródłaThis thesis focuses on the field of causal discovery : the construction of causal graphs from observational data, and in particular, temporal causal discovery and the reconstruction of large gene regulatory networks. After a brief history, this thesis introduces the main concepts, hypotheses and theorems underlying causal graphs as well as the two main approaches: score-based and constraint-based methods. The MIIC (Multivariate Information-based Inductive Causation) method, developed in our laboratory, is then described with its latest improvements: Interpretable MIIC. The issues and solutions implemented to construct a temporal version (tMIIC) are presented as well as benchmarks reflecting the advantages of tMIIC compared to other state-of-the-art methods. The application to sequences of images taken with a microscope of a tumor environment reconstituted on microchips illustrates the capabilities of tMIIC to recover, solely from data, known and new relationships. Finally, this thesis introduces the use of a consequence a priori to apply causal discovery to the reconstruction of gene regulatory networks. By assuming that all genes, except transcription factors, are only consequence genes, it becomes possible to reconstruct graphs with thousands of genes. The ability to identify key transcription factors de novo is illustrated by an application to single cell RNA sequencing data with the discovery of two transcription factors likely to be involved in the biological process of interest
Soler, Maxime. "Réduction et comparaison de structures d'intérêt dans des jeux de données massifs par analyse topologique". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS364.
Pełny tekst źródłaIn this thesis, we propose different methods, based on topological data analysis, in order to address modern problematics concerning the increasing difficulty in the analysis of scientific data. In the case of scalar data defined on geometrical domains, extracting meaningful knowledge from static data, then time-varying data, then ensembles of time-varying data proves increasingly challenging. Our approaches for the reduction and analysis of such data are based on the idea of defining structures of interest in scalar fields as topological features. In a first effort to address data volume growth, we propose a new lossy compression scheme which offers strong topological guarantees, allowing topological features to be preserved throughout compression. The approach is shown to yield high compression factors in practice. Extensions are proposed to offer additional control over the geometrical error. We then target time-varying data by designing a new method for tracking topological features over time, based on topological metrics. We extend the metrics in order to overcome robustness and performance limitations. We propose a new efficient way to compute them, gaining orders of magnitude speedups over state-of-the-art approaches. Finally, we apply and adapt our methods to ensemble data related to reservoir simulation, for modeling viscous fingering in porous media. We show how to capture viscous fingers with topological features, adapt topological metrics for capturing discrepancies between simulation runs and a ground truth, evaluate the proposed metrics with feedback from experts, then implement an in-situ ranking framework for rating the fidelity of simulation runs
Hollocou, Alexandre. "Nouvelles approches pour le partitionnement de grands graphes". Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE063.
Pełny tekst źródłaGraphs are ubiquitous in many fields of research ranging from sociology to biology. A graph is a very simple mathematical structure that consists of a set of elements, called nodes, connected to each other by edges. It is yet able to represent complex systems such as protein-protein interaction or scientific collaborations. Graph clustering is a central problem in the analysis of graphs whose objective is to identify dense groups of nodes that are sparsely connected to the rest of the graph. These groups of nodes, called clusters, are fundamental to an in-depth understanding of graph structures. There is no universal definition of what a good cluster is, and different approaches might be best suited for different applications. Whereas most of classic methods focus on finding node partitions, i.e. on coloring graph nodes so that each node has one and only one color, more elaborate approaches are often necessary to model the complex structure of real-life graphs and to address sophisticated applications. In particular, in many cases, we must consider that a given node can belong to more than one cluster. Besides, many real-world systems exhibit multi-scale structures and one much seek for hierarchies of clusters rather than flat clusterings. Furthermore, graphs often evolve over time and are too massive to be handled in one batch so that one must be able to process stream of edges. Finally, in many applications, processing entire graphs is irrelevant or expensive, and it can be more appropriate to recover local clusters in the neighborhood of nodes of interest rather than color all graph nodes. In this work, we study alternative approaches and design novel algorithms to tackle these different problems. The novel methods that we propose to address these different problems are mostly inspired by variants of modularity, a classic measure that accesses the quality of a node partition, and by random walks, stochastic processes whose properties are closely related to the graph structure. We provide analyses that give theoretical guarantees for the different proposed techniques, and endeavour to evaluate these algorithms on real-world datasets and use cases
Osty, Guillaume. "Extraction de particularités sur données discrètes issues de numérisation 3D : partitionnement de grands nuages de points". Cachan, Ecole normale supérieure, 2002. http://www.theses.fr/2002DENS0003.
Pełny tekst źródłaChamekh, Rabeb. "Stratégies de jeux pour quelques problèmes inverses". Thesis, Université Côte d'Azur (ComUE), 2019. http://www.theses.fr/2019AZUR4103.
Pełny tekst źródłaIn this PHD-Thesis, we focused on solving the coupling problem of data completion and parameter identification. The Cauchy problem is a problem of identification of boundary condition on a part of the boundary from overabundant data on the remaining part. Parameter identification is a problem of the system parameter. These two problems are known to be ill-posed in the sense of Hadamard. This Thesis is divided into four parts. The first part is dedicated to a bibliography study. In the second chapter, we applied the game theory on the resolution of the coupling problem of data completion and the conductivity identification in electrocardiography. We talked about the identifiability of the conductivity. We have shown the uniqueness of this parameter using only the Cauchy data on a part of the edge. Our numerical experiments target medical applications in electrocardiography. We applied our procedure in a two-dimensional and three-dimensional thorax. The third part is dedicated to the resolution of the coupling problem in linear elasticity applying the game theory. A numerical study has been done where we considered a particular configuration to ensure the parameters identifiability. In the last part, we are interested in a problem of thermoelasticity. It’s about coupling two different disciplines : thermal and elasticity. The problem of crack identification is a natural application in this case
Lamarche-Perrin, Robin. "Analyse macroscopique des grands systèmes : émergence épistémique et agrégation spatio-temporelle". Phd thesis, Université de Grenoble, 2013. http://tel.archives-ouvertes.fr/tel-00933186.
Pełny tekst źródłaChebbo, Manal. "SIMULATION FINE D'OPTIQUE ADAPTATIVE A TRES GRAND CHAMP POUR DES GRANDS ET FUTURS TRES GRANDS TELESCOPES". Phd thesis, Aix-Marseille Université, 2012. http://tel.archives-ouvertes.fr/tel-00742873.
Pełny tekst źródłaLongueville, Véronique. "Modélisation, calcul et évaluation de liens pour la navigation dans les grands ensembles d'images fixes". Toulouse 3, 1993. http://www.theses.fr/1993TOU30149.
Pełny tekst źródłaDerriere, Sébastien. "Gestion de grands catalogues et application de relevés infrarouges à l'étude de la structure galactique". Université Louis Pasteur (Strasbourg) (1971-2008), 2001. http://www.theses.fr/2001STR13112.
Pełny tekst źródłaLegtchenko, Sergey. "Adaptation dynamique des architectures réparties pour jeux massivement multijoueurs". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2012. http://tel.archives-ouvertes.fr/tel-00931865.
Pełny tekst źródłaRoy-Pomerleau, Xavier. "Inférence d'interactions d'ordre supérieur et de complexes simpliciaux à partir de données de présence/absence". Master's thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/66994.
Pełny tekst źródłaDespite the effectiveness of networks to represent complex systems, recent work has shownthat their structure sometimes limits the explanatory power of the theoretical models, sinceit only encodes dyadic interactions. If a more complex interaction exists in the system, it isautomatically reduced to a group of pairwise interactions that are of the first order. We thusneed to use structures that can take higher-order interactions into account. However, whetherrelationships are of higher order or not is rarely explicit in real data sets. This is the case ofpresence/absence data, that only indicate which species (of animals, plants or others) can befound (or not) on a site without showing the interactions between them.The goal of this project is to develop an inference method to find higher-order interactionswithin presence/absence data. Here, two frameworks are examined. The first one is based onthe comparison of the topology of the data, obtained with a non-restrictive hypothesis, andthe topology of a random ensemble. The second one uses log-linear models and hypothesistesting to infer interactions one by one until the desired order. From this framework, we havedevelopped several inference methods to generate simplicial complexes (or hypergraphs) thatcan be studied with regular tools of network science as well as homology. In order to validatethese methods, we have developed a generative model of presence/absence data in which thetrue interactions are known. Results have also been obtained on real data sets. For instance,from presence/absence data of nesting birds in Québec, we were able to infer co-occurrencesof order two
Maillet, Nicolas. "Comparaison de novo de données de séquençage issues de très grands échantillons métagénomiques : application sur le projet Tara Oceans". Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00941922.
Pełny tekst źródłaStoica, Beck Alina. "Analyse de la structure locale des grands réseaux sociaux". Phd thesis, Université Paris-Diderot - Paris VII, 2010. http://tel.archives-ouvertes.fr/tel-00987880.
Pełny tekst źródłaSemboloni, Elisabetta. "Mesure et interprétation du cisaillement cosmologique". Paris 6, 2006. https://tel.archives-ouvertes.fr/tel-00114489.
Pełny tekst źródłaConde, Cespedes Patricia. "Modélisations et extensions du formalisme de l'analyse relationnelle mathématique à la modularisation des grands graphes". Paris 6, 2013. http://www.theses.fr/2013PA066654.
Pełny tekst źródłaGraphs are the mathematical representation of networks. Since a graph is a special type of binary relation, graph clustering (or modularization), can be mathematically modelled using the Mathematical Relational analysis. This modelling allows to compare numerous graph clustering criteria on the same type of formal representation. We give through a relational coding, the way of comparing different modularization criteria such as: Newman-Girvan, Zahn-Condorcet, Owsinski-Zadrozny, Demaine-Immorlica, Wei-Cheng, Profile Difference et Michalski-Goldberg. We introduce three modularization criteria: the Balanced Modularity, the deviation to Indetermination and the deviation to Uniformity. We identify the properties verified by those criteria and for some of those criteria, specially linear criteria, we characterize the partitions obtained by the optimization of these criteria. The final goal is to facilitate their understanding and their usefulness in some practical contexts, where their purposes become easily interpretable and understandable. Our results are tested by modularizing real networks of different sizes with the generalized Louvain algorithm
Sridhar, Srivatsan. "Analyse statistique de la distribution des amas de galaxies à partir des grands relevés de la nouvelle génération". Thesis, Université Côte d'Azur (ComUE), 2016. http://www.theses.fr/2016AZUR4152/document.
Pełny tekst źródłaI aim to study to which accuracy it is actually possible to recover the real-space to-point correlation function from cluster catalogues based on photometric redshifts. I make use of cluster sub-samples selected from a light-cone simulated catalogue. Photometric redshifts are assigned to each cluster by randomly extracting from a Gaussian distribution having a dispersion varied in the range σ (z=0) = 0.005 à 0.050. The correlation function in real-space is computed through deprojection method. Four masse ranges and six redshifts slices covering the redshift range 0
Lavallard, Anne. "Exploration interactive d'archives de forums : Le cas des jeux de rôle en ligne". Phd thesis, Université de Caen, 2008. http://tel.archives-ouvertes.fr/tel-00292617.
Pełny tekst źródłaCampigotto, Romain. "Algorithmes d'approximation à mémoire limitée pour le traitement de grands graphes : le problème du Vertex Cover". Phd thesis, Université d'Evry-Val d'Essonne, 2011. http://tel.archives-ouvertes.fr/tel-00677774.
Pełny tekst źródłaFender, Alexandre. "Solutions parallèles pour les grands problèmes de valeurs propres issus de l'analyse de graphe". Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLV069/document.
Pełny tekst źródłaGraphs, or networks, are mathematical structures to represent relations between elements. These systems can be analyzed to extract information upon the comprehensive structure or the nature of individual components. The analysis of networks often results in problems of high complexity. At large scale, the exact solution is prohibitively expensive to compute. Fortunately, this is an area where iterative approximation methods can be employed to find accurate estimations. Historical methods suitable for a small number of variables could not scale to large and sparse matrices arising in graph applications. Therefore, the design of scalable and efficient solvers remains an essential problem. Simultaneously, the emergence of parallel architecture such as GPU revealed remarkable ameliorations regarding performances and power efficiency. In this dissertation, we focus on solving large eigenvalue problems a rising in network analytics with the goal of efficiently utilizing parallel architectures. We revisit the spectral graph analysis theory and propose novel parallel algorithms and implementations. Experimental results indicate improvements on real and large applications in the context of ranking and clustering problems
Lelu, Alain. "Modeles neuronaux pour l'analyse de donnees documentaires et textuelles : organiser de très grands tableaux de données qualitatives en pôles et zones d'influence". Paris 6, 1993. http://www.theses.fr/1993PA066148.
Pełny tekst źródłaHénon, Pascal. "Distribution des données et régulation statique des calculs et des communications pour la résolution de grands systèmes linéaires creux par méthode directe". Bordeaux 1, 2001. http://www.theses.fr/2001BOR12432.
Pełny tekst źródłaBereau, Philippe. "Traitements informatiques de l'information formelle et informelle pour l'aide à la veille technologique et à la planification stratégique des petites et moyennes entreprises et des grands groupes industriels". Aix-Marseille 3, 1999. http://www.theses.fr/1999AIX3A001.
Pełny tekst źródłaTeyssière, Gilles. "Processus d'appariements sur le marché du travail : une étude à partir de données d'une agence locale de l'ANPE". Aix-Marseille 2, 1991. http://www.theses.fr/1991AIX24001.
Pełny tekst źródłaThe purpose of this thesis is to determine the explicative elements of employer's hiring decision when he meets a worker through the national agency for employment. We use for this study a theoretical framework constituted by matching models. These models explain the level of wage that receive the worker by this labour productivity (or his level of education) and the alternative meeting opportunities of the two agents. We adapt these models to a sample of observed meetings and we explain the worker's hiring probability with a nested Logit model. We use for explicative variables the individual characteristics of the worker (like age, sex, marital status, level of education, his past situation in the labour market. . . ), the characteristics of the vacancies (like type of labour contract, offered wage. . . ) And the employer's alternative meeting opportunities. . . Moreover, we explain the employers hiring behaviour throughout time with survival models. We can observe from the estimation results a segmentation in the labor market on the basis of worker's level of education. A worker is hired only if his level of education is greater than a level fixed by the employer