Dissertations / Theses on the topic 'Série de données'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Série de données.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Meyer, Nicolas. "Méthodes statistiques d'analyse des données d'allélotypage en présence d'homozygotes." Université Louis Pasteur (Strasbourg) (1971-2008), 2007. https://publication-theses.unistra.fr/public/theses_doctorat/2007/MEYER_Nicolas_2007.pdf.
Full textAllelotyping data contain measures done using Polymerase Chain Reaction on a batch of DNA microsatellites in order to ascertain the presence or not of an allelic imbalance for this microsatellites. From a statistical point of view, those data are characterised by a high number of missing data (in case of homozygous microsatellite), square or °at matrices, binomial data, sample sizes which may be small with respect to the number of variables and possibly some colinearity. Frequentist statistical methods have a number of shortcomings who led us to choose a bayesian framework to analyse these data. For univariate analyses, the Bayes factor is explored and several variants according to the presence or absence of missing data are compared. Di®erent multiple imputations types are then studied. Meta-analysis models are also assessed. For multivariate analyses, a Partial Least Square model is developed. The model is applied under a generalised linear model (logistic regression) and combined with a Non Iterative Partial Least Squares algorithm which 3 makes it possible to manage simultaneously all the limits of allelotyping data. Properties of this model are explored. It is then applied on allelotyping data on 33 microsatellites of 104 patients who have colon cancer to predict the tumor Astler-Coller stage. A model with all possible microsatellites pairs interactions is also run
Ponchateau, Cyrille. "Conception et exploitation d'une base de modèles : application aux data sciences." Thesis, Chasseneuil-du-Poitou, Ecole nationale supérieure de mécanique et d'aérotechnique, 2018. http://www.theses.fr/2018ESMA0005/document.
Full textIt is common practice in experimental science to use time series to represent experimental results, that usually come as a list of values in chronological order (indexed by time) and generally obtained via sensors connected to the studied physical system. Those series are analyzed to obtain a mathematical model that allow to describe the data and thus to understand and explain the behavio rof the studied system. Nowadays, storage and analyses technologies for time series are numerous and mature, but the storage and management technologies for mathematical models and their linking to experimental numerical data are both scarce and recent. Still, mathematical models have an essential role to play in the interpretation and validation of experimental results. Consequently, an adapted storage system would ease the management and re-usability of mathematical models. This work aims at developing a models database to manage mathematical models and provide a “query by data” system, to help retrieve/identify a model from an experimental time series. In this work, I will describe the conception (from the modeling of the system, to its software architecture) of the models database and its extensions to allow the “query by data”. Then, I will describe the prototype of models database,that I implemented and the results obtained by tests performed on the latter
Iraqui, Samir. "Détection statique en temps semi réel de valeurs aberrantes dans une série chronologique de données bactériologiques." Rouen, 1986. http://www.theses.fr/1986ROUES042.
Full textBenson, Marie Anne. "Pouvoir prédictif des données d'enquête sur la confiance." Master's thesis, Université Laval, 2021. http://hdl.handle.net/20.500.11794/69497.
Full textConfidence survey data are time series containting the responses to questions aiming to measure confidence and expectations of economic agents about future economic activity. The richness of these data and their availability in real time attracts the interest of many forecasters who see it as a way to improve their traditional forecasts. In this thesis, I assess the predictive power of survey data for the future evolution of Canadian GDP, while comparing the forecasting performance of the Conference Board of Canada own confidence indices to the indicators I construct using principal component analysis. Using three simple linear models, I carry out an out-of-sample forecasting experiment with rolling windows on the period 1980 to 2019. The results show that principal component analysis provides better-performing indicators than the indices produced by the Conference Board. However, the results of the study cannot show that clear that confidence improves forecasting unambiguently once the lagged growth rate of GDP is added to the analysis.
Peng, Tao. "Analyse de données loT en flux." Electronic Thesis or Diss., Aix-Marseille, 2021. http://www.theses.fr/2021AIXM0649.
Full textSince the advent of the IoT (Internet of Things), we have witnessed an unprecedented growth in the amount of data generated by sensors. To exploit this data, we first need to model it, and then we need to develop analytical algorithms to process it. For the imputation of missing data from a sensor f, we propose ISTM (Incremental Space-Time Model), an incremental multiple linear regression model adapted to non-stationary data streams. ISTM updates its model by selecting: 1) data from sensors located in the neighborhood of f, and 2) the near-past most recent data gathered from f. To evaluate data trustworthiness, we propose DTOM (Data Trustworthiness Online Model), a prediction model that relies on online regression ensemble methods such as AddExp (Additive Expert) and BNNRW (Bagging NNRW) for assigning a trust score in real time. DTOM consists: 1) an initialization phase, 2) an estimation phase, and 3) a heuristic update phase. Finally, we are interested predicting multiple outputs STS in presence of imbalanced data, i.e. when there are more instances in one value interval than in another. We propose MORSTS, an online regression ensemble method, with specific features: 1) the sub-models are multiple output, 2) adoption of a cost sensitive strategy i.e. the incorrectly predicted instance has a higher weight, and 3) management of over-fitting by means of k-fold cross-validation. Experimentation with with real data has been conducted and the results were compared with reknown techniques
Hugueney, Bernard. "Représentations symboliques de longues séries temporelles." Paris 6, 2003. http://www.theses.fr/2003PA066161.
Full textDavid, Bogdan-Simion. "Les données climatiques instrumentales de Roumanie sont-elles susceptibles d'identifier un changement climatique ?" Strasbourg, 2010. http://www.theses.fr/2010STRA5004.
Full textLadjouze, Salim. "Problèmes d'estimation dans les séries temporelles stationnaires avec données manquantes." Phd thesis, Université Joseph Fourier (Grenoble ; 1971-2015), 1986. http://tel.archives-ouvertes.fr/tel-00319946.
Full textWalter, Patricia. "L'effet du traitement chirurgical dans l'acromégalie : à propos d'une série caennaise de 29 acromégalies opérées : données comparées à celle de la littérature." Caen, 1993. http://www.theses.fr/1993CAEN3096.
Full textHocine, Mounia Nacima. "Analyse de données de comptage bivarié dans les études de série de cas ou de cohorte : application à la résistance bactérienne aux antibiotiques." Paris 11, 2005. http://www.theses.fr/2005PA11T052.
Full textLanglois, Vincent. "Couple de friction métallique de nouvelle génération en arthroplastie totale primaire de hanche : historique, données actuelles et résultats préliminaires d'une série de 54 cas." Bordeaux 2, 2001. http://www.theses.fr/2001BOR23022.
Full textCheysson, Felix. "Maladies infectieuses et données agrégées : estimation de la fraction attribuable et prise en compte de biais." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASR012.
Full textEpidemiological surveillance is most often based on the analysis of aggregate health indicators. We study the methodological problems encountered when working with this type of data in a public health context. First, we focus on calculating the attributable fraction when the exposure is epidemic and the number of health events exhibits a seasonality. For the most frequently used time series models, we present a method for estimating this fraction and its confidence intervals. This work enabled us to show that the awareness campaign "Antibiotics are not automatic!" led to a reduction of more than half of the antibiotic prescriptions associated with influenza epidemics as early as 2005. Moreover, recently 17% of prescriptions are thought to be attributable to viral infections of the lower respiratory tract during the cold period, and nearly 38% in children, half of which attributable to bronchiolitis. In a second step, we propose Hawkes processes as models for contagious diseases and study the impact of data aggregation on their estimation. In this context, we develop a method for estimating the process parameters and prove that the estimators have good asymptotic properties. This work provides statistical tools to avoid some biases due to the use of aggregate data for the study of attributable fractions and contagious diseases
Dkhil, Abdellatif. "Identification systématique de structures visuelles de flux physique de production." Strasbourg, 2011. http://www.theses.fr/2011STRA6012.
Full textThis research is motivated by the competitive environment of manufacturing companies. It mainly concerns the design of physical production systems. Specifically, the framework study is performed during the preliminary design phase. This phase is particularly sensitive and plays a major role, where different point of views can be considered to realize the conceptual design. Only one view point concerning the static production flow is considered in this work. To generate a conceptual design depending on this point of view, a usual method of conceptual design elaboration is used. This method is introduced in many literatures. It looks like a string of data processing generated by three main activities. The first activity allows the extraction of data flow from product routing data. During the second activity, properties of analysis are used to analyze the data flow. The single or combined analysis results are called visual structures. The third activity allows the drawings of production flow graph using visual structures. After a literature review, 44 properties analysis are obtained. From these properties of analysis we can deduce 1. 75 1013 possible visual structures and the same number of production flow graphs. Recognizing this, a scientific problem of model reduction based on expert knowledge is defined. Here, the model reduction is a restriction process based on expert rules and validated with industrial data. Through this restriction process, three contributions are proposed. The first concerns the identification of referential properties of analysis which are considered the most useful and relevant in preliminary design phase. The second allows the identification of referential visual structures. The third contribution is a method to automatically identify the particular visual structures. In order to evaluate these contributions, an industrial case study is proposed
Bajja, Ali. "Nouvelles données pétrographiques et géochimiques sur les formations volcaniques précambriennes du Djebel Saghro (anti-atlas marocain), basaltes en coussins du P II et volcanites de la série de Ouarzazate (P III)." Nancy 1, 1987. http://www.theses.fr/1987NAN10130.
Full textNguyen, Hoang Viet Tuan. "Prise en compte de la qualité des données lors de l’extraction et de la sélection d’évolutions dans les séries temporelles de champs de déplacements en imagerie satellitaire." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAA011.
Full textThis PhD thesis deals with knowledge discovery from Displacement Field Time Series (DFTS) obtained by satellite imagery. Such series now occupy a central place in the study and monitoring of natural phenomena such as earthquakes, volcanic eruptions and glacier displacements. These series are indeed rich in both spatial and temporal information and can now be produced regularly at a lower cost thanks to spatial programs such as the European Copernicus program and its famous Sentinel satellites. Our proposals are based on the extraction of grouped frequent sequential patterns. These patterns, originally defined for the extraction of knowledge from Satellite Image Time Series (SITS), have shown their potential in early work to analyze a DFTS. Nevertheless, they cannot use the confidence indices coming along with DFTS and the swap method used to select the most promising patterns does not take into account their spatiotemporal complementarities, each pattern being evaluated individually. Our contribution is thus double. A first proposal aims to associate a measure of reliability with each pattern by using the confidence indices. This measure allows to select patterns having occurrences in the data that are on average sufficiently reliable. We propose a corresponding constraint-based extraction algorithm. It relies on an efficient search of the most reliable occurrences by dynamic programming and on a pruning of the search space provided by a partial push strategy. This new method has been implemented on the basis of the existing prototype SITS-P2miner, developed by the LISTIC and LIRIS laboratories to extract and rank grouped frequent sequential patterns. A second contribution for the selection of the most promising patterns is also made. This one, based on an informational criterion, makes it possible to take into account at the same time the confidence indices and the way the patterns complement each other spatially and temporally. For this aim, the confidence indices are interpreted as probabilities, and the DFTS are seen as probabilistic databases whose distributions are only partial. The informational gain associated with a pattern is then defined according to the ability of its occurrences to complete/refine the distributions characterizing the data. On this basis, a heuristic is proposed to select informative and complementary patterns. This method provides a set of weakly redundant patterns and therefore easier to interpret than those provided by swap randomization. It has been implemented in a dedicated prototype. Both proposals are evaluated quantitatively and qualitatively using a reference DFTS covering Greenland glaciers constructed from Landsat optical data. Another DFTS that we built from TerraSAR-X radar data covering the Mont-Blanc massif is also used. In addition to being constructed from different data and remote sensing techniques, these series differ drastically in terms of confidence indices, the series covering the Mont-Blanc massif being at very low levels of confidence. In both cases, the proposed methods operate under standard conditions of resource consumption (time, space), and experts’ knowledge of the studied areas is confirmed and completed
Linardi, Michele. "Variable-length similarity search for very large data series : subsequence matching, motif and discord detection." Electronic Thesis or Diss., Sorbonne Paris Cité, 2019. http://www.theses.fr/2019USPCB056.
Full textData series (ordered sequences of real valued points, a.k.a. time series) has become one of the most important and popular data-type, which is present in almost all scientific fields. For the last two decades, but more evidently in this last period the interest in this data-type is growing at a fast pace. The reason behind this is mainly due to the recent advances in sensing, networking, data processing and storage technologies, which have significantly assisted the process of generating and collecting large amounts of data series. Data series similarity search has emerged as a fundamental operation at the core of several analysis tasks and applications related to data series collections. Many solutions to different data mining problems, such as Clustering, Subsequence Matching, Imputation of Missing Values, Motif Discovery, and Anomaly detection work by means of similarity search. Data series indexes have been proposed for fast similarity search. Nevertheless all existing indexes can only answer queries of a single length (fixed at index construction time), which is a severe limitation. In this regard, all solutions for the aforementioned problems require the prior knowledge of the series length, on which similarity search is performed. Consequently, the user must know the length of the expected results, which is often an unrealistic assumption. This aspect is thus of paramount importance. In several cases, the length is a critical parameter that heavily influences the quality of the final outcome. In this thesis, we propose scalable solutions that enable variable-length analysis of very large data series collections. We propose ULISSE, the first data series index structure designed for answering similarity search queries of variable length. Our contribution is two-fold. First, we introduce a novel representation technique, which effectively and succinctly summarizes multiple sequences of different length. Based on the proposed index, we describe efficient algorithms for approximate and exact similarity search, combining disk based index visits and in-memory sequential scans. Our approach supports non Z-normalized and Z-normalized sequences, and can be used with no changes with both Euclidean Distance and Dynamic Time Warping, for answering both κ-NN and ε-range queries. We experimentally evaluate our approach using several synthetic and real datasets. The results show that ULISSE is several times, and up to orders of magnitude more efficient in terms of both space and time cost, when compared to competing approaches. Subsequently, we introduce a new framework, which provides an exact and scalable motif and discord discovery algorithm that efficiently finds all motifs and discords in a given range of lengths. The experimental evaluation we conducted over several diverse real datasets show that our approaches are up to orders of magnitude faster than the alternatives. We moreover demonstrate that we can remove the unrealistic constraint of performing analytics using a predefined length, leading to more intuitive and actionable results, which would have otherwise been missed
Bahamonde, Natalia. "Estimation de séries chronologiques avec données manquantes." Paris 11, 2007. http://www.theses.fr/2007PA112115.
Full textNkoumbou, Charles. "I. Étude géologique des Monts Roumpi : un ensemble plutonique et volcanique de la "Ligne du Cameroun"II. Données pétrologiques sur les néphélinites du Mont Etinde (Cameroun)." Nancy 1, 1990. http://docnum.univ-lorraine.fr/public/SCD_T_1990_0460_NKOUMBOU.pdf.
Full textEl-Taib, El-Rafehi Ahmed. "Estimation des données manquantes dans les séries chronologiques." Montpellier 2, 1992. http://www.theses.fr/1992MON20239.
Full textKhiali, Lynda. "Fouille de données à partir de séries temporelles d’images satellites." Thesis, Montpellier, 2018. http://www.theses.fr/2018MONTS046/document.
Full textNowadays, remotely sensed images constitute a rich source of information that can be leveraged to support several applications including risk prevention, land use planning, land cover classification and many other several tasks. In this thesis, Satellite Image Time Series (SITS) are analysed to depict the dynamic of natural and semi-natural habitats. The objective is to identify, organize and highlight the evolution patterns of these areas.We introduce an object-oriented method to analyse SITS that consider segmented satellites images. Firstly, we identify the evolution profiles of the objects in the time series. Then, we analyse these profiles using machine learning methods. To identify the evolution profiles, we explore all the objects to select a subset of objects (spatio-temporal entities/reference objects) to be tracked. The evolution of the selected spatio-temporal entities is described using evolution graphs.To analyse these evolution graphs, we introduced three contributions. The first contribution explores annual SITS. It analyses the evolution graphs using clustering algorithms, to identify similar evolutions among the spatio-temporal entities. In the second contribution, we perform a multi-annual cross-site analysis. We consider several study areas described by multi-annual SITS. We use the clustering algorithms to identify intra and inter-site similarities. In the third contribution, we introduce à semi-supervised method based on constrained clustering. We propose a method to select the constraints that will be used to guide the clustering and adapt the results to the user needs.Our contributions were evaluated on several study areas. The experimental results allow to pinpoint relevant landscape evolutions in each study sites. We also identify the common evolutions among the different sites. In addition, the constraint selection method proposed in the constrained clustering allows to identify relevant entities. Thus, the results obtained using the unsupervised learning were improved and adapted to meet the user needs
Moyse, Gilles. "Résumés linguistiques de données numériques : interprétabilité et périodicité de séries." Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066526/document.
Full textOur research is in the field of fuzzy linguistic summaries (FLS) that allow to generate natural language sentences to describe very large amounts of numerical data, providing concise and intelligible views of these data. We first focus on the interpretability of FLS, crucial to provide end-users with an easily understandable text, but hard to achieve due to its linguistic form. Beyond existing works on that topic, based on the basic components of FLS, we propose a general approach for the interpretability of summaries, considering them globally as groups of sentences. We focus more specifically on their consistency. In order to guarantee it in the framework of standard fuzzy logic, we introduce a new model of oppositions between increasingly complex sentences. The model allows us to show that these consistency properties can be satisfied by selecting a specific negation approach. Moreover, based on this model, we design a 4-dimensional cube displaying all the possible oppositions between sentences in a FLS and show that it generalises several existing logical opposition structures. We then consider the case of data in the form of numerical series and focus on linguistic summaries about their periodicity: the sentences we propose indicate the extent to which the series are periodic and offer an appropriate linguistic expression of their periods. The proposed extraction method, called DPE, standing for Detection of Periodic Events, splits the data in an adaptive manner and without any prior information, using tools from mathematical morphology. The segments are then exploited to compute the period and the periodicity, measuring the quality of the estimation and the extent to which the series is periodic. Lastly, DPE returns descriptive sentences of the form ``Approximately every 2 hours, the customer arrival is important''. Experiments with artificial and real data show the relevance of the proposed DPE method. From an algorithmic point of view, we propose an incremental and efficient implementation of DPE, based on established update formulas. This implementation makes DPE scalable and allows it to process real-time streams of data. We also present an extension of DPE based on the local periodicity concept, allowing the identification of local periodic subsequences in a numerical series, using an original statistical test. The method validated on artificial and real data returns natural language sentences that extract information of the form ``Every two weeks during the first semester of the year, sales are high''
Moyse, Gilles. "Résumés linguistiques de données numériques : interprétabilité et périodicité de séries." Electronic Thesis or Diss., Paris 6, 2016. http://www.theses.fr/2016PA066526.
Full textOur research is in the field of fuzzy linguistic summaries (FLS) that allow to generate natural language sentences to describe very large amounts of numerical data, providing concise and intelligible views of these data. We first focus on the interpretability of FLS, crucial to provide end-users with an easily understandable text, but hard to achieve due to its linguistic form. Beyond existing works on that topic, based on the basic components of FLS, we propose a general approach for the interpretability of summaries, considering them globally as groups of sentences. We focus more specifically on their consistency. In order to guarantee it in the framework of standard fuzzy logic, we introduce a new model of oppositions between increasingly complex sentences. The model allows us to show that these consistency properties can be satisfied by selecting a specific negation approach. Moreover, based on this model, we design a 4-dimensional cube displaying all the possible oppositions between sentences in a FLS and show that it generalises several existing logical opposition structures. We then consider the case of data in the form of numerical series and focus on linguistic summaries about their periodicity: the sentences we propose indicate the extent to which the series are periodic and offer an appropriate linguistic expression of their periods. The proposed extraction method, called DPE, standing for Detection of Periodic Events, splits the data in an adaptive manner and without any prior information, using tools from mathematical morphology. The segments are then exploited to compute the period and the periodicity, measuring the quality of the estimation and the extent to which the series is periodic. Lastly, DPE returns descriptive sentences of the form ``Approximately every 2 hours, the customer arrival is important''. Experiments with artificial and real data show the relevance of the proposed DPE method. From an algorithmic point of view, we propose an incremental and efficient implementation of DPE, based on established update formulas. This implementation makes DPE scalable and allows it to process real-time streams of data. We also present an extension of DPE based on the local periodicity concept, allowing the identification of local periodic subsequences in a numerical series, using an original statistical test. The method validated on artificial and real data returns natural language sentences that extract information of the form ``Every two weeks during the first semester of the year, sales are high''
Rouy, Jean-Pierre. "Décomposition cycle-tendance des données françaises désagrégées." Paris 1, 1998. http://www.theses.fr/1998PA010027.
Full textMeasuring business cycles is at the forefront of modern economic research. It provides stylized facts can be used to examine quantitative and qualitative validity of theoretical models. The disaggregated studied data are the production of intermediate, equipment and consumer goods sectors in france from 1963 to 1993. Several identifying methods are used. Based on different concepts of buisiness cycle fluctuations. They extract different types of information from the original series. The filtering tools, proposed among others by hodrick-prescott or baxter-king, imply we have knowledge of cycle duration. Conversely, the key identifying assumptions of the harvey unobserved components models are an explicit econometric representation of each component. Although the long run economic representation of these methods is the same, they give us different short term cyclical characteristics. Nevertheless, we propose in this study a chronology for each french manufacturing production and describe their morphology in the nber's spirit. In multivariate case, cointegration and codependance principles indicate the number of common trends and common cycles existing in series. The presence of random walks in permanent components leads to emphasize that large number of shocks are responsible of production growth rates. This study shows that properties of business cycle vary widely across different detrending methods. This is not an obstacle to understand economic facts but an evidence of their complexity
Bayar, Mohamed Amine. "Randomized Clinical Trials in Oncology with Rare Diseases or Rare Biomarker-based Subtypes." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS441.
Full textLarge sample sizes are required in randomized trials designed to meet typical one-sided α-level of 0.025 and at least 80% power. This may be unachievable in a reasonable time frame even with international collaborations. It is either because the medical condition is rare, or because the trial focuses on an uncommon subset of patients with a rare molecular subtype where the treatment tested is deemed relevant. We simulated a series of two-arm superiority trials over a long research horizon (15 years). Within the series of trials, the treatment selected after each trial becomes the control treatment of the next one. Different disease severities, accrual rates, and hypotheses of how treatments improve over time were considered. We showed that compared with two larger trials with the typical one-sided α-level of 0.025, performing a series of small trials with relaxed α-levels leads on average to larger survival benefits over a long research horizon, but also to higher risk of selecting a worse treatment at the end of the research period. We then extended this framework with more 'flexible' designs including interim analyses for futility and/or efficacy, and three-arm adaptive designs with treatment selection at interim. We showed that including an interim analysis with a futility rule is associated with an additional survival gain and a better risk control as compared to series with no interim analysis. Including an interim analysis for efficacy yields almost no additional gain. Series based on three-arm trials are associated with a systematic improvement of the survival gain and the risk control as compared to series of two-arm trials. In the third part of the thesis, we examined the issue of randomized trials evaluating a treatment algorithm instead of a single drugs' efficacy. The treatment in the experimental group depends on the mutation, unlike the control group. We evaluated two methods based on the Cox frailty model to estimate the treatment effect in each mutation: Maximum Integrated Partial Likellihood (MIPL) using package coxme and Maximum H-Likelihood (MHL) using package frailtyHL. MIPL method performs slightly better. In presence of a heterogeneous treatment effect, the two methods underestimate the treatment effect in mutations where the treatment effect is large, and overestimates the treatment effect in mutations where the treatment effect is small
Paquin, Jean. "Développement d'algorithmes pour l'analyse des séries temporelles des données de production d'eau potable." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0017/MQ56951.pdf.
Full textCuenca, Pauta Erick. "Visualisation de données dynamiques et complexes : des séries temporelles hiérarchiques aux graphes multicouches." Thesis, Montpellier, 2018. http://www.theses.fr/2018MONTS054/document.
Full textThe analysis of data that is increasingly complex, large and from different sources (e.g. internet, social medias, etc.) is a dificult task. However, it remains crucial for many fields of application. It implies, in order to extract knowledge, to better understand the nature of the data, its evolution or the many complex relationships it may contain. Information visualization is about visual and interactive representation methods to help a user to extract knowledge. The work presented in this document takes place in this context. At first, we are interested in the visualization of large hierarchical time series. After analyzing the different existing approaches, we present the MultiStream system for visualizing, exploring and comparing the evolution of the series organized into a hierarchical structure. We illustrate its use by two examples: emotions expressed in social media and the evolution of musical genres. In a second time, we tackle the problem of complex data modeled in the form of multilayer graphs (different types of edges can connect the nodes). More specifically, we are interested in the visual querying of large graphs and we present VERTIGo, a system which makes it possible to build queries, to launch them on a specific engine, to visualize/explore the results at different levels of details and to suggest new query extensions. We illustrate its use with a graph of co-authors from different communities
Walwer, Damian. "Dynamique non linéaire des systèmes volcaniques à partir des données géodésiques." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE004/document.
Full textWe study the use of the "multichannel singular spectrum analysis" on GPS time series. This method allows to simultaneously analyze a set of time series in order to extract from it common modes of variability without using any a priori on the temporal or the spatial structure of geophysical fields. The extracted modes correspond either to nonlinear trends, oscillations or noise. The method is applied on a set of GPS time series recorded at Akutan, a volcano located in Aleutian arc in Alaska. Two types of signals are extracted from it. The first one corresponds to seasonal deformations and the other represents two successive cycles of inflation and subsidence of Akutan volcano. The inflations are fast and short and are followed by deflations that are slower and longer. In the second part we take benefit of the M-SSA to analyze GPS time series recorded at several volcanoes. Okmok and Shishaldin in Alaska and Piton de la Fournaise in La Réunion possess a part of their deformation history that is similar to Akutan volcano. The cyclic nature of the observed deformations leads us to make an analogy between the oscillatory regime of a simple nonlinear oscillator and the deformation cycles of these volcanoes. Geochemical, petrological and geophysical data available for Okmok and Piton de la Fournaise combined with the constraint on the qualitative dynamics bring by the nonlinear oscillator allow to propose a physical model. Two shallow reservoirs are connected by a cylindrical conduit in which the magma have a viscosity that depends on the temperature. Such system behaves like the nonlinear oscillator mentioned above. When the temperature gradient inside theconduit is large enough and the flux of magma entering the shallow system is bounded by values that are determined analytically anonlinear oscillatory regime arises
Ben, Salem Mélika. "Changement structurel et croissance : essai d'économétrie comparative sur données françaises et américaines." Paris 1, 1997. http://www.theses.fr/1997PA010078.
Full textThis dissertation analyses the empirical implications of growth models, a la Solow-Swan (1956) and Romer (1986,1990). According to the nature of the shocks affecting the economy and the nature of growth (exogenous or endogenous), the macroeconomic variables are characterized by particular statistical properties of persistence, that may originate from a common source. These statistical properties are tested in the framework of new time series methods, in order to reveal the propagation mechanisms of growth from data. Thus the stability of trend component in output, capital and hours worked, as well as the constancy of connections between production and its usual inputs, is studied in french and american data, with annual observations during the period 1870-1994, and at a quarterly rate with postwar observations
Baudry, Maximilien. "Quelques problèmes d’apprentissage statistique en présence de données incomplètes." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSE1002.
Full textMost statistical methods are not designed to directly work with incomplete data. The study of data incompleteness is not new and strong methods have been established to handle it prior to a statistical analysis. On the other hand, deep learning literature mainly works with unstructured data such as images, text or raw audio, but very few has been done on tabular data. Hence, modern machine learning literature tackling data incompleteness on tabular data is scarce. This thesis focuses on the use of machine learning models applied to incomplete tabular data, in an insurance context. We propose through our contributions some ways to model complex phenomena in presence of incompleteness schemes, and show that our approaches outperform the state-of-the-art models
Ben, othmane Zied. "Analyse et visualisation pour l'étude de la qualité des séries temporelles de données imparfaites." Thesis, Reims, 2020. http://www.theses.fr/2020REIMS002.
Full textThis thesis focuses on the quality of the information collected by sensors on the web. These data form time series that are incomplete, imprecise, and are on quantitative scales that are not very comparable. In this context, we are particularly interested in the variability and stability of these time series. We propose two approaches to quantify them. The first is based on a representation using quantiles, the second is a fuzzy approach. Using these indicators, we propose an interactive visualization tool dedicated to the analysis of the quality of the harvest carried out by the sensors. This work is part of a CIFRE collaboration with Kantar
Benkabou, Seif-Eddine. "Détection d’anomalies dans les séries temporelles : application aux masses de données sur les pneumatiques." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1046/document.
Full textAnomaly detection is a crucial task that has attracted the interest of several research studies in machine learning and data mining communities. The complexity of this task depends on the nature of the data, the availability of their labeling and the application framework on which they depend. As part of this thesis, we address this problem for complex data and particularly for uni and multivariate time series. The term "anomaly" can refer to an observation that deviates from other observations so as to arouse suspicion that it was generated by a different generation process. More generally, the underlying problem (also called novelty detection or outlier detection) aims to identify, in a set of data, those which differ significantly from others, which do not conform to an "expected behavior" (which could be defined or learned), and which indicate a different mechanism. The "abnormal" patterns thus detected often result in critical information. We focus specifically on two particular aspects of anomaly detection from time series in an unsupervised fashion. The first is global and consists in detecting abnormal time series compared to an entire database, whereas the second one is called contextual and aims to detect locally, the abnormal points with respect to the global structure of the relevant time series. To this end, we propose an optimization approaches based on weighted clustering and the warping time for global detection ; and matrix-based modeling for the contextual detection. Finally, we present several empirical studies on public data to validate the proposed approaches and compare them with other known approaches in the literature. In addition, an experimental validation is provided on a real problem, concerning the detection of outlier price time series on the tyre data, to meet the needs expressed by, LIZEO, the industrial partner of this thesis
Assaad, Mohammad. "Un nouvel algorithme de boosting pour les réseaux de neurones récurrents : application au traitement des données sequentielles." Tours, 2006. http://www.theses.fr/2006TOUR4024.
Full textThe work of this thesis deals with the proposal of a new boosting algorithm dedicated to the problem of learning time-dependencies for the time series prediction, using recurrent neural networks as regressors. This algorithm is based on the boosting algorith and allows concentrating the training on difficult examples. A new parameter is introduced to regulate the influence of boosting. To evaluate our algorithm, systematic experiments were carried out on two types of problems of time series prediction : single-step ahead predicton and multi-step ahead prediction. The results obtained from several series of reference are close to the best results reported in the literature
Lazar, Cosmin. "Méthodes non supervisées pour l’analyse des données multivariées." Reims, 2008. http://theses.univ-reims.fr/exl-doc/GED00000846.pdf.
Full textMany scientific disciplines deal with multivariate data. Different recordings of the same phenomenon are usually embedded in a multivariate data set. Multivariate data analysis gathers efficient tools for extracting relevant information in order to comprehend the phenomenon in study. Gathering data into groups or classes according to some similarity criteria is an essential step in the analysis. Intrinsic dimension or dimension reduction of multivariate data, the choice of the similarity criterion, cluster validation are problems which still let open questions. This work tries to make a step further concerning two of the problems mentioned above: the choice of the similarity measure for data clustering and the dimension reduction of multivariate data. The choice of the similarity measure for data clustering is investigated from the concentration phenomenon of metrics point of view. Non Euclidean metrics are tested as alternative to the classical Euclidian distance as similarity measure. We tested if less concentrated metrics are more discriminative for multivariate data clustering. We also proposed indices which take into account the inter-classes distance (e. G. Davies-Bouldin index) in order to find the optimal metric when the classes are supposed to be Gaussian. Blind Source Separation (BSS) methods are also investigated for dimension reduction of multivariate data. A BSS method based on a geometrical interpretation of the linear mixing model is proposed. BSS methods which take into account application constraints are used for dimension reduction in two different applications of multivariate imaging. These methods allow the extraction of meaningful factors from the whole data set; they also allow reducing the complexity and the computing time of the clustering algorithms which are used further in analysis. Applications on multivariate image analysis are also presented
Wu, Fei. "Knowledge discovery in time-series databases." Versailles-St Quentin en Yvelines, 2001. http://www.theses.fr/2001VERS0023.
Full textOlteanu, Madalina. "Modèles à changements de régime : applications aux données financières." Phd thesis, Université Panthéon-Sorbonne - Paris I, 2006. http://tel.archives-ouvertes.fr/tel-00133132.
Full textOn propose d'étudier ces questions à travers deux approches. Dans la première, il s'agit de montrer la consistance faible d'un estimateur de maximum de vraisemblance pénalisée sous des conditions de stationnarité et dépendance faible. Les hypothèses introduites sur l'entropie à crochets de la classe des fonctions scores généralisés sont ensuite vérifiées dans un cadre linéaire et gaussien. La deuxième approche, plutôt empirique, est issue des méthodes de classification non-supervisée et combine les cartes de Kohonen avec une classification hiérarchique pour laquelle une nouvelle dispersion basée sur la somme des carrés résiduelle est introduite.
Rahier, Thibaud. "Réseaux Bayésiens pour fusion de données statiques et temporelles." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAM083/document.
Full textPrediction and inference on temporal data is very frequently performed using timeseries data alone. We believe that these tasks could benefit from leveraging the contextual metadata associated to timeseries - such as location, type, etc. Conversely, tasks involving prediction and inference on metadata could benefit from information held within timeseries. However, there exists no standard way of jointly modeling both timeseries data and descriptive metadata. Moreover, metadata frequently contains highly correlated or redundant information, and may contain errors and missing values.We first consider the problem of learning the inherent probabilistic graphical structure of metadata as a Bayesian Network. This has two main benefits: (i) once structured as a graphical model, metadata is easier to use in order to improve tasks on temporal data and (ii) the learned model enables inference tasks on metadata alone, such as missing data imputation. However, Bayesian network structure learning is a tremendous mathematical challenge, that involves a NP-Hard optimization problem. We present a tailor-made structure learning algorithm, inspired from novel theoretical results, that exploits (quasi)-determinist dependencies that are typically present in descriptive metadata. This algorithm is tested on numerous benchmark datasets and some industrial metadatasets containing deterministic relationships. In both cases it proved to be significantly faster than state of the art, and even found more performant structures on industrial data. Moreover, learned Bayesian networks are consistently sparser and therefore more readable.We then focus on designing a model that includes both static (meta)data and dynamic data. Taking inspiration from state of the art probabilistic graphical models for temporal data (Dynamic Bayesian Networks) and from our previously described approach for metadata modeling, we present a general methodology to jointly model metadata and temporal data as a hybrid static-dynamic Bayesian network. We propose two main algorithms associated to this representation: (i) a learning algorithm, which while being optimized for industrial data, is still generalizable to any task of static and dynamic data fusion, and (ii) an inference algorithm, enabling both usual tasks on temporal or static data alone, and tasks using the two types of data.%We then provide results on diverse cross-field applications such as forecasting, metadata replenishment from timeseries and alarms dependency analysis using data from some of Schneider Electric’s challenging use-cases.Finally, we discuss some of the notions introduced during the thesis, including ways to measure the generalization performance of a Bayesian network by a score inspired from the cross-validation procedure from supervised machine learning. We also propose various extensions to the algorithms and theoretical results presented in the previous chapters, and formulate some research perspectives
Coelho, Rodrigues Pedro Luiz. "Exploration des invariances de séries temporelles multivariées via la géométrie Riemannienne : validation sur des données EEG." Electronic Thesis or Diss., Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAT095.
Full textMultivariate time series are the standard tool for describing and analysing measurements from multiple sensors during an experiment. In this work, we discuss different aspects of such representations that are invariant to transformations occurring in practical situations. The main source of inspiration for our investigations are experiments with neural signals from electroencephalography (EEG), but the ideas that we present are amenable to other kinds of time series.The first invariance that we consider concerns the dimensionality of the multivariate time series. Very often, signals recorded from neighbouring sensors present strong statistical dependency between them. We present techniques for disposing of the redundancy of these correlated signals and obtaining new multivariate time series that represent the same phenomenon but in a smaller dimension.The second invariance that we treat is related to time series describing the same phenomena but recorded under different experimental conditions. For instance, signals recorded with the same experimental apparatus but on different days of the week, different test subjects, etc. In such cases, despite an underlying variability, the multivariate time series share certain commonalities that can be exploited for joint analysis. Moreover, reusing information already available from other datasets is a very appealing idea and allows for “data-efficient” machine learning methods. We present an original transfer learning procedure that transforms these time series so that their statistical distributions become aligned and can be pooled together for further statistical analysis.Finally, we extend the previous case to when the time series are obtained from different experimental conditions and also different experimental setups. A practical example is having EEG recordings from subjects executing the same cognitive task but with the electrodes positioned differently. We present an original method that transforms these multivariate time series so that they become compatible in terms of dimensionality and also in terms of statistical distributions.We illustrate the techniques described above on EEG epochs recorded during brain-computer interface (BCI) experiments. We show examples where the reduction of the multivariate time series does not affect the performance of statistical classifiers used to distinguish their classes, as well as instances where our transfer learning and dimension-matching proposals provide remarkable results on classification in cross-session and cross-subject settings.For exploring the invariances presented above, we rely on a framework that parametrizes the statistics of the multivariate time series via Hermitian positive definite (HPD) matrices. We manipulate these matrices by considering them in a Riemannian manifold in which an adequate metric is chosen. We use concepts from Riemannian geometry to define notions such as geodesic distance, center of mass, and statistical classifiers for time series. This approach is rooted on fundamental results of differential geometry for Hermitian positive definite matrices and has links with other well established areas in applied mathematics, such as information geometry and signal processing
Goffinet, Étienne. "Clustering multi-blocs et visualisation analytique de données séquentielles massives issues de simulation du véhicule autonome." Thesis, Paris 13, 2021. http://www.theses.fr/2021PA131090.
Full textAdvanced driving-assistance systems validation remains one of the biggest challenges car manufacturers must tackle to provide safe driverless cars. The reliable validation of these systems requires to assess their reaction’s quality and consistency to a broad spectrum of driving scenarios. In this context, large-scale simulation systems bypass the physical «on-tracks» limitations and produce important quantities of high-dimensional time series data. The challenge is to find valuable information in these multivariate unlabelled datasets that may contain noisy, sometimes correlated or non-informative variables. This thesis propose several model-based tool for univariate and multivariate time series clustering based on a Dictionary approach or Bayesian Non Parametric framework. The objective is to automatically find relevant and natural groups of driving behaviors and, in the multivariate case, to perform a model selection and multivariate time series dimension reduction. The methods are experimented on simulated datasets and applied on industrial use cases from Groupe Renault Coclustering
Simon, Franck. "Découverte causale sur des jeux de données classiques et temporels. Application à des modèles biologiques." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS528.
Full textThis thesis focuses on the field of causal discovery : the construction of causal graphs from observational data, and in particular, temporal causal discovery and the reconstruction of large gene regulatory networks. After a brief history, this thesis introduces the main concepts, hypotheses and theorems underlying causal graphs as well as the two main approaches: score-based and constraint-based methods. The MIIC (Multivariate Information-based Inductive Causation) method, developed in our laboratory, is then described with its latest improvements: Interpretable MIIC. The issues and solutions implemented to construct a temporal version (tMIIC) are presented as well as benchmarks reflecting the advantages of tMIIC compared to other state-of-the-art methods. The application to sequences of images taken with a microscope of a tumor environment reconstituted on microchips illustrates the capabilities of tMIIC to recover, solely from data, known and new relationships. Finally, this thesis introduces the use of a consequence a priori to apply causal discovery to the reconstruction of gene regulatory networks. By assuming that all genes, except transcription factors, are only consequence genes, it becomes possible to reconstruct graphs with thousands of genes. The ability to identify key transcription factors de novo is illustrated by an application to single cell RNA sequencing data with the discovery of two transcription factors likely to be involved in the biological process of interest
Hmamouche, Youssef. "Prédiction des séries temporelles larges." Electronic Thesis or Diss., Aix-Marseille, 2018. http://www.theses.fr/2018AIXM0480.
Full textNowadays, storage and data processing systems are supposed to store and process large time series. As the number of variables observed increases very rapidly, their prediction becomes more and more complicated, and the use of all the variables poses problems for classical prediction models.Univariate prediction models are among the first models of prediction. To improve these models, the use of multiple variables has become common. Thus, multivariate models and become more and more used because they consider more information.With the increase of data related to each other, the application of multivariate models is also questionable. Because the use of all existing information does not necessarily lead to the best predictions. Therefore, the challenge in this situation is to find the most relevant factors among all available data relative to a target variable.In this thesis, we study this problem by presenting a detailed analysis of the proposed approaches in the literature. We address the problem of prediction and size reduction of massive data. We also discuss these approaches in the context of Big Data.The proposed approaches show promising and very competitive results compared to well-known algorithms, and lead to an improvement in the accuracy of the predictions on the data used.Then, we present our contributions, and propose a complete methodology for the prediction of wide time series. We also extend this methodology to big data via distributed computing and parallelism with an implementation of the prediction process proposed in the Hadoop / Spark environment
Nakkar, Osman. "Modélisation espace d'états de la dynamique des séries temporelles : traitement automatique des données du marché du cuivre." Montpellier 1, 1994. http://www.theses.fr/1994MON10027.
Full textThe state space model and the kalman filter which are originally used in applied automatic, were used recently in econometrics to model the time series and to estimate the linear models with variables coefficients. Firstly, we present the state space model and the kalman filtre; then we discuss about the problems associated with the identification and estimation of that model. We propose to this end the combination of the two methods of e. M. Algorithm and hankel matrix factorization. To model the non stationary time series, we propose the utilization of a var model with variable coefficients. These later are estimated by the kalman filter. We also use a state space model estimated in two steps. This model is based on the calssical idea of decomposition of a non stationary serie into tendentiously component and accidental or cyclic component. The empiric application in the copper markt show up the practical advantage of state space model for the forecast and its ability to detect the dynamic interactions between the various variables witch intervene in the model with the help of its transfer function
Melzi, Fateh. "Fouille de données pour l'extraction de profils d'usage et la prévision dans le domaine de l'énergie." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1123/document.
Full textNowadays, countries are called upon to take measures aimed at a better rationalization of electricity resources with a view to sustainable development. Smart Metering solutions have been implemented and now allow a fine reading of consumption. The massive spatio-temporal data collected can thus help to better understand consumption behaviors, be able to forecast them and manage them precisely. The aim is to be able to ensure "intelligent" use of resources to consume less and consume better, for example by reducing consumption peaks or by using renewable energy sources. The thesis work takes place in this context and aims to develop data mining tools in order to better understand electricity consumption behaviors and to predict solar energy production, then enabling intelligent energy management.The first part of the thesis focuses on the classification of typical electrical consumption behaviors at the scale of a building and then a territory. In the first case, an identification of typical daily power consumption profiles was conducted based on the functional K-means algorithm and a Gaussian mixture model. On a territorial scale and in an unsupervised context, the aim is to identify typical electricity consumption profiles of residential users and to link these profiles to contextual variables and metadata collected on users. An extension of the classical Gaussian mixture model has been proposed. This allows exogenous variables such as the type of day (Saturday, Sunday and working day,...) to be taken into account in the classification, thus leading to a parsimonious model. The proposed model was compared with classical models and applied to an Irish database including both electricity consumption data and user surveys. An analysis of the results over a monthly period made it possible to extract a reduced set of homogeneous user groups in terms of their electricity consumption behaviors. We have also endeavoured to quantify the regularity of users in terms of consumption as well as the temporal evolution of their consumption behaviors during the year. These two aspects are indeed necessary to evaluate the potential for changing consumption behavior that requires a demand response policy (shift in peak consumption, for example) set up by electricity suppliers.The second part of the thesis concerns the forecast of solar irradiance over two time horizons: short and medium term. To do this, several approaches have been developed, including autoregressive statistical approaches for modelling time series and machine learning approaches based on neural networks, random forests and support vector machines. In order to take advantage of the different models, a hybrid model combining the different models was proposed. An exhaustive evaluation of the different approaches was conducted on a large database including four locations (Carpentras, Brasilia, Pamplona and Reunion Island), each characterized by a specific climate as well as weather parameters: measured and predicted using NWP models (Numerical Weather Predictions). The results obtained showed that the hybrid model improves the results of photovoltaic production forecasts for all locations
Julea, Andreea Maria. "Extraction de motifs spatio-temporels dans des séries d'images de télédétection : application à des données optiques et radar." Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00652810.
Full textMalgras, Jacques. "Applications, à des données de la biologie des populations et de l'écologie, de méthodes d'analyse des séries temporelles." Lyon 1, 1996. http://www.theses.fr/1996LYO10240.
Full textBündgen, Blanche. "Évolution des comportements techniques au Magdalénien supérieur : les données de l'industrie lithique de La Madeleine (Dordogne), séries récentes." Bordeaux 1, 2002. http://www.theses.fr/2002BOR12515.
Full textBen, Hamadou Radhouane. "Contribution à l'analyse spatio-temporelle de séries écologiques marines." Paris 6, 2003. http://www.theses.fr/2003PA066021.
Full textPetitjean, François. "Dynamic time warping : apports théoriques pour l'analyse de données temporelles : application à la classification de séries temporelles d'images satellites." Thesis, Strasbourg, 2012. http://www.theses.fr/2012STRAD023.
Full textSatellite Image Time Series are becoming increasingly available and will continue to do so in the coming years thanks to the launch of space missions, which aim at providing a coverage of the Earth every few days with high spatial resolution (ESA’s Sentinel program). In the case of optical imagery, it will be possible to produce land use and cover change maps with detailed nomenclatures. However, due to meteorological phenomena, such as clouds, these time series will become irregular in terms of temporal sampling. In order to consistently handle the huge amount of information that will be produced (for instance, Sentinel-2 will cover the entire Earth’s surface every five days, with 10m to 60m spatial resolution and 13 spectral bands), new methods have to be developed. This Ph.D. thesis focuses on the “Dynamic Time Warping” similarity measure, which is able to take the most of the temporal structure of the data, in order to provide an efficient and relevant analysis of the remotely observed phenomena
Gong, Xing. "Analyse de séries temporelles d’images à moyenne résolution spatiale : reconstruction de profils de LAI, démélangeage : application pour le suivi de la végétation sur des images MODIS." Thesis, Rennes 2, 2015. http://www.theses.fr/2015REN20021/document.
Full textThis PhD dissertation is concerned with time series analysis for medium spatial resolution (MSR) remote sensing images. The main advantage of MSR data is their high temporal rate which allows to monitor land use. However, two main problems arise with such data. First, because of cloud coverage and bad acquisition conditions, the resulting time series are often corrupted and not directly exploitable. Secondly, pixels in medium spatial resolution images are often “mixed” in the sense that the spectral response is a combination of the response of “pure” elements.These two problems are addressed in this PhD. First, we propose a data assimilation technique able to recover consistent time series of Leaf Area Index from corrupted MODIS sequences. To this end, a plant growth model, namely GreenLab, is used as a dynamical constraint. Second, we propose a new and efficient unmixing technique for time series. It is in particular based on the use of “elastic” kernels able to properly compare time series shifted in time or of various lengths.Experimental results are shown both on synthetic and real data and demonstrate the efficiency of the proposed methodologies
Ghaddar, Alia. "Improving the quality of aggregation using data analysis in WSNs." Thesis, Lille 1, 2011. http://www.theses.fr/2011LIL10068/document.
Full textThe promise and application domain of Wireless Sensor Networks (WSNs) continue to grow such as health care, home automation, industry process control, object tracking, etc. This is due to the emergence of embedded, small and intelligent sensor devices in our everyday life. These devices are getting smarter with their capability to interact with the environment or other devices, to analyze data and to make decisions. They have made it possible not only gather data from the environment, but also to bridge the physical and virtual worlds, assist people in their activities, while achieving transparent integration of the wireless technology around us. Along with this promising glory for WSNs, there are however, several challenges facing their deployments and functionality, especially for battery-operated sensor networks. For these networks, the power consumption is the most important challenge. In fact, most of WSNs are composed of low-power, battery-operated sensor nodes that are expected to replace human activities in many critical places, such as disaster relief terrains, active volcanoes, battlefields, difficult terrain border lands, etc. This makes their battery replacement or recharging a non-trivial task. We are concerned with the most energy consuming part of these networks, that is the communication. We propose methods to reduce the cost of transmission in energy-constrained sensor nodes. For this purpose, we observe the way data is collected and processed to save energy during transmission. Our work is build on three basic axis: data estimation, data similarity detection and abnormal behaviors detection
Rhéaume, François. "Une méthode de machine à état liquide pour la classification de séries temporelles." Thesis, Université Laval, 2012. http://www.theses.ulaval.ca/2012/28815/28815.pdf.
Full textThere are a number of reasons that motivate the interest in computational neuroscience for engineering applications of artificial intelligence. Among them is the speed at which the domain is growing and evolving, promising further capabilities for artificial intelligent systems. In this thesis, a method that exploits the recent advances in computational neuroscience is presented: the liquid state machine. A liquid state machine is a biologically inspired computational model that aims at learning on input stimuli. The model constitutes a promising temporal pattern recognition tool and has shown to perform very well in many applications. In particular, temporal pattern recognition is a problem of interest in military surveillance applications such as automatic target recognition. Until now, most of the liquid state machine implementations for spatiotemporal pattern recognition have remained fairly similar to the original model. From an engineering perspective, a challenge is to adapt liquid state machines to increase their ability for solving practical temporal pattern recognition problems. Solutions are proposed. The first one concentrates on the sampling of the liquid state. In this subject, a method that exploits frequency features of neurons is defined. The combination of different liquid state vectors is also discussed. Secondly, a method for training the liquid is developed. The method implements synaptic spike-timing dependent plasticity to shape the liquid. A new class-conditional approach is proposed, where different networks of neurons are trained exclusively on particular classes of input data. For the suggested liquid sampling methods and the liquid training method, comparative tests were conducted with both simulated and real data sets from different application areas. The tests reveal that the methods outperform the conventional liquid state machine approach. The methods are even more promising in that the results are obtained without optimization of many internal parameters for the different data sets. Finally, measures of the liquid state are investigated for predicting the performance of the liquid state machine.