Teses / dissertações sobre o tema "Classification de séries"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Veja os 50 melhores trabalhos (teses / dissertações) para estudos sobre o assunto "Classification de séries".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Veja as teses / dissertações das mais diversas áreas científicas e compile uma bibliografia correta.
Bailly, Adeline. "Classification de séries temporelles avec applications en télédétection". Thesis, Rennes 2, 2018. http://www.theses.fr/2018REN20021/document.
Texto completo da fonteTime Series Classification (TSC) has received an important amount of interest over the past years due to many real-life applications. In this PhD, we create new algorithms for TSC, with a particular emphasis on Remote Sensing (RS) time series data. We first propose the Dense Bag-of-Temporal-SIFT-Words (D-BoTSW) method that uses dense local features based on SIFT features for 1D data. Extensive experiments exhibit that D-BoTSW significantly outperforms nearly all compared standalone baseline classifiers. Then, we propose an enhancement of the Learning Time Series Shapelets (LTS) algorithm called Adversarially-Built Shapelets (ABS) based on the introduction of adversarial time series during the learning process. Adversarial time series provide an additional regularization benefit for the shapelets and experiments show a performance improvementbetween the baseline and our proposed framework. Due to the lack of available RS time series datasets,we also present and experiment on two remote sensing time series datasets called TiSeLaCand Brazilian-Amazon
Jebreen, Kamel. "Modèles graphiques pour la classification et les séries temporelles". Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0248/document.
Texto completo da fonteFirst, in this dissertation, we will show that Bayesian networks classifiers are very accurate models when compared to other classical machine learning methods. Discretising input variables often increase the performance of Bayesian networks classifiers, as does a feature selection procedure. Different types of Bayesian networks may be used for supervised classification. We combine such approaches together with feature selection and discretisation to show that such a combination gives rise to powerful classifiers. A large choice of data sets from the UCI machine learning repository are used in our experiments, and the application to Epilepsy type prediction based on PET scan data confirms the efficiency of our approach. Second, in this dissertation we also consider modelling interaction between a set of variables in the context of time series and high dimension. We suggest two approaches; the first is similar to the neighbourhood lasso where the lasso model is replaced by Support Vector Machines (SVMs); the second is a restricted Bayesian network for time series. We demonstrate the efficiency of our approaches simulations using linear and nonlinear data set and a mixture of both
Jean, Sandrine. "Classification à conjugaison près des séries de p-torsion". Limoges, 2008. https://aurore.unilim.fr/theses/nxfile/default/730bf760-8418-47c7-bec5-45796c5d7e8f/blobholder:0/2008LIMO4011.pdf.
Texto completo da fonteAccording to Green-Matignon's version of the conjecture of F. Oort, any series of order pn can be lifted up by a série of the same order which coefficients are integer in a certain extension of Qp. So it is necessary to lift a series of every conjugacy class to lift all formal power series of order pn. That is why, we have studied, in this report, conjugacy classes of formal power series of order pn with coefficients in the algebraic closure Fpalg de Fp. The first chapter is dedicated to recalls on locals fields and especially local fields of characteristc p. In the second chapter, we give a second proof of the theorem of B. Klopsch which states the conjugacy classes of series of order p when the residue field is perfect. The third chapter is dedicated to Witt vectors and gives a reduction of these vectors. Then, in the fourth chapter, we use Witt vectors of length n which, thanks to Artin-Schreier-Witt theory, determined any extensions of degree pn. In the fifth chapter, we use the equivalence between endomorphisms and formal power series to construct the first bijection which states a link between a set An of Witt vectors and a certain characterization of extension of degree pn of K. The second bijection permits, thanks to a certain action of group to get a correspondence between conjugacy classes of order pn and the orbits of An under this action. This is this bijection we will build in the sixth chapter. Finally, in the last chapter, we give two calculations, the first one using the Lubin-Tate theory and the second one Artin-Schreier-Witt theory, to get an explicit writting of series of order 4 for he conjugation law
Caiado, Aníbal Jorge da Costa Cristóvão. "Distance-based methods for classification and clustering of time series". Doctoral thesis, Instituto Superior de Economia e Gestão, 2006. http://hdl.handle.net/10400.5/3531.
Texto completo da fonteRenard, Xavier. "Time series representation for classification : a motif-based approach". Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066593.
Texto completo da fonteOur research described in this thesis is about the learning of a motif-based representation from time series to perform automatic classification. Meaningful information in time series can be encoded across time through trends, shapes or subsequences usually with distortions. Approaches have been developed to overcome these issues often paying the price of high computational complexity. Among these techniques, it is worth pointing out distance measures and time series representations. We focus on the representation of the information contained in the time series. We propose a framework to generate a new time series representation to perform classical feature-based classification based on the discovery of discriminant sets of time series subsequences (motifs). This framework proposes to transform a set of time series into a feature space, using subsequences enumerated from the time series, distance measures and aggregation functions. One particular instance of this framework is the well-known shapelet approach. The potential drawback of such an approach is the large number of subsequences to enumerate, inducing a very large feature space and a very high computational complexity. We show that most subsequences in a time series dataset are redundant. Therefore, a random sampling can be used to generate a very small fraction of the exhaustive set of subsequences, preserving the necessary information for classification and thus generating a much smaller feature space compatible with common machine learning algorithms with tractable computations. We also demonstrate that the number of subsequences to draw is not linked to the number of instances in the training set, which guarantees the scalability of the approach. The combination of the latter in the context of our framework enables us to take advantage of advanced techniques (such as multivariate feature selection techniques) to discover richer motif-based time series representations for classification, for example by taking into account the relationships between the subsequences. These theoretical results have been extensively tested on more than one hundred classical benchmarks of the literature with univariate and multivariate time series. Moreover, since this research has been conducted in the context of an industrial research agreement (CIFRE) with Arcelormittal, our work has been applied to the detection of defective steel products based on production line's sensor measurements
Ziat, Ali Yazid. "Apprentissage de représentation pour la prédiction et la classification de séries temporelles". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066324/document.
Texto completo da fonteThis thesis deals with the development of time series analysis methods. Our contributions focus on two tasks: time series forecasting and classification. Our first contribution presents a method of prediction and completion of multivariate and relational time series. The aim is to be able to simultaneously predict the evolution of a group of time series connected to each other according to a graph, as well as to complete the missing values in these series (which may correspond for example to a failure of a sensor during a given time interval). We propose to use representation learning techniques to forecast the evolution of the series while completing the missing values and taking into account the relationships that may exist between them. Extensions of this model are proposed and described: first in the context of the prediction of heterogeneous time series and then in the case of the prediction of time series with an expressed uncertainty. A prediction model of spatio-temporal series is then proposed, in which the relations between the different series can be expressed more generally, and where these can be learned.Finally, we are interested in the classification of time series. A joint model of metric learning and time-series classification is proposed and an experimental comparison is conducted
Dilmi, Mohamed Djallel. "Méthodes de classification des séries temporelles : application à un réseau de pluviomètres". Electronic Thesis or Diss., Sorbonne université, 2019. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2019SORUS087.pdf.
Texto completo da fonteThe impact of climat change on the temporal evolution of precipitation as well as the impact of the Parisian heat island on the spatial distribution of précipitation motivate studying the varaibility of the water cycle on a small scale on île-de-france. one way to analyse this varaibility using the data from a rain gauge network is to perform a clustring on time series measured by this network. In this thesis, we have explored two approaches for time series clustring : for the first approach based on the description of series by characteristics, an algorithm for selecting characteristics based on genetic algorithms and topological maps has been proposed. for the second approach based on shape comparaison, a measure of dissimilarity (iterative downscaling time warping) was developed to compare two rainfall time series. Then the limits of the two approaches were discuddes followed by a proposition of a mixed approach that combine the advantages of each approach. The approach was first applied to the evaluation of spatial variability of precipitation on île-de-france. For the evaluation of the temporal variability of the precpitation, a clustring on the precipitation events observed by a station was carried out then extended on the whole rain gauge network. The application on the historical series of Paris-Montsouris (1873-2015) makes it possible to automatically discriminate "remarkable" years from a meteorological point of view
Ziat, Ali Yazid. "Apprentissage de représentation pour la prédiction et la classification de séries temporelles". Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066324.
Texto completo da fonteThis thesis deals with the development of time series analysis methods. Our contributions focus on two tasks: time series forecasting and classification. Our first contribution presents a method of prediction and completion of multivariate and relational time series. The aim is to be able to simultaneously predict the evolution of a group of time series connected to each other according to a graph, as well as to complete the missing values in these series (which may correspond for example to a failure of a sensor during a given time interval). We propose to use representation learning techniques to forecast the evolution of the series while completing the missing values and taking into account the relationships that may exist between them. Extensions of this model are proposed and described: first in the context of the prediction of heterogeneous time series and then in the case of the prediction of time series with an expressed uncertainty. A prediction model of spatio-temporal series is then proposed, in which the relations between the different series can be expressed more generally, and where these can be learned.Finally, we are interested in the classification of time series. A joint model of metric learning and time-series classification is proposed and an experimental comparison is conducted
Esling, Philippe. "Multiobjective time series matching and classification". Paris 6, 2012. http://www.theses.fr/2012PA066704.
Texto completo da fonteMillions of years of genetic evolution have shaped our auditory system, allowing to discriminate acoustic events in a flexible manner. We can perceptually process multiple de-correlated scales in a multidimensional way. In addition, humans have a natural ability to extract a coherent structure from temporal shapes. We show that emulating these mechanisms in our algorithmic choices, allow to create efficient approaches to perform matching and classification, with a scope beyond musical issues. We introduce the problem of multiobjective Time Series (MOTS) and propose an efficient algorithm to solve it. We introduce two innovative querying paradigms on audio files. We introduce a new classification paradigm based on the hypervolume dominated by different classes called hypervolume-MOTS (HV-MOTS). This system studies the behavior of the whole class by its distribution and spread over the optimization space. We show an improvement over the state of the art methods on a wide range of scientific problems. We present a biometric identification systems based on the sounds produced by heartbeats. This system is able to reach low error rates equivalent to other biometric features. These results are confirmed by the extensive cardiac data set of the Mars500 isolation study. Finally, we study the problem of generating orchestral mixtures that could best imitate a sound target. The search algorithm based on MOTS problem allows to obtain a set of solutions to approximate any audio source
Rhéaume, François. "Une méthode de machine à état liquide pour la classification de séries temporelles". Thesis, Université Laval, 2012. http://www.theses.ulaval.ca/2012/28815/28815.pdf.
Texto completo da fonteThere are a number of reasons that motivate the interest in computational neuroscience for engineering applications of artificial intelligence. Among them is the speed at which the domain is growing and evolving, promising further capabilities for artificial intelligent systems. In this thesis, a method that exploits the recent advances in computational neuroscience is presented: the liquid state machine. A liquid state machine is a biologically inspired computational model that aims at learning on input stimuli. The model constitutes a promising temporal pattern recognition tool and has shown to perform very well in many applications. In particular, temporal pattern recognition is a problem of interest in military surveillance applications such as automatic target recognition. Until now, most of the liquid state machine implementations for spatiotemporal pattern recognition have remained fairly similar to the original model. From an engineering perspective, a challenge is to adapt liquid state machines to increase their ability for solving practical temporal pattern recognition problems. Solutions are proposed. The first one concentrates on the sampling of the liquid state. In this subject, a method that exploits frequency features of neurons is defined. The combination of different liquid state vectors is also discussed. Secondly, a method for training the liquid is developed. The method implements synaptic spike-timing dependent plasticity to shape the liquid. A new class-conditional approach is proposed, where different networks of neurons are trained exclusively on particular classes of input data. For the suggested liquid sampling methods and the liquid training method, comparative tests were conducted with both simulated and real data sets from different application areas. The tests reveal that the methods outperform the conventional liquid state machine approach. The methods are even more promising in that the results are obtained without optimization of many internal parameters for the different data sets. Finally, measures of the liquid state are investigated for predicting the performance of the liquid state machine.
Plaud, Angéline. "Classification ensembliste des séries temporelles multivariées basée sur les M-histogrammes et une approche multi-vues". Thesis, Université Clermont Auvergne (2017-2020), 2019. http://www.theses.fr/2019CLFAC047.
Texto completo da fonteRecording measurements about various phenomena and exchanging information about it, participate in the emergence of a type of data called time series. Today humongous quantities of those data are often collected. A time series is characterized by numerous points and interactions can be observed between those points. A time series is multivariate when multiple measures are recorded at each timestamp, meaning a point is, in fact, a vector of values. Even if univariate time series, one value at each timestamp, are well-studied and defined, it’s not the case of multivariate one, for which the analysis is still challenging. Indeed, it is not possible to apply directly techniques of classification developed on univariate data to the case of multivariate one. In fact, for this latter, we have to take into consideration the interactions not only between points but also between dimensions. Moreover, in industrial cases, as in Michelin company, the data are big and also of different length in terms of points size composing the series. And this brings a new complexity to deal with during the analysis. None of the current techniques of classifying multivariate time series satisfies the following criteria, which are a low complexity of computation, dealing with variation in the number of points and good classification results. In our approach, we explored a new tool, which has not been applied before for MTS classification, which is called M-histogram. A M-histogram is a visualization tool using M axis to project the density function underlying the data. We have employed it here to produce a new representation of the data, that allows us to bring out the interactions between dimensions. Searching for links between dimensions correspond particularly to a part of learning techniques called multi-view learning. A view is an extraction of dimensions of a dataset, which are of same nature or type. Then the goal is to display the links between the dimensions inside each view in order to classify all the data, using an ensemble classifier. So we propose a multi-view ensemble model to classify multivariate time series. The model creates multiple M-histograms from differents groups of dimensions. Then each view allows us to get a prediction which we can aggregate to get a final prediction. In this thesis, we show that the proposed model allows a fast classification of multivariate time series of different sizes. In particular, we applied it on aMichelin use case
Varasteh, Yazdi Saeed. "Représentations parcimonieuses et apprentissage de dictionnaires pour la classification et le clustering de séries temporelles". Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAM062/document.
Texto completo da fonteLearning dictionary for sparse representing time series is an important issue to extract latent temporal features, reveal salient primitives and sparsely represent complex temporal data. This thesis addresses the sparse coding and dictionary learning problem for time series classification and clustering under time warp. For that, we propose a time warp invariant sparse coding and dictionary learning framework where both input samples and atoms define time series of different lengths that involve varying delays.In the first part, we formalize an L0 sparse coding problem and propose a time warp invariant orthogonal matching pursuit based on a new cosine maximization time warp operator. For the dictionary learning stage, a non linear time warp invariant kSVD (TWI-kSVD) is proposed. Thanks to a rotation transformation between each atom and its sibling atoms, a singular value decomposition is used to jointly approximate the coefficients and update the dictionary, similar to the standard kSVD. In the second part, a time warp invariant dictionary learning for time series clustering is formalized and a gradient descent solution is proposed.The proposed methods are confronted to major shift invariant, convolved and kernel dictionary learning methods on several public and real temporal data. The conducted experiments show the potential of the proposed frameworks to efficiently sparse represent, classify and cluster time series under time warp
Cano, Emmanuelle. "Cartographie des formations végétales naturelles à l’échelle régionale par classification de séries temporelles d’images satellitaires". Thesis, Rennes 2, 2016. http://www.theses.fr/2016REN20024/document.
Texto completo da fonteForest cover mapping is an essential tool for forest management. Detailed maps, characterizing forest types at a régional scale, are needed. This need can be fulfilled by médium spatial resolution optical satellite images time sériés. This thesis aims at improving the supervised classification procédure applied to a time sériés, to produce maps detailing forest types at a régional scale. To meet this goal, the improvement of the results obtained by the classification of a MODIS time sériés, performed with a stratification of the study area, was assessed. An improvement of classification accuracy due to stratification built by object-based image analysis was observed, with an increase of the Kappa index value and an increase of the reject fraction rate. These two phenomena are correlated to the classified végétation area. A minimal and a maximal value were identified, respectively related to a too high reject fraction rate and a neutral stratification impact.We carried out a second study, aiming at assessing the influence of the médium spatial resolution time sériés organization and of the algorithm on classification quality. Three distinct classification algorithms (maximum likelihood, Support Vector Machine, Random Forest) and several time sériés were studied. A significant improvement due to temporal and radiométrie effects and the superiority of Random Forest were highlighted by the results. Thematic confusions and low user's and producer's accuracies were still observed for several classes. We finally studied the improvement brought by a spatial resolution change for the images composing the time sériés to discriminate classes of mixed forest species. The conclusions of the former study (MODIS images) were confirmed with DEIMOS images. We can conclude that these effects are independent from input data and their spatial resolution. A significant improvement was also observed with an increase of the Kappa index value from 0,60 with MODIS data to 0,72 with DEIMOS data, due to a decrease of the mixed pixels rate
Nicolae, Maria-Irina. "Learning similarities for linear classification : theoretical foundations and algorithms". Thesis, Lyon, 2016. http://www.theses.fr/2016LYSES062/document.
Texto completo da fonteThe notion of metric plays a key role in machine learning problems, such as classification, clustering and ranking. Learning metrics from training data in order to make them adapted to the task at hand has attracted a growing interest in the past years. This research field, known as metric learning, usually aims at finding the best parameters for a given metric under some constraints from the data. The learned metric is used in a machine learning algorithm in hopes of improving performance. Most of the metric learning algorithms focus on learning the parameters of Mahalanobis distances for feature vectors. Current state of the art methods scale well for datasets of significant size. On the other hand, the more complex topic of multivariate time series has received only limited attention, despite the omnipresence of this type of data in applications. An important part of the research on time series is based on the dynamic time warping (DTW) computing the optimal alignment between two time series. The current state of metric learning suffers from some significant limitations which we aim to address in this thesis. The most important one is probably the lack of theoretical guarantees for the learned metric and its performance for classification.The theory of (ℰ , ϓ, τ)-good similarity functions has been one of the first results relating the properties of a similarity to its classification performance. A second limitation in metric learning comes from the fact that most methods work with metrics that enforce distance properties, which are computationally expensive and often not justified. In this thesis, we address these limitations through two main contributions. The first one is a novel general framework for jointly learning a similarity function and a linear classifier. This formulation is inspired from the (ℰ , ϓ, τ)-good theory, providing a link between the similarity and the linear classifier. It is also convex for a broad range of similarity functions and regularizers. We derive two equivalent generalization bounds through the frameworks of algorithmic robustness and uniform convergence using the Rademacher complexity, proving the good theoretical properties of our framework. Our second contribution is a method for learning similarity functions based on DTW for multivariate time series classification. The formulation is convex and makes use of the(ℰ , ϓ, τ)-good framework for relating the performance of the metric to that of its associated linear classifier. Using uniform stability arguments, we prove the consistency of the learned similarity leading to the derivation of a generalization bound
Oliveira, Adriano Lorena Inácio de. "Neural networks forecasting and classification-based techniques for novelty detection in time series". Universidade Federal de Pernambuco, 2011. https://repositorio.ufpe.br/handle/123456789/1825.
Texto completo da fonteO problema da detecção de novidades pode ser definido como a identificação de dados novos ou desconhecidos aos quais um sistema de aprendizagem de máquina não teve acesso durante o treinamento. Os algoritmos para detecção de novidades são projetados para classificar um dado padrão de entrada como normal ou novidade. Esses algoritmos são usados em diversas areas, como visão computacional, detecçãao de falhas em máquinas, segurança de redes de computadores e detecção de fraudes. Um grande número de sistemas pode ter seu comportamento modelado por séries temporais. Recentemente o pro oblema de detecção de novidades em séries temporais tem recebido considerável atenção. Várias técnicas foram propostas, incluindo téecnicas baseadas em previsão de séries temporais com redes neurais artificiais e em classificação de janelas das s´eries temporais. As t´ecnicas de detec¸c ao de novidades em s´eries temporais atrav´es de previs ao t em sido criticadas devido a seu desempenho considerado insatisfat´orio. Em muitos problemas pr´aticos, a quantidade de dados dispon´ıveis nas s´eries ´e bastante pequena tornando a previs ao um problema ainda mais complexo. Este ´e o caso de alguns problemas importantes de auditoria, como auditoria cont´abil e auditoria de folhas de pagamento. Como alternativa aos m´etodos baseados em previs ao, alguns m´etodos baseados em classificação foram recentemente propostos para detecção de novidades em séries temporais, incluindo m´etodos baseados em sistemas imunol´ogicos artificiais, wavelets e m´aquinas de vetor de suporte com uma ´unica classe. Esta tese prop oe um conjunto de m´etodos baseados em redes neurais artificiais para detecção de novidades em séries temporais. Os métodos propostos foram projetados especificamente para detec¸c ao de fraudes decorrentes de desvios relativamente pequenos, que s ao bastante importantes em aplica¸c oes de detec¸c ao de fraudes em sistemas financeiros. O primeiro m´etodo foi proposto para melhorar o desempenho de detec¸c ao de novidades baseada em previs ao. Este m´etodo ´e baseado em intervalos de confian¸ca robustos, que s ao usados para definir valores adequados para os limiares a serem usados para detec¸c ao de novidades. O m´etodo proposto foi aplicado a diversas s´eries temporais financeiras e obteve resultados bem melhores que m´etodos anteriores baseados em previs ao. Esta tese tamb´em prop oe dois diferentes m´etodos baseados em classifica¸c ao para detec ¸c ao de novidades em s´eries temporais. O primeiro m´etodo ´e baseado em amostras negativas, enquanto que o segundo m´etodo ´e baseado em redes neurais artificiais RBFDDA e n ao usa amostras negativas na fase de treinamento. Resultados de simula¸c ao usando diversas s´eries temporais extra´ıdas de aplica¸c oes reais mostraram que o segundo m´etodo obt´em melhor desempenho que o primeiro. Al´em disso, o desempenho do segundo m´etodo n ao depende do tamanho do conjunto de teste, ao contr´ario do que acontece com o primeiro m´etodo. Al´em dos m´etodos para detec¸c ao de novidades em s´eries temporais, esta tese prop oe e investiga quatro diferentes m´etodos para melhorar o desempenho de redes neurais RBF-DDA. Os m´etodos propostos foram avaliados usando seis conjuntos de dados do reposit´orio UCI e os resultados mostraram que eles melhoram consideravelmente o desempenho de redes RBF-DDA e tamb´em que eles obt em melhor desempenho que redes MLP e que o m´etodo AdaBoost. Al´em disso, mostramos que os m´etodos propostos obt em resultados similares a k-NN. Os m´etodos propostos para melhorar RBF-DDA foram tamb´em usados em conjunto com o m´etodo proposto nesta tese para detec¸c ao de novidades em s´eries temporais baseado em amostras negativas. Os resultados de diversos experimentos mostraram que esses m´etodos tamb´em melhoram bastante o desempenho da detec¸c ao de fraudes em s´eries temporais, que ´e o foco principal desta tese.
Régis, Sébastien. "Segmentation, classification et fusion de séries temporelles multi-sources : application à des signaux dans un bio-procédé". Antilles-Guyane, 2004. http://www.theses.fr/2004AGUY0121.
Texto completo da fonteThis PhD is devoted to knowledge basis discovery using signal analysis and classification tools on time series. The application is the detection of new, known or abnormal physiological states in a alcoholic bioprocess. Analysis, classification and fusion of data from time series are done. First, wavelets transform and Hôlder exponent (linked to the singularities of the time series) are used to detect phenomenon and physiological states of the system. A new approach combining wavelets transform and differential evolutionary methods is proposed and gives better result than other classical evaluation methods of fuis Hôlder exponent. Then the LAMDA method of classification and its tools are presented. Aggregation operators of LAMDA are presented and a new operator is proposed. A comparison with other classifiers shows that LAMDA gives better results for this application. Relevance of data source is studied. A method based on evidence theory is proposed. Experimental results show that the relevance evaluation are quite interesting. This approach using signal processing, classification and evidence theory enables the analysis and the characterisation of the biological systems without using deterministic model. Thus the combination of these tools enables to discover new knowledge and to confirm the knowledge of the expert mainly by using time series describing biological systems
Ben, Hamadou Radhouane. "Contribution à l'analyse spatio-temporelle de séries écologiques marines". Paris 6, 2003. http://www.theses.fr/2003PA066021.
Texto completo da fonteBenkabou, Seif-Eddine. "Détection d’anomalies dans les séries temporelles : application aux masses de données sur les pneumatiques". Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1046/document.
Texto completo da fonteAnomaly detection is a crucial task that has attracted the interest of several research studies in machine learning and data mining communities. The complexity of this task depends on the nature of the data, the availability of their labeling and the application framework on which they depend. As part of this thesis, we address this problem for complex data and particularly for uni and multivariate time series. The term "anomaly" can refer to an observation that deviates from other observations so as to arouse suspicion that it was generated by a different generation process. More generally, the underlying problem (also called novelty detection or outlier detection) aims to identify, in a set of data, those which differ significantly from others, which do not conform to an "expected behavior" (which could be defined or learned), and which indicate a different mechanism. The "abnormal" patterns thus detected often result in critical information. We focus specifically on two particular aspects of anomaly detection from time series in an unsupervised fashion. The first is global and consists in detecting abnormal time series compared to an entire database, whereas the second one is called contextual and aims to detect locally, the abnormal points with respect to the global structure of the relevant time series. To this end, we propose an optimization approaches based on weighted clustering and the warping time for global detection ; and matrix-based modeling for the contextual detection. Finally, we present several empirical studies on public data to validate the proposed approaches and compare them with other known approaches in the literature. In addition, an experimental validation is provided on a real problem, concerning the detection of outlier price time series on the tyre data, to meet the needs expressed by, LIZEO, the industrial partner of this thesis
Renard, Xavier. "Time series representation for classification : a motif-based approach". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066593/document.
Texto completo da fonteOur research described in this thesis is about the learning of a motif-based representation from time series to perform automatic classification. Meaningful information in time series can be encoded across time through trends, shapes or subsequences usually with distortions. Approaches have been developed to overcome these issues often paying the price of high computational complexity. Among these techniques, it is worth pointing out distance measures and time series representations. We focus on the representation of the information contained in the time series. We propose a framework to generate a new time series representation to perform classical feature-based classification based on the discovery of discriminant sets of time series subsequences (motifs). This framework proposes to transform a set of time series into a feature space, using subsequences enumerated from the time series, distance measures and aggregation functions. One particular instance of this framework is the well-known shapelet approach. The potential drawback of such an approach is the large number of subsequences to enumerate, inducing a very large feature space and a very high computational complexity. We show that most subsequences in a time series dataset are redundant. Therefore, a random sampling can be used to generate a very small fraction of the exhaustive set of subsequences, preserving the necessary information for classification and thus generating a much smaller feature space compatible with common machine learning algorithms with tractable computations. We also demonstrate that the number of subsequences to draw is not linked to the number of instances in the training set, which guarantees the scalability of the approach. The combination of the latter in the context of our framework enables us to take advantage of advanced techniques (such as multivariate feature selection techniques) to discover richer motif-based time series representations for classification, for example by taking into account the relationships between the subsequences. These theoretical results have been extensively tested on more than one hundred classical benchmarks of the literature with univariate and multivariate time series. Moreover, since this research has been conducted in the context of an industrial research agreement (CIFRE) with Arcelormittal, our work has been applied to the detection of defective steel products based on production line's sensor measurements
Amaral, Bruno Ferraz do. "Classificação semissupervisionada de séries temporais extraídas de imagens de satélite". Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-18112016-105621/.
Texto completo da fonteThe amount of digital data generated and stored as well as the need of creation and management of large databases has increased significantly, in the last decades. The possibility of finding valid and potentially useful patterns and information in large databases has attracted the attention of many scientific areas. Time series databases have been explored using data mining methods in serveral domains of application, such as economics, medicine and agrometeorology. Due to the large volume and complexity of some time series databases, the process of labeling data for supervised tasks, such as classification, can be very expensive. To overcome the problem of scarcity of labeled data, semi-supervised classification, which benefits from both labeled and unlabeled data available, can be applied to classify data from large time series databases. In this Master dissertation, we propose 1) a framework for the analysis of data extracted from satellite image time series (SITS) using data mining tasks and 2) a graph-based semi-supervised classification method, developed to classify temporal data obtained from satellite images. According to experts in agrometeorology, the use of the proposed method and framework provides an automatic way of analyzing data extracted from SITS, which is very useful for supporting research in this domain of application. We apply the framework and the proposed semi-supervised classification method in the analysis of vegetation index time series, aiming at identifying sugarcane crop fields, in Brazil. Experimental results indicate that our proposed framework is useful for supporting researches in agriculture, according to experts in the domain of application. We also show that our method is more accurate than traditional supervised methods and related semi-supervised methods.
Leverger, Colin. "Investigation of a framework for seasonal time series forecasting". Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S033.
Texto completo da fonteTo deploy web applications, using web servers is paramount. If there is too few of them, applications performances can quickly deteriorate. However, if they are too numerous, the resources are wasted and the cost increased. In this context, engineers use capacity planning tools to follow the performances of the servers, to collect time series data and to anticipate future needs. The necessity to create reliable forecasts seems clear. Data generated by the infrastructure often exhibit seasonality. The activity cycle followed by the infrastructure is determined by some seasonal cycles (for example, the user’s daily rhythms). This thesis introduces a framework for seasonal time series forecasting. This framework is composed of two machine learning models (e.g. clustering and classification) and aims at producing reliable midterm forecasts with a limited number of parameters. Three instantiations of the framework are presented: one baseline, one deterministic and one probabilistic. The baseline is composed of K-means clustering algorithms and Markov Models. The deterministic version is composed of several clustering algorithms (K-means, K-shape, GAK and MODL) and of several classifiers (naive-bayes, decision trees, random forests and logistic regression). The probabilistic version relies on coclustering to create time series probabilistic grids, that are used to describe the data in an unsupervised way. The performances of the various implementations are compared with several state-of-the-art models, including the autoregressive models, ARIMA and SARIMA, Holt Winters, or even Prophet for the probabilistic paradigm. The results of the baseline are encouraging and confirm the interest for the framework proposed. Good results are observed for the deterministic implementation, and correct results for the probabilistic version. One Orange use case is studied, and the interest and limits of the methodology are discussed
Phan, Thi-Thu-Hong. "Elastic matching for classification and modelisation of incomplete time series". Thesis, Littoral, 2018. http://www.theses.fr/2018DUNK0483/document.
Texto completo da fonteMissing data are a prevalent problem in many domains of pattern recognition and signal processing. Most of the existing techniques in the literature suffer from one major drawback, which is their inability to process incomplete datasets. Missing data produce a loss of information and thus yield inaccurate data interpretation, biased results or unreliable analysis, especially for large missing sub-sequence(s). So, this thesis focuses on dealing with large consecutive missing values in univariate and low/un-correlated multivariate time series. We begin by investigating an imputation method to overcome these issues in univariate time series. This approach is based on the combination of shape-feature extraction algorithm and Dynamic Time Warping method. A new R-package, namely DTWBI, is then developed. In the following work, the DTWBI approach is extended to complete large successive missing data in low/un-correlated multivariate time series (called DTWUMI) and a DTWUMI R-package is also established. The key of these two proposed methods is that using the elastic matching to retrieving similar values in the series before and/or after the missing values. This optimizes as much as possible the dynamics and shape of knowledge data, and while applying the shape-feature extraction algorithm allows to reduce the computing time. Successively, we introduce a new method for filling large successive missing values in low/un-correlated multivariate time series, namely FSMUMI, which enables to manage a high level of uncertainty. In this way, we propose to use a novel fuzzy grades of basic similarity measures and fuzzy logic rules. Finally, we employ the DTWBI to (i) complete the MAREL Carnot dataset and then we perform a detection of rare/extreme events in this database (ii) forecast various meteorological univariate time series collected in Vietnam
Bergomi, Mattia Giuseppe. "Dynamical and topological tools for (modern) music analysis". Electronic Thesis or Diss., Paris 6, 2015. http://www.theses.fr/2015PA066465.
Texto completo da fonteIn this work, we suggest a collection of novel models for the representation of music. These models are endowed with two main features. First, they originate from a topological and geometrical inspiration; second, their low dimensionality allows to build simple and informative visualisations. We tackle the problem of music representation following three non-orthogonal directions. First, we propose an interpretation of counterpoint as a multivariate time series of partial permutation matrices, whose observations are characterised by a degree of complexity. After providing both a static and a dynamic representation of counterpoint, voice leadings are reinterpreted as a special class of partial singular braids, and their main features are visualised. Thereafter, we give a topological interpretation of the Tonnetz (a graph commonly used in computational musicology), whose vertices are deformed by both a harmonic and a consonance-oriented function. The shapes derived from these deformations are classified using the formalism of persistent homology. Thus, this novel representation of music is evaluated on a collection of heterogenous musical datasets. Finally, a combination of the two approaches is proposed. A model at the crossroad between the signal and symbolic analysis of music uses multiple sequences alignment to provide an encompassing, novel viewpoint on the musical inspiration transfer among compositions belonging to different artists, genres and time. Then, music is represented as a time series of topological fingerprints, allowing the comparison of pairs of time-varying shapes in both topological and musical terms
Bergomi, Mattia Giuseppe. "Dynamical and topological tools for (modern) music analysis". Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066465/document.
Texto completo da fonteIn this work, we suggest a collection of novel models for the representation of music. These models are endowed with two main features. First, they originate from a topological and geometrical inspiration; second, their low dimensionality allows to build simple and informative visualisations. We tackle the problem of music representation following three non-orthogonal directions. First, we propose an interpretation of counterpoint as a multivariate time series of partial permutation matrices, whose observations are characterised by a degree of complexity. After providing both a static and a dynamic representation of counterpoint, voice leadings are reinterpreted as a special class of partial singular braids, and their main features are visualised. Thereafter, we give a topological interpretation of the Tonnetz (a graph commonly used in computational musicology), whose vertices are deformed by both a harmonic and a consonance-oriented function. The shapes derived from these deformations are classified using the formalism of persistent homology. Thus, this novel representation of music is evaluated on a collection of heterogenous musical datasets. Finally, a combination of the two approaches is proposed. A model at the crossroad between the signal and symbolic analysis of music uses multiple sequences alignment to provide an encompassing, novel viewpoint on the musical inspiration transfer among compositions belonging to different artists, genres and time. Then, music is represented as a time series of topological fingerprints, allowing the comparison of pairs of time-varying shapes in both topological and musical terms
Petitjean, François. "Dynamic time warping : apports théoriques pour l'analyse de données temporelles : application à la classification de séries temporelles d'images satellites". Thesis, Strasbourg, 2012. http://www.theses.fr/2012STRAD023.
Texto completo da fonteSatellite Image Time Series are becoming increasingly available and will continue to do so in the coming years thanks to the launch of space missions, which aim at providing a coverage of the Earth every few days with high spatial resolution (ESA’s Sentinel program). In the case of optical imagery, it will be possible to produce land use and cover change maps with detailed nomenclatures. However, due to meteorological phenomena, such as clouds, these time series will become irregular in terms of temporal sampling. In order to consistently handle the huge amount of information that will be produced (for instance, Sentinel-2 will cover the entire Earth’s surface every five days, with 10m to 60m spatial resolution and 13 spectral bands), new methods have to be developed. This Ph.D. thesis focuses on the “Dynamic Time Warping” similarity measure, which is able to take the most of the temporal structure of the data, in order to provide an efficient and relevant analysis of the remotely observed phenomena
Santos, Irineu Júnior Pinheiro dos. "TRACTS : um método para classificação de trajetórias de objetos móveis usando séries temporais". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2011. http://hdl.handle.net/10183/55445.
Texto completo da fonteThe growing use of global positioning systems (GPS) and other location systems made the tracking of moving objects possible, producing a large volume of a new kind of data, called trajectories of moving objects. However, there is a large gap between the amount of data generated by these devices and the knowledge that can be inferred from these data. One type of knowledge discovery in trajectories of moving objects is classification. Trajectory classification is a relatively new research subject, and a few methods have been proposed so far. Most of these methods were developed for a specific application. Only a few have proposed a general method, applicable to multiple domains or datasets. This work presents a new classification method that transforms the trajectories into time series, in order to obtain more discriminative features for classification. Experiments with real trajectory data revealed that the proposed approach is more effective than existing approaches.
Anghinoni, Leandro. "Classificação e previsão de séries temporais através de redes complexas". Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/59/59143/tde-11122018-095106/.
Texto completo da fonteExtracting knowledge from time series analysis has been growing in importance and complexity over the last decade as the amount of stored data has increased exponentially. Considering this scenario, new data mining techniques have continuously developed to deal with such a situation. In this work, we propose to study time series based on its topological characteristics, observed on a complex network generated from the time series data. Specifically, the aim of the proposed model is to create a trend detection algorithm for stochastic time series based on community detection and network metrics. The proposed model presents some advantages over traditional time series analysis, such as adaptive number of classes with measurable strength and better noise absorption. Experimental results on artificial and real datasets shows that the proposed method is able to classify the time series into local and global patterns, improving the predictability of the series when using machine-learning methods
Flocon-Cholet, Joachim. "Classification audio sous contrainte de faible latence". Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S030/document.
Texto completo da fonteThis thesis focuses on audio classification under low-latency constraints. Audio classification has been widely studied for the past few years, however, a large majority of the existing work presents classification systems that are not subject to temporal constraints : the audio signal can be scanned freely in order to gather the needed information to perform the decision (in that case, we may refer to an offline classification). Here, we consider audio classification in the telecommunication domain. The working conditions are now more severe : algorithms work in real time and the analysis and processing steps are now operated on the fly, as long as the signal is transmitted. Hence, the audio classification step has to meet the real time constraints, which can modify its behaviour in different ways : only the current and the past observations of the signal are available, and, despite this fact the classification system has to remain reliable and reactive. Thus, the first question that occurs is : what strategy for the classification can we adopt in order to tackle the real time constraints ? In the literature, we can find two main approaches : the frame-level classification and the segment-level classification. In the frame-level classification, the decision is performed using only the information extracted from the current audio frame. In the segment-level classification, we exploit a short-term information using data computed from the current and few past frames. The data fusion here is obtained using the process of temporal feature integration which consists of deriving relevant information based on the temporal evolution of the audio features. Based on that, there are several questions that need to be answered. What are the limits of these two classification framework ? Can an frame-level classification and a segment-level be used efficiently for any classification task ? Is it possible to obtain good performance with these approaches ? Which classification framework may lead to the best trade-off between accuracy and reactivity ? Furthermore, for the segment-level classification framework, the temporal feature integration process is mainly based on statistical models, but would it be possible to propose other methods ? Throughout this thesis, we investigate this subject by working on several concrete case studies. First, we contribute to the development of a novel audio algorithm dedicated to audio protection. The purpose of this algorithm is to detect and suppress very quickly potentially dangerous sounds for the listener. Our method, which relies on the proposition of three features, shows high detection rate and low false alarm rate in many use cases. Then, we focus on the temporal feature integration in a low-latency framework. To that end, we propose and evaluate several methodologies for the use temporal integration that lead to a good compromise between performance and reactivity. Finally, we propose a novel approach that exploits the temporal evolution of the features. This approach is based on the use of symbolic representation that can capture the temporal structure of the features. The idea is thus to find temporal patterns that are specific to each audio classes. The experiments performed with this approach show promising results
Do, Cao Tri. "Apprentissage de métrique temporelle multi-modale et multi-échelle pour la classification robuste de séries temporelles par plus proches voisins". Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM028/document.
Texto completo da fonteThe definition of a metric between time series is inherent to several data analysis and mining tasks, including clustering, classification or forecasting. Time series data present naturally several characteristics, called modalities, covering their amplitude, behavior or frequential spectrum, that may be expressed with varying delays and at different temporal granularity and localization - exhibited globally or locally. Combining several modalities at multiple temporal scales to learn a holistic metric is a key challenge for many real temporal data applications. This PhD proposes a Multi-modal and Multi-scale Temporal Metric Learning (M2TML) approach for robust time series nearest neighbors classification. The solution is based on the embedding of pairs of time series into a pairwise dissimilarity space, in which a large margin optimization process is performed to learn the metric. The M2TML solution is proposed for both linear and non linear contexts, and is studied for different regularizers. A sparse and interpretable variant of the solution shows the ability of the learned temporal metric to localize accurately discriminative modalities as well as their temporal scales.A wide range of 30 public and challenging datasets, encompassing images, traces and ECG data, that are linearly or non linearly separable, are used to show the efficiency and the potential of M2TML for time series nearest neighbors classification
Giusti, Rafael. "Classicação de séries temporais utilizando diferentes representações de dados e ensembles". Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-05122017-170029/.
Texto completo da fonteTemporal data are ubiquitous in nearly all areas of human knowledge. The research field known as machine learning has contributed to temporal data mining with algorithms for classification, clustering, anomaly or exception detection, and motif detection, among others. These algorithms oftentimes are reliant on a distance function that must be capable of expressing a similarity concept among the data. One of the most important classification models, the 1-NN, employs a distance function when comparing a time series of interest against a reference set, and assigns to the former the label of the most similar reference time series. There are, however, several domains in which the temporal data are insufficient to characterize neighbors according to the concepts associated to the classes. One possible approach to this problem is to transform the time series into a representation domain in which the meaningful attributes for the classifier are more clearly expressed. For instance, a time series may be decomposed into periodic components of different frequency and amplitude values. For several applications, those components are much more meaningful in discriminating the classes than the temporal evolution of the original observations. In this work, we employ diversity of representation and distance functions for the classification of time series. By choosing a data representation that is more suitable to express the discriminating characteristics of the domain, we are able to achieve classification that are more faithful to the target-concept. With this goal in mind, we promote a study of time series representation domains, and we evaluate how such domains can provide alternative decision spaces. Different models of the 1-NN classifier are evaluated both isolated and associated in classification ensembles in order to construct more robust classifiers. We also use distance functions and alternative representation domains in order to extract nontemporal attributes, known as distance features. Distance features reflect neighborhood concepts of the instances to the training samples, and they may be used to induce classification models which are typically not as efficient when trained with the original time series observations. We show that distance features allow for classification results compatible with the state-of-the-art.
Dachraoui, Asma. "Cost-Sensitive Early classification of Time Series". Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLA002/document.
Texto completo da fonteEarly classification of time series is becoming increasingly a valuable task for assisting in decision making process in many application domains. In this setting, information can be gained by waiting for more evidences to arrive, thus helping to make better decisions that incur lower misclassification costs, but, meanwhile, the cost associated with delaying the decision generally increases, rendering the decision less attractive. Making early predictions provided that are accurate requires then to solve an optimization problem combining two types of competing costs. This thesis introduces a new general framework for time series early classification problem. Unlike classical approaches that implicitly assume that misclassification errors are cost equally and the cost of delaying the decision is constant over time, we cast the the problem as a costsensitive online decision making problem when delaying the decision is costly. We then propose a new formal criterion, along with two approaches that estimate the optimal decision time for a new incoming yet incomplete time series. In particular, they capture the evolutions of typical complete time series in the training set thanks to a segmentation technique that forms meaningful groups, and leverage these complete information to estimate the costs for all future time steps where data points still missing. These approaches are interesting in two ways: (i) they estimate, online, the earliest time in the future where a minimization of the criterion can be expected. They thus go beyond the classical approaches that myopically decide at each time step whether to make a decision or to postpone the call one more time step, and (ii) they are adaptive, in that the properties of the incoming time series are taken into account to decide when is the optimal time to output a prediction. Results of extensive experiments on synthetic and real data sets show that both approaches successfully meet the behaviors expected from early classification systems
Mousheimish, Raef. "Combinaison de l’Internet des objets, du traitement d’évènements complexes et de la classification de séries temporelles pour une gestion proactive de processus métier". Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLV073/document.
Texto completo da fonteInternet of things is at the core ofsmart industrial processes thanks to its capacityof event detection from data conveyed bysensors. However, much remains to be done tomake the most out of this recent technologyand make it scale. This thesis aims at filling thegap between the massive data flow collected bysensors and their effective exploitation inbusiness process management. It proposes aglobal approach, which combines stream dataprocessing, supervised learning and/or use ofcomplex event processing rules allowing topredict (and thereby avoid) undesirable events,and finally business process managementextended to these complex rules. The scientificcontributions of this thesis lie in several topics:making the business process more intelligentand more dynamic; automation of complexevent processing by learning the rules; and lastand not least, in datamining for multivariatetime series by early prediction of risks. Thetarget application of this thesis is theinstrumented transportation of artworks
Kallas, Maya. "Méthodes à noyaux en reconnaissance de formes, prédiction et classification : applications aux biosignaux". Troyes, 2012. http://www.theses.fr/2012TROY0026.
Texto completo da fonteThe proliferation of kernel methods lies essentially on the kernel trick, which induces an implicit nonlinear transformation with reduced computational cost. Still, the inverse transformation is often necessary. The resolution of this so-called pre-image problem enables new fields of applications of these methods. The main purpose of this thesis is to show that recent advances in statistical learning theory provide relevant solutions to several issues raised in signal and image processing. The first part focuses on the pre-image problem, and on solutions with constraints imposed by physiology. The non-negativity is probably the most commonly stated constraints when dealing with natural signals and images. Nonnegativity constraints on the result, as well as on the additivity of the contributions, are studied. The second part focuses on time series analysis according to a predictive approach. Autoregressive models are developed in the transformed space, while the prediction requires solving the pre-image problem. Two kernelbased predictive models are considered: the first one is derived by solving a least-squares problem, and the second one by providing the adequate Yule-Walker equations. The last part deals with the classification task for electrocardiograms, in order to detect anomalies. Detection and multi-class classification are explored in the light of support vector machines and self-organizing maps
Silva, Diego Furtado. "Classificação de séries temporais por similaridade e extração de atributos com aplicação na identificação automática de insetos". Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28042014-142456/.
Texto completo da fonteOne of the major challenges in data mining is the integration of temporal data to its process. There are a number of emerging applications that involve temporal data, including fraud detection in credit card transactions and phone calls, intrusion detection in computer systems, the prediction of secondary structures of proteins, the analysis of data from sensors, and many others. In this work, our main interest is the classification of time series that represent audio signals. Our main interest is an application for classifying signals of insects collected from an optical sensor, which should count and classify insects automatically. Although these signals are optically collected, they resemble audio signals. The objective of this research is to compare classification methods based on similarity and feature extraction in the context of insects classification. For this purpose, we used the main classification methods for audio signals, which have been proposed for problems such as musical instrument, speech and animal species recognition. This work shows that, in general, the approach based on feature extraction is more accurate than the classification by similarity. More specifically, the best results are obtained with mel-frequency cepstrum coefficients. This work also presents significant contributions in other applications, also related to the analysis of time series and audio signals by similarity and feature extraction
Masse, Antoine. "Développement et automatisation de méthodes de classification à partir de séries temporelles d'images de télédétection - Application aux changements d'occupation des sols et à l'estimation du bilan carbone". Phd thesis, Université Paul Sabatier - Toulouse III, 2013. http://tel.archives-ouvertes.fr/tel-00921853.
Texto completo da fonteBreton, Marc. "Application de méthodes de classification par séries temporelles au diagnostic médical et à la détection de changements statistiques et étude de la robustesse". Ecole Centrale de Lille, 2004. http://www.theses.fr/2004ECLI0005.
Texto completo da fonteBleakley, Kevin. "Quelques contributions à l'analyse statistique et à la classification des graphes et des courbes : applications à l'immunobiologie et à la reconstruction des réseaux biologiques". Montpellier 2, 2007. http://www.theses.fr/2007MON20209.
Texto completo da fonteThis thesis proposes a set of results in the domain of statistical learning and supervised classification, both from a theoretical and an algorithmic point of view, along with several real-world applications. The thesis is divided into two independent research projects. The first project, essentially theoretical, involves studies into the reconstruction of biological networks, as well as the analysis and classification of time series. The second project presents the results of a statistical study undertaken in collaboration with immunobiologists from Montpellier University 2, in which a step by step analysis of the rearrangement of genes in the formation of T cell receptor junctions was undertaken. Since this domain is very young, we had to first propose a system of notation for the biological variables of interest. We then developed statistical analysis methods aiming to better understand the physical processes implicated in these rearrangements, which are at present little-understood
Goffinet, Étienne. "Clustering multi-blocs et visualisation analytique de données séquentielles massives issues de simulation du véhicule autonome". Thesis, Paris 13, 2021. http://www.theses.fr/2021PA131090.
Texto completo da fonteAdvanced driving-assistance systems validation remains one of the biggest challenges car manufacturers must tackle to provide safe driverless cars. The reliable validation of these systems requires to assess their reaction’s quality and consistency to a broad spectrum of driving scenarios. In this context, large-scale simulation systems bypass the physical «on-tracks» limitations and produce important quantities of high-dimensional time series data. The challenge is to find valuable information in these multivariate unlabelled datasets that may contain noisy, sometimes correlated or non-informative variables. This thesis propose several model-based tool for univariate and multivariate time series clustering based on a Dictionary approach or Bayesian Non Parametric framework. The objective is to automatically find relevant and natural groups of driving behaviors and, in the multivariate case, to perform a model selection and multivariate time series dimension reduction. The methods are experimented on simulated datasets and applied on industrial use cases from Groupe Renault Coclustering
Derksen, Dawa. "Classification contextuelle de gros volumes de données d'imagerie satellitaire pour la production de cartes d'occupation des sols sur de grandes étendues". Thesis, Toulouse 3, 2019. http://www.theses.fr/2019TOU30290.
Texto completo da fonteThis work studies the application of supervised classification for the production of land cover maps using time series of satellite images at high spatial, spectral, and temporal resolutions. On this problem, certain classes such as urban cover, depend more on the context of the pixel than its content. The issue of this Ph.D. work is therefore to take into account the neighborhood of the pixel, to improve the recognition rates of these classes. This research first leads to question the definition of the context, and to imagine different possible shapes for it. Then comes describing the context, that is to say to create a representation or a model that allows the target classes to be recognized. The combinations of these two aspects are evaluated on two experimental data sets, one on Sentinel-2 images, and the other on SPOT-7 images
Melzi, Fateh. "Fouille de données pour l'extraction de profils d'usage et la prévision dans le domaine de l'énergie". Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1123/document.
Texto completo da fonteNowadays, countries are called upon to take measures aimed at a better rationalization of electricity resources with a view to sustainable development. Smart Metering solutions have been implemented and now allow a fine reading of consumption. The massive spatio-temporal data collected can thus help to better understand consumption behaviors, be able to forecast them and manage them precisely. The aim is to be able to ensure "intelligent" use of resources to consume less and consume better, for example by reducing consumption peaks or by using renewable energy sources. The thesis work takes place in this context and aims to develop data mining tools in order to better understand electricity consumption behaviors and to predict solar energy production, then enabling intelligent energy management.The first part of the thesis focuses on the classification of typical electrical consumption behaviors at the scale of a building and then a territory. In the first case, an identification of typical daily power consumption profiles was conducted based on the functional K-means algorithm and a Gaussian mixture model. On a territorial scale and in an unsupervised context, the aim is to identify typical electricity consumption profiles of residential users and to link these profiles to contextual variables and metadata collected on users. An extension of the classical Gaussian mixture model has been proposed. This allows exogenous variables such as the type of day (Saturday, Sunday and working day,...) to be taken into account in the classification, thus leading to a parsimonious model. The proposed model was compared with classical models and applied to an Irish database including both electricity consumption data and user surveys. An analysis of the results over a monthly period made it possible to extract a reduced set of homogeneous user groups in terms of their electricity consumption behaviors. We have also endeavoured to quantify the regularity of users in terms of consumption as well as the temporal evolution of their consumption behaviors during the year. These two aspects are indeed necessary to evaluate the potential for changing consumption behavior that requires a demand response policy (shift in peak consumption, for example) set up by electricity suppliers.The second part of the thesis concerns the forecast of solar irradiance over two time horizons: short and medium term. To do this, several approaches have been developed, including autoregressive statistical approaches for modelling time series and machine learning approaches based on neural networks, random forests and support vector machines. In order to take advantage of the different models, a hybrid model combining the different models was proposed. An exhaustive evaluation of the different approaches was conducted on a large database including four locations (Carpentras, Brasilia, Pamplona and Reunion Island), each characterized by a specific climate as well as weather parameters: measured and predicted using NWP models (Numerical Weather Predictions). The results obtained showed that the hybrid model improves the results of photovoltaic production forecasts for all locations
Pelletier, Charlotte. "Cartographie de l'occupation des sols à partir de séries temporelles d'images satellitaires à hautes résolutions : identification et traitement des données mal étiquetées". Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30241/document.
Texto completo da fonteLand surface monitoring is a key challenge for diverse applications such as environment, forestry, hydrology and geology. Such monitoring is particularly helpful for the management of territories and the prediction of climate trends. For this purpose, mapping approaches that employ satellite-based Earth Observations at different spatial and temporal scales are used to obtain the land surface characteristics. More precisely, supervised classification algorithms that exploit satellite data present many advantages compared to other mapping methods. In addition, the recent launches of new satellite constellations - Landsat-8 and Sentinel-2 - enable the acquisition of satellite image time series at high spatial and spectral resolutions, that are of great interest to describe vegetation land cover. These satellite data open new perspectives, but also interrogate the choice of classification algorithms and the choice of input data. In addition, learning classification algorithms over large areas require a substantial number of instances per land cover class describing landscape variability. Accordingly, training data can be extracted from existing maps or specific existing databases, such as crop parcel farmer's declaration or government databases. When using these databases, the main drawbacks are the lack of accuracy and update problems due to a long production time. Unfortunately, the use of these imperfect training data lead to the presence of mislabeled training instance that may impact the classification performance, and so the quality of the produced land cover map. Taking into account the above challenges, this Ph.D. work aims at improving the classification of new satellite image time series at high resolutions. The work has been divided into two main parts. The first Ph.D. goal consists in studying different classification systems by evaluating two classification algorithms with several input datasets. In addition, the stability and the robustness of the classification methods are discussed. The second goal deals with the errors contained in the training data. Firstly, methods for the detection of mislabeled data are proposed and analyzed. Secondly, a filtering method is proposed to take into account the mislabeled data in the classification framework. The objective is to reduce the influence of mislabeled data on the classification performance, and thus to improve the produced land cover map
Conti, José Carlos 1966. "Eficácia de medidas de similaridade para a classificação de séries temporais associadas ao comportamento fenológico de plantas". [s.n.], 2013. http://repositorio.unicamp.br/jspui/handle/REPOSIP/267746.
Texto completo da fonteDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Tecnologia
Made available in DSpace on 2018-08-24T02:27:51Z (GMT). No. of bitstreams: 1 Conti_JoseCarlos_M.pdf: 2108170 bytes, checksum: 16e7093192986c856bf2d3675ef2a605 (MD5) Previous issue date: 2013
Resumo: Fenologia é o estudo de fenômenos naturais periódicos e sua relação com o clima. Nos últimos anos, tem se apresentado relevante como o indicador mais simples e confiável dos efeitos das mudanças climáticas em plantas e animais. É nesse contexto que se destaca o e-phenology, um projeto multidisciplinar envolvendo pesquisas na área de computação e fenologia. Suas principais características são: o uso de novas tecnologias de monitoramento ambiental, o fornecimento de modelos, métodos e algoritmos para apoiar o gerenciamento, a integração e a análise remota de dados de fenologia, além da criação de um protocolo para um programa de monitoramento de fenologia. Do ponto de vista da computação, as pesquisas científicas buscam modelos, ferramentas e técnicas baseadas em processamento de imagem, extraindo e indexando características de imagens associadas a diferentes tipos de vegetação, além de se concentrar no gerenciamento e mineração de dados e no processamento de séries temporais. Diante desse cenário, esse trabalho especificamente, tem como objetivo investigar a eficácia de medidas de similaridade para a classificação de séries temporais sobre fenômenos fenológicos caracterizados por vetores de características extraídos de imagens de vegetação. Os cálculos foram realizados considerando regiões de imagens de vegetação e foram considerados diferentes critérios de avaliação: espécies de planta, hora do dia e canais de cor. Os resultados obtidos oferecem algumas possibilidades de análise, porém na visão geral, a medida de distância Edit Distance with Real Penalty (ERP) apresentou o índice de acerto mais alto com 29,90%. Adicionalmente, resultados obtidos mostram que as primeiras horas do dia e no final da tarde, provavelmente devido à luminosidade, apresentam os índices de acerto mais altos para todas as visões de análise
Abstract: Phenology is the study of periodic natural phenomena and their relationship to climate. In recent years, it has gained importance as the more simple and reliable indicator of effects of climate changes on plants and animals. In this context, we emphasizes the e-phenology, a multidisciplinary research project in computer science and phenology. Its main characteristics are: The use of new technologies for environmental monitoring, providing models, methods and algorithms to support management, integration and remote analysis of data on phenology, and the creation a protocol for a program to monitoring phenology. From the computer science point of view, the e-phenology project has been dedicated to creating models, tools and techniques based on image processing algorithms, extracting and indexing image features associated with different types of vegetation, and implementing data mining algorithms for processing time series. This project has as main goal to investigate the effectiveness of similarity measures for the classification of time series associated with phenological phenomena characterized by feature vectors extracted from images. Conducted experiments considered different regions containing individuals of different species and considering different criteria such as: plant species, time of day and color channels. Obtained results show that the Edit Distance with Real Penalty (ERP) distance measure yields the highest accuracy. Additionally, the analyzes show that in the early morning and late afternoon, probably due to light conditions, it can be observed the highest accuracy rates for all views analysis
Mestrado
Tecnologia e Inovação
Mestre em Tecnologia
Masse, Antoine. "Développement et automatisation de méthodes de classification à partir de séries temporelles d'images de télédétection : application aux changements d'occupation des sols et à l'estimation du bilan carbone". Phd thesis, Toulouse 3, 2013. http://thesesups.ups-tlse.fr/2106/.
Texto completo da fonteAs acquisition technology progresses, remote sensing data contains an ever increasing amount of information. Future projects in remote sensing like Copernicus will give a high temporal repeatability of acquisitions and will cover large geographical areas. As part of the Copernicus project, Sentinel-2 combines a large swath, frequent revisit (5 days), and systematic acquisition of all land surfaces at high-spatial resolution and with a large number of spectral bands. The context of my research activities has involved the automation and improvement of classification processes for land use and land cover mapping in application with new satellite characteristics. This research has been focused on four main axes: selection of the input data for the classification processes, improvement of classification systems with introduction of ancillary data, fusion of multi-sensors, multi-temporal and multi-spectral classification image results and classification without ground truth data. These new methodologies have been validated on a wide range of images available: various sensors (optical: Landsat 5/7, Worldview-2, Formosat-2, Spot 2/4/5, Pleiades; and radar: Radarsat, Terrasar-X), various spatial resolutions (30 meters to 0. 5 meters), various time repeatability (up to 46 images per year) and various geographical areas (agricultural area in Toulouse, France, Pyrenean mountains and arid areas in Morocco and Algeria). These methodologies are applicable to a wide range of thematic applications like Land Cover mapping, carbon flux estimation and greenbelt mapping
Soheily-Khah, Saeid. "Generalized k-means-based clustering for temporal data under time warp". Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM064/document.
Texto completo da fonteTemporal alignment of multiple time series is an important unresolved problem in many scientific disciplines. Major challenges for an accurate temporal alignment include determining and modeling the common and differential characteristics of classes of time series. This thesis is motivated by recent works in extending Dynamic time warping for aligning multiple time series from several applications including speech recognition, curve matching, micro-array data analysis, temporal segmentation or human motion. However these DTW-based works suffer of several limitations: 1) They address the problem of aligning two time series regardless of the remaining time series, 2) They involve uniformly the features of the multiple time series, 3) The time series are aligned globally by including the whole observations. The aim of this thesis is to explore a generalized dynamic time warping for time series clustering. This work includes first the problem of prototype extraction, then the alignment of multiple and multidimensional time series
Wagner, Nicolas. "Détection des modifications de l’organisation circadienne des activités des animaux en relation avec des états pré-pathologiques, un stress, ou un événement de reproduction". Thesis, Université Clermont Auvergne (2017-2020), 2020. http://www.theses.fr/2020CLFAC032.
Texto completo da fontePrecision livestock farming consists of recording parameters on the animals or their environment using various sensors. In this thesis, the aim is to monitor the behaviour of dairy cows via a real-time localisation system. The data are collected in a sequence of values at regular intervals, a so-called time series. The problems associated with the use of sensors are the large amount of data generated and the quality of this data. The Machine Learning (ML) helps to alleviate this problem. The aim of this thesis is to detect abnormal cow behaviour. The working hypothesis, supported by the biological literature, is that the circadian rhythm of a cow's activity changes if it goes from a normal state to a state of disease, stress or a specific physiological stage (oestrus, farrowing) at a very early stage. The detection of a behavioural anomaly would allow decisions to be taken more quickly in breeding. To do this, there are Time Series Classification (TSC) tools. The problem with behavioural data is that the so-called normal behavioural pattern of the cow varies from cow to cow, day to day, farm to farm, season to season, and so on. Finding a common normal pattern to all cows is therefore impossible. However, most TSC tools rely on learning a global model to define whether a given behaviour is close to this model or not. This thesis is structured around two major contributions. The first one is the development of a new TSC method: FBAT. It is based on Fourier transforms to identify a pattern of activity over 24 hours and compare it to another consecutive 24-hour period, in order to overcome the problem of the lack of a common pattern in a normal cow. The second contribution is the use of fuzzy labels. Indeed, around the days considered abnormal, it is possible to define an uncertain area where the cow would be in an intermediate state. We show that fuzzy logic improves results when labels are uncertain and we introduce a fuzzy variant of FBAT: F-FBAT
Hedhli, Ihsen. "Modèles de classification hiérarchiques d'images satellitaires multi-résolutions, multi-temporelles et multi-capteurs. Application aux désastres naturels". Thesis, Nice, 2016. http://www.theses.fr/2016NICE4006/document.
Texto completo da fonteThe capabilities to monitor the Earth's surface, notably in urban and built-up areas, for example in the framework of the protection from environmental disasters such as floods or earthquakes, play important roles in multiple social, economic, and human viewpoints. In this framework, accurate and time-efficient classification methods are important tools required to support the rapid and reliable assessment of ground changes and damages induced by a disaster, in particular when an extensive area has been affected. Given the substantial amount and variety of data available currently from last generation very-high resolution (VHR) satellite missions such as Pléiades, COSMO-SkyMed, or RadarSat-2, the main methodological difficulty is to develop classifiers that are powerful and flexible enough to utilize the benefits of multiband, multiresolution, multi-date, and possibly multi-sensor input imagery. With the proposed approaches, multi-date/multi-sensor and multi-resolution fusion are based on explicit statistical modeling. The method combines a joint statistical model of multi-sensor and multi-temporal images through hierarchical Markov random field (MRF) modeling, leading to statistical supervised classification approaches. We have developed novel hierarchical Markov random field models, based on the marginal posterior modes (MPM) criterion, that support information extraction from multi-temporal and/or multi-sensor information and allow the joint supervised classification of multiple images taken over the same area at different times, from different sensors, and/or at different spatial resolutions. The developed methods have been experimentally validated with complex optical multispectral (Pléiades), X-band SAR (COSMO-Skymed), and C-band SAR (RadarSat-2) imagery taken from the Haiti site
Olteanu, Madalina. "Modèles à changements de régime : applications aux données financières". Phd thesis, Université Panthéon-Sorbonne - Paris I, 2006. http://tel.archives-ouvertes.fr/tel-00133132.
Texto completo da fonteOn propose d'étudier ces questions à travers deux approches. Dans la première, il s'agit de montrer la consistance faible d'un estimateur de maximum de vraisemblance pénalisée sous des conditions de stationnarité et dépendance faible. Les hypothèses introduites sur l'entropie à crochets de la classe des fonctions scores généralisés sont ensuite vérifiées dans un cadre linéaire et gaussien. La deuxième approche, plutôt empirique, est issue des méthodes de classification non-supervisée et combine les cartes de Kohonen avec une classification hiérarchique pour laquelle une nouvelle dispersion basée sur la somme des carrés résiduelle est introduite.
Manabe, Victor Danilo 1986. "Metodologia para mapeamento da expansão de cana-de-açúcar no Estado de Mato Grosso por meio de séries temporais de NDVI/MODIS". [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/257105.
Texto completo da fonteDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Agrícola
Made available in DSpace on 2018-08-25T12:57:44Z (GMT). No. of bitstreams: 1 Manabe_VictorDanilo_M.pdf: 5304321 bytes, checksum: 80a3f7d1cb298d39ab607a7a6015ab38 (MD5) Previous issue date: 2014
Resumo: O aumento na produção da cana-de-açúcar vem gerando grande discussão sobre a sustentabilidade da produção e a sua influência direta na mudança de uso da terra, principalmente em áreas de pastagem e cultura anual. O estudo da dinâmica da cana-de-açúcar tem influência direta em questões como a composição da produção agrícola, nos impactos sobre a biodiversidade, no desenvolvimento social e humano e na definição de políticas públicas. Índice de vegetação, através de séries temporais de imagens, tem sido utilizado para mapeamento de uso da terra de grandes áreas (estados, países ou regiões), através de produtos do sensor MODerate resolution Imaging Spectroradiometer (MODIS). Este trabalho avaliou o desempenho de diferentes técnicas de filtragem em séries temporais e também realizou detecção automatizada de áreas de cana-de-açúcar e principais usos da terra para os anos de 2005, 2008 e 2012, e consequente mudança de uso da terra, utilizando séries temporais NDVI/MODIS, no estado de Mato Grosso. Foi utilizado o NDVI dos produtos MOD13Q1 e MYD13Q1 do sensor MODIS para identificação das áreas de diferentes usos da terra. Primeiramente foram avaliados os filtros Savitz-Golay , HANTS e Flat Bottom de maneira individual e também com a combinação Flat Bottom + HANTS e Flat Bottom + Savitz-Golay, nas séries de dados somente referentes ao NDVI MODIS/Terra e em conjunto com NDVI MODIS/Aqua. Tendo o resultado, que a utilização MODIS/Terra e MODIS/Aqua trouxe melhora significativa no resultado da classificação, quando utilizado em conjunto a algum filtro de série temporal, sendo o Savitzky-Golay, o que apresentou melhor resultado na diferenciação dos alvos. Na identificação e mapeamento automatizado, de áreas de cana-de-açúcar e outros principais usos da terra para a região (cultura anual, pastagem, cerrado e mata), para os anos de 2005, 2008 e 2012, os valores de acertos para cana-de-açúcar foram de 83%, 82% e 85% nos anos 2005, 2008 e 2012, respectivamente, e o acerto total foram de 89%, 88% e 89%, também para os anos 2005, 2008 e 2012. Ao cruzar os mapeamentos, foi possível realizar a análise da mudança de uso da terra para cana-de-açúcar. A certeza na mudança de uso da terra, quando implementa em áreas anteriormente destinadas a agricultura anual foi de 80% e 82%, na comparação de 2005 para 2008 e 2008 para 2012, respectivamente. No uso anterior de pastagem e cerrado este valor apresentou valores de 69% e 30%, respectivamente, na mudança de 2005 para 2008, e 66% e 34%, respectivamente, na mudança de 2008 para 2012. O resultado na analise de mudança de usa da terra teve a predominância de áreas de pastagem como principal uso anterior a cana-de-açúcar, seguida pela agricultura e o cerrado como responsável pelo restante do uso anterior da terra. Assim, o método para identificação da mudança de uso da terra apresentou um erro a ser considero, porém a tendência de ocorrência se apresenta de maneira consistente
Abstract: The production increase of sugarcane has generated discussion about the sustainability of production and its direct impact on the land use change, especially in pasture and annual crops areas. The study of the dynamics of sugarcane has a direct impact on issues such as the composition of agricultural production, the impacts on biodiversity, social and human development and the definition of public policies. Vegetation index through time series images have been used to map land use of large areas (states, countries or regions) using sensor Moderate Resolution Imaging Spectroradiometer (MODIS). This study evaluated the performance of different time series smoothing techniques and also held automated detection of sugarcane areas and main land uses for the years 2005, 2008 and 2012, and the consequent land use change, using NDVI/MODIS time series in Mato Grosso state. It was used NDVI product of MOD13Q1 and MYD13Q1 to identify areas of different land uses. At first, Savitz-Golay, Hants and Flat Bottom individually and also the combination Flat Bottom + Hants and Flat Bottom + Savitz-Golay, it was applied on NDVI time series data only related to MODIS/Terra and in conjunction with MODIS/Aqua. The result was that the use MODIS/Terra and MODIS/Aqua brought significant improvement in the overall classification, when used in conjunction with any time series smoothing, and the Savitzky-Golay showed better results in the differentiation of targets. The mapping areas of sugarcane and other major land uses (annual crops, grassland, savanna and forest), for the years 2005, 2008 and 2012, the number of right answers for sugarcane were 83 %, 82 % and 85 % in the years 2005, 2008 and 2012, respectively, and total accuracy were 89 %, 88 % and 89 %, also for the years 2005, 2008 and 2012. When crossing the maps, it was possible to perform the analysis of the land use change to cane sugar. The certainty of change in land use, when deploy in areas previously designed to annual agriculture was 80 % and 82 % in 2005 compared to 2008 and 2008 compared to 2012 respectively. The past use of grassland and savannah, this value, showed values of 69 % and 30 %, respectively, in the change from 2005 to 2008, and 66 % and 34 %, respectively, in the change from 2008 to 2012. The result of the study of land use changing had the predominance of grazing areas as the former principal use sugarcane, followed by agriculture and savanna as responsible for the remainder of the previous land use. Thus, the method to identifying the change of land use has an error to consider, but the trend appears to occur consistently
Mestrado
Planejamento e Desenvolvimento Rural Sustentável
Mestre em Engenharia Agrícola
Mure, Simon. "Classification non supervisée de données spatio-temporelles multidimensionnelles : Applications à l’imagerie". Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI130/document.
Texto completo da fonteDue to the dramatic increase of longitudinal acquisitions in the past decades such as video sequences, global positioning system (GPS) tracking or medical follow-up, many applications for time-series data mining have been developed. Thus, unsupervised time-series data mining has become highly relevant with the aim to automatically detect and identify similar temporal patterns between time-series. In this work, we propose a new spatio-temporal filtering scheme based on the mean-shift procedure, a state of the art approach in the field of image processing, which clusters multivariate spatio-temporal data. We also propose a hierarchical time-series clustering algorithm based on the dynamic time warping measure that identifies similar but asynchronous temporal patterns. Our choices have been motivated by the need to analyse magnetic resonance images acquired on people affected by multiple sclerosis. The genetics and environmental factors triggering and governing the disease evolution, as well as the occurrence and evolution of individual lesions, are still mostly unknown and under intense investigation. Therefore, there is a strong need to develop new methods allowing automatic extraction and quantification of lesion characteristics. This has motivated our work on time-series clustering methods, which are not widely used in image processing yet and allow to process image sequences without prior knowledge on the final results
Al, Saleh Mohammed. "SPADAR : Situation-aware and proactive analytics for dynamic adaptation in real time". Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG060.
Texto completo da fonteAlthough radiation level is a serious concern that requires continuous monitoring, many existing systems are designed to perform this task. Radiation Early Warning System (REWS) is one of these systems which monitors the gamma radiation level in the air. Such a system requires high manual intervention, depends totally on experts' analysis, and has some shortcomings that can be risky sometimes. In this thesis, the RIMI (Refining Incoming Monitored Incidents) approach will be introduced, which aims to improve this system while becoming more autonomous while keeping the final decision to the experts. A new method is presented which will help in changing this system to become more intelligent while learning from past incidents of each specific system