To see the other types of publications on this topic, follow the link: Time series data management.

Dissertations / Theses on the topic 'Time series data management'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Time series data management.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Matus, Castillejos Abel, and n/a. "Management of Time Series Data." University of Canberra. Information Sciences & Engineering, 2006. http://erl.canberra.edu.au./public/adt-AUC20070111.095300.

Full text
Abstract:
Every day large volumes of data are collected in the form of time series. Time series are collections of events or observations, predominantly numeric in nature, sequentially recorded on a regular or irregular time basis. Time series are becoming increasingly important in nearly every organisation and industry, including banking, finance, telecommunication, and transportation. Banking institutions, for instance, rely on the analysis of time series for forecasting economic indices, elaborating financial market models, and registering international trade operations. More and more time series are being used in this type of investigation and becoming a valuable resource in today�s organisations. This thesis investigates and proposes solutions to some current and important issues in time series data management (TSDM), using Design Science Research Methodology. The thesis presents new models for mapping time series data to relational databases which optimise the use of disk space, can handle different time granularities, status attributes, and facilitate time series data manipulation in a commercial Relational Database Management System (RDBMS). These new models provide a good solution for current time series database applications with RDBMS and are tested with a case study and prototype with financial time series information. Also included is a temporal data model for illustrating time series data lifetime behaviour based on a new set of time dimensions (confidentiality, definitiveness, validity, and maturity times) specially targeted to manage time series data which are introduced to correctly represent the different status of time series data in a timeline. The proposed temporal data model gives a clear and accurate picture of the time series data lifecycle. Formal definitions of these time series dimensions are also presented. In addition, a time series grouping mechanism in an extensible commercial relational database system is defined, illustrated, and justified. The extension consists of a new data type and its corresponding rich set of routines that support modelling and operating time series information within a higher level of abstraction. It extends the capability of the database server to organise and manipulate time series into groups. Thus, this thesis presents a new data type that is referred to as GroupTimeSeries, and its corresponding architecture and support functions and operations. Implementation options for the GroupTimeSeries data type in relational based technologies are also presented. Finally, a framework for TSDM with enough expressiveness of the main requirements of time series application and the management of that data is defined. The framework aims at providing initial domain know-how and requirements of time series data management, avoiding the impracticability of designing a TSDM system on paper from scratch. Many aspects of time series applications including the way time series data are organised at the conceptual level are addressed. The central abstraction for the proposed domain specific framework is the notions of business sections, group of time series, and time series itself. The framework integrates comprehensive specification regarding structural and functional aspects for time series data management. A formal framework specification using conceptual graphs is also explored.
APA, Harvard, Vancouver, ISO, and other styles
2

Siwela, Blessing. "Web-based management of time-series raster data." Master's thesis, University of Cape Town, 2010. http://hdl.handle.net/11427/6441.

Full text
Abstract:
Data discovery and data handling often presents serious challenges to organizations that manage huge archives of raster datasets such as those generated by satellite remote sensing. Satellite remote sensing produces a regular stream of raster datasets used in many applications including environmental and agricultural monitoring. This thesis presents a system architecture for the management of time-series GIS raster datasets. The architecture is then applied in a prototype implementation for a department that uses remote sensing data for agricultural monitoring. The architecture centres on three key components. The first is a metadatabase to hold metadata for the raster datasets, and an interface to manage the metadatabase and facilitate the search and discovery of raster metadata. The design of the metadatabase involved the examination of existing standards for geographic raster metadata and the determination of the metadata elements required for time-series raster data. The second component is an interactive tool for viewing the time-series raster data discovered via the metadatabase. The third component provides basic image analysis functionality typically required by users of time-series raster datasets. A prototype was implemented using open source software and following the Open Geospatial Consortium specifications for web map services (WMS) version 1.3.0. After implementation, an evaluation of the prototype was carried out by the target users from the RRSU (Regional Remote Sensing Unit) to assess the usability, the added value of the prototype and its impact on the work of the users. The evaluation showed that the prototype system was generally well received, since it allowed both the data managers and users of time-series datasets to save significant amounts of time in their work routines and it also offered some raster data analyses that are useful to a wider community of time-series raster data managers.
APA, Harvard, Vancouver, ISO, and other styles
3

Mousavi, Bamdad. "Scalable Stream Processing and Management for Time Series Data." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42295.

Full text
Abstract:
There has been an enormous growth in the generation of time series data in the past decade. This trend is caused by widespread adoption of IoT technologies, the data generated by monitoring of cloud computing resources, and cyber physical systems. Although time series data have been a topic of discussion in the domain of data management for several decades, this recent growth has brought the topic to the forefront. Many of the time series management systems available today lack the necessary features to successfully manage and process the sheer amount of time series being generated today. In this today we stive to examine the field and study the prior work in time series management. We then propose a large system capable of handling time series management end to end, from generation to consumption by the end user. Our system is composed of open-source data processing frameworks. Our system has the capability to collect time series data, perform stream processing over it, store it for immediate and future processing and create necessary visualizations. We present the implementation of the system and perform experimentations to show its scalability to handle growing pipelines of incoming data from various sources.
APA, Harvard, Vancouver, ISO, and other styles
4

Romanazzi, Stefano. "Water Supply Network Management: Sensor Analysis using Google Cloud Dataflow." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.

Find full text
Abstract:
The growing field of IoT increases the amount of time series data produced every day. With such information overload it is necessary to promptly clean and process those information extracting meaningful knowledge and avoiding raw data storage. Nowadays cloud infrastructures allow to adopt this processing demand by providing new models for defining data-parallel processing pipelines, such as the Apache Beam unified model which evolved from Google Cloud Dataflow and MapReduce paradigm. The projects of this thesis have been implemented during a three-month internship at Injenia srl, and face this exact trail, by processing external IoT-acquired data, going through a cleansing and a processing phase in order to obtain neural networks ready-to-feed data. The sewerage project acquires signals from IoT sensors of a sewerage infrastructure and aims at predicting signals' trends over close future periods. The aqueduct project acquires the same information type from aqueduct plants and aims to reduce the false alarm rate of the telecontrol system. Given the good results of both projects it can be concluded that the data processing phase has produced high-quality information which is the main objective of this thesis.
APA, Harvard, Vancouver, ISO, and other styles
5

Alvidrez, Carlos. "A systematic framework for preparing and enhancing structured data sets for time series analysis." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/100367.

Full text
Abstract:
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, Engineering Systems Division, System Design and Management Program, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 216-217).
This thesis proposes a framework to systematically prepare and enhance structured data for time series analysis. It suggests the production of intermediate derived calculations, which aid in the analysis and rationalization of variation over time, to enhance the consistency and the efficiency of data analysis. This thesis was developed with the cooperation of a major international financial firm. The use of their actual historical financial credit risk data sets significantly aided this work by providing genuine feedback, validating specific results, and confirming the usefulness of the method. While illustrated through the use of credit risk data sets, the methodology this thesis presents is designed to be applied easily and transparently to structured data sets used for time series analysis.
by Carlos Alvidrez.
S.M. in Engineering and Management
APA, Harvard, Vancouver, ISO, and other styles
6

Battaglia, Bruno. "Studio e valutazione di database management system per la gestione di serie temporali." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17270/.

Full text
Abstract:
La tesi è incentrata sulle time series e la loro gestione. Dopo aver esposto cosa fosse una serie temporale ed alcuni casi di utilizzo, la dissertazione prosegue elencando le famiglie di DBMS ed i criteri attraverso i quali valutarli. Successivamente si è descritto il modello che ogni DBMS implementava e, dopo aver dato un accenno di esso, si è passati alle tecniche usate per la gestione e l'analisi delle serie temporali. Ancora dopo, invece, si sono viste le tecniche di modellazione di un database in grado di gestire serie storiche e sono stati analizzati tutti i DBMS presi in esame attraverso i criteri sopracitati. Una comparazione, anche tramite forma tabellare, è stata accompagnata da una descrizione che potesse guidare il lettore ad una comprensione rapida delle differenze, dei punti di forza e delle debolezze di ogni TSDB. Infine sono state tratte le conclusioni che, in seguito al percorso svolto, sono sembrate più appropriate, sono stati individuati dei punti chiave su cui incentrare i lavori futuri e sono stati proposti altri spunti di lavoro ai quali non si è potuto lavorare per mancanza di ulteriore tempo e di disponibilità dei software completi di tutte le loro funzionalità.
APA, Harvard, Vancouver, ISO, and other styles
7

Gogolou, Anna. "Iterative and Expressive Querying for Big Data Series." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS415.

Full text
Abstract:
Les séries temporelles deviennent omniprésentes dans la vie moderne et leur analyse de plus en plus difficile compte tenu de leur taille. L’analyse des grandes séries de données implique des tâches telles que l’appariement de modèles (motifs), la détection d’anomalies, l’identification de modèles fréquents, et la classification ou le regroupement (clustering). Ces tâches reposent sur la notion de similarité. La communauté scientifique a proposé de plusieurs techniques, y compris de nombreuses mesures de similarité pour calculer la distance entre deux séries temporelles, ainsi que des techniques et des algorithmes d’indexation correspondants, afin de relever les défis de l’évolutivité lors de la recherche de similarité.Les analystes, afin de s’acquitter efficacement de leurs tâches, ont besoin de systèmes d’analyse visuelle interactifs, extrêmement rapides, et puissants. Lors de la création de tels systèmes, nous avons identifié deux principaux défis: (1) la perception de similarité et (2) la recherche progressive de similarité. Le premier traite de la façon dont les gens perçoivent des modèles similaires et du rôle de la visualisation dans la perception de similarité. Le dernier point concerne la rapidité avec laquelle nous pouvons redonner aux utilisateurs des mises à jour des résultats progressifs, lorsque les temps de réponse du système sont longs et non interactifs. Le but de cette thèse est de répondre et de donner des solutions aux défis ci-dessus.Dans la première partie, nous avons étudié si différentes représentations visuelles (Graphiques en courbes, Graphiques d’horizon et Champs de couleur) modifiaient la perception de similarité des séries temporelles. Nous avons essayé de comprendre si les résultats de recherche automatique de similarité sont perçus de manière similaire, quelle que soit la technique de visualisation; et si ce que les gens perçoivent comme similaire avec chaque visualisation s’aligne avec différentes mesures de similarité. Nos résultats indiquent que les Graphes d’horizon s’alignent sur des mesures qui permettent des variations de décalage temporel ou d’échelle (i.e., ils promeuvent la déformation temporelle dynamique). En revanche, ils ne s’alignent pas sur des mesures autorisant des variations d’amplitude et de décalage vertical (ils ne promeuvent pas des mesures basées sur la z-normalisation). L’inverse semble être le cas pour les Graphiques en courbes et les Champs de couleur. Dans l’ensemble, nos travaux indiquent que le choix de la visualisation affecte les schémas temporels que l’homme considère comme similaires. Donc, la notion de similarité dans les séries temporelles est dépendante de la technique de visualisation.Dans la deuxième partie, nous nous sommes concentrés sur la recherche progressive de similarité dans de grandes séries de données. Nous avons étudié la rapidité avec laquelle les premières réponses approximatives et puis des mises à jour des résultats progressifs sont détectées lors de l’exécuton des requêtes progressives. Nos résultats indiquent qu’il existe un écart entre le moment où la réponse finale s’est trouvée et le moment où l’algorithme de recherche se termine, ce qui entraîne des temps d’attente gonflés sans amélioration. Des estimations probabilistes pourraient aider les utilisateurs à décider quand arrêter le processus de recherche, i.e., décider quand l’amélioration de la réponse finale est improbable. Nous avons développé et évalué expérimentalement une nouvelle méthode probabiliste qui calcule les garanties de qualité des résultats progressifs de k-plus proches voisins (k-NN). Notre approche apprend d’un ensemble de requêtes et construit des modèles de prédiction basés sur deux observations: (i) des requêtes similaires ont des réponses similaires; et (ii) des réponses progressives renvoyées par les indices de séries de données sont de bons prédicteurs de la réponse finale. Nous fournissons des estimations initiales et progressives de la réponse finale
Time series are becoming ubiquitous in modern life, and given their sizes, their analysis is becoming increasingly challenging. Time series analysis involves tasks such as pattern matching, anomaly detection, frequent pattern identification, and time series clustering or classification. These tasks rely on the notion of time series similarity. The data-mining community has proposed several techniques, including many similarity measures (or distance measure algorithms), for calculating the distance between two time series, as well as corresponding indexing techniques and algorithms, in order to address the scalability challenges during similarity search.To effectively support their tasks, analysts need interactive visual analytics systems that combine extremely fast computation, expressive querying interfaces, and powerful visualization tools. We identified two main challenges when considering the creation of such systems: (1) similarity perception and (2) progressive similarity search. The former deals with how people perceive similar patterns and what the role of visualization is in time series similarity perception. The latter is about how fast we can give back to users updates of progressive similarity search results and how good they are, when system response times are long and do not support real-time analytics in large data series collections. The goal of this thesis, that lies at the intersection of Databases and Human-Computer Interaction, is to answer and give solutions to the above challenges.In the first part of the thesis, we studied whether different visual representations (Line Charts, Horizon Graphs, and Color Fields) alter time series similarity perception. We tried to understand if automatic similarity search results are perceived in a similar manner, irrespective of the visualization technique; and if what people perceive as similar with each visualization aligns with different automatic similarity measures and their similarity constraints. Our findings indicate that Horizon Graphs promote as invariant local variations in temporal position or speed, and as a result they align with measures that allow variations in temporal shifting or scaling (i.e., dynamic time warping). On the other hand, Horizon Graphs do not align with measures that allow amplitude and y-offset variations (i.e., measures based on z-normalization), because they exaggerate these differences, while the inverse seems to be the case for Line Charts and Color Fields. Overall, our work indicates that the choice of visualization affects what temporal patterns humans consider as similar, i.e., the notion of similarity in time series is visualization-dependent.In the second part of the thesis, we focused on progressive similarity search in large data series collections. We investigated how fast first approximate and then updates of progressive answers are detected, while we execute similarity search queries. Our findings indicate that there is a gap between the time the final answer is found and the time when the search algorithm terminates, resulting in inflated waiting times without any improvement. Computing probabilistic estimates of the final answer could help users decide when to stop the search process. We developed and experimentally evaluated using benchmarks, a new probabilistic learning-based method that computes quality guarantees (error bounds) for progressive k-Nearest Neighbour (k-NN) similarity search results. Our approach learns from a set of queries and builds prediction models based on two observations: (i) similar queries have similar answers; and (ii) progressive best-so-far (bsf) answers returned by the state-of-the-art data series indexes are good predictors of the final k-NN answer. We provide both initial and incrementally improved estimates of the final answer
APA, Harvard, Vancouver, ISO, and other styles
8

Waitayangkoon, Chalermpol. "Factors Affecting the Efficient Performance of the Thai State Railway Authority: a Time-Series Data Analysis." Thesis, University of North Texas, 1988. https://digital.library.unt.edu/ark:/67531/metadc330635/.

Full text
Abstract:
The Thai State Railway Authority (RSR) is a public enterprise in Thailand. As an organization its performance is subject to the argument of contingency theorists that operating efficiency is dependent upon various factors both in the internal and external environments of the enterprise. Most of the internal factors are those that organization theorists in the developed world have identified such as goals and objectives, resources, and organization structures. Meanwhile, external factors such as political, economic and social conditions of the society are regarded as indirect factors that have less importance than do the internal factors. Scholars of the developing world have argued that political, social and economic conditions in the society are as important as internal factors. These factors may have a very significant influence on the enterprises and on the society as a whole. Consequently, public enterprises in developing countries always encounter the same problem of operating inefficiency. The RSR is selected as a case study because of its advantages over the other public enterprises in Thailand in terms of size of operation, length of service, and data availability. For the purpose of this project, data are collected from 1960 to 1984 for longitudinal analysis. The methods of analysis are divided into two major sections: simple regression testing and multiple regression testing. The principal component technique is used in both testings to reduce variables to a smaller number for further analysis. The simple regression tests yielded mixed results, but the multiple regression tests resulted in significant relationships. The three new factors derived from the factor analysis technique were labeled as "the organizational pressures," "the socio-political downturn," and "the public criticisms." They explained 84% of all the variance of operating efficiency. The other 16% was the effect of other factors including the management skills, which were excluded from this analysis.
APA, Harvard, Vancouver, ISO, and other styles
9

Winn, David. "An analysis of neural networks and time series techniques for demand forecasting." Thesis, Rhodes University, 2007. http://hdl.handle.net/10962/d1004362.

Full text
Abstract:
This research examines the plausibility of developing demand forecasting techniques which are consistently and accurately able to predict demand. Time Series Techniques and Artificial Neural Networks are both investigated. Deodorant sales in South Africa are specifically studied in this thesis. Marketing techniques which are used to influence consumer buyer behaviour are considered, and these factors are integrated into the forecasting models wherever possible. The results of this research suggest that Artificial Neural Networks can be developed which consistently outperform industry forecasting targets as well as Time Series forecasts, suggesting that producers could reduce costs by adopting this more effective method.
APA, Harvard, Vancouver, ISO, and other styles
10

Jin, Chao. "Methodology on Exact Extraction of Time Series Features for Robust Prognostics and Health Monitoring." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504795992214385.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Li, Yang. "The time-series approaches in forecasting one-step-ahead cash-flow data of mining companies listed on the Johannesburg Stock Exchange." Thesis, University of the Western Cape, 2007. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_1552_1254470577.

Full text
Abstract:

Previous research pertaining to the financial aspect of the mining industry has focused predominantly on mining products' values and the companies' sensitivity to exchange rates. There has been very little empirical research carries out in the field of the statistical behaviour of mning companies' cash flow data. This paper aimed to study the time-series behaviour of the cash flow data series of JSE listed mining companies.

APA, Harvard, Vancouver, ISO, and other styles
12

Zeng, Chunqiu. "Large Scale Data Mining for IT Service Management." FIU Digital Commons, 2016. http://digitalcommons.fiu.edu/etd/3051.

Full text
Abstract:
More than ever, businesses heavily rely on IT service delivery to meet their current and frequently changing business requirements. Optimizing the quality of service delivery improves customer satisfaction and continues to be a critical driver for business growth. The routine maintenance procedure plays a key function in IT service management, which typically involves problem detection, determination and resolution for the service infrastructure. Many IT Service Providers adopt partial automation for incident diagnosis and resolution where the operation of the system administrators and automation operation are intertwined. Often the system administrators' roles are limited to helping triage tickets to the processing teams for problem resolving. The processing teams are responsible to perform a complex root cause analysis, providing the system statistics, event and ticket data. A large scale of system statistics, event and ticket data aggravate the burden of problem diagnosis on both the system administrators and the processing teams during routine maintenance procedures. Alleviating human efforts involved in IT service management dictates intelligent and efficient solutions to maximize the automation of routine maintenance procedures. Three research directions are identified and considered to be helpful for IT service management optimization: (1) Automatically determine problem categories according to the symptom description in a ticket; (2) Intelligently discover interesting temporal patterns from system events; (3) Instantly identify temporal dependencies among system performance statistics data. Provided with ticket, event, and system performance statistics data, the three directions can be effectively addressed with a data-driven solution. The quality of IT service delivery can be improved in an efficient and effective way. The dissertation addresses the research topics outlined above. Concretely, we design and develop data-driven solutions to help system administrators better manage the system and alleviate the human efforts involved in IT Service management, including (1) a knowledge guided hierarchical multi-label classification method for IT problem category determination based on both the symptom description in a ticket and the domain knowledge from the system administrators; (2) an efficient expectation maximization approach for temporal event pattern discovery based on a parametric model; (3) an online inference on time-varying temporal dependency discovery from large-scale time series data.
APA, Harvard, Vancouver, ISO, and other styles
13

Muhammad, Fuad Muhammad Marwan. "Similarity Search in High-dimensional Spaces with Applications to Time Series Data Mining and Information Retrieval." Phd thesis, Université de Bretagne Sud, 2011. http://tel.archives-ouvertes.fr/tel-00619953.

Full text
Abstract:
Nous présentons l'un des principaux problèmes dans la recherche d'informations et de data mining, ce qui est le problème de recherche de similarité. Nous abordons ce problème dans une perspective essentiellement métrique. Nous nous concentrons sur des données de séries temporelles, mais notre objectif général est de développer des méthodes et des algorithmes qui peuvent être étendus aux autres types de données. Nous étudions de nouvelles méthodes pour traiter le problème de recherche de similarité dans des espaces haut-dimensionnels. Les nouvelles méthodes et algorithmes que nous introduisons sont largement testés et ils montrent une supériorité sur les autres méthodes et algorithmes dans la littérature.
APA, Harvard, Vancouver, ISO, and other styles
14

Dalderop, Jeroen Wilhelmus Paulus. "Essays on nonparametric estimation of asset pricing models." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/277966.

Full text
Abstract:
This thesis studies the use of nonparametric econometric methods to reconcile the empirical behaviour of financial asset prices with theoretical valuation models. The confrontation of economic theory with asset price data requires various functional form assumptions about the preferences and beliefs of investors. Nonparametric methods provide a flexible class of models that can prevent misspecification of agents’ utility functions or the distribution of asset returns. Evidence for potential nonlinearity is seen in the presence of non-Gaussian distributions and excessive volatility of stock returns, or non-monotonic stochastic discount factors in option prices. More robust model specifications are therefore likely to contribute to risk management and return predictability, and lend credibility to economists’ assertions. Each of the chapters in this thesis relaxes certain functional form assumptions that seem most important for understanding certain asset price data. Chapter 1 focuses on the state-price density in option prices, which confounds the nonlinearity in both the preferences and the beliefs of investors. To understand both sources of nonlinearity in equity prices, Chapter 2 introduces a semiparametric generalization of the standard representative agent consumption-based asset pricing model. Chapter 3 returns to option prices to understand the relative importance of changes in the distribution of returns and in the shape of the pricing kernel. More specifically, Chapter 1 studies the use of noisy high-frequency data to estimate the time-varying state-price density implicit in European option prices. A dynamic kernel estimator of the conditional pricing function and its derivatives is proposed that can be used for model-free risk measurement. Infill asymptotic theory is derived that applies when the pricing function is either smoothly varying or driven by diffusive state variables. Trading times and moneyness levels are modelled by marked point processes to capture intraday trading patterns. A simulation study investigates the performance of the estimator using an iterated plug-in bandwidth in various scenarios. Empirical results using S&P 500 E-mini European option quotes finds significant time-variation at intraday frequencies. An application towards delta- and minimum variance-hedging further illustrates the use of the estimator. Chapter 2 proposes a semiparametric asset pricing model to measure how consumption and dividend policies depend on unobserved state variables, such as economic uncertainty and risk aversion. Under a flexible specification of the stochastic discount factor, the state variables are recovered from cross-sections of asset prices and volatility proxies, and the shape of the policy functions is identified from the pricing functions. The model leads to closed-form price-dividend ratios under polynomial approximations of the unknown functions and affine state variable dynamics. In the empirical application uncertainty and risk aversion are separately identified from size-sorted stock portfolios exploiting the heterogeneous impact of uncertainty on dividend policy across small and large firms. I find an asymmetric and convex response in consumption (-) and dividend growth (+) towards uncertainty shocks, which together with moderate uncertainty aversion, can generate large leverage effects and divergence between macroeconomic and stock market volatility. Chapter 3 studies the nonparametric identification and estimation of projected pricing kernels implicit in the pricing of options, the underlying asset, and a riskfree bond. The sieve minimum-distance estimator based on conditional moment restrictions avoids the need to compute ratios of estimated risk-neutral and physical densities, and leads to stable estimates even in regions with low probability mass. The conditional empirical likelihood (CEL) variant of the estimator is used to extract implied densities that satisfy the pricing restrictions while incorporating the forwardlooking information from option prices. Moreover, I introduce density combinations in the CEL framework to measure the relative importance of changes in the physical return distribution and in the pricing kernel. The nonlinear dynamic pricing kernels can be used to understand return predictability, and provide model-free quantities that can be compared against those implied by structural asset pricing models.
APA, Harvard, Vancouver, ISO, and other styles
15

Vander, Elst Harry-Paul. "Measuring, Modeling, and Forecasting Volatility and Correlations from High-Frequency Data." Doctoral thesis, Universite Libre de Bruxelles, 2016. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/228960.

Full text
Abstract:
This dissertation contains four essays that all share a common purpose: developing new methodologies to exploit the potential of high-frequency data for the measurement, modeling and forecasting of financial assets volatility and correlations. The first two chapters provide useful tools for univariate applications while the last two chapters develop multivariate methodologies. In chapter 1, we introduce a new class of univariate volatility models named FloGARCH models. FloGARCH models provide a parsimonious joint model for low frequency returns and realized measures, and are sufficiently flexible to capture long memory as well as asymmetries related to leverage effects. We analyze the performances of the models in a realistic numerical study and on the basis of a data set composed of 65 equities. Using more than 10 years of high-frequency transactions, we document significant statistical gains related to the FloGARCH models in terms of in-sample fit, out-of-sample fit and forecasting accuracy compared to classical and Realized GARCH models. In chapter 2, using 12 years of high-frequency transactions for 55 U.S. stocks, we argue that combining low-frequency exogenous economic indicators with high-frequency financial data improves the ability of conditionally heteroskedastic models to forecast the volatility of returns, their full multi-step ahead conditional distribution and the multi-period Value-at-Risk. Using a refined version of the Realized LGARCH model allowing for time-varying intercept and implemented with realized kernels, we document that nominal corporate profits and term spreads have strong long-run predictive ability and generate accurate risk measures forecasts over long-horizon. The results are based on several loss functions and tests, including the Model Confidence Set. Chapter 3 is a joint work with David Veredas. We study the class of disentangled realized estimators for the integrated covariance matrix of Brownian semimartingales with finite activity jumps. These estimators separate correlations and volatilities. We analyze different combinations of quantile- and median-based realized volatilities, and four estimators of realized correlations with three synchronization schemes. Their finite sample properties are studied under four data generating processes, in presence, or not, of microstructure noise, and under synchronous and asynchronous trading. The main finding is that the pre-averaged version of disentangled estimators based on Gaussian ranks (for the correlations) and median deviations (for the volatilities) provide a precise, computationally efficient, and easy alternative to measure integrated covariances on the basis of noisy and asynchronous prices. Along these lines, a minimum variance portfolio application shows the superiority of this disentangled realized estimator in terms of numerous performance metrics. Chapter 4 is co-authored with Niels S. Hansen, Asger Lunde and Kasper V. Olesen, all affiliated with CREATES at Aarhus University. We propose to use the Realized Beta GARCH model to exploit the potential of high-frequency data in commodity markets. The model produces high quality forecasts of pairwise correlations between commodities which can be used to construct a composite covariance matrix. We evaluate the quality of this matrix in a portfolio context and compare it to models used in the industry. We demonstrate significant economic gains in a realistic setting including short selling constraints and transaction costs.
Doctorat en Sciences économiques et de gestion
info:eu-repo/semantics/nonPublished
APA, Harvard, Vancouver, ISO, and other styles
16

Kilborn, Joshua Paul. "Investigating Marine Resources in the Gulf of Mexico at Multiple Spatial and Temporal Scales of Inquiry." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/7046.

Full text
Abstract:
The work in this dissertation represents an attempt to investigate multiple temporal and spatial scales of inquiry relating to the variability of marine resources throughout the Gulf of Mexico large marine ecosystem (Gulf LME). This effort was undertaken over two spatial extents within the greater Gulf LME using two different time-series of fisheries monitoring data. Case studies demonstrating simple frameworks and best practices are presented with the aim of aiding researchers seeking to reduce errors and biases in scientific decision making. Two of the studies focused on three years of groundfish survey data collected across the West Florida Shelf (WFS), an ecosystem that occupies the eastern portion of the Gulf LME and which spans the entire latitudinal extent of the state of Florida. A third study was related to the entire area covered by the Gulf LME, and explored a 30-year dataset containing over 100 long-term monitoring time-series of indicators representing (1) fisheries resource status and structure, (2) human use patterns and resource extractions, and (3) large- and small-scale environmental and climatological characteristics. Finally, a fourth project involved testing the reliability of a popular new clustering algorithm in ecology using data simulation techniques. The work in Chapter Two, focused on the WFS, describes a quantitatively defensible technique to define daytime and nighttime groundfish assemblages, based on the nautical twilight starting and ending times at a sampling station. It also describes the differences between these two unique diel communities, the indicator species that comprise them, and environmental drivers that organize them at daily and inter-annual time scales. Finally, the differential responses in the diel, and inter-annual communities were used to provide evidence for a large-scale event that began to show an environmental signal in 2010 and subsided in 2011 and beyond. The event was manifested in the organization of the benthic fishes beginning weakly in 2010, peaking in 2011, and fully dissipating by 2012. The biotic effects of the event appeared to disproportionately affect the nighttime assemblage of fishes sampled on the WFS. Chapter Three explores the same WFS ecosystem, using the same fisheries-independent dataset, but also includes explicit modeling of the spatial variability captured by the sampling program undertaking the annual monitoring effort. The results also provided evidence of a disturbance that largely affected the nighttime fish community, and which was operating at spatial scales of variability that were larger than the extent of the shelf system itself. Like the previous study, the timing of this event is coincident with the 2010 Deepwater Horizon oil spill, the subsequent sub-marine dispersal of pollutants, and the cessation of spillage. Furthermore, the spatial models uncovered the influence of known spatial-abiotic gradients within the Gulf LME related to (1) depth, (2) temperature, and (3) salinity on the organization of daytime groundfish communities. Finally, the models developed also described which non-spatially structured abiotic variables were important to the observed beta-diversity. The ultimate results were the decomposition of the biotic response, within years and divided by diel classification, into the (1) pure-spatial, (2) pure-abiotic, (3) spatial-abiotic, and (4) unexplained fractions of variation. This study, along with that in Chapter Two, also highlighted the relative importance of the nighttime fish community to the assessment of the structure and function of the WFS, and the challenges associated with adequately sampling it, both in space and time. Because one focus of this dissertation was to develop low-decision frameworks and mathematically defensible alternatives to some common methods in fisheries ecology, Chapter Five employs a clustering technique to identify regime states that relies on hypothesis testing and the use of resemblance profiles as decision criteria. This clustering method avoids some of the arbitrary nature of common clustering solutions seen in ecology, however, it had never been rigorously subjected to numerical data simulation studies. Therefore, a formal investigation of the functional limits of the clustering method was undertaken prior to its use on real fisheries monitoring data, and is presented in Chapter Four. The results of this study are a set of recommendations for researchers seeking to utilize the new method, and the advice is applied in a case study in Chapter Five. Chapter Five presents the ecosystem-level fisheries indicator selection heuristic (EL-FISH) framework for examining long-term time-series data based on ecological monitoring for resources management. The focus of this study is the Gulf LME, encompassing the period of 1980-2011, and it specifically sought to determine to what extent the natural and anthropogenic induced environmental variability, including fishing extractions, affected the structure, function, and status of marine fisheries resources. The methods encompassed by EL-FISH, and the resulting ecosystem model that accounted for ~73% of the variability in biotic resources, allowed for (1) the identification and description of three fisheries resource regime state phase shifts in time, (2) the determination of the effects of fishing and environmental pressures on resources, and (3) providing context and evidence for trade-offs to be considered by managers and stakeholders when addressing fisheries management concerns. The EL-FISH method is fully transferrable and readily adapts to any set of continuous monitoring data.
APA, Harvard, Vancouver, ISO, and other styles
17

Elmäng, Niclas. "Sequence classification on gamified behavior data from a learning management system : Predicting student outcome using neural networks and Markov chain." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18654.

Full text
Abstract:
This study has investigated whether it is possible to classify time series data originating from a gamified learning management system. By using the school data provided by the gamification company Insert Coin AB, the aim was to distribute the teacher’s supervision more efficiently among students who are more likely to fail. Motivating this is the possibility that the student retention and completion rate can be increased. This was done by using Long short-term memory and convolutional neural networks and Markov chain to classify time series of event data. Since the classes are balanced the classification was evaluated using only the accuracy metric. The results for the neural networks show positive results but overfitting seems to occur strongly for the convolutional network and less so for the Long short-term memory network. The Markov chain show potential but further work is needed to mitigate the problem of a strong correlation between sequence length and likelihood.
APA, Harvard, Vancouver, ISO, and other styles
18

Mousheimish, Raef. "Combinaison de l’Internet des objets, du traitement d’évènements complexes et de la classification de séries temporelles pour une gestion proactive de processus métier." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLV073/document.

Full text
Abstract:
L’internet des objets est au coeur desprocessus industriels intelligents grâce à lacapacité de détection d’évènements à partir dedonnées de capteurs. Cependant, beaucoup resteà faire pour tirer le meilleur parti de cettetechnologie récente et la faire passer à l’échelle.Cette thèse vise à combler le gap entre les fluxmassifs de données collectées par les capteurs etleur exploitation effective dans la gestion desprocessus métier. Elle propose une approcheglobale qui combine le traitement de flux dedonnées, l’apprentissage supervisé et/oul’utilisation de règles sur des évènementscomplexes permettant de prédire (et doncéviter) des évènements indésirables, et enfin lagestion des processus métier étendue par cesrègles complexes.Les contributions scientifiques de cette thèse sesituent dans différents domaines : les processusmétiers plus intelligents et dynamiques; letraitement d’évènements complexes automatisépar l’apprentissage de règles; et enfin et surtout,dans le domaine de la fouille de données deséries temporelles multivariéespar la prédiction précoce de risques.L’application cible de cette thèse est le transportinstrumenté d’oeuvres d’art
Internet of things is at the core ofsmart industrial processes thanks to its capacityof event detection from data conveyed bysensors. However, much remains to be done tomake the most out of this recent technologyand make it scale. This thesis aims at filling thegap between the massive data flow collected bysensors and their effective exploitation inbusiness process management. It proposes aglobal approach, which combines stream dataprocessing, supervised learning and/or use ofcomplex event processing rules allowing topredict (and thereby avoid) undesirable events,and finally business process managementextended to these complex rules. The scientificcontributions of this thesis lie in several topics:making the business process more intelligentand more dynamic; automation of complexevent processing by learning the rules; and lastand not least, in datamining for multivariatetime series by early prediction of risks. Thetarget application of this thesis is theinstrumented transportation of artworks
APA, Harvard, Vancouver, ISO, and other styles
19

Ahsan, Ramoza. "Time Series Data Analytics." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/529.

Full text
Abstract:
Given the ubiquity of time series data, and the exponential growth of databases, there has recently been an explosion of interest in time series data mining. Finding similar trends and patterns among time series data is critical for many applications ranging from financial planning, weather forecasting, stock analysis to policy making. With time series being high-dimensional objects, detection of similar trends especially at the granularity of subsequences or among time series of different lengths and temporal misalignments incurs prohibitively high computation costs. Finding trends using non-metric correlation measures further compounds the complexity, as traditional pruning techniques cannot be directly applied. My dissertation addresses these challenges while meeting the need to achieve near real-time responsiveness. First, for retrieving exact similarity results using Lp-norm distances, we design a two-layered time series index for subsequence matching. Time series relationships are compactly organized in a directed acyclic graph embedded with similarity vectors capturing subsequence similarities. Powerful pruning strategies leveraging the graph structure greatly reduce the number of time series as well as subsequence comparisons, resulting in a several order of magnitude speed-up. Second, to support a rich diversity of correlation analytics operations, we compress time series into Euclidean-based clusters augmented by a compact overlay graph encoding correlation relationships. Such a framework supports a rich variety of operations including retrieving positive or negative correlations, self correlations and finding groups of correlated sequences. Third, to support flexible similarity specification using computationally expensive warped distance like Dynamic Time Warping we design data reduction strategies leveraging the inexpensive Euclidean distance with subsequent time warped matching on the reduced data. This facilitates the comparison of sequences of different lengths and with flexible alignment still within a few seconds of response time. Comprehensive experimental studies using real-world and synthetic datasets demonstrate the efficiency, effectiveness and quality of the results achieved by our proposed techniques as compared to the state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
20

Fischer, Ulrike. "Forecasting in Database Systems." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2014. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-133281.

Full text
Abstract:
Time series forecasting is a fundamental prerequisite for decision-making processes and crucial in a number of domains such as production planning and energy load balancing. In the past, forecasting was often performed by statistical experts in dedicated software environments outside of current database systems. However, forecasts are increasingly required by non-expert users or have to be computed fully automatically without any human intervention. Furthermore, we can observe an ever increasing data volume and the need for accurate and timely forecasts over large multi-dimensional data sets. As most data subject to analysis is stored in database management systems, a rising trend addresses the integration of forecasting inside a DBMS. Yet, many existing approaches follow a black-box style and try to keep changes to the database system as minimal as possible. While such approaches are more general and easier to realize, they miss significant opportunities for improved performance and usability. In this thesis, we introduce a novel approach that seamlessly integrates time series forecasting into a traditional database management system. In contrast to flash-back queries that allow a view on the data in the past, we have developed a Flash-Forward Database System (F2DB) that provides a view on the data in the future. It supports a new query type - a forecast query - that enables forecasting of time series data and is automatically and transparently processed by the core engine of an existing DBMS. We discuss necessary extensions to the parser, optimizer, and executor of a traditional DBMS. We furthermore introduce various optimization techniques for three different types of forecast queries: ad-hoc queries, recurring queries, and continuous queries. First, we ease the expensive model creation step of ad-hoc forecast queries by reducing the amount of processed data with traditional sampling techniques. Second, we decrease the runtime of recurring forecast queries by materializing models in a specialized index structure. However, a large number of time series as well as high model creation and maintenance costs require a careful selection of such models. Therefore, we propose a model configuration advisor that determines a set of forecast models for a given query workload and multi-dimensional data set. Finally, we extend forecast queries with continuous aspects allowing an application to register a query once at our system. As new time series values arrive, we send notifications to the application based on predefined time and accuracy constraints. All of our optimization approaches intend to increase the efficiency of forecast queries while ensuring high forecast accuracy.
APA, Harvard, Vancouver, ISO, and other styles
21

Chinipardaz, Rahim. "Discrimination of time series data." Thesis, University of Newcastle Upon Tyne, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.481472.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

VALENTIM, CAIO DIAS. "DATA STRUCTURES FOR TIME SERIES." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2012. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=21522@1.

Full text
Abstract:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
Séries temporais são ferramentas importantes para análise de eventos que ocorrem em diferentes domínios do conhecimento humano, como medicina, física, meteorologia e finanças. Uma tarefa comum na análise de séries temporais é a busca por eventos pouco frequentes que refletem fatos de interesse sobre o domínio de origem da série. Neste trabalho, buscamos desenvolver técnicas para detecção de eventos raros em séries temporais. Formalmente, uma série temporal A igual a (a1, a2,..., an) é uma sequência de valores reais indexados por números inteiros de 1 a n. Dados dois números, um inteiro t e um real d, dizemos que um par de índices i e j formam um evento-(t, d) em A se, e somente se, 0 menor que j - i menor ou igual a t e aj - ai maior ou igual a d. Nesse caso, i é o início do evento e j o fim. Os parâmetros t e d servem para controlar, respectivamente, a janela de tempo em que o evento pode ocorrer e a magnitude da variação na série. Assim, nos concentramos em dois tipos de perguntas relacionadas aos eventos-(t, d), são elas: - Quais são os eventos-(t, d) em uma série A? - Quais são os índices da série A que participam como inícios de ao menos um evento-(t, d)? Ao longo desse trabalho estudamos, do ponto de vista prático e teórico, diversas estruturas de dados e algoritmos para responder às duas perguntas listadas.
Time series are important tools for the anaylsis of events that occur in different fields of human knowledge such as medicine, physics, meteorology and finance. A common task in analysing time series is to try to find events that happen infrequently as these events usually reflect facts of interest about the domain of the series. In this study, we develop techniques for the detection of rare events in time series. Technically, a time series A equal to (a1, a2,..., an) is a sequence of real values indexed by integer numbers from 1 to n. Given an integer t and a real number d, we say that a pair of time indexes i and j is a (t, d)-event in A, if and only if 0 less than j - i less than or equal to t and aj - ai greater than or equal to d. In this case, i is said to be the beginning of the event and j is its end. The parameters t and d control, respectively, the time window in which the event can occur and magnitude of the variation in the series. Thus, we focus on two types of queries related to the (t, d)-events, which are: - What are the (t, d)-events in a series A? - What are the indexes in the series A which are the beginning of at least one (t, d)-event? Throughout this study we discuss, from both theoretical and practical points of view, several data structures and algorithms to answer the two queries mentioned above.
APA, Harvard, Vancouver, ISO, and other styles
23

Matam, Basava R. "Watermarking biomedical time series data." Thesis, Aston University, 2009. http://publications.aston.ac.uk/15351/.

Full text
Abstract:
This thesis addresses the problem of information hiding in low dimensional digital data focussing on issues of privacy and security in Electronic Patient Health Records (EPHRs). The thesis proposes a new security protocol based on data hiding techniques for EPHRs. This thesis contends that embedding of sensitive patient information inside the EPHR is the most appropriate solution currently available to resolve the issues of security in EPHRs. Watermarking techniques are applied to one-dimensional time series data such as the electroencephalogram (EEG) to show that they add a level of confidence (in terms of privacy and security) in an individual’s diverse bio-profile (the digital fingerprint of an individual’s medical history), ensure belief that the data being analysed does indeed belong to the correct person, and also that it is not being accessed by unauthorised personnel. Embedding information inside single channel biomedical time series data is more difficult than the standard application for images due to the reduced redundancy. A data hiding approach which has an in built capability to protect against illegal data snooping is developed. The capability of this secure method is enhanced by embedding not just a single message but multiple messages into an example one-dimensional EEG signal. Embedding multiple messages of similar characteristics, for example identities of clinicians accessing the medical record helps in creating a log of access while embedding multiple messages of dissimilar characteristics into an EPHR enhances confidence in the use of the EPHR. The novel method of embedding multiple messages of both similar and dissimilar characteristics into a single channel EEG demonstrated in this thesis shows how this embedding of data boosts the implementation and use of the EPHR securely.
APA, Harvard, Vancouver, ISO, and other styles
24

Mazel, David S. "Fractal modeling of time-series data." Diss., Georgia Institute of Technology, 1991. http://hdl.handle.net/1853/13916.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Brunsdon, T. M. "Time series analysis of compositional data." Thesis, University of Southampton, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.378257.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Tang, Fengzhen. "Kernel methods for time series data." Thesis, University of Birmingham, 2015. http://etheses.bham.ac.uk//id/eprint/5929/.

Full text
Abstract:
Kernel methods are powerful learning techniques with excellent generalization capability. This thesis develops three advanced approaches within the generic SVM framework in the application domain of time series data. The first contribution presents a new methodology for incorporating privileged information about the future evolution of time series, which is only available in the training phase. The task is prediction of the ordered categories of future time series movements. This is implemented by directly extending support vector ordinal regression with implicit constraints to leaning using privileged information paradigm. The second contribution demonstrates a novel methodology of constructing efficient kernels for time series classification problems. These kernels are constructed by representing each time series through a linear readout model from a high dimensional state space model with a fixed deterministically constructed dynamic part. Learning is then performed in the linear readout model space. Finally, in the same context, we introduce yet another novel time series kernel by co-learning the dynamic part and a global metric in the linear readout model space, encouraging time series from the same class to be represented by close model representations, while model representations of time series from different classes to be well-separated.
APA, Harvard, Vancouver, ISO, and other styles
27

Granberg, Patrick. "Churn prediction using time series data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-294206.

Full text
Abstract:
Customer churn is problematic for any business trying to expand their customer base. The acquisition of new customers to replace churned ones are associated with additional costs, whereas taking measures to retain existing customers may prove more cost efficient. As such, it is of interest to estimate the time until the occurrence of a potential churn for every customer in order to take preventive measures. The application of deep learning and machine learning to this type of problem using time series data is relatively new and there is a lot of recent research on this topic. This thesis is based on the assumption that early signs of churn can be detected by the temporal changes in customer behavior. Recurrent neural networks and more specifically long short-term memory (LSTM) and gated recurrent unit (GRU) are suitable contenders since they are designed to take the sequential time aspect of the data into account. Random forest (RF) and stochastic vector machine (SVM) are machine learning models that are frequently used in related research. The problem is solved through a classification approach, and a comparison is done with implementations using LSTM, GRU, RF, and SVM. According to the results, LSTM and GRU perform similarly while being slightly better than RF and SVM in the task of predicting customers that will churn in the coming six months, and that all models could potentially lead to cost savings according to simulations (using non-official but reasonable costs assigned to each prediction outcome). Predicting the time until churn is a more difficult problem and none of the models can give reliable estimates, but all models are significantly better than random predictions.
Kundbortfall är problematiskt för företag som försöker expandera sin kundbas. Förvärvandet av nya kunder för att ersätta förlorade kunder är associerat med extra kostnader, medan vidtagandet av åtgärder för att behålla kunder kan visa sig mer lönsamt. Som så är det av intresse att för varje kund ha pålitliga tidsestimat till en potentiell uppsägning kan tänkas inträffa så att förebyggande åtgärder kan vidtas. Applicerandet av djupinlärning och maskininlärning på denna typ av problem som involverar tidsseriedata är relativt nytt och det finns mycket ny forskning kring ämnet. Denna uppsats är baserad på antagandet att tidiga tecken på kundbortfall kan upptäckas genom kunders användarmönster över tid. Reccurent neural networks och mer specifikt long short-term memory (LSTM) och gated recurrent unit (GRU) är lämpliga modellval eftersom de är designade att ta hänsyn till den sekventiella tidsaspekten i tidsseriedata. Random forest (RF) och stochastic vector machine (SVM) är maskininlärningsmodeller som ofta används i relaterad forskning. Problemet löses genom en klassificeringsapproach, och en jämförelse utförs med implementationer av LSTM, GRU, RF och SVM. Resultaten visar att LSTM och GRU presterar likvärdigt samtidigt som de presterar bättre än RF och SVM på problemet om att förutspå kunder som kommer att säga upp sig inom det kommande halvåret, och att samtliga modeller potentiellt kan leda till kostnadsbesparingar enligt simuleringar (som använder icke-officiella men rimliga kostnader associerat till varje utfall). Att förutspå tid till en kunduppsägning är ett svårare problem och ingen av de framtagna modellerna kan ge pålitliga tidsestimat, men alla är signifikant bättre än slumpvisa gissningar.
APA, Harvard, Vancouver, ISO, and other styles
28

Svensson, Martin. "Unsupervised Segmentation of Time Series Data." Thesis, Linköpings universitet, Statistik och maskininlärning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176519.

Full text
Abstract:
In a modern vehicle system the amount of data generated are time series large enough for big data. Many of the time series contains interesting patterns, either densely populated or scarcely distributed over the data. For engineers to review the data a segmentation is crucial for data reduction, which is why this thesis investigates unsupervised segmentation of time series. This report uses two different methods, Fast Low-cost Unipotent Semantic Segmentation (FLUSS) and  Information Gain-based Temporal Segmentation (IGTS). These have different approaches, shape and statistical respectively. The goal is to evaluate the strength and weaknesses on tailored time series data, that has properties suiting one or more of the models. The data is constructed from an open dataset, the cricket dataset, that contains labelled segments. These are then concatenated to create datasets with specific properties. Evaluation metrics suitable for segmentation are discussed and evaluated. From the experiments it is clear that all models has strength and weaknesses, so outcome will depend on the data and model combination.  The shape based model, FLUSS, cannot handle reoccurring events or regimes. However, linear transitions between regimes, e.g. A to B to C, gives very good results if the regimes are not too similar. Statistical model, IGTS, yields a non-intuitive segmentation for humans, but could be a good way to reduce data in a preprocess step. It does have the ability to automatically reduce the number of segments to the optimal value based on entropy, which depending on the goal can be desirable or not.  Overall the methods delivered at worst the same as the random segmentation model, but in every test one or more models has better results than this baseline model. Unsupervised segmentation of time series is a difficult problem and will be highly dependent on the target data.
APA, Harvard, Vancouver, ISO, and other styles
29

Guthrey, Delparde Raleigh. "Time series analysis of ozone data." CSUSB ScholarWorks, 1998. https://scholarworks.lib.csusb.edu/etd-project/1788.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

MacDonald, Iain L. "Time series models for discrete data." Doctoral thesis, University of Cape Town, 1992. http://hdl.handle.net/11427/26105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Damle, Chaitanya. "Flood forecasting using time series data mining." [Tampa, Fla.] : University of South Florida, 2005. http://purl.fcla.edu/fcla/etd/SFE0001038.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Chintalapani, Gouthami. "Temporal treemaps for visualizing time series data." College Park, Md. : University of Maryland, 2004. http://hdl.handle.net/1903/1459.

Full text
Abstract:
Thesis (M.S.) -- University of Maryland, College Park, 2004.
Thesis research directed by: Dept. of Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
APA, Harvard, Vancouver, ISO, and other styles
33

Xia, Betty Bin. "Similarity search in time series data sets." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp04/mq24275.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Clarke, Liam. "Nonlinear time series analysis of data streams." Thesis, University of Oxford, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.401147.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Hills, Jonathan F. F. "Mining time-series data using discriminative subsequences." Thesis, University of East Anglia, 2014. https://ueaeprints.uea.ac.uk/53397/.

Full text
Abstract:
Time-series data is abundant, and must be analysed to extract usable knowledge. Local-shape-based methods offer improved performance for many problems, and a comprehensible method of understanding both data and models. For time-series classification, we transform the data into a local-shape space using a shapelet transform. A shapelet is a time-series subsequence that is discriminative of the class of the original series. We use a heterogeneous ensemble classifier on the transformed data. The accuracy of our method is significantly better than the time-series classification benchmark (1-nearest-neighbour with dynamic time-warping distance), and significantly better than the previous best shapelet-based classifiers. We use two methods to increase interpretability: First, we cluster the shapelets using a novel, parameterless clustering method based on Minimum Description Length, reducing dimensionality and removing duplicate shapelets. Second, we transform the shapelet data into binary data reflecting the presence or absence of particular shapelets, a representation that is straightforward to interpret and understand. We supplement the ensemble classifier with partial classifocation. We generate rule sets on the binary-shapelet data, improving performance on certain classes, and revealing the relationship between the shapelets and the class label. To aid interpretability, we use a novel algorithm, BruteSuppression, that can substantially reduce the size of a rule set without negatively affecting performance, leading to a more compact, comprehensible model. Finally, we propose three novel algorithms for unsupervised mining of approximately repeated patterns in time-series data, testing their performance in terms of speed and accuracy on synthetic data, and on a real-world electricity-consumption device-disambiguation problem. We show that individual devices can be found automatically and in an unsupervised manner using a local-shape-based approach.
APA, Harvard, Vancouver, ISO, and other styles
36

Borella, Margherita. "Time series analyses of consumption grouped data." Thesis, University College London (University of London), 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.271818.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Hempel, Sabrina. "Deciphering gene regulation from time series data." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2012. http://dx.doi.org/10.18452/16602.

Full text
Abstract:
Meine Arbeit beschäftigt sich mit der Rekonstruktion genregulatorischer Netze, um die Funktionalität von Organismen und ihre Reaktionen auf die vielfältigen externen Einflussfaktoren besser zu verstehen. Die Analyse kurzer, zeitaufgelöster Daten mit Hilfe von Assoziationsmaßen kann dabei erste wesentliche Einblicke in mögliche Wechselwirkungskreisläufe liefern. In einer umfangreicher Vergleichstudie untersuche ich die Effizienz der Netzwerkrekonstruktion bei der Anwendung verschiedener Maße und Bewertungsschemata. Weiterhin führe ich IOTA (inner composition alignment) als ein neues asymmetrisches, permutationsbasiertes Ähnlichkeitsmaß ein, welches ein effektives Werkzeug zur Rekonstruktion gerichteter Netzwerke ohne die Verwendung zusätzlicher Bewertungsschemata darstellt. In meiner Arbeit betrachte ich verschiedene Modifikationen dieses Maßes und untersuche deren Eigenschaften. Dabei zeige ich, dass IOTA geeignet ist, um statistisch signifikante gerichtete, nichtlineare Kopplungen in verschiedenen Zeitreihen (autoregressive Prozesse, Michaelis-Menten Kinetik und chaotische Oszillatoren in verschiedenen Regimen) und Autoregulation zu untersuchen. Weiterhin erlaubt IOTA, ebenso wie die Korrelationsmaße, die Spezifizierung des Types der Regulation (Aktivierung oder Repression), was es zu dem einzigen Maß macht, dass die Ableitung aller für die Rekonstruktion genregulatorischer Netzwerke erforderlichen Kenndaten ermöglicht. Schließlich nutze ich das neuen Ähnlichkeitsmaß IOTA, um ein genregulatorisches Netzwerk für die Grünalgenart Chlamydomonas reinhardtii unter Kohlenstoffmangel aus experimentellen Daten abzuleiten.
My thesis is about reconstructing gene regulatory networks in order to better understand the functionality of organisms and their reactions to various external influences. In this context, the analysis of short, time-resolved measurements with association measures can yield crucial insights into possible interactions. In an extensive comparison study, I examine the efficiency of different measures and scoring schemes for solving the network reconstruction problem. Furthermore, I introduce IOTA (inner composition alignment), a novel asymmetric, permutation-based association measure, as an efficent tool for reconstructing directed networks without the application of additional scoring schemes. In my thesis, I analyze the properties of various modifications of the measure. Moreover, I show that IOTA is valuable to study significant, directed, nonlinear couplings in several time series (autoregressive processes, Michaelis-Menten kinetics and chaotic oscillators in different dynamical regimes) , as well as autoregulation. In addition, IOTA, similar to correlation measures, permits to identify the type of regulation (activation or repression). Hence, it is the only measure that can determine all necessay characteristics when reconstruction regulatory networks. Finally, I apply the novel association measure IOTA to infer a gene regulatory network for the green algae Chlamydomas reinhardtii under carbon deprivation from experimally obtained data.
APA, Harvard, Vancouver, ISO, and other styles
38

Ferreira, Leonardo Nascimento. "Time series data mining using complex networks." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-01022018-144118/.

Full text
Abstract:
A time series is a time-ordered dataset. Due to its ubiquity, time series analysis is interesting for many scientific fields. Time series data mining is a research area that is intended to extract information from these time-related data. To achieve it, different models are used to describe series and search for patterns. One approach for modeling temporal data is by using complex networks. In this case, temporal data are mapped to a topological space that allows data exploration using network techniques. In this thesis, we present solutions for time series data mining tasks using complex networks. The primary goal was to evaluate the benefits of using network theory to extract information from temporal data. We focused on three mining tasks. (1) In the clustering task, we represented every time series by a vertex and we connected vertices that represent similar time series. We used community detection algorithms to cluster similar series. Results show that this approach presents better results than traditional clustering results. (2) In the classification task, we mapped every labeled time series in a database to a visibility graph. We performed classification by transforming an unlabeled time series to a visibility graph and comparing it to the labeled graphs using a distance function. The new label is the most frequent label in the k-nearest graphs. (3) In the periodicity detection task, we first transform a time series into a visibility graph. Local maxima in a time series are usually mapped to highly connected vertices that link two communities. We used the community structure to propose a periodicity detection algorithm in time series. This method is robust to noisy data and does not require parameters. With the methods and results presented in this thesis, we conclude that network science is beneficial to time series data mining. Moreover, this approach can provide better results than traditional methods. It is a new form of extracting information from time series and can be easily extended to other tasks.
Séries temporais são conjuntos de dados ordenados no tempo. Devido à ubiquidade desses dados, seu estudo é interessante para muitos campos da ciência. A mineração de dados temporais é uma área de pesquisa que tem como objetivo extrair informações desses dados relacionados no tempo. Para isso, modelos são usados para descrever as séries e buscar por padrões. Uma forma de modelar séries temporais é por meio de redes complexas. Nessa modelagem, um mapeamento é feito do espaço temporal para o espaço topológico, o que permite avaliar dados temporais usando técnicas de redes. Nesta tese, apresentamos soluções para tarefas de mineração de dados de séries temporais usando redes complexas. O objetivo principal foi avaliar os benefícios do uso da teoria de redes para extrair informações de dados temporais. Concentramo-nos em três tarefas de mineração. (1) Na tarefa de agrupamento, cada série temporal é representada por um vértice e as arestas são criadas entre as séries de acordo com sua similaridade. Os algoritmos de detecção de comunidades podem ser usados para agrupar séries semelhantes. Os resultados mostram que esta abordagem apresenta melhores resultados do que os resultados de agrupamento tradicional. (2) Na tarefa de classificação, cada série temporal rotulada em um banco de dados é mapeada para um gráfico de visibilidade. A classificação é realizada transformando uma série temporal não marcada em um gráfico de visibilidade e comparando-a com os gráficos rotulados usando uma função de distância. O novo rótulo é dado pelo rótulo mais frequente nos k grafos mais próximos. (3) Na tarefa de detecção de periodicidade, uma série temporal é primeiramente transformada em um gráfico de visibilidade. Máximos locais em uma série temporal geralmente são mapeados para vértices altamente conectados que ligam duas comunidades. O método proposto utiliza a estrutura de comunidades para realizar a detecção de períodos em séries temporais. Este método é robusto para dados ruidosos e não requer parâmetros. Com os métodos e resultados apresentados nesta tese, concluímos que a teoria da redes complexas é benéfica para a mineração de dados em séries temporais. Além disso, esta abordagem pode proporcionar melhores resultados do que os métodos tradicionais e é uma nova forma de extrair informações de séries temporais que pode ser facilmente estendida para outras tarefas.
APA, Harvard, Vancouver, ISO, and other styles
39

Sperl, Ryan E. "Hierarchical Anomaly Detection for Time Series Data." Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1590709752916657.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Mitchell, F. "Painless knowledge acquisition for time series data." Thesis, University of Aberdeen, 1997. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU100889.

Full text
Abstract:
Knowledge Acquisition has long been acknowledged as the bottleneck in producing Expert Systems. This is because, until relatively recently, the KA (Knowledge Acquisition) process has concentrated on extracting knowledge from a domain expert, which is a very time consuming process. Support tools have been constructed to help this process, but these have not been able to reduce the time radically. However, in many domains, the expert is not the only source of knowledge, nor indeed the best source of knowledge. This is particularly true in industrial settings where performance information is routinely archived. This information, if processed correctly, can provide a substantial part of the knowledge required to build a KB (Knowledge Base). In this thesis I discuss current KA approaches and then go on to outline a methodology which uses KD (Knowledge Discovery) techniques to mine archived time series data to produce fault detection and diagnosis KBs with minimal expert input. This methodology is implemented in the TIGON system, which is the focus of this thesis. TIGON uses archived information (in TIGON's case the information is from a gas turbine engine) along with guidance from the expert to produce KBs for detecting and diagnosing faults in a gas turbine engine. TIGON's performance is also analysed in some detail. A comparison with other related work is also included.
APA, Harvard, Vancouver, ISO, and other styles
41

Tapinos, Avraam. "Time series data mining in systems biology." Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/time-series-data-mining-in-systems-biology(5b538723-503b-4b82-959b-d4567e8d4658).html.

Full text
Abstract:
Analysis of time series data constitutes an important activity in many scientific disciplines. Over the last years there has been an increase in the collection of time series data in all scientific fields and disciplines, such as the industry and engineering. Due to the increasing size of the time series datasets, new automated time series data mining techniques have been devised for comparing time series data and present information in a logical and easily comprehensible structure.In systems biology in particular, time series are used to the study biological systems. The time series representations of a systems’ dynamics behaviour are multivariate time series. Time series are considered multivariate when they contain observations for more than one variable component. The biological systems’ dynamics time series contain observations for every feature component that is included in the system; they thus are multivariate time series. Recently, there has been an increasing interest in the collection of biological time series. It would therefore be beneficial for systems biologist to be able to compare these multivariate time series.Over the last decade, the field of time series analysis has attracted the attention of people from different scientific disciplines. A number of researchers from the data mining community focus their efforts on providing solutions on numerous problems regarding different time series data mining tasks. Different methods have been proposed for instance, for comparing, indexing and clustering, of univariate time series. Furthermore, different methods have been proposed for creating abstract representations of time series data and investigating the benefits of using these representations for data mining tasks.The introduction of more advanced computing resources facilitated the collection of multivariate time series, which has become common practise in various scientific fields. The increasing number of multivariate time series data triggered the demand for methods to compare them. A small number of well-suited methods have been proposed for comparing these multivariate time series data.All the currently available methods for multivariate time series comparison are more than adequate for comparing multivariate time series with the same dimensionality. However, they all suffer the same drawback. Current techniques cannot process multivariate time series with different dimensions. A proposed solution for comparing multivariate time series with arbitrary dimensions requires the creation of weighted averages. However, the accumulation of weights data is not always feasible.In this project, a new method is proposed which enables the comparison of multivariate time series with arbitrary dimensions. The particular method is evaluated on multivariate time series from different disciplines in order to test the methods’ applicability on data from different fields of science and industry. Lastly, the newly formed method is applied to perform different time series data mining analyses on a set of biological data.
APA, Harvard, Vancouver, ISO, and other styles
42

Matsubara, Yasuko. "Statistical Data Mining for Time-series Datasets." 京都大学 (Kyoto University), 2012. http://hdl.handle.net/2433/157475.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

丁嘉慧 and Ka-wai Ting. "Time sequences: data mining." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31226760.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

謝永然 and Wing-yin Tse. "Time series analysis in inventory management." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1993. http://hub.hku.hk/bib/B31977510.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Pradhan, Shameer Kumar. "Investigation of Event-Prediction in Time-Series Data : How to organize and process time-series data for event prediction?" Thesis, Högskolan Kristianstad, Fakulteten för naturvetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hkr:diva-19416.

Full text
Abstract:
The thesis determines the type of deep learning algorithms to compare for a particular dataset that contains time-series data. The research method includes study of multiple literatures and conduction of 12 tests. It deals with the organization and processing of the data so as to prepare the data for prediction of an event in the time-series. It also includes the explanation of the algorithms selected. Similarly, it provides a detailed description of the steps taken for classification and prediction of the event. It includes the conduction of multiple tests for varied timeframe in order to compare which algorithm provides better results in different timeframes. The comparison between the selected two deep learning algorithms identified that for shorter timeframes Convolutional Neural Networks performs better and for longer timeframes Recurrent Neural Networks has higher accuracy in the provided dataset. Furthermore, it discusses possible improvements that can be made to the experiments and the research as a whole.
APA, Harvard, Vancouver, ISO, and other styles
46

Rekdal, Espen Ekornes. "Metric Indexing in Time Series." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2008. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-10487.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Jiang, Chunyu. "DATA MINING AND ANALYSIS ON MULTIPLE TIME SERIES OBJECT DATA." Wright State University / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=wright1177959264.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Morrill, Jeffrey P., and Jonathan Delatizky. "REAL-TIME RECOGNITION OF TIME-SERIES PATTERNS." International Foundation for Telemetering, 1993. http://hdl.handle.net/10150/608854.

Full text
Abstract:
International Telemetering Conference Proceedings / October 25-28, 1993 / Riviera Hotel and Convention Center, Las Vegas, Nevada
This paper describes a real-time implementation of the pattern recognition technology originally developed by BBN [Delatizky et al] for post-processing of time-sampled telemetry data. This makes it possible to monitor a data stream for a characteristic shape, such as an arrhythmic heartbeat or a step-response whose overshoot is unacceptably large. Once programmed to recognize patterns of interest, it generates a symbolic description of a time-series signal in intuitive, object-oriented terms. The basic technique is to decompose the signal into a hierarchy of simpler components using rules of grammar, analogous to the process of decomposing a sentence into phrases and words. This paper describes the basic technique used for pattern recognition of time-series signals and the problems that must be solved to apply the techniques in real time. We present experimental results for an unoptimized prototype demonstrating that 4000 samples per second can be handled easily on conventional hardware.
APA, Harvard, Vancouver, ISO, and other styles
49

Milton, Robert. "Time-series in distributed real-time databases." Thesis, University of Skövde, Department of Computer Science, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-827.

Full text
Abstract:

In a distributed real-time environment where it is imperative to make correct decisions it is important to have all facts available to make the most accurate decision in a certain situation. An example of such an environment is an Unmanned Aerial Vehicle (UAV) system where several UAVs cooperate to carry out a certain task and the data recorded is analyzed after the completion of the mission. This project aims to define and implement a time series architecture for use together with a distributed real-time database for the ability to store temporal data. The result from this project is a time series (TS) architecture that uses DeeDS, a distributed real-time database, for storage. The TS architecture is used by an application modelled from a UAV scenario for storing temporal data. The temporal data is produced by a simulator. The TS architecture solves the problem of storing temporal data for applications using DeeDS. The TS architecture is also useful as a foundation for integrating time series in DeeDS since it is designed for space efficiency and real-time requirements.

APA, Harvard, Vancouver, ISO, and other styles
50

Thornlow, Robert Timothy. "Spectrum estimation using extrapolated time series." Thesis, Monterey, California : Naval Postgraduate School, 1990. http://handle.dtic.mil/100.2/ADA246554.

Full text
Abstract:
Thesis (M.S. in Electrical Engineering)--Naval Postgraduate School, December 1990.
Thesis Advisor(s): Hippenstiel, Ralph. Second Reader: Tummala, Murali. "December 1990." Description based on title screen as viewed on March 30, 2010. DTIC Descriptor(s): Frequency, Density, Data Management, Models, Signal To Noise Ratio, Theses, Power Spectra, Sequences, Estimates, Short Range(Time), Spectra, Sampling, Fast Fourier Transforms, Extrapolation, Data Processing. DTIC Identifier(s): Power Spectra, Estimates, Time Series Analysis, Extrapolation, Density, Theses, Fast Fourier Transforms, Eigenvectors, Mathematical Prediction. Author(s) subject terms: Data Extrapolation, Periodogram, AR spectral estimates. Includes bibliographical references (p. 94). Also available in print.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography