Dissertations / Theses: 'Forecasting of data in the form of time series'

1

Li, Jing. "Clustering and forecasting for rain attenuation time series data." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-219615.

Full text

Abstract:

Clustering is one of unsupervised learning algorithm to group similar objects into the same cluster and the objects in the same cluster are more similar to each other than those in the other clusters. Forecasting is making prediction based on the past data and efficient artificial intelligence models to predict data developing tendency, which can help to make appropriate decisions ahead. The datasets used in this thesis are the signal attenuation time series data from the microwave networks. Microwave networks are communication systems to transmit information between two fixed locations on the earth. They can support increasing capacity demands of mobile networks and play an important role in next generation wireless communication technology. But inherent vulnerability to random fluctuation such as rainfall will cause significant network performance degradation. In this thesis, K-means, Fuzzy c-means and 2-state Hidden Markov Model are used to develop one step and two step rain attenuation data clustering models. The forecasting models are designed based on k-nearest neighbor method and implemented with linear regression to predict the real-time rain attenuation in order to help microwave transport networks mitigate rain impact, make proper decisions ahead of time and improve the general performance.
Clustering is een van de unsupervised learning algorithmen om groep soortgelijke objecten in dezelfde cluster en de objecten in dezelfde cluster zijn meer vergelijkbaar met elkaar dan die in de andere clusters. Prognoser är att göra förutspårningar baserade på övergående data och effektiva artificiella intelligensmodeller för att förutspå datautveckling, som kan hjälpa till att fatta lämpliga beslut. Dataseten som används i denna avhandling är signaldämpningstidsseriedata från mikrovågsnätverket. Mikrovågsnät är kommunikationssystem för att överföra information mellan två fasta platser på jorden. De kan stödja ökade kapacitetsbehov i mobilnät och spela en viktig roll i nästa generationens trådlösa kommunikationsteknik. Men inneboende sårbarhet för slumpmässig fluktuering som nedbörd kommer att orsaka betydande nätverksförstöring. I den här avhandlingen används K-medel, Fuzzy c-medel och 2-state Hidden Markov Model för att utveckla ett steg och tvåstegs regen dämpning dataklyvningsmodeller. Prognosmodellerna är utformade utifrån k-närmaste granne-metoden och implementeras med linjär regression för att förutsäga realtidsdämpning för att hjälpa mikrovågstransportnät att mildra regnpåverkan, göra rätt beslut före tid och förbättra den allmänna prestandan.

APA, Harvard, Vancouver, ISO, and other styles

2

Khadivi, Pejman. "Online Denoising Solutions for Forecasting Applications." Diss., Virginia Tech, 2016. http://hdl.handle.net/10919/72907.

Full text

Abstract:

Dealing with noisy time series is a crucial task in many data-driven real-time applications. Due to the inaccuracies in data acquisition, time series suffer from noise and instability which leads to inaccurate forecasting results. Therefore, in order to improve the performance of time series forecasting, an important pre-processing step is the denoising of data before performing any action. In this research, we will propose various approaches to tackle the noisy time series in forecasting applications. For this purpose, we use different machine learning methods and information theoretical approaches to develop online denoising algorithms. In this dissertation, we propose four categories of time series denoising methods that can be used in different situations, depending on the noise and time series properties. In the first category, a seasonal regression technique is proposed for the denoising of time series with seasonal behavior. In the second category, multiple discrete universal denoisers are developed that can be used for the online denoising of discrete value time series. In the third category, we develop a noisy channel reversal model based on the similarities between time series forecasting and data communication and use that model to deploy an out-of-band noise filtering in forecasting applications. The last category of the proposed methods is deep-learning based denoisers. We use information theoretic concepts to analyze a general feed-forward deep neural network and to prove theoretical bounds for deep neural networks behavior. Furthermore, we propose a denoising deep neural network method for the online denoising of time series. Real-world and synthetic time series are used for numerical experiments and performance evaluations. Experimental results show that the proposed methods can efficiently denoise the time series and improve their quality.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

3

Li, Yuntao. "Federated Learning for Time Series Forecasting Using Hybrid Model." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254677.

Full text

Abstract:

Time Series data has become ubiquitous thanks to affordable edge devices and sensors. Much of this data is valuable for decision making. In order to use these data for the forecasting task, the conventional centralized approach has shown deficiencies regarding large data communication and data privacy issues. Furthermore, Neural Network models cannot make use of the extra information from the time series, thus they usually fail to provide time series specific results. Both issues expose a challenge to large-scale Time Series Forecasting with Neural Network models. All these limitations lead to our research question:Can we realize decentralized time series forecasting with a Federated Learning mechanism that is comparable to the conventional centralized setup in forecasting performance?In this work, we propose a Federated Series Forecasting framework, resolving the challenge by allowing users to keep the data locally, and learns a shared model by aggregating locally computed updates. Besides, we design a hybrid model to enable Neural Network models utilizing the extra information from the time series to achieve a time series specific learning. In particular, the proposed hybrid outperforms state-of-art baseline data-central models with NN5 and Ericsson KPI data. Meanwhile, the federated settings of purposed model yields comparable results to data-central settings on both NN5 and Ericsson KPI data. These results together answer the research question of this thesis.
Tidseriedata har blivit allmänt förekommande tack vare överkomliga kantenheter och sensorer. Mycket av denna data är värdefull för beslutsfattande. För att kunna använda datan för prognosuppgifter har den konventionella centraliserade metoden visat brister avseende storskalig datakommunikation och integritetsfrågor. Vidare har neurala nätverksmodeller inte klarat av att utnyttja den extra informationen från tidsserierna, vilket leder till misslyckanden med att ge specifikt tidsserierelaterade resultat. Båda frågorna exponerar en utmaning för storskalig tidsserieprognostisering med neurala nätverksmodeller. Alla dessa begränsningar leder till vår forskningsfråga:Kan vi realisera decentraliserad tidsserieprognostisering med en federerad lärningsmekanism som presterar jämförbart med konventionella centrala lösningar i prognostisering?I det här arbetet föreslår vi ett ramverk för federerad tidsserieprognos som löser utmaningen genom att låta användaren behålla data lokalt och lära sig en delad modell genom att aggregera lokalt beräknade uppdateringar. Dessutom utformar vi en hybrid modell för att möjliggöra neurala nätverksmodeller som kan utnyttja den extra informationen från tidsserierna för att uppnå inlärning av specifika tidsserier. Den föreslagna hybrida modellen presterar bättre än state-of-art centraliserade grundläggande modeller med NN5och Ericsson KPIdata. Samtidigt ger den federerade ansatsen jämförbara resultat med de datacentrala ansatserna för både NN5och Ericsson KPI-data. Dessa resultat svarar tillsammans på forskningsfrågan av denna avhandling.

APA, Harvard, Vancouver, ISO, and other styles

4

Ben, Taieb Souhaib. "Machine learning strategies for multi-step-ahead time series forecasting." Doctoral thesis, Universite Libre de Bruxelles, 2014. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209234.

Full text

Abstract:

How much electricity is going to be consumed for the next 24 hours? What will be the temperature for the next three days? What will be the number of sales of a certain product for the next few months? Answering these questions often requires forecasting several future observations from a given sequence of historical observations, called a time series.

Historically, time series forecasting has been mainly studied in econometrics and statistics. In the last two decades, machine learning, a field that is concerned with the development of algorithms that can automatically learn from data, has become one of the most active areas of predictive modeling research. This success is largely due to the superior performance of machine learning prediction algorithms in many different applications as diverse as natural language processing, speech recognition and spam detection. However, there has been very little research at the intersection of time series forecasting and machine learning.

The goal of this dissertation is to narrow this gap by addressing the problem of multi-step-ahead time series forecasting from the perspective of machine learning. To that end, we propose a series of forecasting strategies based on machine learning algorithms.

Multi-step-ahead forecasts can be produced recursively by iterating a one-step-ahead model, or directly using a specific model for each horizon. As a first contribution, we conduct an in-depth study to compare recursive and direct forecasts generated with different learning algorithms for different data generating processes. More precisely, we decompose the multi-step mean squared forecast errors into the bias and variance components, and analyze their behavior over the forecast horizon for different time series lengths. The results and observations made in this study then guide us for the development of new forecasting strategies.

In particular, we find that choosing between recursive and direct forecasts is not an easy task since it involves a trade-off between bias and estimation variance that depends on many interacting factors, including the learning model, the underlying data generating process, the time series length and the forecast horizon. As a second contribution, we develop multi-stage forecasting strategies that do not treat the recursive and direct strategies as competitors, but seek to combine their best properties. More precisely, the multi-stage strategies generate recursive linear forecasts, and then adjust these forecasts by modeling the multi-step forecast residuals with direct nonlinear models at each horizon, called rectification models. We propose a first multi-stage strategy, that we called the rectify strategy, which estimates the rectification models using the nearest neighbors model. However, because recursive linear forecasts often need small adjustments with real-world time series, we also consider a second multi-stage strategy, called the boost strategy, that estimates the rectification models using gradient boosting algorithms that use so-called weak learners.

Generating multi-step forecasts using a different model at each horizon provides a large modeling flexibility. However, selecting these models independently can lead to irregularities in the forecasts that can contribute to increase the forecast variance. The problem is exacerbated with nonlinear machine learning models estimated from short time series. To address this issue, and as a third contribution, we introduce and analyze multi-horizon forecasting strategies that exploit the information contained in other horizons when learning the model for each horizon. In particular, to select the lag order and the hyperparameters of each model, multi-horizon strategies minimize forecast errors over multiple horizons rather than just the horizon of interest.

We compare all the proposed strategies with both the recursive and direct strategies. We first apply a bias and variance study, then we evaluate the different strategies using real-world time series from two past forecasting competitions. For the rectify strategy, in addition to avoiding the choice between recursive and direct forecasts, the results demonstrate that it has better, or at least has close performance to, the best of the recursive and direct forecasts in different settings. For the multi-horizon strategies, the results emphasize the decrease in variance compared to single-horizon strategies, especially with linear or weakly nonlinear data generating processes. Overall, we found that the accuracy of multi-step-ahead forecasts based on machine learning algorithms can be significantly improved if an appropriate forecasting strategy is used to select the model parameters and to generate the forecasts.

Lastly, as a fourth contribution, we have participated in the Load Forecasting track of the Global Energy Forecasting Competition 2012. The competition involved a hierarchical load forecasting problem where we were required to backcast and forecast hourly loads for a US utility with twenty geographical zones. Our team, TinTin, ranked fifth out of 105 participating teams, and we have been awarded an IEEE Power & Energy Society award.

Doctorat en sciences, Spécialisation Informatique
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

5

Mercurio, Danilo. "Adaptive estimation for financial time series." Doctoral thesis, [S.l. : s.n.], 2004. http://deposit.ddb.de/cgi-bin/dokserv?idn=972597263.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Winn, David. "An analysis of neural networks and time series techniques for demand forecasting." Thesis, Rhodes University, 2007. http://hdl.handle.net/10962/d1004362.

Full text

Abstract:

This research examines the plausibility of developing demand forecasting techniques which are consistently and accurately able to predict demand. Time Series Techniques and Artificial Neural Networks are both investigated. Deodorant sales in South Africa are specifically studied in this thesis. Marketing techniques which are used to influence consumer buyer behaviour are considered, and these factors are integrated into the forecasting models wherever possible. The results of this research suggest that Artificial Neural Networks can be developed which consistently outperform industry forecasting targets as well as Time Series forecasts, suggesting that producers could reduce costs by adopting this more effective method.

APA, Harvard, Vancouver, ISO, and other styles

7

Marriott, Richard Keyworth. "Estimating and forecasting a demand chain for food using cross-section and time-series data." Thesis, University of Bristol, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.266903.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Díaz, González Fernando. "Federated Learning for Time Series Forecasting Using LSTM Networks: Exploiting Similarities Through Clustering." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254665.

Full text

Abstract:

Federated learning poses a statistical challenge when training on highly heterogeneous sequence data. For example, time-series telecom data collected over long intervals regularly shows mixed fluctuations and patterns. These distinct distributions are an inconvenience when a node not only plans to contribute to the creation of the global model but also plans to apply it on its local dataset. In this scenario, adopting a one-fits-all approach might be inadequate, even when using state-of-the-art machine learning techniques for time series forecasting, such as Long Short-Term Memory (LSTM) networks, which have proven to be able to capture many idiosyncrasies and generalise to new patterns. In this work, we show that by clustering the clients using these patterns and selectively aggregating their updates in different global models can improve local performance with minimal overhead, as we demonstrate through experiments using realworld time series datasets and a basic LSTM model.
Federated Learning utgör en statistisk utmaning vid träning med starkt heterogen sekvensdata. Till exempel så uppvisar tidsseriedata inom telekomdomänen blandade variationer och mönster över längre tidsintervall. Dessa distinkta fördelningar utgör en utmaning när en nod inte bara ska bidra till skapandet av en global modell utan även ämnar applicera denna modell på sin lokala datamängd. Att i detta scenario införa en global modell som ska passa alla kan visa sig vara otillräckligt, även om vi använder oss av de mest framgångsrika modellerna inom maskininlärning för tidsserieprognoser, Long Short-Term Memory (LSTM) nätverk, vilka visat sig kunna fånga komplexa mönster och generalisera väl till nya mönster. I detta arbete visar vi att genom att klustra klienterna med hjälp av dessa mönster och selektivt aggregera deras uppdateringar i olika globala modeller kan vi uppnå förbättringar av den lokal prestandan med minimala kostnader, vilket vi demonstrerar genom experiment med riktigt tidsseriedata och en grundläggande LSTM-modell.

APA, Harvard, Vancouver, ISO, and other styles

9

Li, Lei. "Fast Algorithms for Mining Co-evolving Time Series." Research Showcase @ CMU, 2011. http://repository.cmu.edu/dissertations/112.

Full text

Abstract:

Time series data arise in many applications, from motion capture, environmental monitoring, temperatures in data centers, to physiological signals in health care. In the thesis, I will focus on the theme of learning and mining large collections of co-evolving sequences, with the goal of developing fast algorithms for finding patterns, summarization, and anomalies. In particular, this thesis will answer the following recurring challenges for time series: 1. Forecasting and imputation: How to do forecasting and to recover missing values in time series data? 2. Pattern discovery and summarization: How to identify the patterns in the time sequences that would facilitate further mining tasks such as compression, segmentation and anomaly detection? 3. Similarity and feature extraction: How to extract compact and meaningful features from multiple co-evolving sequences that will enable better clustering and similarity queries of time series? 4. Scale up: How to handle large data sets on modern computing hardware? We develop models to mine time series with missing values, to extract compact representation from time sequences, to segment the sequences, and to do forecasting. For large scale data, we propose algorithms for learning time series models, in particular, including Linear Dynamical Systems (LDS) and Hidden Markov Models (HMM). We also develop a distributed algorithm for finding patterns in large web-click streams. Our thesis will present special models and algorithms that incorporate domain knowledge. For motion capture, we will describe the natural motion stitching and occlusion filling for human motion. In particular, we provide a metric for evaluating the naturalness of motion stitching, based which we choose the best stitching. Thanks to domain knowledge (body structure and bone lengths), our algorithm is capable of recovering occlusions in mocap sequences, better in accuracy and longer in missing period. We also develop an algorithm for forecasting thermal conditions in a warehouse-sized data center. The forecast will help us control and manage the data center in a energy-efficient way, which can save a significant percentage of electric power consumption in data centers.

APA, Harvard, Vancouver, ISO, and other styles

10

Kruger, Albertus Stephanus. "An investigation into the use of combined linear and neural network models for time series data / A.S. Kruger." Thesis, North-West University, 2009. http://hdl.handle.net/10394/4782.

Full text

Abstract:

Time series forecasting is an important area of forecasting in which past observations of the same variable are collected and analyzed to develop a model describing the underlying relationship. The model is then used to extrapolate the time series into the future. This modeling approach is particularly useful when little knowledge is available on the underlying data generating process or when there is no satisfactory explanatory model that relates the prediction variable to other explanatory variables. Time series can be modeled in a variety of ways e.g. using exponential smoothing techniques, regression models, autoregressive (AR) techniques, moving averages (MA) etc. Recent research activities in forecasting also suggested that artificial neural networks can be used as an alternative to traditional linear forecasting models. This study will, along the lines of an existing study in the literature, investigate the use of a hybrid approach to time series forecasting using both linear and neural network models. The proposed methodology consists of two basic steps. In the first step, a linear model is used to analyze the linear part of the problem and in the second step a neural network model is developed to model the residuals from the linear model. The results from the neural network can then be used to predict the error terms for the linear model. This means that the combined forecast of the time series will depend on both models. Following an overview of the models, empirical tests on real world data will be performed to determine the forecasting performance of such a hybrid model. Results have indicated that depending on the forecasting period, it might be worthwhile to consider the use of a hybrid model.
Thesis (M.Sc. (Computer Science))--North-West University, Vaal Triangle Campus, 2010.

APA, Harvard, Vancouver, ISO, and other styles

11

Zlicar, Blaz. "Algorithms for noisy and nonstationary data : advances in financial time series forecasting and pattern detection with machine learning." Thesis, University College London (University of London), 2018. http://discovery.ucl.ac.uk/10043123/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Jaunzems, Davis. "Time-series long-term forcasting for A/B tests." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-205344.

Full text

Abstract:

Den tekniska utvecklingen av datorenheter och kommunikationsverktyg har skapat möjligheter att lagra och bearbeta större mängder information än någonsin tidigare. För forskare är det ett sätt att göra mer exakta vetenskapliga upptäckter, för företag är det ett verktyg för att bättre förstå sina kunder, sina produkter och att skapa fördelar gentemot sina konkurrenter. Inom industrin har A/B-testning blivit ett viktigt och vedertaget sätt att skaffa kunskaper som bidrar till att kunna fatta datadrivna beslut. A/B-test är en jämförelse av två eller flera versioner för att avgöra vilken som fungerar bäst enligt förutbestämda mätningar. I kombination med informationsutvinning och statistisk analys gör dessa tester det möjligt att besvara ett antal viktiga frågor och bidra till övergången från att "vi tror" till att "vi vet". Samtidigt kan dåliga testfall ha negativ inverkan på företags affärer och kan också leda till att användare upplever testerna negativt. Det är skälet till varför det är viktigt att kunna förutsäga A/B-testets långsiktiga effekter, utvunna ur kortsiktiga data. I denna rapport är A/B-tester och de prognoser de skapar undersökta genom att använda univariat tidsserieanalys. Men på grund av den korta tidsperioden och det stora urvalet, är det en stor utmaning att ge korrekta långtidsprognoser. Det är en kvantitativ och empirisk studie som använder verkliga data som tagits från ett socialt spelutvecklingsbolag, King Digital Entertainment PLC (King.com). Först analyseras och förbereds data genom en serie olika steg. Tidsserieprognoser har funnits i generationer. Därför görs en analys och noggrannhetsjämförelse av befintliga prognosmodeller, så som medelvärdesprognos, ARIMA och Artificial Neural Networks. Resultaten av analysen på verkliga data visar liknande resultat som andra forskare har funnit för långsiktiga prognoser med kortsiktiga data. För att förbättra exaktheten i prognosen föreslås en metod med tidsseriekluster. Metoden utnyttjar likheten mellan tidsserier genom Dynamic Time Warping och skapar separata kluster av prognosmodeller. Klustren väljs med hög noggrannhet med hjälp av Random Forest klassificering och de långa tidsserieintervallen säkras genom att använda historiska tester och en Markov Chain. Den föreslagna metoden visar överlägsna resultat i jämförelse med befintliga modeller och kan användas för att erhålla långsiktiga prognoser för A/B-tester.
The technological development of computing devices and communication tools has allowed to store and process more information than ever before. For researchers it is a means of making more accurate scientific discoveries, for companies it is a way of better understanding their clients, products and gain an edge over the competitors. In the industry A/B testing is becoming an important and a common way of obtaining insights that help to make data-driven decisions. A/B test is a comparison of two or more versions to determine which is performing better according to predetermined measurements. In combination of data mining and statistical analysis, these tests allow to answer important questions and help to transition from the state of “we think” to “we know”. Nevertheless, running bad test cases can have negative impact on businesses and can result in bad user experience. That is why it is important to be able to forecast A/B test long-term effects from short-term data. In this report A/B tests and their forecasting is looked at using the univariate time-series analysis. However, because of the short duration and high diversity, it poses a great challenge in providing accurate long-term forecasts. This is a quantitative and empirical study that uses real-world data set from a social game development company King Digital Entertainment PLC(King.com). First through series of steps the data are analysed and pre-processed. Time-series forecasting has been around for generations. That is why an analysis and accuracy comparison of existing forecasting models, like, mean forecast, ARIMA and Artificial Neural Networks, is carried out. The results on real data set show similar results that other researchers have found for long-term forecasts with short-term data. To improve the forecasting accuracy a time-series clustering method is proposed. The method utilizes similarity between time-series through Dynamic Time Warping, and trains separate cluster forecasting models. The clusters are chosen with high accuracy using Random Forest classifier, and certainty about time-series long-term range is obtained by using historical tests and a Markov Chain. The proposed method shows superior results against existing models, and can be used to obtain long-term forecasts for A/B tests.

APA, Harvard, Vancouver, ISO, and other styles

13

Khakipoor, Banafsheh. "Applied Science for Water Quality Monitoring." University of Akron / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=akron1595858677325397.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Wang, Shuo. "An Improved Meta-analysis for Analyzing Cylindrical-type Time Series Data with Applications to Forecasting Problem in Environmental Study." Digital WPI, 2015. https://digitalcommons.wpi.edu/etd-theses/386.

Full text

Abstract:

This thesis provides a case study on how the wind direction plays an important role in the amount of rainfall, in the village of Somi$acute{o}$. The primary goal is to illustrate how a meta-analysis, together with circular data analytic methods, helps in analyzing certain environmental issues. The existing GLS meta-analysis combines the merits of usual meta-analysis that yields a better precision and also accounts for covariance among coefficients. But, it is quite limited since information about the covariance among coefficients is not utilized. Hence, in my proposed meta-analysis, I take the correlations between adjacent studies into account when employing the GLS meta-analysis. Besides, I also fit a time series linear-circular regression as a comparable model. By comparing the confidence intervals of parameter estimates, covariance matrix, AIC, BIC and p-values, I discuss an improvement on the GLS meta analysis model in its application to forecasting problem in Environmental study.

APA, Harvard, Vancouver, ISO, and other styles

15

Claudino, Joana Filipa Caetano. "Intelligent system for time series pattern identification and prediction." Master's thesis, Instituto Superior de Economia e Gestão, 2020. http://hdl.handle.net/10400.5/21036.

Full text

Abstract:

Mestrado em Gestão de Sistemas de Informação
Os crescentes volumes de dados representam uma fonte de informação potencialmente valiosa para as empresas, mas também implicam desafios nunca antes enfrentados. Apesar da sua complexidade intrínseca, as séries temporais são um tipo de dados notavelmente relevantes para o contexto empresarial, especialmente para tarefas preditivas. Os modelos Autorregressivos Integrados de Médias Móveis (ARIMA), têm sido a abordagem mais popular para tais tarefas, porém, não estão preparados para lidar com as cada vez mais comuns séries temporais de maior dimensão ou granularidade. Assim, novas tendências de investigação envolvem a aplicação de modelos orientados a dados, como Redes Neuronais Recorrentes (RNNs), à previsão. Dada a dificuldade da previsão de séries temporais e a necessidade de ferramentas aprimoradas, o objetivo deste projeto foi a implementação dos modelos clássicos ARIMA e as arquiteturas RNN mais proeminentes, de forma automática, e o posterior uso desses modelos como base para o desenvolvimento de um sistema modular capaz de apoiar o utilizador em todo o processo de previsão. Design science research foi a abordagem metodológica adotada para alcançar os objetivos propostos e envolveu, para além da identificação dos objetivos, uma revisão aprofundada da literatura que viria a servir de suporte teórico à etapa seguinte, designadamente a execução do projeto e findou com a avaliação meticulosa do artefacto produzido. No geral todos os objetivos propostos foram alcançados, sendo os principais contributos do projeto o próprio sistema desenvolvido devido à sua utilidade prática e ainda algumas evidências empíricas que apoiam a aplicabilidade das RNNs à previsão de séries temporais.
The current growing volumes of data present a source of potentially valuable information for companies, but they also pose new challenges never faced before. Despite their intrinsic complexity, time series are a notably relevant kind of data in the entrepreneurial context, especially regarding prediction tasks. The Autoregressive Integrated Moving Average (ARIMA) models have been the most popular approach for such tasks, but they do not scale well to bigger and more granular time series which are becoming increasingly common. Hence, newer research trends involve the application of data-driven models, such as Recurrent Neural Networks (RNNs), to forecasting. Therefore, given the difficulty of time series prediction and the need for improved tools, the purpose of this project was to implement the classical ARIMA models and the most prominent RNN architectures in an automated fashion and posteriorly to use such models as foundation for the development of a modular system capable of supporting the common user along the entire forecasting process. Design science research was the adopted methodology to achieve the proposed goals and it comprised the activities of goal definition, followed by a thorough literature review aimed at providing the theoretical background necessary to the subsequent step that involved the actual project execution and, finally, the careful evaluation of the produced artifact. In general, each the established goals were accomplished, and the main contributions of the project were the developed system itself due to its practical usefulness along with some empirical evidence supporting the suitability of RNNs to time series forecasting.
info:eu-repo/semantics/publishedVersion

APA, Harvard, Vancouver, ISO, and other styles

16

Kotriwala, Arzam Muzaffar. "Load Forecasting for Temporary Power Installations : A Machine Learning Approach." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-211554.

Full text

Abstract:

Sports events, festivals, construction sites, and film sites are examples of cases where power is required temporarily and often away from the power grid. Temporary Power Installations refer to systems set up for a limited amount of time with power typically generated on-site. Most load forecasting research has centered around settings with a permanent supply of power (such as in residential buildings). On the contrary, this work proposes machine learning approaches to accurately forecast load for Temporary Power Installations. In practice, these systems are typically powered by diesel generators that are over-sized and consequently, operate at low inefficient load levels. In this thesis, a ‘Pre-Event Forecasting’ approach is proposed to address this inefficiency by classifying a new Temporary Power Installation to a cluster of installations with similar load patterns. By doing so, the sizing of generators and power generation planning can be optimized thereby improving system efficiency. Load forecasting for Temporary Power Installations is also useful whilst a Temporary Power Installation is operational. A ‘Real-Time Forecasting’ approach is proposed to use monitored load data streamed to a server to forecast load two hours or more ahead in time. By doing so, practical measures can be taken in real-time to meet unexpected high and low power demands thereby improving system reliability.
Sportevenemang, festivaler, byggarbetsplatser och film platser är exempel på fall där kraften krävs Tillfälligt eller och bort från elnätet. Tillfälliga Kraft Installationer avser system som inrättats för en begränsad tid med Vanligtvis ström genereras på plats. De flesta lastprognoser forskning har kretsat kring inställningar med permanent eller strömförsörjning (zoals i bostadshus). Tvärtom föreslår detta arbete maskininlärning metoder för att noggrant prognos belastning under Tillfälliga anläggningar. I praktiken är thesis Typiskt system drivs med dieselgeneratorer som är överdimensionerad och följaktligen arbetar ineffektivt vid låga belastningsnivåer. I denna avhandling är en ‘Pre-Event Casting’ Föreslagen metod för att ta itu med denna ineffektivitet genom att klassificera ett nytt tillfälligt ström Installation till ett kluster av installationer med liknande lastmönster. Genom att göra så, kan dimensioneringen av generatorer och kraftproduktion planering optimeras därigenom förbättra systemets effektivitet. Load prognoser för Tillfälliga Kraft installationer är ook användbar Medan en tillfällig ström Installationen är i drift. En ‘Prognoser Real-Time’ Föreslagen metod är att använda övervakade lastdata strömmas till en server att förutse belastningen två timmar eller mer i förväg. Genom att göra så, kan praktiska åtgärder vidtas i realtid för att möta oväntade höga och låga effektbehov och därigenom förbättra systemets tillförlitlighet.

APA, Harvard, Vancouver, ISO, and other styles

17

Saluja, Rohit. "Interpreting Multivariate Time Series for an Organization Health Platform." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-289465.

Full text

Abstract:

Machine learning-based systems are rapidly becoming popular because it has been realized that machines are more efficient and effective than humans at performing certain tasks. Although machine learning algorithms are extremely popular, they are also very literal and undeviating. This has led to a huge research surge in the field of interpretability in machine learning to ensure that machine learning models are reliable, fair, and can be held liable for their decision-making process. Moreover, in most real-world problems just making predictions using machine learning algorithms only solves the problem partially. Time series is one of the most popular and important data types because of its dominant presence in the fields of business, economics, and engineering. Despite this, interpretability in time series is still relatively unexplored as compared to tabular, text, and image data. With the growing research in the field of interpretability in machine learning, there is also a pressing need to be able to quantify the quality of explanations produced after interpreting machine learning models. Due to this reason, evaluation of interpretability is extremely important. The evaluation of interpretability for models built on time series seems completely unexplored in research circles. This thesis work focused on achieving and evaluating model agnostic interpretability in a time series forecasting problem. The use case discussed in this thesis work focused on finding a solution to a problem faced by a digital consultancy company. The digital consultancy wants to take a data-driven approach to understand the effect of various sales related activities in the company on the sales deals closed by the company. The solution involved framing the problem as a time series forecasting problem to predict the sales deals and interpreting the underlying forecasting model. The interpretability was achieved using two novel model agnostic interpretability techniques, Local interpretable model- agnostic explanations (LIME) and Shapley additive explanations (SHAP). The explanations produced after achieving interpretability were evaluated using human evaluation of interpretability. The results of the human evaluation studies clearly indicate that the explanations produced by LIME and SHAP greatly helped lay humans in understanding the predictions made by the machine learning model. The human evaluation study results also indicated that LIME and SHAP explanations were almost equally understandable with LIME performing better but with a very small margin. The work done during this project can easily be extended to any time series forecasting or classification scenario for achieving and evaluating interpretability. Furthermore, this work can offer a very good framework for achieving and evaluating interpretability in any machine learning-based regression or classification problem.
Maskininlärningsbaserade system blir snabbt populära eftersom man har insett att maskiner är effektivare än människor när det gäller att utföra vissa uppgifter. Även om maskininlärningsalgoritmer är extremt populära, är de också mycket bokstavliga. Detta har lett till en enorm forskningsökning inom området tolkbarhet i maskininlärning för att säkerställa att maskininlärningsmodeller är tillförlitliga, rättvisa och kan hållas ansvariga för deras beslutsprocess. Dessutom löser problemet i de flesta verkliga problem bara att göra förutsägelser med maskininlärningsalgoritmer bara delvis. Tidsserier är en av de mest populära och viktiga datatyperna på grund av dess dominerande närvaro inom affärsverksamhet, ekonomi och teknik. Trots detta är tolkningsförmågan i tidsserier fortfarande relativt outforskad jämfört med tabell-, text- och bilddata. Med den växande forskningen inom området tolkbarhet inom maskininlärning finns det också ett stort behov av att kunna kvantifiera kvaliteten på förklaringar som produceras efter tolkning av maskininlärningsmodeller. Av denna anledning är utvärdering av tolkbarhet extremt viktig. Utvärderingen av tolkbarhet för modeller som bygger på tidsserier verkar helt outforskad i forskarkretsar. Detta uppsatsarbete fokuserar på att uppnå och utvärdera agnostisk modelltolkbarhet i ett tidsserieprognosproblem. Fokus ligger i att hitta lösningen på ett problem som ett digitalt konsultföretag står inför som användningsfall. Det digitala konsultföretaget vill använda en datadriven metod för att förstå effekten av olika försäljningsrelaterade aktiviteter i företaget på de försäljningsavtal som företaget stänger. Lösningen innebar att inrama problemet som ett tidsserieprognosproblem för att förutsäga försäljningsavtalen och tolka den underliggande prognosmodellen. Tolkningsförmågan uppnåddes med hjälp av två nya tekniker för agnostisk tolkbarhet, lokala tolkbara modellagnostiska förklaringar (LIME) och Shapley additiva förklaringar (SHAP). Förklaringarna som producerats efter att ha uppnått tolkbarhet utvärderades med hjälp av mänsklig utvärdering av tolkbarhet. Resultaten av de mänskliga utvärderingsstudierna visar tydligt att de förklaringar som produceras av LIME och SHAP starkt hjälpte människor att förstå förutsägelserna från maskininlärningsmodellen. De mänskliga utvärderingsstudieresultaten visade också att LIME- och SHAP-förklaringar var nästan lika förståeliga med LIME som presterade bättre men med en mycket liten marginal. Arbetet som utförts under detta projekt kan enkelt utvidgas till alla tidsserieprognoser eller klassificeringsscenarier för att uppnå och utvärdera tolkbarhet. Dessutom kan detta arbete erbjuda en mycket bra ram för att uppnå och utvärdera tolkbarhet i alla maskininlärningsbaserade regressions- eller klassificeringsproblem.

APA, Harvard, Vancouver, ISO, and other styles

18

Sävhammar, Simon. "Uniform interval normalization : Data representation of sparse and noisy data sets for machine learning." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-19194.

Full text

Abstract:

The uniform interval normalization technique is proposed as an approach to handle sparse data and to handle noise in the data. The technique is evaluated transforming and normalizing the MoodMapper and Safebase data sets, the predictive capabilities are compared by forecasting the data set with aLSTM model. The results are compared to both the commonly used MinMax normalization technique and MinMax normalization with a time2vec layer. It was found the uniform interval normalization performed better on the sparse MoodMapper data set, and the denser Safebase data set. Future works consist of studying the performance of uniform interval normalization on other data sets and with other machine learning models.

APA, Harvard, Vancouver, ISO, and other styles

19

Johansson, David. "Automatic Device Segmentation for Conversion Optimization : A Forecasting Approach to Device Clustering Based on Multivariate Time Series Data from the Food and Beverage Industry." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-81476.

Full text

Abstract:

This thesis investigates a forecasting approach to clustering device behavior based on multivariate time series data. Identifying an equitable selection to use in conversion optimization testing is a difficult task. As devices are able to collect larger amounts of data about their behavior it becomes increasingly difficult to utilize manual selection of segments in traditional conversion optimization systems. Forecasting the segments can be done automatically to reduce the time spent on testing while increasing the test accuracy and relevance. The thesis evaluates the results of utilizing multiple forecasting models, clustering models and data pre-processing techniques. With optimal conditions, the proposed model achieves an average accuracy of 97,7%.

APA, Harvard, Vancouver, ISO, and other styles

20

Mejdi, Sami. "Encoder-Decoder Networks for Cloud Resource Consumption Forecasting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291546.

Full text

Abstract:

Excessive resource allocation in telecommunications networks can be prevented by forecasting the resource demand when dimensioning the networks and the allocation the necessary resources accordingly, which is an ongoing effort to achieve a more sustainable development. In this work, traffic data from cloud environments that host deployed virtualized network functions (VNFs) of an IP Multimedia Subsystem (IMS) has been collected along with the computational resource consumption of the VNFs. A supervised learning approach was adopted to address the forecasting problem by considering encoder-decoder networks. These networks were applied to forecast future resource consumption of the VNFs by regarding the problem as a time series forecasting problem, and recasting it as a sequence-to-sequence (seq2seq) problem. Different encoder-decoder network architectures were then utilized to forecast the resource consumption. The encoder-decoder networks were compared against a widely deployed classical time series forecasting model that served as a baseline model. The results show that while the considered encoder-decoder models failed to outperform the baseline model in overall Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), the forecasting capabilities were more resilient to degradation over time. This suggests that the encoder-decoder networks are more appropriate for long-term forecasting, which is an agreement with related literature. Furthermore, the encoder-decoder models achieved competitive performance when compared to the baseline, despite being treated with limited hyperparameter-tuning and the absence of more sophisticated functionality such as attention. This work has shown that there is indeed potential for deep learning applications in forecasting of cloud resource consumption.
Överflödig allokering av resurser I telekommunikationsnätverk kan förhindras genom att prognosera resursbehoven vid dimensionering av dessa nätverk. Detta görs i syfte att bidra till en mer hållbar utveckling. Inför detta prjekt har trafikdata från molnmiljön som hyser aktiva virtuella komponenter (VNFs) till ett IÅ Multimedia Subsystem (IMS) samlats in tillsammans med resursförbrukningen av dessa komponenter. Detta examensarbete avhandlar hur effektivt övervakad maskininlärning i form av encoder-decoder nätverk kan användas för att prognosera resursbehovet hos ovan nämnda VNFs. Encoder-decoder nätverken appliceras genom att betrakta den samlade datan som en tidsserie. Problemet med att förutspå utvecklingen av tidsserien formuleras sedan som ett sequence-2-sequence (seq2seq) problem. I detta arbete användes en samling encoder-decoder nätverk med olika arkitekturer för att prognosera resursförbrukningen och dessa jämfördes med en populär modell hämtad från klassisk tidsserieanalys. Resultaten visar att encoder-decoder nätverken misslyckades med att överträffa den klassiska tidsseriemodellen med avseende på Root Mean Squeared Error (RMSE) och Mean Absolut Error (MAE). Dock visar encoder-decoder nätverken en betydlig motståndskraft mot prestandaförfall över tid i jämförelse med den klassiska tidsseriemodellen. Detta indikerar att encoder-decoder nätverk är lämpliga för prognosering över en längre tidshorisont. Utöver detta visade encoder-decoder nätverken en konkurrenskraftig förmåga att förutspå det korrekta resursbehovet, trots en begränsad justering av disponeringsparametrarna och utan mer sofistikerad funktionalitet implementerad som exempelvis attention.

APA, Harvard, Vancouver, ISO, and other styles

21

Mejdi, Sami. "Encoder-Decoder Networks for Cloud Resource Consumption Forecasting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-294066.

Full text

Abstract:

Excessive resource allocation in telecommunications networks can be prevented by forecasting the resource demand when dimensioning the networks and then allocating the necessary resources accordingly, which is an ongoing effort to achieve a more sustainable development. In this work, traffic data from cloud environments that host deployed virtualized network functions (VNFs) of an IP Multimedia Subsystem (IMS) has been collected along with the computational resource consumption of the VNFs. A supervised learning approach was adopted to address the forecasting problem by considering encoder-decoder networks. These networks were applied to forecast future resource consumption of the VNFs by regarding the problem as a time series forecasting problem, and recasting it as a sequence-to-sequence (seq2seq) problem. Different encoder-decoder network architectures were then utilized to forecast the resource consumption. The encoder-decoder networks were compared against a widely deployed classical time series forecasting model that served as a baseline model. The results show that while the considered encoder-decoder models failed to outperform the baseline model in overall Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), the forecasting capabilities were more resilient to degradation over time. This suggests that the encoder-decoder networks are more appropriate for long-term forecasting, which is in agreement with related literature. Furthermore, the encoder-decoder models achieved competitive performance when compared to the baseline, despite being treated with limited hyperparameter-tuning and the absence of more sophisticated functionality such as attention. This work has shown that there is indeed potential for deep learning applications in forecasting of cloud resource consumption.
Överflödig allokering av resurser i telekommunikationsnätverk kan förhindras genom att prognosera resursbehoven vid dimensionering av dessa nätverk. Detta görs i syfte att bidra till en mer hållbar utveckling. Infor detta projekt har trafikdata från molnmiljon som hyser aktiva virtuella komponenter (VNFs) till ett IP Multimedia Subsystem (IMS) samlats in tillsammans med resursförbrukningen av dessa komponenter. Detta examensarbete avhandlar hur effektivt övervakad maskininlärning i form av encoder-decoder natverk kan användas för att prognosera resursbehovet hos ovan nämnda VNFs. Encoder-decoder nätverken appliceras genom att betrakta den samlade datan som en tidsserie. Problemet med att förutspå utvecklingen av tidsserien formuleras sedan som ett sequence-to-sequence (seq2seq) problem. I detta arbete användes en samling encoder-decoder nätverk med olika arkitekturer for att prognosera resursförbrukningen och dessa jämfördes med en populär modell hämtad från klassisk tidsserieanalys. Resultaten visar att encoder- decoder nätverken misslyckades med att överträffa den klassiska tidsseriemodellen med avseende på Root Mean Squared Error (RMSE) och Mean Absolute Error (MAE). Dock visade encoder-decoder nätverken en betydlig motståndskraft mot prestandaförfall över tid i jämförelse med den klassiska tidsseriemodellen. Detta indikerar att encoder-decoder nätverk är lämpliga för prognosering över en längre tidshorisont. Utöver detta visade encoder-decoder nätverken en konkurrenskraftig förmåga att förutspå det korrekta resursbehovet, trots en begränsad justering av disponeringsparametrarna och utan mer sofistikerad funktionalitet implementerad som exempelvis attention.

APA, Harvard, Vancouver, ISO, and other styles

22

Almqvist, Olof. "A comparative study between algorithms for time series forecasting on customer prediction : An investigation into the performance of ARIMA, RNN, LSTM, TCN and HMM." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-16974.

Full text

Abstract:

Time series prediction is one of the main areas of statistics and machine learning. In 2018 the two new algorithms higher order hidden Markov model and temporal convolutional network were proposed and emerged as challengers to the more traditional recurrent neural network and long-short term memory network as well as the autoregressive integrated moving average (ARIMA). In this study most major algorithms together with recent innovations for time series forecasting is trained and evaluated on two datasets from the theme park industry with the aim of predicting future number of visitors. To develop models, Python libraries Keras and Statsmodels were used. Results from this thesis show that the neural network models are slightly better than ARIMA and the hidden Markov model, and that the temporal convolutional network do not perform significantly better than the recurrent or long-short term memory networks although having the lowest prediction error on one of the datasets. Interestingly, the Markov model performed worse than all neural network models even when using no independent variables.

APA, Harvard, Vancouver, ISO, and other styles

23

Zdybek, Mia. "Evaluating deep learning models for electricity spot price forecasting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302642.

Full text

Abstract:

Electricity spot prices are difficult to predict since they depend on different unstable and erratic parameters, and also due to the fact that electricity is a commodity that cannot be stored efficiently. This results in a volatile, highly fluctuating behavior of the prices, with many peaks. Machine learning algorithms have outperformed traditional methods in various areas due to their ability to learn complex patterns. In the last decade, deep learning approaches have been introduced in electricity spot price prediction problems, often exceeding their predecessors. In this thesis, several deep learning models were built and evaluated for their ability to predict the spot prices 10-days ahead. Several conclusions were made. Firstly, it was concluded that rather simple neural network architectures can predict prices with high accuracy, except for the most extreme sudden peaks. Secondly, all the deep networks outperformed the benchmark statistical model. Lastly, the proposed LSTM and CNN provided forecasts which were statistically, significantly superior and had the lowest errors, suggesting they are the most suitable for the prediction task.
Elspotspriser är svåra att förutsäga eftersom de beror på olika instabila och oregelbundna faktorer, och också på grund av att elektricitet är en vara som inte kan lagras effektivt. Detta leder till ett volatilt, fluktuerande beteende hos priserna, med många plötsliga toppar. Maskininlärningsalgoritmer har överträffat traditionella metoder inom olika områden på grund av deras förmåga att lära sig komplexa mönster. Under det senaste decenniet har djupinlärningsmetoder introducerats till problem inom elprisprognostisering och ofta visat sig överlägsna sina föregångare. I denna avhandling konstruerades och utvärderades flera djupinlärningsmodeller på deras förmåga att förutsäga spotpriserna 10 dagar framåt. Den första slutsatsen är att relativt simpla nätverksarkitekturer kan förutsäga priser med hög noggrannhet, förutom för fallen med de mest extrema, plötsliga topparna. Vidare, så övertränade alla djupa neurala nätverken den statistiska modellen som användes som riktmärke. Slutligen, så gav de föreslagna LSTM- och CNN-modellerna prognoser som var statistiskt, signifikant överlägsna de andra och hade de lägsta felen, vilket tyder på att de är bäst lämpade för prognostiseringsuppgiften.

APA, Harvard, Vancouver, ISO, and other styles

24

Gheyas, Iffat A. "Novel computationally intelligent machine learning algorithms for data mining and knowledge discovery." Thesis, University of Stirling, 2009. http://hdl.handle.net/1893/2152.

Full text

Abstract:

This thesis addresses three major issues in data mining regarding feature subset selection in large dimensionality domains, plausible reconstruction of incomplete data in cross-sectional applications, and forecasting univariate time series. For the automated selection of an optimal subset of features in real time, we present an improved hybrid algorithm: SAGA. SAGA combines the ability to avoid being trapped in local minima of Simulated Annealing with the very high convergence rate of the crossover operator of Genetic Algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks (GRNN). For imputing missing values and forecasting univariate time series, we propose a homogeneous neural network ensemble. The proposed ensemble consists of a committee of Generalized Regression Neural Networks (GRNNs) trained on different subsets of features generated by SAGA and the predictions of base classifiers are combined by a fusion rule. This approach makes it possible to discover all important interrelations between the values of the target variable and the input features. The proposed ensemble scheme has two innovative features which make it stand out amongst ensemble learning algorithms: (1) the ensemble makeup is optimized automatically by SAGA; and (2) GRNN is used for both base classifiers and the top level combiner classifier. Because of GRNN, the proposed ensemble is a dynamic weighting scheme. This is in contrast to the existing ensemble approaches which belong to the simple voting and static weighting strategy. The basic idea of the dynamic weighting procedure is to give a higher reliability weight to those scenarios that are similar to the new ones. The simulation results demonstrate the validity of the proposed ensemble model.

APA, Harvard, Vancouver, ISO, and other styles

25

Svensk, Gustav. "TDNet : A Generative Model for Taxi Demand Prediction." Thesis, Linköpings universitet, Programvara och system, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158514.

Full text

Abstract:

Supplying the right amount of taxis in the right place at the right time is very important for taxi companies. In this paper, the machine learning model Taxi Demand Net (TDNet) is presented which predicts short-term taxi demand in different zones of a city. It is based on WaveNet which is a causal dilated convolutional neural net for time-series generation. TDNet uses historical demand from the last years and transforms features such as time of day, day of week and day of month into 26-hour taxi demand forecasts for all zones in a city. It has been applied to one city in northern Europe and one in South America. In northern europe, an error of one taxi or less per hour per zone was achieved in 64% of the cases, in South America the number was 40%. In both cities, it beat the SARIMA and stacked ensemble benchmarks. This performance has been achieved by tuning the hyperparameters with a Bayesian optimization algorithm. Additionally, weather and holiday features were added as input features in the northern European city and they did not improve the accuracy of TDNet.

APA, Harvard, Vancouver, ISO, and other styles

26

Damle, Chaitanya. "Flood forecasting using time series data mining." [Tampa, Fla.] : University of South Florida, 2005. http://purl.fcla.edu/fcla/etd/SFE0001038.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Логін, Вадим Вікторович. "Моделі для прогнозування характеристик трафіка цифрової реклами." Master's thesis, Київ, 2018. https://ela.kpi.ua/handle/123456789/23748.

Full text

Abstract:

Магістерська дисертація: 112 с., 48 рис., 40 табл., 3 додатки і 30 джерел. Об’єкт дослідження – трафік цифрової реклами у формі статистичних даних. Предмет дослідження – моделі та методи аналізу даних у формі часових рядів, методи прикладної статистики. Мета роботи – побудова моделей часових рядів для прогнозування найважливіших характеристик трафіка цифрової реклами. Методи дослідження – моделі часових рядів для прогнозування даних та порівняльний аналіз отриманих моделей. У даній роботі наведені результати побудови моделей часових рядів, що призначені для прогнозування найважливіших характеристик трафіка цифрової реклами. Описані результати порівняльного аналізу отриманих моделей за допомогою інформаційних критеріїв, а також з точки зору їхньої точності. Встановлено, що для нашої задачі, найкращою моделлю є модель ARIMAX (Autoregressive integrated moving-average model with exogenous inputs), тобто модель авторегресії та ковзного середнього з екзогенними змінними. Тому для подальших досліджень рекомендовано використовувати саме цю модель. За матеріалами магістерської дисертації були написані тези, а також написана наукова стаття. Тези будуть опубліковані в збірці тез доповідей конференції САІТ-2018. А наукова стаття буде опублікована в електронній збірці доповідей у видавництві CEUR. Прогнозні припущення щодо подальшого розвитку об’єкта дослідження – побудова нових, а також вдосконалення існуючих моделей часових рядів для прогнозування найважливіших характеристик цифрової реклами. А також узагальнення дослідження, що проводилось у даній роботі, на аналіз окремих сайтів із рекламного трафіку.
Models for forecasting parameters of digital advertising traffic. Master's thesis: 112 p., 48 fig., 40 tabl., 3 appendixes and 30 sources. The object of study – digital advertising traffic in the form of statistical data. Subject of research – models and methods of analysis of data in the form of time series, methods of applied statistics. Purpose – constructing time series models for forecasting the most important characteristics of digital advertising traffic. Methods of research – time series models for forecasting data and comparative analysis of the obtained models. This paper presents the results of construction of time series models, which are intended for forecasting of the most important characteristics of digital advertising traffic. Described the results of the comparative analysis of the obtained models with the help of information criteria, and also in terms of their accuracy. Was found that for our task, the best model is the ARIMAX model (Autoregressive integrated moving-average model with exogenous inputs). Therefore, it is recommended to use this model for further research. Based on master's dissertation were written theses as well as a scientific article. The theses will be published in the SAIT-2018 conference Book of Abstracts. The scientific article will be published in the electronic collection of reports at the CEUR publishing house (CEUR Workshop Proceedings). The further development of the research object – is the construction of new ones, as well as the improvement of existing time series models for forecasting the most important characteristics of digital advertising traffic. And also – it is a generalization of the research, conducted in this paper, on the analysis of individual sites from the digital advertising traffic.

APA, Harvard, Vancouver, ISO, and other styles

28

Engström, Olof. "Deep Learning for Anomaly Detection in Microwave Links : Challenges and Impact on Weather Classification." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-276676.

Full text

Abstract:

Artificial intelligence is receiving a great deal of attention in various fields of science and engineering due to its promising applications. In today’s society, weather classification models with high accuracy are of utmost importance. An alternative to using conventional weather radars is to use measured attenuation data in microwave links as the input to deep learning-based weather classification models. Detecting anomalies in the measured attenuation data is of great importance as the output of a classification model cannot be trusted if the input to the classification model contains anomalies. Designing an accurate classification model poses some challenges due to the absence of predefined features to discriminate among the various weather conditions, and due to specific domain requirements in terms of execution time and detection sensitivity. In this thesis we investigate the relationship between anomalies in signal attenuation data, which is the input to a weather classification model, and the model’s misclassifications. To this end, we propose and evaluate two deep learning models based on long short-term memory networks (LSTM) and convolutional neural networks (CNN) for anomaly detection in a weather classification problem. We evaluate the feasibility and possible generalizations of the proposed methodology in an industrial case study at Ericsson AB, Sweden. The results show that both proposed methods can detect anomalies that correlate with misclassifications made by the weather classifier. Although the LSTM performed better than the CNN with regards to top performance on one link and average performance across all 5 tested links, the CNN performance is shown to be more consistent.
Artificiell intelligens har fått mycket uppmärksamhet inom olika teknik- och vetenskapsområden på grund av dess många lovande tillämpningar. I dagens samhälle är väderklassificeringsmodeller med hög noggrannhet av yttersta vikt. Ett alternativ till att använda konventionell väderradar är att använda uppmätta dämpningsdata i mikrovågslänkar som indata till djupinlärningsbaserade väderklassificeringsmodeller. Detektering av avvikelser i uppmätta dämpningsdata är av stor betydelse eftersom en klassificeringsmodells pålitlighet minskar om träningsdatat innehåller avvikelser. Att utforma en noggrann klassificeringsmodell är svårt på grund av bristen på fördefinierade kännetecken för olika typer av väderförhållanden, och på grund av de specifika domänkrav som ofta ställs när det gäller exekveringstid och detekteringskänslighet. I det här examensarbetet undersöker vi förhållandet mellan avvikelser i uppmätta dämpningsdata från mikrovågslänkar, och felklassificeringar gjorda av en väderklassificeringsmodell. För detta ändamål utvärderar vi avvikelsedetektering inom ramen för väderklassificering med hjälp av två djupinlärningsmodeller, baserade på long short-term memory-nätverk (LSTM) och faltningsnätverk (CNN). Vi utvärderar genomförbarhet och generaliserbarhet av den föreslagna metodiken i en industriell fallstudie hos Ericsson AB. Resultaten visar att båda föreslagna metoder kan upptäcka avvikelser som korrelerar med felklassificeringar gjorda av väderklassificeringsmodellen. LSTM-modellen presterade bättre än CNN-modellen både med hänsyn till toppprestanda på en länk och med hänsyn till genomsnittlig prestanda över alla 5 testade länkar, men CNNmodellens prestanda var mer konsistent.

APA, Harvard, Vancouver, ISO, and other styles

29

Penzer, Jeremy. "Estimation of time series models with incomplete data." Thesis, Manchester Metropolitan University, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.294162.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Kumar, Prashant. "Forecasting Cloud Resource Utilization Using Time Series Methods." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-240595.

Full text

Abstract:

With the contemporary technological advancements, the adoption of cloud as service has been evolving exponentially while providing a seemingly incessant measure of resources such as storage, network, CPU and many more. In today’s data centres that accommodate thousands of servers, ensuring the availability of continuous services is a significant hurdle. So as in to meet the compelling demands of resources, the proactive forecasting of resource usage demands is of immense importance. Cloud service providers can profit by having an unbiased forecast of the future usage requirement on a cloud resource proactively by employing historical data patterns. Therefore, forecasting resource usage is of great importance for dynamic scaling of cloud resources to have gained such as cost saving and optimal energy consumption while vouching for the proper quality of service. Amongst the extensive resources in a cloud setup, we focus on CPU utilisation metric. In this work, we review the performance of several forecasting methods discussed in existing literature, experiment with ensemble models, recognise the fundamental evaluation metrics, reform data to a machine learning common problem and ultimately compare the performance of the model on the given dataset. To assess the accuracy of the forecasting model we validate it against the unseen test data using walk-forward validation technique. As a conclusion, we found the Feed Forward Neural Network to be the best accomplishing model when evaluated with real traces of CPU utilisation of a cloud setup where it showed an improvement of approximately 18.13% when relative measure of forecasting error is considered. We also find that an amalgam of individual forecasting models such as ARIMA and ETS performs better with an improvement of approximately 2.6% than the individual time series method such as ARIMA in our case. In the end, we also discuss the possible approaches which could improve the performance of this work and the possible future work to encourage fur-ther research in the area of time series forecasting.
Med dagens tekniska framsteg har användandet av molntjänster utvecklats exponentiellt, samtidigt som detta ger en näst intill oupphörlig resurs i termer av lagring, nätverk, CPU och mycket annat. I dagens datacenter som rymmer tusentals servrar är säkerställandet av tillgängligheten av tjänster ett betydande hinder. För att möta de ökande kraven på resursen är proaktiva prognoser för resursanvändningen av stor betydelse. Molntjänsteleverantörer kan skapa en objektiv prognos för framtida användningsbehov i molnet med hjälp av historiska datamönster. Därför är utvecklingen av prediktion av resursanvändning av stor betydelse för dynamisk skalning av molnresurser för att uppnå kostnadsbesparingar och optimal energiförbrukning samtidigt som rätt kvalitet säkerställs. Bland de omfattande resurserna i ett molnupplägg, fokuserar vi på CPU-utnyttjande. I det här arbetet granskar vi resultaten av flera prediktionsmetoder som diskuteras i befintlig litteratur, experimenterar med ensemblemodeller, kartlägger grundläggande utvärderingsmetoder, formulerar problemet som ett gemensamt maskininlärningsproblem och jämför i slutändan modellernas prestanda på den angivna datamängden. För att bedöma exaktheten av prognosmodellen värderas den mot osedda testdata med valideringsteknik och utvärderas sedan mot ett öppet dataset för metodens giltighet. Som en slutsats fann vi att feed forward neural networks var den bästa prestandamodellen när den utvärderades med verkliga data för CPU-utnyttjandet av en molntjänst, vilket visade en förbättring på ca 18,13 % när relativa mätningar av prognosfel användes. Vi finner också att en sammanslagning av individuella prognosmodeller som ARIMA och ETS fungerar bättre, med en förbättring på ungefär 2,6 % jämfört med den enskilda tidsseriemetoden, ARIMA i vårt fall. I slutet diskuterar vi också möjliga tillvägagångssätt som kan förbättra resultatet av detta arbete och diskuterar det eventuella framtida arbe-tet att uppmuntra vidare forskning av prognoser för tidsserier.

APA, Harvard, Vancouver, ISO, and other styles

31

Fageehi, Yahya. "SIMULATION-BASED OPTIMIZATION FOR COMPLEX SYSTEMS WITH SUPPLY AND DEMAND UNCERTAINTY." University of Akron / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=akron1531147903589262.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Lundkvist, Emil. "Decision Tree Classification and Forecasting of Pricing Time Series Data." Thesis, KTH, Reglerteknik, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-151017.

Full text

Abstract:

Many companies today, in different fields of operations and sizes, have access to a vast amount of data which was not available only a couple of years ago. This situation gives rise to questions regarding how to organize and use the data in the best way possible. In this thesis a large database of pricing data for products within various market segments is analysed. The pricing data is from both external and internal sources and is therefore confidential. Because of the confidentiality, the labels from the database are in this thesis substituted with generic ones and the company is not referred to by name, but the analysis is carried out on the real data set. The data is from the beginning unstructured and difficult to overlook. Therefore, it is first classified. This is performed by feeding some manual training data into an algorithm which builds a decision tree. The decision tree is used to divide the rest of the products in the database into classes. Then, for each class, a multivariate time series model is built and each product’s future price within the class can be predicted. In order to interact with the classification and price prediction, a front end is also developed. The results show that the classification algorithm both is fast enough to operate in real time and performs well. The time series analysis shows that it is possible to use the information within each class to do predictions, and a simple vector autoregressive model used to perform it shows good predictive results.

APA, Harvard, Vancouver, ISO, and other styles

33

Lee, Fung-Man. "Studies in time series analysis and forecasting of energy data." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36032.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Sseguya, Raymond. "Forecasting anomalies in time series data from online production environments." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166044.

Full text

Abstract:

Anomaly detection on time series forecasts can be used by many industries in especially forewarning systems that can predict anomalies before they happen. Infor (Sweden) AB is software company that provides Enterprise Resource Planning cloud solutions. Infor is interested in predicting anomalies in their data and that is the motivation for this thesis work. The general idea is firstly to forecast the time series and then secondly detect and classify anomalies on the forecast. The first part is time series forecasting and the second part is anomaly detection and classification done on the forecasted values. In this thesis work, the time series forecasting to predict anomalous behaviour is done using two strategies namely the recursive strategy and the direct strategy. The recursive strategy includes two methods; AutoRegressive Integrated Moving Average and Neural Network AutoRegression. The direct strategy is done with ForecastML-eXtreme Gradient Boosting. Then the three methods are compared concerning performance of forecasting. The anomaly detection and classification is done by setting a decision rule based on a threshold. In this thesis work, since the true anomaly thresholds were not previously known, an arbitrary initial anomaly threshold is set by using a combination of statistical methods for outlier detection and then human judgement by the company commissioners. These statistical methods include Seasonal and Trend decomposition using Loess + InterQuartile Range, Twitter + InterQuartile Range and Twitter + GESD (Generalized Extreme Studentized Deviate). After defining what an anomaly threshold is in the usage context of Infor (Sweden) AB, then a decision rule is set and used to classify anomalies in time series forecasts. The results from comparing the classifications of the forecasts from the three time series forecasting methods are unfortunate and no recommendation is made concerning what model or algorithm to be used by Infor (Sweden) AB. However, the thesis work concludes by recommending other methods that can be tried in future research.

APA, Harvard, Vancouver, ISO, and other styles

35

Burgada, Muñoz Santiago. "Improvement on the sales forecast accuracy for a fast growing company by the best combination of historical data usage and clients segmentation." reponame:Repositório Institucional do FGV, 2014. http://hdl.handle.net/10438/13322.

Full text

Abstract:

Submitted by SANTIAGO BURGADA (sburgada@maxam.net) on 2015-01-25T12:10:08Z No. of bitstreams: 1 DISSERTATION SANTIAGO BURGADA CORPORATE INTERNATIONAL MASTERS SUBMISSION VERSION.pdf: 3588309 bytes, checksum: b70385fd690a43ddea32379f34b4afe9 (MD5)
Approved for entry into archive by Janete de Oliveira Feitosa (janete.feitosa@fgv.br) on 2015-02-04T19:27:15Z (GMT) No. of bitstreams: 1 DISSERTATION SANTIAGO BURGADA CORPORATE INTERNATIONAL MASTERS SUBMISSION VERSION.pdf: 3588309 bytes, checksum: b70385fd690a43ddea32379f34b4afe9 (MD5)
Approved for entry into archive by Marcia Bacha (marcia.bacha@fgv.br) on 2015-02-11T13:27:32Z (GMT) No. of bitstreams: 1 DISSERTATION SANTIAGO BURGADA CORPORATE INTERNATIONAL MASTERS SUBMISSION VERSION.pdf: 3588309 bytes, checksum: b70385fd690a43ddea32379f34b4afe9 (MD5)
Made available in DSpace on 2015-02-11T13:34:18Z (GMT). No. of bitstreams: 1 DISSERTATION SANTIAGO BURGADA CORPORATE INTERNATIONAL MASTERS SUBMISSION VERSION.pdf: 3588309 bytes, checksum: b70385fd690a43ddea32379f34b4afe9 (MD5) Previous issue date: 2014-10-29
Industrial companies in developing countries are facing rapid growths, and this requires having in place the best organizational processes to cope with the market demand. Sales forecasting, as a tool aligned with the general strategy of the company, needs to be as much accurate as possible, in order to achieve the sales targets by making available the right information for purchasing, planning and control of production areas, and finally attending in time and form the demand generated. The present dissertation uses a single case study from the subsidiary of an international explosives company based in Brazil, Maxam, experiencing high growth in sales, and therefore facing the challenge to adequate its structure and processes properly for the rapid growth expected. Diverse sales forecast techniques have been analyzed to compare the actual monthly sales forecast, based on the sales force representatives’ market knowledge, with forecasts based on the analysis of historical sales data. The dissertation findings show how the combination of both qualitative and quantitative forecasts, by the creation of a combined forecast that considers both client´s demand knowledge from the sales workforce with time series analysis, leads to the improvement on the accuracy of the company´s sales forecast.

APA, Harvard, Vancouver, ISO, and other styles

36

Vera, Barberán José María. "Adding external factors in Time Series Forecasting : Case study: Ethereum price forecasting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-289187.

Full text

Abstract:

The main thrust of time-series forecasting models in recent years has gone in the direction of pattern-based learning, in which the input variable for the models is a vector of past observations of the variable itself to predict. The most used models based on this traditional pattern-based approach are the autoregressive integrated moving average model (ARIMA) and long short-term memory neural networks (LSTM). The main drawback of the mentioned approaches is their inability to react when the underlying relationships in the data change resulting in a degrading predictive performance of the models. In order to solve this problem, various studies seek to incorporate external factors into the models treating the system as a black box using a machine learning approach which generates complex models that require a large amount of data for their training and have little interpretability. In this thesis, three different algorithms have been proposed to incorporate additional external factors into these pattern-based models, obtaining a good balance between forecast accuracy and model interpretability. After applying these algorithms in a study case of Ethereum price time-series forecasting, it is shown that the prediction error can be efficiently reduced by taking into account these influential external factors compared to traditional approaches while maintaining full interpretability of the model.
Huvudinstrumentet för prognosmodeller för tidsserier de senaste åren har gått i riktning mot mönsterbaserat lärande, där ingångsvariablerna för modellerna är en vektor av tidigare observationer för variabeln som ska förutsägas. De mest använda modellerna baserade på detta traditionella mönsterbaserade tillvägagångssätt är auto-regressiv integrerad rörlig genomsnittsmodell (ARIMA) och långa kortvariga neurala nätverk (LSTM). Den huvudsakliga nackdelen med de nämnda tillvägagångssätten är att de inte kan reagera när de underliggande förhållandena i data förändras vilket resulterar i en försämrad prediktiv prestanda för modellerna. För att lösa detta problem försöker olika studier integrera externa faktorer i modellerna som behandlar systemet som en svart låda med en maskininlärningsmetod som genererar komplexa modeller som kräver en stor mängd data för deras inlärning och har liten förklarande kapacitet. I denna uppsatsen har tre olika algoritmer föreslagits för att införliva ytterligare externa faktorer i dessa mönsterbaserade modeller, vilket ger en bra balans mellan prognosnoggrannhet och modelltolkbarhet. Efter att ha använt dessa algoritmer i ett studiefall av prognoser för Ethereums pristidsserier, visas det att förutsägelsefelet effektivt kan minskas genom att ta hänsyn till dessa inflytelserika externa faktorer jämfört med traditionella tillvägagångssätt med bibehållen full tolkbarhet av modellen.

APA, Harvard, Vancouver, ISO, and other styles

37

Tsang, Fan Cheong. "Advances in flood forecasting using radar rainfalls and time-series analysis." Thesis, Lancaster University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.481184.

Full text

Abstract:

This thesis reports the use of a time-series analysis approach to study the catchment hydrological system of the River Ribble. Rain gauge records, radar rainfall estimates and flow data are used in the analysis. The preliminary study consists of the flow forecasting at Reedyford, Pendle Water (82 km2). Flow forecasts generated from the rain gauge records are better than the radar rainfall estimates over this small catchment. However, the catchment response to rainfall is quick and no clear advantages in extending the lead-time of the forecast can be introduced by using an artificial time delayed rainfall input. A non-linear rainfall-flow relationship has been studied using the rain gauge rainfall and flow records at the River Hodder catchment (261 km2). A calibration scheme is used to identify the non-linear function of the catchment as well as the rainfall-flow system model. Although a better time-invariant system model can be identified, the non-linear rainfall-flow process cannot be fully explained by a power law function of effective rainfall. Assuming the dynamic, nonlinear system characteristics of the catchment can be reflected by a time-varying model gain parameter, relationships between the parameter and the flow, and between the parameter and the rainfall can be evaluated. These relationships have been used to improve the flow forecast during storm events. The results indicate, however, that the approach failed to improve the flow forecast near the peak flow condition. Radar data have been incorporated to forecast the flow at Jumbles Rock (1053 km2) and Samlesbury (1140 km2), River Ribble. The radar data calibrated by the Lancaster University Adaptive Radar Calibration System appears to produce better flow forecasts than the standard radar data product calibrated by the Meteorological Office. The proposed flow forecasting scheme generates better forecasts than the current system operated by the National Rivers Authority, North West Region.

APA, Harvard, Vancouver, ISO, and other styles

38

Katardjiev, Nikola. "High-variance multivariate time series forecasting using machine learning." Thesis, Uppsala universitet, Institutionen för informatik och media, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-353827.

Full text

Abstract:

There are several tools and models found in machine learning that can be used to forecast a certain time series; however, it is not always clear which model is appropriate for selection, as different models are suited for different types of data, and domain-specific transformations and considerations are usually required. This research aims to examine the issue by modeling four types of machine- and deep learning algorithms - support vector machine, random forest, feed-forward neural network, and a LSTM neural network - on a high-variance, multivariate time series to forecast trend changes one time step in the future, accounting for lag.The models were trained on clinical trial data of patients in an alcohol addiction treatment plan provided by a Uppsala-based company. The results showed moderate performance differences, with a concern that the models were performing a random walk or naive forecast. Further analysis was able to prove that at least one model, the feed-forward neural network, was not undergoing this and was able to make meaningful forecasts one time step into the future. In addition, the research also examined the effec tof optimization processes by comparing a grid search, a random search, and a Bayesian optimization process. In all cases, the grid search found the lowest minima, though its slow runtimes were consistently beaten by Bayesian optimization, which contained only slightly lower performances than the grid search.
Det finns flera verktyg och modeller inom maskininlärning som kan användas för att utföra tidsserieprognoser, men det är sällan tydligt vilken modell som är lämplig vid val, då olika modeller är anpassade för olika sorts data. Denna forskning har som mål att undersöka problemet genom att träna fyra modeller - support vector machine, random forest, ett neuralt nätverk, och ett LSTM-nätverk - på en flervariabelstidserie med hög varians för att förutse trendskillnader ett tidssteg framåt i tiden, kontrollerat för tidsfördröjning. Modellerna var tränade på klinisk prövningsdata från patienter som deltog i en alkoholberoendesbehandlingsplan av ett Uppsalabaserat företag. Resultatet visade vissa moderata prestandaskillnader, och en oro fanns att modellerna utförde en random walk-prognos. I analysen upptäcktes det dock att den ena neurala nätverksmodellen inte gjorde en sådan prognos, utan utförde istället meningsfulla prediktioner. Forskningen undersökte även effekten av optimiseringsprocesser genomatt jämföra en grid search, random search, och Bayesisk optimisering. I alla fall hittade grid search lägsta minimumpunkten, men dess långsamma körtider blev konsistent slagna av Bayesisk optimisering, som även presterade på nivå med grid search.

APA, Harvard, Vancouver, ISO, and other styles

39

Wang, Mu-Chun. "On the forecasting of economic time series structural versus data-based approaches." Göttingen Sierke, 2009. http://d-nb.info/994722028/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Börjesson, Lukas. "Forecasting Financial Time Series through Causal and Dilated Convolutional Neural Networks." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-167331.

Full text

Abstract:

In this paper, predictions of future price movements of a major American stock index was made by analysing past movements of the same and other correlated indices. A model that has shown very good results in speech recognition was modified to suit the analysis of financial data and was then compared to a base model, restricted by assumptions made for an efficient market. The performance of any model, that is trained by looking at past observations, is heavily influenced by how the division of the data into train, validation and test sets is made. This is further exaggerated by the temporal structure of the financial data, which means that the causal relationship between the predictors and the response is dependent in time. The complexity of the financial system further increases the struggle to make accurate predictions, but the model suggested here was still able to outperform the naive base model by more than 20 percent. The model is, however, too primitive to be used as a trading system, but suitable modifications, in order to turn the model into one, will be discussed in the end of the paper.

APA, Harvard, Vancouver, ISO, and other styles

41

Hartmann, Claudio [Verfasser], Wolfgang [Gutachter] Lehner, and Stephan [Gutachter] Günnemann. "Forecasting Large-scale Time Series Data / Claudio Hartmann ; Gutachter: Wolfgang Lehner, Stephan Günnemann." Dresden : Technische Universität Dresden, 2018. http://d-nb.info/1227315449/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Bäärnhielm, Arvid. "Multiple time-series forecasting on mobile network data using an RNN-RBM model." Thesis, Uppsala universitet, Datalogi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-315782.

Full text

Abstract:

The purpose of this project is to evaluate the performance of a forecasting model based on a multivariate dataset consisting of time series of traffic characteristic performance data from a mobile network. The forecasting is made using machine learning with a deep neural network. The first part of the project involves the adaption of the model design to fit the dataset and is followed by a number of simulations where the aim is to tune the parameters of the model to give the best performance. The simulations show that with well tuned parameters, the neural network performes better than the baseline model, even when using only a univariate dataset. If a multivariate dataset is used, the neural network outperforms the baseline model even when the dataset is small.

APA, Harvard, Vancouver, ISO, and other styles

43

Elsegai, Heba. "Network inference and data-based modelling with applications to stock market time series." Thesis, University of Aberdeen, 2015. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?pid=228017.

Full text

Abstract:

The inference of causal relationships between stock markets constitutes a major research topic in the field of financial time series analysis. A successful reconstruction of the underlying causality structure represents an important step towards the overall aim of improving stock market price forecasting. In this thesis, I utilise the concept of Granger-causality for the identification of causal relationships. One major challenge is the possible presence of latent variables that affect the measured components. An instantaneous interaction can arise in the inferred network of stock market relationships either spuriously due to the existence of a latent confounder or truly as a result of hidden agreements between market players. I investigate the implications of such a scenario; proposing a new method that allows for the first time to distinguish between instantaneous interactions caused by a latent confounder and those resulting from hidden agreements. Another challenge is the implicit assumption of existing Granger-causality analysis techniques that the interactions have a time delay either equal to or a multiple of the observed data. Two sub-cases of this scenario are discussed: (i) when the collected data is simultaneously recorded, (ii) when the collected data is non-simultaneously recorded. I propose two modified approaches based on time series shifting that provide correct inferences of the complete causal interaction structure. To investigate the performance of the above mentioned method improvements in predictions, I present a modified version of the building block model for modelling stock prices allowing causality structure between stock prices to be modelled. To assess the forecasting ability of the extended model, I compare predictions resulting from network reconstruction methods developed throughout this thesis to predictions made based on standard correlation analysis using stock market data. The findings show that predictions based on the developed methods provide more accurate forecasts than predictions resulting from correlation analysis.

APA, Harvard, Vancouver, ISO, and other styles

44

Taherifard, Ershad. "Load and Demand Forecasting in Iraqi Kurdistan using Time series modelling." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-260260.

Full text

Abstract:

This thesis examines the concept of time series forecasting. More specifically, it predicts the load and power demand in Sulaymaniyah, Iraqi Kurdistan, who are today experiencing frequent power shortages. This study applies a commonly used time series model, the autoregressive integrated moving average model, which is compared to the naïve method. Several key model properties are inspected to evaluate model accuracy. The model is then used to forecast the load and the demand on a daily, weekly and monthly basis. The forecasts are evaluated by examining the residual metrics. Furthermore, the quantitative results and the answers collected from interviews are used as a basis to investigate the conditions of capacity planning in order to determine a suitable strategy to minimize the unserved power demand. The findings indicate an unsustainable over consumption of power in the region due to low tariffs and subsidized energy. A suggested solution is to manage power demand by implementing better strategies such as increasing tariffs and to use demand forecast to supply power accordingly. The monthly supply forecast in this study outperforms the baseline method but not the demand forecast. On weekly basis, both the load and the demand models underperform. The performance of the daily forecasts performs equally or worse than the baseline. Overall, the supply predictions are more precise than the demand predictions. However, there is room for improvement regarding the forecasts. For instance, better model selection and data preparation can result in more accurate forecasts.
Denna studie undersöker prediktion av tidserier. Den tittar närmare på last- och effektbehov i Sulaymaniyah i Irak som idag drabbas av regelbunden effektbrist. Rapporten applicerar en vedertagen tidseriemodell, den autoregressiva integrerade glidande medelvärdesmodellen, som sedan jämförs med den naiva metoden. Några karaktäristiska modellegenskaper undersöks för att evaluera modellens noggrannhet. Den anpassade modellen används sedan för att predikera last- och effektbehovet på dags-, månads-, och årsbasis. Prognoserna evalueras genom att undersöka dess residualer. Vidare så användas de kvalitativa svaren från intervjuerna som underlag för att undersöka förutsättningarna för kapacitetsplanering och den strategi som är bäst lämpad för att möta effektbristen. Studien visar att det råder en ohållbar överkonsumtion av energi i regionen som konsekvens av låga elavgifter och subventionerad energi. En föreslagen lösning är att hantera efterfrågan genom att implementera strategier som att höja elavgifter men även försöka matcha produktionen med efterfrågan med hjälp av prognoser. De månadsvisa prognoserna för produktionen i studien överträffar den naiva metoden men inte för prognoserna för efterfrågan. På veckobasis underpresterar båda modellerna. De dagliga prognoserna presterar lika bra eller värre än den naiva metoden. I sin helhet lyckas modellerna förutspå utbudet bättre än efterfrågan på effekt. Men det finns utrymme för förbättringar. Det går nog att uppnå bättre resultat genom bättre förbehandling av data och noggrannare valda tidseriemodeller.

APA, Harvard, Vancouver, ISO, and other styles

45

Stockhammar, Pär. "Some Contributions to Filtering, Modeling and Forecasting of Heteroscedastic Time Series." Doctoral thesis, Stockholms universitet, Statistiska institutionen, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-38627.

Full text

Abstract:

Heteroscedasticity (or time-dependent volatility) in economic and financial time series has been recognized for decades. Still, heteroscedasticity is surprisingly often neglected by practitioners and researchers. This may lead to inefficient procedures. Much of the work in this thesis is about finding more effective ways to deal with heteroscedasticity in economic and financial data. Paper I suggest a filter that, unlike the Box-Cox transformation, does not assume that the heteroscedasticity is a power of the expected level of the series. This is achieved by dividing the time series by a moving average of its standard deviations smoothed by a Hodrick-Prescott filter. It is shown that the filter does not colour white noise. An appropriate removal of heteroscedasticity allows more effective analyses of heteroscedastic time series. A few examples are presented in Paper II, III and IV of this thesis. Removing the heteroscedasticity using the proposed filter enables efficient estimation of the underlying probability distribution of economic growth. It is shown that the mixed Normal - Asymmetric Laplace (NAL) distributional fit is superior to the alternatives. This distribution represents a Schumpeterian model of growth, the driving mechanism of which is Poisson (Aghion and Howitt, 1992) distributed innovations. This distribution is flexible and has not been used before in this context. Another way of circumventing strong heteroscedasticity in the Dow Jones stock index is to divide the data into volatility groups using the procedure described in Paper III. For each such group, the most accurate probability distribution is searched for and is used in density forecasting. Interestingly, the NAL distribution fits best also here. This could hint at a new analogy between the financial sphere and the real economy, further investigated in Paper IV. These series are typically heteroscedastic, making standard detrending procedures, such as Hodrick-Prescott or Baxter-King, inadequate. Prior to this comovement study, the univariate and bivariate frequency domain results from these filters are compared to the filter proposed in Paper I. The effect of often neglected heteroscedasticity may thus be studied.

APA, Harvard, Vancouver, ISO, and other styles

46

Alklid, Jonathan. "Time to Strike: Intelligent Detection of Receptive Clients : Predicting a Contractual Expiration using Time Series Forecasting." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-106217.

Full text

Abstract:

In recent years with the advances in Machine Learning and Artificial Intelligence, the demand for ever smarter automation solutions could seem insatiable. One such demand was identified by Fortnox AB, but undoubtedly shared by many other industries dealing with contractual services, who were looking for an intelligent solution capable of predicting the expiration date of a contractual period. As there was no clear evidence suggesting that Machine Learning models were capable of learning the patterns necessary to predict a contract's expiration, it was deemed desirable to determine subject feasibility while also investigating whether it would perform better than a commonplace rule-based solution, something that Fortnox had already investigated in the past. To do this, two different solutions capable of predicting a contractual expiration were implemented. The first one was a rule-based solution that was used as a measuring device, and the second was a Machine Learning-based solution that featured Tree Decision classifier as well as Neural Network models. The results suggest that Machine Learning models are indeed capable of learning and recognizing patterns relevant to the problem, and with an average accuracy generally being on the high end. Unfortunately, due to a lack of available data to use for testing and training, the results were too inconclusive to make a reliable assessment of overall accuracy beyond the learning capability. The conclusion of the study is that Machine Learning-based solutions show promising results, but with the caveat that the results should likely be seen as indicative of overall viability rather than representative of actual performance.

APA, Harvard, Vancouver, ISO, and other styles

47

Larsson, Klara, and Freja Ling. "Time Series forecasting of the SP Global Clean Energy Index using a Multivariate LSTM." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301904.

Full text

Abstract:

Clean energy and machine learning are subjects that play significant roles in shaping our future. The current climate crisis has forced the world to take action towards more sustainable solutions. Arrangements such as the UN’s Sustainable Development Goals and the Paris Agreement are causing an increased interest in renewable energy solutions. Further, the EU Taxonomy Regulation, applied in 2020, aims to scale up sustainable investments and to direct cash flows toward sustainable projects and activities. These measures create interest in investing in renewable energy alternatives and predicting future movements of stocks related to these businesses. Machine learning models have previously been used to predict time series with promising results. However, predicting time series in the form of stock price indices has, throughout previous attempts, proved to be a difficult task due to the complexity of the variables that play a role in the indices’ movements. This paper uses the machine learning algorithm long short-term memory (LSTM) to predict the S&P Global Clean Energy Index. The research question revolves around how well the LSTM model performs on this specific index and how the result is affected when past returns from correlating variables are added to the model. The researched variables are crude oil price, gold price, and interest. A model for each correlating variable was created, as well as one with all three, and one standard model which used only historical data from the index. The study found that while the model with the variable which had the strongest correlation performed best among the multivariate models, the standard model using only the target variable gave the most accurate result of any of the LSTM models.
Den pågående klimatkrisen har tvingat allt fler länder till att vidta åtgärder, och FN:s globala hållbarhetsmål och Parisavtalet ökar intresset för förnyelsebar energi. Vidare lanserade EU-kommissionen den 21 april 2021 ett omfattande åtgärdspaket, med syftet att öka investeringar i hållbara verksamheter. Detta skapar i sin tur ett ökat intresse för investeringar i förnyelsebar energi och metoder för att förutspå aktiepriser för dessa bolag. Maskininlärningsmodeller har tidigare använts för tidsserieanalyser med goda resultat, men att förutspå aktieindex har visat sig svårt till stor del på grund av uppgiftens komplexitet och antalet variabler som påverkar börsen. Den här uppsatsen använder sig av maskininlärningsmodellen long short-term memory (LSTM) för att förutspå S&P:s Global Clean Energy Index. Syftet är att ta reda på hur träffsäkert en LSTM-modell kan förutspå detta index, och hur resultatet påverkas då modellen används med ytterligare variabler som korrelerar med indexet. De variabler som undersöks är priset på råolja, priset på guld, och ränta. Modeller för var variabel skapades, samt en modell med samtliga variabler och en med endast historisk data från indexet. Resultatet visar att den modell med den variabel som korrelerar starkast med indexet presterade bäst bland flervariabelmodellerna, men den modell som endast användes med historisk data från indexet gav det mest träffsäkra resultatet.

APA, Harvard, Vancouver, ISO, and other styles

48

Hellman, Simon. "Forecasting conflict using RNNs." Thesis, Uppsala universitet, Signaler och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-445859.

Full text

Abstract:

The rise in machine learning has made the subject interesting for new types of uses. This Master thesis implements and evaluates an LSTM-based algorithm on the conflict forecasting problem. Data is structured in country-month pairs, with information about conflict, economy, demography, democracy and unrest. The goal is to forecast the probability of at least one conflict event in a country based on a window of historic information. Results show that the model is not as good as a Random Forest. There are also indications of a lack of data with the network having difficulty performing consistently and with learning curves not flattening. Naive models perform surprisingly well. The conclusion is that the problem needs some restructuring in order to improve performance compared to naive approaches. To help this endeavourpossible paths for future work has been identified.

APA, Harvard, Vancouver, ISO, and other styles

49

Nigrini, L. B., and G. D. Jordaan. "Short term load forecasting using neural networks." Journal for New Generation Sciences, Vol 11, Issue 3: Central University of Technology, Free State, Bloemfontein, 2013. http://hdl.handle.net/11462/646.

Full text

Abstract:

Published Article
Several forecasting models are available for research in predicting the shape of electric load curves. The development of Artificial Intelligence (AI), especially Artificial Neural Networks (ANN), can be applied to model short term load forecasting. Because of their input-output mapping ability, ANN's are well-suited for load forecasting applications. ANN's have been used extensively as time series predictors; these can include feed-forward networks that make use of a sliding window over the input data sequence. Using a combination of a time series and a neural network prediction method, the past events of the load data can be explored and used to train a neural network to predict the next load point. In this study, an investigation into the use of ANN's for short term load forecasting for Bloemfontein, Free State has been conducted with the MATLAB Neural Network Toolbox where ANN capabilities in load forecasting, with the use of only load history as input values, are demonstrated.

APA, Harvard, Vancouver, ISO, and other styles

50

Vander, Elst Harry-Paul. "Measuring, Modeling, and Forecasting Volatility and Correlations from High-Frequency Data." Doctoral thesis, Universite Libre de Bruxelles, 2016. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/228960.

Full text

Abstract:

This dissertation contains four essays that all share a common purpose: developing new methodologies to exploit the potential of high-frequency data for the measurement, modeling and forecasting of financial assets volatility and correlations. The first two chapters provide useful tools for univariate applications while the last two chapters develop multivariate methodologies. In chapter 1, we introduce a new class of univariate volatility models named FloGARCH models. FloGARCH models provide a parsimonious joint model for low frequency returns and realized measures, and are sufficiently flexible to capture long memory as well as asymmetries related to leverage effects. We analyze the performances of the models in a realistic numerical study and on the basis of a data set composed of 65 equities. Using more than 10 years of high-frequency transactions, we document significant statistical gains related to the FloGARCH models in terms of in-sample fit, out-of-sample fit and forecasting accuracy compared to classical and Realized GARCH models. In chapter 2, using 12 years of high-frequency transactions for 55 U.S. stocks, we argue that combining low-frequency exogenous economic indicators with high-frequency financial data improves the ability of conditionally heteroskedastic models to forecast the volatility of returns, their full multi-step ahead conditional distribution and the multi-period Value-at-Risk. Using a refined version of the Realized LGARCH model allowing for time-varying intercept and implemented with realized kernels, we document that nominal corporate profits and term spreads have strong long-run predictive ability and generate accurate risk measures forecasts over long-horizon. The results are based on several loss functions and tests, including the Model Confidence Set. Chapter 3 is a joint work with David Veredas. We study the class of disentangled realized estimators for the integrated covariance matrix of Brownian semimartingales with finite activity jumps. These estimators separate correlations and volatilities. We analyze different combinations of quantile- and median-based realized volatilities, and four estimators of realized correlations with three synchronization schemes. Their finite sample properties are studied under four data generating processes, in presence, or not, of microstructure noise, and under synchronous and asynchronous trading. The main finding is that the pre-averaged version of disentangled estimators based on Gaussian ranks (for the correlations) and median deviations (for the volatilities) provide a precise, computationally efficient, and easy alternative to measure integrated covariances on the basis of noisy and asynchronous prices. Along these lines, a minimum variance portfolio application shows the superiority of this disentangled realized estimator in terms of numerous performance metrics. Chapter 4 is co-authored with Niels S. Hansen, Asger Lunde and Kasper V. Olesen, all affiliated with CREATES at Aarhus University. We propose to use the Realized Beta GARCH model to exploit the potential of high-frequency data in commodity markets. The model produces high quality forecasts of pairwise correlations between commodities which can be used to construct a composite covariance matrix. We evaluate the quality of this matrix in a portfolio context and compare it to models used in the industry. We demonstrate significant economic gains in a realistic setting including short selling constraints and transaction costs.
Doctorat en Sciences économiques et de gestion
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Forecasting of data in the form of time series'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles