Tesi: "Time series aggregation"

1

Tripodis, Georgios. "Heterogeneity and aggregation in seasonal time series". Thesis, London School of Economics and Political Science (University of London), 2007. http://etheses.lse.ac.uk/2933/.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Seasonality is an important part of many real time series. While issues of seasonal heteroscedasticity and aggregation have been a cause of concern for data users, there has not been a great deal of theoretical research in this area. This thesis concentrates on these two issues. We consider seasonal time series with single season heteroscedasticity. We show that when only one month has different variability from others there are constraints on the seasonal models that can be used. We show that both the dummy and the trigonometric models are not effective in modelling seasonal series with this type of variability. We suggest two models that permit single season heteroscedasticity as a special case. We show that seasonal heteroscedasticity gives rise to periodic autocorrelation function. We propose a new class, called periodic structural time series models (PSTSM) to deal with such periodicities. We show that PSTSM have correlation structure equivalent to that of a periodic integrated moving average (PIMA) process. In a comparison of forecast performance for a set of quarterly macroeconomic series, PSTSM outperform periodic autoregressive (PAR) models both within and out of sample. We also consider the problem of contemporaneous aggregation of time series using the structural time series framework. We consider the conditions of identifiability for the aggregate series. We show that the identifiability of the models for the component series is not sufficient for the identifiability of the model for the aggregate series. We also consider the case where there is no estimation error as well as the case of modeling an unknown process. For the case of the unknown process we provide recursions based on the Kalman filter that give the asymptotic variance of the estimated parameters.

2

Lin, Shu-Chin. "Aggregation and time series implications of state-dependent consumption /". free to MU campus, to others for purchase, 1996. http://wwwlib.umi.com/cr/mo/fullcit?p9737881.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Sariaslan, Nazli. "The Effect Of Temporal Aggregation On Univariate Time Series Analysis". Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612528/index.pdf.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Most of the time series are constructed by some kind of aggregation and temporal aggregation that can be defined as aggregation over consecutive time periods. Temporal aggregation takes an important role in time series analysis since the choice of time unit clearly influences the type of model and forecast results. A totally different time series model can be fitted on the same variable over different time periods. In this thesis, the effect of temporal aggregation on univariate time series models is studied by considering modeling and forecasting procedure via a simulation study and an application based on a southern oscillation data set. Simulation study shows how the model, mean square forecast error and estimated parameters change when temporally aggregated data is used for different orders of aggregation and sample sizes. Furthermore, the effect of temporal aggregation is also demonstrated through southern oscillation data set for different orders of aggregation. It is observed that the effect of temporal aggregation should be taken into account for data analysis since temporal aggregation can give rise to misleading results and inferences.

4

Gehman, Andrew J. "The Effects of Spatial Aggregation on Spatial Time Series Modeling and Forecasting". Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/382669.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Statistics
Ph.D.
Spatio-temporal data analysis involves modeling a variable observed at different locations over time. A key component of space-time modeling is determining the spatial scale of the data. This dissertation addresses the following three questions: 1) How does spatial aggregation impact the properties of the variable and its model? 2) What spatial scale of the data produces more accurate forecasts of the aggregate variable? 3) What properties lead to the smallest information loss due to spatial aggregation? Answers to these questions involve a thorough examination of two common space-time models: the STARMA and GSTARMA models. These results are helpful to researchers seeking to understand the impact of spatial aggregation on temporal and spatial correlation as well as to modelers interested in determining a spatial scale for the data. Two data examples are included to illustrate the findings, and they concern states' annual labor force totals and monthly burglary counts for police districts in the city of Philadelphia.
Temple University--Theses

5

Kim, Hang. "TIME SERIES BLOCK BOOTSTRAP APPLICATION AND EFFECT OF AGGREGATION AND SYSTEMATIC SAMPLING". Diss., Temple University Libraries, 2018. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/490644.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Statistics
Ph.D.
In this dissertation, we review the basic properties of the bootstrap and time series application. Then we apply parametric bootstrap on three simulated normal i.i.d. samples and nonparametric bootstrap on four real life financial returns. Among the time series bootstrap methods, we look into the specific method called block bootstrap and investigate the block length consideration to properly select a suitable block size for AR(1) model. We propose a new rule of blocking named as Combinatorially-Augmented Block Bootstrap(CABB). We compare the existing block bootstrap and CABB method using the simulated i.i.d. samples, AR(1) time series, and the real life examples. Both methods perform equally well in estimating AR(1) coefficients. CABB produces a smaller standard deviation based on our simulated and empirical studies. We study two procedures of collecting time series, (i) aggregation of a flow variable and (ii) systematic sampling of a stock variable. In these two procedures, we derive theorems that calculate exact equations for $m$ aggregated and $m^{th}$ systematically sampled series of the original AR(1) model. We evaluate the performance of block bootstrap estimation of the parameters of ARMA(1,1) and AR(1) model using aggregated and systematically sampled series. Simulation and real data analyses show that in some cases, the performance of the estimation based on the block bootstrap method for the MA(1) parameter of the ARMA(1,1) model in aggregated series is better than the one without using bootstrap. In an extreme case of stock price movement, which is close to a random walk, the block bootstrap estimate using systematically sampled series is closer to the true parameter, defined as the parameter calculated by the theorem. Specifically, the block bootstrap estimate of the parameter of AR(1) model using the systematically sampled series is closer to phi(n) than that based on the MLE for the AR(1) model. Future research problems include theoretical investigation of CABB, effectiveness of block bootstrap in other time series analyses such as nonlinear or VAR.
Temple University--Theses

6

APRAEZ, CESAR DAVID REVELO. "A HYBRID NEURO- EVOLUTIONARY APPROACH FOR DYNAMIC WEIGHTED AGGREGATION OF TIME SERIES FORECASTERS". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2016. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=36950@1.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
Estudos empíricos na área de séries temporais indicam que combinar modelos preditivos, originados a partir de diferentes técnicas de modelagem, levam a previsões consensuais superiores, em termos de acurácia, às previsões individuais dos modelos envolvidos na combinação. No presente trabalho é apresentada uma metodologia de combinação convexa de modelos estatísticos de previsão, cujo sucesso depende da forma como os pesos de combinação de cada modelo são estimados. Uma Rede Neural Artificial Perceptron Multi-camada (Multilayer Perceptron - MLP) é utilizada para gerar dinamicamente vetores de pesos ao longo do horizonte de previsão, sendo estes dependentes da contribuição individual de cada previsor observada nos dados históricos da série. O ajuste dos parâmetros da rede MLP é efetuado através de um algoritmo de treinamento híbrido, que integra técnicas de busca global, baseadas em computação evolucionária, junto com o algoritmo de busca local backpropagation, de modo a otimizar de forma simultânea tanto os pesos quanto a arquitetura da rede, visando, assim, a gerar de forma automática um modelo de ponderação dinâmica de previsores de alto desempenho. O modelo proposto, batizado de Neural Expert Weighting - Genetic Algorithm (NEW-GA), foi avaliado em diversos experimentos comparativos com outros modelos de ponderação de previsores, assim como também com os modelos individuais envolvidos na combinação, contemplando 15 séries temporais divididas em dois estudos de casos: séries de derivados de petróleo e séries da versão reduzida da competição NN3, uma competição entre metodologias de previsão, com maior ênfase nos modelos baseados em Redes Neurais. Os resultados demonstraram o potencial do NEWGA em fornecer modelos acurados de previsão de séries temporais.
Empirical studies on time series indicate that the combination of forecasting models, generated from different modeling techniques, leads to higher consen+sus forecasts, in terms of accuracy, than the forecasts of individual models involved in the combination scheme. In this work, we present a methodology for convex combination of statistical forecasting models, whose success depends on how the combination weights of each model are estimated. An Artificial Neural Network Multilayer Perceptron (MLP) is used to generate dynamically weighting vectors over the forecast horizon, being dependent on the individual contribution of each forecaster observed over historical data series. The MLP network parameters are adjusted via a hybrid training algorithm that integrates global search techniques, based on evolutionary computation, along with the local search algorithm backpropagation, in order to optimize simultaneously both weights and network architecture. This approach aims to automatically generate a dynamic weighted forecast aggregation model with high performance. The proposed model, called Neural Expert Weighting - Genetic Algorithm (NEW-GA), was com- pared with other forecaster combination models, as well as with the individual models involved in the combination scheme, comprising 15 time series divided into two case studies: Petroleum Products and the reduced set of NN3 forecasting competition, a competition between forecasting methodologies, with greater emphasis on models based on neural networks. The results obtained demonstrated the potential of NEW-GA in providing accurate models for time series forecasting.

7

Weiss, Christoph. "Essays in hierarchical time series forecasting and forecast combination". Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/274757.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This dissertation comprises of three original contributions to empirical forecasting research. Chapter 1 introduces the dissertation. Chapter 2 contributes to the literature on hierarchical time series (HTS) modelling by proposing a disaggregated forecasting system for both inflation rate and its volatility. Using monthly data that underlies the Retail Prices Index for the UK, we analyse the dynamics of the inflation process. We examine patterns in the time-varying covariation among product-level inflation rates that aggregate up to industry-level inflation rates that in turn aggregate up to the overall inflation rate. The aggregate inflation volatility closely tracks the time path of this covariation, which is seen to be driven primarily by the variances of common shocks shared by all products, and by the covariances between idiosyncratic product-level shocks. We formulate a forecasting system that comprises of models for mean inflation rate and its variance, and exploit the index structure of the aggregate inflation rate using the HTS framework. Using a dynamic model selection approach to forecasting, we obtain forecasts that are between 9 and 155 % more accurate than a SARIMA-GARCH(1,1) for the aggregate inflation volatility. Chapter 3 is on improving forecasts using forecast combinations. The paper documents the software implementation of the open source R package for forecast combination that we coded and published on the official R package depository, CRAN. The GeomComb package is the only R package that covers a wide range of different popular forecast combination methods. We implement techniques from 3 broad categories: (a) simple non-parametric methods, (b) regression-based methods, and (c) geometric (eigenvector) methods, allowing for static or dynamic estimation of each approach. Using S3 classes/methods in R, the package provides a user-friendly environment for applied forecasting, implementing solutions for typical issues related to forecast combination (multicollinearity, missing values, etc.), criterion-based optimisation for several parametric methods, and post-fit functions to rationalise and visualise estimation results. The package has been listed in the official R Task Views for Time Series Analysis and for Official Statistics. The brief empirical application in the paper illustrates the package’s functionality by estimating forecast combination techniques for monthly UK electricity supply. Chapter 4 introduces HTS forecasting and forecast combination to a healthcare staffing context. A slowdown of healthcare budget growth in the UK that does not keep pace with growth of demand for hospital services made efficient cost planning increasingly crucial for hospitals, in particular for staff which accounts for more than half of hospitals’ expenses. This is facilitated by accurate forecasts of patient census and churn. Using a dataset of more than 3 million observations from a large UK hospital, we show how HTS forecasting can improve forecast accuracy by using information at different levels of the hospital hierarchy (aggregate, emergency/electives, divisions, specialties), compared to the naïve benchmark: the seasonal random walk model applied to the aggregate. We show that forecast combination can improve accuracy even more in some cases, and leads to lower forecast error variance (decreasing forecasting risk). We propose a comprehensive parametric approach to use forecasts in a nurse staffing model that has the aim of minimising cost while satisfying that the care requirements (e.g. nurse hours per patient day thresholds) are met.

8

Doell, Christoph [Verfasser]. "Methods for Multivariate Time-Series Classification on Brain Data : Aggregation, Stratification and Neural Network Models / Christoph Doell". Konstanz : KOPS Universität Konstanz, 2021. http://d-nb.info/1232176648/34.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Bahl, Björn [Verfasser], André [Akademischer Betreuer] Bardow e Francois [Akademischer Betreuer] Marechal. "Optimization-based synthesis of large-scale energy systems by time-series aggregation / Björn Bahl ; André Bardow, Francois Marechal". Aachen : Universitätsbibliothek der RWTH Aachen, 2018. http://d-nb.info/1186069260/34.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Lee, Bu Hyoung. "The use of temporally aggregated data on detecting a structural change of a time series process". Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/375511.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Statistics
Ph.D.
A time series process can be influenced by an interruptive event which starts at a certain time point and so a structural break in either mean or variance may occur before and after the event time. However, the traditional statistical tests of two independent samples, such as the t-test for a mean difference and the F-test for a variance difference, cannot be directly used for detecting the structural breaks because it is almost certainly impossible that two random samples exist in a time series. As alternative methods, the likelihood ratio (LR) test for a mean change and the cumulative sum (CUSUM) of squares test for a variance change have been widely employed in literature. Another point of interest is temporal aggregation in a time series. Most published time series data are temporally aggregated from the original observations of a small time unit to the cumulative records of a large time unit. However, it is known that temporal aggregation has substantial effects on process properties because it transforms a high frequency nonaggregate process into a low frequency aggregate process. In this research, we investigate the effects of temporal aggregation on the LR test and the CUSUM test, through the ARIMA model transformation. First, we derive the proper transformation of ARIMA model orders and parameters when a time series is temporally aggregated. For the LR test for a mean change, its test statistic is associated with model parameters and errors. The parameters and errors in the statistic should be changed when an AR(p) process transforms upon the mth order temporal aggregation to an ARMA(P,Q) process. Using the property, we propose a modified LR test when a time series is aggregated. Through Monte Carlo simulations and empirical examples, we show that the aggregation leads the null distribution of the modified LR test statistic being shifted to the left. Hence, the test power increases as the order of aggregation increases. For the CUSUM test for a variance change, we show that two aggregation terms will appear in the test statistic and have negative effects on test results when an ARIMA(p,d,q) process transforms upon the mth order temporal aggregation to an ARIMA(P,d,Q) process. Then, we propose a modified CUSUM test to control the terms which are interpreted as the aggregation effects. Through Monte Carlo simulations and empirical examples, the modified CUSUM test shows better performance and higher test powers to detect a variance change in an aggregated time series than the original CUSUM test.
Temple University--Theses

11

Sharif, Abbass. "Visual Data Mining Techniques for Functional Actigraphy Data: An Object-Oriented Approach in R". DigitalCommons@USU, 2012. https://digitalcommons.usu.edu/etd/1394.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Actigraphy, a technology for measuring a subject's overall activity level almost continuously over time, has gained a lot of momentum over the last few years. An actigraph, a watch-like device that can be attached to the wrist or ankle of a subject, uses an accelerometer to measure human movement every minute or even every 15 seconds. Actigraphy data is often treated as functional data. In this dissertation, we discuss what has been done regarding the visualization of actigraphy data, and then we will explain the three main goals we achieved: (i) develop new multivariate visualization techniques for actigraphy data; (ii) integrate the new and current visualization tools into an R package using object-oriented model design; and (iii) develop an adaptive user-friendly web interface for actigraphy software.

12

Silvestrini, Andrea. "Essays on aggregation and cointegration of econometric models". Doctoral thesis, Universite Libre de Bruxelles, 2009. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210304.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This dissertation can be broadly divided into two independent parts. The first three chapters analyse issues related to temporal and contemporaneous aggregation of econometric models. The fourth chapter contains an application of Bayesian techniques to investigate whether the post transition fiscal policy of Poland is sustainable in the long run and consistent with an intertemporal budget constraint.

Chapter 1 surveys the econometric methodology of temporal aggregation for a wide range of univariate and multivariate time series models.

A unified overview of temporal aggregation techniques for this broad class of processes is presented in the first part of the chapter and the main results are summarized. In each case, assuming to know the underlying process at the disaggregate frequency, the aim is to find the appropriate model for the aggregated data. Additional topics concerning temporal aggregation of ARIMA-GARCH models (see Drost and Nijman, 1993) are discussed and several examples presented. Systematic sampling schemes are also reviewed.

Multivariate models, which show interesting features under temporal aggregation (Breitung and Swanson, 2002, Marcellino, 1999, Hafner, 2008), are examined in the second part of the chapter. In particular, the focus is on temporal aggregation of VARMA models and on the related concept of spurious instantaneous causality, which is not a time series property invariant to temporal aggregation. On the other hand, as pointed out by Marcellino (1999), other important time series features as cointegration and presence of unit roots are invariant to temporal aggregation and are not induced by it.

Some empirical applications based on macroeconomic and financial data illustrate all the techniques surveyed and the main results.

Chapter 2 is an attempt to monitor fiscal variables in the Euro area, building an early warning signal indicator for assessing the development of public finances in the short-run and exploiting the existence of monthly budgetary statistics from France, taken as "example country".

The application is conducted focusing on the cash State deficit, looking at components from the revenue and expenditure sides. For each component, monthly ARIMA models are estimated and then temporally aggregated to the annual frequency, as the policy makers are interested in yearly predictions.

The short-run forecasting exercises carried out for years 2002, 2003 and 2004 highlight the fact that the one-step-ahead predictions based on the temporally aggregated models generally outperform those delivered by standard monthly ARIMA modeling, as well as the official forecasts made available by the French government, for each of the eleven components and thus for the whole State deficit. More importantly, by the middle of the year, very accurate predictions for the current year are made available.

The proposed method could be extremely useful, providing policy makers with a valuable indicator when assessing the development of public finances in the short-run (one year horizon or even less).

Chapter 3 deals with the issue of forecasting contemporaneous time series aggregates. The performance of "aggregate" and "disaggregate" predictors in forecasting contemporaneously aggregated vector ARMA (VARMA) processes is compared. An aggregate predictor is built by forecasting directly the aggregate process, as it results from contemporaneous aggregation of the data generating vector process. A disaggregate predictor is a predictor obtained from aggregation of univariate forecasts for the individual components of the data generating vector process.

The econometric framework is broadly based on Lütkepohl (1987). The necessary and sufficient condition for the equality of mean squared errors associated with the two competing methods in the bivariate VMA(1) case is provided. It is argued that the condition of equality of predictors as stated in Lütkepohl (1987), although necessary and sufficient for the equality of the predictors, is sufficient (but not necessary) for the equality of mean squared errors.

Furthermore, it is shown that the same forecasting accuracy for the two predictors can be achieved using specific assumptions on the parameters of the VMA(1) structure.

Finally, an empirical application that involves the problem of forecasting the Italian monetary aggregate M1 on the basis of annual time series ranging from 1948 until 1998, prior to the creation of the European Economic and Monetary Union (EMU), is presented to show the relevance of the topic. In the empirical application, the framework is further generalized to deal with heteroskedastic and cross-correlated innovations.

Chapter 4 deals with a cointegration analysis applied to the empirical investigation of fiscal sustainability. The focus is on a particular country: Poland. The choice of Poland is not random. First, the motivation stems from the fact that fiscal sustainability is a central topic for most of the economies of Eastern Europe. Second, this is one of the first countries to start the transition process to a market economy (since 1989), providing a relatively favorable institutional setting within which to study fiscal sustainability (see Green, Holmes and Kowalski, 2001). The emphasis is on the feasibility of a permanent deficit in the long-run, meaning whether a government can continue to operate under its current fiscal policy indefinitely.

The empirical analysis to examine debt stabilization is made up by two steps.

First, a Bayesian methodology is applied to conduct inference about the cointegrating relationship between budget revenues and (inclusive of interest) expenditures and to select the cointegrating rank. This task is complicated by the conceptual difficulty linked to the choice of the prior distributions for the parameters relevant to the economic problem under study (Villani, 2005).

Second, Bayesian inference is applied to the estimation of the normalized cointegrating vector between budget revenues and expenditures. With a single cointegrating equation, some known results concerning the posterior density of the cointegrating vector may be used (see Bauwens, Lubrano and Richard, 1999).

The priors used in the paper leads to straightforward posterior calculations which can be easily performed.

Moreover, the posterior analysis leads to a careful assessment of the magnitude of the cointegrating vector. Finally, it is shown to what extent the likelihood of the data is important in revising the available prior information, relying on numerical integration techniques based on deterministic methods.

Doctorat en Sciences économiques et de gestion
info:eu-repo/semantics/nonPublished

13

Hellström, Jörgen. "Count data modelling and tourism demand". Doctoral thesis, Umeå universitet, Institutionen för nationalekonomi, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-82168.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

This thesis consists of four papers concerning modelling of count data and tourism demand. For three of the papers the focus is on the integer-valued autoregressive moving average model class (INARMA), and especially on the ENAR(l) model. The fourth paper studies the interaction between households' choice of number of leisure trips and number of overnight stays within a bivariate count data modelling framework. Paper [I] extends the basic INAR(1) model to enable more flexible and realistic empirical economic applications. The model is generalized by relaxing some of the model's basic independence assumptions. Results are given in terms of first and second conditional and unconditional order moments. Extensions to general INAR(p), time-varying, multivariate and threshold models are also considered. Estimation by conditional least squares and generalized method of moments techniques is feasible. Monte Carlo simulations for two of the extended models indicate reasonable estimation and testing properties. An illustration based on the number of Swedish mechanical paper and pulp mills is considered. Paper[II] considers the robustness of a conventional Dickey-Fuller (DF) test for the testing of a unit root in the INAR(1) model. Finite sample distributions for a model with Poisson distributed disturbance terms are obtained by Monte Carlo simulation. These distributions are wider than those of AR(1) models with normal distributed error terms. As the drift and sample size, respectively, increase the distributions appear to tend to T-2) and standard normal distributions. The main results are summarized by an approximating equation that also enables calculation of critical values for any sample and drift size. Paper[III] utilizes the INAR(l) model to model the day-to-day movements in the number of guest nights in hotels. By cross-sectional and temporal aggregation an INARMA(1,1) model for monthly data is obtained. The approach enables easy interpretation and econometric modelling of the parameters, in terms of daily mean check-in and check-out probability. Empirically approaches accounting for seasonality by dummies and using differenced series, as well as forecasting, are studied for a series of Norwegian guest nights in Swedish hotels. In a forecast evaluation the improvements by introducing economic variables is minute. Paper[IV] empirically studies household's joint choice of the number of leisure trips and the total night to stay on these trips. The paper introduces a bivariate count hurdle model to account for the relative high frequencies of zeros. A truncated bivariate mixed Poisson lognormal distribution, allowing for both positive as well as negative correlation between the count variables, is utilized. Inflation techniques are used to account for clustering of leisure time to weekends. Simulated maximum likelihood is used as estimation method. A small policy study indicates that households substitute trips for nights as the travel costs increase.

Härtill 4 uppsatser.

digitalisering@umu

14

Inersjö, Adam. "Transformation of Time-based Sensor Data to Material Quality Data in Stainless Steel Production". Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-414802.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Quality assurance in the stainless steel production requires large amounts of sensor data to monitor the processing steps. Digitalisation of the production would allow higher levels of control to both evaluate and increase the quality of the end products. At Outokumpu Avesta Works, continuous processing of coils creates sensor data without connecting it to individual steel coils, a connection needed to achieve the promises of digitalisation. In this project, the time series data generated from 12 sensors in the continuous processing was analysed and four alternative methods to connect the data to coils were presented. A method based on positional time series was deemed the most suitable for the data and was selected for implementation over other methods that would apply time series analysis on the sensor data itself. Evaluations of the selected method showed that it was able to connect sensor data to 98.10 % of coils, just short of reaching the accuracy requirement of 99 %. Because the overhead of creating the positional time series was constant regardless of the number of sensors, the performance per sensor improved with increased number of sensors. The median processing time for 24 hours of sensor data was less than 20 seconds per sensor when batch processing eight or more sensors. The performance for processing fewer than four sensors was not as good, requiring further optimization to reach the requirement of 30 seconds per sensor. Although the requirements were not completely fulfilled, the implemented method can still be used on historical production data to facilitate further quality estimation of stainless steel coils
Kvalitetssäkring av rostfritt stål produktion kräver stora mängder av sensordata för att övervaka processtegen. Digitalisering av produktionen skulle ge större kontroll för att både bedöma och öka kvaliteten på slutprodukterna. Vid Outokumpu Avesta Works skapas sensordata vid kontinuerlig bearbetning av stålband utan att datan sammankopplas till enskilda band, trots att denna sammankoppling krävs för att uppnå löftena som digitaliseringens ger. I detta projekt analyserades tidsseriedata från 12 sensorer vid den kontinuerliga bearbetningen av band och fyra alternativa metoder för att sammankoppla sensordatan till stålband presenterades. En metod som byggde på tidsserier med positionsvärden bedömdes vara mest passande för sensordatan och valdes för implementation över andra metoder som byggde på tidsserieanalys av själva sensordatan. Evaluering av den valda metoden visade att den kunde sammankoppla sensordata till 98.10 % av ståldbanden, något lägre än kravet på 99 % korrekthet. På grund av att skapandet av tidsserierna med positionsvärden tog lika lång tid oberoende av antalet sensorer så förbättrades bearbetningstiden desto fler sensorer som bearbetades. För bearbetning av 24 timmar av sensordata låg median bearbetningstiden på mindre än 20 sekunder per sensor när åtta eller fler sensorer bearbetades tillsammans. Prestandan för bearbetning av färre än fyra sensorer var inte lilka bra och kräver ytterliga optimering för att nå kravet på 30 sekunder per sensor. Fastän kraven på metoden inte uppnåddes till fullo kan den implementerade metoden ändå användas på historisk data för att främja kvalitetsbedömning av rostfria stålband.

15

Gaillard, Pierre. "Contributions à l’agrégation séquentielle robuste d’experts : Travaux sur l’erreur d’approximation et la prévision en loi. Applications à la prévision pour les marchés de l’énergie". Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112133/document.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Nous nous intéressons à prévoir séquentiellement une suite arbitraire d'observations. À chaque instant, des experts nous proposent des prévisions de la prochaine observation. Nous formons alors notre prévision en mélangeant celles des experts. C'est le cadre de l'agrégation séquentielle d'experts. L'objectif est d'assurer un faible regret cumulé. En d'autres mots, nous souhaitons que notre perte cumulée ne dépasse pas trop celle du meilleur expert sur le long terme. Nous cherchons des garanties très robustes~: aucune hypothèse stochastique sur la suite d'observations à prévoir n'est faite. Celle-ci est supposée arbitraire et nous souhaitons des garanties qui soient vérifiées quoi qu'il arrive. Un premier objectif de ce travail est l'amélioration de la performance des prévisions. Plusieurs possibilités sont proposées. Un exemple est la création d'algorithmes adaptatifs qui cherchent à s'adapter automatiquement à la difficulté de la suite à prévoir. Un autre repose sur la création de nouveaux experts à inclure au mélange pour apporter de la diversité dans l'ensemble d'experts. Un deuxième objectif de la thèse est d'assortir les prévisions d'une mesure d'incertitude, voire de prévoir des lois. Les applications pratiques sont nombreuses. En effet, très peu d'hypothèses sont faites sur les données. Le côté séquentiel permet entre autres de traiter de grands ensembles de données. Nous considérons dans cette thèse divers jeux de données du monde de l'énergie (consommation électrique, prix de l'électricité,...) pour montrer l'universalité de l'approche
We are interested in online forecasting of an arbitrary sequence of observations. At each time step, some experts provide predictions of the next observation. Then, we form our prediction by combining the expert forecasts. This is the setting of online robust aggregation of experts. The goal is to ensure a small cumulative regret. In other words, we want that our cumulative loss does not exceed too much the one of the best expert. We are looking for worst-case guarantees: no stochastic assumption on the data to be predicted is made. The sequence of observations is arbitrary. A first objective of this work is to improve the prediction accuracy. We investigate several possibilities. An example is to design fully automatic procedures that can exploit simplicity of the data whenever it is present. Another example relies on working on the expert set so as to improve its diversity. A second objective of this work is to produce probabilistic predictions. We are interested in coupling the point prediction with a measure of uncertainty (i.e., interval forecasts,…). The real world applications of the above setting are multiple. Indeed, very few assumptions are made on the data. Besides, online learning that deals with data sequentially is crucial to process big data sets in real time. In this thesis, we carry out for EDF several empirical studies of energy data sets and we achieve good forecasting performance

16

Sànchez, Pérez Andrés. "Agrégation de prédicteurs pour des séries temporelles, optimalité dans un contexte localement stationnaire". Thesis, Paris, ENST, 2015. http://www.theses.fr/2015ENST0051/document.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Cette thèse regroupe nos résultats sur la prédiction de séries temporelles dépendantes. Le document comporte trois chapitres principaux où nous abordons des problèmes différents. Le premier concerne l’agrégation de prédicteurs de décalages de Bernoulli Causales, en adoptant une approche Bayésienne. Le deuxième traite de l’agrégation de prédicteurs de ce que nous définissions comme processus sous-linéaires. Une attention particulaire est portée aux processus autorégressifs localement stationnaires variables dans le temps, nous examinons un schéma de prédiction adaptative pour eux. Dans le dernier chapitre nous étudions le modèle de régression linéaire pour une classe générale de processus localement stationnaires
This thesis regroups our results on dependent time series prediction. The work is divided into three main chapters where we tackle different problems. The first one is the aggregation of predictors of Causal Bernoulli Shifts using a Bayesian approach. The second one is the aggregation of predictors of what we define as sub-linear processes. Locally stationary time varying autoregressive processes receive a particular attention; we investigate an adaptive prediction scheme for them. In the last main chapter we study the linear regression problem for a general class of locally stationary processes

17

Sànchez, Pérez Andrés. "Agrégation de prédicteurs pour des séries temporelles, optimalité dans un contexte localement stationnaire". Electronic Thesis or Diss., Paris, ENST, 2015. http://www.theses.fr/2015ENST0051.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Cette thèse regroupe nos résultats sur la prédiction de séries temporelles dépendantes. Le document comporte trois chapitres principaux où nous abordons des problèmes différents. Le premier concerne l’agrégation de prédicteurs de décalages de Bernoulli Causales, en adoptant une approche Bayésienne. Le deuxième traite de l’agrégation de prédicteurs de ce que nous définissions comme processus sous-linéaires. Une attention particulaire est portée aux processus autorégressifs localement stationnaires variables dans le temps, nous examinons un schéma de prédiction adaptative pour eux. Dans le dernier chapitre nous étudions le modèle de régression linéaire pour une classe générale de processus localement stationnaires
This thesis regroups our results on dependent time series prediction. The work is divided into three main chapters where we tackle different problems. The first one is the aggregation of predictors of Causal Bernoulli Shifts using a Bayesian approach. The second one is the aggregation of predictors of what we define as sub-linear processes. Locally stationary time varying autoregressive processes receive a particular attention; we investigate an adaptive prediction scheme for them. In the last main chapter we study the linear regression problem for a general class of locally stationary processes

18

Shirizadeh, Ghezeljeh Behrang. "Reaching carbon neutrality in France by 2050 : optimal choice of energy sources, carriers and storage options". Electronic Thesis or Diss., Paris, EHESS, 2021. http://www.theses.fr/2021EHES0013.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Pour contribuer à l’objectif de contenir le réchauffement climatique à 1,5°C, le gouvernement français a adopté l'objectif de zéro émission nette de gaz à effet de serre d'ici 2050. Le principal gaz à effet de serre étant le dioxyde de carbone, et la plupart des émissions de CO2 étant dues à la combustion d'énergies fossiles, cette thèse porte sur l'atteinte de la neutralité carbone des émissions françaises de CO2 liées à l'énergie d'ici 2050. Cette thèse vise à étudier le rôle relatif des différentes options bas-carbone dans le secteur de l'énergie pour atteindre la neutralité carbone. Plus précisément, cette thèse étudie d'abord le secteur électrique français, d'abord dans un système entièrement renouvelable, et ensuite dans un en intégrant d'autres options d'atténuation, c'est-à-dire l'énergie nucléaire et la capture et le stockage du carbone. J'étudie l'impact des incertitudes liées au développement des coûts des énergies renouvelables et des options de stockage et j'aborde la question de la robustesse d'un système électrique entièrement renouvelable face aux incertitudes liées aux coûts. Plus tard, en ajoutant d'autres options bas-carbone dans le secteur de l'électricité, j'analyse le rôle relatif des différentes options. De même, pour encourager les investissements dans des sources d'énergie renouvelables telles que l'énergie éolienne et solaire, j’étudie le risque d'investissement lié à la volatilité des prix et des volumes des technologies d'électricité renouvelable, et les performances de différents régimes de soutien publique. L'analyse de cette thèse va au-delà du système électrique et considère également l'ensemble du système énergétique en présence d'un couplage sectoriel. Au cours de cette thèse, j’ai développé une famille de modèles d'optimisation de l’investissement et du fonctionnement pour répondre à différentes questions concernant la transition énergétique française. Ces modèles minimisent le coût du système considéré (système électrique ou système énergétique dans son ensemble) en satisfaisant l'équilibre offre/demande à chaque heure pendant au moins un an, en respectant les principales contraintes techniques et opérationnelles et liées aux ressources et à l'usage des sols. Ainsi, la variabilité à court et à long terme des énergies renouvelables est prise en compte. En utilisant ces modèles, je réponds aux questions soulevées ci-dessus. Ces modèles ne sont pas utilisés pour trouver une seule solution optimale, mais plusieurs solutions optimales en fonction de différents scénarios de conditions météorologiques, de coûts, de demande énergétique et de disponibilité des technologies. Par conséquent, l'importance de la robustesse face aux incertitudes est au centre de la méthodologie utilisée, ainsi que l'optimalité. Les résultats de ma thèse montrent que les sources d’énergie renouvelable sont les principaux facilitateurs de la transition énergétique, non-seulement dans le système électrique mais aussi dans l'ensemble du système énergétique. Bien que l'élimination de l'énergie nucléaire n'augmente que marginalement le coût d'un système énergétique neutre en carbone, l'élimination des énergies renouvelables est associée à des inefficacités élevées tant du point de vue des coûts que des émissions. En fait, si le gaz renouvelable n'est pas disponible, même un coût social du carbone de 500 €/tCO2 ne suffira pas pour atteindre la neutralité carbone. Cela est dû en partie aux émissions négatives qu'il peut produire avec le captage et le stockage du carbone, et en partie à la rentabilité des moteurs à combustion interne alimentés au gaz renouvelable. Le message central de cette thèse est que pour atteindre la neutralité carbone au moindre coût, il faut un système d'énergie largement renouvelable. Par conséquent, si nous voulons donner la priorité aux investissements dans les options à faible émission de carbone, les technologies de gaz et d'électricité renouvelables sont de la plus haute importance
To stay in line with 1.5°C of global warming, the French government has adopted the target of net zero greenhouse gas emissions by 2050. The main greenhouse gas being carbon dioxide, and the majority of its emissions being due to energy combustion, this dissertation focuses on reaching carbon-neutrality in French energy-related CO2 emissions by 2050. This thesis dissertation aims to study the relative role of different low-carbon mitigation options in the energy sector in reaching carbon-neutrality. More precisely, this thesis first studies the French power sector, first in a fully renewable power system, and second in a power system containing other mitigation options i.e. nuclear energy and carbon capture and storage. I study the impact of uncertainties related to cost development of renewables and storage options and address the robustness of a fully renewable power system to cost uncertainties. Later, adding other low-carbon mitigation options in the power sector, I analyze the relative role of different low-carbon options. Similarly, to incentivize the investments in variable renewable energy sources such as wind and solar power, I study the investment risk related to the price and volume volatility of renewable electricity technologies, and the performance of different public policy support schemes. The analysis in this thesis goes beyond the electricity system and it also considers the whole energy system in the presence of sector-coupling. During this thesis, I have developed a family of models optimizing dispatch and investment to answer different questions regarding the French energy transition. These models minimize the cost of the considered system (electricity system or the whole energy system) by satisfying the supply/demand equilibrium at each hour over at least one year, respecting the main technical and operational, resource related and land-use constraints. Thus, both short-term and long-term variability of renewable energy sources are taken into account. Using these models, I address the questions raised above. These models are not used to find a single optimal solution, but several optimal solutions depending on different weather, cost, energy demand and technology availability scenarios. Therefore, the importance of robustness to the uncertainties is at the center of the used methodology beside optimality. The findings of my thesis show that renewable energy supply sources are the main enablers of reaching carbon neutrality in a cost-effective way, no matter the considered energy system; either only electricity or the whole energy system. While the elimination of nuclear power barely increases the cost of a carbon-neutral energy system, the elimination of renewables is associated with high inefficiencies both from the cost and emission points of view. In fact, if renewable gas is not available, even a social cost of carbon of €500/tCO2 will not be enough to reach carbon-neutrality. This is partially due to the negative emissions that it can provide once combined with carbon capture and storage, and partially due to the cost-optimality of renewable gas-fired internal combustion engines in reaching carbon-neutrality in the transport sector. This dissertation has several important policy-related messages; however, the central one is that reaching carbon-neutrality for the lowest cost requires a highly renewable energy system. Therefore, if we are to prioritize investment in low-carbon options, renewable gas and electricity technologies are of the highest importance

19

Lin, Hsing-Chung, e 林信忠. "Contemporaneous on aggregation of stationary vector time series". Thesis, 1994. http://ndltd.ncl.edu.tw/handle/93708887088111805070.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

20

Sebatjane, Phuti. "Understanding patterns of aggregation in count data". Diss., 2016. http://hdl.handle.net/10500/22067.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

The term aggregation refers to overdispersion and both are used interchangeably in this thesis. In addressing the problem of prevalence of infectious parasite species faced by most rural livestock farmers, we model the distribution of faecal egg counts of 15 parasite species (13 internal parasites and 2 ticks) common in sheep and goats. Aggregation and excess zeroes is addressed through the use of generalised linear models. The abundance of each species was modelled using six different distributions: the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-altered Poisson (ZAP) and zero-altered negative binomial (ZANB) and their fit was later compared. Excess zero models (ZIP, ZINB, ZAP and ZANB) were found to be a better fit compared to standard count models (Poisson and negative binomial) in all 15 cases. We further investigated how distributional assumption a↵ects aggregation and zero inflation. Aggregation and zero inflation (measured by the dispersion parameter k and the zero inflation probability) were found to vary greatly with distributional assumption; this in turn changed the fixed-effects structure. Serial autocorrelation between adjacent observations was later taken into account by fitting observation driven time series models to the data. Simultaneously taking into account autocorrelation, overdispersion and zero inflation proved to be successful as zero inflated autoregressive models performed better than zero inflated models in most cases. Apart from contribution to the knowledge of science, predictability of parasite burden will help farmers with effective disease management interventions. Researchers confronted with the task of analysing count data with excess zeroes can use the findings of this illustrative study as a guideline irrespective of their research discipline. Statistical methods from model selection, quantifying of zero inflation through to accounting for serial autocorrelation are described and illustrated.
Statistics
M.Sc. (Statistics)

21

Saker, Halima. "Segmentation of Heterogeneous Multivariate Genome Annotation Data". 2021. https://ul.qucosa.de/id/qucosa%3A75914.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Due to the potential impact of next-generation sequencing NGS, we have seen a rapid increase in genomic information and annotation information that can be naturally mapped to genomic locations. In cancer research, for example, there are significant efforts to chart DNA methylation at single-nucleotide resolution. The NIH Roadmap Epigenomics Projects, on the other hand, has set out to chart a large number of different histone modifications. However, throughout the last few years, a very diverse set of aspects has become the aim of large-scale experiments with a genome-wide readout. Therefore, the identification of functional units of the genomic DNA is considered a significant and essential challenge. Subsequently, we have been motivated to implement multi-dimensional segmentation approaches that serve gene variety and genome heterogeneity. The segmentation of multivariate genomic, epigenomic, and transcriptomic data from multiple time points, tissue, and cell types to compare changes in genomic organization and identify common elements form the headline of our research. Next generation sequencing offers a rich material used in bioinformatics research to find answers, solutions, and exploration for the molecular functions, diseases causes, etc. Rapid advances in technology also have led to the proliferation of types of experiments. Although sharing next-generation sequencing as the readout produces signals with an entirely different inherent resolution, ranging from a precise transcript structure at the single-nucleotide resolution to pull-down and enrichment-based protocols with resolutions on order 100 nt to chromosome conformation data that are only accurate at kilobase resolution. Therefore, the main goal of the dissertation project is to design, implement, and test novel segmentation algorithms that work on one- and multi-dimensional and can accommodate data of different types and resolutions. The target data in this project is multivariate genetic, epigenetic, transcriptomic, and proteomic data; the reason is that these datasets can change under the effect of several conditions such as chemical, genetic and epigenetic modifications. A promising approach towards this end is to identify intervals of the genomic DNA that behave coherently in multiple conditions and tissues and could be defined as intervals on which all measured quantities are constant within each experiment. A naive approach would take each data set in isolation and estimate intervals in which the signal at hand is constant. Another approach takes datasets all at once as input without recurring to one-dimensional segmentation. Once implemented, the algorithm should be applied on heterogeneous genomic, transcriptomic, proteomic, and epigenomic data; the aim here is to draw and improve the map of functionally coherent segments of a genome. Current approaches either focus on individual datasets, as in the case of tiling array transcriptomics data; Or on the analysis of comparable experiments such as ChIP-seq data for various histone modifications. The simplest sub-problem in segmentation is to decide whether two adjacent intervals should form two distinct segments or whether they should be combined into a single one. We have to find out how this should be done in the multi-D segmentation; in 1-D, this is relatively well known. This leads to a segmentation of the genome concerning the particular dataset. The intersection of segmentations for different datasets could identify then the DNA elements.

22

Alves, Pedro Miguel Carregueiro Jordão. "The effect of serial correlation in time-aggregation of annual sharpe ratios from monthly data". Master's thesis, 2018. http://hdl.handle.net/10362/32318.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

The Sharpe ratio is one of the most widely used measures of risk-adjusted returns. It rests on the estimation of the mean and standard deviation of returns, which is subject to estimation errors. Moreover, it assumes identically and independently distributed returns, normality and no serial correlation, which are very restrictive assumptions in general. By using the Generalized Method of Moments approach to estimate these quantities, the assumptions may be relaxed and a more efficient estimator can be derived, by allowing serial correlation in returns. The purpose of this research is to show how serial correlation can affect the timeaggregation of Sharpe ratios, changing the ordering of a ranking of assets based on the ratio.

23

ALAIMO, LEONARDO. "Complexity of social phenomena: measurements, analysis, representations and synthesis". Doctoral thesis, 2020. http://hdl.handle.net/11573/1360691.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

During the three years of my Ph.D., I have analyzed and studied phenomena often very different and apparently distant from one another: well-being; sustainable development; gender inequalities; the Brexit vote. The aim has always been to understand these facets of reality, to give them an explanation based on a quantitative point of view. My interest was to provide a measure of concepts often considered difficult to deal with and to understand. When I finished my Ph.D., I tried to put together the experience I had gained. As mentioned, the research interests were many and, therefore, it was necessary to conceptualize them within a framework that would highlight the elements in common. This thesis is the result of such an attempt at conceptualisation. The title itself highlights the concepts common to my research work over the years. The first concept I deal with is complexity. I have realized that all different socioeconomic phenomena have in common their complex structure, often mistakenly exchanged with complication and difficulty. Nowadays, complexity is a concept that characterizes allthe natural and social sciences and defines our relationship with knowledge. Chapter 1 examines precisely the theme of complexity, presenting different approaches and definitions to this issue. I have tried to reconstruct the way in which complexity became central in the relationship with knowledge, together with its qualifying concepts such as subjectivity, the concept of system and circular causality. The second guiding concept of this research work is measurement. Understanding the world requires a sort of translation, a shift from the plane of reality in which we observe phenomena to the plane of numbers in which we try to encode them. This translation must be meaningful, it must reproduce as faithfully as possible in the world of numbers the phenomenon observed in the plane of reality. Measurement is a need for the knowledge of reality, which speaks to us with the language of numbers. This issue is the subject of Chapter 2, in which I address the question of the definition of this process. Subsequently, the measurement is contextualised within sociology, presenting the essential contribution on this theme offered by Paul Felix Lazarsfeld with the operationalisation. Finally, the concept of indicator is explored, by analysing their crucial importance in the measurement of social phenomena. The Chapter presents all the main aspects through which it is possible to obtain a system of indicators, a tool for measuring complex social phenomena. The way in which we can measure complex socio-economic phenomena is dealt with in Chapter 3. Synthesis is presented by a methodological point of view, considering both aspects of a system, units (rows) and indicators (columns). I focus on the synthesis techniques that allow a dynamic analysis of phenomena in order to obtain comparable measures not only in space, but also in time. Only in this way, a synthesis is meaningful. In the chapter, I define the object of study, the three-way data array X {xijt : i = 1, . . . ,N; j = 1, . . . ,M; t = 1, . . . , T}, where xijt represents the determination of the jth indicator in the ith unit at the tth temporal occasion. The methods of clustering these objects and summarising the indicators are addressed, considering both the aggregative and the non-aggregative approach (in particular, I propose an approach to apply posets to systems of indicators over time). In the last two chapters, I propose two applications to real data. Both applications concern regional data. The choice was made because of the importance that the regional dimension has for a country like Italy, characterized by strong territorial disparities. The first one (Chapter 4) concerns the concept of well-being and, from a methodological point of view, the synthesis of statistical units. In particular, using the time series of regional composites produced by the Italian National Institute of Statistics for the Equitable and Sustainable Well-being project (BES), we classify the Italian regions according to different domains. We use a time series fuzzy clustering algorithm, particularly suitable for that type of data. Chapter 5 deals with sustainable development and the issue of synthesis of statistical indicators over time. In particular, an aggregative method, the Adjusted Mazziotta-Pareto Index (AMPI), and a non-aggregative procedure based on posets will be compared.

24

Riba, Evans Mogolo. "Exploring advanced forecasting methods with applications in aviation". Diss., 2021. http://hdl.handle.net/10500/27410.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Abstract (sommario):

Abstracts in English, Afrikaans and Northern Sotho
More time series forecasting methods were researched and made available in recent years. This is mainly due to the emergence of machine learning methods which also found applicability in time series forecasting. The emergence of a variety of methods and their variants presents a challenge when choosing appropriate forecasting methods. This study explored the performance of four advanced forecasting methods: autoregressive integrated moving averages (ARIMA); artificial neural networks (ANN); support vector machines (SVM) and regression models with ARIMA errors. To improve their performance, bagging was also applied. The performance of the different methods was illustrated using South African air passenger data collected for planning purposes by the Airports Company South Africa (ACSA). The dissertation discussed the different forecasting methods at length. Characteristics such as strengths and weaknesses and the applicability of the methods were explored. Some of the most popular forecast accuracy measures were discussed in order to understand how they could be used in the performance evaluation of the methods. It was found that the regression model with ARIMA errors outperformed all the other methods, followed by the ARIMA model. These findings are in line with the general findings in the literature. The ANN method is prone to overfitting and this was evident from the results of the training and the test data sets. The bagged models showed mixed results with marginal improvement on some of the methods for some performance measures. It could be concluded that the traditional statistical forecasting methods (ARIMA and the regression model with ARIMA errors) performed better than the machine learning methods (ANN and SVM) on this data set, based on the measures of accuracy used. This calls for more research regarding the applicability of the machine learning methods to time series forecasting which will assist in understanding and improving their performance against the traditional statistical methods
Die afgelope tyd is verskeie tydreeksvooruitskattingsmetodes ondersoek as gevolg van die ontwikkeling van masjienleermetodes met toepassings in die vooruitskatting van tydreekse. Die nuwe metodes en hulle variante laat ŉ groot keuse tussen vooruitskattingsmetodes. Hierdie studie ondersoek die werkverrigting van vier gevorderde vooruitskattingsmetodes: outoregressiewe, geïntegreerde bewegende gemiddeldes (ARIMA), kunsmatige neurale netwerke (ANN), steunvektormasjiene (SVM) en regressiemodelle met ARIMA-foute. Skoenlussaamvoeging is gebruik om die prestasie van die metodes te verbeter. Die prestasie van die vier metodes is vergelyk deur hulle toe te pas op Suid-Afrikaanse lugpassasiersdata wat deur die Suid-Afrikaanse Lughawensmaatskappy (ACSA) vir beplanning ingesamel is. Hierdie verhandeling beskryf die verskillende vooruitskattingsmetodes omvattend. Sowel die positiewe as die negatiewe eienskappe en die toepasbaarheid van die metodes is uitgelig. Bekende prestasiemaatstawwe is ondersoek om die prestasie van die metodes te evalueer. Die regressiemodel met ARIMA-foute en die ARIMA-model het die beste van die vier metodes gevaar. Hierdie bevinding strook met dié in die literatuur. Dat die ANN-metode na oormatige passing neig, is deur die resultate van die opleidings- en toetsdatastelle bevestig. Die skoenlussamevoegingsmodelle het gemengde resultate opgelewer en in sommige prestasiemaatstawwe vir party metodes marginaal verbeter. Op grond van die waardes van die prestasiemaatstawwe wat in hierdie studie gebruik is, kan die gevolgtrekking gemaak word dat die tradisionele statistiese vooruitskattingsmetodes (ARIMA en regressie met ARIMA-foute) op die gekose datastel beter as die masjienleermetodes (ANN en SVM) presteer het. Dit dui op die behoefte aan verdere navorsing oor die toepaslikheid van tydreeksvooruitskatting met masjienleermetodes om hul prestasie vergeleke met dié van die tradisionele metodes te verbeter.
Go nyakišišitšwe ka ga mekgwa ye mentši ya go akanya ka ga molokoloko wa dinako le go dirwa gore e hwetšagale mo mengwageng ye e sa tšwago go feta. Se k e k a le b a k a la g o t šwelela ga mekgwa ya go ithuta ya go diriša metšhene yeo le yona e ilego ya dirišwa ka kakanyong ya molokolokong wa dinako. Go t šwelela ga mehutahuta ya mekgwa le go fapafapana ga yona go tšweletša tlhohlo ge go kgethwa mekgwa ya maleba ya go akanya. Dinyakišišo tše di lekodišišitše go šoma ga mekgwa ye mene ya go akanya yeo e gatetšego pele e lego: ditekanyotshepelo tšeo di kopantšwego tša poelomorago ya maitirišo (ARIMA); dinetweke tša maitirelo tša nyurale (ANN); metšhene ya bekthara ya thekgo (SVM); le mekgwa ya poelomorago yeo e nago le diphošo tša ARIMA. Go kaonafatša go šoma ga yona, nepagalo ya go ithuta ka metšhene le yona e dirišitšwe. Go šoma ga mekgwa ye e fepafapanego go laeditšwe ka go šomiša tshedimošo ya banamedi ba difofane ba Afrika Borwa yeo e kgobokeditšwego mabakeng a dipeakanyo ke Khamphani ya Maemafofane ya Afrika Borwa (ACSA). Sengwalwanyaki šišo se ahlaahlile mekgwa ya kakanyo ye e fapafapanego ka bophara. Dipharologanyi tša go swana le maatla le bofokodi le go dirišega ga mekgwa di ile tša šomišwa. Magato a mangwe ao a tumilego kudu a kakanyo ye e nepagetšego a ile a ahlaahlwa ka nepo ya go kwešiša ka fao a ka šomišwago ka gona ka tshekatshekong ya go šoma ga mekgwa ye. Go hweditšwe gore mokgwa wa poelomorago wa go ba le diphošo tša ARIMA o phadile mekgwa ye mengwe ka moka, gwa latela mokgwa wa ARIMA. Dikutollo tše di sepelelana le dikutollo ka kakaretšo ka dingwaleng. Mo k gwa wa ANN o ka fela o fetišiša gomme se se bonagetše go dipoelo tša tlhahlo le dihlo pha t ša teko ya tshedimošo. Mekgwa ya nepagalo ya go ithuta ka metšhene e bontšhitše dipoelo tšeo di hlakantšwego tšeo di nago le kaonafalo ye kgolo go ye mengwe mekgwa ya go ela go phethagatšwa ga mešomo. Go ka phethwa ka gore mekgwa ya setlwaedi ya go akanya dipalopalo (ARIMA le mokgwa wa poelomorago wa go ba le diphošo tša ARIMA) e šomile bokaone go phala mekgwa ya go ithuta ka metšhene (ANN le SVM) ka mo go sehlopha se sa tshedimošo, go eya ka magato a nepagalo ya magato ao a šomišitšwego. Se se nyaka gore go dirwe dinyakišišo tše dingwe mabapi le go dirišega ga mekgwa ya go ithuta ka metšhene mabapi le go akanya molokoloko wa dinako, e lego seo se tlago thuša go kwešiša le go kaonafatša go šoma ga yona kgahlanong le mekgwa ya setlwaedi ya dipalopalo.
Decision Sciences
M. Sc. (Operations Research)

Tesi sul tema "Time series aggregation"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili