Academic literature on the topic 'Outliers'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Outliers.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Outliers"

1

Seo, Han Son. "Outlier tests on potential outliers." Korean Journal of Applied Statistics 30, no. 1 (February 28, 2017): 159–67. http://dx.doi.org/10.5351/kjas.2017.30.1.159.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

., Srividya, S. Mohanavalli, N. Sripriya, and S. Poornima. "Outlier Detection using Clustering Techniques." International Journal of Engineering & Technology 7, no. 3.12 (July 20, 2018): 813. http://dx.doi.org/10.14419/ijet.v7i3.12.16508.

Full text
Abstract:
An outlier is nothing but a pattern that is different compared to the other existing patterns in a particular dataset. In some applications it is very important to understand and identify outliers. Detecting outlier is of major importance in many of the fields like cybersecurity, machine learning, finance, healthcare, etc., A clustering based method is proposed to detect outliers using different algorithms like k means, PAM, Clara, DBScan and LOF on different data sets like breast cancer, heart diseases, multi shaped datasets. This work aims to identify the best suitable method to detect the outliners accurately.
APA, Harvard, Vancouver, ISO, and other styles
3

Huda, Nur'ainul Miftahul, Utriweni Mukhaiyar, and Nurfitri Imro'ah. "AN ITERATIVE PROCEDURE FOR OUTLIER DETECTION IN GSTAR(1;1) MODEL." BAREKENG: Jurnal Ilmu Matematika dan Terapan 16, no. 3 (September 1, 2022): 975–84. http://dx.doi.org/10.30598/barekengvol16iss3pp975-984.

Full text
Abstract:
Outliers are observations that differ significantly from others that can affect the estimation results in the model and reduce the estimator's accuracy. To deal with outliers is to remove outliers from the data. However, sometimes important information is contained in the outlier, so eliminating outliers is a misinterpretation. There are two types of outliers in the time series model, Innovative Outlier (IO) and Additive Outlier (AO). In the GSTAR model, outliers and spatial and time correlations can also be detected. We introduce an iterative procedure for detecting outliers in the GSTAR model. The first step is to form a GSTAR model without outlier factors. Furthermore, the detection of outliers from the model's residuals. If an outlier is detected, add an outlier factor into the initial model and estimate the parameters so that a new GSTAR model and residuals are obtained from the model. The process is repeated by detecting outliers and adding them to the model until a GSTAR model is obtained with no outliers detected. As a result, outliers are not removed or ignored but add an outlier factor to the GSTAR model. This paper presents case studies about Dengue Hemorrhagic Fever cases in five locations in West Kalimantan Province. These are the subject of the GSTAR model with adding outlier factors. The result of this paper is that using an iterative procedure to detect outliers based on the GSTAR residual model provides better accuracy than the regular GSTAR model (without adding outliers to the model). It can be solved without removing outliers from the data by adding outlier factors to the model. This way, the critical information in the outlier id is not lost, and an accurate ore model is obtained.
APA, Harvard, Vancouver, ISO, and other styles
4

Muhima, Rani Rotul, Muchamad Kurniawan, and Oktavian Tegar Pambudi. "A LOF K-Means Clustering on Hotspot Data." International Journal of Artificial Intelligence & Robotics (IJAIR) 2, no. 1 (July 1, 2020): 29. http://dx.doi.org/10.25139/ijair.v2i1.2634.

Full text
Abstract:
K-Means is the most popular of clustering method, but its drawback is sensitivity to outliers. This paper discusses the addition of the outlier removal method to the K-Means method to improve the performance of clustering. The outlier removal method was added to the Local Outlier Factor (LOF). LOF is the representative outlier’s detection algorithm based on density. In this research, the method is called LOF K-Means. The first applying clustering by using the K-Means method on hotspot data and then finding outliers using the LOF method. The object detected outliers are then removed. Then new centroid for each group is obtained using the K-Means method again. This dataset was taken from the FIRM are provided by the National Aeronautics and Space Administration (NASA). Clustering was done by varying the number of clusters (k = 10, 15, 20, 25, 30, 35, 40, 45 and 50) with cluster optimal is k = 20. The result based on the value of Sum of Squared Error (SSE) shown the LOF K-Means method was better than the K-Means method.
APA, Harvard, Vancouver, ISO, and other styles
5

Agyemang, Malik, Ken Barker, and Reda Alhajj. "Web outlier mining: Discovering outliers from web datasets1." Intelligent Data Analysis 9, no. 5 (November 3, 2005): 473–86. http://dx.doi.org/10.3233/ida-2005-9505.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Syed Abd Mutalib, Sharifah Sakinah, Siti Zanariah Satari, and Wan Nur Syahidah Wan Yusoff. "SYNTHETIC MULTIVARIATE DATA GENERATION PROCEDURE WITH VARIOUS OUTLIER SCENARIOS USING R PROGRAMMING LANGUAGE." Jurnal Teknologi 84, no. 3 (March 31, 2022): 89–101. http://dx.doi.org/10.11113/jurnalteknologi.v84.17900.

Full text
Abstract:
A synthetic data generation procedure is a procedure to generate data from either a statistical or mathematical model. The data generation procedure has been used in simulation studies to compare statistical performance methods or propose a new statistical method with a specific distribution. A synthetic multivariate data generation procedure with various outlier scenarios using R is formulated in this study. An outlier generating model is used to generate multivariate data that contains outliers. Data generation procedures for various outlier scenarios by using R are explained. Three outlier scenarios are produced, and graphical representations using 3D scatterplot and Chernoff faces for these outlier scenarios are shown. The graphical representation shows that as the distance between outliers and inliers by shifting the mean, increases in Outlier Scenario 1, the outliers and inliers are completely separated. The same pattern can also be seen when the distance between outliers and inliers, by shifting the covariance, increase in Outlier Scenario 2. For Outlier Scenario 3, when both values and increase, the separation of outliers and inliers are more apparent. The data generation procedure in this study will be continually used in other applications, such as identifying outliers by using the clustering method.
APA, Harvard, Vancouver, ISO, and other styles
7

Yulistiani, Selma, and Suliadi Suliadi. "Deteksi Pencilan pada Model ARIMA dengan Bayesian Information Criterion (BIC) Termodifikasi." STATISTIKA: Journal of Theoretical Statistics and Its Applications 19, no. 1 (June 20, 2019): 29–37. http://dx.doi.org/10.29313/jstat.v19i1.4740.

Full text
Abstract:
Time series data may be affected by special events or circumstances such as promotions, natural disasters, etc. These events can lead to inconsistent observations in the series called outliers. Because outliers can make invalid conclusions, it is important to carry out procedures in detecting outlier effects. In outlier detection there is one type of outlier, namely additive outlier (AO). The process of detecting additive outliers in the ARIMA model can be said as a model selection problem, where the candidate model assumes additive outliers at a certain time. In the selection of models there are criteria that must be considered in order to produce the best model. The good criteria for models selection can use the Bayesian Information Criterion (BIC) derived by Schwarz (1978). Galeano and Pena (2011) proposed a modified Bayesian Information Criterion for model selection and detect potential outliers. The modified Bayesian Information Criterion for outlier detection will be applied to the data OutStanding Loan PT.Pegadaian Cimahi year 2013-2017. So that the best model is obtained that the model with adding 2 potential outliers with the ARIMA model (1.0,0), that outliers at observations 48, and 58 because it has a minimum BICUP value of 1064.95650.
APA, Harvard, Vancouver, ISO, and other styles
8

Knight, Nathan L., and Jinling Wang. "A Comparison of Outlier Detection Procedures and Robust Estimation Methods in GPS Positioning." Journal of Navigation 62, no. 4 (October 2009): 699–709. http://dx.doi.org/10.1017/s0373463309990142.

Full text
Abstract:
With more satellite systems becoming available there is currently a need for Receiver Autonomous Integrity Monitoring (RAIM) to exclude multiple outliers. While the single outlier test can be applied iteratively, in the field of statistics robust methods are preferred when multiple outliers exist. This study compares the outlier test and numerous robust methods with simulated GPS measurements to identify which methods have the greatest ability to correctly exclude outliers. It was found that no method could correctly exclude outliers 100% of the time. However, for a single outlier the outlier test achieved the highest rates of correct exclusion followed by the MM-estimator and the L1-norm. As the number of outliers increased MM-estimators and the L1-norm obtained the highest rates of normal exclusion, which were up to ten percent higher than the outlier test.
APA, Harvard, Vancouver, ISO, and other styles
9

Hasanah, Siti Tabi'atul. "Pendeteksian Outlier pada Regresi Nonlinier dengan Metode statistik Likelihood Displacement." CAUCHY 2, no. 3 (November 15, 2012): 177. http://dx.doi.org/10.18860/ca.v2i3.3127.

Full text
Abstract:
<div class="standard"><a id="magicparlabel-1713">Outlier is an observation that much different (extreme) from the other observational data, or data can be interpreted that do not follow the general pattern of the model. Sometimes outliers provide information that can not be provided by other data. That's why outliers should not just be eliminated. Outliers can also be an influential observation. There are many methods that can be used to detect of outliers. In previous studies done on outlier detection of linear regression. Next will be developed detection of outliers in nonlinear regression. Nonlinear regression here is devoted to multiplicative nonlinear regression. To detect is use of statistical method likelihood displacement. Statistical methods abbreviated likelihood displacement (LD) is a method to detect outliers by removing the suspected outlier data. To estimate the parameters are used to the maximum likelihood method, so we get the estimate of the maximum. By using LD method is obtained i.e likelihood displacement is thought to contain outliers. Further accuracy of LD method in detecting the outliers are shown by comparing the MSE of LD with the MSE from the regression in general. Statistic test used is Λ. Initial hypothesis was rejected when proved so is an outlier.</a></div>
APA, Harvard, Vancouver, ISO, and other styles
10

Maia Lima, Luís Fernando, Alexandre Masson Maroldi, Dávilla Vieira Odízio da Silva, Carlos Roberto Massao Hayashi, and Maria Cristina Piumbato Innocentini Hayashi. "A influência de outliers nos estudos métricos da informação: uma análise de dados univariados." Em Questão 24 (December 31, 2018): 216. http://dx.doi.org/10.19132/1808-5245240.216-235.

Full text
Abstract:
Este artigo apresenta uma nova fórmula de detecção de outliers via Análise Exploratória de Dados, levando em conta a assimetria dos dados, e também estuda o efeito da remoção dos outliers dos dados originais. Aplica-se a fórmula para três conjuntos de dados publicados na literatura de estudos métricos da informação. O primeiro conjunto de dados apresenta cinco outliers inferiores. A média, dos dados agregados, conduz à falsa impressão de que 40 universidades, de um total de 49, estão acima da média. A remoção dos cinco outliers inferiores conduz a uma nova média em que somente 22 universidades estão acima da média. No segundo conjunto de dados há a presença de cinco outliers inferiores e um outlier superior. Neste caso, o outlier superior ameniza o efeito dos outliers inferiores. No terceiro conjunto de dados, detectam-se cinco outliers superiores e um outlier inferior. A média, dos dados agregados, aponta que dez universidades estão acima da média. Removendo-se os seis outliers dos dados originais, encontra-se que 28 universidades estão acima do novo valor da média. Para os três conjuntos de dados analisados o trabalho também demonstra o efeito dos outliers na estimativa intervalar (inferência estatística): a remoção dos outliers gera valores mais representativos tanto para a média como para o desvio padrão da amostra analisada. Portanto, evidencia-se como outliers podem afetar resultados e conclusões nos estudos métricos da informação. Todavia, a fórmula para a detecção de outliers apresenta-se aberta para futuras pesquisas.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Outliers"

1

Sean, Viseth. "Exploration Framework For Detecting Outliers In Data Streams." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-theses/395.

Full text
Abstract:
Current real-world applications are generating a large volume of datasets that are often continuously updated over time. Detecting outliers on such evolving datasets requires us to continuously update the result. Furthermore, the response time is very important for these time critical applications. This is challenging. First, the algorithm is complex; even mining outliers from a static dataset once is already very expensive. Second, users need to specify input parameters to approach the true outliers. While the number of parameters is large, using a trial and error approach online would be not only impractical and expensive but also tedious for the analysts. Worst yet, since the dataset is changing, the best parameter will need to be updated to respond to user exploration requests. Overall, the large number of parameter settings and evolving datasets make the problem of efficiently mining outliers from dynamic datasets very challenging. Thus, in this thesis, we design an exploration framework for detecting outliers in data streams, called EFO, which enables analysts to continuously explore anomalies in dynamic datasets. EFO is a continuous lightweight preprocessing framework. EFO embraces two optimization principles namely "best life expectancy" and "minimal trial," to compress evolving datasets into a knowledge-rich abstraction of important interrelationships among data. An incremental sorting technique is also used to leverage the almost ordered lists in this framework. Thereafter, the knowledge abstraction generated by EFO not only supports traditional outlier detection requests but also novel outlier exploration operations on evolving datasets. Our experimental study conducted on two real datasets demonstrates that EFO outperforms state-of-the-art technique in terms of CPU processing costs when varying stream volume, velocity and outlier rate.
APA, Harvard, Vancouver, ISO, and other styles
2

Beau, Thabiso. "Normality of JSE Returns: Macro-outliers, Micro-outliers: an Empirical Evaluation." Master's thesis, Faculty of Commerce, 2019. https://hdl.handle.net/11427/31721.

Full text
Abstract:
Previous work on the empirical distribution of security returns has found that equity returns are not normally distributed. These findings have brought the applicability of certain asset allocation and pricing frameworks into question. This study examines whether the removal of a priori macro-outliers and micro-outliers leads to improved fits to the Gaussian distribution for single-listed equities on the Johannesburg Stock Exchange (JSE). Single-listed equities refer to stocks (i) listed on the JSE Main Board over the period covered in this study, (ii) that comprise of the exchange’s largest 100 stocks by market capitalisation, and (iii) have been determined, by comparing American Depository Receipt (ADR) trading volume to JSE trading volume, to be mainly exposed to the South African market. Regarding the predetermined outliers, the study categorises macro-outliers as days related to predictable market announcements which are US nonfarm payrolls announcement days. Similarly, micro-outliers are classified as days linked to predictable sector-specific and firm-specific news, which are sectoral announcement, and company earnings announcement days, respectively. The study aims to contribute to the empirical and theoretical literature on the distributional properties of South African equity returns. This study makes use of a filter to narrow the sample of stocks for empirical investigation over the period from 1 January 2016 to 31 December 2017, and analyses daily stock returns on a 65-day rolling basis. Using only those equities, an evaluation of the goodness-of-fit methodology is conducted using graphical methods, and statistical goodness-of-fit tests sorted into (i) empirical distribution function, (ii) regression and correlation, and (iii) moment tests. It is found that the majority of the data exhibits significant departures from normality in empirical distribution function, and regression and correlation tests. The results were statistically significant at three confidence levels. However, in the case of moment tests, the results show a clear divergence between the methods. It is further demonstrated that while the daily stock returns have improved fits to the normal distribution, they remain predominantly positively-skewed and thick-tailed even after the removal of the a priori outliers. On this basis, it is argued that some downside risk measures, and asset allocation frameworks may not be applicable in the South African context.
APA, Harvard, Vancouver, ISO, and other styles
3

Mitchell, Napoleon. "Outliers and Regression Models." Thesis, University of North Texas, 1992. https://digital.library.unt.edu/ark:/67531/metadc279029/.

Full text
Abstract:
The mitigation of outliers serves to increase the strength of a relationship between variables. This study defined outliers in three different ways and used five regression procedures to describe the effects of outliers on 50 data sets. This study also examined the relationship among the shape of the distribution, skewness, and outliers.
APA, Harvard, Vancouver, ISO, and other styles
4

Yin, Yong. "Outliers in Time Series /." Connect to resource, 1995. http://rave.ohiolink.edu/etdc/view.cgi?acc%5Fnum=osu1262638388.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Halldestam, Markus. "ANOVA - The Effect of Outliers." Thesis, Uppsala universitet, Statistiska institutionen, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-295864.

Full text
Abstract:
This bachelor’s thesis focuses on the effect of outliers on the one-way analysis of variance and examines whether the estimate in ANOVA is robust and whether the actual test itself is robust from influence of extreme outliers. The robustness of the estimates is examined using the breakdown point while the robustness of the test is examined by simulating the hypothesis test under some extreme situations. This study finds evidence that the estimates in ANOVA are sensitive to outliers, i.e. that the procedure is not robust. Samples with a larger portion of extreme outliers have a higher type-I error probability than the expected level.
APA, Harvard, Vancouver, ISO, and other styles
6

Schall, Robert. "Outliers and influence under arbitrary variance." Doctoral thesis, University of Cape Town, 1986. http://hdl.handle.net/11427/21913.

Full text
Abstract:
Using a geometric approach to best linear unbiased estimation in the general linear model, the additional sum of squares principle, used to generate decompositions, can be generalized allowing for an efficient treatment of augmented linear models. The notion of the admissibility of a new variable is useful in augmenting models. Best linear unbiased estimation and tests of hypotheses can be performed through transformations and reparametrizations of the general linear model. The theory of outliers and influential observations can be generalized so as to be applicable for the general univariate linear model, where three types of outlier and influence may be distinguished. The adjusted models, adjusted parameter estimates, and test statistics corresponding to each type of outlier are obtained, and data adjustments can be effected. Relationships to missing data problems are exhibited. A unified approach to outliers in the general linear model is developed. The concept of recursive residuals admits generalization. The typification of outliers and influential observations in the general linear model can be extended to normal multivariate models. When the outliers in a multivariate regression model follow a nested pattern, maximum likelihood estimation of the parameters in the model adjusted for the different types of outlier can be performed in closed form, and the corresponding likelihood ratio test statistic is obtained in closed form. For an arbitrary outlier pattern, and for the problem of outliers in the generalized multivariate regression model, three versions of the EM-algorithm corresponding to three types of outlier are used to obtain maximum likelihood estimates iteratively. A fundamental principle is the comparison of observations with a choice of distribution appropriate to the presumed type of outlier present. Applications are not necessarily restricted to multivariate normality.
APA, Harvard, Vancouver, ISO, and other styles
7

Campos, Guilherme Oliveira. "Estudo, avaliação e comparação de técnicas de detecção não supervisionada de outliers." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-04082015-084412/.

Full text
Abstract:
A área de detecção de outliers (ou detecção de anomalias) possui um papel fundamental na descoberta de padrões em dados que podem ser considerados excepcionais sob alguma perspectiva. Detectar tais padrões é relevante de maneira geral porque, em muitas aplicações de mineração de dados, tais padrões representam comportamentos extraordinários que merecem uma atenção especial. Uma importante distinção se dá entre as técnicas supervisionadas e não supervisionadas de detecção. O presente projeto enfoca as técnicas de detecção não supervisionadas. Existem dezenas de algoritmos desta categoria na literatura e novos algoritmos são propostos de tempos em tempos, porém cada um deles utiliza uma abordagem própria do que deve ser considerado um outlier ou não, que é um conceito subjetivo no contexto não supervisionado. Isso dificulta sensivelmente a escolha de um algoritmo em particular em uma dada aplicação prática. Embora seja de conhecimento comum que nenhum algoritmo de aprendizado de máquina pode ser superior a todos os demais em todos os cenários de aplicação, é uma questão relevante se o desempenho de certos algoritmos em geral tende a dominar o de determinados outros, ao menos em classes particulares de problemas. Neste projeto, propõe-se contribuir com o estudo, seleção e pré-processamento de bases de dados que sejam apropriadas para se juntarem a uma coleção de benchmarks para avaliação de algoritmos de detecção não supervisionada de outliers. Propõe-se ainda avaliar comparativamente o desempenho de métodos de detecção de outliers. Durante parte do meu trabalho de mestrado, tive a colaboração intelectual de Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle e, principalmente, Joerg Sander e Arthur Zimek. A contribuição deles foi essencial para as análises dos resultados e a forma compacta de apresentá-los.
The outlier detection area has an essential role in discovering patterns in data that can be considered as exceptional in some perspective. Detect such patterns is important in general because, in many data mining applications, such patterns represent extraordinary behaviors that deserve special attention. An important distinction occurs between supervised and unsupervised detection techniques. This project focuses on the unsupervised detection techniques. There are dozens of algorithms in this category in literature and new algorithms are proposed from time to time, but each of them uses its own approach of what should be considered an outlier or not, which is a subjective concept in the unsupervised context. This considerably complicates the choice of a particular algorithm in a given practical application. While it is common knowledge that no machine learning algorithm can be superior to all others in all application scenarios, it is a relevant question if the performance of certain algorithms in general tends to dominate certain other, at least in particular classes of problems. In this project, proposes to contribute to the databases study, selection and pre-processing that are appropriate to join a benchmark collection for evaluating unsupervised outlier detection algorithms. It is also proposed to evaluate comparatively the performance of outlier detection methods. During part of my master thesis, I had the intellectual collaboration of Erich Schubert, Ira Assent, Barbora Micenková, Michael Houle and especially Joerg Sander and Arthur Zimek. Their contribution was essential for the analysis of the results and the compact way to present them.
APA, Harvard, Vancouver, ISO, and other styles
8

Berton, Lilian. "Caracterização de classes e detecção de outliers em redes complexa." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-19072011-132701/.

Full text
Abstract:
As redes complexas surgiram como uma nova e importante maneira de representação e abstração de dados capaz de capturar as relações espaciais, topológicas, funcionais, entre outras características presentes em muitas bases de dados. Dentre as várias abordagens para a análise de dados, destacam-se a classificação e a detecção de outliers. A classificação de dados permite atribuir uma classe aos dados, baseada nas características de seus atributos e a detecção de outliers busca por dados cujas características se diferem dos demais. Métodos de classificação de dados e de detecção de outliers baseados em redes complexas ainda são pouco estudados. Tendo em vista os benefícios proporcionados pelo uso de redes complexas na representação de dados, o presente trabalho apresenta o desenvolvimento de um método baseado em redes complexas para detecção de outliers que utiliza a caminhada aleatória e um índice de dissimilaridade. Este método possibilita a identificação de diferentes tipos de outliers usando a mesma medida. Dependendo da estrutura da rede, os vértices outliers podem ser tanto aqueles distantes do centro como os centrais, podem ser hubs ou vértices com poucas ligações. De um modo geral, a medida proposta é uma boa estimadora de vértices outliers em uma rede, identificando, de maneira adequada, vértices com uma estrutura diferenciada ou com uma função especial na rede. Foi proposta também uma técnica de construção de redes capaz de representar relações de similaridade entre classes de dados, baseada em uma função de energia que considera medidas de pureza e extensão da rede. Esta rede construída foi utilizada para caracterizar mistura entre classes de dados. A caracterização de classes é uma questão importante na classificação de dados, porém ainda é pouco explorada. Considera-se que o trabalho desenvolvido é uma das primeiras tentativas nesta direção
Complex networks have emerged as a new and important way of representation and data abstraction capable of capturing the spatial relationships, topological, functional, and other features present in many databases. Among the various approaches to data analysis, we highlight classification and outlier detection. Data classification allows to assign a class to the data based on characteristics of their attributes and outlier detection search for data whose characteristics differ from the others. Methods of data classification and outlier detection based on complex networks are still little studied. Given the benefits provided by the use of complex networks in data representation, this study developed a method based on complex networks to detect outliers based on random walk and on a dissimilarity index. The method allows the identification of different types of outliers using the same measure. Depending on the structure of the network, the vertices outliers can be either those distant from the center as the central, can be hubs or vertices with few connections. In general, the proposed measure is a good estimator of outlier vertices in a network, properly identifying vertices with a different structure or a special function in the network. We also propose a technique for building networks capable of representing similarity relationships between classes of data based on an energy function that considers measures of purity and extension of the network. This network was used to characterize mixing among data classes. Characterization of classes is an important issue in data classification, but it is little explored. We consider that this work is one of the first attempts in this direction
APA, Harvard, Vancouver, ISO, and other styles
9

Iranzo, Pérez David. "Análisis de outliers: un caso a estudio." Doctoral thesis, Universitat de València, 2007. http://hdl.handle.net/10803/9467.

Full text
Abstract:
Una de las limitaciones del estudio de series temporales mediante lamodelización ARIMA, y en concreto a través del enfoque Box-Jenkins, es la dificultadde identificar correctamente el modelo y, en su caso, seleccionar el más adecuado. Elprocedimiento de filtrado estándar para estimar el ciclo de negocios puede requeriralgunas correcciones previas de las series, dado que, de otro modo, se podrían producirgraves distorsiones en los resultados. Un destacado ejemplo es la corrección por outliersque es tratada, junto con el resto de ajustes previos.Los outliers denotan observaciones atípicas que, hablando en general, no puedenser explicadas por el modelo ARIMA y violan sus subyacentes supuestos denormalidad. Como los modelos ARIMA utilizados frecuentemente en series temporalesestán diseñados para recoger la información de procesos que tienen una ciertahomogeneidad, los outliers y los cambios estructurales influyen en la eficiencia y labondad del ajuste de dichos modelos.Siguiendo el trabajo seminal de Fox, cuatro diferentes tipos de outliers han sidopropuestos, junto con diversos procedimientos para detectarlos. Los cuatro tipos deoutliers que se han considerado en la literatura son: el outlier aditivo (AO), el cambio ennivel (LS), el cambio temporal (TC) y el outlier innovacional (OI).El presente estudio hace una comparación de los programas TRAMO/SEATS yX12ARIMA, ampliamente usados (y recomendados) por Eurostat y el Banco CentralEuropeo, junto con X12ARIMA. La comparación es importante para dilucidar laconveniencia de promover el uso de uno de los dos, en aras a armonizar el tratamientode series temporales.Ambos programas son altamente configurables y disponen de una infinidad deparámetros que el usuario puede determinar.Para ilustrar el trabajo se realiza, en primer lugar, un experimento con seriesgeneradas, en el cual se va a trabajar con un total de nueve mil series ruido blancosimuladas a partir de una función generadora de datos aleatorios, resultado deconsiderar tres modelos econométricos distintos y, a su vez, tres periodos muestralesdistintos en cada caso (60, 120 y 300 observaciones). Además, se va a forzar lapresencia de los tres tipos de outliers (AO, LS, TC) con tres niveles de intensidad delimpacto. Para cada uno de estos casos concretos se estudiarán un total de cien series.En segundo lugar, se trabaja con series reales donde se trata de analizar laincidencia del shock provocado por un acto terrorista, sobre la actividad turística en unadeterminada zona. Para ello se realiza un estudio detallado de las pernoctaciones totalesde viajeros en establecimientos hoteleros según el país de procedencia.El marco teórico utilizado se inspira en los trabajos de Enders et al. (1992) yDrakos et al. (2001), mientras que la metodología utilizada se inspira en el análisis deseries temporales, en concreto se sigue la propuesta de A. Maravall y V. Gómez (1996).Dentro de las acciones terroristas, destacan las acciones sobre la actividadturística en general y sobre el sector del transporte en particular. Dichos sectores son losmás vulnerables ante las amenazas de inseguridad.Tanto en el experimento con series generadas como en el experimento con seriesreales se procede a analizar las series con ambos programes, es decir, TRAMO/SEATSy X12ARIMA para comparar los resultados y así poder establecer diferencias entre losprogramas.
One of the limitations of using ARIMA modelling, and more specifically theBox-Jenkins approach, to study time series is how difficult it is to correctly identify themodel and, where applicable, to choose the most suitable one. The standard filteringprocess used to estimate the business cycle can require the prior correction of someseries, due to the fact that if this were not the case, results could be seriously distorted.One outstanding example is outlier correction.Outliers denote unusual observations that, generally speaking, cannot beexplained by the ARIMA model and violate its underlying normality assumptions. Asthe ARIMA models frequently used in time series are designed to capture informationin processes that have some degree of homogeneity, their efficiency and goodness-of-fitcan be influenced by outliers and structural changes.Following the seminal research by Fox, four different types of outliers areproposed, together with various processes to detect them. The four types of outlierscontemplated in the literature are: Additive Outlier (AO), Level Shift (LS), TemporaryChange (TC) and Innovational Outlier (IO).In order to illustrate this research, in the first place, an experiment is carried outusing nine thousand white noise series simulated using a random data generationfunction after considering three different econometric models and, at the same time,three different sample periods in each case (60, 120 and 300 observations).Furthermore, the presence of three types of outliers will be forced (AO, LS and TC)with three different levels of impact. A total of 100 series will be studied for each ofthese specific cases.In the second place, real series are used to analyse the influence of a shockcaused by a terrorist attack on tourism activity in a given area. In order to do so, wecarry out a detailed study of travellers' total overnight stays in hotels by country oforigin.Both programmes, that is, TRAMO/SEAT and X12ARIMA, are used to analysedata in both the experiment with generated series and that using real series in order tocompare results and hence establish differences between the two.
APA, Harvard, Vancouver, ISO, and other styles
10

Dunagan, John D. (John David) 1976. "A geometric theory of outliers and perturbation." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/8396.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2002.
Includes bibliographical references (p. 91-94).
We develop a new understanding of outliers and the behavior of linear programs under perturbation. Outliers are ubiquitous in scientific theory and practice. We analyze a simple algorithm for removal of outliers from a high-dimensional data set and show the algorithm to be asymptotically good. We extend this result to distributions that we can access only by sampling, and also to the optimization version of the problem. Our results cover both the discrete and continuous cases. This is joint work with Santosh Vempala. The complexity of solving linear programs has interested researchers for half a century now. We show that an arbitrary linear program subject to a small random relative perturbation has good condition number with high probability, and hence is easy to solve. This is joint work with Avrim Blum, Daniel Spielman, and Shang-Hua Teng. This result forms part of the smoothed analysis project initiated by Spielman and Teng to better explain mathematically the observed performance of algorithms.
by John D. Dunagan.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Outliers"

1

Gladwell, Malcolm. Outliers. New York: Little, Brown and Company, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Toby, Lewis, ed. Outliers in statistical data. 3rd ed. Chichester: Wiley, 1994.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Gladwell, Malcolm. Outliers: The story of success. New York: Little, Brown and Co. Large Print, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Gladwell, Malcolm. Outliers: The story of success. New York: Little, Brown and Co., 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

St. Kilda and other Hebridean outliers. Newton Abbot: David & Charles, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

GOVERNMENT, US. Chacoan Outliers Protection Act of 1995. [Washington, D.C.?: U.S. G.P.O., 1995.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

1944-, Hoaglin David C., ed. How to detect and handle outliers. Milwaukee, Wis: ASQC Quality Press, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Guttman, Irwin. Spuriosity and outliers in circular data. Toronto: University of Toronto, Dept. of Statistics, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Francis, Thompson. St Kilda and other Hebridean outliers. Newton Abbot: David & Charles, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Rumburg, Scot. Characteristics of directly expanded hog data outliers. Washington, D.C: Research and Applications Division, National Agricultural Statistics Service, U.S. Department of Agriculture, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Outliers"

1

Baragona, Roberto, Francesco Battaglia, and Irene Poli. "Outliers." In Evolutionary Statistical Procedures, 159–97. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-16218-3_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Barrie Wetherill, G., P. Duncombe, M. Kenward, J. Köllerström, S. R. Paul, and B. J. Vowden. "Outliers." In Regression Analysis with Applications, 138–64. Dordrecht: Springer Netherlands, 1986. http://dx.doi.org/10.1007/978-94-009-4105-2_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Nahler, Gerhard. "outliers." In Dictionary of Pharmaceutical Medicine, 128. Vienna: Springer Vienna, 2009. http://dx.doi.org/10.1007/978-3-211-89836-9_983.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Krasker, William S. "Outliers." In Time Series and Statistics, 194–97. London: Palgrave Macmillan UK, 1990. http://dx.doi.org/10.1007/978-1-349-20865-4_25.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Yan. "Outliers." In Encyclopedia of Quality of Life and Well-Being Research, 4542–46. Dordrecht: Springer Netherlands, 2014. http://dx.doi.org/10.1007/978-94-007-0753-5_2039.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

O’Connor, Jennifer. "Outliers." In EAI International Conference on Technology, Innovation, Entrepreneurship and Education, 173–81. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-16130-9_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Krasker, William S. "Outliers." In The New Palgrave Dictionary of Economics, 1–4. London: Palgrave Macmillan UK, 1987. http://dx.doi.org/10.1057/978-1-349-95121-5_1884-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lewis, Toby. "Outliers." In International Encyclopedia of Statistical Science, 1043–45. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-04898-2_437.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ooms, Marius. "Outliers." In Lecture Notes in Economics and Mathematical Systems, 139–203. Berlin, Heidelberg: Springer Berlin Heidelberg, 1994. http://dx.doi.org/10.1007/978-3-642-48792-7_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Krasker, William S. "Outliers." In The New Palgrave Dictionary of Economics, 9922–25. London: Palgrave Macmillan UK, 2018. http://dx.doi.org/10.1057/978-1-349-95189-5_1884.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Outliers"

1

Liu, Ninghao, Donghwa Shin, and Xia Hu. "Contextual Outlier Interpretation." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/341.

Full text
Abstract:
While outlier detection has been intensively studied in many applications, interpretation is becoming increasingly important to help people trust and evaluate the developed detection models through providing intrinsic reasons why the given outliers are identified. It is a nontrivial task for interpreting the abnormality of outliers due to the distinct characteristics of different detection models, complicated structures of data in certain applications, and imbalanced distribution of outliers and normal instances. In addition, contexts where outliers locate, as well as the relation between outliers and the contexts, are usually overlooked in existing interpretation frameworks. To tackle the issues, in this paper, we propose a Contextual Outlier INterpretation (COIN) framework to explain the abnormality of outliers spotted by detectors. The interpretability of an outlier is achieved through three aspects, i.e., outlierness score, attributes that contribute to the abnormality, and contextual description of its neighborhoods. Experimental results on various types of datasets demonstrate the flexibility and effectiveness of the proposed framework.
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Yongmou, Yijie Wang, and Hongtao Guan. "Improve the Detection of Clustered Outliers via Outlier Score Propagation." In 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 2019. http://dx.doi.org/10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00155.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wang, Maximilian J., Guifen Mao, and Haiquan Chen. "Mining multivariate outliers." In the 2014 ACM Southeast Regional Conference. New York, New York, USA: ACM Press, 2014. http://dx.doi.org/10.1145/2638404.2638526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Qin, Jiahang, Yongping Hou, and Liying Ma. "Research on Automatic Removal of Outliers in Fuel Cell Test Data and Fitting Method of Polarization Curve." In WCX SAE World Congress Experience. 400 Commonwealth Drive, Warrendale, PA, United States: SAE International, 2024. http://dx.doi.org/10.4271/2024-01-2896.

Full text
Abstract:
<div class="section abstract"><div class="htmlview paragraph">Fuel cell vehicles have always garnered a lot of attention in terms of energy utilization and environmental protection. In the analysis of fuel cell performance, there are usually some outliers present in the raw experimental data that can significantly affect the data analysis results. Therefore, data cleaning work is necessary to remove these outliers. The polarization curve is a crucial tool for describing the basic characteristics of fuel cells, typically described by semi-empirical formulas. The parameters in these semi-empirical formulas are fitted using the raw experimental data, so how to quickly and effectively automatically identify and remove data outliers is a crucial step in the process of fitting polarization curve parameters. This article explores data-cleaning methods based on the Local Outlier Factor (LOF) algorithm and the Isolation Forest algorithm to remove data outliers. For fuel cell experimental data, two algorithms are used to score all data points for outliers, and a reasonable threshold is set for outlier identification and removal. Then the parameters in the empirical formula of the polarization curve are fitted. The evaluation indicators adopt the coefficient of determination and root mean square error. The results show that after removing data outliers using two algorithms, the polarization curve has greatly improved in terms of fitting effects compared to the raw data. In addition, this article also compares and analyzes the outlier removal effects of the Isolation Forest algorithm and LOF algorithm and the two evaluation indicators. The results show that the LOF algorithm has higher accuracy and stability than the Isolation Forest algorithm in detecting outliers.</div></div>
APA, Harvard, Vancouver, ISO, and other styles
5

Gupta, Manish, Jing Gao, Yizhou Sun, and Jiawei Han. "Integrating community matching and outlier detection for mining evolutionary community outliers." In the 18th ACM SIGKDD international conference. New York, New York, USA: ACM Press, 2012. http://dx.doi.org/10.1145/2339530.2339667.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sidiropoulos, Anastasios, Dingkang Wang, and Yusu Wang. "Metric embeddings with outliers." In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2017. http://dx.doi.org/10.1137/1.9781611974782.43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Wu, Ou, Jun Gao, Weiming Hu, Bing Li, and Mingliang Zhu. "Identifying Multi-instance Outliers." In Proceedings of the 2010 SIAM International Conference on Data Mining. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2010. http://dx.doi.org/10.1137/1.9781611972801.38.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kolesárová, Anna, and Radko Mesiar. "Aggregation Based on Outliers." In 19th World Congress of the International Fuzzy Systems Association (IFSA), 12th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT), and 11th International Summer School on Aggregation Operators (AGOP). Paris, France: Atlantis Press, 2021. http://dx.doi.org/10.2991/asum.k.210827.078.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Følstad, Asbjørn, Effie Lai-Chong Law, and Kasper Hornbæk. "Outliers in usability testing." In the 7th Nordic Conference. New York, New York, USA: ACM Press, 2012. http://dx.doi.org/10.1145/2399016.2399056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Har-Peled, Sariel, and Yusu Wang. "Shape fitting with outliers." In the nineteenth conference. New York, New York, USA: ACM Press, 2003. http://dx.doi.org/10.1145/777792.777798.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Outliers"

1

Fenimore, Edward E. The cause of outliers in electromagnetic pulse (EMP) locations. Office of Scientific and Technical Information (OSTI), October 2014. http://dx.doi.org/10.2172/1159220.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Álvarez Florens Odendahl, Luis J., and Germán López-Espinosa. Data outliers and Bayesian VARs in the euro area. Madrid: Banco de España, November 2022. http://dx.doi.org/10.53479/23552.

Full text
Abstract:
We propose a method to adjust for data outliers in Bayesian Vector Autoregressions (BVARs), which allows for different outlier magnitudes across variables and rescales the reduced form error terms. We use the method to document several facts about the effect of outliers on estimation and out-of-sample forecasting results using euro area macroeconomic data. First, the COVID-19 pandemic led to large swings in macroeconomic data that distort the BVAR estimation results. Second, these swings can be addressed by rescaling the shocks’ variance. Third, taking into account outliers before 2020 leads to mild improvements in the point forecasts of BVARs for some variables and horizons. However, the density forecast performance considerably deteriorates. Therefore, we recommend taking into account outliers only on pre-specified dates around the onset of the COVID-19 pandemic.
APA, Harvard, Vancouver, ISO, and other styles
3

Taveras, Elsie, Richard Marshall, Mona Sharifi, Earlene Avalon, Lauren Fiechtner, Christine Horan, Monica Gerber, et al. Improving Childhood Obesity Outcomes: Testing Best Practices of Positive Outliers. Patient-Centered Outcomes Research Institute (PCORI), March 2018. http://dx.doi.org/10.25302/3.2018.ih.13046739.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sadler, Brian M., and Stephen D. Casey. On Periodic Pulse Interval Analysis with Outliers and Missing Observations. Fort Belvoir, VA: Defense Technical Information Center, January 1996. http://dx.doi.org/10.21236/ada454910.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Eidsvik, Jo, and Steinar L. Ellefmo. Fast detection of outliers and anomalies in joint frequency data. Cogeo@oeaw-giscience, September 2011. http://dx.doi.org/10.5242/iamg.2011.0025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Mustard, P. S., and G. E. Rouse. Sedimentary Outliers of the eastern Georgia Basin Margin, British Columbia. Natural Resources Canada/ESS/Scientific and Technical Publishing Services, 1991. http://dx.doi.org/10.4095/132517.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Giltinan, D. M., R. J. Carroll, and D. Ruppert. Some New Estimation Methods for Weighted Regression When There are Possible Outliers. Fort Belvoir, VA: Defense Technical Information Center, January 1985. http://dx.doi.org/10.21236/ada152104.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lucon, Enrico. Statistical Detection of Outliers in the Certification of NIST Reference Charpy Lots. Gaithersburg, MD: National Institute of Standards and Technology, 2024. http://dx.doi.org/10.6028/nist.ir.8526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Taplin, Ross, and Adrian E. Raftery. Analysis of Agricultural Field Trials in the Presence of Outliers and Fertility Jumps. Fort Belvoir, VA: Defense Technical Information Center, September 1991. http://dx.doi.org/10.21236/ada242454.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Mathew, Jijo K., Christopher M. Day, Howell Li, and Darcy M. Bullock. Curating Automatic Vehicle Location Data to Compare the Performance of Outlier Filtering Methods. Purdue University, 2021. http://dx.doi.org/10.5703/1288284317435.

Full text
Abstract:
Agencies use a variety of technologies and data providers to obtain travel time information. The best quality data can be obtained from second-by-second tracking of vehicles, but that data presents many challenges in terms of privacy, storage requirements and analysis. More frequently agencies collect or purchase segment travel time based upon some type of matching of vehicles between two spatially distributed points. Typical methods for that data collection involve license plate re-identification, Bluetooth, Wi-Fi, or some type of rolling DSRC identifier. One of the challenges in each of these sampling techniques is to employ filtering techniques to remove outliers associated with trip chaining, but not remove important features in the data associated with incidents or traffic congestion. This paper describes a curated data set that was developed from high-fidelity GPS trajectory data. The curated data contained 31,621 vehicle observations spanning 42 days; 2550 observations had travel times greater than 3 minutes more than normal. From this baseline data set, outliers were determined using GPS waypoints to determine if the vehicle left the route. Two performance measures were identified for evaluating three outlier-filtering algorithms by the proportion of true samples rejected and proportion of outliers correctly identified. The effectiveness of the three methods over 10-minute sampling windows was also evaluated. The curated data set has been archived in a digital repository and is available online for others to test outlier-filtering algorithms.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography