Log in

Relevant bibliographies by topics / Kolmogorov-Smirnov test / Dissertations / Theses

Dissertations / Theses on the topic 'Kolmogorov-Smirnov test'

To see the other types of publications on this topic, follow the link: Kolmogorov-Smirnov test.

Author: Grafiati

Published: 4 June 2021

Last updated: 20 February 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 31 dissertations / theses for your research on the topic 'Kolmogorov-Smirnov test.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Böhm, Walter, and Kurt Hornik. "A Kolmogorov-Smirnov Test for r Samples." WU Vienna University of Economics and Business, 2010. http://epub.wu.ac.at/2960/1/Report105.pdf.

Full text

Abstract:

We consider the problem of testing whether r (>=2) samples are drawn from the same continuous distribution F(x). The test statistic we will study in some detail is defined as the maximum of the circular differences of the empirical distribution functions, a generalization of the classical 2-sample Kolmogorov-Smirnov test to r (>=2) independent samples. For the case of equal sample sizes we derive the exact null distribution by counting lattice paths confined to stay in the scaled alcove $\mathcal{A}_r$ of the affine Weyl group $A_{r-1}$. This is done using a generalization of the classical reflection principle. By a standard diffusion scaling we derive also the asymptotic distribution of the test statistic in terms of a multivariate Dirichlet series. When the sample sizes are not equal the reflection principle no longer works, but we are able to establish a weak convergence result even in this case showing that by a proper rescaling a test statistic based on a linear transformation of the circular differences of the empirical distribution functions has the same asymptotic distribution as the test statistic in the case of equal sample sizes.
Series: Research Report Series / Department of Statistics and Mathematics

APA, Harvard, Vancouver, ISO, and other styles

2

Andrade, Francisco Arruda Raposo. "New techniques for vibration condition monitoring : Volterra kernel and Kolmogorov-Smirnov." Thesis, Brunel University, 1999. http://bura.brunel.ac.uk/handle/2438/7871.

Full text

Abstract:

This research presents a complete review of signal processing techniques used, today, in vibration based industrial condition monitoring and diagnostics. It also introduces two novel techniques to this field, namely: the Kolmogorov-Smirnov test and Volterra series, which have not yet been applied to vibration based condition monitoring. The first technique, the Kolmogorov-Smirnov test, relies on a statistical comparison of the cumulative probability distribution functions (CDF) from two time series. It must be emphasised that this is not a moment technique, and it uses the whole CDF, in the comparison process. The second tool suggested in this research is the Volterra series. This is a non-linear signal processing technique, which can be used to model a time series. The parameters of this model are used for condition monitoring applications. Finally, this work also presents a comprehensive comparative study between these new methods and the existing techniques. This study is based on results from numerical and experimental applications of each technique here discussed. The concluding remarks include suggestions on how the novel techniques proposed here can be improved.

APA, Harvard, Vancouver, ISO, and other styles

3

Steele, Michael C., and n/a. "The Power of Categorical Goodness-Of-Fit Statistics." Griffith University. Australian School of Environmental Studies, 2003. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20031006.143823.

Full text

Abstract:

The relative power of goodness-of-fit test statistics has long been debated in the literature. Chi-Square type test statistics to determine 'fit' for categorical data are still dominant in the goodness-of-fit arena. Empirical Distribution Function type goodness-of-fit test statistics are known to be relatively more powerful than Chi-Square type test statistics for restricted types of null and alternative distributions. In many practical applications researchers who use a standard Chi-Square type goodness-of-fit test statistic ignore the rank of ordinal classes. This thesis reviews literature in the goodness-of-fit field, with major emphasis on categorical goodness-of-fit tests. The continued use of an asymptotic distribution to approximate the exact distribution of categorical goodness-of-fit test statistics is discouraged. It is unlikely that an asymptotic distribution will produce a more accurate estimation of the exact distribution of a goodness-of-fit test statistic than a Monte Carlo approximation with a large number of simulations. Due to their relatively higher powers for restricted types of null and alternative distributions, several authors recommend the use of Empirical Distribution Function test statistics over nominal goodness-of-fit test statistics such as Pearson's Chi-Square. In-depth power studies confirm the views of other authors that categorical Empirical Distribution Function type test statistics do not have higher power for some common null and alternative distributions. Because of this, it is not sensible to make a conclusive recommendation to always use an Empirical Distribution Function type test statistic instead of a nominal goodness-of-fit test statistic. Traditionally the recommendation to determine 'fit' for multivariate categorical data is to treat categories as nominal, an approach which precludes any gain in power which may accrue from a ranking, should one or more variables be ordinal. The presence of multiple criteria through multivariate data may result in partially ordered categories, some of which have equal ranking. This thesis proposes a modification to the currently available Kolmogorov-Smirnov test statistics for ordinal and nominal categorical data to account for situations of partially ordered categories. The new test statistic, called the Combined Kolmogorov-Smirnov, is relatively more powerful than Pearson's Chi-Square and the nominal Kolmogorov-Smirnov test statistic for some null and alternative distributions. A recommendation is made to use the new test statistic with higher power in situations where some benefit can be achieved by incorporating an Empirical Distribution Function approach, but the data lack a complete natural ordering of categories. The new and established categorical goodness-of-fit test statistics are demonstrated in the analysis of categorical data with brief applications as diverse as familiarity of defence programs, the number of recruits produced by the Merlin bird, a demographic problem, and DNA profiling of genotypes. The results from these applications confirm the recommendations associated with specific goodness-of-fit test statistics throughout this thesis.

APA, Harvard, Vancouver, ISO, and other styles

4

Steele, Michael C. "The Power of Categorical Goodness-Of-Fit Statistics." Thesis, Griffith University, 2003. http://hdl.handle.net/10072/366717.

Full text

Abstract:

The relative power of goodness-of-fit test statistics has long been debated in the literature. Chi-Square type test statistics to determine 'fit' for categorical data are still dominant in the goodness-of-fit arena. Empirical Distribution Function type goodness-of-fit test statistics are known to be relatively more powerful than Chi-Square type test statistics for restricted types of null and alternative distributions. In many practical applications researchers who use a standard Chi-Square type goodness-of-fit test statistic ignore the rank of ordinal classes. This thesis reviews literature in the goodness-of-fit field, with major emphasis on categorical goodness-of-fit tests. The continued use of an asymptotic distribution to approximate the exact distribution of categorical goodness-of-fit test statistics is discouraged. It is unlikely that an asymptotic distribution will produce a more accurate estimation of the exact distribution of a goodness-of-fit test statistic than a Monte Carlo approximation with a large number of simulations. Due to their relatively higher powers for restricted types of null and alternative distributions, several authors recommend the use of Empirical Distribution Function test statistics over nominal goodness-of-fit test statistics such as Pearson's Chi-Square. In-depth power studies confirm the views of other authors that categorical Empirical Distribution Function type test statistics do not have higher power for some common null and alternative distributions. Because of this, it is not sensible to make a conclusive recommendation to always use an Empirical Distribution Function type test statistic instead of a nominal goodness-of-fit test statistic. Traditionally the recommendation to determine 'fit' for multivariate categorical data is to treat categories as nominal, an approach which precludes any gain in power which may accrue from a ranking, should one or more variables be ordinal. The presence of multiple criteria through multivariate data may result in partially ordered categories, some of which have equal ranking. This thesis proposes a modification to the currently available Kolmogorov-Smirnov test statistics for ordinal and nominal categorical data to account for situations of partially ordered categories. The new test statistic, called the Combined Kolmogorov-Smirnov, is relatively more powerful than Pearson's Chi-Square and the nominal Kolmogorov-Smirnov test statistic for some null and alternative distributions. A recommendation is made to use the new test statistic with higher power in situations where some benefit can be achieved by incorporating an Empirical Distribution Function approach, but the data lack a complete natural ordering of categories. The new and established categorical goodness-of-fit test statistics are demonstrated in the analysis of categorical data with brief applications as diverse as familiarity of defence programs, the number of recruits produced by the Merlin bird, a demographic problem, and DNA profiling of genotypes. The results from these applications confirm the recommendations associated with specific goodness-of-fit test statistics throughout this thesis.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
Australian School of Environmental Studies
Full Text

APA, Harvard, Vancouver, ISO, and other styles

5

Larson, Lincoln Gary. "Investigating Statistical vs. Practical Significance of the Kolmogorov-Smirnov Two-Sample Test Using Power Simulations and Resampling Procedures." Thesis, North Dakota State University, 2018. https://hdl.handle.net/10365/28770.

Full text

Abstract:

This research examines the power of the Kolmogorov-Smirnov two-sample test. The motivation for this research is a large data set containing soil salinity values. One problem encountered was that the power of the Kolmogorov-Smirnov two-sample test became extremely high due to the large sample size. This extreme power resulted in statistically significant differences between two distributions when no practically significant difference was present. This research used resampling procedures to create simulated null distributions for the test statistic. These null distributions were used to obtain power approximations for the Kolmogorov-Smirnov tests under differing effect sizes. The research shows that the power of the Kolmogorov-Smirnov test can become very large in cases of large sample sizes.

APA, Harvard, Vancouver, ISO, and other styles

6

Li, Rong. "A Tree-based Framework for Difference Summarization." Kent State University / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=kent1334277940.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Mao, Qian. "Clusters Identification: Asymmetrical Case." Thesis, Uppsala universitet, Informationssystem, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-208328.

Full text

Abstract:

Cluster analysis is one of the typical tasks in Data Mining, and it groups data objects based only on information found in the data that describes the objects and their relationships. The purpose of this thesis is to verify a modified K-means algorithm in asymmetrical cases, which can be regarded as an extension to the research of Vladislav Valkovsky and Mikael Karlsson in Department of Informatics and Media. In this thesis an experiment is designed and implemented to identify clusters with the modified algorithm in asymmetrical cases. In the experiment the developed Java application is based on knowledge established from previous research. The development procedures are also described and input parameters are mentioned along with the analysis. This experiment consists of several test suites, each of which simulates the situation existing in real world, and test results are displayed graphically. The findings mainly emphasize the limitations of the algorithm, and future work for digging more essences of the algorithm is also suggested.

APA, Harvard, Vancouver, ISO, and other styles

8

Carrier, Denis Joseph Gaston. "Automatic measurement of particles from holograms taken in the combustion chamber of a rocket motor." Thesis, Monterey, California. Naval Postgraduate School, 1988. http://hdl.handle.net/10945/22924.

Full text

Abstract:

Approved for public release; distribution is unlimited
This thesis describes the procedure used for the automatic measurement of particles from hologram taken in the combustion chamber of a rocket motor while firing. It describes the investigation done on two averaging techniques used to reduce speckle noise, capturing the image focused on a spinning mylar disk and software averaging of several image frames. The spinning disk technique proved superior for this application. The Kolmogorov-Smirnov two-sample test is applied to different particle samples in order to find an estimate of the number of particles required to obtain a stable distribution function. The number of particles is calculated and given. The last part of this study shows real particle distributions in the form of frequency histograms.
http://archive.org/details/automaticmeasure00carr
Major, Canadian Armed Forces

APA, Harvard, Vancouver, ISO, and other styles

9

Lekomtcev, Demian. "Snímání spektra pro kognitivní rádiové sítě - vliv vlastností reálného komunikačního řetězce." Doctoral thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2016. http://www.nusl.cz/ntk/nusl-255288.

Full text

Abstract:

The doctoral thesis deals with spectrum sensing in cognitive radio networks (CRN). A number of international organizations are currently actively engaged in standardization of CRN and it points out to the fact that this technology will be widely used in the near future. One of the key features of this technology is a dynamic access to the spectrum, which can be affected by many different harmful factors occurring in the communication chain. The thesis investigates the influence of selected factors on the spectrum sensing process. Another contribution of the work is the optimization of the Kolmogorov - Smirnov statistical test that can be applied for the primary user signal detection. The work also incorporates the analysis of the influence of the harmful effects caused by the commonly used transmitters and receivers on various spectrum sensing methods. The investigations are verified by the results of the simulations and also by the measurements with experimental platforms based on the software-defined radio (SDR).

APA, Harvard, Vancouver, ISO, and other styles

10

Bagdonavičius, Vilijandas B., Ruta Levuliene, Mikhail S. Nikulin, and Olga Zdorova-Cheminade. "Tests for homogeneity of survival distributions against non-location alternatives and analysis of the gastric cancer data." Universität Potsdam, 2004. http://opus.kobv.de/ubp/volltexte/2011/5152/.

Full text

Abstract:

The two and k-sample tests of equality of the survival distributions against the alternatives including cross-effects of survival functions, proportional and monotone hazard ratios, are given for the right censored data. The asymptotic power against approaching alternatives is investigated. The tests are applied to the well known chemio and radio therapy data of the Gastrointestinal Tumor Study Group. The P-values for both proposed tests are much smaller then in the case of other known tests. Differently from the test of Stablein and Koutrouvelis the new tests can be applied not only for singly but also to randomly censored data.

APA, Harvard, Vancouver, ISO, and other styles

11

Zhang, Yan. "The impact of midbrain cauterize size on auditory and visual responses' distribution." unrestricted, 2009. http://etd.gsu.edu/theses/available/etd-04202009-145923/.

Full text

Abstract:

Thesis (M.S.)--Georgia State University, 2009.
Title from file title page. Yu-Sheng Hsu, committee chair; Xu Zhang, Sarah. L. Pallas, committee members. Description based on contents viewed June 12, 2009. Includes bibliographical references (p. 37). Appendix A: SAS code: p. 38-53.

APA, Harvard, Vancouver, ISO, and other styles

12

Novotná, Lenka. "Statistické metody pro popis provozu restaurace." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2010. http://www.nusl.cz/ntk/nusl-222784.

Full text

Abstract:

The diploma thesis is written with a view to illustrate application of statistical methods describing progress of economical processes in company. The thesis is divided into two separated parts. First part focuses on theoretical pieces of knowledge about control charts and time series. Second part is composed from chapters that are focused on its practical usage. As simple application for control chart making is enclosed.

APA, Harvard, Vancouver, ISO, and other styles

13

Wärn, Caroline. "Deviating time-to-onset in predictive models : detecting new adverse effects from medicines." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-257100.

Full text

Abstract:

Identifying previously unknown adverse drug reactions becomes more important as the number of drugs and the extent of their use increases. The aim of this Master’s thesis project was to evaluate the performance of a novel approach for highlighting potential adverse drug reactions, also known as signal detection. The approach was based on deviating time-to-onset patterns and was implemented as a two-sample Kolmogorov-Smirnov test for non-vaccine data in the safety report database, VigiBase. The method was outperformed by both disproportionality analysis and the multivariate predictive model vigiRank. Performance estimates indicate that deviating time-to-onset patterns is not a suitable approach for signal detection for non-vaccine data in VigiBase.

APA, Harvard, Vancouver, ISO, and other styles

14

Haluzová, Dana. "Uplatnění statistických metod pro zkoumání vlastností nejprodávanějších přípravků na ochranu rostlin a vztahů mezi nimi." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2018. http://www.nusl.cz/ntk/nusl-377387.

Full text

Abstract:

This diploma thesis focuses on the statistical examination of properties of plant protection products at Agro-Artikel, s.r.o. Using the empirical distribution function, it focuses on the sales price and the shelf life of the products, tests the hypotheses about the properties of the products and the dependencies between them. The thesis also explores the results of the questionnaire survey and offers recommendations for the introduction of new products.

APA, Harvard, Vancouver, ISO, and other styles

15

Apeltauer, Jiří. "Statistické vlastnosti mikrostruktury dopravního proudu." Doctoral thesis, Vysoké učení technické v Brně. Fakulta stavební, 2018. http://www.nusl.cz/ntk/nusl-390266.

Full text

Abstract:

The actual traffic flow theory assumes interactions only between neighbouring vehicles within the traffic. This assumption is reasonable, but it is based on the possibilities of science and technology available decades ago, which are currently overcome. Obviously, in general, there is an interaction between vehicles at greater distances (or between multiple vehicles), but at the time, no procedure has been put forward to quantify the distance of this interaction. This work introdukce a method, which use mathematical statistics and precise measurement of time distances of individual vehicles, which allows to determine these interacting distances (between several vehicles) and its validation for narrow densities of traffic flow. It has been revealed that at high traffic flow densities there is an interaction between at least three consecutive vehicles and four and five vehicles at lower densities. Results could be applied in the development of new traffic flow models and its verification.

APA, Harvard, Vancouver, ISO, and other styles

16

Au, Manix. "Automatic State Construction using Decision Trees for Reinforcement Learning Agents." Thesis, Queensland University of Technology, 2005. https://eprints.qut.edu.au/15965/1/Manix_Au_Thesis.pdf.

Full text

Abstract:

Reinforcement Learning (RL) is a learning framework in which an agent learns a policy from continual interaction with the environment. A policy is a mapping from states to actions. The agent receives rewards as feedback on the actions performed. The objective of RL is to design autonomous agents to search for the policy that maximizes the expectation of the cumulative reward. When the environment is partially observable, the agent cannot determine the states with certainty. These states are called hidden in the literature. An agent that relies exclusively on the current observations will not always find the optimal policy. For example, a mobile robot needs to remember the number of doors went by in order to reach a specific door, down a corridor of identical doors. To overcome the problem of partial observability, an agent uses both current and past (memory) observations to construct an internal state representation, which is treated as an abstraction of the environment. This research focuses on how features of past events are extracted with variable granularity regarding the internal state construction. The project introduces a new method that applies Information Theory and decision tree technique to derive a tree structure, which represents the state and the policy. The relevance, of a candidate feature, is assessed by the Information Gain Ratio ranking with respect to the cumulative expected reward. Experiments carried out on three different RL tasks have shown that our variant of the U-Tree (McCallum, 1995) produces a more robust state representation and faster learning. This better performance can be explained by the fact that the Information Gain Ratio exhibits a lower variance in return prediction than the Kolmogorov-Smirnov statistical test used in the original U-Tree algorithm.

APA, Harvard, Vancouver, ISO, and other styles

17

Au, Manix. "Automatic State Construction using Decision Trees for Reinforcement Learning Agents." Queensland University of Technology, 2005. http://eprints.qut.edu.au/15965/.

Full text

Abstract:

Reinforcement Learning (RL) is a learning framework in which an agent learns a policy from continual interaction with the environment. A policy is a mapping from states to actions. The agent receives rewards as feedback on the actions performed. The objective of RL is to design autonomous agents to search for the policy that maximizes the expectation of the cumulative reward. When the environment is partially observable, the agent cannot determine the states with certainty. These states are called hidden in the literature. An agent that relies exclusively on the current observations will not always find the optimal policy. For example, a mobile robot needs to remember the number of doors went by in order to reach a specific door, down a corridor of identical doors. To overcome the problem of partial observability, an agent uses both current and past (memory) observations to construct an internal state representation, which is treated as an abstraction of the environment. This research focuses on how features of past events are extracted with variable granularity regarding the internal state construction. The project introduces a new method that applies Information Theory and decision tree technique to derive a tree structure, which represents the state and the policy. The relevance, of a candidate feature, is assessed by the Information Gain Ratio ranking with respect to the cumulative expected reward. Experiments carried out on three different RL tasks have shown that our variant of the U-Tree (McCallum, 1995) produces a more robust state representation and faster learning. This better performance can be explained by the fact that the Information Gain Ratio exhibits a lower variance in return prediction than the Kolmogorov-Smirnov statistical test used in the original U-Tree algorithm.

APA, Harvard, Vancouver, ISO, and other styles

18

Ječmínková, Michaela. "Využití regulačních diagramů pro kontrolu jakosti." Master's thesis, Vysoké učení technické v Brně. Fakulta podnikatelská, 2014. http://www.nusl.cz/ntk/nusl-224704.

Full text

Abstract:

This diploma thesis deals with use of Shewhart Control Charts in quality control. The thesis describes the currently used process of quality control in the enterprise. Afterwards practical guidance for implementation of the statistical process control of the selected component and evaluation of capability is provided. An application for creating control charts and monitoring the quality of the product is included.

APA, Harvard, Vancouver, ISO, and other styles

19

Aguiar, Marcelo Figueiredo Massulo. "Redução no tamanho da amostra de pesquisas de entrevistas domiciliares para planejamento de transportes: uma verificação preliminar." Universidade de São Paulo, 2005. http://www.teses.usp.br/teses/disponiveis/18/18137/tde-28032014-193530/.

Full text

Abstract:

O trabalho tem por principal objetivo verificar, preliminarmente, a possibilidade de reduzir a quantidade de indivíduos na amostra de Pesquisa de Entrevistas Domiciliares, sem prejudicar a qualidade e representatividade da mesma. Analisar a influência das características espaciais e de uso de solo da área urbana constitui o objetivo intermediário. Para ambos os objetivos, a principal ferramenta utilizada foi o minerador de dados denominado Árvore de Decisão e Classificação contido no software S-Plus 6.1, que encontra as relações entre as características socioeconômicas dos indivíduos, as características espaciais e de uso de solo da área urbana e os padrões de viagens encadeadas. Os padrões de viagens foram codificados em termos de sequência cronológica de: motivos, modos, durações de viagem e períodos do dia em que as viagens ocorrem. As análises foram baseadas nos dados da Pesquisa de Entrevistas Domiciliares realizada pela Agência de Cooperação Internacional do Japão e Governo do Estado do Pará em 2000 na Região Metropolitana de Belém. Para se atingir o objetivo intermediário o método consistiu em analisar, através da Árvore de Decisão e Classificação, a influência da variável categórica Macrozona, que representa as características espaciais e de uso de solo da área urbana, nos padrões de viagens encadeadas realizados pelos indivíduos. Para o objetivo principal, o método consistiu em escolher, aleatoriamente, sub-amostras contendo 25% de pessoas da amostra final e verificar, através do Processamento de Árvores de Decisão e Classificação e do teste estatístico Kolmogorov - Smirnov, se os modelos obtidos a partir das amostras reduzidas conseguem ilustrar bem a freqüência de ocorrência dos padrões de viagens das pessoas da amostra final. Concluiu-se que as características espaciais e de uso de solo influenciam os padrões de encadeamento de viagens, e portanto foram incluídas como variáveis preditoras também nos modelos obtidos a partir das sub-amostras. A conclusão principal foi a não rejeição da hipótese de que é possível reduzir o tamanho da amostra de pesquisas domiciliares para fins de estudo do encadeamento de viagens. Entretanto ainda são necessárias muitas outras verificações antes de aceitar esta conclusão.
The main aim of this work is to verify, the possibility of reducing the sample size in home-interview surveys, without being detrimental to the quality and representation. The sub aim of this work is to analyze the influence of spatial characteristics and land use of an urban area. For both aims, the main analyses tool used was Data Miner called the Decision and Classification Tree which is in the software S-Plus 6.1. The Data Miner finds relations between trip chaining patterns and individual socioeconomic characteristics, spatial characteristics and land use patterns. The trip chaining patterns were coded in terms of chronological sequence of trip purpose, travel mode, travel time and the period of day in which travel occurs. The analyses were based on home-interview surveys carried out in the Belém Metropolitan Area in 2000, by Japan International Cooperation Agency and Pará State Government. In order to achieve the sub aim of this work, the method consisted of analyzing, using the Decision and Classification Tree, the influence of the categorical variable \"Macrozona\", which represents spatial characteristics and urban land use patterns, in trip chaining patterns carried by the individuals. Concerning the main aim, the method consisted of choosing sub-samples randomly containing 25% of the final sample of individuals and verifying (using Decision and Classification Tree and Kolmogorov-Smirnov statistical test) whether the models obtained from the reduced samples can describe the frequency of the occurrence of the individuals trip chaining patterns in the final sample well. The first conclusion is that spatial characteristics and land use of the urban area have influenced the trip chaining patterns, and therefore they were also included as independent variables in the models obtained from the sub-samples. The main conclusion was the non-rejection of the hypothesis that it is possible to reduce the sample size in home-interview surveys used for trip-chaining research. Nevertheless, several other verifications are necessary before accepting this conclusion.

APA, Harvard, Vancouver, ISO, and other styles

20

Belkhira, Moussaad. "Tests de kolmogorov-smirnov dans le cas ou des parametres sont estimes." Paris 6, 1988. http://www.theses.fr/1988PA066056.

Full text

Abstract:

Dans le cas ou la loi d'un echantillon est estimee a partir des donnees, nous ne pouvons par utiliser le test de kolmogorov-smirnov comme si la loi etait entierement connue. Il s'agit donc de construire d'autres tables statistiques. Ce travail consiste a faire une etude par simulation des differentes statistiques de kolmogorov-smirnov dans le cas parametrique; ce qui nous permet de donner une nouvelle relation entre les lois de ces statistiques et de passer du cas parametrique au cas non parametrique et permet asussi la construction de tables de pourcentages

APA, Harvard, Vancouver, ISO, and other styles

21

Belkhira, Moussaad. "Tests de Kolmogorov-Smirnov dans le cas ou des paramètres sont estimés." Grenoble 2 : ANRT, 1988. http://catalogue.bnf.fr/ark:/12148/cb376117075.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Povalač, Karel. "Sledování spektra a optimalizace systémů s více nosnými pro kognitivní rádio." Doctoral thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2012. http://www.nusl.cz/ntk/nusl-233577.

Full text

Abstract:

The doctoral thesis deals with spectrum sensing and subsequent use of the frequency spectrum by multicarrier communication system, which parameters are set on the basis of the optimization technique. Adaptation settings can be made with respect to several requirements as well as state and occupancy of individual communication channels. The system, which is characterized above is often referred as cognitive radio. Equipments operating on cognitive radio principles will be widely used in the near future, because of frequency spectrum limitation. One of the main contributions of the work is the novel usage of the Kolmogorov – Smirnov statistical test as an alternative detection of primary user signal presence. The new fitness function for Particle Swarm Optimization (PSO) has been introduced and the Error Vector Magnitude (EVM) parameter has been used in the adaptive greedy algorithm and PSO optimization. The dissertation thesis also incorporates information about the reliability of the frequency spectrum sensing in the modified greedy algorithm. The proposed methods are verified by the simulations and the frequency domain energy detection is implemented on the development board with FPGA.

APA, Harvard, Vancouver, ISO, and other styles

23

Bezerra, Thiago Junqueira de Castro. "Estudo da sensibilidade do detector de neutrinos do Projeto ANGRA aos efeitos da queima do combustível nuclear." [s.n.], 2009. http://repositorio.unicamp.br/jspui/handle/REPOSIP/277698.

Full text

Abstract:

Orientador: Ernesto Kemp
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Fisica Gleb Wataghin
Made available in DSpace on 2018-08-14T15:51:55Z (GMT). No. of bitstreams: 1 Bezerra_ThiagoJunqueiradeCastro_M.pdf: 3958606 bytes, checksum: 2103ee29e4d2b1cf0f81e8738d681d27 (MD5) Previous issue date: 2009
Resumo: Reatores nucleares constituem uma profusa fonte de antineutrinos, cujo espectro é determinado pelos decaimentos beta dos isótopos radioativos presentes no combustível nuclear. À medida que o combustível é consumido, sua composição isotópica é alterada, com reflexos diretos no espectro de antineutrinos. Desta forma, investigamos neste trabalho a viabilidade de um detector de neutrinos monitorar o reator de uma usina nuclear, sabendo seu estado de atividade. Também investigamos a evolução temporal da resposta do detector à queima gradual do combustível nuclear. Assim, determinamos o tempo necessário de coleta de dados para identificarmos que o combustível nuclear evoluiu para outra composição, para vários níveis de confiança, com relação ao início de operação da usina. Estes resultados fazem da detecção de antineutrinos de reatores nucleares uma ferramenta adicional para a verificação de salvaguardas nucleares
Abstract: Nuclear reactors are a profuse neutrino source, which spectrum is determined by the beta decay of the fissile isotopes in the nuclear fuel. While the fuel is consumed, the isotopic composition changes, resulting in trends on the neutrino spectrum. So, we investigated in this work the viability of monitoring a reactor of a nuclear power plant with a neutrino detector, knowing its state of activity. We also investigated the temporal evolution of the response time of the detector in function of the gradual burn of the fuel. Therefore, with some confidence levels, we determined the needed time of data taking to identify fuel changes, in a PWR power plant, related to the beginning of operation. Consequently, these results make the detection of antineutrinos of nuclear reactors an additional method to nuclear safeguards
Mestrado
Física das Particulas Elementares e Campos
Mestre em Física

APA, Harvard, Vancouver, ISO, and other styles

24

Falk, Matthew Gregory. "Incorporating uncertainty in environmental models informed by imagery." Thesis, Queensland University of Technology, 2010. https://eprints.qut.edu.au/33235/1/Matthew_Falk_Thesis.pdf.

Full text

Abstract:

In this thesis, the issue of incorporating uncertainty for environmental modelling informed by imagery is explored by considering uncertainty in deterministic modelling, measurement uncertainty and uncertainty in image composition. Incorporating uncertainty in deterministic modelling is extended for use with imagery using the Bayesian melding approach. In the application presented, slope steepness is shown to be the main contributor to total uncertainty in the Revised Universal Soil Loss Equation. A spatial sampling procedure is also proposed to assist in implementing Bayesian melding given the increased data size with models informed by imagery. Measurement error models are another approach to incorporating uncertainty when data is informed by imagery. These models for measurement uncertainty, considered in a Bayesian conditional independence framework, are applied to ecological data generated from imagery. The models are shown to be appropriate and useful in certain situations. Measurement uncertainty is also considered in the context of change detection when two images are not co-registered. An approach for detecting change in two successive images is proposed that is not affected by registration. The procedure uses the Kolmogorov-Smirnov test on homogeneous segments of an image to detect change, with the homogeneous segments determined using a Bayesian mixture model of pixel values. Using the mixture model to segment an image also allows for uncertainty in the composition of an image. This thesis concludes by comparing several different Bayesian image segmentation approaches that allow for uncertainty regarding the allocation of pixels to different ground components. Each segmentation approach is applied to a data set of chlorophyll values and shown to have different benefits and drawbacks depending on the aims of the analysis.

APA, Harvard, Vancouver, ISO, and other styles

25

Messias, Cassiano Gustavo 1987. "Mapeamento das áreas suscetíveis à fragilidade ambiental na alta bacia do Rio São Francisco, Parque Nacional da Serra da Canastra - MG." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/286612.

Full text

Abstract:

Orientador: Marcos Cesar Ferreira
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Geociências
Made available in DSpace on 2018-08-25T23:24:24Z (GMT). No. of bitstreams: 1 Messias_CassianoGustavo_M.pdf: 23638977 bytes, checksum: e704bfa68254415de27c9956b5cdab14 (MD5) Previous issue date: 2014
Resumo: As paisagens rurais vêm sendo transformadas continuamente pela ocupação humana, principalmente em razão de adaptações técnicas requeridas para o desenvolvimento da agricultura. De maneira geral, estas alterações antrópicas estão diretamente ligadas à utilização dos recursos naturais como insumos da produção agrícola. Os recursos mais impactados por este modo de produção são a vegetação e o solo. Dentre as formas de avaliação do grau de comprometimento da paisagem em razão da exploração agrícola do território é o mapeamento da fragilidade ambiental. Esta pesquisa teve como principal objetivo avaliar os graus de fragilidade ambiental de diferentes áreas do Parque Nacional da Serra da Canastra, situado no sudoeste de Minas Gerais. Criado em 1972, o parque tem como meta principal preservar ecossistemas naturais ainda existentes no bioma do cerrado brasileiro. A metodologia de mapeamento da fragilidade ambiental utilizada neste trabalho baseia-se em cinco variáveis geográficas: índice de vegetação, probabilidade de ocorrência de chuvas intensas, declividades, densidade de estradas e densidade de lineamentos estruturais. Estas variáveis foram processadas em sistemas de informação geográfica, por meio de técnicas de análise espacial, utilizadas para a transformação destas, mapeadas segundo a lógica booleana, em variáveis probabilísticas fuzzy. Os mapas fuzzy foram combinados por meio de algoritmo baseado em soma ponderada, gerando-se um mapa de fragilidade ambiental do Parque Nacional. Este mapa final foi comparado a mapas de processos erosivos e de movimento de massa, checados em campo, com o objetivo de se atribuir pesos às variáveis ambientais por meio do teste de Kolmogorov-Smirnov. A metodologia se mostrou eficiente para a identificação e mapeamento de áreas com maior grau de fragilidade no parque, considerando-se a evidência dos processos erosivos e dos movimentos de massa
Abstract: Rural landscapes have been continually transformed by human occupation, mainly due to technical adjustments required for the development of agriculture. In general, these anthropogenic changes are directly linked to the use of natural resources as inputs into agricultural production. The resources most impacted by this mode of production are the vegetation and soil. Among the manners of assessing the degree of landscape vulnerability is the environmental fragility mapping. This research aimed to assess the degree of environmental vulnerability of different areas located within the Serra da Canastra National Park, located in southwestern Minas Gerais. Created in 1972, the park has as main goal to preserve remaining natural ecosystems of the Brazilian Cerrado biome. The methodology for environmental fragility mapping, used in this work, is based on five geographic variables: vegetation index, probability of intense rainfall, slope terrain, roads and structural lineaments densities. These variables, mapped according to the Boolean logic, were processed in a geographic information system through spatial analysis techniques and transformed in fuzzy probabilistic variables. The fuzzy maps were combined by means of algorithm based on a weighted sum, generating the environmental fragility map of the National Park. The estimative of weights values of the five environmental variables was carried out comparing erosion-mass movement maps and the fragility map, using the D-value of Kolmogorov-Smirnov test. Considering the evidence of erosion and mass movements, we concluded that the methodology is efficient for the identification and mapping of areas with the high degree of fragility in the park
Mestrado
Análise Ambiental e Dinâmica Territorial
Mestre em Geografia

APA, Harvard, Vancouver, ISO, and other styles

26

Lindahl, Fred. "Detection of Sparse and Weak Effects in High-Dimensional Supervised Learning Problems, Applied to Human Microbiome Data." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288503.

Full text

Abstract:

This project studies the signal detection and identification problem in high-dimensional noisy data and the possibility of using it on microbiome data. An extensive simulation study was performed on generated data using as well as a microbiome dataset collected on patients with Parkinson's disease, using Donoho and Jin's Higher criticism, Jager and Wellner's phi-divergence-based goodness-of-fit-test and Stepanova and Pavlenko's CsCsHM statistic . We present some novel approaches based on established theory that perform better than existing methods and show that it is possible to use the signal identification framework to detect differentially abundant features in microbiome data. Although the novel approaches produce good results, they lack substantial mathematical foundations and should be avoided if theoretical rigour is needed. We also conclude that while we have found that it is possible to use signal identification methods to find abundant features in microbiome data, further refinement is necessary before it can be properly used in research.
Detta projekt studerar signaldetekterings- och identifieringsproblemet i högdimensionell brusig data och möjligheten att använda det på mikrobiomdata från människor. En omfattande simuleringsstudie utfördes på genererad data samt ett mikrobiomdataset som samlats in på patienter med Parkinsons sjukdom, med hjälp av ett antal goodness-of-fit-metoder: Donoho och Jins Higher criticis , Jager och Wellners phi-divergenser och Stepanova och Pavelenkos CsCsHM. Vi presenterar några nya tillvägagångssätt baserade på vedertagen teori som visar sig fungera bättre än befintliga metoder och visar att det är möjligt att använda signalidentifiering för att upptäcka olika funktioner i mikrobiomdata. Även om de nya metoderna ger goda resultat saknar de betydande matematiska grunder och bör undvikas om teoretisk formalism är nödvändigt. Vi drar också slutsatsen att medan vi har funnit att det är möjligt att använda signalidentifieringsmetoder för att hitta information i mikrobiomdata, är ytterligare experiment nödvändiga innan de kan användas på ett korrekt sätt i forskning.

APA, Harvard, Vancouver, ISO, and other styles

27

Petit, Frédéric. "Modélisation et simulation d'une chambre réverbérante à brassage de modes à l'aide de la méthode des différences finies dans le domaine temporel." Phd thesis, Université de Marne la Vallée, 2002. http://tel.archives-ouvertes.fr/tel-00003238.

Full text

Abstract:

Le développement des moyens de communications par l'intermédiaire des ondes
électromagnétiques connaît une croissance sans précédent depuis quelques années, grâce
notamment au développement de la téléphonie mobile. La chambre réverbérante est un
moyen d'essais qui permet d'étudier l'influence de ces ondes électromagnétiques sur un
appareil électronique particulier. Cependant, le fonctionnement d'une chambre
réverbérante étant complexe, il est primordial de procéder à des simulations afin de
déterminer quels sont les paramètres cruciaux entrant en jeu.

Le travail de cette thèse consiste à modéliser et à simuler le fonctionnement d'une
chambre réverbérante à l'aide de la méthode des différences finies dans le domaine
temporel. Après une brève étude portant sur quelques résultats de mesures de champ et
de puissances effectuées dans une chambre réverbérante, le chapitre~2 aborde les
différents problèmes liés à la modélisation de la chambre. La notion de pertes étant
déterminante pour évaluer le fonctionnement d'une chambre réverbérante, deux méthodes
implémentant ces pertes sont aussi exposées dans ce chapitre. L'étude menée dans le
chapitre~3 consiste à analyser l'influence du brasseur sur les premiers modes propres
de la chambre, ceux-ci pouvant être décalés de plusieurs MHz. Le chapitre~4 présente
des résultats de simulations en hautes fréquences comparés à des résultats
statistiques théoriques. Le cas de la présence d'un objet au sein de la chambre
pouvant perturber le champ est aussi abordé. Enfin, le chapitre~5 montre une
comparaison des résultats statistiques dans le cas où l'on considère plusieurs formes
de brasseurs.

APA, Harvard, Vancouver, ISO, and other styles

28

Tsai, Wen-Chi, and 蔡文綺. "A Kolmogorov-Smirnov Type Goodness-of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/xyrjap.

Full text

Abstract:

碩士
淡江大學
統計學系碩士班
104
The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This thesis bases on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the p-value of the proposed test statistic. Empirical type I error rates and powers of the test are evaluated by simulation studies. Some examples will be illustrated the implementation of the test.

APA, Harvard, Vancouver, ISO, and other styles

29

Witzgall, Laila Chahrazad. "Nonparametric segmentation of nonstationary time series." Master's thesis, 2016. http://hdl.handle.net/10451/24924.

Full text

Abstract:

Tese de mestrado em Física, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2016
A análise de séries temporais trata do estudo de dados colectados durante determinado período de tempo. Uma série temporal consiste numa série de dados listados por ordem temporal, e é constituído por uma sequência de dados medida sucessivamente em intervalos de tempo equidistantes, ou não. O estudo de séries temporais é um campo vasto da estatística que se ramifica a várias áreas da ciência. A análise de séries temporais consiste em métodos de análise de dados com o objectivo de extrair elementos estatísticos e outras características relevantes e ocorre frequentemente no contexto da estatística, econometria, geofísica, meteorologia e outras áreas onde uma das principais motivações para o estudo destas séries temporais é a previsão. Uma grande parte dos sistemas complexos encontrados na vida real têm associados séries temporais empíricas que exibem graus variáveis de não-estacionariedade, como por exemplo medições da velocidade do vento, séries temporais financeiras, entre outros. Um processo estocástico estacionário tem como propriedade que a estrutura da média, variância e autocorrelação não se altera no tempo. Um dos focos desta área de estudo é o tratamento de séries temporais não-estacionárias através de algoritmos de segmentação. A segmentação de séries temporais consiste em dividir a série em fragmentos, baseando a decisão de segmentação num critério pré-determinado no algoritmo. Neste trabalho explora-se um algoritmo de segmentação automática recursiva não-paramétrica baseado no teste estatístico de Kolmogorov-Smirnov para séries temporais não-estacionárias provenientes de processos complexos. A segmentação permite dividir a série temporal em fragmentos onde a estatística é idêntica, criando assim janelas de estacionariedade dentro de uma série não-estacionária. O teste de Kolmogorov-Smirnov é um teste totalmente não-paramétrico que avalia a igualdade de distribuições de probabilidade contínuas que pode ser utilizado para comparar uma amostra de dados com uma distribuição de probabilidade de referência, Teste de Kolmogorov-Smirnov para uma amostra, ou pode ser utilizado para comparar duas amostras de dados e neste caso designa-se por Teste de Kolmogorov-Smirnov para duas amostras. Este teste possibilita-nos testar se duas amostras pertencem a uma mesma distribuição sem necessidade de especificar qual, isto resulta da análise da diferença entre duas funções de distribuição cumulativas e observar em que ponto esta diferença absoluta é máxima. Esta diferença designa-se por distância de Kolmogorov-Smirnov. Neste trabalho utiliza-se o conceito de teste de hipóteses que consiste numa categoria de inferência estatística fazendo parte de teoria da decisão. Um teste de hipóteses inicia com a proposta de uma hipótese nula, em como um modelo probabilístico descreve as observações de determinada experiência. A questão abordada no teste tem como consequência dois possíveis resultados: aceitar ou rejeitar a hipótese nula. Neste caso estamos interessados em testar a existência de uma distribuição comum entre duas amostras de séries temporais. Dada a hipótese nula de que as duas amostras pertencem à mesma distribuição, podemos testar esta relativamente à hipótese alternativa de que as distribuições têm funções de distribuição cumulativas diferentes. Para cada amostra calcula-se a função de distribuição cumulativa e a diferença entre elas ponto a ponto. Comparamos esta distância e extraímos a distância máxima que constitui a estatística do teste, a distância de Kolmogorov-Smirnov entre as duas funções. O algoritmo de segmentação para séries temporais aqui desenvolvido baseia-se nesta distância entre funções de distribuição cumulativas e funciona, em suma, da seguinte forma: dada uma série temporal e um ponteiro que se move sequencialmente em toda a série, a cada posição do ponteiro é feito um corte na amostra e são comparados os dois fragmentos resultantes. É calculada a estatística de Kolmogorov-Smirnov e quando o algoritmo percorre toda a série temporal é extraído o valor máximo desta estatística. Por sua vez, é nesta posição, onde o valor máximo é encontrado que o algoritmo propõe uma posição de corte da série temporal e compara este com a significância de uma possível posição de segmentação. Este processo é então aplicado iterativamente até não existirem mais propostas de posições de corte ou o fragmento testado tem tamanho inferior a um tamanho pré-determinado. O objectivo principal do trabalho consistiu em caracterizar o algoritmo de segmentação testando séries temporais artificiais compostas por números aleatórios de distribuições diferentes, Gaussiana, log-normal e Cauchy. A escolha das distribuições de log-normal e de Cauchy foi motivada por estas serem classificadas como classes de distribuições com heavy tails, i.e., a cauda da distribuição é mais acentuada e decai como uma power-law. Muitas séries temporais de sistemas reais apresentam heavy tails e por esta razão é importante explorar o algoritmo e optimizá-lo para este tipo de distribuições. Explora-se também a função de probabilidade do teste de Kolmogorov-Smirnov e o critério de significância para amostras de tamanho muito grande. Este critério não se mostra adequado para o algoritmo aqui desenvolvido porque assume que as amostras comparadas pelo algoritmo são independentes o que não é o caso. O algoritmo tem como entrada uma série temporal que é dividida recursivamente em pares de fragmentos que são posteriormente comparados entre si o que torna os dados interdependentes e por este motivo utiliza-se um critério de significância adequado sugerido na literatura. Numa fase seguinte realizam-se testes numéricos extensivos para avaliar a precisão e eficiência do algoritmo para diferentes distribuições, nomeadamente, Gaussiana, log-normal e Cauchy. O algoritmo de segmentação de Kolmogorov-Smirnov mostra comportar-se bem mesmo quando testado em distribuições com heavy tails, caso em que o teste de Kolmogorov-Smirnov é, em teoria, menos sensível. Motivados por isto e procurando optimizar o desempenho do algoritmo para distribuições com heavy-tails introduzimos uma mudança ao algoritmo onde substituímos o teste de Kolmogorov-Smirnov pelo teste de Anderson-Darling que consiste em adicionar um termo com uma função de peso. Esta função de peso permite uma maior flexibilidade no sentido que mediante a escolha certa dá mais peso a determinada zona da distribuição, no nosso caso, a cauda. Com esta alteração ao algoritmo de segmentação analisou-se o comportamento do critério de significância que se mostrou continuar adequado. O algoritmo de segmentação com o teste de Anderson-Darling foi então aplicado a séries temporais construídas a partir de números aleatórios gerados a partir da distribuição de Cauchy e comparado à versão do algoritmo com o teste de Kolmogorov-Smirnov. Em seguida analisa-se o desempenho do algoritmo de segmentação no espaço de parâmetros das distribuições para as duas versões do algoritmo, com o teste de Kolmogorov-Smirnov e com a introdução da modificação de Anderson-Darling. Com esta análise é possível fazer uma análise quantitativa do desempenho do algoritmo e deste modo estabelecer uma comparação entre ambas as vertentes do algoritmo. Esperava-se que a implementação do teste de Anderson-Darling otimizasse significativamente o desempenho do algoritmo quando aplicado a distribuições com heavy-tails verificando-se apenas uma ligeira melhoria quando aplicado a uma série temporal de Cauchy. Trabalho futuro poderia consistir em melhorar desempenho do algoritmo de segmentação em séries temporais com heavy tails, aumentando a sua sensibilidade nas caudas da distribuição. Será interessante aplicar o algoritmo a medições empíricas de sistemas complexos reais tais como sistemas geofísicos ou sistemas socio-económicos situações onde distribuições com heavy tails têm um papel crucial. Será igualmente interessante analisar como é que o algoritmo de segmentação modificado, com a implementação do teste de Anderson-Darling ao invés do de Kolmogorov-Smirnov, aqui apresentado poderá auxiliar na distinção de diferentes regimes de parâmetros em séries temporais complexas de sistemas físicos reais, como por exemplo dados de mercados financeiros onde ocorrem tipicamente oscilações entre diferentes estados de mercado acompanhados de alterações nas distribuições de retorno, estruturas de correlação, expoentes de Hurst entre outros. Possivelmente em combinação com outras ferramentas estatísticas sensíveis a alterações nas quantidades previamente mencionadas, uma rotina de segmentação automatizada poderá ser útil, eficiente e uma assistência facilmente programável em decision-making.
Many empirical time series that arise in real-world complex systems are found to exhibit varying degrees of nonstationarity, such as atmosferic wind fields and financial time series. A nonparametric segmentation method for nonstationary time series has been implemented based on an existing algorithm using the statistical Kolmogorov-Smirnov test for equality of cumulative distribution functions. Starting from an automated segmentation algorithm based on the Kolmogorov-Smirnov distance for Gaussian, log-normal and Cauchy distributed random time series, we have attempted to characterize and improve the segmentation performance for heavy tailed time series. A time series can be understood to be composed of a series of reasonably long segments, for each of which its properties are stationary. The nonparametric segmentation algorithm presented here divides the time series recursively into segments and for each pair of resulting segments congruence of the respective empirical probability distribution function is asserted by the Kolmogorov-Smirnov test. The Kolmogorov-Smirnov test is weakly sensitive in the tails of the tested sample, when often these tail events are most interesting. For this reason we introduce a modification to the segmentation algorithm, replacing the Kolmogorov-Smirnov test with the Anderson-Darling test, incorporating a weight function to allow more flexibility in the test and account for the tails. In a primary phase we make a complete characterization of the segmentation algorithm and look to make improvements for heavy tailed distributions. We explore the Kolmogorov-Smirnov probability function for large sample sizes and the significance criterion for the classic Kolmogorov-Smirnov test and examine a proposed significance criterion suited for data that is not independent, which is our case because we start from an integral time series that is recursively divided into fragments and compared. In a final phase we investigate the efficiency and performance range of the segmentation algorithm with the Kolmogorov-Smirnov test for Gaussian, log-normal and Cauchy distributed time series. We implement the Anderson-Darling test and establish a comparison with the Kolmogorov-Smirnov based segmentation algorithm for heavy tailed distributed time series.

APA, Harvard, Vancouver, ISO, and other styles

30

Greenberg, Simon L. "Bivariate goodness-of-fit tests based on Kolmogorov-Smirnov type statistics." Thesis, 2008. http://hdl.handle.net/10210/437.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Neto, Miguel Ângelo Silva. "Traffic classification based on statistical tests for matching empirical distributions of lengths of IP packets." Master's thesis, 2013. http://hdl.handle.net/10400.6/3885.

Full text

Abstract:

Nowadays, traffic classification constitutes one of the most important resources in the task of managing computer networks. The tools and techniques that enable network traffic to be segregated into classes are critical for administrators to maintain their networks operating at the required Quality of Service (QoS) and security levels. Nonetheless, the steady evolution of the infrastructure and mainly of the terminal devices, as well as the consequent increase of the complexity of the networks, make this task a lot harder to achieve, both in terms of accuracy and computational requirements. Some of the factors that most prejudice traffic classification are the adoption of encryption and evasive techniques, employed by network applications. Several researchers have thus been focusing efforts in finding new means to classify traffic or improve the existing ones. This dissertation discusses a research work on the network traffic classification subject, focused on the segregation of network flows according to the application that generated them, independently of the fact that such applications use different communication paradigms. For achieving that purpose, a network scenario similar to a real one was setup on a lab environment, and several traffic traces generated using different contemporary applications were collected. This traces were initially subject to human analysis, which enabled the identification of behavior patterns without resorting to information inside the contents of the packets, using only the empirical distribution of the size of the packets. After the initial analysis, a set of signatures composed by the aforementioned empirical distributions and the name of respective applications was build, for each one of the applications and type of traffic under analysis. Subsequently, the best means to obtain the correspondence between the signatures and the network traffic in real-time and in a packet-by-packet manner was investigated, from which resulted the modification of two statistical tests known as Chi- Squared and Kolmogorov-Smirnov, later implemented in prototypes for traffic classification. To enable the packet-by-packet analysis, the two statistics of the aforementioned tests are calculated for a sliding window of values, which iterates each time a new packet of the flow arrives. The number of operations involved in the actualization of the statistics is constant and low, which enables obtaining a classification at any given moment of the duration of a flow. Each one of the two classification methods was implemented in a different prototype and then combined, using an heuristic, to obtain a third classifier. The classifiers were tested and evaluated separately resorting to new traffic traces, generated by the different applications considered in the study, captured in a network aggregation point. Even though the results obtained for each one of the two classifiers were good, presenting an accuracy above 70%, the combination of the two methods improves those results, correctly classifying more than 90% of the analysed flows. Additionally, the developed prototypes were compared with other similar tools discussed on the related literature and available online, and it was verified that, in many cases, the proposed classifiers produce better results for the analysed traces.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!