Dissertations / Theses on the topic 'Nearest neighbor analysis (Statistics)'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Nearest neighbor analysis (Statistics).'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Shen, Qiong Mao. "Group nearest neighbor queries /." View abstract or full-text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20SHEN.
Full textHui, Michael Chun Kit. "Aggregate nearest neighbor queries /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?COMP%202004%20HUI.
Full textIncludes bibliographical references (leaves 91-95). Also available in electronic version. Access restricted to campus users.
Xie, Xike, and 谢希科. "Evaluating nearest neighbor queries over uncertain databases." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B4784954X.
Full textpublished_or_final_version
Computer Science
Doctoral
Doctor of Philosophy
Zhang, Jun. "Nearest neighbor queries in spatial and spatio-temporal databases /." View abstract or full-text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20ZHANG.
Full textRam, Parikshit. "New paradigms for approximate nearest-neighbor search." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49112.
Full textZhang, Peiwu, and 张培武. "Voronoi-based nearest neighbor search for multi-dimensional uncertain databases." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2012. http://hub.hku.hk/bib/B49618179.
Full textpublished_or_final_version
Computer Science
Master
Master of Philosophy
Wong, Wing Sing. "K-nearest-neighbor queries with non-spatial predicates on range attributes /." View abstract or full-text, 2005. http://library.ust.hk/cgi/db/thesis.pl?COMP%202005%20WONGW.
Full textYiu, Man-lung. "Advanced query processing on spatial networks." Click to view the E-thesis via HKUTO, 2006. http://sunzi.lib.hku.hk/hkuto/record/B36279365.
Full textYiu, Man-lung, and 姚文龍. "Advanced query processing on spatial networks." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B36279365.
Full textDastile, Xolani Collen. "Improved tree species discrimination at leaf level with hyperspectral data combining binary classifiers." Thesis, Rhodes University, 2011. http://hdl.handle.net/10962/d1002807.
Full textBengtsson, Thomas. "Time series discrimination, signal comparison testing, and model selection in the state-space framework /." free to MU campus, to others for purchase, 2000. http://wwwlib.umi.com/cr/mo/fullcit?p9974611.
Full textAli, Khan Syed Irteza. "Classification using residual vector quantization." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50300.
Full textHawash, Maher Mofeid. "Methods for Efficient Synthesis of Large Reversible Binary and Ternary Quantum Circuits and Applications of Linear Nearest Neighbor Model." PDXScholar, 2013. https://pdxscholar.library.pdx.edu/open_access_etds/1090.
Full textLee, Jong-Seok. "Preserving nearest neighbor consistency in cluster analysis." [Ames, Iowa : Iowa State University], 2009. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3369852.
Full textZhong, Xiao. "A study of several statistical methods for classification with application to microbial source tracking." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0430104-155106/.
Full textKeywords: classification; k-nearest-neighbor (k-n-n); neural networks; linear discriminant analysis (LDA); support vector machines; microbial source tracking (MST); quadratic discriminant analysis (QDA); logistic regression. Includes bibliographical references (p. 59-61).
Cheng, Si. "Hierarchical Nearest Neighbor Co-kriging Gaussian Process For Large And Multi-Fidelity Spatial Dataset." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613750570927821.
Full textOgden, Mitchell S. "Observing Clusters and Point Densities in Johnson City, TN Crime Using Nearest Neighbor Hierarchical Clustering and Kernel Density Estimation." Digital Commons @ East Tennessee State University, 2019. https://dc.etsu.edu/asrf/2019/schedule/138.
Full textGard, Rikard. "Design-based and Model-assisted estimators using Machine learning methods : Exploring the k-Nearest Neighbor metod applied to data from the Recreational Fishing Survey." Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-72488.
Full textFunai, Tomohiko. "Extensions of Nearest Shrunken Centroid Method for Classification." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2402.
Full textMa, Tao. "Statistics of Quantum Energy Levels of Integrable Systems and a Stochastic Network Model with Applications to Natural and Social Sciences." University of Cincinnati / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1378196433.
Full textZhang, Xianjie, and Sebastian Bogic. "Datautvinning av klickdata : Kombination av klustring och klassifikation." Thesis, KTH, Hälsoinformatik och logistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230630.
Full textOwners of websites and applications usually profits through users that clicks on their links. These can be advertisements or items for sale amongst others. There are many studies about data analysis where they tell you if a link will be clicked, but only a few that focus on what needs to be adjusted to get the link clicked. The problem that Flygresor.se have is that they are missing a tool for their customers, travel agencies, that analyses their tickets and after that adjusts the attributes of those trips. The requested solution was an application which gave suggestions about how to change the tickets in a way that would make it more clicked and in that way, make more sales. A prototype was constructed which make use of two different data mining methods, clustering with the algorithm DBSCAN and classification with the algorithm knearest neighbor. These algorithms were used together with an evaluation process, called DNNA, which analyzes the result from the algorithms and gave suggestions about changes that could be done to the attributes of the links. The combination of the algorithms and DNNA was tested and evaluated as the solution to the problem. The program was able to predict what attributes of the tickets needed to be adjusted to get the tickets more clicks. ‘The recommendations of adjustments were reasonable but this result could not be compared to similar tools since they had not been published.
Sammon, Ryan. "Data Collection, Analysis, and Classification for the Development of a Sailing Performance Evaluation System." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/25481.
Full textNathan, Andrew Prashant. "Single Chain Statistics of a Polymer in a Crystallizable Solvent." University of Akron / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=akron1216146248.
Full textShi, Hongxiang. "Hierarchical Statistical Models for Large Spatial Data in Uncertainty Quantification and Data Fusion." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504802515691938.
Full textTsipenyuk, Gregory. "Evaluation of decentralized email architecture and social network analysis based on email attachment sharing." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/273963.
Full textPrabhu, Chitra. "COMPARISON OF THE UTILITY OF REGRESSION ANALYSIS AND K-NEAREST NEIGHBOR TECHNIQUE TO ESTIMATE ABOVE-GROUND BIOMASS IN PINE FORESTS USING LANDSAT ETM+ IMAGERY." MSSTATE, 2006. http://sun.library.msstate.edu/ETD-db/theses/available/etd-08092006-091449/.
Full textSahtout, Mohammad Omar. "Improving the performance of the prediction analysis of microarrays algorithm via different thresholding methods and heteroscedastic modeling." Diss., Kansas State University, 2014. http://hdl.handle.net/2097/17914.
Full textDepartment of Statistics
Haiyan Wang
This dissertation considers different methods to improve the performance of the Prediction Analysis of Microarrays (PAM). PAM is a popular algorithm for high-dimensional classification. However, it has a drawback of retaining too many features even after multiple runs of the algorithm to perform further feature selection. The average number of selected features is 2611 from the application of PAM to 10 multi-class microarray human cancer datasets. Such a large number of features make it difficult to perform follow up study. This drawback is the result of the soft thresholding method used in the PAM algorithm and the thresholding parameter estimate of PAM. In this dissertation, we extend the PAM algorithm with two other thresholding methods (hard and order thresholding) and a deep search algorithm to achieve better thresholding parameter estimate. In addition to the new proposed algorithms, we derived an approximation for the probability of misclassification for the hard thresholded algorithm under the binary case. Beyond the aforementioned work, this dissertation considers the heteroscedastic case in which the variances for each feature are different for different classes. In the PAM algorithm the variance of the values for each predictor was assumed to be constant across different classes. We found that this homogeneity assumption is invalid for many features in most data sets, which motivates us to develop the new heteroscedastic version algorithms. The different thresholding methods were considered in these algorithms. All new algorithms proposed in this dissertation are extensively tested and compared based on real data or Monte Carlo simulation studies. The new proposed algorithms, in general, not only achieved better cancer status prediction accuracy, but also resulted in more parsimonious models with significantly smaller number of genes.
Favaro, Martha Maria Andreotti 1981. "Exploração de dados multivariados de fontes e extratos de antocianinas ultilizando análise de componentes princiaipais e método do vizinho mais proximo." [s.n.], 2012. http://repositorio.unicamp.br/jspui/handle/REPOSIP/250159.
Full textTese (doutorado) - Universidade Estadual de Campinas, Instituto de Química
Made available in DSpace on 2018-08-20T02:46:28Z (GMT). No. of bitstreams: 1 Favaro_MarthaMariaAndreotti_D.pdf: 3734314 bytes, checksum: 08002efe51b2f18e9a942c3b818270b7 (MD5) Previous issue date: 2012
Resumo: Antocianinas (ACYS) são corantes naturais responsáveis pela coloração de frutas, hortaliças, flores e grãos. Novas perspectivas de usos de antocianinas em diversos segmentos industriais estimulam estudos analíticos para sistematizar a identificação e a classificação de fontes e extratos desses corantes. Neste trabalho foram utilizadas fontes de ACYS como frutas típicas brasileiras: AMORA (Morus nigra), amora preta (Rubus sp.), jabuticaba (Myrciaria cauliflora), jambolão (Syzygium cumini), jussara (Euterpe edulis Mart.), morango (Fragaria x ananassa Duch) e uva (Vitis vinífera e Vitis vinífera L. Brasil); hortaliças: alface roxa (Lactuca sativa), berinjela (Solanum melongena), cebola roxa (Allium cepa), rabanete (Raphanus sativus), repolho roxo (Brassica oleraceae) e flores: beijo-turco (Impatiens walleriana), gerânio (Pelargonium hortorum e Pelargonium peltatum L.), hibisco (Hibiscus sinensis e Hibiscus syriacus) e hortênsia (Hydrangea macrophylla). A literatura descreve diversas técnicas para análise de ACYS em vegetais e seus extratos, com destaque para cromatografia líquida de alta eficiência (HPLC), espectrometria de massas (MS) e espectrofotometria (UV-Vis), sendo que todas elas foram aplicadas neste trabalho, incluindo-se espectrofotometria de reflectância e a técnica de eletromigração em capilares cromatografia eletrocinética micelar (MEKC). As ferramentas quimiométricas utilizadas no tratamento dos dados foram análise de componentes principais (PCA) e método do vizinho mais próximo (KNN). Os modelos quimiométricos de classificação obtidos apresentaram-se robustos com erros de previsão de menos de 30 % sendo possível identificar as fontes de ACYS, o solvente extrator, a idade dos extratos e dados sobre sua estabilidade e condições de armazenamento. Os resultados apontaram que dados obtidos de técnicas analíticas simples como espectrofotometria de absorção e sem necessidade de preparo de amostra como reflectância difusa na região do visível são comparáveis a resultados de técnicas mais sofisticadas e caras como HPLC e MEKC e até superam o potencial de algumas informações obtidas por MS
Abstract: Anthocyanins (ACYS) are natural dyes responsible for color in fruits, vegetables, flowers and grains. New perspectives for use of anthocyanins in various industries stimulate analytical studies to systematize the identification and classification of sources and extracts of these dyes. In this work, typical Brazilian fruits: mulberry (Morus nigra), blackberry (Rubus sp), jaboticaba (Myrciaria cauliflora), jambolan (Syzygium cumini), jussara fruit (Euterpe edulis Mart.), strawberry (Fragaria x ananassa Duch) and grapes (Vitis vinifera and Vitis vinifera L. 'Brazil'); vegetables: red lettuce (Lactuca sativa), eggplant (Solanum melongena), purple onion (Allium cepa), radish (Raphanus sativus), red cabbage (Brassica oleracea) and flowers, Buzy Lizzie (Impatiens walleriana), geranium (Pelargonium hortorum and Pelargonium peltatum L.), hibiscus (Hibiscus sinensis and Hibiscus syriacus) and hydrangea (Hydrangea macrophylla) were used as sources of ACYS. The literature describes several techniques for analyzing ACYS in vegetables and their extracts, with emphasis on high performance liquid chromatography (HPLC), mass spectrometry (MS) and spectrophotometry (UV-VIS). All of these techniques were applied in this work, including reflectance spectrophotometry and micellar electrokinetic chromatography (MEKC) which is one of the capillary electromigration techniques. The chemometric tools used in data handling were the principal component analysis (PCA) and the K-nearest neighbor method (KNN). The chemometric classification models obtained are robust with predict errors of less than 30 %. It is possible to identify the sources of ACYS, the extractor solvent, the age of the extracts, their stability and storage conditions. The results show that data obtained from simple analytical techniques such as absorption spectroscopy and diffuse reflectance in the visible region (sample preparation is not needed) are comparable to results of those obtained from sophisticated and expensive techniques such as HPLC and MEKC. These techniques also surpass the information obtained by MS
Doutorado
Quimica Analitica
Doutor em Ciências
Aygar, Alper. "Doppler Radar Data Processing And Classification." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609890/index.pdf.
Full textKucuktunc, Onur. "Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1374148621.
Full textAhmed, Mohamed Salem. "Contribution à la statistique spatiale et l'analyse de données fonctionnelles." Thesis, Lille 3, 2017. http://www.theses.fr/2017LIL30047/document.
Full textThis thesis is about statistical inference for spatial and/or functional data. Indeed, weare interested in estimation of unknown parameters of some models from random or nonrandom(stratified) samples composed of independent or spatially dependent variables.The specificity of the proposed methods lies in the fact that they take into considerationthe considered sample nature (stratified or spatial sample).We begin by studying data valued in a space of infinite dimension or so-called ”functionaldata”. First, we study a functional binary choice model explored in a case-controlor choice-based sample design context. The specificity of this study is that the proposedmethod takes into account the sampling scheme. We describe a conditional likelihoodfunction under the sampling distribution and a reduction of dimension strategy to definea feasible conditional maximum likelihood estimator of the model. Asymptotic propertiesof the proposed estimates as well as their application to simulated and real data are given.Secondly, we explore a functional linear autoregressive spatial model whose particularityis on the functional nature of the explanatory variable and the structure of the spatialdependence. The estimation procedure consists of reducing the infinite dimension of thefunctional variable and maximizing a quasi-likelihood function. We establish the consistencyand asymptotic normality of the estimator. The usefulness of the methodology isillustrated via simulations and an application to some real data.In the second part of the thesis, we address some estimation and prediction problemsof real random spatial variables. We start by generalizing the k-nearest neighbors method,namely k-NN, to predict a spatial process at non-observed locations using some covariates.The specificity of the proposed k-NN predictor lies in the fact that it is flexible and allowsa number of heterogeneity in the covariate. We establish the almost complete convergencewith rates of the spatial predictor whose performance is ensured by an application oversimulated and environmental data. In addition, we generalize the partially linear probitmodel of independent data to the spatial case. We use a linear process for disturbancesallowing various spatial dependencies and propose a semiparametric estimation approachbased on weighted likelihood and generalized method of moments methods. We establishthe consistency and asymptotic distribution of the proposed estimators and investigate thefinite sample performance of the estimators on simulated data. We end by an applicationof spatial binary choice models to identify UADT (Upper aerodigestive tract) cancer riskfactors in the north region of France which displays the highest rates of such cancerincidence and mortality of the country
Dikkaya, Fahri. "Settlement Patterns Of Altinova In The Early Bronze Age." Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/1254614/index.pdf.
Full textTandan, Isabelle, and Erika Goteman. "Bank Customer Churn Prediction : A comparison between classification and evaluation methods." Thesis, Uppsala universitet, Statistiska institutionen, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-411918.
Full textServien, Rémi. "Estimation de régularité locale." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2010. http://tel.archives-ouvertes.fr/tel-00730491.
Full textAmbrožová, Monika. "Detekce fibrilace síní v krátkodobých EKG záznamech." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2019. http://www.nusl.cz/ntk/nusl-400984.
Full textBaum, Kristen Anne. "Feral Africanized honey bee ecology in a coastal prairie landscape." Texas A&M University, 2003. http://hdl.handle.net/1969/150.
Full textRamraj, Varun. "Exploiting whole-PDB analysis in novel bioinformatics applications." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:6c59c813-2a4c-440c-940b-d334c02dd075.
Full textJelínková, Jana. "Rozpoznání hudebního slohu z orchestrální nahrávky za pomoci technik Music Information Retrieval." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-413256.
Full textBílý, Ondřej. "Moderní řečové příznaky používané při diagnóze chorob." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-218971.
Full text(11181162), Jiexin Duan. "DISTRIBUTED NEAREST NEIGHBOR CLASSIFICATION WITH APPLICATIONS TO CROWDSOURCING." Thesis, 2021.
Find full text"Superseding neighbor search on uncertain data." 2009. http://library.cuhk.edu.hk/record=b5894020.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 2009.
Includes bibliographical references (leaves [44]-46).
Abstract also in Chinese.
Thesis Committee --- p.i
Abstract --- p.ii
Acknowledgement --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Related Work --- p.6
Chapter 2.1 --- Nearest Neighbor Search on Precise Data --- p.6
Chapter 2.2 --- NN Search on Uncertain Data --- p.8
Chapter 3 --- Problem Definitions and Basic Characteristics --- p.11
Chapter 4 --- The Full-Graph Approach --- p.16
Chapter 5 --- The Pipeline Approach --- p.19
Chapter 5.1 --- The Algorithm --- p.20
Chapter 5.2 --- Edge Phase --- p.24
Chapter 5.3 --- Pruning Phase --- p.27
Chapter 5.4 --- Validating Phase --- p.28
Chapter 5.5 --- Discussion --- p.29
Chapter 6 --- Extension --- p.31
Chapter 7 --- Experiment --- p.34
Chapter 7.1 --- Properties of the SNN-core --- p.34
Chapter 7.2 --- Efficiency of Our Algorithms --- p.38
Chapter 8 --- Conclusions and Future Work --- p.42
Chapter A --- List of Publications --- p.43
Bibliography --- p.44
Chamness, Kevin Andrew. "Multivariate fault detection and visualization in the semiconductor industry." Thesis, 2006. http://hdl.handle.net/2152/2830.
Full text"Automatic text categorization for information filtering." 1998. http://library.cuhk.edu.hk/record=b5889734.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 1998.
Includes bibliographical references (leaves 157-163).
Abstract also in Chinese.
Abstract --- p.i
Acknowledgment --- p.iii
List of Figures --- p.viii
List of Tables --- p.xiv
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Automatic Document Categorization --- p.1
Chapter 1.2 --- Information Filtering --- p.3
Chapter 1.3 --- Contributions --- p.6
Chapter 1.4 --- Organization of the Thesis --- p.7
Chapter 2 --- Related Work --- p.9
Chapter 2.1 --- Existing Automatic Document Categorization Approaches --- p.9
Chapter 2.1.1 --- Rule-Based Approach --- p.10
Chapter 2.1.2 --- Similarity-Based Approach --- p.13
Chapter 2.2 --- Existing Information Filtering Approaches --- p.19
Chapter 2.2.1 --- Information Filtering Systems --- p.19
Chapter 2.2.2 --- Filtering in TREC --- p.21
Chapter 3 --- Document Pre-Processing --- p.23
Chapter 3.1 --- Document Representation --- p.23
Chapter 3.2 --- Classification Scheme Learning Strategy --- p.26
Chapter 4 --- A New Approach - IBRI --- p.31
Chapter 4.1 --- Overview of Our New IBRI Approach --- p.31
Chapter 4.2 --- The IBRI Representation and Definitions --- p.34
Chapter 4.3 --- The IBRI Learning Algorithm --- p.37
Chapter 5 --- IBRI Experiments --- p.43
Chapter 5.1 --- Experimental Setup --- p.43
Chapter 5.2 --- Evaluation Metric --- p.45
Chapter 5.3 --- Results --- p.46
Chapter 6 --- A New Approach - GIS --- p.50
Chapter 6.1 --- Motivation of GIS --- p.50
Chapter 6.2 --- Similarity-Based Learning --- p.51
Chapter 6.3 --- The Generalized Instance Set Algorithm (GIS) --- p.58
Chapter 6.4 --- Using GIS Classifiers for Classification --- p.63
Chapter 6.5 --- Time Complexity --- p.64
Chapter 7 --- GIS Experiments --- p.68
Chapter 7.1 --- Experimental Setup --- p.68
Chapter 7.2 --- Results --- p.73
Chapter 8 --- A New Information Filtering Approach Based on GIS --- p.87
Chapter 8.1 --- Information Filtering Systems --- p.87
Chapter 8.2 --- GIS-Based Information Filtering --- p.90
Chapter 9 --- Experiments on GIS-based Information Filtering --- p.95
Chapter 9.1 --- Experimental Setup --- p.95
Chapter 9.2 --- Results --- p.100
Chapter 10 --- Conclusions and Future Work --- p.108
Chapter 10.1 --- Conclusions --- p.108
Chapter 10.2 --- Future Work --- p.110
Chapter A --- Sample Documents in the corpora --- p.111
Chapter B --- Details of Experimental Results of GIS --- p.120
Chapter C --- Computational Time of Reuters-21578 Experiments --- p.141
Lawson, Kathryn Sahara. "Defining activity areas in the Early Neolithic site at Foeni-Salaş (southwest Romania): A spatial analytic approach with geographical information systems in archaeology." 2007. http://hdl.handle.net/1993/2838.
Full textFebruary 2008
Cheema, Muhammad Aamir Computer Science & Engineering Faculty of Engineering UNSW. "CircularTrip and ArcTrip:effective grid access methods for continuous spatial queries." 2007. http://handle.unsw.edu.au/1959.4/40512.
Full textChen, Hue-Ling, and 陳慧玲. "Design and Analysis of Nearest Neighbor Search Strategies." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/87412624225347213682.
Full text國立中山大學
資訊工程學系研究所
90
With the proliferation of wireless communications and rapid advances in technologies, algorithms for efficiently answering queries about large number of spatial data are needed. Spatial data consists of spatial objects including data of higher dimension. Neighbor finding is one of the most important spatial operations in the field of spatial data structures. In recent years, many researchers have focused on finding efficient solutions to the nearest neighbor problem (NN) which involves determining the point in a data set that is the nearest to a given query point. It is frequently used in Geographical Information Systems (GIS). A block B is said to be the neighbor of another block A, if block B has the same property as block A has and covers an equal-sized neighbor of block A. Jozef Voros has proposed a neighbor finding strategy on images represented by quadtrees, in which the four equal-sized neighbors (the east, west, north, and south directions) of block A can be found. However, based on Voros''s strategy, the case that the nearest neighbor occurs in the diagonal directions (the northeast, northwest, southeast, and southwest directions) will be ignored. Moreover, there is no total ordering that preserve proximity when mapping a spatial data from a higher dimensional space to a 1D-space. One way of effecting such a mapping is to utilize space-filling curves. Space-filling curves pass through every point in a space and give a one-one correspondence between the coordinate and the 1D-sequence number of the point. The Peano curve, proposed by Orenstein, which maps the 1D-coordinate of a point by simply interleaving the bits of the X and Y coordinates in the 2D-space, can be easily used in neighbor finding. But with the data ordered by the RBG curve or the Hilbert curve, the neighbor finding would be complex. The RBG curve achieves savings in random accesses on the disk for range queries and the Hilbert curve achieves the best clustering for range queries. Therefore, in this thesis, we first show the missing case in the Voros''s strategy and show the ways to find it. Next, we show that the Peano curve is the best mapping function used in the nearest neighbor finding. We also show the transformation rules between the Peano curve and the other curves such that we can efficiently find the nearest neighbor, when the data is linearly ordered by the other curves. From our simulation, we show that our proposed two strategies can work correctly and faster than the conventional strategies in nearest neighbor finding. Finally, we present a revised version of NA-Trees, which can work for exact match queries and range queries from a large, dynamic index, where an exact match query means finding the specific data object in a spatial database and a range query means reporting all data objects which are located in a specific range. By large, we mean that most of the index must be stored in secondary memory. By dynamic, we mean that insertions and deletions are intermixed with queries, so that the index cannot be built beforehand.
Li, Yung-Hsu, and 黎詠絮. "Adaptive Nearest Neighbor Discriminant Analysis for High-dimensional Data Classification." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/22255030417583197406.
Full text輔仁大學
應用統計學研究所
98
As the advancement in technology, we would like to collect a lot of data attributes. Since data usually with massive attributes, we called this kind of data the high dimensional data. For example, each pixel in a hyperspectral image is consisted of about hundreds or even thousands of bands. However, in high dimensional data classification, the number of available training samples might be very limited. Actually, only relatively small training sets are available is a common problem in high-dimensional data analysis. As a consequence, based on the cure of dimensionality, the accuracy rate might be unsatisfied due to this data property. DANN is a adaptive classifer for high dimensional data. If the within-class covariance is singular, which often occurs in high-dimensional problems, DANN will have a poor performance on classification. In this paper we proposed DANN_PRDA to reduce the effect of high dimensionality and small sample classification situation. In our study, there are many different data, among them hyperspectral image, face recognition. First we used LDC, QDC, SVM, k-NN, DANN, DANN_DA, DANN_PRDA etc. to classify the material, and the experimental result was using DANN_PRDA classification accuracy access other classification.
Chen, Chih-Han, and 陳志翰. "Fault Diagnosis of Steam Turbine-Generator Sets Using K-Nearest Neighbor and Principal Component Analysis Methods." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/85499230315301703030.
Full text正修科技大學
電機工程研究所
103
From the view of preventive measures, the work on earlier detection of the incipient fault of the fundamental equipment in the steam power plants, especially the steam turbine-generator sets, has attracted quite much attention. As a vital device in the power system, the fault of the steam turbine-generator set will lead itself to a very wide range of outage of the system. Due to the increasing capacity and structure complexity of steam turbine-generator sets, the relations among components of the set become closer than before. Thus, the research on vibration fault diagnosis not only has great importance and benefit for the machine to operate safely and stably, but also is a frontier issue for electrical engineering. This thesis presents a data mining approach based on K-nearest neighbor (K-NN) classifier and principal component analysis (PCA) to diagnose the vibration faults of turbine-generator units. The PCA is used to reduce the dimension of the input attributes through a linear combination of the original attributes. The testing results demonstrates the feasibility of the proposed approaches to diagnose the vibration faults of turbine-generator units.
Chang, Huan Ling, and 張華玲. "A Study on Spatial Analysis of National Monument through the Method of Nearest Neighbor Index in Tainan City." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/pd99gr.
Full text康寧大學
休閒管理研究所
105
The purpose of this study is to discuss the definition and classification of the National Monument. In order to analyze the spatial characteristics of 22 National Monument which are distributed in Tainan to mark them on the map and discuss whether the area is centralized or decentralized. The research method is using the "Nearest – Neighbor Index" to analyze the number of National Monument which are located in 7 different districts. The investigation results are: (A) the spatial characteristics of the National Monument not only can provide humanistic characteristics but also combine the different regional spatial characteristics as the resources of local cultural tourism. (B) Most of the National Monument are located in the central and western districts of Tainan. The population density is also concentrated. In the limited geographical space, it attracts many tourists compare to other districts because of its concentration of National Monument and the convenience. It attracts more tourists because some of the National Monument are walking distance. (C) Comparing the number of National Monument in Old Tainan City and Tainan now,the result is decentralized.
Lo, Yu-Yan, and 羅玉燕. "Extracting Function-level Statements in Biological Expression Language from Biomedical Literature:A K Nearest Neighbor approach inspired by Principal Component Analysis." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/a7p3u7.
Full text國立中央大學
資訊工程學系
104
Nowadays, understanding pathway is one of the main purpose of biomedical domains, because the biological pathway involves various regulation mechanisms. Many regulation mechanisms have being discovered and presented in biomedical literature, allowing life scientists to perceive the latest results. It also has being highly demanded within the scientific community in the text mining for biomedical researches. Biological Expression Language (BEL) is designed to capture relationships between the two biological entities, such as gene, protein and chemical in scientific literatures. This is can not only describe the positive/negative relationship between biomedical entities, but represent biomedical function-level information, such as complex abundance, chaperone protein, catalyst and so on. In related research, the latest performance of function-level classification is 30.5\%, and the performance will effect on the BEL full-statement performance. In order to enhance the integrity of the BEL full-statements, we proposed a K-nearest neighbor (KNN) approach inspired by Principal Component Analysis (PCA) to recognize the function-level terms automatically. In experimental results, combination of PCA and KNN has the higher performance than SVM-based method, and it can achieve F-score of 59.70\%. In conclusion, we hope that the higher performance of function-level classification can not only enhance the integrity of BEL full-statement, but help to construct complete biological networks and to accelerate the biomedical research processes for life scientists.