Dissertations / Theses on the topic 'Clustering'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Clustering.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Yoo, Jaiyul. "From galaxy clustering to dark matter clustering." Columbus, Ohio : Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1186586898.
Full textHinz, Joel. "Clustering the Web : Comparing Clustering Methods in Swedish." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-95228.
Full textBacarella, Daniele. "Distributed clustering algorithm for large scale clustering problems." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-212089.
Full textZimek, Arthur. "Correlation Clustering." Diss., lmu, 2008. http://nbn-resolving.de/urn:nbn:de:bvb:19-87361.
Full textRutten, Jeroen Hendrik Gerardus Christiaan. "Polyhedral clustering." Maastricht : Maastricht : Universiteit Maastricht ; University Library, Maastricht University [Host], 1998. http://arno.unimaas.nl/show.cgi?fid=6061.
Full textLeisch, Friedrich. "Bagged clustering." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/1272/1/document.pdf.
Full textSeries: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
Eldridge, Justin Eldridge. "Clustering Consistently." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070374903249.
Full textSalamone, Johnny <1990>. "Speaker Clustering." Master's Degree Thesis, Università Ca' Foscari Venezia, 2018. http://hdl.handle.net/10579/12958.
Full textRosell, Magnus. "Text Clustering Exploration : Swedish Text Representation and Clustering Results Unraveled." Doctoral thesis, KTH, Numerisk Analys och Datalogi, NADA, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-10129.
Full textQC 20100806
Rossi, Alfred Vincent III. "Temporal Clustering of Finite Metric Spaces and Spectral k-Clustering." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1500033042082458.
Full textKeller, Jens. "Clustering biological data using a hybrid approach : Composition of clusterings from different features." Thesis, University of Skövde, School of Humanities and Informatics, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-1078.
Full textClustering of data is a well-researched topic in computer sciences. Many approaches have been designed for different tasks. In biology many of these approaches are hierarchical and the result is usually represented in dendrograms, e.g. phylogenetic trees. However, many non-hierarchical clustering algorithms are also well-established in biology. The approach in this thesis is based on such common algorithms. The algorithm which was implemented as part of this thesis uses a non-hierarchical graph clustering algorithm to compute a hierarchical clustering in a top-down fashion. It performs the graph clustering iteratively, with a previously computed cluster as input set. The innovation is that it focuses on another feature of the data in each step and clusters the data according to this feature. Common hierarchical approaches cluster e.g. in biology, a set of genes according to the similarity of their sequences. The clustering then reflects a partitioning of the genes according to their sequence similarity. The approach introduced in this thesis uses many features of the same objects. These features can be various, in biology for instance similarities of the sequences, of gene expression or of motif occurences in the promoter region. As part of this thesis not only the algorithm itself was implemented and evaluated, but a whole software also providing a graphical user interface. The software was implemented as a framework providing the basic functionality with the algorithm as a plug-in extending the framework. The software is meant to be extended in the future, integrating a set of algorithms and analysis tools related to the process of clustering and analysing data not necessarily related to biology.
The thesis deals with topics in biology, data mining and software engineering and is divided into six chapters. The first chapter gives an introduction to the task and the biological background. It gives an overview of common clustering approaches and explains the differences between them. Chapter two shows the idea behind the new clustering approach and points out differences and similarities between it and common clustering approaches. The third chapter discusses the aspects concerning the software, including the algorithm. It illustrates the architecture and analyses the clustering algorithm. After the implementation the software was evaluated, which is described in the fourth chapter, pointing out observations made due to the use of the new algorithm. Furthermore this chapter discusses differences and similarities to related clustering algorithms and software. The thesis ends with the last two chapters, namely conclusions and suggestions for future work. Readers who are interested in repeating the experiments which were made as part of this thesis can contact the author via e-mail, to get the relevant data for the evaluation, scripts or source code.
Gondek, David. "Non-redundant clustering /." View online version; access limited to Brown University users, 2005. http://wwwlib.umi.com/dissertations/fullcit/3174612.
Full textGupta, Pramod. "Robust clustering algorithms." Thesis, Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/39553.
Full textAchtert, Elke. "Hierarchical Subspace Clustering." Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-68071.
Full textWhissell, John. "Significant Feature Clustering." Thesis, University of Waterloo, 2006. http://hdl.handle.net/10012/2926.
Full textJohnson, Samuel. "Document Clustering Interface." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-112878.
Full textEvans, Reuben James Emmanuel. "Clustering for Classification." The University of Waikato, 2007. http://hdl.handle.net/10289/2403.
Full textKarim, Ehsanul, Sri Phani Venkata Siva Krishna Madani, and Feng Yun. "Fuzzy Clustering Analysis." Thesis, Blekinge Tekniska Högskola, Sektionen för ingenjörsvetenskap, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2165.
Full textParker, Jonathon Karl. "Accelerated Fuzzy Clustering." Scholar Commons, 2013. http://scholarcommons.usf.edu/etd/4929.
Full textShih, Benjamin. "Target Sequence Clustering." Research Showcase @ CMU, 2011. http://repository.cmu.edu/dissertations/177.
Full textAfsarmanesh, Tehrani Nazanin. "Clustering Multilayer Networks." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-279745.
Full textOshiro, Marcio Takashi Iura. "Clustering de trajetórias." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-29102015-142559/.
Full textThis work aimed to study kinetic problems of clustering, i.e., clustering problems in which the objects are moving. The study focused on the unidimensional case, where the objects are points moving on the real line. Several variants of this case have been discussed. Regarding the movement, we consider the case where each point moves at a constant velocity in a given time interval, the case where the points move arbitrarily and we only know their positions in discrete time instants, the case where the points move at a random velocity in which only the expected value of the velocity is known, and the case where, given a partition of the time interval, the points move at constant velocities in each sub-interval. Regarding the kind of clustering sought, we focused in the case where the number of clusters is part of the input of the problem and we consider different measures of quality for the clustering. Two of them are traditional measures for clustering problems: the sum of the cluster diameters and the maximum diameter of a cluster. The third measure considered takes into account the kinetic characteristic of the problem, and allows, in a controlled manner, that a cluster change along time. For each of the variants of the problem, we present algorithms, exact or approximation, some obtained complexity results, and open questions.
Alqurashi, Tahani. "Clustering ensemble method." Thesis, University of East Anglia, 2017. https://ueaeprints.uea.ac.uk/62679/.
Full textPahmp, Oliver. "N-sphere Clustering." Thesis, Umeå universitet, Statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-172387.
Full textTadepalli, Sriram Satish. "Schemas of Clustering." Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/26261.
Full textPh. D.
Loganathan, Satish Kumar. "Distributed Hierarchical Clustering." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1544001912266574.
Full textAl-Razgan, Muna Saleh. "Weighted clustering ensembles." Fairfax, VA : George Mason University, 2008. http://hdl.handle.net/1920/3212.
Full textVita: p. 134. Thesis director: Carlotta Domeniconi. Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Information Technology. Title from PDF t.p. (viewed Oct. 14, 2008). Includes bibliographical references (p. 128-133). Also issued in print.
Xu, Tianbing. "Nonparametric evolutionary clustering." Diss., Online access via UMI:, 2009.
Find full textZhong, Wei. "Clustering System and Clustering Support Vector Machine for Local Protein Structure Prediction." Digital Archive @ GSU, 2006. http://digitalarchive.gsu.edu/cs_diss/7.
Full textHoffmann, Kai Delf. "Cosmology with galaxy clustering." Doctoral thesis, Universitat Autònoma de Barcelona, 2015. http://hdl.handle.net/10803/297700.
Full textFor constraining cosmological models via the growth of large-scale matter fluctuations it is important to understand how the observed galaxies trace the full matter density field. The relation between the density fields of matter and galaxies is often approximated by a second- order expansion of a so-called bias function. The freedom of the parameters in the bias function weakens cosmological constraints from observations. In this thesis we study two methods for determining the bias parameters independently from the growth. Our analysis is based on the matter field from the large MICE Grand Challenge simulation. Haloes, identified in this simulation, are associated with galaxies. The first method is to measure the bias parameters directly from third-order statistics of the halo and matter distributions. The second method is to predict them from the abundance of haloes as a function of halo mass (hereafter referred to as mass function). Our bias estimations from third-order statistics are based on three-point auto- and cross- correlations of halo and matter density fields in three dimensional configuration space. Using three-point auto-correlations and a local quadratic bias model we find a ∼ 20% overestimation of the linear bias parameter with respect to the reference from two-point correlations. This deviation can originate from ignoring non-local and higher-order contributions to the bias function, as well as from systematics in the measurements. The effect of such inaccuracies in the bias estimations on growth measurements are comparable with errors in our measurements, coming from sampling variance and noise. We also present a new method for measuring the growth which does not require a model for the dark matter three-point correlation. Results from both approaches are in good agreement with predictions. By combining three-point auto- and cross-correlations one can either measure the linear bias without being affected by quadratic (local or non-local) terms in the bias functions or one can isolate such terms and compare them to predictions. Our linear bias measurements from such combinations are in very good agreement with the reference linear bias. The comparison of the non-local contributions with predictions reveals a strong scale dependence of the measurements with significant deviations from the predictions, even at very large scales. Our second approach for obtaining the bias parameters are predictions derived from the mass function via the peak-background split approach. We find significant 5−10% deviations between these predictions and the reference from two-point clustering. These deviations can only partly be explained with systematics affecting the bias predictions, coming from the halo mass function binning, the mass function error estimation and the mass function parameterisation from which the bias predictions are derived. Studying the mass function we find unifying relations between different mass function parameterisation. Furthermore, we find that the standard Jack-Knife method overestimates the mass function error covariance in the low mass range. We explain these deviations and present a new improved covariance estimator.
Batet, Sanromà Montserrat. "Ontology based semantic clustering." Doctoral thesis, Universitat Rovira i Virgili, 2011. http://hdl.handle.net/10803/31913.
Full textClustering algorithms have focused on the management of numerical and categorical data. However, in the last years, textual information has grown in importance. Proper processing of this kind of information within data mining methods requires an interpretation of their meaning at a semantic level. In this work, a clustering method aimed to interpret, in an integrated manner, numerical, categorical and textual data is presented. Textual data will be interpreted by means of semantic similarity measures. These measures calculate the alikeness between words by exploiting one or several knowledge sources. In this work we also propose two new ways of compute semantic similarity based on 1) the exploitation of the taxonomical knowledge available on one or several ontologies and 2) the estimation of the information distribution of terms in the Web. Results show that a proper interpretation of textual data at a semantic level improves clustering results and eases the interpretability of the classifications
Chang, Soong Uk. "Clustering with mixed variables /." [St. Lucia, Qld.], 2005. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe19086.pdf.
Full textGalåen, Magnus. "Dokument-klynging (document clustering)." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2008. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-8868.
Full textAs document searching becomes more and more important with the rapid growth of document bases today, document clustering also becomes more important. Some of the most commonly used document clustering algorithms today, are pure statistical in nature. Other algorithms have emerged, adressing some of the issues with numerical algorithms, claiming to be better. This thesis compares two well-known algorithms: Elliptic K-Means and Suffix Tree Clustering. They are compared in speed and quality, and it is shown that Elliptic K-Means performs better in speed, while Suffix Tree Clustering (STC) performs better in quality. It is further shown that STC performs better using small portions of relevant text (snippets) on real web-data compared to the full document. It is also shown that a threshold value for base cluster merging is unneccesary. As STC is shown to perform adequately in speed when running on snippets only, it is concluded that STC is the better algorithm for the purpose of search results clustering.
Buchta, Christian, Martin Kober, Ingo Feinerer, and Kurt Hornik. "Spherical k-Means Clustering." American Statistical Association, 2012. http://epub.wu.ac.at/4000/1/paper.pdf.
Full textCole, Rowena Marie. "Clustering with genetic algorithms." University of Western Australia. Dept. of Computer Science, 1998. http://theses.library.uwa.edu.au/adt-WU2003.0008.
Full textHou, Jean Fen-ju. "Clustering with obstacle entities." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0023/MQ51360.pdf.
Full textTzerpos, Vassilios. "Comprehension-driven software clustering." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/NQ63614.pdf.
Full textShortreed, Susan. "Learning in spectral clustering /." Thesis, Connect to this title online; UW restricted, 2006. http://hdl.handle.net/1773/8977.
Full textSheth, Ravi Kiran. "Gravitational clustering of galaxies." Thesis, University of Cambridge, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.320096.
Full textStratton, R. A. "Clustering in light nuclei." Thesis, University of Oxford, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.355812.
Full textBielby, Richard. "Galaxy clustering and feedback." Thesis, Durham University, 2008. http://etheses.dur.ac.uk/2344/.
Full textSmith, Robert James. "QSO clustering and environments." Thesis, University of Cambridge, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.624809.
Full textIsheden, Gabriel. "Bayesian Hierarchic Sample Clustering." Thesis, KTH, Matematik (Inst.), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-168316.
Full textDenna rapport presenterar en ny algoritm för hierarkisk klustring, Bayesian Sample Clustering (BSC). BSC är en single-linkage algoritm som använder stickprov av data för att skapa en prediktiv fördelning för varje stickprov. De prediktiva fördelningarna jämförs med Chan-Darwiche avståndet, en metrik över ändliga sannolikhetsfördelningar, vilket möjliggör skapandet av en hierarki av kluster. BSC finns i implementerad version på https://github.com/Skjulet/Bayesian Sample Clustering.
Hahmann, Martin. "Feedback-Driven Data Clustering." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2014. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-135647.
Full textAl-Harbi, Sami. "Clustering in metric spaces." Thesis, University of East Anglia, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.396604.
Full textZhou, Hanson M. (Hanson Mi) 1977. "Clustering via matrix exponentiation." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/17671.
Full textIncludes bibliographical references (leaves 26-27).
Given a set of n points with a matrix of pairwise similarity measures, one would like to partition the points into clusters so that similar points are together and different ones apart. We present an algorithm requiring only matrix exponentiation that performs well in practice and bears an elegant interpretation in terms of random walks on a graph. Under a certain mixture model involving planting a partition via randomized rounding of tailored matrix entries, the algorithm can be proven effective for only a single squaring. It is shown that the clustering performance of the algorithm degrades with larger values of the exponent, thus revealing that a single squaring is optimal.
by Hanson M. Zhou.
S.M.
Bouvrie, Jacob V. "Multi-source contingency clustering." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/33122.
Full textIncludes bibliographical references (p. 93-96).
This thesis examines the problem of clustering multiple, related sets of data simultaneously. Given datasets which are in some way connected (e.g. temporally) but which do not necessarily share label compatibility, we exploit co-occurrence in- formation in the form of normalized multidimensional contingency tables in order to recover robust mappings between data points and clusters for each of the individual data sources. We outline a unifying formalism by which one might approach cross-channel clustering problems, and begin by defining an information-theoretic objective function that is small when the clustering can be expected to be good. We then propose and explore several multi-source algorithms for optimizing this and other relevant objective functions, borrowing ideas from both continuous and discrete optimization methods. More specifically, we adapt gradient-based techniques, simulated annealing, and spectral clustering to the multi-source clustering problem. Finally, we apply the proposed algorithms to a multi-source human identification task, where the overall goal is to cluster grayscale face images according to identity, using additional temporally connected features. It is our hope that the proposed multi-source clustering framework can ultimately shed light on the problem of when and how models might be automatically created to account for, and adapt to, novel individuals as a surveillance/recognition system accumulates sensory experience.
by Jacob V. Bouvrie.
M.Eng.
Bădoiu, Mihai 1978. "Clustering in high dimensions." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/87376.
Full textIncludes bibliographical references (p. 47-48).
by Mihai Bădoiu.
M.Eng.and S.B.
Dimitriadou, Evgenia, Andreas Weingessel, and Kurt Hornik. "Fuzzy voting in clustering." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/742/1/document.pdf.
Full textSeries: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
Madureira, Erikson Manuel Geraldo Vieira de. "Análise de mercado : clustering." Master's thesis, Instituto Superior de Economia e Gestão, 2016. http://hdl.handle.net/10400.5/13122.
Full textO presente trabalho tem como objetivo descrever as atividades realizadas durante o estágio efetuado na empresa Quidgest. Tendo a empresa a necessidade de estudar as suas diversas vertentes de negócio, optou-se por extrair e identificar as informações presentes no banco de dados da empresa. Para isso, foi utilizado um processo conhecido na análise de dados denominado por Extração de Conhecimento em Bases de Dados (ECBD). O maior desafio na utilização deste processo deveu-se há grande acumulação de informação pela empresa, que se foi intensificando a partir de 2013. Das fases do processo de ECBD, a que tem maior relevância é o data mining, onde é feito um estudo das variáveis caracterizadoras necessárias para a análise em foco. Foi escolhida a técnica de análise cluster da fase de data mining para que que toda análise possa ser eficiente, eficaz e se possa obter resultados de fácil leitura. Após o desenvolvimento do processo de ECBD, foi decidido que a fase de data mining podia ser implementada de modo a facilitar um trabalho futuro de uma análise realizada pela empresa. Para implementar essa fase, utilizaram-se técnicas de análise cluster e foi desenvolvida um programa em VBA/Excel centrada no utilizador. Para testar o programa criado foi utilizado um caso concreto da empresa. Esse caso consistiu em determinar quais os atuais clientes que mais contribuíram para a evolução da empresa nos anos de 2013 a 2015. Aplicando o caso referido no programa criado, obtiveram-se resultados e informações que foram analisadas e interpretadas.
This paper aims to describe the activities performed during the internship made in Quidgest company. Having the company need to study their various business areas, it was decided to extract and identify the information contained in the company's database. For this end, we used a process known in the data analysis called for Knowledge Discovery in Databases (KDD). The biggest challenge in using this process was due to their large accumulation of information by the company, which was intensified from 2013. The phases of the KDD process, which is the most relevant is data mining, where a study of characterizing variables required for the analysis is done. The cluster analysis technique of data mining phase was chosen for that any analysis can be efficient, effective and could provide results easy to read. After the development of the KDD process, it was decided that the data mining phase could be automated to facilitate future work carried out by the company. To automate this phase, cluster analysis techniques were used and was developed a program in VBA/Excel user-centered. To test the created program we used a specific case of the company. This case consisted in determining the current customers that have contributed to the company's evolution during the years 2013-2015. The application of the program has revealed useful information that has been analyzed and interpreted.
info:eu-repo/semantics/publishedVersion