Dissertations / Theses on the topic 'Cluster clustering'

To see the other types of publications on this topic, follow the link: Cluster clustering.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Cluster clustering.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Dimitriadou, Evgenia, Andreas Weingessel, and Kurt Hornik. "A voting-merging clustering algorithm." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/94/1/document.pdf.

Full text
Abstract:
In this paper we propose an unsupervised voting-merging scheme that is capable of clustering data sets, and also of finding the number of clusters existing in them. The voting part of the algorithm allows us to combine several runs of clustering algorithms resulting in a common partition. This helps us to overcome instabilities of the clustering algorithms and to improve the ability to find structures in a data set. Moreover, we develop a strategy to understand, analyze and interpret these results. In the second part of the scheme, a merging procedure starts on the clusters resulting by voting, in order to find the number of clusters in the data set.
Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
APA, Harvard, Vancouver, ISO, and other styles
2

Al-Razgan, Muna Saleh. "Weighted clustering ensembles." Fairfax, VA : George Mason University, 2008. http://hdl.handle.net/1920/3212.

Full text
Abstract:
Thesis (Ph.D.)--George Mason University, 2008.
Vita: p. 134. Thesis director: Carlotta Domeniconi. Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Information Technology. Title from PDF t.p. (viewed Oct. 14, 2008). Includes bibliographical references (p. 128-133). Also issued in print.
APA, Harvard, Vancouver, ISO, and other styles
3

Gaertler, Marco. "Clustering with spectral methods." [S.l. : s.n.], 2002. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB10101213.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ptitsyn, Andrey. "New algorithms for EST clustering." Thesis, University of the Western Cape, 2000. http://etd.uwc.ac.za/index.php?module=etd&amp.

Full text
Abstract:
Expressed sequence tag database is a rich and fast growing source of data for gene expression analysis and drug discovery. Clustering of raw EST data is a necessary step for further analysis and one of the most challenging problems of modem computational biology.
APA, Harvard, Vancouver, ISO, and other styles
5

Koepke, Hoyt Adam. "Bayesian cluster validation." Thesis, University of British Columbia, 2008. http://hdl.handle.net/2429/1496.

Full text
Abstract:
We propose a novel framework based on Bayesian principles for validating clusterings and present efficient algorithms for use with centroid or exemplar based clustering solutions. Our framework treats the data as fixed and introduces perturbations into the clustering procedure. In our algorithms, we scale the distances between points by a random variable whose distribution is tuned against a baseline null dataset. The random variable is integrated out, yielding a soft assignment matrix that gives the behavior under perturbation of the points relative to each of the clusters. From this soft assignment matrix, we are able to visualize inter-cluster behavior, rank clusters, and give a scalar index of the the clustering stability. In a large test on synthetic data, our method matches or outperforms other leading methods at predicting the correct number of clusters. We also present a theoretical analysis of our approach, which suggests that it is useful for high dimensional data.
APA, Harvard, Vancouver, ISO, and other styles
6

Tittley, Eric Robert. "Hierarchical clustering and galaxy cluster scaling laws." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape9/PQDD_0008/NQ40291.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kuah, Adrian T. H. "Determinants of clustering, cluster growth and performance." Thesis, University of Manchester, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.629921.

Full text
Abstract:
Previous studies on industry clusters argue that there are benefits and externalities that only incumbents enjoy. Many studies focus on manufacturing and hightechnology clusters with less importance placed on services clusters. There are even fewer studies considering financial agglomerations as clusters that could create national competitive advantage. This thesis investigates the detenninants of clustering by focusing on the UK and Singapore financial sectors. It adopts an approach that concentrates on pure agglomerations - the importance of clientele, suppliers, factor conditions, rivalry, and how the agglomerations of competing and related industries influence incumbents' growth and perfonnance. The thesis combines two cluster models (Porter, 1990 and Swann et al., 1998) that contribute to an understanding of how, what and why certain detenninants lead to better perfonnance of finns in a cluster. It contributes to the existing literature by demonstrating that Porter's generic model is applicable to an international financial cluster and to another cluster in a smaller and open economy through the case studies of the London and Singapore financial centres. It advances knowledge on cluster theories by affinning Swann's lifetime growth model in the UK and Singapore financial clusters through econometric analysis, and extending the model to consider the causal and significant effects of cluster strengths in influencing finn perfonnance and the growth lifecycle of some 17,000 finns in the UK and Singapore. The thesis highlights the importance of factor and demand conditions in a cluster, and at the same time, the role that competitors and related industries play in enhancing the competitive advantage of the location. It illuminates the need for strong agglomerations of competing industry and related industries in a regional cluster to enhance incumbent's growth and actual financial perfonnance. Concertedly through the two cluster models, this thesis provides evidences of improved finn growth and actual financial perfonnance the more financial services finns cluster together, and how certain clustering conditions aid incumbents attain their competitive advantage.
APA, Harvard, Vancouver, ISO, and other styles
8

Shortreed, Susan. "Learning in spectral clustering /." Thesis, Connect to this title online; UW restricted, 2006. http://hdl.handle.net/1773/8977.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Chan, Alton Kam Fai. "Hyperplane based efficient clustering and searching /." View abstract or full-text, 2003. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202003%20CHANA.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Madureira, Erikson Manuel Geraldo Vieira de. "Análise de mercado : clustering." Master's thesis, Instituto Superior de Economia e Gestão, 2016. http://hdl.handle.net/10400.5/13122.

Full text
Abstract:
Mestrado em Decisão Económica e Empresarial
O presente trabalho tem como objetivo descrever as atividades realizadas durante o estágio efetuado na empresa Quidgest. Tendo a empresa a necessidade de estudar as suas diversas vertentes de negócio, optou-se por extrair e identificar as informações presentes no banco de dados da empresa. Para isso, foi utilizado um processo conhecido na análise de dados denominado por Extração de Conhecimento em Bases de Dados (ECBD). O maior desafio na utilização deste processo deveu-se há grande acumulação de informação pela empresa, que se foi intensificando a partir de 2013. Das fases do processo de ECBD, a que tem maior relevância é o data mining, onde é feito um estudo das variáveis caracterizadoras necessárias para a análise em foco. Foi escolhida a técnica de análise cluster da fase de data mining para que que toda análise possa ser eficiente, eficaz e se possa obter resultados de fácil leitura. Após o desenvolvimento do processo de ECBD, foi decidido que a fase de data mining podia ser implementada de modo a facilitar um trabalho futuro de uma análise realizada pela empresa. Para implementar essa fase, utilizaram-se técnicas de análise cluster e foi desenvolvida um programa em VBA/Excel centrada no utilizador. Para testar o programa criado foi utilizado um caso concreto da empresa. Esse caso consistiu em determinar quais os atuais clientes que mais contribuíram para a evolução da empresa nos anos de 2013 a 2015. Aplicando o caso referido no programa criado, obtiveram-se resultados e informações que foram analisadas e interpretadas.
This paper aims to describe the activities performed during the internship made in Quidgest company. Having the company need to study their various business areas, it was decided to extract and identify the information contained in the company's database. For this end, we used a process known in the data analysis called for Knowledge Discovery in Databases (KDD). The biggest challenge in using this process was due to their large accumulation of information by the company, which was intensified from 2013. The phases of the KDD process, which is the most relevant is data mining, where a study of characterizing variables required for the analysis is done. The cluster analysis technique of data mining phase was chosen for that any analysis can be efficient, effective and could provide results easy to read. After the development of the KDD process, it was decided that the data mining phase could be automated to facilitate future work carried out by the company. To automate this phase, cluster analysis techniques were used and was developed a program in VBA/Excel user-centered. To test the created program we used a specific case of the company. This case consisted in determining the current customers that have contributed to the company's evolution during the years 2013-2015. The application of the program has revealed useful information that has been analyzed and interpreted.
info:eu-repo/semantics/publishedVersion
APA, Harvard, Vancouver, ISO, and other styles
11

Gupta, Pramod. "Robust clustering algorithms." Thesis, Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/39553.

Full text
Abstract:
One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across any different fields ranging from computational biology to social sciences to computer vision in part because they are simple and their output is easy to interpret. However, many of these algorithms lack any performance guarantees when the data is noisy, incomplete or has outliers, which is the case for most real world data. It is well known that standard linkage algorithms perform extremely poorly in presence of noise. In this work we propose two new robust algorithms for bottom-up agglomerative clustering and give formal theoretical guarantees for their robustness. We show that our algorithms can be used to cluster accurately in cases where the data satisfies a number of natural properties and where the traditional agglomerative algorithms fail. We also extend our algorithms to an inductive setting with similar guarantees, in which we randomly choose a small subset of points from a much larger instance space and generate a hierarchy over this sample and then insert the rest of the points to it to generate a hierarchy over the entire instance space. We then do a systematic experimental analysis of various linkage algorithms and compare their performance on a variety of real world data sets and show that our algorithms do much better at handling various forms of noise as compared to other hierarchical algorithms in the presence of noise.
APA, Harvard, Vancouver, ISO, and other styles
12

Zhang, Yiqun. "Advances in categorical data clustering." HKBU Institutional Repository, 2019. https://repository.hkbu.edu.hk/etd_oa/658.

Full text
Abstract:
Categorical data are common in various research areas, and clustering is a prevalent technique used for analyse them. However, two challenging problems are encountered in categorical data clustering analysis. The first is that most categorical data distance metrics were actually proposed for nominal data (i.e., a categorical data set that comprises only nominal attributes), ignoring the fact that ordinal attributes are also common in various categorical data sets. As a result, these nominal data distance metrics cannot account for the order information of ordinal attributes and may thus inappropriately measure the distances for ordinal data (i.e., a categorical data set that comprises only ordinal attributes) and mixed categorical data (i.e., a categorical data set that comprises both ordinal and nominal attributes). The second problem is that most hierarchical clustering approaches were actually designed for numerical data and have very high computation costs; that is, with time complexity O(N2) for a data set with N data objects. These issues have presented huge obstacles to the clustering analysis of categorical data. To address the ordinal data distance measurement problem, we studied the characteristics of ordered possible values (also called 'categories' interchangeably in this thesis) of ordinal attributes and propose a novel ordinal data distance metric, which we call the Entropy-Based Distance Metric (EBDM), to quantify the distances between ordinal categories. The EBDM adopts cumulative entropy as a measure to indicate the amount of information in the ordinal categories and simulates the thinking process of changing one's mind between two ordered choices to quantify the distances according to the amount of information in the ordinal categories. The order relationship and the statistical information of the ordinal categories are both considered by the EBDM for more appropriate distance measurement. Experimental results illustrate the superiority of the proposed EBDM in ordinal data clustering. In addition to designing an ordinal data distance metric, we further propose a unified categorical data distance metric that is suitable for distance measurement of all three types of categorical data (i.e., ordinal data, nominal data, and mixed categorical data). The extended version uniformly defines distances and attribute weights for both ordinal and nominal attributes, by which the distances measured for the two types of attributes of a mixed categorical data can be directly combined to obtain the overall distances between data objects with no information loss. Extensive experiments on all three types of categorical data sets demonstrate the effectiveness of the unified distance metric in clustering analysis of categorical data. To address the hierarchical clustering problem of large-scale categorical data, we propose a fast hierarchical clustering framework called the Growing Multi-layer Topology Training (GMTT). The most significant merit of this framework is its ability to reduce the time complexity of most existing hierarchical clustering frameworks (i.e., O(N2)) to O(N1.5) without sacrificing the quality (i.e., clustering accuracy and hierarchical details) of the constructed hierarchy. According to our design, the GMTT framework is applicable to categorical data clustering simply by adopting a categorical data distance metric. To make the GMTT framework suitable for the processing of streaming categorical data, we also provide an incremental version of GMTT that can dynamically adopt new inputs into the hierarchy via local updating. Theoretical analysis proves that the GMTT frameworks have time complexity O(N1.5). Extensive experiments show the efficacy of the GMTT frameworks and demonstrate that they achieve more competitive categorical data clustering performance by adopting the proposed unified distance metric.
APA, Harvard, Vancouver, ISO, and other styles
13

Cole, Rowena Marie. "Clustering with genetic algorithms." University of Western Australia. Dept. of Computer Science, 1998. http://theses.library.uwa.edu.au/adt-WU2003.0008.

Full text
Abstract:
Clustering is the search for those partitions that reflect the structure of an object set. Traditional clustering algorithms search only a small sub-set of all possible clusterings (the solution space) and consequently, there is no guarantee that the solution found will be optimal. We report here on the application of Genetic Algorithms (GAs) -- stochastic search algorithms touted as effective search methods for large and complex spaces -- to the problem of clustering. GAs which have been made applicable to the problem of clustering (by adapting the representation, fitness function, and developing suitable evolutionary operators) are known as Genetic Clustering Algorithms (GCAs). There are two parts to our investigation of GCAs: first we look at clustering into a given number of clusters. The performance of GCAs on three generated data sets, analysed using 4320 differing combinations of adaptions, establishes their efficacy. Choice of adaptions and parameter settings is data set dependent, but comparison between results using generated and real data sets indicate that performance is consistent for similar data sets with the same number of objects, clusters, attributes, and a similar distribution of objects. Generally, group-number representations are better suited to the clustering problem, as are dynamic scaling, elite selection and high mutation rates. Independent generalised models fitted to the correctness and timing results for each of the generated data sets produced accurate predictions of the performance of GCAs on similar real data sets. While GCAs can be successfully adapted to clustering, and the method produces results as accurate and correct as traditional methods, our findings indicate that, given a criterion based on simple distance metrics, GCAs provide no advantages over traditional methods. Second, we investigate the potential of genetic algorithms for the more general clustering problem, where the number of clusters is unknown. We show that only simple modifications to the adapted GCAs are needed. We have developed a merging operator, which with elite selection, is employed to evolve an initial population with a large number of clusters toward better clusterings. With regards to accuracy and correctness, these GCAs are more successful than optimisation methods such as simulated annealing. However, such GCAs can become trapped in local minima in the same manner as traditional hierarchical methods. Such trapping is characterised by the situation where good (k-1)-clusterings do not result from our merge operator acting on good k-clusterings. A marked improvement in the algorithm is observed with the addition of a local heuristic.
APA, Harvard, Vancouver, ISO, and other styles
14

Leisch, Friedrich. "Bagged clustering." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/1272/1/document.pdf.

Full text
Abstract:
A new ensemble method for cluster analysis is introduced, which can be interpreted in two different ways: As complexity-reducing preprocessing stage for hierarchical clustering and as combination procedure for several partitioning results. The basic idea is to locate and combine structurally stable cluster centers and/or prototypes. Random effects of the training set are reduced by repeatedly training on resampled sets (bootstrap samples). We discuss the algorithm both from a more theoretical and an applied point of view and demonstrate it on several data sets. (author's abstract)
Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
APA, Harvard, Vancouver, ISO, and other styles
15

Tantrum, Jeremy. "Model based and hybrid clustering of large datasets /." Thesis, Connect to this title online; UW restricted, 2003. http://hdl.handle.net/1773/8933.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Cui, Yingjie. "A study on privacy-preserving clustering." Click to view the E-thesis via HKUTO, 2009. http://sunzi.lib.hku.hk/hkuto/record/B4357225X.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Kübler, Bernhard Christian. "Risk classification by means of clustering." Frankfurt, M. Berlin Bern Bruxelles New York, NY Oxford Wien Lang, 2009. http://d-nb.info/998737291/04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

McClelland, Robyn L. "Regression based variable clustering for data reduction /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/9611.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Lee, King-for Foris. "Clustering uncertain data using Voronoi diagram." Click to view the E-thesis via HKUTO, 2009. http://sunzi.lib.hku.hk/hkuto/record/B43224131.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Speer, Nora. "Funktionelles Clustering von Genen mit der Gene Ontology /." Berlin : Logos-Verl, 2006. http://deposit.d-nb.de/cgi-bin/dokserv?id=2875270&prov=M&dok_var=1&dok_ext=htm.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Dimitriadou, Evgenia, Andreas Weingessel, and Kurt Hornik. "Fuzzy voting in clustering." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 1999. http://epub.wu.ac.at/742/1/document.pdf.

Full text
Abstract:
In this paper we present a fuzzy voting scheme for cluster algorithms. This fuzzy voting method allows us to combine several runs of cluster algorithms resulting in a common fuzzy partition. This helps us to overcome instabilities of the cluster algorithms and results in a better clustering.
Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
APA, Harvard, Vancouver, ISO, and other styles
22

Zhou, Hong. "Visual clustering in parallel coordinates and graphs /." View abstract or full-text, 2009. http://library.ust.hk/cgi/db/thesis.pl?CSED%202009%20ZHOU.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Pourvali, Mohsen <1984&gt. "Improving the quality of text clustering and cluster labeling." Doctoral thesis, Università Ca' Foscari Venezia, 2016. http://hdl.handle.net/10579/10311.

Full text
Abstract:
The abundance of available electronic information is rapidly increasing with the advancements in digital processing. Furthermore, huge amounts of textual data have given rise to the need for efficient techniques that can organize the data in manageable forms. In order to tackle this challenge, clustering algorithms try to group automatically similar documents. While clustering plays a significant role that helps to categorize documents, it owes intrinsic limits when it comes to allowing human users to understand the content of documents at a deeper level. This is where cluster labeling techniques come into the scene. The goal of cluster labeling is to label - i.e., describe in an informative way - clusters of documents according to their content. Document clustering and cluster labeling are two vital problems in the information retrieval domain because of their ability to organize increasing amount of texts and describe such the huge amount in a concise way. In this thesis, we have addressed these problems in three parts. In the first part, we investigate how we can improve the effectiveness of text clustering by summarizing some documents in a corpus, specifically the ones that are much significantly longer than the mean. The contribution in this part is twofold. First, we show that text summarization can improve the performance of classical text clustering algorithms, in particular, by reducing noise coming from long documents that can negatively affect clustering results. Moreover, we show that the clustering quality can be used to quantitatively evaluate different summarization methods. In the second part, we explore a multi-strategy technique that aims at enriching documents for improving clustering quality. Specifically, we use a combination of entity linking and document summarization, to determine the identity of the most salient entities mentioned in texts. We further investigate ensemble clustering in order to combine multiple clustering results, generated based on the combination of the specific set of features, into a single result of better quality. In the third part, we investigate the problem of cluster labeling whose quality obviously depends on the quality of document clustering. To this end, we first explore and categorize cluster labeling techniques, providing a thorough discussion of the relevant state-of-the-art literature. We then present a fusion-based topic modeling approach to enrich documents' vectors of corpus with the aim of improving the quality of text clustering. We further exploit such vectors through a fusion method for cluster labeling. Finally, we experimentally prove the effectiveness of our solutions, explained in three parts, in the clustering and cluster labeling problems with various datasets.
APA, Harvard, Vancouver, ISO, and other styles
24

Wang, Dali. "Adaptive Double Self-Organizing Map for Clustering Gene Expression Data." Fogler Library, University of Maine, 2003. http://www.library.umaine.edu/theses/pdf/WangD2003.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Cui, Yingjie, and 崔英杰. "A study on privacy-preserving clustering." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B4357225X.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Xiong, Yimin. "Time series clustering using ARMA models /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?COMP%202004%20XIONG.

Full text
Abstract:
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2004.
Includes bibliographical references (leaves 49-55). Also available in electronic version. Access restricted to campus users.
APA, Harvard, Vancouver, ISO, and other styles
27

Silva, Gustavo Girão Barreto da. "Resource-aware clustering design for NoC-based MPSoCs." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2014. http://hdl.handle.net/10183/95984.

Full text
Abstract:
Atualmente, o paradigma multicore é uma tendência fortemente estabelecida também na área de sistemas embarcados. O grau de paralelismo provido por tal arquitetura tem sido a principal causa de avanços de performance na área além de economia de energia e potência. Entretanto, para obter paralelismo eficiente desta arquitetura não é uma tarefa simples. Assim, desenvolvedores propuseram diversos modelos de ambientes de programação tentando prover o máximo de transparência possível. No nível do hardware, este crescente aumento no número de componentes dentro chip cria um problema de gerenciamento a ser tratado. No contexto deste cenário complexo, esta tese propõe o uso de abordagens de gerenciamento de recursos para aumentar a eficiência, levando em consideração tanto performance quanto consumo de energia, de ambientes MPSoC em diferentes níveis. Além disso, estas abordagens tem em comum a noção de clusterização, a qual tenta agregar recursos logicamente de acordo com as demandas da aplicação. Primeiramente no nível do processador/aplicação, é proposto um hardware dinamicamente adaptável para suportar modelos de programação paralelos distintos sem nenhum sobrecusto computacional uma vez que todo o processo é completamente transparente para o programador. Ainda neste ambiente, onde aplicações distintas podem ser executadas, é proposto um mecanismo de escalonamento visando gerenciamento de recursos para aumentar a performance chamado Processor Clustering. São propostas quatro diferentes políticas de mapeamento de recursos que tiram vantagem de aspectos distintos da natureza paralela das aplicações e das restrições arquiteturais do sistema. Entretanto, algumas aplicações tem demandas de memória mais altas do que demandas computacionais. Logo, uma abordagem similar pode ser utilizada no nível da hierarquia de memória. Neste caso, o objetivo é redistribuir recursos de memória de acordo com as demandas da aplicação. Redistribuição de memória é explorada tanto em tempo de projeto quanto em tempo de execução. Um mecanismo de mapeamento de distribuição é proposto baseado na quantidade de requisições de acesso à memória externa. Finalmente, é proposto um mecanismo de tolerância à falhas baseado em gerenciamento de recursos para memórias distribuídas dentro do chip em NoCs. É introduzido um modelo de Reliability Clustering que tira proveito da infraestrutura da NoC. Neste caso, os roteadores tem conhecimento dos blocos com falhas e blocos redundantes. Baseado neste conhecimento, o mecanismo é capaz evitar altas latências de acesso à memória.
The multicore paradigm is a solid trend nowadays, also in the field of embedded systems. The degree of parallelism provided by such architecture has been the foundation of performance advancements in the field as well as for power and energy savings. However, to obtain efficient parallelism of such architecture is not an easy task. Therefore, developers come up with several proposals of programming environments trying to provide as much transparency as possible. On the hardware side, this increasing number of on-chip components creates a management issue to be handled. In the context of this complex scenario this thesis proposes the use of resource management approaches to improve the efficiency, regarding both performance and energy consumption, of MPSoC environments at different levels. Also, these approaches have in common the notion of clustering, which tries to logically aggregate resources according to application demands. First, at the processor/application level, we propose a dynamically adaptable hardware to support distinct parallel programming models at no computational overhead, since the entire process is completely transparent to the programmer. Also, in this environment, where distinct applications can be executed, we propose a resource-aware scheduling mechanism to improve performance named Processor Clustering. We propose four different resource mapping policies that leverage on distinct aspects of the parallel nature of the applications and on architecture constraints. However, some applications have higher memory demands than computational demands. Therefore, a similar approach can be used at the memory level. In this case, we aim at redistributing memory resources according to application demands. We explore memory redistribution at both design time and runtime and propose a distribution mapping mechanism based on the amount of off-chip memory requests. Finally, we propose a resource-aware fault-tolerance mechanism for distributed on-chip memories in NoCs. We introduce a Reliability Clustering model that leverages on the NoC infrastructure. In this case, the routers have knowledge of faulty blocks and redundancy blocks and, based on that, they are able to avoid higher memory access latency.
APA, Harvard, Vancouver, ISO, and other styles
28

Woo, Kam Tim. "Applications of clustering techniques on communication systems /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202004%20WOO.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Butchart, Kate. "Hierarchical clustering using dynamic self organising neural networks." Thesis, University of Hertfordshire, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.338383.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Correia, Maria Inês Costa. "Cluster analysis of financial time series." Master's thesis, Instituto Superior de Economia e Gestão, 2020. http://hdl.handle.net/10400.5/21016.

Full text
Abstract:
Mestrado em Mathematical Finance
Esta dissertação aplica o método da Signature como medida de similaridade entre dois objetos de séries temporais usando as propriedades de ordem 2 da Signature e aplicando-as a um método de Clustering Asimétrico. O método é comparado com uma abordagem de Clustering mais tradicional, onde a similaridade é medida usando Dynamic Time Warping, desenvolvido para trabalhar com séries temporais. O intuito é considerar a abordagem tradicional como benchmark e compará-la ao método da Signature através do tempo de computação, desempenho e algumas aplicações. Estes métodos são aplicados num conjunto de dados de séries temporais financeiras de Fundos Mútuos do Luxemburgo. Após a revisão da literatura, apresentamos o método Dynamic Time Warping e o método da Signature. Prossegue-se com a explicação das abordagens de Clustering Tradicional, nomeadamente k-Means, e Clustering Espectral Assimétrico, nomeadamente k-Axes, desenvolvido por Atev (2011). O último capítulo é dedicado à Investigação Prática onde os métodos anteriores são aplicados ao conjunto de dados. Os resultados confirmam que o método da Signature têm efectivamente potencial para Machine Learning e previsão, como sugerido por Levin, Lyons and Ni (2013).
This thesis applies the Signature method as a measurement of similarities between two time-series objects, using the Signature properties of order 2, and its application to Asymmetric Spectral Clustering. The method is compared with a more Traditional Clustering approach where similarities are measured using Dynamic Time Warping, developed to work with time-series data. The intention for this is to consider the traditional approach as a benchmark and compare it to the Signature method through computation times, performance, and applications. These methods are applied to a financial time series data set of Mutual Exchange Funds from Luxembourg. After the literature review, we introduce the Dynamic Time Warping method and the Signature method. We continue with the explanation of Traditional Clustering approaches, namely k-Means, and Asymmetric Clustering techniques, namely the k-Axes algorithm, developed by Atev (2011). The last chapter is dedicated to Practical Research where the previous methods are applied to the data set. Results confirm that the Signature method has indeed potential for machine learning and prediction, as suggested by Levin, Lyons, and Ni (2013).
info:eu-repo/semantics/publishedVersion
APA, Harvard, Vancouver, ISO, and other styles
31

Lee, King-for Foris, and 李敬科. "Clustering uncertain data using Voronoi diagram." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B43224131.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Bigdeli, Elnaz. "Incremental Anomaly Detection Using Two-Layer Cluster-based Structure." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/34299.

Full text
Abstract:
Anomaly detection algorithms face several challenges, including processing speed and dealing with noise in data. In this thesis, a two-layer cluster- based anomaly detection structure is presented which is fast, noise-resilient and incremental. In this structure, each normal pattern is considered as a cluster, and each cluster is represented using a Gaussian Mixture Model (GMM). Then, new instances are presented to the GMM to be labeled as normal or abnormal. The proposed structure comprises three main steps. In the first step, the data are clustered. The second step is to represent each cluster in a way that enables the model to classify new instances. The Summarization based on Gaussian Mixture Model (SGMM) proposed in this thesis represents each cluster as a GMM. In the third step, a two-layer structure efficiently updates clusters using GMM representation while detecting and ignoring redundant instances. A new approach, called Collective Probabilistic Labeling (CPL) is presented to update clusters in a batch mode. This approach makes the updating phase noise-resistant and fast. The collective approach also introduces a new concept called 'rag bag' used to store new instances. The new instances collected in the rag bag are clustered and summarized by GMMs. This enables online systems to identify nearby clusters in the existing and new clusters, and merge them quickly, despite the presence of noise to update the model. An important step in the updating is the merging of new clusters with ex- isting ones. To this end, a new distance measure is proposed, which is a mod- i ed Kullback-Leibler distance between two GMMs. This modi ed distance allows accurate identi cation of nearby clusters. After finding neighboring clusters, they are merged, quickly and accurately. One of the reasons that GMM is chosen to represent clusters is to have a clear and valid mathematical representation for clusters, which eases further cluster analysis. In most real-time anomaly detection applications, incoming instances are often similar to previous ones. In these cases, there is no need to update clusters based on duplicates, since they have already been modeled in the cluster distribution. The two-layer structure is responsible for identifying redundant instances. In this structure, redundant instance are ignored, and the remaining new instances are used to update clusters. Ignoring redundant instances, which are typically in the majority, makes the detection phase fast. Each part of the general structure is validated in this thesis. The experiments include, detection rates, clustering goodness, time, memory usage and the complexity of the algorithms. The accuracy of the clustering and summarization of clusters using GMMs is evaluated, and compared to that of other methods. Using Davies-Bouldin (DB) and Dunn indexes, the distances for original and regenerated clusters using GMMs is almost zero with SGMM method while this value for ABACUS is around 0:01. Moreover, the results show that the SGMM algorithm is 3 times faster than ABACUS in running time, using one-third of the memory used by ABACUS. The CPL method, used to label new instances, is found to collectively remove the effect of noise, while increasing the accuracy of labeling new instances. In a noisy environment, the detection rate of the CPL method is 5% higher than other algorithms such as one-class SVM. The false alarm rate is decreased by 10% on average. Memory use is 20 times lesser that that of the one-class SVM. The proposed method is found to lower the false alarm rate, which is one of the basic problems for the one-class SVM. Experiments show the false alarm rate is decreased from 5% to 15% among different datasets, while the detection rate is increased from 5% to 10% in di erent datasets with two- layer structure. The memory usage for the two-layer structure is 20 to 50 times less than that of one-class SVM. One-class SVM uses support vectors in labeling new instances, while the labeling of the two-layer structure depends on the number of GMMs. The experiments show that the two-layer structure is 20 to 50 times faster than the one-class SVM in labeling new instances. Moreover, the updating time of two-layer structure is 2 to 3 times less than one-layer structure. This reduction is the direct result of ignoring redundant instances and using two-layer structure.
APA, Harvard, Vancouver, ISO, and other styles
33

Kumar, Swapnil. "Comparison of blocking and hierarchical ways to find cluster." Kansas State University, 2017. http://hdl.handle.net/2097/35425.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
William H. Hsu
Clustering in data mining is a process of discovering groups in a set of data such that the similarity within the group is maximized and the similarity among the groups is minimized. One way of approaching clustering is to treat it as a blocking problem of minimizing the maximum distance between any two units within the same group. This method is known as Threshold blocking. It works by applying blocking as a graph partition problem. Chameleon is a hierarchical clustering algorithm, that based on dynamic modelling measures the similarity between two clusters. In the clustering process, to merge two cluster, we check if the inter-connectivity and closeness between two clusters are high relative to the internal inter-connectivity of the clusters and closeness of items within the clusters. This way of merging of cluster using the dynamic model helps in discovery of natural and homogeneous clusters. The main goal of this project is to implement a local implementation of CHAMELEON and compare the output generated from Chameleon against Threshold blocking algorithm suggested by Higgins et al with its hybridized form and unhybridized form.
APA, Harvard, Vancouver, ISO, and other styles
34

Wang, Xinyu. "Toward Scalable Hierarchical Clustering and Co-clustering Methods : application to the Cluster Hypothesis in Information Retrieval." Thesis, Lyon, 2017. http://www.theses.fr/2017LYSE2123/document.

Full text
Abstract:
Comme une méthode d’apprentissage automatique non supervisé, la classification automatique est largement appliquée dans des tâches diverses. Différentes méthodes de la classification ont leurs caractéristiques uniques. La classification hiérarchique, par exemple, est capable de produire une structure binaire en forme d’arbre, appelée dendrogramme, qui illustre explicitement les interconnexions entre les instances de données. Le co-clustering, d’autre part, génère des co-clusters, contenant chacun un sous-ensemble d’instances de données et un sous-ensemble d’attributs de données. L’application de la classification sur les données textuelles permet d’organiser les documents et de révéler les connexions parmi eux. Cette caractéristique est utile dans de nombreux cas, par exemple, dans les tâches de recherche d’informations basées sur la classification. À mesure que la taille des données disponibles augmente, la demande de puissance du calcul augmente. En réponse à cette demande, de nombreuses plates-formes du calcul distribué sont développées. Ces plates-formes utilisent les puissances du calcul collectives des machines, pour couper les données en morceaux, assigner des tâches du calcul et effectuer des calculs simultanément.Dans cette thèse, nous travaillons sur des données textuelles. Compte tenu d’un corpus de documents, nous adoptons l’hypothèse de «bag-of-words» et applique le modèle vectoriel. Tout d’abord, nous abordons les tâches de la classification en proposant deux méthodes, Sim_AHC et SHCoClust. Ils représentent respectivement un cadre des méthodes de la classification hiérarchique et une méthode du co-clustering hiérarchique, basé sur la proximité. Nous examinons leurs caractéristiques et performances du calcul, grâce de déductions mathématiques, de vérifications expérimentales et d’évaluations. Ensuite, nous appliquons ces méthodes pour tester l’hypothèse du cluster, qui est l’hypothèse fondamentale dans la recherche d’informations basée sur la classification. Dans de tels tests, nous utilisons la recherche du cluster optimale pour évaluer l’efficacité de recherche pour tout les méthodes hiérarchiques unifiées par Sim_AHC et par SHCoClust . Nous aussi examinons l’efficacité du calcul et comparons les résultats. Afin d’effectuer les méthodes proposées sur des ensembles de données plus vastes, nous sélectionnons la plate-forme d’Apache Spark et fournissons implémentations distribuées de Sim_AHC et de SHCoClust. Pour le Sim_AHC distribué, nous présentons la procédure du calcul, illustrons les difficultés rencontrées et fournissons des solutions possibles. Et pour SHCoClust, nous fournissons une implémentation distribuée de son noyau, l’intégration spectrale. Dans cette implémentation, nous utilisons plusieurs ensembles de données qui varient en taille pour examiner l’échelle du calcul sur un groupe de noeuds
As a major type of unsupervised machine learning method, clustering has been widely applied in various tasks. Different clustering methods have different characteristics. Hierarchical clustering, for example, is capable to output a binary tree-like structure, which explicitly illustrates the interconnections among data instances. Co-clustering, on the other hand, generates co-clusters, each containing a subset of data instances and a subset of data attributes. Applying clustering on textual data enables to organize input documents and reveal connections among documents. This characteristic is helpful in many cases, for example, in cluster-based Information Retrieval tasks. As the size of available data increases, demand of computing power increases. In response to this demand, many distributed computing platforms are developed. These platforms use the collective computing powers of commodity machines to parallelize data, assign computing tasks and perform computation concurrently.In this thesis, we first address text clustering tasks by proposing two clustering methods, Sim_AHC and SHCoClust. They respectively represent a similarity-based hierarchical clustering and a similarity-based hierarchical co-clustering. We examine their properties and performances through mathematical deduction, experimental verification and evaluation. Then we apply these methods in testing the cluster hypothesis, which is the fundamental assumption in cluster-based Information Retrieval. In such tests, we apply the optimal cluster search to evaluation the retrieval effectiveness of different clustering methods. We examine the computing efficiency and compare the results of the proposed tests. In order to perform clustering on larger datasets, we select Apache Spark platform and provide distributed implementation of Sim_AHC and of SHCoClust. For distributed Sim_AHC, we present the designed computing procedure, illustrate confronted difficulties and provide possible solutions. And for SHCoClust, we provide a distributed implementation of its core, spectral embedding. In this implementation, we use several datasets that vary in size to examine scalability
APA, Harvard, Vancouver, ISO, and other styles
35

Konda, Swetha Reddy. "Classification of software components based on clustering." Morgantown, W. Va. : [West Virginia University Libraries], 2007. https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5510.

Full text
Abstract:
Thesis (M.S.)--West Virginia University, 2007.
Title from document title page. Document formatted into pages; contains vi, 59 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 57-59).
APA, Harvard, Vancouver, ISO, and other styles
36

Zhang, Kai. "Kernel-based clustering and low rank approximation /." View abstract or full-text, 2008. http://library.ust.hk/cgi/db/thesis.pl?CSED%202008%20ZHANG.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Chan, Yat-ling, and 陳逸靈. "An optimization algorithm for clustering using weighted dissimilarity measures." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2003. http://hub.hku.hk/bib/B26667009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Johnston, Joshua Benjamin Hamerly Gregory James. "Clustering in high dimension and choosing cluster representatives for SimPoint." Waco, Tex. : Baylor University, 2007. http://hdl.handle.net/2104/5067.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Jia, Hong. "Clustering of categorical and numerical data without knowing cluster number." HKBU Institutional Repository, 2013. http://repository.hkbu.edu.hk/etd_ra/1495.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Daumová, Dora. "Clustering as a Tool of Competitiveness, the Case of the Czech Republic." Master's thesis, Vysoká škola ekonomická v Praze, 2008. http://www.nusl.cz/ntk/nusl-4285.

Full text
Abstract:
This paper deals with the issue of clusters as relatively new tools of competitiveness in economies and examines their linkage to state through cluster policies and initiatives. At the beginning, a theoretical development of the concept is presented. Afterwards, the cluster concept as such is introduced, putting down a wide scale of supportive rationales, presenting also possible risks and explaining a linkage to innovation and competitiveness. The second part of the thesis treats the issue of cluster policies as possible means of a cluster creation. Aside from examining cluster policies in different views, an emphasis is put on the justification of the role of state which can take part as either contributive or disturbing factor. The empirical part of the paper presents the case of the Czech clustering. Firstly, the Czech cluster policy is analyzed concerning its targets, instruments, approaches and other relevant issues, focusing on shortcomings and problems of the policy. Further, a case study of the Moravian-Silesian automotive cluster is presented. In this part, the particular cluster is observed with a view to its origin, development and activities. The endeavor is to compare the case with the classical Porter model and find out its inadequacies.
APA, Harvard, Vancouver, ISO, and other styles
41

Chang, Soong Uk. "Clustering with mixed variables /." [St. Lucia, Qld.], 2005. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe19086.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Rantes, García Mónica Tahiz, and Quispe Lizbeth María Cruz. "Detección de fraudes usando técnicas de clustering." Universidad Nacional Mayor de San Marcos. Programa Cybertesis PERÚ, 2010. http://www.cybertesis.edu.pe/sisbib/2010/rantes_gm/html/index-frames.html.

Full text
Abstract:
El fraude con tarjetas de crédito es uno de los problemas más importantes a los que se enfrentan actualmente las entidades financieras. Si bien la tecnología ha permitido aumentar la seguridad en las tarjetas de crédito con el uso de claves PIN, la introducción de chips en las tarjetas, el uso de claves adicionales como tokens y mejoras en la reglamentación de su uso, también es una necesidad para las entidades bancarias, actuar de manera preventiva frente a este crimen. Para actuar de manera preventiva es necesario monitorear en tiempo real las operaciones que se realizan y tener la capacidad de reaccionar oportunamente frente a alguna operación dudosa que se realice. La técnica de Clustering frente a esta problemática es un método muy utilizado puesto que permite la agrupación de datos lo que permitiría clasificarlos por su similitud de acuerdo a alguna métrica, esta medida de similaridad está basada en los atributos que describen a los objetos. Además esta técnica es muy sensible a la herramienta Outlier que se caracteriza por el impacto que causa sobre el estadístico cuando va a analizar los datos
The credit card fraud is one of the most important problems currently facing financial institutions. While technology has enhanced security in credit cards with the use of PINs, the introduction of chips on the cards, the use of additional keys as tokens and improvements in the regulation of their use, is also a need for banks, act preemptively against this crime. To act proactively need real-time monitoring operations are carried out and have the ability to react promptly against any questionable transaction that takes place. Clustering technique tackle this problem is a common method since it allows the grouping of data allowing classifying them by their similarity according to some metric, this measure of similarity is based on the attributes that describe the objects. Moreover, this technique is very sensitive to Outlier tool that is characterized by the impact they cause on the statistic when going to analyze the data
APA, Harvard, Vancouver, ISO, and other styles
43

Xiong, Huojin. "Clustering in the Field of Vocational Education." Doctoral thesis, Universitätsbibliothek Chemnitz, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-113319.

Full text
Abstract:
Diese Dissertation wendet komparative Methoden an, um eine vergleichende Analyse von einigen ausgewählten praktischen Beispielen von ‚Clustering‘ in dem Handlungsfeld der Erziehung in China. Auf der Grundlage der strukturellen, hierarchischen und funktionellen Herangehensweisen der Systemtheorie und auch in Anbetracht der sozialen wirtschaftlichen und pädagogischen Implikationen des Clusters in der Beruflichen Bildung werden die Theorie von Porter (und deren Erweiterungen), die Theorie des Humankapitals und die Theorie der Bildung für die Wahl der für einen Vergleich erforderlichen Kriterien (tertium comparationis) herangezogen. Aus den verfügbaren Berichten über Implementationsversuche wurden die Implementationsmodelle der Cluster von Henan, Shanghai, Hainan, Yongchuan und Yantai ausgewählt. Alle Erfahrungen aus diesen Regionen wurden in zwei Kategorien gemäß ihrer Eigenschaften als professionelle und regionale Cluster untersucht. Die komparativen Analysen verweisen jeweils auf die oben erwähnten drei Kriterien. In Anbetracht der in den praktisch umgesetzten Modellen offenbar gewordenen Probleme werden zusätzlich einige internationale Erfahrungen herangezogen und auf ihre Erfolgskomponenten hin untersucht, z.B. wie man Faktoren für das Cluster verbindet oder wie man Anreize für die Teilnahme von Unternehmen am Cluster setzt. Auf der Grundlage der theoretischen Analyse der praktischen Erfahrungen in China sowie andernorts werden abschließend einige Vorschläge für die zukünftige Entwicklung des Clusterings entwickelt
This dissertation applies comparative methods to make analyses on some selected implementation modes of clustering in the field of vocational education in China. Based on the structural, hierarchical and functional aspects of the theory of system, and also in consideration of the social economical and educational features of clustering in the field of vocational education, Porter’s theory and its amended models, theory of human capital and theory of education are reviewed for the choice of comparative criteria. On the basis of the available information, some representative implementation models are selected from Henan (province, South China), Shanghai (provincial level city, East China), Hainan (province, Central China), Yongchuan (prefectural level city, West China) and Yantai (Prefectural level city, North China). All the experiences from these areas are grouped and compared in two categories according to their features: professional clustering and regional clustering. And comparative analyses are made in reference to the above-mentioned three criteria. In consideration of the problems revealed in the implementation models, some international experiences are referred as examples in some practical aspects, such as of how to connect factors for clustering, of how to assist the clustering to live through its whole lifespan, and of how to get enterprises involved. Furthermore, some suggestions for future development of clustering are also made from theoretical point of view
APA, Harvard, Vancouver, ISO, and other styles
44

Cruz, Quispe Lizbeth María, and García Mónica Tahiz Rantes. "Detección de fraudes usando técnicas de clustering." Bachelor's thesis, Universidad Nacional Mayor de San Marcos, 2010. https://hdl.handle.net/20.500.12672/2644.

Full text
Abstract:
El fraude con tarjetas de crédito es uno de los problemas más importantes a los que se enfrentan actualmente las entidades financieras. Si bien la tecnología ha permitido aumentar la seguridad en las tarjetas de crédito con el uso de claves PIN, la introducción de chips en las tarjetas, el uso de claves adicionales como tokens y mejoras en la reglamentación de su uso, también es una necesidad para las entidades bancarias, actuar de manera preventiva frente a este crimen. Para actuar de manera preventiva es necesario monitorear en tiempo real las operaciones que se realizan y tener la capacidad de reaccionar oportunamente frente a alguna operación dudosa que se realice. La técnica de Clustering frente a esta problemática es un método muy utilizado puesto que permite la agrupación de datos lo que permitiría clasificarlos por su similitud de acuerdo a alguna métrica, esta medida de similaridad está basada en los atributos que describen a los objetos. Además esta técnica es muy sensible a la herramienta Outlier que se caracteriza por el impacto que causa sobre el estadístico cuando va a analizar los datos.
---The credit card fraud is one of the most important problems currently facing financial institutions. While technology has enhanced security in credit cards with the use of PINs, the introduction of chips on the cards, the use of additional keys as tokens and improvements in the regulation of their use, is also a need for banks, act preemptively against this crime. To act proactively need real-time monitoring operations are carried out and have the ability to react promptly against any questionable transaction that takes place. Clustering technique tackle this problem is a common method since it allows the grouping of data allowing classifying them by their similarity according to some metric, this measure of similarity is based on the attributes that describe the objects. Moreover, this technique is very sensitive to Outlier tool that is characterized by the impact they cause on the statistic when going to analyze the data.
Tesis
APA, Harvard, Vancouver, ISO, and other styles
45

Strehl, Alexander. "Relationship-based clustering and cluster ensembles for high-dimensional data mining." Thesis, Full text (PDF) from UMI/Dissertation Abstracts International, 2002. http://wwwlib.umi.com/cr/utexas/fullcit?p3088578.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Eldridge, Justin Eldridge. "Clustering Consistently." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1512070374903249.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

梁德貞 and Tak-ching Leung. "Correspondence analysis and clustering with applications to site-species occurrence." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1991. http://hub.hku.hk/bib/B31209889.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Leung, Tak-ching. "Correspondence analysis and clustering with applications to site-species occurrence /." [Hong Kong] : University of Hong Kong, 1991. http://sunzi.lib.hku.hk/hkuto/record.jsp?B13039519.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Ndebele, Nothando Elizabeth. "Clustering algorithms and their effect on edge preservation in image compression." Thesis, Rhodes University, 2009. http://hdl.handle.net/10962/d1008210.

Full text
Abstract:
Image compression aims to reduce the amount of data that is stored or transmitted for images. One technique that may be used to this end is vector quantization. Vectors may be used to represent images. Vector quantization reduces the number of vectors required for an image by representing a cluster of similar vectors by one typical vector that is part of a set of vectors referred to as the code book. For compression, for each image vector, only the closest codebook vector is stored or transmitted. For reconstruction, the image vectors are again replaced by the the closest codebook vectors. Hence vector quantization is a lossy compression technique and the quality of the reconstructed image depends strongly on the quality of the codebook. The design of the codebook is therefore an important part of the process. In this thesis we examine three clustering algorithms which can be used for codebook design in image compression: c-means (CM), fuzzy c-means (FCM) and learning vector quantization (LVQ). We give a description of these algorithms and their application to codebook design. Edges are an important part of the visual information contained in an image. It is essential therefore to use codebooks which allow an accurate representation of the edges. One of the shortcomings of using vector quantization is poor edge representation. We therefore carry out experiments using these algorithms to compare their edge preserving qualities. We also investigate the combination of these algorithms with classified vector quantization (CVQ) and the replication method (RM). Both these methods have been suggested as methods for improving edge representation. We use a cross validation approach to estimate the mean squared error to measure the performance of each of the algorithms and the edge preserving methods. The results reflect that the edges are less accurately represented than the non - edge areas when using CM, FCM and LVQ. The advantage of using CVQ is that the time taken for code book design is reduced particularly for CM and FCM. RM is found to be effective where the codebook is trained using a set that has larger proportions of edges than the test set.
APA, Harvard, Vancouver, ISO, and other styles
50

Song, Liumeng. "Cluster heads selection and cooperative nodes selection for cluster-based Internet of Things networks." Thesis, Queen Mary, University of London, 2017. http://qmro.qmul.ac.uk/xmlui/handle/123456789/24778.

Full text
Abstract:
Clustering and cooperative transmission are the key enablers in power-constrained Internet of Things (IoT) networks. The challenges for power-constrained devices in IoT networks are to reduce the energy consumption and to guarantee the Quality of Service (QoS) provision. In this thesis, optimal node selection algorithms based on clustering and cooperative communication are proposed for different network scenarios, in particular: • The QoS-aware energy efficient cluster heads (CHs) selection algorithm in one-hop capillary networks. This algorithm selects the optimum set of CHs and construct clusters accordingly based on the location and residual energy of devices. • Cooperative nodes selection algorithms for cluster-based capillary networks. By utilising the spacial diversity of cooperative communication, these algorithms select the optimum set of cooperative nodes to assist the CHs for the long-haul transmission. In addition, with the regard of evenly energy distribution in one-hop cluster-based capillary networks, the CH selection is taken into consideration when developing cooperative devices selection algorithms. The performance of proposed selection algorithms are evaluated via comprehensive simulations. Simulation results show that the proposed algorithms can achieve up to 20% network lifetime longevity and up to 50% overall packet error rate (PER) decrement. Furthermore, the simulation results also prove that the optimal tradeoff between energy efficiency and QoS provision can be achieved in one-hop and multi-hop cluster-based scenarios.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography