Дисертації з теми "Clustering based on correlation"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Clustering based on correlation".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Rosén, Fredrik. "Correlation based clustering of the Stockholm Stock Exchange." Thesis, Stockholm University, School of Business, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-6500.
Повний текст джерелаThis thesis present a topological classification of stocks traded on the Stockholm Stock Exchange based solely on the co-movements between individual stocks. The working hypothesis is that an ultrametric space is an appropriate space for linking stocks together. The hierarchical structure is obtained from the matrix of correlation coefficient computed between all pairs of stocks included in the OMXS~30 portfolio by considering the daily logarithmic return. The dynamics of the system is investigated by studying the distribution and time dependence of the correlation coefficients. Average linkage clustering is proposed as an alternative to the conventional single linkage clustering. The empirical investigation show that the Minimum-Spanning Tree (the graphical representation of the clustering procedure) describe the reciprocal arrangement of the stocks included in the investigated portfolio in a way that also makes sense from an economical point of view. Average linkage clustering results in five main clusters, consisting of Machinery, Bank, Telecom, Paper & Forest and Security companies. Most groups are homogeneous with respect to their sector and also often with respect to their sub-industry, as specified by the GICS classification standard. E.g. the Bank cluster consists of the Commercial Bank companies FöreningsSparbanken, SEB, Handelsbanken and Nordea. However, there are also examples where companies form cluster without belonging to the same sector. One example of this is the Security cluster, consisting of ASSA (Building Products) and Securitas (Diversified Commercial \& Professional Services). Even if they belong to different industries, both are active in the security area. ASSA is a manufacturer and supplier of locking solutions and SECU focus on guarding solutions, security systems and cash handling. The empirical results show that it is possible to obtain a meaningful taxonomy based solely on the co-movements between individual stocks and the fundamental ultrametric assumption, without any presumptions of the companies business activity. The obtained clusters indicate that common economical factors can affect certain groups of stocks, irrespective of their GICS industry classification. The outcome of the investigation is of fundamental importance for e.g. asset classification and portfolio optimization, where the co-movement between assets is of vital importance.
Pettersson, Christoffer. "Investigating the Correlation Between Marketing Emails and Receivers Using Unsupervised Machine Learning on Limited Data : A comprehensive study using state of the art methods for text clustering and natural language processing." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-189147.
Повний текст джерелаMålet med detta projekt att undersöka eventuella samband mellan marknadsföringsemail och dess mottagare med hjälp av oövervakad maskininlärning på en brgränsad mängd data. Datan består av ca 1200 email meddelanden med 98.000 mottagare. Initialt så gruperas alla meddelanden baserat på innehåll via text klustering. Meddelandena innehåller ingen information angående tidigare gruppering eller kategorisering vilket skapar ett behov för ett oövervakat tillvägagångssätt för inlärning där enbart det råa textbaserade meddelandet används som indata. Projektet undersöker moderna tekniker så som bag-of-words för att avgöra termers relevans och the gap statistic för att finna ett optimalt antal kluster. Datan vektoriseras med hjälp av term frequency - inverse document frequency för att avgöra relevansen av termer relativt dokumentet samt alla dokument kombinerat. Ett fundamentalt problem som uppstår via detta tillvägagångssätt är hög dimensionalitet, vilket reduceras med latent semantic analysis tillsammans med singular value decomposition. Då alla kluster har erhållits så analyseras de mest förekommande termerna i vardera kluster och jämförs. Eftersom en initial kategorisering av meddelandena saknas så krävs ett alternativt tillvägagångssätt för evaluering av klustrens validitet. För att göra detta så hämtas och analyseras alla mottagare för vardera kluster som öppnat något av dess meddelanden. Mottagarna har olika attribut angående deras syfte med att använda produkten samt personlig information. När de har hämtats och undersökts kan slutsatser dras kring hurvida samband kan hittas. Det finns ett klart samband mellan vardera kluster och dess mottagare, men till viss utsträckning. Mottagarna från samma kluster visade likartade attribut som var urskiljbara gentemot mottagare från andra kluster. Därav kan det sägas att de resulterande klustren samt dess mottagare är specifika nog att urskilja sig från varandra men för generella för att kunna handera mer detaljerad information. Med mer data kan detta bli ett användbart verktyg för att bestämma mottagare av specifika emailutskick för att på sikt kunna öka öppningsfrekvensen och därmed nå ut till mer relevanta mottagare baserat på tidigare resultat.
Zimek, Arthur. "Correlation Clustering." Diss., lmu, 2008. http://nbn-resolving.de/urn:nbn:de:bvb:19-87361.
Повний текст джерелаTo, Thang Long Information Technology & Electrical Engineering Australian Defence Force Academy UNSW. "Video object segmentation using phase-base detection of moving object boundaries." Awarded by:University of New South Wales - Australian Defence Force Academy. School of Information Technology and Electrical Engineering, 2005. http://handle.unsw.edu.au/1959.4/38705.
Повний текст джерелаRen, Jinchang. "Semantic content analysis for effective video segmentation, summarisation and retrieval." Thesis, University of Bradford, 2009. http://hdl.handle.net/10454/4251.
Повний текст джерелаBatet, Sanromà Montserrat. "Ontology based semantic clustering." Doctoral thesis, Universitat Rovira i Virgili, 2011. http://hdl.handle.net/10803/31913.
Повний текст джерелаClustering algorithms have focused on the management of numerical and categorical data. However, in the last years, textual information has grown in importance. Proper processing of this kind of information within data mining methods requires an interpretation of their meaning at a semantic level. In this work, a clustering method aimed to interpret, in an integrated manner, numerical, categorical and textual data is presented. Textual data will be interpreted by means of semantic similarity measures. These measures calculate the alikeness between words by exploiting one or several knowledge sources. In this work we also propose two new ways of compute semantic similarity based on 1) the exploitation of the taxonomical knowledge available on one or several ontologies and 2) the estimation of the information distribution of terms in the Web. Results show that a proper interpretation of textual data at a semantic level improves clustering results and eases the interpretability of the classifications
Luo, Yongfeng. "Range-Based Graph Clustering." University of Cincinnati / OhioLINK, 2002. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1014606422.
Повний текст джерелаFuentes, Garcia Ruth S. "Bayesian model-based clustering." Thesis, University of Bath, 2004. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412350.
Повний текст джерелаAlbarakati, Rayan. "Density Based Data Clustering." CSUSB ScholarWorks, 2015. https://scholarworks.lib.csusb.edu/etd/134.
Повний текст джерелаFaria, Rodrigo Augusto Dias. "Human skin segmentation using correlation rules on dynamic color clustering." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/45/45134/tde-01102018-101814/.
Повний текст джерелаA pele humana é constituída de uma série de camadas distintas, cada uma das quais reflete uma porção de luz incidente, depois de absorver uma certa quantidade dela pelos pigmentos que se encontram na camada. Os principais pigmentos responsáveis pela origem da cor da pele são a melanina e a hemoglobina. A segmentação de pele desempenha um papel importante em uma ampla gama de aplicações em processamento de imagens e visão computacional. Em suma, existem três abordagens principais para segmentação de pele: baseadas em regras, aprendizado de máquina e híbridos. Elas diferem em termos de precisão e eficiência computacional. Geralmente, as abordagens com aprendizado de máquina e as híbridas superam os métodos baseados em regras, mas exigem um conjunto de dados de treinamento grande e representativo e, por vezes, também um tempo de classificação custoso, que pode ser um fator decisivo para aplicações em tempo real. Neste trabalho, propomos uma melhoria, em três versões distintas, de um novo método de segmentação de pele baseado em regras que funciona no espaço de cores YCbCr. Nossa motivação baseia-se nas hipóteses de que: (1) a regra original pode ser complementada e, (2) pixels de pele humana não aparecem isolados, ou seja, as operações de vizinhança são levadas em consideração. O método é uma combinação de algumas regras de correlação baseadas nessas hipóteses. Essas regras avaliam as combinações de valores de crominância Cb, Cr para identificar os pixels de pele, dependendo da forma e tamanho dos agrupamentos de cores de pele gerados dinamicamente. O método é muito eficiente em termos de esforço computacional, bem como robusto em imagens muito complexas.
Xu, Tianbing. "Nonparametric evolutionary clustering." Diss., Online access via UMI:, 2009.
Знайти повний текст джерелаJarjour, Riad. "Clustering financial time series for volatility modeling." Diss., University of Iowa, 2018. https://ir.uiowa.edu/etd/6439.
Повний текст джерелаErdem, Cosku. "Density Based Clustering Using Mathematical Morphology." Master's thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12608264/index.pdf.
Повний текст джерелаDensity Based Clustering Using Mathematical Morphology"
(DBCM) algorithm as an effective clustering method for extracting arbitrary shaped clusters of noisy numerical data in a reasonable time. This algorithm is predicated on the analogy between images and data warehouses. It applies grayscale morphology which is an image processing technique on multidimensional data. In this study we evaluated the performance of the proposed algorithm on both synthetic and real data and observed that the algorithm produces successful and interpretable results with appropriate parameters. In addition, we computed the computational complexity to be linear on number of data points for low dimensional data and exponential on number of dimensions for high dimensional data mainly due to the morphology operations.
Malsiner-Walli, Gertraud, Daniela Pauger, and Helga Wagner. "Effect fusion using model-based clustering." Sage, 2018. http://dx.doi.org/10.1177/1471082X17739058.
Повний текст джерелаRand, McFadden Renata. "Aspect Mining Using Model-Based Clustering." NSUWorks, 2011. http://nsuworks.nova.edu/gscis_etd/281.
Повний текст джерелаDsouza, Jeevan. "Region-based Crossover for Clustering Problems." NSUWorks, 2012. http://nsuworks.nova.edu/gscis_etd/139.
Повний текст джерелаWei, Wutao. "Model Based Clustering Algorithms with Applications." Thesis, Purdue University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10830711.
Повний текст джерелаIn machine learning predictive area, unsupervised learning will be applied when the labels of the data are unavailable, laborious to obtain or with limited proportion. Based on the special properties of data, we can build models by understanding the properties and making some reasonable assumptions. In this thesis, we will introduce three practical problems and discuss them in detail. This thesis produces 3 papers as follow: Wei, Wutao, et al. "A Non-parametric Hidden Markov Clustering Model with Applications to Time Varying User Activity Analysis." ICMLA2015 Wei, Wutao, et al. "Dynamic Bayesian predictive model for box office forecasting." IEEE Big Data 2017. Wei, Wutao, Bowei Xi, and Murat Kantarcioglu. "Adversarial Clustering: A Grid Based Clustering Algorithm Against Active Adversaries." Submitted
User Profiling Clustering: Activity data of individual users on social media are easily accessible in this big data era. However, proper modeling strategies for user profiles have not been well developed in the literature. Existing methods or models usually have two limitations. The first limitation is that most methods target the population rather than individual users, and the second is that they cannot model non-stationary time-varying patterns. Different users in general demonstrate different activity modes on social media. Therefore, one population model may fail to characterize activities of individual users. Furthermore, online social media are dynamic and ever evolving, so are users’ activities. Dynamic models are needed to properly model users’ activities. In this paper, we introduce a non-parametric hidden Markov model to characterize the time-varying activities of social media users. In addition, based on the proposed model, we develop a clustering method to group users with similar activity patterns.
Adversarial Clustering: Nowadays more and more data are gathered for detecting and preventing cyber-attacks. Unique to the cyber security applications, data analytics techniques have to deal with active adversaries that try to deceive the data analytics models and avoid being detected. The existence of such adversarial behavior motivates the development of robust and resilient adversarial learning techniques for various tasks. In the past most of the work focused on adversarial classification techniques, which assumed the existence of a reasonably large amount of carefully labeled data instances. However, in real practice, labeling the data instances often requires costly and time-consuming human expertise and becomes a significant bottleneck. Meanwhile, a large number of unlabeled instances can also be used to understand the adversaries' behavior. To address the above mentioned challenges, we develop a novel grid based adversarial clustering algorithm. Our adversarial clustering algorithm is able to identify the core normal regions, and to draw defensive walls around the core positions of the normal objects utilizing game theoretic ideas. Our algorithm also identifies sub-clusters of attack objects, the overlapping areas within clusters, and outliers which may be potential anomalies.
Dynamic Bayesian Update for Profiling Clustering: Movie industry becomes one of the most important consumer business. The business is also more and more competitive. As a movie producer, there is a big cost in movie production and marketing; as an owner of a movie theater, it is also a problem that how to arrange the limited screens to the current movies in theater. However, all the current models in movie industry can only give an estimate of the opening week. We improve the dynamic linear model with a Bayesian framework. By using this updating method, we are also able to update the streaming adversarial data and make defensive recommendation for the defensive systems.
Chan, Alton Kam Fai. "Hyperplane based efficient clustering and searching /." View abstract or full-text, 2003. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202003%20CHANA.
Повний текст джерелаMalsiner-Walli, Gertraud, Sylvia Frühwirth-Schnatter, and Bettina Grün. "Model-based clustering based on sparse finite Gaussian mixtures." Springer, 2016. http://dx.doi.org/10.1007/s11222-014-9500-2.
Повний текст джерелаBraune, Christian [Verfasser]. "Skeleton-based validation for density-based clustering / Christian Braune." Magdeburg : Universitätsbibliothek Otto-von-Guericke-Universität, 2018. http://d-nb.info/1220035653/34.
Повний текст джерелаDurkalec, Anna. "Properties and evolution of galaxy clustering at 2." Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM4758/document.
Повний текст джерелаThis thesis focuses on the study of the properties and evolution of galaxy clustering for galaxies in the redshift range 22. I was able to measure the spatial distribution of a general galaxy population at redshift z~3 for the first time with a high accuracy. I quantified the galaxy clustering by estimating and modelling the projected (real-space) two-point correlation function, for a general population of 3022 galaxies. I extended the clustering measurements to the luminosity and stellar mass-selected sub-samples. My results show that the clustering strength of the general galaxy population does not change significantly from redshift z~3.5 to z~2.5, but in both redshift ranges more luminous and more massive galaxies are more clustered than less luminous (massive) ones. Using the halo occupation distribution (HOD) formalism I measured an average host halo mass at redshift z~3 significantly lower than the observed average halo masses at low redshift. I concluded that the observed star-forming population of galaxies at z~3 might have evolved into the massive and bright (Mr<-21.5) galaxy population at redshift z=0. Also, I interpret clustering measurements in terms of a linear large-scale galaxy bias. I find it to be significantly higher than the bias of intermediate and low redshift galaxies. Finally, I computed the stellar-to-halo mass ratio (SHMR) and the integrated star formation efficiency (ISFE) to study the efficiency of star formation and stellar mass assembly. I find that the integrated star formation efficiency is quite high at ~16% for the average galaxies at z~3
Wahid, Dewan Ferdous. "Random models and heuristic algorithms for correlation clustering problems on signed social networks." Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/61438.
Повний текст джерелаIrving K. Barber School of Arts and Sciences (Okanagan)
Computer Science, Department of (Okanagan)
Graduate
Mata, Raman Deep. "Correlation based landmine detection technique /." free to MU campus, to others for purchase, 2004. http://wwwlib.umi.com/cr/mo/fullcit?p1426084.
Повний текст джерелаZhou, Dunke. "High-dimensional Data Clustering and Statistical Analysis of Clustering-based Data Summarization Products." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1338303646.
Повний текст джерелаHolzapfel, Klaus. "Density-based clustering in large-scale networks." [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=979979943.
Повний текст джерелаSlaaen, Roger Antoniussen. "Clustering based localization for wireless sensor networks." Online access for everyone, 2006. http://www.dissertations.wsu.edu/Thesis/Spring2006/R%5FSlaaen%5F050406.pdf.
Повний текст джерелаShekar, B. "A Knowledge-Based Approach To Pattern Clustering." Thesis, Indian Institute of Science, 1988. http://hdl.handle.net/2005/86.
Повний текст джерелаSucasas, V. "Environmental-based smart clustering for mobile networks." Thesis, University of Surrey, 2016. http://epubs.surrey.ac.uk/811628/.
Повний текст джерелаFrühwirth, Rudolf, Korbinian Eckstein, and Sylvia Frühwirth-Schnatter. "Vertex finding by sparse model-based clustering." IOP Publishing, 2016. http://epub.wu.ac.at/6173/1/jop.pdf.
Повний текст джерелаLiu, Jun. "Model-based clustering algorithms, performance and application." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0030/NQ66280.pdf.
Повний текст джерелаKonda, Swetha Reddy. "Classification of software components based on clustering." Morgantown, W. Va. : [West Virginia University Libraries], 2007. https://eidr.wvu.edu/etd/documentdata.eTD?documentid=5510.
Повний текст джерелаTitle from document title page. Document formatted into pages; contains vi, 59 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 57-59).
Ning, Hoi-Kwan Flora. "Model-based regression clustering with variable selection." Thesis, University of Oxford, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.497059.
Повний текст джерелаFIGUEIREDO, AURELIO MORAES. "MAPPING SEISMIC EVENTS USING CLUSTERING-BASED METHODOLOGIES." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2015. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=26709@1.
Повний текст джерелаCOORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
Neste trabalho apresentamos metodologias baseadas em algoritmos de agrupamento de dados utilizadas para processamento de dados sísmicos 3D. Nesse processamento, os voxels de entrada do volume são substituídos por vetores de características que representam a vizinhança local do voxel dentro do seu traço sísmico. Esses vetores são processados por algoritmos de agrupamento de dados. O conjunto de grupos resultantes é então utilizado para gerar uma nova representação do volume sísmico de entrada. Essa estratégia permite modelar a estrutura global do sinal sísmico ao longo de sua vizinhança lateral, reduzindo significativamente o impacto de ruído e demais anomalias presentes no dado original. Os dados pós-processados são então utilizados com duas finalidades principais: o mapeamento automático de horizontes ao longo do volume, e a produção de volumes de visualização destinados a enfatizar possíveis descontinuidades presentes no dado sísmico de entrada, particularmente falhas geológicas. Com relação ao mapeamento de horizontes, o fato de as amostras de entrada dos processos de agrupamento não conterem informação de sua localização 3D no volume permite uma classificação não enviesada dos voxels nos grupos. Consequentemente a metodologia apresenta desempenho robusto mesmo em casos complicados, e o método se mostrou capaz de mapear grande parte das interfaces presentes nos dados testados. Já os atributos de visualização são construídos através de uma função auto-adaptável que usa a informação da vizinhança dos grupos sendo capaz de enfatizar as regiões do dado de entrada onde existam falhas ou outras descontinuidades. Nós aplicamos essas metodologias a dados reais. Os resultados obtidos evidenciam a capacidade dos métodos de mapear mesmo interfaces severamente interrompidas por falhas sísmicas, domos de sal e outras descontinuidades, além de produzirmos atributos de visualização que se mostraram bastante úteis no processo de identificação de descontinuidades presentes nos dados.
We present clustering-based methodologies used to process 3D seismic data. It firstly replaces the volume voxels by corresponding feature samples representing the local behavior in the seismic trace. After this step samples are used as entries to clustering procedures, and the resulting cluster maps are used to create a new representation of the original volume data. This strategy finds the global structure of the seismic signal. It strongly reduces the impact of noise and small disagreements found in the voxels of the entry volume. These clustered versions of the input seismic data can then be used in two different applications: to map 3D horizons automatically and to produce visual attribute volumes where seismic faults and any discontinuities present in the data are highlighted. Concerning the horizon mapping, as the method does not use any lateral similarity measure to organize horizon voxels into clusters, the methodology is very robust when mapping difficult cases. It is capable of mapping a great portion of the seismic interfaces present in the data. In the case of the visualization attribute, it is constructed by applying an auto-adaptable function that uses the voxel neighboring information through a specific measurement that globally highlights the fault regions and other discontinuities present in the original volume. We apply the methodologies to real seismic data, mapping even seismic horizons severely interrupted by various discontinuities and presenting visualization attributes where discontinuities are adequately highlighted.
Murugiah, S. "Bayesian nonparametric clustering based on Dirichlet processes." Thesis, University College London (University of London), 2010. http://discovery.ucl.ac.uk/20467/.
Повний текст джерелаCoretto, Pietro. "The noise component in model-based clustering." Thesis, University College London (University of London), 2008. http://discovery.ucl.ac.uk/1445219/.
Повний текст джерелаAkula, Ravi Kiran. "Botnet Detection Using Graph Based Feature Clustering." Thesis, Mississippi State University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10751733.
Повний текст джерелаDetecting botnets in a network is crucial because bot-activities impact numerous areas such as security, finance, health care, and law enforcement. Most existing rule and flow-based detection methods may not be capable of detecting bot-activities in an efficient manner. Hence, designing a robust botnet-detection method is of high significance. In this study, we propose a botnet-detection methodology based on graph-based features. Self-Organizing Map is applied to establish the clusters of nodes in the network based on these features. Our method is capable of isolating bots in small clusters while containing most normal nodes in the big-clusters. A filtering procedure is also developed to further enhance the algorithm efficiency by removing inactive nodes from bot detection. The methodology is verified using real-world CTU-13 and ISCX botnet datasets and benchmarked against classification-based detection methods. The results show that our proposed method can efficiently detect the bots despite their varying behaviors.
Kim, Yeongwoo. "Dynamic GAN-based Clustering in Federated Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285576.
Повний текст джерелаett nätverk ökat. Enheterna genererar kontinuerligt data som har varierandeinformation, från strömförbrukning till konfigurationen av enheterna. Eftersomdatan innehåller den råa informationen om varje lokal nod i nätverket germanipulation av informationen potential att gynna nätverket med olika metoder.På grund av den stora mängden data, och dess egenskap av att vara icke-o.l.f.,som genereras i varje nod blir manuella operationer för att bearbeta data ochjustera metoderna utmanande. För att hantera utmaningen finns försök med attanvända automatiserade metoder för att bygga precisa maskininlärningsmodellermed hjälp av en mindre mängd insamlad data eller att gruppera nodergenom att utnyttja klustringsalgoritmer och använda maskininlärningsmodellerinom varje kluster. De konventionella klustringsalgoritmerna är emellertidofullkomliga i ett distribuerat och dynamiskt nätverk på grund av risken fördataskydd, de icke-dynamiska klusterna och det fasta antalet kluster. Dessabegränsningar av klustringsalgoritmerna försämrar maskininlärningsmodellernasprestanda eftersom klustren kan bli föråldrade med tiden. Därför föreslårdenna avhandling en trefasklustringsalgoritm i dynamiska miljöer genom attutnyttja 1) GAN-baserad klustring, 2) klusterkalibrering och 3) klyvning avkluster i federerad inlärning. GAN-baserade klustring bevarar dataintegriteteneftersom det eliminerar behovet av att dela rådata i ett nätverk för att skapakluster. Klusterkalibrering lägger till dynamik i klustringen genom att kontinuerligtuppdatera kluster och fördelar metoder som hanterar nätverket. Dessutomdelar den klövlande klustringen olika antal kluster genom att iterativt välja ochdela ett kluster i flera kluster. Som ett resultat skapar vi kluster för dynamiskamiljöer och förbättrar prestandan hos maskininlärningsmodeller inom varjekluster.
Zhang, Kai. "Kernel-based clustering and low rank approximation /." View abstract or full-text, 2008. http://library.ust.hk/cgi/db/thesis.pl?CSED%202008%20ZHANG.
Повний текст джерелаMcClelland, Robyn L. "Regression based variable clustering for data reduction /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/9611.
Повний текст джерелаDavis, Aaron Samuel. "Bisecting Document Clustering Using Model-Based Methods /." Diss., CLICK HERE for online access, 2010. http://contentdm.lib.byu.edu/ETD/image/etd3332.pdf.
Повний текст джерела"Modeling multivariate financial time series based on correlation clustering." 2008. http://library.cuhk.edu.hk/record=b5896838.
Повний текст джерелаThesis (M.Phil.)--Chinese University of Hong Kong, 2008.
Includes bibliographical references (leaves 61-70).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.0
Chapter 1.1 --- Motivation and Objective --- p.0
Chapter 1.2 --- Major Contribution --- p.2
Chapter 1.3 --- Thesis Organization --- p.4
Chapter 2 --- Measurement of Relationship between financial time series --- p.5
Chapter ´ب2.1 --- Linear Correlation --- p.5
Chapter 2.1.1 --- Pearson Correlation Coefficient --- p.6
Chapter 2.1.2 --- Rank Correlation --- p.6
Chapter 2.2 --- Mutual Information --- p.7
Chapter 2.2.1 --- Approaches of Mutual Information Estimation --- p.10
Chapter 2.3 --- Copula --- p.12
Chapter 2.4 --- Analysis from Experimental Data --- p.14
Chapter 2.4.1 --- Experiment 1: Nonlinearity --- p.14
Chapter 2.4.2 --- Experiment 2: Sensitivity of Outliers --- p.16
Chapter 2.4.3 --- Experiment 3: Transformation Invariance --- p.20
Chapter 2.5 --- Chapter Summary --- p.23
Chapter 3 --- Clustered Dynamic Conditional Correlation Model --- p.26
Chapter 3.1 --- Background Review --- p.26
Chapter 3.1.1 --- GARCH Model --- p.26
Chapter 3.1.2 --- Multivariate GARCH model --- p.29
Chapter 3.2 --- DCC Multivariate GARCH Models --- p.31
Chapter 3.2.1 --- DCC GARCH Model --- p.31
Chapter 3.2.2 --- Generalized DCC GARCH Model --- p.32
Chapter 3.2.3 --- Block-DCC GARCH Model --- p.32
Chapter 3.3 --- Clustered DCC GARCH Model --- p.34
Chapter 3.3.1 --- Minimum Distance Estimation (MDE) --- p.36
Chapter 3.3.2 --- Clustered DCC (CDCC) based on MDE --- p.37
Chapter 3.4 --- Clustering Method Selection --- p.40
Chapter 3.5 --- Model Estimation and Testing Method --- p.42
Chapter 3.5.1 --- Maximum Likelihood Estimation --- p.42
Chapter 3.5.2 --- Box-Pierce Statistic Test --- p.44
Chapter 3.6 --- Chapter Summary --- p.44
Chapter 4 --- Experimental Result and Applications on CDCC --- p.46
Chapter 4.1 --- Model Comparison and Analysis --- p.46
Chapter 4.2 --- Portfolio Selection Application --- p.50
Chapter 4.3 --- Value at Risk Application --- p.52
Chapter 4.4 --- Chapter Summary --- p.55
Chapter 5 --- Conclusion --- p.57
Bibliography --- p.61
Chen, Lien-Chin, and 陳連進. "A Correlation-Based Approach for Validating Gene Expression Clustering." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/r32u5a.
Повний текст джерела國立成功大學
資訊工程學系碩博士班
90
This research explores various correlation-based clustering validation methods that are suitable for the gene expression analysis. In biological analysis, the clustering algorithms are often used first to partition the genes into groups exhibiting similar patterns of variation in expression level, then the clustering validation methods are applied to evaluate the validity of the clustering results. However, most of similarity measurements used in existing clustering analysis belong to the distance-based category. In fact, a biologist aims to cluster together genes that have similar expression tendency instead of same expression values. This motivates the use of correlation-based clustering and validation indices in this study. In this thesis, an automatic clustering validation system was presented to guide the user to choose the suitable validation index in cluster analysis. We developed a volumetric-clouds type clusters generator to synthesize various datasets, and a number of correlation-based validation indices were evaluated for measuring the quality of clustering results. Hence, the system can suggest the best validation index for different types of datasets given by users effectively.
Wang, Qian-Hao, and 王千豪. "Document Clustering based on Approximate Word Pattern Matching and Correlation of Co-occurrence." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/mf3227.
Повний текст джерела大同大學
資訊經營學系(所)
95
Because of users often searching related text on Internet to read or browse, this research aims at rapidly and exactly grouping a large number of text by thematic document clustering for users to efficiently absorb them during reading and convert them into really useful information. This research includes feature extraction, feature strength measurement, document-feature vector space modeling, and clustering analysis on document-document space. Feature extraction is based on the approximate word pattern matching, whose strength is evaluated by the correlation of co-occurrence involved in approximation tolerance, the distance between components of the pattern., Then we expand the tf-idf concept of vector space model from Information Retrieval to establish a document-feature vector space model by correlation of co-occurrence and idf (pwf-idf or pa-idf) In order to perform effect clustering, the document-document vector space is generated by the similarity between all pairs of documents and the similarity is calculated from document-feature vector space model. Finally, a simple and effective clustering method by recursive merging data with high similarity is presented. Through the experimental analysis, the result of our presented research method is better than that of Yang & Yu : “With the word bi-gram as its feature and by word clustering first, which will lead the documents containing them grouping together called concept clusters, and then to combine these concept clusters with high document repetition to become the final document clustering”. This research verifies that the approximate word pattern matching can extract more common features from documents and the proposed document clustering model also can solve the error propagation resulting from multiple clustering.
Chiu, Yi-Wen, and 邱伊文. "The Study of the Correlation on the Ability of Othello and Reading-habits through Bee-based Clustering Analysis." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/27258486255908780869.
Повний текст джерела嶺東科技大學
高階主管企管碩士在職專班
102
This study is based on Bee-based Clustering(BBC)to analyze the questionnaires about the correlation from students’ reading habits and Othello. More specifically, the relationship between the outcomes in understanding the relationship between reading habits and reasoning ability of elementary school children. The research can provide a good reference to teachers and staff in education administration. The detail of this study incorporate the 5 th grade elementary school students of a series of questionnaires. There are 163 samples and 161 of effective samples. We also collect the records of school subjects of this 161 students. The results present that the ability in mathematics can produce a strong correlation in winning rates of Othello. It also presents that the preference of reading has some linked to good winning rates of Othello.
Zimek, Arthur [Verfasser]. "Correlation clustering / vorgelegt von Arthur Zimek." 2008. http://d-nb.info/989874494/34.
Повний текст джерела(10725786), James Michael Amstutz. "Cluster-Based Analysis Of Retinitis Pigmentosa Candidate Modifiers Using Drosophila Eye Size And Gene Expression Data." Thesis, 2021.
Знайти повний текст джерелаThe goal of this thesis is to algorithmically identify candidate modifiers for retinitis pigmentosa (RP) to help improve therapy and predictions for this genetic disorder that may lead to a complete loss of vision. A current research by (Chow et al., 2016) focused on the genetic contributors to RP by trying to recognize a correlation between genetic modifiers and phenotypic variation in female Drosophila melanogaster, or fruit flies. In comparison to the genome-wide association analysis carried out in Chow et al.’s research, this study proposes using a K-Means clustering algorithm on RNA expression data to better understand which genes best exhibit characteristics of the RP degenerative model. Validating this algorithm’s effectiveness in identifying suspected genes takes priority over their classification.
This study investigates the linear relationship between Drosophila eye size and genetic expression to gather statistically significant, strongly correlated genes from the clusters with abnormally high or low eye sizes. The clustering algorithm is implemented in the R scripting language, and supplemental information details the steps of this computational process. Running the mean eye size and genetic expression data of 18,140 female Drosophila genes and 171 strains through the proposed algorithm in its four variations helped identify 140 suspected candidate modifiers for retinal degeneration. Although none of the top candidate genes found in this study matched Chow’s candidates, they were all statistically significant and strongly correlated, with several showing links to RP. These results may continue to improve as more of the 140 suspected genes are annotated using identical or comparative approaches.
Chen, Kuan-Chi, and 陳冠奇. "Combine Fuzzy Clustering and Correlation Coefficient for Medical Image Analysis." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/20086993754837383229.
Повний текст джерела龍華科技大學
資訊管理系碩士班
102
Automatic visual detection techniques have been widely applied in the medical field in recent years. The advancement of image analysis technology has allowed medical images to provide more accurate references for physicians to use while making diagnoses. However, despite the rapid development of image analysis technology, as body structure, organ size and position differs among patients, the image information may cause misjudgments due to human negligence and noise. This study applied image analysis and detection to CT images of patients with heart disease. In the proposed analytical framework, the correlation coefficient were used for detection. The results found that the correlation coefficient could be applied to the enhanced image gray scale, could be applied to the CT color image, and that both reached a good analysis effect. Finally, the neighborhood intuitionistic fuzzy clustering algorithm was integrated for comparison, in order to propose the image types that were suitable for different types of image analyses. This study was expected to provide a more accurate reference for physicians to use in making diagnoses.
Tsai, Kun-Hsiu, and 蔡坤修. "On the document clustering based on dynamical term clustering." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/74344328487512323211.
Повний текст джерела國立臺灣科技大學
資訊管理系
91
With the rapid growth of the World Wide Web, more and more information is accessible on-line. The explosion of information has resulted in an information overload problem. However, people have no time to read everything and have to decide which information is available. Document clustering is an important technology to solve the information overload problem. In our project, we focus on Chinese document clustering problem. There are still some difficulties which need to be solved up to now, such as the Chinese sentence segmentation problem, high dimensionality problem, and unpredicted cluster number problem. We propose new methods to solve these problems. First step in our Chinese document clustering system is to segment the sentences into meaningful words. With a view to overcome the shortcoming of traditional Chinese sentence segmentation process, we propose a new method combining the segmentation with the thesaurus and the compound words detection. In our experiments, we show that our method results in a better clustering result. During the clustering phase, we design a dynamic term clustering method based on SOM technique. We propose a hierarchical and growing structure of clustering to cluster the term vectors. Different from the traditional clustering method using document vectors, we generate an efficient clustering process and provide a much friendly browsing interface.
HCWEI and 魏宏全. "Cross-Correlation-based." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/91848855387696610481.
Повний текст джерела國立交通大學
電信工程系
90
In the adaptive acoustic echo cancellation, double-talk will make the echo canceller fail to trace the room impulse response. In this thesis, the cross-correlations between i) the microphone input and the estimated echo, and ii) the microphone input and the AEC error, are used to judge whether double-talk arises. We will derive the theoretical cross-correlations, detection thresholds, and the detection delays. For practical nonstationary speech signals, we propose the Variant threshold method to detect the double-talk more efficiently. To distinguish the echo-path change from double-talk, we also propose a Modified-cross-correlation method. Computer simulations will validate our derivations and proposed methods.
Shao, Qing. "Estimating the number of clusters in regression clustering /." 2004. http://wwwlib.umi.com/cr/yorku/fullcit?pNQ99236.
Повний текст джерелаTypescript. Includes bibliographical references (leaves 114-124). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://wwwlib.umi.com/cr/yorku/fullcit?pNQ99236