Articoli di riviste sul tema "Clustering"

Segui questo link per vedere altri tipi di pubblicazioni sul tema: Clustering.

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Clustering".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

Qian, Yue, Shixin Yao, Tianjun Wu, You Huang e Lingbin Zeng. "Improved Selective Deep-Learning-Based Clustering Ensemble". Applied Sciences 14, n. 2 (15 gennaio 2024): 719. http://dx.doi.org/10.3390/app14020719.

Testo completo
Abstract (sommario):
Clustering ensemble integrates multiple base clustering results to improve the stability and robustness of the single clustering method. It consists of two principal steps: a generation step, which is about the creation of base clusterings, and a consensus function, which is the integration of all clusterings obtained in the generation step. However, most of the existing base clustering algorithms used in the generation step are shallow clustering algorithms such as k-means. These shallow clustering algorithms do not work well or even fail when dealing with large-scale, high-dimensional unstructured data. The emergence of deep clustering algorithms provides a solution to address this challenge. Deep clustering combines the unsupervised commonality of deep representation learning to address complex high-dimensional data clustering, which has achieved excellent performance in many fields. In light of this, we introduce deep clustering into clustering ensemble and propose an improved selective deep-learning-based clustering ensemble algorithm (ISDCE). ISDCE exploits the deep clustering algorithm with different initialization parameters to generate multiple diverse base clusterings. Next, ISDCE constructs ensemble quality and diversity evaluation metrics of base clusterings to select higher-quality and rich-diversity candidate base clusterings. Finally, a weighted graph partition consensus function is utilized to aggregate the candidate base clusterings to obtain a consensus clustering result. Extensive experimental results on various types of datasets demonstrate that ISDCE performs significantly better than existing clustering ensemble approaches.
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Hess, Sibylle, Wouter Duivesteijn, Philipp Honysz e Katharina Morik. "The SpectACl of Nonconvex Clustering: A Spectral Approach to Density-Based Clustering". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 3788–95. http://dx.doi.org/10.1609/aaai.v33i01.33013788.

Testo completo
Abstract (sommario):
When it comes to clustering nonconvex shapes, two paradigms are used to find the most suitable clustering: minimum cut and maximum density. The most popular algorithms incorporating these paradigms are Spectral Clustering and DBSCAN. Both paradigms have their pros and cons. While minimum cut clusterings are sensitive to noise, density-based clusterings have trouble handling clusters with varying densities. In this paper, we propose SPECTACL: a method combining the advantages of both approaches, while solving the two mentioned drawbacks. Our method is easy to implement, such as Spectral Clustering, and theoretically founded to optimize a proposed density criterion of clusterings. Through experiments on synthetic and real-world data, we demonstrate that our approach provides robust and reliable clusterings.
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Manjunath, Mohith, Yi Zhang, Yeonsung Kim, Steve H. Yeo, Omar Sobh, Nathan Russell, Christian Followell, Colleen Bushell, Umberto Ravaioli e Jun S. Song. "ClusterEnG: an interactive educational web resource for clustering and visualizing high-dimensional data". PeerJ Computer Science 4 (21 maggio 2018): e155. http://dx.doi.org/10.7717/peerj-cs.155.

Testo completo
Abstract (sommario):
Background Clustering is one of the most common techniques in data analysis and seeks to group together data points that are similar in some measure. Although there are many computer programs available for performing clustering, a single web resource that provides several state-of-the-art clustering methods, interactive visualizations and evaluation of clustering results is lacking. Methods ClusterEnG (acronym for Clustering Engine for Genomics) provides a web interface for clustering data and interactive visualizations including 3D views, data selection and zoom features. Eighteen clustering validation measures are also presented to aid the user in selecting a suitable algorithm for their dataset. ClusterEnG also aims at educating the user about the similarities and differences between various clustering algorithms and provides tutorials that demonstrate potential pitfalls of each algorithm. Conclusions The web resource will be particularly useful to scientists who are not conversant with computing but want to understand the structure of their data in an intuitive manner. The validation measures facilitate the process of choosing a suitable clustering algorithm among the available options. ClusterEnG is part of a bigger project called KnowEnG (Knowledge Engine for Genomics) and is available at http://education.knoweng.org/clustereng.
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Wei, Shaowei, Jun Wang, Guoxian Yu, Carlotta Domeniconi e Xiangliang Zhang. "Multi-View Multiple Clusterings Using Deep Matrix Factorization". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 04 (3 aprile 2020): 6348–55. http://dx.doi.org/10.1609/aaai.v34i04.6104.

Testo completo
Abstract (sommario):
Multi-view clustering aims at integrating complementary information from multiple heterogeneous views to improve clustering results. Existing multi-view clustering solutions can only output a single clustering of the data. Due to their multiplicity, multi-view data, can have different groupings that are reasonable and interesting from different perspectives. However, how to find multiple, meaningful, and diverse clustering results from multi-view data is still a rarely studied and challenging topic in multi-view clustering and multiple clusterings. In this paper, we introduce a deep matrix factorization based solution (DMClusts) to discover multiple clusterings. DMClusts gradually factorizes multi-view data matrices into representational subspaces layer-by-layer and generates one clustering in each layer. To enforce the diversity between generated clusterings, it minimizes a new redundancy quantification term derived from the proximity between samples in these subspaces. We further introduce an iterative optimization procedure to simultaneously seek multiple clusterings with quality and diversity. Experimental results on benchmark datasets confirm that DMClusts outperforms state-of-the-art multiple clustering solutions.
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Miklautz, Lukas, Dominik Mautz, Muzaffer Can Altinigneli, Christian Böhm e Claudia Plant. "Deep Embedded Non-Redundant Clustering". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 04 (3 aprile 2020): 5174–81. http://dx.doi.org/10.1609/aaai.v34i04.5961.

Testo completo
Abstract (sommario):
Complex data types like images can be clustered in multiple valid ways. Non-redundant clustering aims at extracting those meaningful groupings by discouraging redundancy between clusterings. Unfortunately, clustering images in pixel space directly has been shown to work unsatisfactory. This has increased interest in combining the high representational power of deep learning with clustering, termed deep clustering. Algorithms of this type combine the non-linear embedding of an autoencoder with a clustering objective and optimize both simultaneously. None of these algorithms try to find multiple non-redundant clusterings. In this paper, we propose the novel Embedded Non-Redundant Clustering algorithm (ENRC). It is the first algorithm that combines neural-network-based representation learning with non-redundant clustering. ENRC can find multiple highly non-redundant clusterings of different dimensionalities within a data set. This is achieved by (softly) assigning each dimension of the embedded space to the different clusterings. For instance, in image data sets it can group the objects by color, material and shape, without the need for explicit feature engineering. We show the viability of ENRC in extensive experiments and empirically demonstrate the advantage of combining non-linear representation learning with non-redundant clustering.
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Fisher, D. "Iterative Optimization and Simplification of Hierarchical Clusterings". Journal of Artificial Intelligence Research 4 (1 aprile 1996): 147–78. http://dx.doi.org/10.1613/jair.276.

Testo completo
Abstract (sommario):
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been constructed it is judged by analysts -- often according to task-specific criteria. Several authors have abstracted these criteria and posited a generic performance task akin to pattern completion, where the error rate over completed patterns is used to `externally' judge clustering utility. Given this performance task, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus promising to ease post-clustering analysis. Finally, we propose a number of objective functions, based on attribute-selection measures for decision-tree induction, that might perform well on the error rate and simplicity dimensions.
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Li, Hongmin, Xiucai Ye, Akira Imakura e Tetsuya Sakurai. "LSEC: Large-scale spectral ensemble clustering". Intelligent Data Analysis 27, n. 1 (30 gennaio 2023): 59–77. http://dx.doi.org/10.3233/ida-216240.

Testo completo
Abstract (sommario):
A fundamental problem in machine learning is ensemble clustering, that is, combining multiple base clusterings to obtain improved clustering result. However, most of the existing methods are unsuitable for large-scale ensemble clustering tasks owing to efficiency bottlenecks. In this paper, we propose a large-scale spectral ensemble clustering (LSEC) method to balance efficiency and effectiveness. In LSEC, a large-scale spectral clustering-based efficient ensemble generation framework is designed to generate various base clusterings with low computational complexity. Thereafter, all the base clusterings are combined using a bipartite graph partition-based consensus function to obtain improved consensus clustering results. The LSEC method achieves a lower computational complexity than most existing ensemble clustering methods. Experiments conducted on ten large-scale datasets demonstrate the efficiency and effectiveness of the LSEC method. The MATLAB code of the proposed method and experimental datasets are available at https://github.com/Li-Hongmin/MyPaperWithCode.
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Sun, Yuqin, Songlei Wang, Dongmei Huang, Yuan Sun, Anduo Hu e Jinzhong Sun. "A multiple hierarchical clustering ensemble algorithm to recognize clusters arbitrarily shaped". Intelligent Data Analysis 26, n. 5 (5 settembre 2022): 1211–28. http://dx.doi.org/10.3233/ida-216112.

Testo completo
Abstract (sommario):
As a research hotspot in ensemble learning, clustering ensemble obtains robust and highly accurate algorithms by integrating multiple basic clustering algorithms. Most of the existing clustering ensemble algorithms take the linear clustering algorithms as the base clusterings. As a typical unsupervised learning technique, clustering algorithms have difficulties properly defining the accuracy of the findings, making it difficult to significantly enhance the performance of the final algorithm. AGglomerative NESting method is used to build base clusters in this article, and an integration strategy for integrating multiple AGglomerative NESting clusterings is proposed. The algorithm has three main steps: evaluating the credibility of labels, producing multiple base clusters, and constructing the relation among clusters. The proposed algorithm builds on the original advantages of AGglomerative NESting and further compensates for the inability to identify arbitrarily shaped clusters. It can establish the proposed algorithm’s superiority in terms of clustering performance by comparing the proposed algorithm’s clustering performance to that of existing clustering algorithms on different datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Rouba, Baroudi, e Safia Nait Bahloul. "A Multicriteria Clustering Approach Based on Similarity Indices and Clustering Ensemble Techniques". International Journal of Information Technology & Decision Making 13, n. 04 (luglio 2014): 811–37. http://dx.doi.org/10.1142/s0219622014500631.

Testo completo
Abstract (sommario):
This paper deals with the problem of multicriteria clusters construction. The aim is to propose a multicriteria clustering procedure aiming at discovering data structures from a multicriteria perspective by defining a dissimilarity measure which takes into account the multicriteria nature of the problem. Comparing two objects in the multicriteria context is based on the preference information that expresses whether these objects are indifferent, incomparable or one is preferred to the other. The proposed approach uses this preference information with an agreement–disagreement similarity index to compute a dissimilarity measure. The approach generates, according to the preference relations, a set of clusterings. Each clustering expresses a way of grouping objects according to the preference relation used. A good quality final clustering is obtained by combining the clusterings generated previously using a clustering ensemble technique.
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Gilpin, Sean, Siegried Nijssen e Ian Davidson. "Formalizing Hierarchical Clustering as Integer Linear Programming". Proceedings of the AAAI Conference on Artificial Intelligence 27, n. 1 (30 giugno 2013): 372–78. http://dx.doi.org/10.1609/aaai.v27i1.8671.

Testo completo
Abstract (sommario):
Hierarchical clustering is typically implemented as a greedy heuristic algorithm with no explicit objective function. In this work we formalize hierarchical clustering as an integer linear programming (ILP) problem with a natural objective function and the dendrogram properties enforced as linear constraints. Though exact solvers exists for ILP we show that a simple randomized algorithm and a linear programming (LP) relaxation can be used to provide approximate solutions faster. Formalizing hierarchical clustering also has the benefit that relaxing the constraints can produce novel problem variations such as overlapping clusterings. Our experiments show that our formulation is capable of outperforming standard agglomerative clustering algorithms in a variety of settings, including traditional hierarchical clustering as well as learning overlapping clusterings.
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Wang, Yanhua, Xiyu Liu e Laisheng Xiang. "GA-Based Membrane Evolutionary Algorithm for Ensemble Clustering". Computational Intelligence and Neuroscience 2017 (2017): 1–11. http://dx.doi.org/10.1155/2017/4367342.

Testo completo
Abstract (sommario):
Ensemble clustering can improve the generalization ability of a single clustering algorithm and generate a more robust clustering result by integrating multiple base clusterings, so it becomes the focus of current clustering research. Ensemble clustering aims at finding a consensus partition which agrees as much as possible with base clusterings. Genetic algorithm is a highly parallel, stochastic, and adaptive search algorithm developed from the natural selection and evolutionary mechanism of biology. In this paper, an improved genetic algorithm is designed by improving the coding of chromosome. A new membrane evolutionary algorithm is constructed by using genetic mechanisms as evolution rules and combines with the communication mechanism of cell-like P system. The proposed algorithm is used to optimize the base clusterings and find the optimal chromosome as the final ensemble clustering result. The global optimization ability of the genetic algorithm and the rapid convergence of the membrane system make membrane evolutionary algorithm perform better than several state-of-the-art techniques on six real-world UCI data sets.
Gli stili APA, Harvard, Vancouver, ISO e altri
12

Sun, Tao, Saeed Mashdour e Mohammad Reza Mahmoudi. "An Ensemble Clusterer Framework based on Valid and Diverse Basic Small Clusters". International Journal of Information Technology & Decision Making 20, n. 04 (26 maggio 2021): 1189–219. http://dx.doi.org/10.1142/s0219622021500309.

Testo completo
Abstract (sommario):
Clustering ensemble is a new problem where it is aimed to extract a clustering out of a pool of base clusterings. The pool of base clusterings is sometimes referred to as ensemble. An ensemble is to be considered to be a suitable one, if its members are diverse and any of them has a minimum quality. The method that maps an ensemble into an output partition (called also as consensus partition) is named consensus function. The consensus function should find a consensus partition that all of the ensemble members agree on it as much as possible. In this paper, a novel clustering ensemble framework that guarantees generation of a pool of the base clusterings with the both conditions (diversity among ensemble members and high-quality members) is introduced. According to its limitations, a novel consensus function is also introduced. We experimentally show that the proposed clustering ensemble framework is scalable, efficient and general. Using different base clustering algorithms, we show that our improved base clustering algorithm is better. Also, among different consensus functions, we show the effectiveness of our consensus function. Finally, comparing with the state of the art, we find that the clustering ensemble framework is comparable or even better in terms of scalability and efficacy.
Gli stili APA, Harvard, Vancouver, ISO e altri
13

Komori, Osamu, e Shinto Eguchi. "A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov–Nagumo Average". Entropy 23, n. 5 (24 aprile 2021): 518. http://dx.doi.org/10.3390/e23050518.

Testo completo
Abstract (sommario):
Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov–Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities.
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Niu, Huan, Nasim Khozouie, Hamid Parvin, Hamid Alinejad-Rokny, Amin Beheshti e Mohammad Reza Mahmoudi. "An Ensemble of Locally Reliable Cluster Solutions". Applied Sciences 10, n. 5 (10 marzo 2020): 1891. http://dx.doi.org/10.3390/app10051891.

Testo completo
Abstract (sommario):
Clustering ensemble indicates to an approach in which a number of (usually weak) base clusterings are performed and their consensus clustering is used as the final clustering. Knowing democratic decisions are better than dictatorial decisions, it seems clear and simple that ensemble (here, clustering ensemble) decisions are better than simple model (here, clustering) decisions. But it is not guaranteed that every ensemble is better than a simple model. An ensemble is considered to be a better ensemble if their members are valid or high-quality and if they participate according to their qualities in constructing consensus clustering. In this paper, we propose a clustering ensemble framework that uses a simple clustering algorithm based on kmedoids clustering algorithm. Our simple clustering algorithm guarantees that the discovered clusters are valid. From another point, it is also guaranteed that our clustering ensemble framework uses a mechanism to make use of each discovered cluster according to its quality. To do this mechanism an auxiliary ensemble named reference set is created by running several kmeans clustering algorithms.
Gli stili APA, Harvard, Vancouver, ISO e altri
15

Li, Hong-Dong, Yunpei Xu, Xiaoshu Zhu, Quan Liu, Gilbert S. Omenn e Jianxin Wang. "ClusterMine: A knowledge-integrated clustering approach based on expression profiles of gene sets". Journal of Bioinformatics and Computational Biology 18, n. 03 (giugno 2020): 2040009. http://dx.doi.org/10.1142/s0219720020400090.

Testo completo
Abstract (sommario):
Clustering analysis of gene expression data is essential for understanding complex biological data, and is widely used in important biological applications such as the identification of cell subpopulations and disease subtypes. In commonly used methods such as hierarchical clustering (HC) and consensus clustering (CC), holistic expression profiles of all genes are often used to assess the similarity between samples for clustering. While these methods have been proven successful in identifying sample clusters in many areas, they do not provide information about which gene sets (functions) contribute most to the clustering, thus limiting the interpretability of the resulting cluster. We hypothesize that integrating prior knowledge of annotated gene sets would not only achieve satisfactory clustering performance but also, more importantly, enable potential biological interpretation of clusters. Here we report ClusterMine, an approach that identifies clusters by assessing functional similarity between samples through integrating known annotated gene sets in functional annotation databases such as Gene Ontology. In addition to the cluster membership of each sample as provided by conventional approaches, it also outputs gene sets that most likely contribute to the clustering, thus facilitating biological interpretation. We compare ClusterMine with conventional approaches on nine real-world experimental datasets that represent different application scenarios in biology. We find that ClusterMine achieves better performances and that the gene sets prioritized by our method are biologically meaningful. ClusterMine is implemented as an R package and is freely available at: www.genemine.org/clustermine.php
Gli stili APA, Harvard, Vancouver, ISO e altri
16

Jia, Yuheng, Hui Liu, Junhui Hou e Qingfu Zhang. "Clustering Ensemble Meets Low-rank Tensor Approximation". Proceedings of the AAAI Conference on Artificial Intelligence 35, n. 9 (18 maggio 2021): 7970–78. http://dx.doi.org/10.1609/aaai.v35i9.16972.

Testo completo
Abstract (sommario):
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one. The existing clustering ensemble methods generally construct a co-association matrix, which indicates the pairwise similarity between samples, as the weighted linear combination of the connective matrices from different base clusterings, and the resulting co-association matrix is then adopted as the input of an off-the-shelf clustering algorithm, e.g., spectral clustering. However, the co-association matrix may be dominated by poor base clusterings, resulting in inferior performance. In this paper, we propose a novel low-rank tensor approximation based method to solve the problem from a global perspective. Specifically, by inspecting whether two samples are clustered to an identical cluster under different base clusterings, we derive a coherent-link matrix, which contains limited but highly reliable relationships between samples. We then stack the coherent-link matrix and the co-association matrix to form a three-dimensional tensor, the low-rankness property of which is further explored to propagate the information of the coherent-link matrix to the co-association matrix, producing a refined co-association matrix. We formulate the proposed method as a convex constrained optimization problem and solve it efficiently. Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods. To the best of our knowledge, this is the first work to explore the potential of low-rank tensor on clustering ensemble, which is fundamentally different from previous approaches. Last but not least, our method only contains one parameter, which can be easily tuned.
Gli stili APA, Harvard, Vancouver, ISO e altri
17

Dasgupta, Sajib, Richard Golden e Vincent Ng. "Clustering Documents Along Multiple Dimensions". Proceedings of the AAAI Conference on Artificial Intelligence 26, n. 1 (20 settembre 2021): 879–85. http://dx.doi.org/10.1609/aaai.v26i1.8325.

Testo completo
Abstract (sommario):
Traditional clustering algorithms are designed to search for a single clustering solution despite the fact that multiple alternative solutions might exist for a particular dataset. For example, a set of news articles might be clustered by topic or by the author's gender or age. Similarly, book reviews might be clustered by sentiment or comprehensiveness. In this paper, we address the problem of identifying alternative clustering solutions by developing a Probabilistic Multi-Clustering (PMC) model that discovers multiple, maximally different clusterings of a data sample. Empirical results on six datasets representative of real-world applications show that our PMC model exhibits superior performance to comparable multi-clustering algorithms.
Gli stili APA, Harvard, Vancouver, ISO e altri
18

Davidson, Ian, e S. S. Ravi. "Making Existing Clusterings Fairer: Algorithms, Complexity Results and Insights". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 04 (3 aprile 2020): 3733–40. http://dx.doi.org/10.1609/aaai.v34i04.5783.

Testo completo
Abstract (sommario):
We explore the area of fairness in clustering from the different perspective of modifying clusterings from existing algorithms to make them fairer whilst retaining their quality. We formulate the minimal cluster modification for fairness (MCMF) problem where the input is a given partitional clustering and the goal is to minimally change it so that the clustering is still of good quality and fairer. We show using an intricate case analysis that for a single protected variable, the problem is efficiently solvable (i.e., in the class P) by proving that the constraint matrix for an integer linear programming (ILP) formulation is totally unimodular (TU). Interestingly, we show that even for a single protected variable, the addition of simple pairwise guidance (to say ensure individual level fairness) makes the MCMF problem computationally intractable (i.e., NP-hard). Experimental results on Twitter, Census and NYT data sets show that our methods can modify existing clusterings for data sets in excess of 100,000 instances within minutes on laptops and find as fair but higher quality clusterings than fair by design clustering algorithms.
Gli stili APA, Harvard, Vancouver, ISO e altri
19

AGOGINO, ADRIAN, e KAGAN TUMER. "A MULTIAGENT COORDINATION APPROACH TO ROBUST CONSENSUS CLUSTERING". Advances in Complex Systems 13, n. 02 (aprile 2010): 165–97. http://dx.doi.org/10.1142/s0219525910002499.

Testo completo
Abstract (sommario):
In many distributed modeling, control or information processing applications, clustering patterns that share certain similarities is the critical first step. However, many traditional clustering algorithms require centralized processing, reliable data collection and the availability of all the raw data in one place at one time. None of these requirement can be met in many complex real world problems. In this paper, we present an agent-based method for combining multiple base clusterings into a single unified "consensus" clustering that is robust against many types of failures and does not require spatial/temporal synchronization. In this approach, agents process clusterings coming from separate sources and pool them to produce a unified consensus. The first contribution of this work is to provide an adaptive method by which the agents update their selections to maximize an objective function based on the quality of the consensus clustering. The second contribution of this work is in providing intermediate agent-specific objective functions that significantly improve the quality of the consensus clustering process. Our results show that this agent-based method achieves comparable or better performance than traditional non-agent consensus clustering methods in fault-free conditions, and remains effective under a wide range of failure scenarios that paralyze the traditional methods.
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Rodrigues, Pedro Pereira, João Araújo, João Gama e Luís Lopes. "A local algorithm to approximate the global clustering of streams generated in ubiquitous sensor networks". International Journal of Distributed Sensor Networks 14, n. 10 (ottobre 2018): 155014771880823. http://dx.doi.org/10.1177/1550147718808239.

Testo completo
Abstract (sommario):
In ubiquitous streaming data sources, such as sensor networks, clustering nodes by the data they produce gives insights on the phenomenon being monitored. However, centralized algorithms force communication and storage requirements to grow unbounded. This article presents L2GClust, an algorithm to compute local clusterings at each node as an approximation of the global clustering. L2GClust performs local clustering of the sources based on the moving average of each node’s data over time: the moving average is approximated using memory-less statistics; clustering is based on the furthest-point algorithm applied to the centroids computed by the node’s direct neighbors. Evaluation is performed both on synthetic and real sensor data, using a state-of-the-art sensor network simulator and measuring sensitivity to network size, number of clusters, cluster overlapping, and communication incompleteness. A high level of agreement was found between local and global clusterings, with special emphasis on separability agreement, while an overall robustness to incomplete communications emerged. Communication reduction was also theoretically shown, with communication ratios empirically evaluated for large networks. L2GClust is able to keep a good approximation of the global clustering, using less communication than a centralized alternative, supporting the recommendation to use local algorithms for distributed clustering of streaming data sources.
Gli stili APA, Harvard, Vancouver, ISO e altri
21

VEGA-PONS, SANDRO, e JOSÉ RUIZ-SHULCLOPER. "A SURVEY OF CLUSTERING ENSEMBLE ALGORITHMS". International Journal of Pattern Recognition and Artificial Intelligence 25, n. 03 (maggio 2011): 337–72. http://dx.doi.org/10.1142/s0218001411008683.

Testo completo
Abstract (sommario):
Cluster ensemble has proved to be a good alternative when facing cluster analysis problems. It consists of generating a set of clusterings from the same dataset and combining them into a final clustering. The goal of this combination process is to improve the quality of individual data clusterings. Due to the increasing appearance of new methods, their promising results and the great number of applications, we consider that it is necessary to make a critical analysis of the existing techniques and future projections. This paper presents an overview of clustering ensemble methods that can be very useful for the community of clustering practitioners. The characteristics of several methods are discussed, which may help in the selection of the most appropriate one to solve a problem at hand. We also present a taxonomy of these techniques and illustrate some important applications.
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Xu, Jiaxuan, Jiang Wu, Taiyong Li e Yang Nan. "Divergence-Based Locally Weighted Ensemble Clustering with Dictionary Learning and the L2,1-Norm". Entropy 24, n. 10 (21 settembre 2022): 1324. http://dx.doi.org/10.3390/e24101324.

Testo completo
Abstract (sommario):
Accurate clustering is a challenging task with unlabeled data. Ensemble clustering aims to combine sets of base clusterings to obtain a better and more stable clustering and has shown its ability to improve clustering accuracy. Dense representation ensemble clustering (DREC) and entropy-based locally weighted ensemble clustering (ELWEC) are two typical methods for ensemble clustering. However, DREC treats each microcluster equally and hence, ignores the differences between each microcluster, while ELWEC conducts clustering on clusters rather than microclusters and ignores the sample–cluster relationship. To address these issues, a divergence-based locally weighted ensemble clustering with dictionary learning (DLWECDL) is proposed in this paper. Specifically, the DLWECDL consists of four phases. First, the clusters from the base clustering are used to generate microclusters. Second, a Kullback–Leibler divergence-based ensemble-driven cluster index is used to measure the weight of each microcluster. With these weights, an ensemble clustering algorithm with dictionary learning and the L2,1-norm is employed in the third phase. Meanwhile, the objective function is resolved by optimizing four subproblems and a similarity matrix is learned. Finally, a normalized cut (Ncut) is used to partition the similarity matrix and the ensemble clustering results are obtained. In this study, the proposed DLWECDL was validated on 20 widely used datasets and compared to some other state-of-the-art ensemble clustering methods. The experimental results demonstrated that the proposed DLWECDL is a very promising method for ensemble clustering.
Gli stili APA, Harvard, Vancouver, ISO e altri
23

Niu, Kun, Shubo Zhang e Junliang Chen. "Subspace clustering through attribute clustering". Frontiers of Electrical and Electronic Engineering in China 3, n. 1 (gennaio 2008): 44–48. http://dx.doi.org/10.1007/s11460-008-0010-x.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Basu, Sumit, Danyel Fisher, Steven Drucker e Hao Lu. "Assisting Users with Clustering Tasks by Combining Metric Learning and Classification". Proceedings of the AAAI Conference on Artificial Intelligence 24, n. 1 (3 luglio 2010): 394–400. http://dx.doi.org/10.1609/aaai.v24i1.7694.

Testo completo
Abstract (sommario):
Interactive clustering refers to situations in which a human labeler is willing to assist a learning algorithm in automatically clustering items. We present a related but somewhat different task, assisted clustering, in which a user creates explicit groups of items from a large set and wants suggestions on what items to add to each group. While the traditional approach to interactive clustering has been to use metric learning to induce a distance metric, our situation seems equally amenable to classification. Using clusterings of documents from human subjects, we found that one or the other method proved to be superior for a given cluster, but not uniformly so. We thus developed a hybrid mechanism for combining the metric learner and the classifier. We present results from a large number of trials based on human clusterings, in which we show that our combination scheme matches and often exceeds the performance of a method which exclusively uses either type of learner.
Gli stili APA, Harvard, Vancouver, ISO e altri
25

JONYER, ISTVAN, LAWRENCE B. HOLDER e DIANE J. COOK. "GRAPH-BASED HIERARCHICAL CONCEPTUAL CLUSTERING". International Journal on Artificial Intelligence Tools 10, n. 01n02 (marzo 2001): 107–35. http://dx.doi.org/10.1142/s0218213001000441.

Testo completo
Abstract (sommario):
Hierarchical conceptual clustering has proven to be a useful, although greatly under-explored data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides the advantages of both approaches. This work presents SUBDUE and the development of its clustering functionalities. Several examples are used to illustrate the validity of the approach both in structured and unstructured domains, as well as compare SUBDUE to earlier clustering algorithms. Results show that SUBDUE successfully discovers hierarchical clusterings in both structured and unstructured data.
Gli stili APA, Harvard, Vancouver, ISO e altri
26

Zhang, Yinghui. "A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble". Journal of Intelligent Systems 29, n. 1 (30 dicembre 2017): 143–53. http://dx.doi.org/10.1515/jisys-2017-0513.

Testo completo
Abstract (sommario):
Abstract Co-clustering is used to analyze the row and column clusters of a dataset, and it is widely used in recommendation systems. In general, different co-clustering models often obtain very different results for a dataset because each algorithm has its own optimization criteria. It is an alternative way to combine different co-clustering results to produce a final one for improving the quality of co-clustering. In this paper, a semi-supervised co-clustering ensemble is illustrated in detail based on semi-supervised learning and ensemble learning. A semi-supervised co-clustering ensemble is a framework for combining multiple base co-clusterings and the side information of a dataset to obtain a stable and robust consensus co-clustering. First, the objective function of the semi-supervised co-clustering ensemble is formulated according to normalized mutual information. Then, a kernel probabilistic model for semi-supervised co-clustering ensemble (KPMSCE) is presented and the inference of KPMSCE is illustrated in detail. Furthermore, the corresponding algorithm is designed. Moreover, different algorithms and the proposed algorithm are used for experiments on real datasets. The experimental results demonstrate that the proposed algorithm can significantly outperform the compared algorithms in terms of several indices.
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Berman, Piotr, Bhaskar DasGupta, Ming-Yang Kao e Jie Wang. "On constructing an optimal consensus clustering from multiple clusterings". Information Processing Letters 104, n. 4 (novembre 2007): 137–45. http://dx.doi.org/10.1016/j.ipl.2007.06.008.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Al-Najdi, Atheer, Nicolas Pasquier e Frédéric Precioso. "Using Closed Patterns to Solve the Consensus Clustering Problem". International Journal of Software Engineering and Knowledge Engineering 26, n. 09n10 (novembre 2016): 1379–97. http://dx.doi.org/10.1142/s021819401640009x.

Testo completo
Abstract (sommario):
Clustering is the process of partitioning a dataset into groups based on the similarity between the instances. Many clustering algorithms were proposed, but none of them proved to provide good quality partition in all situations. Consensus clustering aims to enhance the clustering process by combining different partitions obtained from different algorithms to yield a better quality consensus solution. In this work, we propose a new consensus clustering method that uses a pattern mining technique in order to reduce the search space from instance-based into pattern-based space. Instead of finding one solution, our method generates multiple consensus candidates based on varying the number of base clusterings considered. The different solutions are then linked and presented as a tree that gives more insight about the similarities between the instances and the different partitions in the ensemble.
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Altman, Naomi, e Martin Krzywinski. "Clustering". Nature Methods 14, n. 6 (30 maggio 2017): 545–46. http://dx.doi.org/10.1038/nmeth.4299.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
30

Brown, Harold, e Sandra Salisch. "Clustering". College Teaching 44, n. 1 (gennaio 1996): 29–33. http://dx.doi.org/10.1080/87567555.1996.9925552.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Subramanian, Srividhya, e Keith E. Shafer. "Clustering". Journal of Library Administration 34, n. 3-4 (dicembre 2001): 221–28. http://dx.doi.org/10.1300/j111v34n03_01.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
32

OUELLETTE, JANE N., LINDA L. MARTIN, RACHAEL BRUGH HOLMES, JANICE M. ROGERS e ANN M. ROSSI. "Clustering". Nursing Management (Springhouse) 20, n. 6 (giugno 1989): 31???38. http://dx.doi.org/10.1097/00006247-198906000-00012.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Discombe, G. "Clustering". BMJ 290, n. 6475 (13 aprile 1985): 1149. http://dx.doi.org/10.1136/bmj.290.6475.1149-a.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
34

王, 田雨. "Clustering Algorithm Based on Initial Clustering". Advances in Applied Mathematics 10, n. 02 (2021): 461–70. http://dx.doi.org/10.12677/aam.2021.102052.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
35

SATO-ILIC, Mika. "Fuzzy Clustering and Fuzzy Clustering Models". Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 31, n. 3 (15 giugno 2019): 75–81. http://dx.doi.org/10.3156/jsoft.31.3_75.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Liu, Shuai, e Meng Zhan. "Clustering versus non-clustering phase synchronizations". Chaos: An Interdisciplinary Journal of Nonlinear Science 24, n. 1 (marzo 2014): 013104. http://dx.doi.org/10.1063/1.4861685.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Lipor, John, e Laura Balzano. "Clustering quality metrics for subspace clustering". Pattern Recognition 104 (agosto 2020): 107328. http://dx.doi.org/10.1016/j.patcog.2020.107328.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Zhu, Wencheng, Jiwen Lu e Jie Zhou. "Nonlinear subspace clustering for image clustering". Pattern Recognition Letters 107 (maggio 2018): 131–36. http://dx.doi.org/10.1016/j.patrec.2017.08.023.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Abdullah, Manal, Ahlam Al-Thobaity, Afnan Bawazir e Nouf Al-Harbe. "Energy Efficient Ensemble K-means and SVM for Wireless Sensor Network". INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 11, n. 9 (27 novembre 2013): 3034–42. http://dx.doi.org/10.24297/ijct.v11i9.3409.

Testo completo
Abstract (sommario):
A wireless sensor network (WSN) consists of a large number of small sensors with limited energy. For many WSN applications, prolonged network lifetime is important requirements. There are different techniques have already been proposed to improve energy consumption rate such as clustering ,efficient routing , and data aggregation. In this paper, we present a novel technique using clustering .The different clustering algorithms also differ in their objectives. Sometimes Clustering suffers from more overlapping and redundancy data since sensor node's position is in a critical position does not know in which clustering it is belonging. One option is to assign these nodes to both clusters, which is equivalent to overlap of nodes and data redundancy occurs. This paper has proposed a new method to solve this problem and make use of the advantages of Support Vector Machine SVM to strengthen K-MEANS clustering algorithm and give us more accurate dissection boundary for each classes .The new algorithm is called K-SVM.Numerical experiments are carried out using Matlab to simulate sensor fields. Through comparing with classical K-MEANS clustering scheme we confirmed that K-SVM  algorithm has a better improvement in clustering accuracy in these networks.
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Wang, Xing, Jun Wang, Carlotta Domeniconi, Guoxian Yu, Guoqiang Xiao e Maozu Guo. "Multiple Independent Subspace Clusterings". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 5353–60. http://dx.doi.org/10.1609/aaai.v33i01.33015353.

Testo completo
Abstract (sommario):
Multiple clustering aims at discovering diverse ways of organizing data into clusters. Despite the progress made, it’s still a challenge for users to analyze and understand the distinctive structure of each output clustering. To ease this process, we consider diverse clusterings embedded in different subspaces, and analyze the embedding subspaces to shed light into the structure of each clustering. To this end, we provide a two-stage approach called MISC (Multiple Independent Subspace Clusterings). In the first stage, MISC uses independent subspace analysis to seek multiple and statistical independent (i.e. non-redundant) subspaces, and determines the number of subspaces via the minimum description length principle. In the second stage, to account for the intrinsic geometric structure of samples embedded in each subspace, MISC performs graph regularized semi-nonnegative matrix factorization to explore clusters. It additionally integrates the kernel trick into matrix factorization to handle non-linearly separable clusters. Experimental results on synthetic datasets show that MISC can find different interesting clusterings from the sought independent subspaces, and it also outperforms other related and competitive approaches on real-world datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Miyamoto, Sadaaki, Youhei Kuroda e Kenta Arai. "Algorithms for Sequential Extraction of Clusters by Possibilistic Method and Comparison with Mountain Clustering". Journal of Advanced Computational Intelligence and Intelligent Informatics 12, n. 5 (20 settembre 2008): 448–53. http://dx.doi.org/10.20965/jaciii.2008.p0448.

Testo completo
Abstract (sommario):
In addition to fuzzy c-means, possibilistic clustering is useful because it is robust against noise in data. The generated clusters are, however, strongly dependent on an initial value. We propose a family of algorithms for sequentially generating clusters “one cluster at a time,” which includes possibilistic medoid clustering. These algorithms automatically determine the number of clusters. Due to possibilistic clustering's similarity to the mountain clustering by Yager and Filev, we compare their formulation and performance in numerical examples.
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Zhu-Juan Ma, Zhu-Juan Ma, Zi-Han Wang Zhu-Juan Ma, Xiang-Hua Chen Zi-Han Wang e Feng Liu Xiang-Hua Chen. "DP-Kmeans and Beyond: Optimal Clustering with a new Clustering Validity Index". 電腦學刊 33, n. 5 (ottobre 2022): 001–17. http://dx.doi.org/10.53106/199115992022103305001.

Testo completo
Abstract (sommario):
<p>The K-means clustering algorithm is widely used in many areas for its high efficiency. However, the performance of the traditional K-means algorithm is very sensitive to the selection of initial clustering centers. Furthermore, except the convex distributed datasets, the traditional K-means algorithm still cannot optimally process many non-convex distributed datasets and datasets with outliers. To this end, this paper proposes the DP-Kmeans, an improved K-means algorithm based on the Density Parameter and center replacement, which can be more accurate than the traditional K-means by dropping the random selection of the initial clustering centers and continuous updating of the new centers. Due to the unsupervised learning feature, the number of clusters and the quality of data partitions generated by the clustering algorithm cannot be guaranteed. In order to evaluate the results of the DP-Kmeans algorithm, this paper proposes the SII, a new clustering validity index based on the Sum of the Inner-cluster compactness and the Inter-cluster separateness. Based on the DP-Kmeans algorithm and the SII index, a new method is proposed to determine the optimal clustering numbers for different datasets. Experimental results on ten datasets with different distributions demonstrate that the proposed clustering method is more effective the existing ones. </p> <p>&nbsp;</p>
Gli stili APA, Harvard, Vancouver, ISO e altri
43

Muhammad Tianda, Izhar, Mohammad Noufal Ubadah, M. Fariz Fadillah Mardianto, Said Agil Al Munawwarah, Nurhalisa Ishak, Dita Amelia e Elly Ana. "Clustering Fake News with K-Means and Agglomerative Clustering Based on Word2Vec". INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER RESEARCH 12, n. 02 (4 febbraio 2024): 3999–4007. http://dx.doi.org/10.47191/ijmcr/v12i2.01.

Testo completo
Abstract (sommario):
Fake News on digital platforms is a major problem in this digital age. Many people want to find methods to detect Fake News. This research looks at a way to group Fake News articles using K-Means and Agglomerative Clustering techniques, using the semantic representations from Word2Vec embeddings. The researchers use natural language translation methods and advanced machine learning to improve the accuracy and efficiency of Fake News detection. The study involves getting meaningful features from textual data, turning them into vector representations using Word2Vec, and then applying clustering algorithms to sort similar articles. The methodology aims to improve the most recent state of the art in Fake News detection, helping to create more reliable and robust tools to fight misinformation in the digital age, In the comparative analysis of clustering metrics, K-Means clustering exhibits a Purity Score of 88.09% and an Adjusted Rand Score of 58.03%. Conversely, Agglomerative Clustering with the Ward method yields a Purity Score of 85.13% and an Adjusted Rand Score of 49.36%.The Purity Score of 88.09% for K-Means suggests a strong ability to form clusters where the majority of data points share the same true class. Agglomerative Clustering with Ward, though slightly lower at 85.13%, also demonstrates effective class separation within clusters. When considering the Adjusted Rand Score, which accounts for chance and measures the agreement between true and predicted labels, K-Means significantly outperforms Agglomerative Clustering with Ward. The scores are 58.03% and 49.36%, respectively
Gli stili APA, Harvard, Vancouver, ISO e altri
44

Miloudi, Salim, Yulin Wang e Wenjia Ding. "An Improved Similarity-Based Clustering Algorithm for Multi-Database Mining". Entropy 23, n. 5 (29 aprile 2021): 553. http://dx.doi.org/10.3390/e23050553.

Testo completo
Abstract (sommario):
Clustering algorithms for multi-database mining (MDM) rely on computing (n2−n)/2 pairwise similarities between n multiple databases to generate and evaluate m∈[1,(n2−n)/2] candidate clusterings in order to select the ideal partitioning that optimizes a predefined goodness measure. However, when these pairwise similarities are distributed around the mean value, the clustering algorithm becomes indecisive when choosing what database pairs are considered eligible to be grouped together. Consequently, a trivial result is produced by putting all the n databases in one cluster or by returning n singleton clusters. To tackle the latter problem, we propose a learning algorithm to reduce the fuzziness of the similarity matrix by minimizing a weighted binary entropy loss function via gradient descent and back-propagation. As a result, the learned model will improve the certainty of the clustering algorithm by correctly identifying the optimal database clusters. Additionally, in contrast to gradient-based clustering algorithms, which are sensitive to the choice of the learning rate and require more iterations to converge, we propose a learning-rate-free algorithm to assess the candidate clusterings generated on the fly in fewer upper-bounded iterations. To achieve our goal, we use coordinate descent (CD) and back-propagation to search for the optimal clustering of the n multiple database in a way that minimizes a convex clustering quality measure L(θ) in less than (n2−n)/2 iterations. By using a max-heap data structure within our CD algorithm, we optimally choose the largest weight variable θp,q(i) at each iteration i such that taking the partial derivative of L(θ) with respect to θp,q(i) allows us to attain the next steepest descent minimizing L(θ) without using a learning rate. Through a series of experiments on multiple database samples, we show that our algorithm outperforms the existing clustering algorithms for MDM.
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Medrano, Cesar, Gastelum Alonso, Octavio Lafarga e Jose Cervantes. "MCClusteringSM: An approach for the multicriteria clustering problem based on a credibility similarity measure". Computer Science and Information Systems, n. 00 (2024): 33. http://dx.doi.org/10.2298/csis230302033m.

Testo completo
Abstract (sommario):
Multicriteria clustering problem has been studied and applied scarcely. When a multicriteria clustering problem is tackled with an outranking approach, it is necessary to include preferences of decision makers on the raw dataset, e.g., weights and thresholds of the evaluation criteria. Then, it is necessary to conduct a process to obtain a comprehensive model of preferences represented in a fuzzy or crisp outranking relation. Subsequently, the model can be exploited to derive a multicriteria clustering. This work presents an exhaustive search approach using a credibility similarity measure to exploit a fuzzy outranking relation to derive a multicriteria clustering. The work includes two experimental designs to evaluate the performance of the algorithm. Results show that the proposed method has good performance exploiting fuzzy outranking relations to create the clusterings.
Gli stili APA, Harvard, Vancouver, ISO e altri
46

Jadhav, Priyanka, e Rasika Patil. "Analysis of Clustering technique". International Journal of Trend in Scientific Research and Development Volume-2, Issue-4 (30 giugno 2018): 2422–24. http://dx.doi.org/10.31142/ijtsrd15616.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Silva, Pavani Y. De, Chiran N. Fernando, Damith D. Wijethunge e Subha D. Fernando. "Recursive Hierarchical Clustering Algorithm". International Journal of Machine Learning and Computing 8, n. 1 (febbraio 2018): 1–7. http://dx.doi.org/10.18178/ijmlc.2018.8.1.654.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
48

FANG, LI ZHI. "QUASAR CLUSTERING AND ITS COSMOLOGICAL IMPLICATION". International Journal of Modern Physics A 04, n. 14 (20 agosto 1989): 3477–502. http://dx.doi.org/10.1142/s0217751x89001394.

Testo completo
Abstract (sommario):
The clusterings of quasars and absorption line clouds have been analyzed from the viewpoint of the structure formation of the universe. It was found that the features of quasar clustering are quite different from those of galaxies. These results have already given several meaningful constraints on the structure formation, as follows: (a) quasar clustering is much weaker than in galaxies; (b) large scale structures, such as superclusters, should probably be formed after the epoch z~2; (c) the amplitude of the total density inhomogeneity seems to be less than that of galaxy distribution by at least a factor of 3–5 (in a Ω=1 universe); (d) Ly-α absorption clouds may be formed by different processes of clustering from that of glaxies.
Gli stili APA, Harvard, Vancouver, ISO e altri
49

N Sujatha, Latha Narayanan Valli, A Prema, SK Rathiha e V Raja. "Initial centroid selection for K- means clustering algorithm using the statistical method". International Journal of Science and Research Archive 7, n. 2 (30 dicembre 2022): 474–78. http://dx.doi.org/10.30574/ijsra.2022.7.2.0309.

Testo completo
Abstract (sommario):
An iterative process that converges to one of the many local minima is used in practical clustering methods. K-means clustering is one of the most well-liked clustering methods. It is well known that these iterative methods are very susceptible to the initial beginning circumstances. In order to improve K-means clustering's performance, this research suggests a novel method for choosing initial centroids. The suggested approach is evaluated with online access records, and the results demonstrate that better initial starting points and post-processing cluster refinement result in better solutions.
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Du, Hang-Yuan, e Wen-Jian Wang. "A Clustering Ensemble Framework with Integration of Data Characteristics and Structure Information: A Graph Neural Networks Approach". Mathematics 10, n. 11 (26 maggio 2022): 1834. http://dx.doi.org/10.3390/math10111834.

Testo completo
Abstract (sommario):
Clustering ensemble is a research hotspot of data mining that aggregates several base clustering results to generate a single output clustering with improved robustness and stability. However, the validity of the ensemble result is usually affected by unreliability in the generation and integration of base clusterings. In order to address this issue, we develop a clustering ensemble framework viewed from graph neural networks that generates an ensemble result by integrating data characteristics and structure information. In this framework, we extract structure information from base clustering results of the data set by using a coupling affinity measure After that, we combine structure information with data characteristics by using a graph neural network (GNN) to learn their joint embeddings in latent space. Then, we employ a Gaussian mixture model (GMM) to predict the final cluster assignment in the latent space. Finally, we construct the GNN and GMM as a unified optimization model to integrate the objectives of graph embedding and consensus clustering. Our framework can not only elegantly combine information in feature space and structure space, but can also achieve suitable representations for final cluster partitioning. Thus, it can produce an outstanding result. Experimental results on six synthetic benchmark data sets and six real world data sets show that the proposed framework yields a better performance compared to 12 reference algorithms that are developed based on either clustering ensemble architecture or a deep clustering strategy.
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!

Vai alla bibliografia