Journal articles on the topic 'Co-clustering algorithm'

To see the other types of publications on this topic, follow the link: Co-clustering algorithm.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Co-clustering algorithm.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kanzawa, Yuchi. "Bezdek-Type Fuzzified Co-Clustering Algorithm." Journal of Advanced Computational Intelligence and Intelligent Informatics 19, no. 6 (November 20, 2015): 852–60. http://dx.doi.org/10.20965/jaciii.2015.p0852.

Full text
Abstract:
In this study, two co-clustering algorithms based on Bezdek-type fuzzification of fuzzy clustering are proposed for categorical multivariate data. The two proposed algorithms are motivated by the fact that there are only two fuzzy co-clustering methods currently available – entropy regularization and quadratic regularization – whereas there are three fuzzy clustering methods for vectorial data: entropy regularization, quadratic regularization, and Bezdek-type fuzzification. The first proposed algorithm forms the basis of the second algorithm. The first algorithm is a variant of a spherical clustering method, with the kernelization of a maximizing model of Bezdek-type fuzzy clustering with multi-medoids. By interpreting the first algorithm in this way, the second algorithm, a spectral clustering approach, is obtained. Numerical examples demonstrate that the proposed algorithms can produce satisfactory results when suitable parameter values are selected.
APA, Harvard, Vancouver, ISO, and other styles
2

Liu, Yongli, Jingli Chen, and Hao Chao. "A Fuzzy Co-Clustering Algorithm via Modularity Maximization." Mathematical Problems in Engineering 2018 (October 29, 2018): 1–11. http://dx.doi.org/10.1155/2018/3757580.

Full text
Abstract:
In this paper we propose a fuzzy co-clustering algorithm via modularity maximization, named MMFCC. In its objective function, we use the modularity measure as the criterion for co-clustering object-feature matrices. After converting into a constrained optimization problem, it is solved by an iterative alternative optimization procedure via modularity maximization. This algorithm offers some advantages such as directly producing a block diagonal matrix and interpretable description of resulting co-clusters, automatically determining the appropriate number of final co-clusters. The experimental studies on several benchmark datasets demonstrate that this algorithm can yield higher quality co-clusters than such competitors as some fuzzy co-clustering algorithms and crisp block-diagonal co-clustering algorithms, in terms of accuracy.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Yinghui. "A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble." Journal of Intelligent Systems 29, no. 1 (December 30, 2017): 143–53. http://dx.doi.org/10.1515/jisys-2017-0513.

Full text
Abstract:
Abstract Co-clustering is used to analyze the row and column clusters of a dataset, and it is widely used in recommendation systems. In general, different co-clustering models often obtain very different results for a dataset because each algorithm has its own optimization criteria. It is an alternative way to combine different co-clustering results to produce a final one for improving the quality of co-clustering. In this paper, a semi-supervised co-clustering ensemble is illustrated in detail based on semi-supervised learning and ensemble learning. A semi-supervised co-clustering ensemble is a framework for combining multiple base co-clusterings and the side information of a dataset to obtain a stable and robust consensus co-clustering. First, the objective function of the semi-supervised co-clustering ensemble is formulated according to normalized mutual information. Then, a kernel probabilistic model for semi-supervised co-clustering ensemble (KPMSCE) is presented and the inference of KPMSCE is illustrated in detail. Furthermore, the corresponding algorithm is designed. Moreover, different algorithms and the proposed algorithm are used for experiments on real datasets. The experimental results demonstrate that the proposed algorithm can significantly outperform the compared algorithms in terms of several indices.
APA, Harvard, Vancouver, ISO, and other styles
4

Gu, Yi, and Kang Li. "Entropy-Based Multiview Data Clustering Analysis in the Era of Industry 4.0." Wireless Communications and Mobile Computing 2021 (April 30, 2021): 1–8. http://dx.doi.org/10.1155/2021/9963133.

Full text
Abstract:
In the era of Industry 4.0, single-view clustering algorithm is difficult to play a role in the face of complex data, i.e., multiview data. In recent years, an extension of the traditional single-view clustering is multiview clustering technology, which is becoming more and more popular. Although the multiview clustering algorithm has better effectiveness than the single-view clustering algorithm, almost all the current multiview clustering algorithms usually have two weaknesses as follows. (1) The current multiview collaborative clustering strategy lacks theoretical support. (2) The weight of each view is averaged. To solve the above-mentioned problems, we used the Havrda-Charvat entropy and fuzzy index to construct a new collaborative multiview fuzzy c-means clustering algorithm using fuzzy weighting called Co-MVFCM. The corresponding results show that the Co-MVFCM has the best clustering performance among all the comparison clustering algorithms.
APA, Harvard, Vancouver, ISO, and other styles
5

Jin, Chun Xia, Hui Zhang, and Qiu Chan Bai. "Text Clustering Algorithm of Co-Occurrence Word Based on Association-Rule Mining." Applied Mechanics and Materials 599-601 (August 2014): 1749–52. http://dx.doi.org/10.4028/www.scientific.net/amm.599-601.1749.

Full text
Abstract:
According to the analysis of text feature, the document with co-occurrence words expresses very stronger and more accurately topic information. So this paper puts forward a text clustering algorithm of word co-occurrence based on association-rule mining. The method uses the association-rule mining to extract those word co-occurrences of expressing the topic information in the document. According to the co-occurrence words to build the modeling and co-occurrence word similarity measure, then this paper uses the hierarchical clustering algorithm based on word co-occurrence to realize text clustering. Experimental results show the method proposed in this paper improves the efficiency and accuracy of text clustering compared with other algorithms.
APA, Harvard, Vancouver, ISO, and other styles
6

Hussain, Syed Fawad, and Shahid Iqbal. "CCGA: Co-similarity based Co-clustering using genetic algorithm." Applied Soft Computing 72 (November 2018): 30–42. http://dx.doi.org/10.1016/j.asoc.2018.07.045.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Hou, Jie, Xiufen Ye, Chuanlong Li, and Yixing Wang. "K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks." Genes 12, no. 1 (January 12, 2021): 87. http://dx.doi.org/10.3390/genes12010087.

Full text
Abstract:
Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis.
APA, Harvard, Vancouver, ISO, and other styles
8

MA, PATRICK C. H., KEITH C. C. CHAN, and DAVID K. Y. CHIU. "CLUSTERING AND RE-CLUSTERING FOR PATTERN DISCOVERY IN GENE EXPRESSION DATA." Journal of Bioinformatics and Computational Biology 03, no. 02 (April 2005): 281–301. http://dx.doi.org/10.1142/s0219720005001053.

Full text
Abstract:
The combined interpretation of gene expression data and gene sequences is important for the investigation of the intricate relationships of gene expression at the transcription level. The expression data produced by microarray hybridization experiments can lead to the identification of clusters of co-expressed genes that are likely co-regulated by the same regulatory mechanisms. By analyzing the promoter regions of co-expressed genes, the common regulatory patterns characterized by transcription factor binding sites can be revealed. Many clustering algorithms have been used to uncover inherent clusters in gene expression data. In this paper, based on experiments using simulated and real data, we show that the performance of these algorithms could be further improved. For the clustering of expression data typically characterized by a lot of noise, we propose to use a two-phase clustering algorithm consisting of an initial clustering phase and a second re-clustering phase. The proposed algorithm has several desirable features: (i) it utilizes both local and global information by computing both a "local" pairwise distance between two gene expression profiles in Phase 1 and a "global" probabilistic measure of interestingness of cluster patterns in Phase 2, (ii) it distinguishes between relevant and irrelevant expression values when performing re-clustering, and (iii) it makes explicit the patterns discovered in each cluster for possible interpretations. Experimental results show that the proposed algorithm can be an effective algorithm for discovering clusters in the presence of very noisy data. The patterns that are discovered in each cluster are found to be meaningful and statistically significant, and cannot otherwise be easily discovered. Based on these discovered patterns, genes co-expressed under the same experimental conditions and range of expression levels have been identified and evaluated. When identifying regulatory patterns at the promoter regions of the co-expressed genes, we also discovered well-known transcription factor binding sites in them. These binding sites can provide explanations for the co-expressed patterns.
APA, Harvard, Vancouver, ISO, and other styles
9

Shang, Ronghua, Yang Li, and Licheng Jiao. "Co-evolution-based immune clonal algorithm for clustering." Soft Computing 20, no. 4 (February 7, 2015): 1503–19. http://dx.doi.org/10.1007/s00500-015-1602-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Liu, Yongli, Shuai Wu, Zhizhong Liu, and Hao Chao. "A fuzzy co-clustering algorithm for biomedical data." PLOS ONE 12, no. 4 (April 26, 2017): e0176536. http://dx.doi.org/10.1371/journal.pone.0176536.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Sieranoja, Sami, and Pasi Fränti. "Adapting k-means for graph clustering." Knowledge and Information Systems 64, no. 1 (December 4, 2021): 115–42. http://dx.doi.org/10.1007/s10115-021-01623-y.

Full text
Abstract:
AbstractWe propose two new algorithms for clustering graphs and networks. The first, called K‑algorithm, is derived directly from the k-means algorithm. It applies similar iterative local optimization but without the need to calculate the means. It inherits the properties of k-means clustering in terms of both good local optimization capability and the tendency to get stuck at a local optimum. The second algorithm, called the M-algorithm, gradually improves on the results of the K-algorithm to find new and potentially better local optima. It repeatedly merges and splits random clusters and tunes the results with the K-algorithm. Both algorithms are general in the sense that they can be used with different cost functions. We consider the conductance cost function and also introduce two new cost functions, called inverse internal weight and mean internal weight. According to our experiments, the M-algorithm outperforms eight other state-of-the-art methods. We also perform a case study by analyzing clustering results of a disease co-occurrence network, which demonstrate the usefulness of the algorithms in an important real-life application.
APA, Harvard, Vancouver, ISO, and other styles
12

Rahman, Mohammad Arifur, Nathan LaPierre, Huzefa Rangwala, and Daniel Barbara. "Metagenome sequence clustering with hash-based canopies." Journal of Bioinformatics and Computational Biology 15, no. 06 (December 2017): 1740006. http://dx.doi.org/10.1142/s0219720017400066.

Full text
Abstract:
Metagenomics is the collective sequencing of co-existing microbial communities which are ubiquitous across various clinical and ecological environments. Due to the large volume and random short sequences (reads) obtained from community sequences, analysis of diversity, abundance and functions of different organisms within these communities are challenging tasks. We present a fast and scalable clustering algorithm for analyzing large-scale metagenome sequence data. Our approach achieves efficiency by partitioning the large number of sequence reads into groups (called canopies) using hashing. These canopies are then refined by using state-of-the-art sequence clustering algorithms. This canopy-clustering (CC) algorithm can be used as a pre-processing phase for computationally expensive clustering algorithms. We use and compare three hashing schemes for canopy construction with five popular and state-of-the-art sequence clustering methods. We evaluate our clustering algorithm on synthetic and real-world 16S and whole metagenome benchmarks. We demonstrate the ability of our proposed approach to determine meaningful Operational Taxonomic Units (OTU) and observe significant speedup with regards to run time when compared to different clustering algorithms. We also make our source code publicly available on Github. a
APA, Harvard, Vancouver, ISO, and other styles
13

Song, Kun, Xiwen Yao, Feiping Nie, Xuelong Li, and Mingliang Xu. "Weighted bilateral K-means algorithm for fast co-clustering and fast spectral clustering." Pattern Recognition 109 (January 2021): 107560. http://dx.doi.org/10.1016/j.patcog.2020.107560.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Ren, Jiaqi, and Youlong Yang. "Multitask possibilistic and fuzzy co-clustering algorithm for clustering data with multisource features." Neural Computing and Applications 32, no. 9 (November 8, 2018): 4785–804. http://dx.doi.org/10.1007/s00521-018-3851-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Wei, Jiahui, Huifang Ma, Yuhang Liu, Zhixin Li, and Ning Li. "Hierarchical high-order co-clustering algorithm by maximizing modularity." International Journal of Machine Learning and Cybernetics 12, no. 10 (July 23, 2021): 2887–98. http://dx.doi.org/10.1007/s13042-021-01375-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

WU, Hu, Yong-Ji WANG, Zhe WANG, Xiu-Li WANG, and Shuan-Zhu DU. "Two-Phase Collaborative Filtering Algorithm Based on Co-Clustering." Journal of Software 21, no. 5 (May 17, 2010): 1042–54. http://dx.doi.org/10.3724/sp.j.1001.2010.03758.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Ahn, Jungyu, and Ju-Hong Lee. "Clustering Algorithm for Time Series with Co-movement Relationhsip." Journal of Korean Institute of Intelligent Systems 27, no. 6 (December 31, 2017): 552–59. http://dx.doi.org/10.5391/jkiis.2017.27.6.552.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Hussain, Syed Fawad, Adeel Pervez, and Masroor Hussain. "Co-clustering optimization using Artificial Bee Colony (ABC) algorithm." Applied Soft Computing 97 (December 2020): 106725. http://dx.doi.org/10.1016/j.asoc.2020.106725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

de França, Fabrício Olivetti. "A hash-based co-clustering algorithm for categorical data." Expert Systems with Applications 64 (December 2016): 24–35. http://dx.doi.org/10.1016/j.eswa.2016.07.024.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Yang, Hui, Han Peng, Jianyong Zhu, and Feiping Nie. "Co-Clustering Ensemble Based on Bilateral K-Means Algorithm." IEEE Access 8 (2020): 51285–94. http://dx.doi.org/10.1109/access.2020.2979915.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Yan, Yang, Lihui Chen, and William-Chandra Tjhi. "Semi-supervised fuzzy co-clustering algorithm for document categorization." Knowledge and Information Systems 34, no. 1 (November 15, 2011): 55–74. http://dx.doi.org/10.1007/s10115-011-0454-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Wartelle, Adrien, Farah Mourad-Chehade, Farouk Yalaoui, Jan Chrusciel, David Laplanche, and Stéphane Sanchez. "Clustering of a Health Dataset Using Diagnosis Co-Occurrences." Applied Sciences 11, no. 5 (March 7, 2021): 2373. http://dx.doi.org/10.3390/app11052373.

Full text
Abstract:
Assessing the health profiles of populations is a crucial task to create a coherent healthcare offer. Emergency Departments (EDs) are at the core of the healthcare system and could benefit from this evaluation via an improved understanding of the healthcare needs of their population. This paper proposes a novel hierarchical agglomerative clustering algorithm based on multimorbidity analysis. The proposed approach constructs the clustering dendrogram by introducing new quality indicators based on the relative risk of co-occurrences of patient diagnoses. This algorithm enables the detection of multimorbidity patterns by merging similar patient profiles according to their common diagnoses. The multimorbidity approach has been applied to the data of the largest ED of the Aube Department (Eastern France) to cluster its patient visits. Among the 120,718 visits identified during a 24-month period, 16 clusters were identified, accounting for 94.8% of the visits, with the five most prevalent clusters representing 63.0% of them. The new quality indicators show a coherent and good clustering solution with a cluster membership of 1.81 based on a cluster compactness of 1.40 and a cluster separation of 0.77. Compared to the literature, the proposed approach is appropriate for the discovery of multimorbidity patterns and could help to develop better clustering algorithms for more diverse healthcare datasets.
APA, Harvard, Vancouver, ISO, and other styles
23

Ge, Shaodi, Hongjun Li, and Liuhong Luo. "Constrained Dual Graph Regularized Orthogonal Nonnegative Matrix Tri-Factorization for Co-Clustering." Mathematical Problems in Engineering 2019 (December 26, 2019): 1–17. http://dx.doi.org/10.1155/2019/7565640.

Full text
Abstract:
Coclustering approaches for grouping data points and features have recently been receiving extensive attention. In this paper, we propose a constrained dual graph regularized orthogonal nonnegative matrix trifactorization (CDONMTF) algorithm to solve the coclustering problems. The new method improves the clustering performance obviously by employing hard constraints to retain the priori label information of samples, establishing two nearest neighbor graphs to encode the geometric structure of data manifold and feature manifold, and combining with biorthogonal constraints as well. In addition, we have also derived the iterative optimization scheme of CDONMTF and proved its convergence. Clustering experiments on 5 UCI machine-learning data sets and 7 image benchmark data sets show that the achievement of the proposed algorithm is superior to that of some existing clustering algorithms.
APA, Harvard, Vancouver, ISO, and other styles
24

Xu, Xiaoxi. "Discovering Latent Strategies." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (August 4, 2011): 1834–35. http://dx.doi.org/10.1609/aaai.v25i1.8058.

Full text
Abstract:
Strategy mining is a new area of research about discovering strategies in decision-making. In this paper, we formulate the strategy-mining problem as a clustering problem, called the latent-strategy problem. In a latent-strategy problem, a corpus of data instances is given, each of which is represented by a set of features and a decision label. The inherent dependency of the decision label on the features is governed by a latent strategy. The objective is to find clusters, each of which contains data instances governed by the same strategy. Existing clustering algorithms are inappropriate to cluster dependency because they either assume feature independency (e.g., K-means) or only consider the co-occurrence of features without explicitly modeling the special dependency of the decision label on other features (e.g., Latent Dirichlet Allocation (LDA)). In this paper, we present a baseline unsupervised learning algorithm for dependency clustering. Our model-based clustering algorithm iterates between an assignment step and a minimization step to learn a mixture of decision tree models that represent latent strategies. Similar to the Expectation Maximization algorithm, our algorithm is grounded in the statistical learning theory. Different from other clustering algorithms, our algorithm is irrelevant-feature resistant and its learned clusters (modeled by decision trees) are strongly interpretable and predictive. We systematically evaluate our algorithm using a common law dataset comprised of actual cases. Experimental results show that our algorithm significantly outperforms K-means and LDA on clustering dependency.
APA, Harvard, Vancouver, ISO, and other styles
25

ter, Pe, and Max well. "Co-Clustering based Classification Algorithm with Latent Semantic Relationship for Cross-Domain Text Classification throughWikipedia." Bonfring International Journal of Data Mining 7, no. 2 (May 31, 2017): 01–05. http://dx.doi.org/10.9756/bijdm.8330.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Gargi, Ullas, Wenjun Lu, Vahab Mirrokni, and Sangho Yoon. "Large-Scale Community Detection on YouTube for Topic Discovery and Exploration." Proceedings of the International AAAI Conference on Web and Social Media 5, no. 1 (August 3, 2021): 486–89. http://dx.doi.org/10.1609/icwsm.v5i1.14191.

Full text
Abstract:
Detecting coherent, well-connected communities in large graphs provides insight into the graph structure and can serve as the basis for content discovery. Clustering is a popular technique for community detection but global algorithms that examine the entire graph do not scale. Local algorithms are highly parallelizable but perform sub-optimally, especially in applications where we need to optimize multiple metrics. We present a multi-stage algorithm based on local-clustering that is highly scalable, combining a pre-processing stage, a lo- cal clustering stage, and a post-processing stage. We apply it to the YouTube video graph to generate named clusters of videos with coherent content. We formalize coverage, co- herence, and connectivity metrics and evaluate the quality of the algorithm for large YouTube graphs. Our use of local algorithms for global clustering, and its implementation and practical evaluation on such a large scale is a first of its kind.
APA, Harvard, Vancouver, ISO, and other styles
27

KHARMA, NAWWAF, CHING Y. SUEN, and PEI F. GUO. "PALMPRINTS: A COOPERATIVE CO-EVOLUTIONARY ALGORITHM FOR CLUSTERING HAND IMAGES." International Journal of Image and Graphics 05, no. 03 (July 2005): 595–616. http://dx.doi.org/10.1142/s0219467805001902.

Full text
Abstract:
The main objective of Project PalmPrints is to develop and demonstrate a special co-evolutionary genetic algorithm (GA) that optimizes (a clustering fitness function) with respect to three quantities, (a) the dimensions of the clustering space; (b) the number of clusters; and (c) and the locations of the various clusters. This genetic algorithm is applied to the specific practical problem of hand image clustering, with success. In addition to the above, this research effort makes the following contributions: (i) a CD database of (raw and processed) right-hand images; (ii) a number of novel features designed specifically for hand image classification; (iii) an extended fitness function, which is particularly suited to a dynamic (i.e. dimensionality varying) clustering space. Despite the complexity of the multi-optimizational task, the results of this study are clear. The GA succeeded in achieving a maximum fitness value of 99.1%; while reducing the number of dimensions (features) of the space by more than half (from 84 to 41).
APA, Harvard, Vancouver, ISO, and other styles
28

Kim, Junghoon, Kaiyu Feng, Gao Cong, Diwen Zhu, Wenyuan Yu, and Chunyan Miao. "ABC." Proceedings of the VLDB Endowment 15, no. 10 (June 2022): 2134–47. http://dx.doi.org/10.14778/3547305.3547318.

Full text
Abstract:
Finding a set of co-clusters in a bipartite network is a fundamental and important problem. In this paper, we present the Attributed Bipartite Co-clustering (ABC) problem which unifies two main concepts: (i) bipartite modularity optimization, and (ii) attribute cohesiveness. To the best of our knowledge, this is the first work to find co-clusters while considering the attribute cohesiveness. We prove that ABC is NP-hard and is not in APX, unless P=NP. We propose three algorithms: (1) a top-down algorithm; (2) a bottom-up algorithm; (3) a group matching algorithm. Extensive experimental results on real-world attributed bipartite networks demonstrate the efficiency and effectiveness of our algorithms.
APA, Harvard, Vancouver, ISO, and other styles
29

Wu, X., A. Poorthuis, R. Zurita-Milla, and M. J. Kraak. "A WEB-BASED INTERACTIVE PLATFORM FOR CO-CLUSTERING SPATIO-TEMPORAL DATA." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7 (September 12, 2017): 175–79. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w7-175-2017.

Full text
Abstract:
Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.
APA, Harvard, Vancouver, ISO, and other styles
30

Parraga-Alava, Jorge, and Mario Inostroza-Ponta. "Influence of the go-based semantic similarity measures in multi-objective gene clustering algorithm performance." Journal of Bioinformatics and Computational Biology 18, no. 06 (November 5, 2020): 2050038. http://dx.doi.org/10.1142/s0219720020500389.

Full text
Abstract:
Using a prior biological knowledge of relationships and genetic functions for gene similarity, from repository such as the Gene Ontology (GO), has shown good results in multi-objective gene clustering algorithms. In this scenario and to obtain useful clustering results, it would be helpful to know which measure of biological similarity between genes should be employed to yield meaningful clusters that have both similar expression patterns (co-expression) and biological homogeneity. In this paper, we studied the influence of the four most used GO-based semantic similarity measures in the performance of a multi-objective gene clustering algorithm. We used four publicly available datasets and carried out comparative studies based on performance metrics for the multi-objective optimization field and clustering performance indexes. In most of the cases, using Jiang–Conrath and Wang similarities stand in terms of multi-objective metrics. In clustering properties, Resnik similarity allows to achieve the best values of compactness and separation and therefore of co-expression of groups of genes. Meanwhile, in biological homogeneity, the Wang similarity reports greater number of significant GO terms. However, statistical, visual, and biological significance tests showed that none of the GO-based semantic similarity measures stand out above the rest in order to significantly improve the performance of the multi-objective gene clustering algorithm.
APA, Harvard, Vancouver, ISO, and other styles
31

Li, Qinlu, Tao Du, Rui Zhang, Jin Zhou, and Shouning Qu. "DCE-IVI: Density-based clustering ensemble by selecting internal validity index." Intelligent Data Analysis 26, no. 6 (November 12, 2022): 1487–506. http://dx.doi.org/10.3233/ida-216105.

Full text
Abstract:
As each clustering algorithm cannot efficiently partition datasets with arbitrary shapes, the thought of clustering ensemble is proposed to consistently integrate clustering results to obtain better division. Most of ensemble research employs a single algorithm with different parameters to clustering. And this can be easily integrated, however it is hardly to divide complex datasets. Other available methods integrate different algorithms, it can divide datasets from different aspects, but fail to take outliers into account, which produces negative effects on the partition results. In order to solve these problems, we clustering datasets with three different density-based algorithms. The innovation of this paper is described as: (1) by setting dynamic thresholds, lower frequency evidence in the co-association matrix is gradually deleted to obtain multiple reconstructed matrices; (2) these reconstructed matrices are analyzed by hierarchical clustering to obtain basic clustering results; (3) an internal validity index is designed by the compactness within clusters and the correlation between clusters, which is used to select the final clustering result. By this innovation, the clustering effect is significantly improved. Finally, a series of experiments are designed, and the results verify the improvement and effectiveness of the proposed technique (DCE-IVI).
APA, Harvard, Vancouver, ISO, and other styles
32

Fan, Jiachen, Xiaoxiao Wang, Tingfeng Wu, Jin Zhu, and Pingxin Wang. "Three-Way Ensemble Clustering Based on Sample’s Perturbation Theory." Mathematics 10, no. 15 (July 26, 2022): 2598. http://dx.doi.org/10.3390/math10152598.

Full text
Abstract:
The complexity of the data type and distribution leads to the increase in uncertainty in the relationship between samples, which brings challenges to effectively mining the potential cluster structure of data. Ensemble clustering aims to obtain a unified cluster division by fusing multiple different base clustering results. This paper proposes a three-way ensemble clustering algorithm based on sample’s perturbation theory to solve the problem of inaccurate decision making caused by inaccurate information or insufficient data. The algorithm first combines the natural nearest neighbor algorithm to generate two sets of perturbed data sets, randomly extracts the feature subsets of the samples, and uses the traditional clustering algorithm to obtain different base clusters. The sample’s stability is obtained by using the co-association matrix and determinacy function, and then the samples can be divided into a stable region and unstable region according to a threshold for the sample’s stability. The stable region consists of high-stability samples and is divided into the core region of each cluster using the K-means algorithm. The unstable region consists of low-stability samples and is assigned to the fringe regions of each cluster. Therefore, a three-way clustering result is formed. The experimental results show that the proposed algorithm in this paper can obtain better clustering results compared with other clustering ensemble algorithms on the UCI Machine Learning Repository data set, and can effectively reveal the clustering structure.
APA, Harvard, Vancouver, ISO, and other styles
33

Song, Yangqiu, Shimei Pan, Shixia Liu, Furu Wei, Michelle Zhou, and Weihong Qian. "Constrained Coclustering for Textual Documents." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (July 3, 2010): 581–86. http://dx.doi.org/10.1609/aaai.v24i1.7680.

Full text
Abstract:
In this paper, we present a constrained co-clustering approach for clustering textual documents. Our approach combines the benefits of information-theoretic co-clustering and constrained clustering. We use a two-sided hidden Markov random field (HMRF) to model both the document and word constraints. We also develop an alternating expectation maximization (EM) algorithm to optimize the constrained co-clustering model. We have conducted two sets of experiments on a benchmark data set: (1) using human-provided category labels to derive document and word constraints for semi-supervised document clustering, and (2) using automatically extracted named entities to derive document constraints for unsupervised document clustering. Compared to several representative constrained clustering and co-clustering approaches, our approach is shown to be more effective for high-dimensional, sparse text data.
APA, Harvard, Vancouver, ISO, and other styles
34

Zhang, Jiawen, Fuxing Yang, Zhongliang Deng, Xiao Fu, and Jiazhi Han. "Research on D2D co-localization algorithm based on clustering filtering." China Communications 17, no. 8 (August 2020): 121–32. http://dx.doi.org/10.23919/jcc.2020.08.010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Verma, Om Prakash, and Heena Hooda. "A novel intuitionistic fuzzy co-clustering algorithm for brain images." Multimedia Tools and Applications 79, no. 41-42 (August 21, 2020): 31517–40. http://dx.doi.org/10.1007/s11042-020-09320-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Inbarani H., Hannah, Ahmad Taher Azar, and Jothi G. "Leukemia Image Segmentation Using a Hybrid Histogram-Based Soft Covering Rough K-Means Clustering Algorithm." Electronics 9, no. 1 (January 19, 2020): 188. http://dx.doi.org/10.3390/electronics9010188.

Full text
Abstract:
Segmenting an image of a nucleus is one of the most essential tasks in a leukemia diagnostic system. Accurate and rapid segmentation methods help the physicians identify the diseases and provide better treatment at the appropriate time. Recently, hybrid clustering algorithms have started being widely used for image segmentation in medical image processing. In this article, a novel hybrid histogram-based soft covering rough k-means clustering (HSCRKM) algorithm for leukemia nucleus image segmentation is discussed. This algorithm combines the strengths of a soft covering rough set and rough k-means clustering. The histogram method was utilized to identify the number of clusters to avoid random initialization. Different types of features such as gray level co-occurrence matrix (GLCM), color, and shape-based features were extracted from the segmented image of the nucleus. Machine learning prediction algorithms were applied to classify the cancerous and non-cancerous cells. The proposed strategy is compared with an existing clustering algorithm, and the efficiency is evaluated based on the prediction metrics. The experimental results show that the HSCRKM method efficiently segments the nucleus, and it is also inferred that logistic regression and neural network perform better than other prediction algorithms.
APA, Harvard, Vancouver, ISO, and other styles
37

Zhou, G., Q. Li, G. Deng, T. Yue, and X. Zhou. "MINING CO-LOCATION PATTERNS WITH CLUSTERING ITEMS FROM SPATIAL DATA SETS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-3 (May 2, 2018): 2505–9. http://dx.doi.org/10.5194/isprs-archives-xlii-3-2505-2018.

Full text
Abstract:
The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.
APA, Harvard, Vancouver, ISO, and other styles
38

Wu, X., R. Zurita-Milla, M. J. Kraak, and E. Izquierdo-Verdiguier. "CLUSTERING-BASED APPROACHES TO THE EXPLORATION OF SPATIO-TEMPORAL DATA." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7 (September 14, 2017): 1387–91. http://dx.doi.org/10.5194/isprs-archives-xlii-2-w7-1387-2017.

Full text
Abstract:
As one spatio-temporal data mining task, clustering helps the exploration of patterns in the data by grouping similar elements together. However, previous studies on spatial or temporal clustering are incapable of analysing complex patterns in spatio-temporal data. For instance, concurrent spatio-temporal patterns in 2D or 3D datasets. In this study we present two clustering algorithms for complex pattern analysis: (1) the Bregman block average co-clustering algorithm with I-divergence (BBAC_I) which enables the concurrent analysis of spatio-temporal patterns in 2D data matrix, and (2) the Bregman cube average tri-clustering algorithm with I-divergence (BCAT_I) which enables the complete partitional analysis in 3D data cube. Here the use of the two clustering algorithms is illustrated by Dutch daily average temperature dataset from 28 weather stations from 1992 to 2011. For BBAC_I, it is applied to the averaged yearly dataset to identify station-year co-clusters which contain similar temperatures along stations and years, thus revealing patterns along both spatial and temporal dimensions. For BCAT_I, it is applied to the temperature dataset organized in a data cube with one spatial (stations) and two nested temporal dimensions (years and days). By partitioning the whole dataset into clusters of stations and years with similar within-year temperature similarity, BCAT_I explores the spatio-temporal patterns of intra-annual variability in the daily temperature dataset. As such, both BBAC_I and BCAT_I algorithms, combined with suitable geovisualization techniques, allow the exploration of complex spatial and temporal patterns, which contributes to a better understanding of complex patterns in spatio-temporal data.
APA, Harvard, Vancouver, ISO, and other styles
39

Kang, Zhen, Tianchen Huang, Shan Zeng, Hao Li, Lei Dong, and Chaofan Zhang. "A Method for Detection of Corn Kernel Mildew Based on Co-Clustering Algorithm with Hyperspectral Image Technology." Sensors 22, no. 14 (July 17, 2022): 5333. http://dx.doi.org/10.3390/s22145333.

Full text
Abstract:
Hyperspectral imaging can simultaneously acquire spectral and spatial information of the samples and is, therefore, widely applied in the non-destructive detection of grain quality. Supervised learning is the mainstream method of hyperspectral imaging for pixel-level detection of mildew in corn kernels, which requires a large number of training samples to establish the prediction or classification models. This paper presents an unsupervised redundant co-clustering algorithm (FCM-SC) based on multi-center fuzzy c-means (FCM) clustering and spectral clustering (SC), which can effectively detect non-uniformly distributed mildew in corn kernels. This algorithm first carries out fuzzy c-means clustering of sample features, extracts redundant cluster centers, merges the cluster centers by spectral clustering, and finally finds the category of corresponding cluster centers for each sample. It effectively solves the problems of the poor ability of the traditional fuzzy c-means clustering algorithm to classify the data with complex structure distribution and the complex calculation of the traditional spectral clustering algorithm. The experimental results demonstrated that the proposed algorithm could describe the complex structure of mildew distribution in corn kernels and exhibits higher stability, better anti-interference ability, generalization ability, and accuracy than the supervised classification model.
APA, Harvard, Vancouver, ISO, and other styles
40

Ghizlane, Ez-Zarrad, Sabbar Wafae, and Bekkhoucha Abdelkrim. "Features Clustering Around Latent Variables for High Dimensional Data." E3S Web of Conferences 297 (2021): 01070. http://dx.doi.org/10.1051/e3sconf/202129701070.

Full text
Abstract:
Clustering of variables is the task of grouping similar variables into different groups. It may be useful in several situations such as dimensionality reduction, feature selection, and detect redundancies. In the present study, we combine two methods of features clustering the clustering of variables around latent variables (CLV) algorithm and the k-means based co-clustering algorithm (kCC). Indeed, classical CLV cannot be applied to high dimensional data because this approach becomes tedious when the number of features increases.
APA, Harvard, Vancouver, ISO, and other styles
41

Honda, Katsuhiro, Issei Hayashi, Seiki Ubukata, and Akira Notsu. "Three-Mode Fuzzy Co-Clustering Based on Probabilistic Concept and Comparison with FCM-Type Algorithms." Journal of Advanced Computational Intelligence and Intelligent Informatics 25, no. 4 (July 20, 2021): 478–88. http://dx.doi.org/10.20965/jaciii.2021.p0478.

Full text
Abstract:
Three-mode fuzzy co-clustering is a promising technique for analyzing relational co-occurrence information among three mode elements. The conventional FCM-type algorithms achieved simultaneous fuzzy partition of three mode elements based on the fuzzy c-means (FCM) concept, and then, they often suffer from careful tuning of three independent fuzzification parameters. In this paper, a novel three-mode fuzzy co-clustering algorithm is proposed by modifying the conventional aggregation criterion of three elements based on a probabilistic concept. The fuzziness degree of three-mode partition can be easily tuned only with a single parameter under the guideline of the probabilistic standard. The characteristic features of the proposed method are compared with the conventional algorithms through numerical experiments using an artificial dataset and are demonstrated in application to a real world dataset of MovieLens movie evaluation data.
APA, Harvard, Vancouver, ISO, and other styles
42

Wu, Xiao Hong, Tong Xiang Cai, Bin Wu, and Jun Sun. "Research on the Variety Discrimination of Apple Using a Hybrid Possibilistic Clustering." Advanced Materials Research 710 (June 2013): 768–71. http://dx.doi.org/10.4028/www.scientific.net/amr.710.768.

Full text
Abstract:
Near infrared reflectance (NIR) spectroscopy has been used to obtain NIR spectra of two varieties of apple samples. The dimensionality of NIR spectra was reduced by principal component analysis (PCA), and discriminant information was extracted by linear discriminant analysis (LDA). Last, a hybrid possibilistic clustering algorithm (HPCA) was utilized as classifier to discriminate the apple samples of different varieties. HPCA integrates possibilistic clustering algorithm (PCA) and improved possibilistic c-means (IPCM) clustering algorithm, and produces not only the membership values but also typicality values by simple computation of the sample co-variance. Experimental results showed that HPCA, as an unsupervised learning algorithm, could quickly and easily discriminate the apple varieties.
APA, Harvard, Vancouver, ISO, and other styles
43

Liu, Ji, and Lei Li. "Network Community Detection Based on Co-Neighbor Modularity Matrix with Spectral Clustering." Applied Mechanics and Materials 55-57 (May 2011): 1237–41. http://dx.doi.org/10.4028/www.scientific.net/amm.55-57.1237.

Full text
Abstract:
The problem of community detection is one of the outstanding issues in the study of network systems. This paper presents co-neighbor modularity matrix to measure the quality of community detection. The problem of community detection is projected into clustering of eigenvectors in Euclidean space. Network community structure is detected with spectral clustering algorithm which is free from the noise of initial mean point in K-mean algorithm. The experimental results suggest that the method is efficient in finding the structure of community.
APA, Harvard, Vancouver, ISO, and other styles
44

Lan, Yu, Yan Bo, and Yao Baozhen. "Core Business Selection Based on Ant Colony Clustering Algorithm." Mathematical Problems in Engineering 2014 (2014): 1–6. http://dx.doi.org/10.1155/2014/136753.

Full text
Abstract:
Core business is the most important business to the enterprise in diversified business. In this paper, we first introduce the definition and characteristics of the core business and then descript the ant colony clustering algorithm. In order to test the effectiveness of the proposed method, Tianjin Port Logistics Development Co., Ltd. is selected as the research object. Based on the current situation of the development of the company, the core business of the company can be acquired by ant colony clustering algorithm. Thus, the results indicate that the proposed method is an effective way to determine the core business for company.
APA, Harvard, Vancouver, ISO, and other styles
45

Bhattacharya, Anindya, and Rajat K. De. "Bi-correlation clustering algorithm for determining a set of co-regulated genes." Bioinformatics 25, no. 21 (September 3, 2009): 2795–801. http://dx.doi.org/10.1093/bioinformatics/btp526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

SHEN, CHENGCHENG, and YING LIU. "A TRIPARTITE CLUSTERING ANALYSIS ON MICRORNA, GENE AND DISEASE MODEL." Journal of Bioinformatics and Computational Biology 10, no. 01 (February 2012): 1240007. http://dx.doi.org/10.1142/s0219720012400070.

Full text
Abstract:
Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.
APA, Harvard, Vancouver, ISO, and other styles
47

Machap, Logenthiran, Afnizanfaizal Abdullah, and Afnizanfaizal Abdullah. "Functional analysis of cancer gene subtype from co-clustering and classification." Indonesian Journal of Electrical Engineering and Computer Science 18, no. 1 (April 1, 2020): 343. http://dx.doi.org/10.11591/ijeecs.v18.i1.pp343-350.

Full text
Abstract:
<span lang="EN-MY">Cancer is a heterogeneity genetic disease with huge phenotypic alterations among dissimilar cancers types or even between same cancer types. Recent expansions of genome-wide profiling technologies offer a chance to explore molecular changes variations throughout advancement of cancer. Therefore, various statistical and machine learning algorithms have been designed and developed for the handling and interpretation of high-throughput microarray molecular data. Discovery of molecular subtypes studies have permitted the cancer to be allocated into similar groups that are deliberated to port similar molecular and clinical characteristics. Thus, the main objective of this research is to discover cancer gene subtypes and classify genes to obtain higher accuracy. In particular improved co-clustering algorithm used to discover cancer subtypes. And then supervised infinite feature selection gene selection method was combined with multi class SVM for classification of selected genes and further biological analysis. The analysis on breast cancer and glioblastoma multiforme evidences that top genes involved in cancer and the pathways present in both cancer top genes. The functional analysis is useful in medical and pharmaceutical field for cancer diagnosis and prognosis.</span>
APA, Harvard, Vancouver, ISO, and other styles
48

Rohe, Karl, Tai Qin, and Bin Yu. "Co-clustering directed graphs to discover asymmetries and directional communities." Proceedings of the National Academy of Sciences 113, no. 45 (October 21, 2016): 12679–84. http://dx.doi.org/10.1073/pnas.1525793113.

Full text
Abstract:
In directed graphs, relationships are asymmetric and these asymmetries contain essential structural information about the graph. Directed relationships lead to a new type of clustering that is not feasible in undirected graphs. We propose a spectral co-clustering algorithm called di-sim for asymmetry discovery and directional clustering. A Stochastic co-Blockmodel is introduced to show favorable properties of di-sim. To account for the sparse and highly heterogeneous nature of directed networks, di-sim uses the regularized graph Laplacian and projects the rows of the eigenvector matrix onto the sphere. A nodewise asymmetry score and di-sim are used to analyze the clustering asymmetries in the networks of Enron emails, political blogs, and the Caenorhabditiselegans chemical connectome. In each example, a subset of nodes have clustering asymmetries; these nodes send edges to one cluster, but receive edges from another cluster. Such nodes yield insightful information (e.g., communication bottlenecks) about directed networks, but are missed if the analysis ignores edge direction.
APA, Harvard, Vancouver, ISO, and other styles
49

Du, Jia. "Research on Intelligent Tourism Information System Based on Data Mining Algorithm." Mobile Information Systems 2021 (September 23, 2021): 1–10. http://dx.doi.org/10.1155/2021/5727788.

Full text
Abstract:
Smart tourism purposes symbolize a new idea of IT application to increased competition and satisfaction of all stakeholders, including visitors as co-creators of tourism products and co-promoters of a destination. To improve the effect of smart tourism, this paper improves the common big data technology through algorithm enhancement to improve the intuitive effect of big data. We construct big data visualization technology and realize real-time online visualization of tourism data. In the spark-distributed environment, we use the conventional K clustering technique to improve the final output utilizing clustering means. The research results show that the smart tourism information system based on big data constructed in this paper can meet actual tourism information needs and user experience needs. The outcomes of the experimental results show that the proposed predictor significantly outperforms based on the improved algorithm.
APA, Harvard, Vancouver, ISO, and other styles
50

Peralta, Billy, and Luis Alberto Caro. "Improved Object Recognition with Decision Trees Using Subspace Clustering." Journal of Advanced Computational Intelligence and Intelligent Informatics 20, no. 1 (January 19, 2016): 41–48. http://dx.doi.org/10.20965/jaciii.2016.p0041.

Full text
Abstract:
Generic object recognition algorithms usually require complex classificationmodels because of intrinsic difficulties arising from problems such as changes in pose, lighting conditions, or partial occlusions. Decision trees present an inexpensive alternative for classification tasks and offer the advantage of being simple to understand. On the other hand, a common scheme for object recognition is given by the appearances of visual words, also known as the bag-of-words method. Although multiple co-occurrences of visual words are more informative regarding visual classes, a comprehensive evaluation of such combinations is unfeasible because it would result in a combinatorial explosion. In this paper, we propose to obtain the multiple co-occurrences of visual words using a variant of the CLIQUE subspace-clustering algorithm for improving the object recognition performance of simple decision trees. Experiments on standard object datasets show that our method improves the accuracy of the classification of generic objects in comparison to traditional decision tree techniques that are similar, in terms of accuracy, to ensemble techniques. In future we plan to evaluate other variants of decision trees, and apply other subspace-clustering algorithms.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography