Relevant bibliographies by topics / Co-clustering algorithm

Academic literature on the topic 'Co-clustering algorithm'

Author: Grafiati

Published: 10 December 2022

Last updated: 28 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Co-clustering algorithm.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Co-clustering algorithm"

Kanzawa, Yuchi. "Bezdek-Type Fuzzified Co-Clustering Algorithm." Journal of Advanced Computational Intelligence and Intelligent Informatics 19, no. 6 (November 20, 2015): 852–60. http://dx.doi.org/10.20965/jaciii.2015.p0852.

Full text

Abstract:

In this study, two co-clustering algorithms based on Bezdek-type fuzzification of fuzzy clustering are proposed for categorical multivariate data. The two proposed algorithms are motivated by the fact that there are only two fuzzy co-clustering methods currently available – entropy regularization and quadratic regularization – whereas there are three fuzzy clustering methods for vectorial data: entropy regularization, quadratic regularization, and Bezdek-type fuzzification. The first proposed algorithm forms the basis of the second algorithm. The first algorithm is a variant of a spherical clustering method, with the kernelization of a maximizing model of Bezdek-type fuzzy clustering with multi-medoids. By interpreting the first algorithm in this way, the second algorithm, a spectral clustering approach, is obtained. Numerical examples demonstrate that the proposed algorithms can produce satisfactory results when suitable parameter values are selected.

APA, Harvard, Vancouver, ISO, and other styles

Liu, Yongli, Jingli Chen, and Hao Chao. "A Fuzzy Co-Clustering Algorithm via Modularity Maximization." Mathematical Problems in Engineering 2018 (October 29, 2018): 1–11. http://dx.doi.org/10.1155/2018/3757580.

Full text

Abstract:

In this paper we propose a fuzzy co-clustering algorithm via modularity maximization, named MMFCC. In its objective function, we use the modularity measure as the criterion for co-clustering object-feature matrices. After converting into a constrained optimization problem, it is solved by an iterative alternative optimization procedure via modularity maximization. This algorithm offers some advantages such as directly producing a block diagonal matrix and interpretable description of resulting co-clusters, automatically determining the appropriate number of final co-clusters. The experimental studies on several benchmark datasets demonstrate that this algorithm can yield higher quality co-clusters than such competitors as some fuzzy co-clustering algorithms and crisp block-diagonal co-clustering algorithms, in terms of accuracy.

APA, Harvard, Vancouver, ISO, and other styles

Zhang, Yinghui. "A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble." Journal of Intelligent Systems 29, no. 1 (December 30, 2017): 143–53. http://dx.doi.org/10.1515/jisys-2017-0513.

Full text

Abstract:

Abstract Co-clustering is used to analyze the row and column clusters of a dataset, and it is widely used in recommendation systems. In general, different co-clustering models often obtain very different results for a dataset because each algorithm has its own optimization criteria. It is an alternative way to combine different co-clustering results to produce a final one for improving the quality of co-clustering. In this paper, a semi-supervised co-clustering ensemble is illustrated in detail based on semi-supervised learning and ensemble learning. A semi-supervised co-clustering ensemble is a framework for combining multiple base co-clusterings and the side information of a dataset to obtain a stable and robust consensus co-clustering. First, the objective function of the semi-supervised co-clustering ensemble is formulated according to normalized mutual information. Then, a kernel probabilistic model for semi-supervised co-clustering ensemble (KPMSCE) is presented and the inference of KPMSCE is illustrated in detail. Furthermore, the corresponding algorithm is designed. Moreover, different algorithms and the proposed algorithm are used for experiments on real datasets. The experimental results demonstrate that the proposed algorithm can significantly outperform the compared algorithms in terms of several indices.

APA, Harvard, Vancouver, ISO, and other styles

Gu, Yi, and Kang Li. "Entropy-Based Multiview Data Clustering Analysis in the Era of Industry 4.0." Wireless Communications and Mobile Computing 2021 (April 30, 2021): 1–8. http://dx.doi.org/10.1155/2021/9963133.

Full text

Abstract:

In the era of Industry 4.0, single-view clustering algorithm is difficult to play a role in the face of complex data, i.e., multiview data. In recent years, an extension of the traditional single-view clustering is multiview clustering technology, which is becoming more and more popular. Although the multiview clustering algorithm has better effectiveness than the single-view clustering algorithm, almost all the current multiview clustering algorithms usually have two weaknesses as follows. (1) The current multiview collaborative clustering strategy lacks theoretical support. (2) The weight of each view is averaged. To solve the above-mentioned problems, we used the Havrda-Charvat entropy and fuzzy index to construct a new collaborative multiview fuzzy c-means clustering algorithm using fuzzy weighting called Co-MVFCM. The corresponding results show that the Co-MVFCM has the best clustering performance among all the comparison clustering algorithms.

APA, Harvard, Vancouver, ISO, and other styles

Jin, Chun Xia, Hui Zhang, and Qiu Chan Bai. "Text Clustering Algorithm of Co-Occurrence Word Based on Association-Rule Mining." Applied Mechanics and Materials 599-601 (August 2014): 1749–52. http://dx.doi.org/10.4028/www.scientific.net/amm.599-601.1749.

Full text

Abstract:

According to the analysis of text feature, the document with co-occurrence words expresses very stronger and more accurately topic information. So this paper puts forward a text clustering algorithm of word co-occurrence based on association-rule mining. The method uses the association-rule mining to extract those word co-occurrences of expressing the topic information in the document. According to the co-occurrence words to build the modeling and co-occurrence word similarity measure, then this paper uses the hierarchical clustering algorithm based on word co-occurrence to realize text clustering. Experimental results show the method proposed in this paper improves the efficiency and accuracy of text clustering compared with other algorithms.

APA, Harvard, Vancouver, ISO, and other styles

Hussain, Syed Fawad, and Shahid Iqbal. "CCGA: Co-similarity based Co-clustering using genetic algorithm." Applied Soft Computing 72 (November 2018): 30–42. http://dx.doi.org/10.1016/j.asoc.2018.07.045.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hou, Jie, Xiufen Ye, Chuanlong Li, and Yixing Wang. "K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks." Genes 12, no. 1 (January 12, 2021): 87. http://dx.doi.org/10.3390/genes12010087.

Full text

Abstract:

Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis.

APA, Harvard, Vancouver, ISO, and other styles

MA, PATRICK C. H., KEITH C. C. CHAN, and DAVID K. Y. CHIU. "CLUSTERING AND RE-CLUSTERING FOR PATTERN DISCOVERY IN GENE EXPRESSION DATA." Journal of Bioinformatics and Computational Biology 03, no. 02 (April 2005): 281–301. http://dx.doi.org/10.1142/s0219720005001053.

Full text

Abstract:

The combined interpretation of gene expression data and gene sequences is important for the investigation of the intricate relationships of gene expression at the transcription level. The expression data produced by microarray hybridization experiments can lead to the identification of clusters of co-expressed genes that are likely co-regulated by the same regulatory mechanisms. By analyzing the promoter regions of co-expressed genes, the common regulatory patterns characterized by transcription factor binding sites can be revealed. Many clustering algorithms have been used to uncover inherent clusters in gene expression data. In this paper, based on experiments using simulated and real data, we show that the performance of these algorithms could be further improved. For the clustering of expression data typically characterized by a lot of noise, we propose to use a two-phase clustering algorithm consisting of an initial clustering phase and a second re-clustering phase. The proposed algorithm has several desirable features: (i) it utilizes both local and global information by computing both a "local" pairwise distance between two gene expression profiles in Phase 1 and a "global" probabilistic measure of interestingness of cluster patterns in Phase 2, (ii) it distinguishes between relevant and irrelevant expression values when performing re-clustering, and (iii) it makes explicit the patterns discovered in each cluster for possible interpretations. Experimental results show that the proposed algorithm can be an effective algorithm for discovering clusters in the presence of very noisy data. The patterns that are discovered in each cluster are found to be meaningful and statistically significant, and cannot otherwise be easily discovered. Based on these discovered patterns, genes co-expressed under the same experimental conditions and range of expression levels have been identified and evaluated. When identifying regulatory patterns at the promoter regions of the co-expressed genes, we also discovered well-known transcription factor binding sites in them. These binding sites can provide explanations for the co-expressed patterns.

APA, Harvard, Vancouver, ISO, and other styles

Shang, Ronghua, Yang Li, and Licheng Jiao. "Co-evolution-based immune clonal algorithm for clustering." Soft Computing 20, no. 4 (February 7, 2015): 1503–19. http://dx.doi.org/10.1007/s00500-015-1602-z.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Liu, Yongli, Shuai Wu, Zhizhong Liu, and Hao Chao. "A fuzzy co-clustering algorithm for biomedical data." PLOS ONE 12, no. 4 (April 26, 2017): e0176536. http://dx.doi.org/10.1371/journal.pone.0176536.

Full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Co-clustering algorithm"

Mohd, Yusoh Zeratul Izzah. "Composite SaaS resource management in cloud computing using evolutionary computation." Thesis, Queensland University of Technology, 2013. https://eprints.qut.edu.au/63280/1/Zeratul_Mohd_Yusoh_Thesis.pdf.

Full text

Abstract:

Cloud computing is an emerging computing paradigm in which IT resources are provided over the Internet as a service to users. One such service offered through the Cloud is Software as a Service or SaaS. SaaS can be delivered in a composite form, consisting of a set of application and data components that work together to deliver higher-level functional software. SaaS is receiving substantial attention today from both software providers and users. It is also predicted to has positive future markets by analyst firms. This raises new challenges for SaaS providers managing SaaS, especially in large-scale data centres like Cloud. One of the challenges is providing management of Cloud resources for SaaS which guarantees maintaining SaaS performance while optimising resources use. Extensive research on the resource optimisation of Cloud service has not yet addressed the challenges of managing resources for composite SaaS. This research addresses this gap by focusing on three new problems of composite SaaS: placement, clustering and scalability. The overall aim is to develop efficient and scalable mechanisms that facilitate the delivery of high performance composite SaaS for users while optimising the resources used. All three problems are characterised as highly constrained, large-scaled and complex combinatorial optimisation problems. Therefore, evolutionary algorithms are adopted as the main technique in solving these problems. The first research problem refers to how a composite SaaS is placed onto Cloud servers to optimise its performance while satisfying the SaaS resource and response time constraints. Existing research on this problem often ignores the dependencies between components and considers placement of a homogenous type of component only. A precise problem formulation of composite SaaS placement problem is presented. A classical genetic algorithm and two versions of cooperative co-evolutionary algorithms are designed to now manage the placement of heterogeneous types of SaaS components together with their dependencies, requirements and constraints. Experimental results demonstrate the efficiency and scalability of these new algorithms. In the second problem, SaaS components are assumed to be already running on Cloud virtual machines (VMs). However, due to the environment of a Cloud, the current placement may need to be modified. Existing techniques focused mostly at the infrastructure level instead of the application level. This research addressed the problem at the application level by clustering suitable components to VMs to optimise the resource used and to maintain the SaaS performance. Two versions of grouping genetic algorithms (GGAs) are designed to cater for the structural group of a composite SaaS. The first GGA used a repair-based method while the second used a penalty-based method to handle the problem constraints. The experimental results confirmed that the GGAs always produced a better reconfiguration placement plan compared with a common heuristic for clustering problems. The third research problem deals with the replication or deletion of SaaS instances in coping with the SaaS workload. To determine a scaling plan that can minimise the resource used and maintain the SaaS performance is a critical task. Additionally, the problem consists of constraints and interdependency between components, making solutions even more difficult to find. A hybrid genetic algorithm (HGA) was developed to solve this problem by exploring the problem search space through its genetic operators and fitness function to determine the SaaS scaling plan. The HGA also uses the problem's domain knowledge to ensure that the solutions meet the problem's constraints and achieve its objectives. The experimental results demonstrated that the HGA constantly outperform a heuristic algorithm by achieving a low-cost scaling and placement plan. This research has identified three significant new problems for composite SaaS in Cloud. Various types of evolutionary algorithms have also been developed in addressing the problems where these contribute to the evolutionary computation field. The algorithms provide solutions for efficient resource management of composite SaaS in Cloud that resulted to a low total cost of ownership for users while guaranteeing the SaaS performance.

APA, Harvard, Vancouver, ISO, and other styles

Schmutz, Amandine. "Contributions à l'analyse de données fonctionnelles multivariées, application à l'étude de la locomotion du cheval de sport." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1241.

Full text

Abstract:

Avec l'essor des objets connectés pour fournir un suivi systématique, objectif et fiable aux sportifs et à leur entraineur, de plus en plus de paramètres sont collectés pour un même individu. Une alternative aux méthodes d'évaluation en laboratoire est l'utilisation de capteurs inertiels qui permettent de suivre la performance sans l'entraver, sans limite d'espace et sans procédure d'initialisation fastidieuse. Les données collectées par ces capteurs peuvent être vues comme des données fonctionnelles multivariées : se sont des entités quantitatives évoluant au cours du temps de façon simultanée pour un même individu statistique. Cette thèse a pour objectif de chercher des paramètres d'analyse de la locomotion du cheval athlète à l'aide d'un capteur positionné dans la selle. Cet objet connecté (centrale inertielle, IMU) pour le secteur équestre permet de collecter l'accélération et la vitesse angulaire au cours du temps, dans les trois directions de l'espace et selon une fréquence d'échantillonnage de 100 Hz. Une base de données a ainsi été constituée rassemblant 3221 foulées de galop, collectées en ligne droite et en courbe et issues de 58 chevaux de sauts d'obstacles de niveaux et d'âges variés. Nous avons restreint notre travail à la prédiction de trois paramètres : la vitesse par foulée, la longueur de foulée et la qualité de saut. Pour répondre aux deux premiers objectifs nous avons développé une méthode de clustering fonctionnelle multivariée permettant de diviser notre base de données en sous-groupes plus homogènes du point de vue des signaux collectés. Cette méthode permet de caractériser chaque groupe par son profil moyen, facilitant leur compréhension et leur interprétation. Mais, contre toute attente, ce modèle de clustering n'a pas permis d'améliorer les résultats de prédiction de vitesse, les SVM restant le modèle ayant le pourcentage d'erreur inférieur à 0.6 m/s le plus faible. Il en est de même pour la longueur de foulée où une précision de 20 cm est atteinte grâce aux Support Vector Machine (SVM). Ces résultats peuvent s'expliquer par le fait que notre base de données est composée uniquement de 58 chevaux, ce qui est un nombre d'individus très faible pour du clustering. Nous avons ensuite étendu cette méthode au co-clustering de courbes fonctionnelles multivariées afin de faciliter la fouille des données collectées pour un même cheval au cours du temps. Cette méthode pourrait permettre de détecter et prévenir d'éventuels troubles locomoteurs, principale source d'arrêt du cheval de saut d'obstacle. Pour finir, nous avons investigué les liens entre qualité du saut et les signaux collectés par l'IMU. Nos premiers résultats montrent que les signaux collectés par la selle seuls ne suffisent pas à différencier finement la qualité du saut d'obstacle. Un apport d'information supplémentaire sera nécessaire, à l'aide d'autres capteurs complémentaires par exemple ou encore en étoffant la base de données de façon à avoir un panel de chevaux et de profils de sauts plus variés
With the growth of smart devices market to provide athletes and trainers a systematic, objective and reliable follow-up, more and more parameters are monitored for a same individual. An alternative to laboratory evaluation methods is the use of inertial sensors which allow following the performance without hindering it, without space limits and without tedious initialization procedures. Data collected by those sensors can be classified as multivariate functional data: some quantitative entities evolving along time and collected simultaneously for a same individual. The aim of this thesis is to find parameters for analysing the athlete horse locomotion thanks to a sensor put in the saddle. This connected device (inertial sensor, IMU) for equestrian sports allows the collection of acceleration and angular velocity along time in the three space directions and with a sampling frequency of 100 Hz. The database used for model development is made of 3221 canter strides from 58 ridden jumping horses of different age and level of competition. Two different protocols are used to collect data: one for straight path and one for curved path. We restricted our work to the prediction of three parameters: the speed per stride, the stride length and the jump quality. To meet the first to objectives, we developed a multivariate functional clustering method that allow the division of the database into smaller more homogeneous sub-groups from the collected signals point of view. This method allows the characterization of each group by it average profile, which ease the data understanding and interpretation. But surprisingly, this clustering model did not improve the results of speed prediction, Support Vector Machine (SVM) is the model with the lowest percentage of error above 0.6 m/s. The same applied for the stride length where an accuracy of 20 cm is reached thanks to SVM model. Those results can be explained by the fact that our database is build from 58 horses only, which is a quite low number of individuals for a clustering method. Then we extend this method to the co-clustering of multivariate functional data in order to ease the datamining of horses’ follow-up databases. This method might allow the detection and prevention of locomotor disturbances, main source of interruption of jumping horses. Lastly, we looked for correlation between jumping quality and signals collected by the IMU. First results show that signals collected by the saddle alone are not sufficient to differentiate finely the jumping quality. Additional information will be needed, for example using complementary sensors or by expanding the database to have a more diverse range of horses and jump profiles

APA, Harvard, Vancouver, ISO, and other styles

Laclau, Charlotte. "Hard and fuzzy block clustering algorithms for high dimensional data." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB014.

Full text

Abstract:

Notre capacité grandissante à collecter et stocker des données a fait de l'apprentissage non supervisé un outil indispensable qui permet la découverte de structures et de modèles sous-jacents aux données, sans avoir à \étiqueter les individus manuellement. Parmi les différentes approches proposées pour aborder ce type de problème, le clustering est très certainement le plus répandu. Le clustering suppose que chaque groupe, également appelé cluster, est distribué autour d'un centre défini en fonction des valeurs qu'il prend pour l'ensemble des variables. Cependant, dans certaines applications du monde réel, et notamment dans le cas de données de dimension importante, cette hypothèse peut être invalidée. Aussi, les algorithmes de co-clustering ont-ils été proposés: ils décrivent les groupes d'individus par un ou plusieurs sous-ensembles de variables au regard de leur pertinence. La structure des données finalement obtenue est composée de blocs communément appelés co-clusters. Dans les deux premiers chapitres de cette thèse, nous présentons deux approches de co-clustering permettant de différencier les variables pertinentes du bruit en fonction de leur capacité \`a révéler la structure latente des données, dans un cadre probabiliste d'une part et basée sur la notion de métrique, d'autre part. L'approche probabiliste utilise le principe des modèles de mélanges, et suppose que les variables non pertinentes sont distribuées selon une loi de probabilité dont les paramètres sont indépendants de la partition des données en cluster. L'approche métrique est fondée sur l'utilisation d'une distance adaptative permettant d'affecter à chaque variable un poids définissant sa contribution au co-clustering. D'un point de vue théorique, nous démontrons la convergence des algorithmes proposés en nous appuyant sur le théorème de convergence de Zangwill. Dans les deux chapitres suivants, nous considérons un cas particulier de structure en co-clustering, qui suppose que chaque sous-ensemble d'individus et décrit par un unique sous-ensemble de variables. La réorganisation de la matrice originale selon les partitions obtenues sous cette hypothèse révèle alors une structure de blocks homogènes diagonaux. Comme pour les deux contributions précédentes, nous nous plaçons dans le cadre probabiliste et métrique. L'idée principale des méthodes proposées est d'imposer deux types de contraintes : (1) nous fixons le même nombre de cluster pour les individus et les variables; (2) nous cherchons une structure de la matrice de données d'origine qui possède les valeurs maximales sur sa diagonale (par exemple pour le cas des données binaires, on cherche des blocs diagonaux majoritairement composés de valeurs 1, et de 0 à l’extérieur de la diagonale). Les approches proposées bénéficient des garanties de convergence issues des résultats des chapitres précédents. Enfin, pour chaque chapitre, nous dérivons des algorithmes permettant d'obtenir des partitions dures et floues. Nous évaluons nos contributions sur un large éventail de données simulées et liées a des applications réelles telles que le text mining, dont les données peuvent être binaires ou continues. Ces expérimentations nous permettent également de mettre en avant les avantages et les inconvénients des différentes approches proposées. Pour conclure, nous pensons que cette thèse couvre explicitement une grande majorité des scénarios possibles découlant du co-clustering flou et dur, et peut être vu comme une généralisation de certaines approches de biclustering populaires
With the increasing number of data available, unsupervised learning has become an important tool used to discover underlying patterns without the need to label instances manually. Among different approaches proposed to tackle this problem, clustering is arguably the most popular one. Clustering is usually based on the assumption that each group, also called cluster, is distributed around a center defined in terms of all features while in some real-world applications dealing with high-dimensional data, this assumption may be false. To this end, co-clustering algorithms were proposed to describe clusters by subsets of features that are the most relevant to them. The obtained latent structure of data is composed of blocks usually called co-clusters. In first two chapters, we describe two co-clustering methods that proceed by differentiating the relevance of features calculated with respect to their capability of revealing the latent structure of the data in both probabilistic and distance-based framework. The probabilistic approach uses the mixture model framework where the irrelevant features are assumed to have a different probability distribution that is independent of the co-clustering structure. On the other hand, the distance-based (also called metric-based) approach relied on the adaptive metric where each variable is assigned with its weight that defines its contribution in the resulting co-clustering. From the theoretical point of view, we show the global convergence of the proposed algorithms using Zangwill convergence theorem. In the last two chapters, we consider a special case of co-clustering where contrary to the original setting, each subset of instances is described by a unique subset of features resulting in a diagonal structure of the initial data matrix. Same as for the two first contributions, we consider both probabilistic and metric-based approaches. The main idea of the proposed contributions is to impose two different kinds of constraints: (1) we fix the number of row clusters to the number of column clusters; (2) we seek a structure of the original data matrix that has the maximum values on its diagonal (for instance for binary data, we look for diagonal blocks composed of ones with zeros outside the main diagonal). The proposed approaches enjoy the convergence guarantees derived from the results of the previous chapters. Finally, we present both hard and fuzzy versions of the proposed algorithms. We evaluate our contributions on a wide variety of synthetic and real-world benchmark binary and continuous data sets related to text mining applications and analyze advantages and inconvenients of each approach. To conclude, we believe that this thesis covers explicitly a vast majority of possible scenarios arising in hard and fuzzy co-clustering and can be seen as a generalization of some popular biclustering approaches

APA, Harvard, Vancouver, ISO, and other styles

Ailem, Melissa. "Sparsity-sensitive diagonal co-clustering algorithms for the effective handling of text data." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB087.

Full text

Abstract:

Dans le contexte actuel, il y a un besoin évident de techniques de fouille de textes pour analyser l'énorme quantité de documents textuelles non structurées disponibles sur Internet. Ces données textuelles sont souvent représentées par des matrices creuses (sparses) de grande dimension où les lignes et les colonnes représentent respectivement des documents et des termes. Ainsi, il serait intéressant de regrouper de façon simultanée ces termes et documents en classes homogènes, rendant ainsi cette quantité importante de données plus faciles à manipuler et à interpréter. Les techniques de classification croisée servent justement cet objectif. Bien que plusieurs techniques existantes de co-clustering ont révélé avec succès des blocs homogènes dans plusieurs domaines, ces techniques sont toujours contraintes par la grande dimensionalité et la sparsité caractérisant les matrices documents-termes. En raison de cette sparsité, plusieurs co-clusters sont principalement composés de zéros. Bien que ces derniers soient homogènes, ils ne sont pas pertinents et doivent donc être filtrés en aval pour ne garder que les plus importants. L'objectif de cette thèse est de proposer de nouveaux algorithmes de co-clustering conçus pour tenir compte des problèmes liés à la sparsité mentionnés ci-dessus. Ces algorithmes cherchent une structure diagonale par blocs et permettent directement d'identifier les co-clusters les plus pertinents, ce qui les rend particulièrement efficaces pour le co-clustering de données textuelles. Dans ce contexte, nos contributions peuvent être résumées comme suit: Tout d'abord, nous introduisons et démontrons l'efficacité d'un nouvel algorithme de co-clustering basé sur la maximisation directe de la modularité de graphes. Alors que les algorithmes de co-clustering existants qui se basent sur des critères de graphes utilisent des approximations spectrales, l'algorithme proposé utilise une procédure d'optimisation itérative pour révéler les co-clusters les plus pertinents dans une matrice documents-termes. Par ailleurs, l'optimisation proposée présente l'avantage d'éviter le calcul de vecteurs propres, qui est une tâche rédhibitoire lorsque l'on considère des données de grande dimension. Ceci est une amélioration par rapport aux approches spectrales, où le calcul des vecteurs propres est nécessaire pour effectuer le co-clustering. Dans un second temps, nous utilisons une approche probabiliste pour découvrir des structures en blocs homogènes diagonaux dans des matrices documents-termes. Nous nous appuyons sur des approches de type modèles de mélanges, qui offrent de solides bases théoriques et une grande flexibilité qui permet de découvrir diverses structures de co-clusters. Plus précisément, nous proposons un modèle de blocs latents parcimonieux avec des distributions de Poisson sous contraintes. De façon intéressante, ce modèle comprend la sparsité dans sa formulation, ce qui le rend particulièrement adapté aux données textuelles. En plaçant l'estimation des paramètres de ce modèle dans le cadre du maximum de vraisemblance et du maximum de vraisemblance classifiante, quatre algorithmes de co-clustering ont été proposées, incluant une variante dure, floue, stochastique et une quatrième variante qui tire profit des avantages des variantes floue et stochastique simultanément. Pour finir, nous proposons un nouveau cadre de fouille de textes biomédicaux qui comprend certains algorithmes de co-clustering mentionnés ci-dessus. Ce travail montre la contribution du co-clustering dans une problématique réelle de fouille de textes biomédicaux. Le cadre proposé permet de générer de nouveaux indices sur les résultats retournés par les études d'association pan-génomique (GWAS) en exploitant les abstracts de la base de données PUBMED. (...)
In the current context, there is a clear need for Text Mining techniques to analyse the huge quantity of unstructured text documents available on the Internet. These textual data are often represented by sparse high dimensional matrices where rows and columns represent documents and terms respectively. Thus, it would be worthwhile to simultaneously group these terms and documents into meaningful clusters, making this substantial amount of data easier to handle and interpret. Co-clustering techniques just serve this purpose. Although many existing co-clustering approaches have been successful in revealing homogeneous blocks in several domains, these techniques are still challenged by the high dimensionality and sparsity characteristics exhibited by document-term matrices. Due to this sparsity, several co-clusters are primarily composed of zeros. While homogeneous, these co-clusters are irrelevant and must be filtered out in a post-processing step to keep only the most significant ones. The objective of this thesis is to propose new co-clustering algorithms tailored to take into account these sparsity-related issues. The proposed algorithms seek a block diagonal structure and allow to straightaway identify the most useful co-clusters, which makes them specially effective for the text co-clustering task. Our contributions can be summarized as follows: First, we introduce and demonstrate the effectiveness of a novel co-clustering algorithm based on a direct maximization of graph modularity. While existing graph-based co-clustering algorithms rely on spectral relaxation, the proposed algorithm uses an iterative alternating optimization procedure to reveal the most meaningful co-clusters in a document-term matrix. Moreover, the proposed optimization has the advantage of avoiding the computation of eigenvectors, a task which is prohibitive when considering high dimensional data. This is an improvement over spectral approaches, where the eigenvectors computation is necessary to perform the co-clustering. Second, we use an even more powerful approach to discover block diagonal structures in document-term matrices. We rely on mixture models, which offer strong theoretical foundations and considerable flexibility that makes it possible to uncover various specific cluster structure. More precisely, we propose a rigorous probabilistic model based on the Poisson distribution and the well known Latent Block Model. Interestingly, this model includes the sparsity in its formulation, which makes it particularly effective for text data. Setting the estimate of this model’s parameters under the Maximum Likelihood (ML) and the Classification Maximum Likelihood (CML) approaches, four co-clustering algorithms have been proposed, including a hard, a soft, a stochastic and a fourth algorithm which leverages the benefits of both the soft and stochastic variants, simultaneously. As a last contribution of this thesis, we propose a new biomedical text mining framework that includes some of the above mentioned co-clustering algorithms. This work shows the contribution of co-clustering in a real biomedical text mining problematic. The proposed framework is able to propose new clues about the results of genome wide association studies (GWAS) by mining PUBMED abstracts. This framework has been tested on asthma disease and allowed to assess the strength of associations between asthma genes reported in previous GWAS as well as discover new candidate genes likely associated to asthma. In a nutshell, while several text co-clustering algorithms already exist, their performance can be substantially increased if more appropriate models and algorithms are available. According to the extensive experiments done on several challenging real-world text data sets, we believe that this thesis has served well this objective

APA, Harvard, Vancouver, ISO, and other styles

Bozdag, Doruk. "Graph Coloring and Clustering Algorithms for Science and Engineering Applications." The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1229459765.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Anand, K. "Methods for Blind Separation of Co-Channel BPSK Signals Arriving at an Antenna Array and Their Performance Analysis." Thesis, Indian Institute of Science, 1995. http://hdl.handle.net/2005/123.

Full text

Abstract:

Capacity improvement of Wireless Communication Systems is a very important area of current research. The goal is to increase the number of users supported by the system per unit bandwidth allotted. One important way of achieving this improvement is to use multiple antennas backed by intelligent signal processing. In this thesis, we present methods for blind separation of co-channel BPSK signals arriving at an antenna array. These methods consist of two parts, Constellation Estimation and Assignment. We give two methods for constellation estimation, the Smallest Distance Clustering and the Maximum Likelihood Estimation. While the latter is theoretically sound,the former is Computationally simple and intuitively appealing. We show that the Maximum Likelihood Constellation Estimation is well approximated by the Smallest Distance Clustering Algorithm at high SNR. The Assignment Algorithm exploits the structure of the BPSK signals. We observe that both the methods for estimating the constellation vectors perform very well at high SNR and nearly attain Cramer-Rao bounds. Using this fact and noting that the Assignment Algorithm causes negligble error at high SNR, we derive an upper bound on the probability of bit error for the above methods at high SNR. This upper bound falls very rapidly with increasing SNR, showing that our constellation estimation-assignment approach is very efficient. Simulation results are given to demonstrate the usefulness of the bounds.

APA, Harvard, Vancouver, ISO, and other styles

Médoc, Nicolas. "A visual analytics approach for multi-resolution and multi-model analysis of text corpora : application to investigative journalism." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB042/document.

Full text

Abstract:

À mesure que la production de textes numériques croît exponentiellement, un besoin grandissant d’analyser des corpus de textes se manifeste dans beaucoup de domaines d’application, tant ces corpus constituent des sources inépuisables d’information et de connaissance partagées. Ainsi proposons-nous dans cette thèse une nouvelle approche de visualisation analytique pour l’analyse de corpus textuels, mise en œuvre pour les besoins spécifiques du journalisme d’investigation. Motivées par les problèmes et les tâches identifiés avec une journaliste d’investigation professionnelle, les visualisations et les interactions ont été conçues suivant une méthodologie centrée utilisateur, impliquant l’utilisateur durant tout le processus de développement. En l’occurrence, les journalistes d’investigation formulent des hypothèses, explorent leur sujet d’investigation sous tous ses angles, à la recherche de sources multiples étayant leurs hypothèses de travail. La réalisation de ces tâches, très fastidieuse lorsque les corpus sont volumineux, requiert l’usage de logiciels de visualisation analytique se confrontant aux problématiques de recherche abordées dans cette thèse. D’abord, la difficulté de donner du sens à un corpus textuel vient de sa nature non structurée. Nous avons donc recours au modèle vectoriel et son lien étroit avec l’hypothèse distributionnelle, ainsi qu’aux algorithmes qui l’exploitent pour révéler la structure sémantique latente du corpus. Les modèles de sujets et les algorithmes de biclustering sont efficaces pour l’extraction de sujets de haut niveau. Ces derniers correspondent à des groupes de documents concernant des sujets similaires, chacun représenté par un ensemble de termes extraits des contenus textuels. Une telle structuration par sujet permet notamment de résumer un corpus et de faciliter son exploration. Nous proposons une nouvelle visualisation, une carte pondérée des sujets, qui dresse une vue d’ensemble des sujets de haut niveau. Elle permet d’une part d’interpréter rapidement les contenus grâce à de multiples nuages de mots, et d’autre part, d’apprécier les propriétés des sujets telles que leur taille relative et leur proximité sémantique. Bien que l’exploration des sujets de haut niveau aide à localiser des sujets d’intérêt ainsi que leur voisinage, l’identification de faits précis, de points de vue ou d’angles d’analyse, en lien avec un événement ou une histoire, nécessite un niveau de structuration plus fin pour représenter des variantes de sujet. Cette structure imbriquée révélée par Bimax, une méthode de biclustering basée sur des motifs avec chevauchement, capture au sein des biclusters les co-occurrences de termes partagés par des sous-ensembles de documents pouvant dévoiler des faits, des points de vue ou des angles associés à des événements ou des histoires communes. Cette thèse aborde les problèmes de visualisation de biclusters avec chevauchement en organisant les biclusters terme-document en une hiérarchie qui limite la redondance des termes et met en exergue les parties communes et distinctives des biclusters. Nous avons évalué l’utilité de notre logiciel d’abord par un scénario d’utilisation doublé d’une évaluation qualitative avec une journaliste d’investigation. En outre, les motifs de co-occurrence des variantes de sujet révélées par Bima. sont déterminés par la structure de sujet englobante fournie par une méthode d’extraction de sujet. Cependant, la communauté a peu de recul quant au choix de la méthode et son impact sur l’exploration et l’interprétation des sujets et de ses variantes. Ainsi nous avons conduit une expérience computationnelle et une expérience utilisateur contrôlée afin de comparer deux méthodes d’extraction de sujet. D’un côté Coclu. est une méthode de biclustering disjointe, et de l’autre, hirarchical Latent Dirichlet Allocation (hLDA) est un modèle de sujet probabiliste dont les distributions de probabilité forment une structure de bicluster avec chevauchement. (...)
As the production of digital texts grows exponentially, a greater need to analyze text corpora arises in various domains of application, insofar as they constitute inexhaustible sources of shared information and knowledge. We therefore propose in this thesis a novel visual analytics approach for the analysis of text corpora, implemented for the real and concrete needs of investigative journalism. Motivated by the problems and tasks identified with a professional investigative journalist, visualizations and interactions are designed through a user-centered methodology involving the user during the whole development process. Specifically, investigative journalists formulate hypotheses and explore exhaustively the field under investigation in order to multiply sources showing pieces of evidence related to their working hypothesis. Carrying out such tasks in a large corpus is however a daunting endeavor and requires visual analytics software addressing several challenging research issues covered in this thesis. First, the difficulty to make sense of a large text corpus lies in its unstructured nature. We resort to the Vector Space Model (VSM) and its strong relationship with the distributional hypothesis, leveraged by multiple text mining algorithms, to discover the latent semantic structure of the corpus. Topic models and biclustering methods are recognized to be well suited to the extraction of coarse-grained topics, i.e. groups of documents concerning similar topics, each one represented by a set of terms extracted from textual contents. We provide a new Weighted Topic Map visualization that conveys a broad overview of coarse-grained topics by allowing quick interpretation of contents through multiple tag clouds while depicting the topical structure such as the relative importance of topics and their semantic similarity. Although the exploration of the coarse-grained topics helps locate topic of interest and its neighborhood, the identification of specific facts, viewpoints or angles related to events or stories requires finer level of structuration to represent topic variants. This nested structure, revealed by Bimax, a pattern-based overlapping biclustering algorithm, captures in biclusters the co-occurrences of terms shared by multiple documents and can disclose facts, viewpoints or angles related to events or stories. This thesis tackles issues related to the visualization of a large amount of overlapping biclusters by organizing term-document biclusters in a hierarchy that limits term redundancy and conveys their commonality and specificities. We evaluated the utility of our software through a usage scenario and a qualitative evaluation with an investigative journalist. In addition, the co-occurrence patterns of topic variants revealed by Bima. are determined by the enclosing topical structure supplied by the coarse-grained topic extraction method which is run beforehand. Nonetheless, little guidance is found regarding the choice of the latter method and its impact on the exploration and comprehension of topics and topic variants. Therefore we conducted both a numerical experiment and a controlled user experiment to compare two topic extraction methods, namely Coclus, a disjoint biclustering method, and hierarchical Latent Dirichlet Allocation (hLDA), an overlapping probabilistic topic model. The theoretical foundation of both methods is systematically analyzed by relating them to the distributional hypothesis. The numerical experiment provides statistical evidence of the difference between the resulting topical structure of both methods. The controlled experiment shows their impact on the comprehension of topic and topic variants, from analyst perspective. (...)

APA, Harvard, Vancouver, ISO, and other styles

Kyrgyzov, Ivan. "Recherche dans les bases de donnees satellitaires des paysages et application au milieu urbain: clustering, consensus et categorisation." Phd thesis, Télécom ParisTech, 2008. http://pastel.archives-ouvertes.fr/pastel-00004084.

Full text

Abstract:

Les images satellitaires ont trouvées une large application pour l'analyse des ressources naturelles et des activités humaines. Les images à haute résolution, e.g., SPOT5, sont très nombreuses. Ceci donne un grand intérêt afin de développer de nouveaux aspects théoriques et des outils pour la fouille d'images. L'objectif de la thèse est la fouille non-supervisée d'images et inclut trois parties principales. Dans la première partie nous démontrons le contenu d'images à haute résolution. Nous décrivons les zones d'images par les caractéristiques texturelles et géométriques. Les algorithmes de clustering sont présentés dans la deuxième partie. Une étude de critères de validité et de mesures d'information est donnée pour estimer la qualité de clustering. Un nouveau critère basé sur la Longueur de Description Minimale (LDM) est proposé pour estimer le nombre optimal de clusters. Par ailleurs, nous proposons un nouveau algorithme hiérarchique basé sur le critère LDM à noyau. Une nouvelle méthode de ''combinaison de clustering'' est présentée dans la thèse pour profiter de différents algorithmes de clustering. Nous développons un algorithme hiérarchique pour optimiser la fonction objective basée sur une matrice de co-association. Une deuxième méthode est proposée qui converge à une solution globale. Nous prouvons que le minimum global peut être trouvé en utilisant l'algorithme de type ''mean shift''. Les avantages de cette méthode sont une convergence rapide et une complexité linéaire. Dans la troisième partie de la thèse un protocole complet de la fouille d'images est proposé. Différents clusterings sont représentés via les relations sémantiques entre les concepts.

APA, Harvard, Vancouver, ISO, and other styles

Guo, Pei Fang. "PalmPrints : a cooperative co-evolutionary clustering algorithm for hand-based biometric identification." Thesis, 2003. http://spectrum.library.concordia.ca/2283/1/MQ83865.pdf.

Full text

Abstract:

The thesis first introduces a new adaptive technique of finger upright reorientation by using the Principle of Coordinate System Rotation . The empirical results demonstrate that reorienting the images of fingers of a hand prior to any feature extraction consistently leads to more stable feature values, regardless of the features measured. Hand shape analysis included Central Moments, Fourier Descriptors and Zernike Moments is characterized based on I-D contour transformation. The main contribution of the thesis is the first to use a genetic algorithm to simultaneously achieve dimensionality reduction and object (hand image) clustering. A novel Cooperative Coevolutionary Clustering Algorithm (COCA) with dynamic clustering and feature selection has been developed to search for a proper number (without prior knowledge of it) of clusters of hand images into these clusters based on a smaller set of new features. In addition to the main contribution of the study, an MSE Extended Fitness Function is presented which is particularly suited to an integrated dynamic clustering space. The proposed design and experimental implementation show that the dimensionality of the clustering space can be cut in half, and the GA evolves an average of 4 clusters with a very low standard deviation of 0.4714. Average hand image misplacement number is 5.8 out of 100 hand images. These results open a new way towards other cooperative co-evolutionary applications, in which 3 or more populations are used to co-evolve solutions and designs consisting of 3 or more loosely coupled subsolutions or modules.

APA, Harvard, Vancouver, ISO, and other styles

Cho, Hyuk. "Co-clustering algorithms : extensions and applications." 2008. http://hdl.handle.net/2152/17809.

Full text

Abstract:

Co-clustering is rather a recent paradigm for unsupervised data analysis, but it has become increasingly popular because of its potential to discover latent local patterns, otherwise unapparent by usual unsupervised algorithms such as k-means. Wide deployment of co-clustering, however, requires addressing a number of practical challenges such as data transformation, cluster initialization, scalability, and so on. Therefore, this thesis focuses on developing sophisticated co-clustering methodologies to maturity and its ultimate goal is to promote co-clustering as an invaluable and indispensable unsupervised analysis tool for varied practical applications. To achieve this goal, we explore the three specific tasks: (1) development of co-clustering algorithms to be functional, adaptable, and scalable (co-clustering algorithms); (2) extension of co-clustering algorithms to incorporate application-specific requirements (extensions); and (3) application of co-clustering algorithms broadly to existing and emerging problems in practical application domains (applications). As for co-clustering algorithms, we develop two fast Minimum Sum-Squared Residue Co-clustering (MSSRCC) algorithms [CDGS04], which simultaneously cluster data points and features via an alternating minimization scheme and generate co-clusters in a “checkerboard” structure. The first captures co-clusters with constant values, while the other discovers co-clusters with coherent “trends” as well as constant values. We note that the proposed algorithms are two special cases (bases 2 and 6 with Euclidean distance, respectively) of the general co-clustering framework, Bregman Co-clustering (BCC) [BDG+07], which contains six Euclidean BCC and six I-divergence BCC algorithms. Then, we substantially enhance the performance of the two MSSRCC algorithms by escaping from poor local minima and resolving the degeneracy problem of generating empty clusters in partitional clustering algorithms through the three specific strategies: (1) data transformation; (2) deterministic spectral initialization; and (3) local search strategy. Concerning co-clustering extensions, we investigate general algorithmic strategies for the general BCC framework, since it is applicable to a large class of distance measures and data types. We first formalize various data transformations for datasets with varied scaling and shifting factors, mathematically justify their effects on the six Euclidean BCC algorithms, and empirically validate the analysis results. We also adapt the local search strategy, initially developed for the two MSSRCC algorithms, to all the twelve BCC algorithms. Moreover, we consider variations of cluster assignments and cluster updates, including greedy vs. non-greedy cluster assignment, online vs. batch cluster update, and so on. Furthermore, in order to provide better scalability and usability, we parallelize all the twelve BCC algorithms, which are capable of co-clustering large-scaled datasets over multiple processors. Regarding co-clustering applications, we extend the functionality of BCC to incorporate application-specific requirements: (1) discovery of inverted patterns, whose goal is to find anti-correlation; (2) discovery of coherent co-clusters from noisy data, whose purpose is to do dimensional reduction and feature selection; and (3) discovery of patterns from time-series data, whose motive is to guarantee critical time-locality. Furthermore, we employ co-clustering to pervasive computing for mobile devices, where the task is to extract latent patterns from usage logs as well as to recognize specific situations of mobile-device users. Finally, we demonstrate the applicability of our proposed algorithms for aforementioned applications through empirical results on various synthetic and real-world datasets. In summary, we present co-clustering algorithms to discover latent local patterns, propose their algorithmic extensions to incorporate specific requirements, and provide their applications to a wide range of practical domains.
text

APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Co-clustering algorithm"

Govaert, Gerard, and Mohamed Nadif. Co-Clustering. Wiley & Sons, Incorporated, John, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Nadif, Mohamed, and GÃ©rard Govaert. Co-Clustering: Models, Algorithms and Applications. Wiley & Sons, Incorporated, John, 2014.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Nadif, Mohamed, and Gérard Govaert. Co-Clustering: Models, Algorithms and Applications. Wiley-Interscience, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Nadif, Mohamed, and GÃ©rard Govaert. Co-Clustering: Models, Algorithms and Applications. Wiley & Sons, Incorporated, John, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Nadif, Mohamed, and G�rard Govaert. Co-Clustering: Models, Algorithms and Applications. Wiley & Sons, Incorporated, John, 2013.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Co-clustering algorithm"

Laclau, Charlotte, and Mohamed Nadif. "Diagonal Co-clustering Algorithm for Document-Word Partitioning." In Advances in Intelligent Data Analysis XIV, 170–80. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24465-5_15.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hemalatha, Chunduru, and T. V. Sarath. "Analysis of Clustering Algorithm in VANET Through Co-Simulation." In Sustainable Communication Networks and Application, 441–50. Singapore: Springer Singapore, 2022. http://dx.doi.org/10.1007/978-981-16-6605-6_32.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kharma, Nawwaf, Ching Y. Suen, and Pei F. Guo. "PalmPrints: A Novel Co-evolutionary Algorithm for Clustering Finger Images." In Genetic and Evolutionary Computation — GECCO 2003, 322–31. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/3-540-45105-6_38.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Wu, Meng-Lun, and Chia-Hui Chang. "Parallel Co-clustering with Augmented Matrices Algorithm with Map-Reduce." In Data Warehousing and Knowledge Discovery, 183–94. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-10160-6_17.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Zhou, Xiaowei, Fumin Ma, and Mengtao Zhang. "Clustering Ensemble Algorithm Based on an Improved Co-association Matrix." In Intelligent Equipment, Robots, and Vehicles, 805–15. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-7213-2_78.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Honda, Katsuhiro, Arina Kawano, and Akira Notsu. "A Greedy Fuzzy k-Member Co-clustering Algorithm and Collaborative Filtering Applicability." In Knowledge-Based Information Systems in Practice, 39–50. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-13545-8_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Tjhi, William-Chandra, and Lihui Chen. "A New Fuzzy Co-clustering Algorithm for Categorization of Datasets with Overlapping Clusters." In Advanced Data Mining and Applications, 328–39. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11811305_36.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kwon, Bongjune, and Hyuk Cho. "Scalable Co-clustering Algorithms." In Algorithms and Architectures for Parallel Processing, 32–43. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-13119-6_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bulteau, Laurent, Vincent Froese, Sepp Hartung, and Rolf Niedermeier. "Co-Clustering Under the Maximum Norm." In Algorithms and Computation, 298–309. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-13075-0_24.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Honda, Katsuhiro. "Fuzzy Clustering/Co-clustering and Probabilistic Mixture Models-Induced Algorithms." In Fuzzy Sets, Rough Sets, Multisets and Clustering, 29–43. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-47557-8_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Co-clustering algorithm"

Tjhi, William-Chandra, and Lihui Chen. "Robust fuzzy Co-clustering algorithm." In 2007 6th International Conference on Information, Communications & Signal Processing. IEEE, 2007. http://dx.doi.org/10.1109/icics.2007.4449868.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Van Nha Pham and Long Thanh Ngo. "Interval type-2 fuzzy co-clustering algorithm." In 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE, 2015. http://dx.doi.org/10.1109/fuzz-ieee.2015.7337960.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Nicoleta, Rogovschi, Lazhar Labiod, and Mohamed Nadif. "A spectral algorithm for topographical Co-clustering." In 2012 International Joint Conference on Neural Networks (IJCNN 2012 - Brisbane). IEEE, 2012. http://dx.doi.org/10.1109/ijcnn.2012.6252398.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Narang, Ankur, Abhinav Srivastava, and Naga Praveen Kumar Katta. "Distributed hierarchical co-clustering and collaborative filtering algorithm." In 2012 19th International Conference on High Performance Computing (HiPC). IEEE, 2012. http://dx.doi.org/10.1109/hipc.2012.6507497.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hoseini, Elham, Sattar Hashemi, and Ali Hamzeh. "A levelwise spectral co-clustering algorithm for collaborative filtering." In the 6th International Conference. New York, New York, USA: ACM Press, 2012. http://dx.doi.org/10.1145/2184751.2184759.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Binh, Le Thi Cam, Pham Van Nha, and Pham The Long. "Fuzzy Co-clustering Algorithm for Multi-source Data Mining." In 19th World Congress of the International Fuzzy Systems Association (IFSA), 12th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT), and 11th International Summer School on Aggregation Operators (AGOP). Paris, France: Atlantis Press, 2021. http://dx.doi.org/10.2991/asum.k.210827.016.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Yao, Shixin, Guoxian Yu, Jun Wang, Carlotta Domeniconi, and Xiangliang Zhang. "Multi-View Multiple Clustering." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/572.

Full text

Abstract:

Multiple clustering aims at exploring alternative clusterings to organize the data into meaningful groups from different perspectives. Existing multiple clustering algorithms are designed for single-view data. We assume that the individuality and commonality of multi-view data can be leveraged to generate high-quality and diverse clusterings. To this end, we propose a novel multi-view multiple clustering (MVMC) algorithm. MVMC first adapts multi-view self-representation learning to explore the individuality encoding matrices and the shared commonality matrix of multi-view data. It additionally reduces the redundancy (i.e., enhancing the individuality) among the matrices using the Hilbert-Schmidt Independence Criterion (HSIC), and collects shared information by forcing the shared matrix to be smooth across all views. It then uses matrix factorization on the individual matrices, along with the shared matrix, to generate diverse clusterings of high-quality. We further extend multiple co-clustering on multi-view data and propose a solution called multi-view multiple co-clustering (MVMCC). Our empirical study shows that MVMC (MVMCC) can exploit multi-view data to generate multiple high-quality and diverse clusterings (co-clusterings), with superior performance to the state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

Pham, Van Nha, Long Thanh Ngo, and Thao Duc Nguyen. "Feature-reduction fuzzy co-clustering algorithm for hyperspectral image segmentation." In 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE, 2017. http://dx.doi.org/10.1109/fuzz-ieee.2017.8015643.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Ramanathan, Venkatram, Wenjing Ma, Vignesh T. Ravi, Tantan Liu, and Gagan Agrawal. "Parallelizing an Information Theoretic Co-clustering Algorithm Using a Cloud Middleware." In 2010 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 2010. http://dx.doi.org/10.1109/icdmw.2010.100.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Lu, Wei, and Ling Xue. "A Heuristic-Based Co-clustering Algorithm for the Internet Traffic Classification." In 2014 28th International Conference on Advanced Information Networking and Applications Workshops (WAINA). IEEE, 2014. http://dx.doi.org/10.1109/waina.2014.16.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Academic literature on the topic 'Co-clustering algorithm'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Contents

Journal articles on the topic "Co-clustering algorithm"

Dissertations / Theses on the topic "Co-clustering algorithm"

Books on the topic "Co-clustering algorithm"

Book chapters on the topic "Co-clustering algorithm"

Conference papers on the topic "Co-clustering algorithm"