Articles de revues : « Variables clustering »

1

Perricone, Chiara. « Clustering macroeconomic variables ». Structural Change and Economic Dynamics 44 (mars 2018) : 23–33. http://dx.doi.org/10.1016/j.strueco.2018.02.001.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

2

Hathaway, Richard J. « Clustering Random Variables ». IETE Journal of Research 44, n^o 4-5 (juillet 1998) : 199–205. http://dx.doi.org/10.1080/03772063.1998.11416046.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

3

Chen, Mingkun, et Evelyne Vigneau. « Supervised clustering of variables ». Advances in Data Analysis and Classification 10, n^o 1 (15 novembre 2014) : 85–101. http://dx.doi.org/10.1007/s11634-014-0191-5.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

4

Zhang, Hongmei, Yubo Zou, Will Terry, Wilfried Karmaus et Hasan Arshad. « Joint Clustering With Correlated Variables ». American Statistician 73, n^o 3 (9 juillet 2018) : 296–306. http://dx.doi.org/10.1080/00031305.2018.1424033.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

5

Rubiano Moreno, Jesica, Carlos Alonso Malaver, Samuel Nucamendi Guillén et Carlos López Hernández. « A clustering algorithm for ipsative variables ». DYNA 86, n^o 211 (1 octobre 2019) : 94–101. http://dx.doi.org/10.15446/dyna.v86n211.77835.

Texte intégral

Résumé :

The aim of this study is to introduce a new clustering method for ipsatives variables. This method can be used for nominals or ordinals variables for which responses must be mutually exclusive, and it is independent of data distribution. The proposed method is applied to outline motivational profiles for individuals based on a declared preferences set. A case study is used to analyze the performance of the proposed algorithm by comparing proposed method results versus the PAM method. Results show that proposed method generate a better segmentation and differentiated groups. An extensive study was conducted to validate the performance clustering method against a set of random groups by clustering measures.

Styles APA, Harvard, Vancouver, ISO, etc.

6

Forina, M., C. Armanino et V. Raggio. « Clustering with dendrograms on interpretation variables ». Analytica Chimica Acta 454, n^o 1 (mars 2002) : 13–19. http://dx.doi.org/10.1016/s0003-2670(01)01517-3.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

7

Saracco, J., et M. Chavent. « Clustering of Variables for Mixed Data ». EAS Publications Series 77 (2016) : 121–69. http://dx.doi.org/10.1051/eas/1677007.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

8

Huh, Myung-Hoe, et Yong B. Lim. « Weighting variables in K-means clustering ». Journal of Applied Statistics 36, n^o 1 (31 octobre 2008) : 67–78. http://dx.doi.org/10.1080/02664760802382533.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

9

Vigneau, E., et E. M. Qannari. « Clustering of Variables Around Latent Components ». Communications in Statistics - Simulation and Computation 32, n^o 4 (11 janvier 2003) : 1131–50. http://dx.doi.org/10.1081/sac-120023882.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

10

Ghizlane, Ez-Zarrad, Sabbar Wafae et Bekkhoucha Abdelkrim. « Features Clustering Around Latent Variables for High Dimensional Data ». E3S Web of Conferences 297 (2021) : 01070. http://dx.doi.org/10.1051/e3sconf/202129701070.

Texte intégral

Résumé :

Clustering of variables is the task of grouping similar variables into different groups. It may be useful in several situations such as dimensionality reduction, feature selection, and detect redundancies. In the present study, we combine two methods of features clustering the clustering of variables around latent variables (CLV) algorithm and the k-means based co-clustering algorithm (kCC). Indeed, classical CLV cannot be applied to high dimensional data because this approach becomes tedious when the number of features increases.

Styles APA, Harvard, Vancouver, ISO, etc.

11

KELLER, ANNETTE, et FRANK KLAWONN. « FUZZY CLUSTERING WITH WEIGHTING OF DATA VARIABLES ». International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 08, n^o 06 (décembre 2000) : 735–46. http://dx.doi.org/10.1142/s0218488500000538.

Texte intégral

Résumé :

We introduce an objective function-based fuzzy clustering technique that assigns one influence parameter to each single data variable for each cluster. Our method is not only suited to detect structures or groups of data that are not uniformly distributed over the structure's single domains, but gives also information about the influence of individual variables on the detected groups. In addition, our approach can be seen as a generalization of the well-known fuzzy c-means clustering algorithm.

Styles APA, Harvard, Vancouver, ISO, etc.

12

Yang, Miin-Shen, Pei-Yuan Hwang et De-Hua Chen. « Fuzzy clustering algorithms for mixed feature variables ». Fuzzy Sets and Systems 141, n^o 2 (janvier 2004) : 301–17. http://dx.doi.org/10.1016/s0165-0114(03)00072-1.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

13

Vichi, Maurizio, Donatella Vicari et Henk A. L. Kiers. « Clustering and dimension reduction for mixed variables ». Behaviormetrika 46, n^o 2 (11 mars 2019) : 243–69. http://dx.doi.org/10.1007/s41237-018-0068-6.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

14

Vigneau, E., K. Sahmer, E. M. Qannari et D. Bertrand. « Clustering of variables to analyze spectral data ». Journal of Chemometrics 19, n^o 3 (2005) : 122–28. http://dx.doi.org/10.1002/cem.909.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

15

Raghuveer, Boddupally. « Clustering Methods of Acculturation ». Journal of Clinical and Medical Case Reports and Reviews 2, n^o 4 (24 octobre 2022) : 1–2. http://dx.doi.org/10.59468/2837-469x/022.

Texte intégral

Résumé :

The purpose of our study was to determine if acculturation variables from different acculturation domains form empirically extracted acculturation clusters The findings of the present study lend additional support to the use of clustering methods as a way of including multiple domains of acculturation, thereby gaining a more comprehensive understanding of acculturation and its connection with psychosocial adjustment. The results also reinforce prior research findings that integration, or biculturalism, is an adaptive acculturation strategy.

Styles APA, Harvard, Vancouver, ISO, etc.

16

Vigneau, Evelyne, Mingkun Chen et El,Mostafa Qannari. « ClustVarLV : An R Package for the Clustering of Variables Around Latent Variables ». R Journal 7, n^o 2 (2015) : 134. http://dx.doi.org/10.32614/rj-2015-026.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

17

Karimi, Sadegh, et Bahram Hemmateenejad. « Identification of discriminatory variables in proteomics data analysis by clustering of variables ». Analytica Chimica Acta 767 (mars 2013) : 35–43. http://dx.doi.org/10.1016/j.aca.2012.12.050.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

18

Singh, Nripendra. « Clustering Consumers on the Basis of Attitudinal Variables ». Asia Pacific Business Review 5, n^o 4 (octobre 2009) : 146–55. http://dx.doi.org/10.1177/097324700900500413.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

19

Bühlmann, Peter, Philipp Rütimann, Sara van de Geer et Cun-Hui Zhang. « Correlated variables in regression : Clustering and sparse estimation ». Journal of Statistical Planning and Inference 143, n^o 11 (novembre 2013) : 1835–58. http://dx.doi.org/10.1016/j.jspi.2013.05.019.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

20

Pulido-Valdeolivas, I., D. Gómez-Andrés, J. A. Martin, J. López, E. Gómez-Barrena et E. Rausell. « P6.14 Hierarchical clustering of Gillette Gait Index variables ». Clinical Neurophysiology 122 (juin 2011) : S87. http://dx.doi.org/10.1016/s1388-2457(11)60303-9.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

21

Csörgő, Sándor, et Wei Biao Wu. « On the clustering of independent uniform random variables ». Random Structures & ; Algorithms 25, n^o 4 (28 juin 2004) : 396–420. http://dx.doi.org/10.1002/rsa.20030.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

22

Šulc, Zdeněk, et Hana Řezanková. « Evaluation of selected approaches to clustering categorical variables ». Statistics in Transition new series 15, n^o 4 (1 décembre 2014) : 591–610. http://dx.doi.org/10.59170/stattrans-2014-039.

Texte intégral

Résumé :

This paper focuses on recently proposed similarity measures and their performance in categorical variable clustering. It compares clustering results using three recently developed similarity measures (IOF, OF and Lin measures) with results obtained using two association measures for nominal variables (Cramér’s V and the uncertainty coefficient) and with the simple matching coefficient (the overlap measure). To eliminate the influence of a particular linkage method on the structure of final clusters, three linkage methods are examined (complete, single, average). The created groups (clusters) of variables can be considered as the basis for dimensionality reduction, e.g. by choosing one of the variables from a given group as a representative for the whole group. The quality of resulting clusters is evaluated by the within-cluster variability, expressed by the WCM coefficient, and by dendrogram analysis. The examined similarity measures are compared and evaluated using two real data sets from a social survey.

Styles APA, Harvard, Vancouver, ISO, etc.

23

Yan, Jingdong, et Wuwei Liu. « An Ensemble Clustering Approach (Consensus Clustering) for High-Dimensional Data ». Security and Communication Networks 2022 (16 mai 2022) : 1–9. http://dx.doi.org/10.1155/2022/5629710.

Texte intégral

Résumé :

Due to the plurality of irrelevant attributes, sparse distribution, and complicated calculations in high-dimensional data, traditional clustering algorithms, such as K-means, do not perform well on high-dimensional data. To address the clustering problem of high-dimensional data, this paper studies an integrated clustering method for high-dimensional data. A method of subspace division based on minimum redundancy is proposed to solve the problem of subspace division of high-dimensional data; subspace division is improved by using the K-means algorithm. Additionally, this method uses mutual information between the characteristic variables of the data to replace the calculation in the K-means algorithm. The distance between the characteristic variables of the data is used to divide the data into subspaces according to the mutual information values between the characteristic variables of the data. To achieve high clustering accuracy and diversity based on clustering requirements, this paper uses a genetic algorithm as the consistency integration function. The fitness function is designed according to the clustering fusion target, and the selection operator is designed according to the maximum number of overlapping elements in the base clustering. The experimental results show that the clustering algorithm proposed in this paper outperforms other methods on most datasets and is an effective clustering integration algorithm. The proposed clustering algorithm is compared with other commonly used clustering fusion algorithms on datasets to prove the advantages of the proposed algorithm.

Styles APA, Harvard, Vancouver, ISO, etc.

24

Vandewalle, Vincent. « Multi-Partitions Subspace Clustering ». Mathematics 8, n^o 4 (15 avril 2020) : 597. http://dx.doi.org/10.3390/math8040597.

Texte intégral

Résumé :

In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data.

Styles APA, Harvard, Vancouver, ISO, etc.

25

Legland, D., et J. Beaugrand. « Automated clustering of lignocellulosic fibres based on morphometric features and using clustering of variables ». Industrial Crops and Products 45 (février 2013) : 253–61. http://dx.doi.org/10.1016/j.indcrop.2012.12.021.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

26

Chang, Xiaopeng, Minghua Zhang, Xiang Zhang et Sheng Zhang. « Two-Step Clustering for Mineral Prospectivity Mapping : A Case Study from the Northeastern Edge of the Jiaolai Basin, China ». Minerals 14, n^o 11 (28 octobre 2024) : 1089. http://dx.doi.org/10.3390/min14111089.

Texte intégral

Résumé :

The advancement of geological big data has rendered data-driven methodologies increasingly vital in Mineral Prospectivity Mapping. The effective integration of quantitative and qualitative data, including experiential and knowledge-based insights, is crucial in geological data fusion. Specifically, the conversion of raw data into samples and the selection of predictive methods are two core issues that constitute the focus of this study. Traditional clustering methods require the user to specify the number of clusters in advance. The two-step clustering can automatically determine the clustering result ‘k’ while analyzing both continuous and categorical variables, by building a Cluster Feature (CF) and using information criteria to merge nodes. In this study, we conducted an analysis utilizing stream sediment element data, residual gravity anomalies, and fault distribution through the two-step clustering method. Factor analysis (FA) was employed to reduce 16 elemental variables from stream sediments into five uncorrelated continuous variables; additionally, residual gravity anomalies were transformed from continuous to categorical variables via an interval-based method before being combined with fault distribution, resulting in seven variables for clustering. The research findings indicate that categorical variables significantly influence clustering results; concurrently, as the importance of continuous variables within the cluster increases, so does k. When only one categorical variable is present, residual gravity anomalies show significantly better clustering than fault distribution; however, when two categorical variables are involved, it is essential to consider the quantity of categories: more categories lead to poorer quality. The results from the Jiaolai Basin’s northeastern margin indicate a significant correlation with known gold deposits; two-step clustering is a promising and effective method for improving mineral prospecting efforts.

Styles APA, Harvard, Vancouver, ISO, etc.

27

Raymaekers, Jakob, et Ruben H. Zamar. « Pooled variable scaling for cluster analysis ». Bioinformatics 36, n^o 12 (13 avril 2020) : 3849–55. http://dx.doi.org/10.1093/bioinformatics/btaa243.

Texte intégral

Résumé :

Abstract Motivation Many popular clustering methods are not scale-invariant because they are based on Euclidean distances. Even methods using scale-invariant distances, such as the Mahalanobis distance, lose their scale invariance when combined with regularization and/or variable selection. Therefore, the results from these methods are very sensitive to the measurement units of the clustering variables. A simple way to achieve scale invariance is to scale the variables before clustering. However, scaling variables is a very delicate issue in cluster analysis: A bad choice of scaling can adversely affect the clustering results. On the other hand, reporting clustering results that depend on measurement units is not satisfactory. Hence, a safe and efficient scaling procedure is needed for applications in bioinformatics and medical sciences research. Results We propose a new approach for scaling prior to cluster analysis based on the concept of pooled variance. Unlike available scaling procedures, such as the SD and the range, our proposed scale avoids dampening the beneficial effect of informative clustering variables. We confirm through an extensive simulation study and applications to well-known real-data examples that the proposed scaling method is safe and generally useful. Finally, we use our approach to cluster a high-dimensional genomic dataset consisting of gene expression data for several specimens of breast cancer cells tissue obtained from human patients. Availability and implementation An R-implementation of the algorithms presented is available at https://wis.kuleuven.be/statdatascience/robust/software. Supplementary information Supplementary data are available at Bioinformatics online.

Styles APA, Harvard, Vancouver, ISO, etc.

28

BAŞARAN, Bülent. « Examining Preservice Teachers’ TPACK-21 Efficacies with Clustering Analysis in Terms of Certain Variables ». Malaysian Online Journal of Educational Technology 8, n^o 3 (1 juillet 2020) : 84–99. http://dx.doi.org/10.17220/mojet.2020.03.005.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

29

Hummel, Manuela, Dominic Edelmann et Annette Kopp-Schneider. « Clustering of samples and variables with mixed-type data ». PLOS ONE 12, n^o 11 (28 novembre 2017) : e0188274. http://dx.doi.org/10.1371/journal.pone.0188274.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

30

Hajnal, Istvan, et Geert Loosveldt. « The Sensitivity of Hierarchical Clustering Solutions to Irrelevant Variables ». Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique 50, n^o 1 (mars 1996) : 56–70. http://dx.doi.org/10.1177/075910639605000105.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

31

Brusco, Michael J. « Clustering Binary Data in the Presence of Masking Variables. » Psychological Methods 9, n^o 4 (2004) : 510–23. http://dx.doi.org/10.1037/1082-989x.9.4.510.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

32

Lim, Yaeji, Hee‐Seok Oh et Ying Kuen Cheung. « Functional clustering of accelerometer data via transformed input variables ». Journal of the Royal Statistical Society : Series C (Applied Statistics) 68, n^o 3 (16 septembre 2018) : 495–520. http://dx.doi.org/10.1111/rssc.12310.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

33

Qannari, E. M., E. Vigneau, P. Luscan, A. C. Lefebvre et F. Vey. « Clustering of variables, application in consumer and sensory studies ». Food Quality and Preference 8, n^o 5-6 (septembre 1997) : 423–28. http://dx.doi.org/10.1016/s0950-3293(97)00008-6.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

34

Yu, Chang, et Daniel Zelterman. « Sums of dependent Bernoulli random variables and disease clustering ». Statistics & ; Probability Letters 57, n^o 4 (mai 2002) : 363–73. http://dx.doi.org/10.1016/s0167-7152(02)00091-3.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

35

Li, Ye, Yiyan Chen et Qun Li. « A Clustering Algorithm for Triangular Fuzzy Normal Random Variables ». International Journal of Fuzzy Systems 22, n^o 7 (15 septembre 2020) : 2083–100. http://dx.doi.org/10.1007/s40815-020-00933-7.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

36

González-Rodríguez, Gil, Ana Colubi, Pierpaolo D’Urso et Manuel Montenegro. « Multi-sample test-based clustering for fuzzy random variables ». International Journal of Approximate Reasoning 50, n^o 5 (mai 2009) : 721–31. http://dx.doi.org/10.1016/j.ijar.2009.01.003.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

37

Fernández, Antonio, José A. Gámez, Rafael Rumí et Antonio Salmerón. « Data clustering using hidden variables in hybrid Bayesian networks ». Progress in Artificial Intelligence 2, n^o 2-3 (9 avril 2014) : 141–52. http://dx.doi.org/10.1007/s13748-014-0048-3.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

38

Lee, Yunjung, et Seyoung Park. « Spectral clustering of weighted variables on multi-omics data ». Korean Journal of Applied Statistics 36, n^o 3 (30 juin 2023) : 175–96. http://dx.doi.org/10.5351/kjas.2023.36.3.175.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

39

Hagoel, Lea, Liora Ore, Efrat Neter, Zmira Silman et Gad Rennert. « Clustering Women’s Health Behaviors ». Health Education & ; Behavior 29, n^o 2 (avril 2002) : 170–82. http://dx.doi.org/10.1177/109019810202900203.

Texte intégral

Résumé :

This study attempts to characterize health lifestyles by subgrouping women with similar behavior patterns. Data on background, health behaviors, and perceptions were collected via phone interview from 1,075 Israeli women aged 50 to 74. From a cluster analysis conducted on health behaviors, three clusters emerged: a “health promoting” cluster (44.1%), women adhering to recommended behaviors; an “inactive” cluster (40.3%), women engaging in neither health-promoting nor compromising behaviors; and an “ambivalent” cluster (15.4%), women engaging somewhat in both health-promoting and compromising behaviors. Clustering was cross-tabulated by demographic and perceptual variables, further validating the subgrouping. The cluster solution was also validated by predicting another health behavior (mammography screening) for which there was an external validating source. Findings are discussed in comparison to published cluster solutions, culminating in suggestions for intervention alternatives. The concept of lifestyle was deemed appropriate to summarize the clustering of these behavioral, perceptual, and structural variables.

Styles APA, Harvard, Vancouver, ISO, etc.

40

Rosing, K. E., et C. S. ReVelle. « Optimal Clustering ». Environment and Planning A : Economy and Space 18, n^o 11 (novembre 1986) : 1463–76. http://dx.doi.org/10.1068/a181463.

Texte intégral

Résumé :

Cluster analysis can be performed with several models. One method is to seek those clusters for which the total flow between all within-cluster members is a maximum. This model has, until now, been viewed as mathematically difficult because of the presence of products of integer variables in the objective function. In another optimization model of cluster analysis, the p-median, a central member is found for each cluster, so that relationships of cluster members with the various central members are maximized (or minimized). This problem, although mathematically tractable, is a less realistic formulation of the general clustering problem. The formulation of the maximum interflow problem is here transformed in stages into a linear analogue which is economically solvable. Computation experience with the several transformed stages is reported and a practical example of the analysis demonstrated.

Styles APA, Harvard, Vancouver, ISO, etc.

41

Ma, Xinwei, Ruiming Cao et Yuchuan Jin. « Spatiotemporal Clustering Analysis of Bicycle Sharing System with Data Mining Approach ». Information 10, n^o 5 (2 mai 2019) : 163. http://dx.doi.org/10.3390/info10050163.

Texte intégral

Résumé :

The main objective of this study is to explore the spatiotemporal activities pattern of bicycle sharing system by combining together temporal and spatial attributes variables through clustering analysis method. Specifically, three clustering algorithms, i.e., hierarchical clustering, K-means clustering, expectation maximization clustering, are chosen to group the bicycle sharing stations. The temporal attributes variables are obtained through the statistical analysis of bicycle sharing smart card data, and the spatial attributes variables are quantified by point of interest (POI) data around bicycle sharing docking stations, which reflects the influence of land use on bicycle sharing system. According to the performance of the three clustering algorithms and six cluster validation measures, K-means clustering has been proven as the better clustering algorithm for the case of Ningbo, China. Then, the 477 bicycle sharing docking stations were clustered into seven clusters. The results show that the stations of each cluster have their own unique spatiotemporal activities pattern influenced by people’s travel habits and land use characteristics around the stations. This analysis will help bicycle sharing operators better understand the system usage and learn how to improve the service quality of the existing system.

Styles APA, Harvard, Vancouver, ISO, etc.

42

Jamotton, Charlotte, Donatien Hainaut et Thomas Hames. « Insurance Analytics with Clustering Techniques ». Risks 12, n^o 9 (5 septembre 2024) : 141. http://dx.doi.org/10.3390/risks12090141.

Texte intégral

Résumé :

The K-means algorithm and its variants are well-known clustering techniques. In actuarial applications, these partitioning methods can identify clusters of policies with similar attributes. The resulting partitions provide an actuarial framework for creating maps of dominant risks and unsupervised pricing grids. This research article aims to adapt well-established clustering methods to complex insurance datasets containing both categorical and numerical variables. To achieve this, we propose a novel approach based on Burt distance. We begin by reviewing the K-means algorithm to establish the foundation for our Burt distance-based framework. Next, we extend the scope of application of the mini-batch and fuzzy K-means variants to heterogeneous insurance data. Additionally, we adapt spectral clustering, a technique based on graph theory that accommodates non-convex cluster shapes. To mitigate the computational complexity associated with spectral clustering’s O(n3) runtime, we introduce a data reduction method for large-scale datasets using our Burt distance-based approach.

Styles APA, Harvard, Vancouver, ISO, etc.

43

Bello, Thaísa B., Anderson G. Costa, Thainara R. da Silva, Juliana L. Paes et Marcus V. M. de Oliveira. « Tomato quality based on colorimetric characteristics of digital images ». Revista Brasileira de Engenharia Agrícola e Ambiental 24, n^o 8 (août 2020) : 567–72. http://dx.doi.org/10.1590/1807-1929/agriambi.v24n8p567-572.

Texte intégral

Résumé :

ABSTRACT Results of evaluations using optical evaluation methods may be correlated with tomato quality and maturation. In this context, the objective of this study was to evaluated the correlation between tomato colorimetric and physico-chemical variables, clustering them as a function of maturation stages, using multivariate analysis. The experiment was conducted using 150 fruits and three maturation stages (immature, light red and mature). The physico-chemical variables were evaluated through traditional methods. The colorimetric variables were assessed on images in RGB color model taken with a digital camera. The correlation between colorimetric and physico-chemical variables was analyzed using the Pearson’s coefficient. Principal components analysis and k-means clustering method was applied to three data set: RGB isolated variables; colorimetric variables calculated by relation between the RGB bands (colorimetric indexes); and physico-chemical variables. The colorimetric variables present higher explanatory capacity of the maturation variation than physico-chemical variables. The colorimetric indexes presented higher performance in clustering (accuracy of 0.98) tomatoes as a function of maturation.

Styles APA, Harvard, Vancouver, ISO, etc.

44

Rosyada, Istina Alya, et Dina Tri Utari. « Penerapan Principal Component Analysis untuk Reduksi Variabel pada Algoritma K-Means Clustering ». Jambura Journal of Probability and Statistics 5, n^o 1 (4 juin 2024) : 6–13. http://dx.doi.org/10.37905/jjps.v5i1.18733.

Texte intégral

Résumé :

K-Means clustering is a widely used clustering algorithm. However, it has the disadvantage that the performance of clustering data decreases if the variables of the processed data are immense. The complex variables problem in K-Means can be overcome by combining the Principal Component Analysis (PCA) variable reduction method. This study uses seven indicator variables for the welfare of the people of West Java Province in 2021 to measure the welfare level of districts/cities. The results of the analysis obtained two principal components based on eigenvalues. Clustering from cluster analysis with the K-Means with variable reduction using PCA formed the three best clusters where the number of members of each cluster consisted of 12, 8, and 7 districts/cities.

Styles APA, Harvard, Vancouver, ISO, etc.

45

Zhu, Chuanze, Xu Zhong, Zhenjie Lin, Liming Wang, Wenzhong Li et Sanglu Lu. « Multivariate Time Series Clustering based on Graph Convolutional Network ». Journal of Physics : Conference Series 2522, n^o 1 (1 juin 2023) : 012021. http://dx.doi.org/10.1088/1742-6596/2522/1/012021.

Texte intégral

Résumé :

Abstract Multivariable time series (MTS) clustering is an important topic in time series data mining. The major challenge of MTS clustering is to capture the temporal correlations and the dependencies between multiple variables. In this paper, we propose a novel MTS clustering approach based on graph convolutional network (GCN), which is a powerful feature extractor for graph structure data. We regard each variable in MTS as a node in the graph and construct edges through the correlation between variables. Furthermore, GCN and deep learning back-ropagation technology are used to continuously learn the relationship between multiple variables. Combining the learned variables with the characteristics of the time dimensions, the comprehensive features can be fused to form effective representation for MTS clustering task. We carry out extensive experimental analysis on four open time series data sets and six benchmark algorithms, which shows the superiority of the proposed method.

Styles APA, Harvard, Vancouver, ISO, etc.

46

Hendricks, Renee, et Mohammad Khasawneh. « Cluster Analysis of Categorical Variables of Parkinson’s Disease Patients ». Brain Sciences 11, n^o 10 (29 septembre 2021) : 1290. http://dx.doi.org/10.3390/brainsci11101290.

Texte intégral

Résumé :

Parkinson’s disease (PD) is a chronic disease. No treatment stops its progression, and it presents symptoms in multiple areas. One way to understand the PD population is to investigate the clustering of patients by demographic and clinical similarities. Previous PD cluster studies included scores from clinical surveys, which provide a numerical but ordinal, non-linear value. In addition, these studies did not include categorical variables, as the clustering method utilized was not applicable to categorical variables. It was discovered that the numerical values of patient age and disease duration were similar among past cluster results, pointing to the need to exclude these values. This paper proposes a novel and automatic discovery method to cluster PD patients by incorporating categorical variables. No estimate of the number of clusters is required as input, whereas the previous cluster methods require a guess from the end user in order for the method to be initiated. Using a patient dataset from the Parkinson’s Progression Markers Initiative (PPMI) website to demonstrate the new clustering technique, our results showed that this method provided an accurate separation of the patients. In addition, this method provides an explainable process and an easy way to interpret clusters and describe patient subtypes.

Styles APA, Harvard, Vancouver, ISO, etc.

47

Normoyle, Aline, et Shane Jensen. « Bayesian Clustering of Player Styles for Multiplayer Games ». Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 11, n^o 1 (24 juin 2021) : 163–69. http://dx.doi.org/10.1609/aiide.v11i1.12805.

Texte intégral

Résumé :

With game play data, empirical approaches to clustering are typically based solely on game outcomes, e.g. kills, deaths, and score for each player. In this paper, we investigate a method for clustering players based on how a player’s choices relate to outcomes, or equivalently the latent player styles exhibited by players. Our approach is based on a Bayesian semi-parametric clustering method which has several advantages: the number of clusters do not need to be specified a priori; the technique can work with a very compact representation of each match (e.g. consisting primarily of indicator variables for player choices); a player can belong to multiple clusters and hence can have a hybrid style; and the resulting clusterings often have a straight-forward interpretation. To demonstrate the approach, we apply our method to multiplayer match logs from Battlefield 3 consisting of over 1200 players and 500,000 matches.

Styles APA, Harvard, Vancouver, ISO, etc.

48

Mogaraju, Jagadish Kumar. « Agglomerative and Divisive hierarchical cluster analysis of groundwater quality variables using opensource tools over YSR district, AP, India ». Journal of Scientific Research 66, n^o 04 (2022) : 15–20. http://dx.doi.org/10.37398/jsr.2022.660403.

Texte intégral

Résumé :

Groundwater quality variables like F, Total Hardness (TH), Total Alkalinity (TA), Total Dissolved Solids (TDS), SO4, SAR, NA, EC, Cl, Ca, Mg, and pH were tested with Hierarchical clustering analysis (HCA) to identify the groupings or clusters that exist in the dataset. The dataset is subjected to Agglomerative and divisive hierarchical clustering. The observations were scaled to compare variables systematically. The clustering structure was determined using an agglomerative coefficient. Agglomerative approaches like complete, average, single, and ward are tested using agglomerative coefficients. The ward approach best suits the dataset to investigate a strong clustering structure. The agglomerative coefficient obtained is 0.8666752, and the divisive coefficient is 0.8371531. The entanglement score attained was 0.26, demonstrating a good alignment with nominal entanglement. The principal component analysis resulted in two main components contributing 54.8% and 18.2% explainable variance. The variables that are prominent in each PC are investigated and reported. The gap statistic and average silhouette method are used to know the optimal number of clusters. Open-source software like R/ R studio is used for this analysis. This work concludes that clustering analysis is essential to understand the groundwater quality variables better.

Styles APA, Harvard, Vancouver, ISO, etc.

49

Li, Wei Ping, De Qing Quan et Jun Cai. « Application of Data Mining in Sports in the Consumer Market Segmentation ». Applied Mechanics and Materials 631-632 (septembre 2014) : 280–83. http://dx.doi.org/10.4028/www.scientific.net/amm.631-632.280.

Texte intégral

Résumé :

This paper combines the data mining technology and the rich sports consumption data resources of city household survey. By using the K-Means fast cluster method, sports consumer market models were constructed based on the different variables. Research shows, choosing sports consumption content as variables to establish clustering model is better than choosing the demographic , sports consumption content ,consumer psychology and way of life as variables to establish clustering model. According to the results of clustering, the city residents are divided into four kinds of consumer groups in accordance with the different features of sports consumption.

Styles APA, Harvard, Vancouver, ISO, etc.

50

Mukherjee, Sudipto, Himanshu Asnani, Eugene Lin et Sreeram Kannan. « ClusterGAN : Latent Space Clustering in Generative Adversarial Networks ». Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 juillet 2019) : 4610–17. http://dx.doi.org/10.1609/aaai.v33i01.33014610.

Texte intégral

Résumé :

Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets.

Styles APA, Harvard, Vancouver, ISO, etc.

Articles de revues sur le sujet « Variables clustering »

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres